The Acoustically Optimized Dragon

Is there an echo in here?It turns out the Dragon hasn't been completely healthy all this time. It worked, but not as well as it could have. Since installation, closing the application always resulted in an error, and although this didn't seem to affect the program's operation it was an indication that something wasn't quite right.

I wasn't able to run the Acoustic Optimizer module, which is the part of the application that, as the name implies, attempts to understand and compensate for the acoustics of my particular speaking environment. Since I don't use a headset, and also like to wander around when I'm dictating—and in fact frequently dictate while driving—this function is particularly important. As you can imagine, changes in acoustics during dictation are quite capable of introducing error during the recognition process. For example, a slight echo or delay resulting from sound waves bouncing around in an enclosed space can be easily translated—in the Dragon's mind—to extra syllables in a word, when in fact it's only a reflected sound wave.

Yesterday, I finally had the opportunity to track this problem down and—big surprise—the culprit was a Microsoft application interfering with the Dragon's operation. Before I received the Dragon software I had been messing about with a rudimentary Microsoft speech application, which as I recall had been installed as part of the MS Office suite. Uninstalling the Microsoft application solved the problem.

The Acoustic Optimizer module is essentially an intense number-crunching operation that updates and modifies my user files; these are the files the Dragon uses to identify my particular voice characteristics, among other things. Although the system initially predicted a 2 1/2 hour duration for this process, the reality was closer to 20 minutes. The result of this optimization routine is a noticeable improvement in recognition accuracy, especially in the translation of those slurred contractions I've so often mentioned.

So life is good again, and the plan for the weekend is to read the Dragon a few more of the included sample texts to further improve its ability to understand what I'm saying. One side note in this regard: although it's obviously possible to read anything for the purpose of training, doing so requires subsequent corrections, since the Dragon has no way of knowing beforehand what the word ought to be. Dictating the included samples, on the other hand, means the Dragon already knows what the words should be, thereby eliminating the need for correction afterward.

Ideallythis investment in training will pay off in the long term, but from what I've read about the experiences of others—combined with my own—confidence is high.


P.S. For those of you keeping track of such things, this monologue is the result of about 12 minutes of dictation, and better yet, required virtually no corrections this time. Woot! The Acoustic Optimizer in action, I think.

No comments:

Post a Comment