Dictate

speaker dependent speech recognition software from the company MacSpeech

These notes represent clippings and/or a syntheses of materials mainly from the MacSpeech support mailing list.

Dictate training sessions are stored in a profile, along with other things such as the type of microphone you are using, and the amount of background noise, in a folder called MacSpeech Profiles which occupy about 40 Mb and must reside in the user's Documents folder. The MacSpeech Profiles folder is the critical folder to back up in terms of personal data. The application itself is placed in the Applications folder, and within the user's Library folder (Application Support) are placed in a MacSpeech folder and — within Application Support > Preferences — the file com.macspeech.dictate.plist.

Dictate creates and maintains a separate dictation session per document opened or created, allowing it to keep track of words and punctuation for each window.

The program's Microphone Setup assistant should suitably adjust the levels of most microphones. If you use a microphone that is not MacSpeech-certified and are getting accuracy below 95%, it is a good indication the microphone you are using is not compatible with MacSpeech Dictate. Most problems with achieving an acceptable level of accuracy with MacSpeech Dictate can usually be traced to improper setup... see "Troubleshooting Training" on approx page 12 of the manual.

Until MacSpeech provides a Correction feature to automate the correction of misrecognition and at the same time improve users' voice profiles, correction is achieved by selecting the word(s) — which you can optionally command Dictate to do — and either re-dictating or typing the correct word(s) manually.

Note: accuracy is also improved by assuring proper microphone positioning, suitable enunciation, thinking before you speak, reading more stories (Tools menu > Voice Training > Choose Story) and, of course, making sure you did not select the wrong one among multiple profiles, if applicable.

Incompatibilities

  • TextExpander has been found incompatible... example: speak Page 2: Because we have no figures from Late Antiquity, it is common to estimate the numbers in the Roman Empire. and get iBctht o:a er  taoehav niyfxeursl ta eitc,i  tic moomt  eotsmita et emnebsri nht eoRamnE prie.
    • compatibility apparently fixed in v 2.2 of TextExpander?

Support

MacSpeech greatly expanded their support capabilities in conjunction with MacSpeech Dictate. Support for MacSpeech Dictate is now handled by a firm that provides free telephone support between the hours of 8 AM and midnight Eastern Monday through Friday excluding federally mandated holidays. You may also contact the support team by e-mail or fax. (Anything they can't handle then gets escalated.)

Phone: 410-568-3645 Fax: 410-891-0215 Email: dictate-support@<macspeech.com

Manually updating to new versions

If you wish to not use the automatic update and just download the new version you can do so by clicking on a link. The following was current as at May 1, 2008: http://www.macspeech.com/release/MacSpeech_Dictate_2226.zip

• Download the compressed application from the url above • Unzip the application if your browser has not already done so for you • Please make sure that MacSpeech Dictate is not running • Drag the application into the folder where you have previously installed MacSpeech Dictate • When it asks if you wish to Replace the file click Replace

You are now free to use your updated version of Dictate. You do not have to retrain your voice profile.

Canadian accents

MacSpeech Dictate has six different dialects of the English language built-in. Just like with the previous iListen program you could have accuracy issues if you needed the UK English version but were using the US English version instead, choosing the appropriate English dialect when creating a voice profile is critical to achieving the best results.

Let me give you one example from a Canadian customer of ours: she chose the British English (also known as UK English) dialect when creating the voice profile in order to get British spellings because she is in Canada. However, the Canadian style of speech more closely resembles US English. So she was having a number of accuracy issues. Once she switched to US English and created a new voice profile at my instruction things improved dramatically. The downside to this was she had American rather than British spellings for some words. But since most of what she did was in a word processor search and replace functions and took care of that.

Command and control

MacSpeech Dictate depends on an application's menu item to be enabled at the time it is being called. Whether it is so can be examined by means of option double-click on the Access Find… window command in the Available Commands window, or open the Commands window while in MacSpeech Dictate. Do so by identifying the running application of interest in the left column and click to see its command set and whether the command is "active" (enabled).

If MacSpeech Dictate is accessing a menu item, that menu item has to be enabled at the time you issue the command.

If the access find window command does not work despite the application command being enabled, a bug report should be filed with the application's maintainers.

Importing command sets from iListen

Export from iListen, import into Dictate. But you have to quit and relaunch to see them listed, MacSpeech knows about this and has it as a "to do".

Data corruption repair

There is a Tech Note about the Dictate Data First Aid utility which will assist with fixing corrupt data, and the related issues during the creation of profiles. http://www.macspeech.com/article_info.php?articles_id=296

The cause of a data disk issue has been repaired, see here.

Microphones

The Sennheiser ME3 is comfortable and said to be very good for improving accuracy. The Sennheiser microphone does work quite well with naturally speaking on the PC. One user, in Dictate, was at 95% whereas with the much less expensive Plantronics off the Dictate list they were at 98%. The reason it isn't on the MacSpeech list yet is that it has not yet been tested with Dictate which means the company cannot certify it as compatible with the software. Does that mean the microphone will not work and produce good results with MacSpeech Dictate? Not necessarily.

It is important to remember that although MacSpeech Dictate is in fact based on the Dragon speech recognition engine (which is admittedly a lot more tolerant about which microphone you use with it) the Windows operating system and the Macintosh operating system do handle audio entirely differently.

Feel free to try a microphone that is not on the company list. It may work quite well for you. But if the microphone is not on the list they cannot officially support that microphone or assist you in troubleshooting problems that may be related to the microphone.

Changing microphones

Creating a new User profile with a new microphone will be able to access existing commands because Dictate associates commands on a per user basis as opposed to a per profile basis. Dictate command sets are common and shared among all of a user's profiles.

Microphone placement

The rear USB port on a Macbook is on an internal hub whereas the front one is not and may therefore be more reliable and provide better throughput. Users have noticed interference using the port near the screen and discovered that the front one was better. http://blog.wired.com/gadgets/2008/05/the-macbook-all.html Reference

Capitalization

To enter "wordcap" mode for a single word, for example to produce "Word", say "cap" or "cap next" instead of "caps" (iListen used the plural, but Dictate uses the singular).

Open parenthesis and other characters

When MacSpeech Dictate types, high ascii characters are typed using multiple keystrokes and the code that does that used to work and is now broken for some reason.

The reason that the cursor is in the wrong place is because instead of the one character (just the Euro symbol), 3 characters were typed and there is some other cursor positioning going on due to the keystrokes generated. There is no workaround at the moment short of dictating some other one character symbol and then correcting by hand.

Dictate: open parenthesis 100,000

resulting in: (100,000

Then say "Do Select open parenthesis"

Type the Euro Sign

changing it to: £100,000

Then don't forget to put the cursor back by saying "go to end".

Yes, I know this is not pretty, nor what it should be, but it is a way around this temporary limitation.

We have a different means of typing under development, that is supposed to be less susceptible to keyboard and operating system changes. So until MacSpeech Dictate can properly "type" these characters the workaround is what I described.

Now if you don't like "open parenthesis" you can certainly use a character that has the following characteristics:

the character is 1 character long, and is typed correctly.

"clings right grammatically" - grammar rules indicate no following space.

smile "Dollar sign" is shorter and fits those rules! (veiled attempt to take over the world's money :-))

Seriously though, we will fix this as quickly as we can.

Text macros and capitalization

Dictate can track and allow you to modify a text macro verbally, but unfortunately, right now any special capitalization that you put into the text macro will be ignored. Another item we are working on fixing. IListen would faithfully reproduce your text macro, but you could never modify any text that was dictated prior to the text macro. MacSpeech Dictate can modify text dictated before a text macro. One step forward, one step backward. Next, we step forward again.

When creating simple text insert commands like "signature block" and "email address", ensure that the macro text did not contain high ASCII characters (foreign language characters, etc. because high ASCII characters will cause text macros to fail.

Vocabulary training options

At present MacSpeech Dictate has a limitation of only "adding words" via vocabulary training, once, They intend to remove this limitation in a future release. In the meantime, you can maintain a single file that you use to add words with. Whenever you have something new to add to your vocabulary, append it to this "vocabulary addition" file and then run Vocabulary training on the whole file. This way you will not lose words added in prior sessions.

The Vocabulary Training option is found in the menu labeled "Tools".

In that same menu you will also find an option for training from a selection. For this option to be active you need to have an open document in the built-in Note Pad. Once you have such an open document you can paste text into that open document and then select the text.

This is an easy way to get a new term into the vocabulary "on-the-fly". The drawback to doing this is any contextual information (statistical information about where words typically appear in documents in relation to other words) that had been added to your voice profile as a result of analyzing files gets thrown out.

Some people prefer to train from a selection and add terminology on-the-fly. Others prefer to analyze documents. While analyzing documents does take a little more work, it will achieve better results in the long run.

Voice recognition tips

there are other things you can do to assure yourself of good accuracy. Probably the most important one of these has to do with how you actually dictate. Although current crop of speech recognition software is using continuous speech, it does not use conversational speech. By this I mean for high accuracy every single word must be enunciated clearly and it helps to talk and phrases because the system is analyzing groups of words for context clues of the surrounding words. On the eMicrophones website's Links/Articles section at: https://www.emicrophones.com/articles/index.asp the very first reference has the race sound files on how to sound and how not to sound. In a nutshell, we try to feel our mouth form every word as we are dictating. It takes a little bit of practice, but the reduction in errors is well worth it.

Having a good microphone will also help. The microphone must not only pass the words you speak accurately, but must reject background noise. If the microphone you currently have is the old VXI Parrott and not TalkPro? Parrot, you will benefit from a better microphone as that one was not very noise canceling. In our experience even a quiet environments, a better noise canceling microphone yields higher accuracy.

Technical

Dual core processors

Dictate (and iListen) run multiple threads, but both programs also follow Apple's recommendations of not explicitly telling the operating system how to delegate tasks to the various cores.

iListen and Dictate

Both programs can be installed without messing up either one. The only resource they share is the microphone, so if you have both programs running at the same time, you can only have the microphone turned on in one program at a time.

Now all that said, both programs do watch all other applications and all windows on the system and do consume resources to do this. Memory consumption will go up and performance (2 programs monitoring everything you do, will slow your machine down.

Macintosh "line in" audio inputs

Reposted from http://www.atpm.com/12.09/ilisten.shtml#33052, accessed May 2, 2008

All Macs since the very first Bondi Blue iMac and G3 PowerBook have had a "line in" sound port, but no "mic in" port. The difference is important. On Windows machines, most sound cards have inputs for both line in and mic in. Line in assumes an amplified signal, where mic in assumes an unamplified signal. In other words, if you plug an analog microphone into a line in port, there is not enough signal strength there to get any kind of decent results. Before going any further, I also need to mention that Macs want an audio signal (called "gain") about 10db higher than most Windows machines. This has been independently verified by Martin Markoe of eMicrophones.com. (This becomes important a bit later.)

OK, so you need a USB adapter of some kind to bring mic level sound into a Mac. There aren't that many chipsets out there for USB audio, but each manufacturer who uses a particular chipset can produce their own firmware. Those of you who have many microphones may have noticed they show up as "AK5370," "VXI 7.0x," "C-Media Headset" (even when the mic is not a headset), "Plantronics," and even "Unknown." When something shows up as "unknown" it means the manufacturer didn't label the firmware they are using, btw.

One more thing you need to understand: all microphones are analog - even "USB" microphones. What that means is that the sound is listened to by an analog mic, then converted to digital before being sent through the USB port. When the Mac gets the digital stream, it is identified as audio and then decoded back to analog by the Mac OS. From there, we can do something with it.

So here is where sound in on the Mac becomes "so tricky." If you take a microphone that was designed without the Mac in mind, it usually has a signal strength too low to get good results. This is perhaps a definitive example of "Garbage In, Garbage Out." If you search the web for the right things, you will find references from Skype for Windows users who complain that Mac users are harder to hear. This is because they are using the Logitech microphones, none of which were designed with the Mac's higher gain requirement in mind. (Further corroboration of the 10db difference mentioned above.)

With iListen, this was especially important. With Dictate it may be less important simply because of the higher accuracy, but also, our engineers are now able to concentrate exclusively on Core Audio, so additional tweaking may improve results and broaden the gamut of microphones we are able to certify.

Professional microphones costing $400 or more may work less well than models that sell for less than $100. For the most part, this is because their expensive microphones lack the noise canceling properties that really have quite a bit to do with the accuracy one is able to achieve.

Typographers' quotes

The typographer's quote "`" also known as "grave accent" is apparently properly named "backtick" and is so-recognized by Dictate.

For Smart Quotes the following scripts work:

Begin forwarded message:

From: Leonard Rothbart Date: May 29, 2008 7:31:03 AM PDT (CA) Subject: Re: [MacSpeech-Support] Dictate is allergic to typographer's quotes...

These scripts work for me:


-- Open Smart Quote

set _currentAppName to short name of (info for (path to frontmost application as alias)) try tell application "System Events" tell process _currentAppName key code 33 using {option down} end tell end tell end try


-- Close Smart Quote

set _currentAppName to short name of (info for (path to frontmost application as alias)) try tell application "System Events" tell process _currentAppName key code 33 using {option down, shift down} end tell end tell end try


Topic revision: 21 Jun 2008, JamesBusser
 
Download.png
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback
Powered by Olark