Instruction, Training, Technology
For Email Marketing you can trust
Case Studies in Instruction with Speech-recognition Software
The following case studies are excerpted from There's a Dragon in My School by Literacy Solutions Director Shelley Lacey-Castelot, ATACP, MS. They are synopses of some of her work with individual students at Literacy Solutions.
The descriptions below appear mid-way through Ms. Lacey-Castelot's book, after several chapters devoted to describing the principles and operation of digital speech recognition. Thus, they were created for an audience already introduced to the somewhat specialized vocabulary of speech-recognition technology. They contain references that may be unfamiliar to the average person. When a parent, guardian, or educator arranges for speech-recognition or assistive-technologies instruction through Literacy Solutions, however, our staff explains every part of each instruction-progress report — in simple English, and in a relaxed and private setting.
David is a bright young man with dyslexia. When he was introduced to voice recognition, David was entering the fifth grade. His fine motor difficulties were greater than Tyler's [NOTE: Tyler was the subject of a previous, unexcerpted, chapter in There's a Dragon in My School]. His cognitive disorganization was even greater than Tyler's. When sitting in front of the keyboard, David was virtually at a loss. He was, however, able to manage the mouse with ease.
David was trained on Dragon NaturallySpeaking version 3.52. He was trained on a 300 MHz processor. While the enrollment process was somewhat shorter than it was with Tyler, it was still long.
David was a marvelous mimic of intonation, and his ability to hold phonemes in auditory memory was quite good, IF he was provided with simultaneous visual feedback. There were times however, when David had some difficulty with repeating the words that I dictated for him during the enrollment process. Because David had a phonological processing disorder, he still had difficulty at times with sound sequencing in words; most notably, in words containing r's and l's. When David had to repeat phrases which contained more than two r or l phonemes, some consonant migration was noted and this did affect recognition of what he was saying during the enrollment process. Therefore, it was very important for the trainer to recognize this to and to anticipate this difficulty. When a sentence contained several words with r's and l's, I needed to dictate smaller phrases to David than at other times. Once this adjustment was made, David was able to repeat phrases effectively and he continued on with the enrollment process. It did not take him long to understand that the computer could always hear him unless he selected the pause button, and he began to turn the microphone off using a mouse click early on. David also recognized after a short period of time that there were times when the computer did not recognize him if his speech was not clear. After just one demonstration, David began to know when he had to repeat a word and to recognize when he needed to click on the skip word button. About a third of the way into the training David clicked on the pause button, turn to me and said, "It understands me better when I say more words together." I was delighted by his observation. David is the first student that I have worked with who has spontaneously figured this out.
Total time to enroll David's voice was about an hour. His enrollment was accomplished in two half-hour sessions. Although the process was long and tedious, David remained highly motivated and attentive, as long as he had auditory and visual feedback. The interactive nature of the process and the ability to have a trainer beside him during the entire training period enabled David to avoid frustration.
The next step was for David to begin actually dictating sentences and to try some simple correction. While students are beginning to learn the process of voice recognition, it is often difficult for them to formulate sentences, even orally, while attending to the initial training process. Therefore, I often approach initial dictation in one of two ways. One of the methods is to suggest a sentence for the student to dictate. This leaves the student free to focus on the process of speaking into the microphone while watching their words converted to text on the screen before them. This enables me to choose a sentence that I feel the student will be able to enunciate clearly enough that the computer will recognize what he has said with a high degree of accuracy. In David's case, he was fairly cognizant of the need to turn the microphone off at all times except when he was directly dictating. David also had good diction. Therefore, after suggesting only one sentence for David to repeat, I elected to encourage David to formulate his own sentences for dictation. This was a task that was challenging for David. He tended to go off on tangents when speaking orally and he had not had a lot of experience with composing in writing. I needed to model for David first how to shorten his verbiage into discrete thought units. I taught him to verbally rehearse what he wanted to say, at first out loud, (with the microphone off) and later in his head. It was helpful for David to learn to think and to dictate in phrases rather than in long sentences. David was quite pleased to see his words "translated" into printed words on the screen before him.
David learned to make simple corrections without a great deal of difficulty as long as his trainer or teacher was beside him to help him read the words in the correction box. Although it is more helpful for some students to compose an entire paragraph before correcting, it helped David to stay focused when he corrected after every sentence or two. This forced him to continually review what he was writing, which enabled him to remain more focused and to keep his current train of thinking in the forefront of his mind.
David is learning to match his verbal output speed with the pace of the text production on the monitor screen. This helps keep David on target with the focus of his composition.
Nick is a personable young man with learning disabilities in the areas of reading and writing. Charm must surely be Nick's middle name. Nick was adopted from an overseas country at the age of two and a half. Although he did speak when adopted, he did not speak English. The younger of two children, Nick was raised by a family who immersed him in language and enrichment activities. At the time that Nick was introduced to voice recognition, he was just beginning the second grade. Nick was trained on a Compaq Deskpro 4000 with a Celeron chip running at 400 MHz, and 64 MB of RAM. By this time, Dragon NaturallySpeaking Professional version 4.0 had arrived. On version 4.0, training time for adults was just over five minutes (on a 400 MHz machine). With a child with a learning disability, training time averaged about fifteen to twenty minutes. The speech model that was chosen for Nick was the Student BestMatch Model. The vocabulary that was chosen for Nick was the Student General English BestMatch Model. One of the easier enrollment texts was chosen (Getting a New Bike). This enrollment text was only eleven paragraphs long, yet the trained voice file was just as accurate as Nick would have gotten from an earlier version of Dragon NaturallySpeaking with a longer and more difficult enrollment text.
Unlike the majority of students that I have trained, the visual anchor of text on the screen during the enrollment process was not helpful for Nick. In fact, this proved to be a distracter for Nick. Because it was so difficult for him to remember the short sequence of syllables that I read for him to repeat into the microphone, Nick automatically tried to read what was on the screen to help himself out. Unfortunately, he was unable to read quite a few of the words in the enrollment text, and what words he could read, he could not read in smooth phrases. As much as Nick tried to focus on my words, his eyes were constantly drawn back to the screen, causing him to be distracted from focusing on the words that I was reading to him from the enrollment text. Finally, I turned Nick's chair completely away from the computer screen so that he could not see it at all. Then I used my facial expressions and some strong eye contact to keep Nick's gaze focused on my face so that he would be better able to repeat the words that I read. Once I did this, Nick's progress through the enrollment text went much more smoothly. I did, however, need to limit the number of phonemes that I presented for Nick to repeat after me.
Once enrollment was completed, Nick began to do some simple dictation and correction. Nick's recognition accuracy initially was best when he dictated in phrases of three to four words only. When he tried to speak in longer phrases, Nick tended to stumble over his words in his effort to say the entire sentence before he forgot a piece of it. Although Nick was able to recognize at sight quite a few more words than he was able to write, he needed a high level of auditory support and feedback. The screen reader/spell check/homophone check program Keystone 99 was used with DNS to facilitate Nick's writing process. This program allowed Nick to use his visual strengths and to support his auditory weakness when writing so that he could devote his attention to the task of composing and dictating his thoughts. Keystone enabled Nick to have what he had written read back to him with word by word or sentence by sentence highlighting. Keystone also enabled Nick to use a pop-up homophone checker and to have his words echoed back to him as he dictated them. He found this echoing to be tremendously helpful to him. The echoing acted as an auditory cue for Nick.
In order to assess how helpful such features as this would be to him, we took Nick through the enrollment process in DragonDictate (a discrete speech recognition program), using Keystone 99 to "read" the enrollment text to him, bit by bit. Enrolling using this auditory prompting process and using the shorter phrases in DragonDictate seemed to be quite helpful to Nick. Yet, when Nick began to compose in DragonDictate, he was frustrated by the need to speak only one word at a time and to pause in between words. Therefore, we decided to continue his training in Dragon NaturallySpeaking with the support of Keystone 99. This proved to be the most successful venue for Nick's writing.
Although Nick needed to dictate in shorter phrases that many students, Dragon NaturallySpeaking Professional 4.0 was able to understand him quite well. Having Keystone echo what he had just dictated helped free up Nick's working memory to focus on his next thought. When Nick lost his train of thought, it was very helpful to him to be able to use a single keystroke to have Keystone read back to him what he just written. This seemed to give him a nice running start into his next thought. Using Keystone to read the selections in the correction box enabled Nick to be much more independent in his writing process. Increasing the pause time between phrases seemed to help Dragon NaturallySpeaking recognize what Nick was saying when he spoke his words too slowly.
Nick became a fairly fluent writer. His ability to write by longhand also began to improve; perhaps because his ability to organize his thoughts and to fluently express them improved as he used Dragon NaturallySpeaking.
© Shelley Lacey-Castelot