Case Studies in Instruction with Speech-recognition Software
The following case studies are excerpted from There's a Dragon in My School by Literacy Solutions Director Shelley Lacey-Castelot, ATACP, MS. They are synopses of some of her work with individual students at Literacy Solutions.
The descriptions below appear mid-way through Ms. Lacey-Castelot's book, after several chapters devoted to describing the principles and operation of digital speech recognition. Thus, they were created for an audience already introduced to the somewhat specialized vocabulary of speech-recognition technology. They contain references that may be unfamiliar to the average person. When a parent, guardian, or educator arranges for speech-recognition or assistive-technologies instruction through Literacy Solutions, however, our staff explains every part of each instruction-progress report — in simple English, and in a relaxed and private setting.
David
David is a bright young man with dyslexia. When he was introduced to voice recognition,
David was entering the fifth grade. His fine motor difficulties were greater than Tyler's [NOTE: Tyler was the subject of a previous, unexcerpted, chapter in There's a Dragon in My School].
His cognitive disorganization was even greater than Tyler's. When sitting in front of the
keyboard, David was virtually at a loss. He was, however, able to manage the mouse with
ease.
David was trained on Dragon NaturallySpeaking version 3.52. He was trained on a 300 MHz
processor. While the enrollment process was somewhat shorter than it was with Tyler, it
was still long.
David was a marvelous mimic of intonation, and his ability to hold phonemes in auditory
memory was quite good, IF he was provided with simultaneous visual feedback. There were
times however, when David had some difficulty with repeating the words that I dictated for
him during the enrollment process. Because David had a phonological processing disorder,
he still had difficulty at times with sound sequencing in words; most notably, in words
containing r's and l's. When David had to repeat phrases which contained more than two r
or l phonemes, some consonant migration was noted and this did affect recognition of what
he was saying during the enrollment process. Therefore, it was very important for the
trainer to recognize this to and to anticipate this difficulty. When a sentence contained
several words with r's and l's, I needed to dictate smaller phrases to David than at other
times. Once this adjustment was made, David was able to repeat phrases effectively and he
continued on with the enrollment process. It did not take him long to understand that the
computer could always hear him unless he selected the pause button, and he began to turn
the microphone off using a mouse click early on. David also recognized after a short
period of time that there were times when the computer did not recognize him if his speech
was not clear. After just one demonstration, David began to know when he had to repeat a
word and to recognize when he needed to click on the skip word button. About a third of
the way into the training David clicked on the pause button, turn to me and said, "It
understands me better when I say more words together." I was delighted by his
observation. David is the first student that I have worked with who has spontaneously
figured this out.
Total time to enroll David's voice was about an hour. His enrollment was accomplished
in two half-hour sessions. Although the process was long and tedious, David remained
highly motivated and attentive, as long as he had auditory and visual feedback. The
interactive nature of the process and the ability to have a trainer beside him during the
entire training period enabled David to avoid frustration.
The next step was for David to begin actually dictating sentences and to try some
simple correction. While students are beginning to learn the process of voice recognition,
it is often difficult for them to formulate sentences, even orally, while attending to the
initial training process. Therefore, I often approach initial dictation in one of two
ways. One of the methods is to suggest a sentence for the student to dictate. This leaves
the student free to focus on the process of speaking into the microphone while watching
their words converted to text on the screen before them. This enables me to choose a
sentence that I feel the student will be able to enunciate clearly enough that the
computer will recognize what he has said with a high degree of accuracy. In David's case,
he was fairly cognizant of the need to turn the microphone off at all times except when he
was directly dictating. David also had good diction. Therefore, after suggesting only one
sentence for David to repeat, I elected to encourage David to formulate his own sentences
for dictation. This was a task that was challenging for David. He tended to go off on
tangents when speaking orally and he had not had a lot of experience with composing in
writing. I needed to model for David first how to shorten his verbiage into discrete
thought units. I taught him to verbally rehearse what he wanted to say, at first out loud,
(with the microphone off) and later in his head. It was helpful for David to learn to
think and to dictate in phrases rather than in long sentences. David was quite pleased to
see his words "translated" into printed words on the screen before him.
David learned to make simple corrections without a great deal of difficulty as long as
his trainer or teacher was beside him to help him read the words in the correction box.
Although it is more helpful for some students to compose an entire paragraph before
correcting, it helped David to stay focused when he corrected after every sentence or two.
This forced him to continually review what he was writing, which enabled him to remain
more focused and to keep his current train of thinking in the forefront of his mind.
David is learning to match his verbal output speed with the pace of the text production
on the monitor screen. This helps keep David on target with the focus of his composition.
Nick:
Nick is a personable young man with learning disabilities in the areas of reading and
writing. Charm must surely be Nick's middle name. Nick was adopted from an overseas
country at the age of two and a half. Although he did speak when adopted, he did not speak
English. The younger of two children, Nick was raised by a family who immersed him in
language and enrichment activities. At the time that Nick was introduced to voice
recognition, he was just beginning the second grade. Nick was trained on a Compaq Deskpro
4000 with a Celeron chip running at 400 MHz, and 64 MB of RAM. By this time, Dragon
NaturallySpeaking Professional version 4.0 had arrived. On version 4.0, training time for
adults was just over five minutes (on a 400 MHz machine). With a child with a learning
disability, training time averaged about fifteen to twenty minutes. The speech model that
was chosen for Nick was the Student BestMatch Model. The vocabulary that was chosen for
Nick was the Student General English BestMatch Model. One of the easier enrollment texts
was chosen (Getting a New Bike). This enrollment text was only eleven paragraphs long, yet
the trained voice file was just as accurate as Nick would have gotten from an earlier
version of Dragon NaturallySpeaking with a longer and more difficult enrollment text.
Unlike the majority of students that I have trained, the visual anchor of text on the
screen during the enrollment process was not helpful for Nick. In fact, this proved to be
a distracter for Nick. Because it was so difficult for him to remember the short sequence
of syllables that I read for him to repeat into the microphone, Nick automatically tried
to read what was on the screen to help himself out. Unfortunately, he was unable to read
quite a few of the words in the enrollment text, and what words he could read, he could
not read in smooth phrases. As much as Nick tried to focus on my words, his eyes were
constantly drawn back to the screen, causing him to be distracted from focusing on the
words that I was reading to him from the enrollment text. Finally, I turned Nick's chair
completely away from the computer screen so that he could not see it at all. Then I used
my facial expressions and some strong eye contact to keep Nick's gaze focused on my face
so that he would be better able to repeat the words that I read. Once I did this, Nick's
progress through the enrollment text went much more smoothly. I did, however, need to
limit the number of phonemes that I presented for Nick to repeat after me.
Once enrollment was completed, Nick began to do some simple dictation and correction.
Nick's recognition accuracy initially was best when he dictated in phrases of three to
four words only. When he tried to speak in longer phrases, Nick tended to stumble over his
words in his effort to say the entire sentence before he forgot a piece of it. Although
Nick was able to recognize at sight quite a few more words than he was able to write, he
needed a high level of auditory support and feedback. The screen reader/spell
check/homophone check program Keystone 99 was used with DNS to facilitate Nick's writing
process. This program allowed Nick to use his visual strengths and to support his auditory
weakness when writing so that he could devote his attention to the task of composing and
dictating his thoughts. Keystone enabled Nick to have what he had written read back to him
with word by word or sentence by sentence highlighting. Keystone also enabled Nick to use
a pop-up homophone checker and to have his words echoed back to him as he dictated them.
He found this echoing to be tremendously helpful to him. The echoing acted as an auditory
cue for Nick.
In order to assess how helpful such features as this would be to him, we took Nick
through the enrollment process in DragonDictate (a discrete speech recognition program),
using Keystone 99 to "read" the enrollment text to him, bit by bit. Enrolling
using this auditory prompting process and using the shorter phrases in DragonDictate
seemed to be quite helpful to Nick. Yet, when Nick began to compose in DragonDictate, he
was frustrated by the need to speak only one word at a time and to pause in between words.
Therefore, we decided to continue his training in Dragon NaturallySpeaking with the
support of Keystone 99. This proved to be the most successful venue for Nick's writing.
Although Nick needed to dictate in shorter phrases that many students, Dragon
NaturallySpeaking Professional 4.0 was able to understand him quite well. Having Keystone
echo what he had just dictated helped free up Nick's working memory to focus on his next
thought. When Nick lost his train of thought, it was very helpful to him to be able to use
a single keystroke to have Keystone read back to him what he just written. This seemed to
give him a nice running start into his next thought. Using Keystone to read the selections
in the correction box enabled Nick to be much more independent in his writing process.
Increasing the pause time between phrases seemed to help Dragon NaturallySpeaking
recognize what Nick was saying when he spoke his words too slowly.
Nick became a fairly fluent writer. His ability to write by longhand also began to
improve; perhaps because his ability to organize his thoughts and to fluently express them
improved as he used Dragon NaturallySpeaking.
© Shelley Lacey-Castelot
This document may not be reproduced
in whole or in part without the express
written permission of Literacy Solutions, LLC
Using Technology for the Humanities