Remote CART: How I Survived My Ph.D. Research
By Kathryn Woodcock, Ph.D.
Remote CART offered several benefits to a deaf researcher interviewing hearing subjects.
For my Ph.D. research, I struggled for months trying to think of ways to avoid a fundamental barrier: I am deaf and I was not doing deaf-related research, as many deaf researchers are. Even worse, I was not doing research with computer models, fruit flies, rats, chemicals in test tubes, machines or structures. I was doing ergonomic research that needed to examine human reasoning, knowledge and problem solving. I tried to think of many other ways to look at problems, but I kept returning to the same types of questions that interested me.
There was no way around it. I needed to interview people — hearing, non-signing people — and I needed to use naturalistic, or qualitative, methods. I sign, and I normally prefer to use that method for access to ordinary meetings. It is 100 times more reliable than lipreading, and results in 1/100 of the fatigue. However, this type of research requires accurate and complete notes, and sign language reception monopolizes vision. Using sign language interpreters in my interviews would have enabled me to understand the interview concurrently, but not to take reliable and thorough notes. The chosen alternative was to use a realtime reporter.
With CART, a specially trained and prepared realtime reporter transcribes the interview as it is conducted, using a steno machine. The proceedings are recorded by entering a chord (key combination) for each syllable. Realtime reporters have a background in court stenography. Ordinarily, homonyms such as “to/two/too” could be keyed alike in the courtroom and deciphered from context in preparing the court transcript in a subsequent operation. Because these distinctions must be made in real time to make the transcript useful to the client, realtime reporters have supplemented court reporting training with additional training and experience to key homonyms distinctly and eliminate these so-called “conflicts.”
Computerized translation is performed from “steno-language” using a prepared English vocabulary, with the interview transcript displayed on a video monitor in real time. A delay of only seconds is typical with a skilled reporter, with an accuracy rate in excess of 95 percent. (“Inaccuracies” in this context are normally words for which the phonetic input fails to find a match in the English vocabulary, possibly because it is a technical term, or one of the syllables was miskeyed. It is usually possible to decipher the intended words without great difficulty if the reporter has adequate skill.)
Normally, realtime transcription is used for the closed captioning of live television events or for the display of transcription on a monitor or projection screen for a meeting or conference presentation or similar function. In the present instance, there were concerns that the presence of the reporter would be intrusive, as well as space consuming, in the small interview location. The reporter could be set up in an adjacent lab, but this approach would require incurring travel expenses for several interview periods. Conserving travel expenses and paid idle time for the reporters would have imposed additional restrictions on possible times for interviews, when maximizing the number of interviewees dictated being as flexible as possible for their schedules.
The Technical Arrangement
The solution was found in telecommunications. While most realtime television captioning is performed in specially equipped studios from network feeds and satellite transmission, telephone lines are increasingly used as captioning of local broadcasts becomes more common but local captioners are unavailable. By connecting the reporter’s remote site to the interview site through two telephone lines, personal computers with modems and software, and an enhanced microphone in the interview site, the reporter could be situated anywhere.
This remote realtime transcription arrangement had been devised by Valerie Waite for use by the National Technical Institute for the Deaf (Rochester, N.Y.) for large group meetings, but it had not been tested for private meetings or interviews. A trial session was conducted with the reporter in Montreal, and the actual interviews were transcribed from Whitby, Ontario, and Syracuse, N.Y., by Valerie Waite, RMR, CRR, and Michele LaPointe of Waite and Associates. The reporters used a steno software marketed as Captivator/ TurboCAT by Cheetah International, mainly intended for broadcast captioning. (Compatibility of TurboCAT and Windows had not yet been verified, so pcAnywhere for DOS was used.) The University of Toronto Department of Services to Persons with a Disability (Special Services) paid for the reporting service as a reasonable accommodation of student disability. The National Technical Institute for the Deaf contributed the use of the special microphone and coupler required, as well as technician assistance to set up and test the system. At the time of carrying out this research, as a doctoral candidate at the University of Toronto (Mechanical and Industrial Engineering), I held a joint appointment to the faculty of the National Technical Institute for the Deaf and Rochester Institute of Technology College of Engineering. The interviews were held in my laboratory in the College of Engineering.
Prior to each scheduled interview, I set my personal computer in remote mode of the software product pcAnywhere (Symantec) and set the audioline auto-coupler (Gentner) into “auto” mode. At the arranged time, the reporter called on the modem telephone line to initiate the remote display of the transcription, and on the audio line, activating the auto-coupler and enabling remote listening using the microphone/mixer apparatus. Lacking a visual channel, an “above average” audio connection was required. A suitable microphone and mixer (Shure SE30 Gated Compressor/Mixer) were used.
Once the lines were connected, my PC displayed exactly what the reporter’s host PC displayed, and the reporter could hear everything in my office (including lawnmowers and helicopters outdoors). This two-way communication capability was used before and after interviews to communicate instructions and coordinate connection schedules as well as to display the transcript during the interview.
The transcript display could be customized for font size, screen width, text and background color, spacing and other parameters. Based on the interviewer’s personal preference, a normal 80-character word processing line width and character size were chosen, double spaced with yellow letters on a blue background. As text was added, old text scrolled off the top of the screen. The video monitor was aligned so that I could glance readily back and forth from the monitor to the subject’s face. The interviewees could also see the display, to allay any concerns that anything other than a simple verbatim transcript was appearing. To address confidentiality, the interviewees’ names were never known to the reporters. Each interview was identified to them by number only. On the displayed transcript, the speakers were denoted “Kathryn>” for the interviewer, and a simple “>” or “Inter-viewee>” for the subject.
How it Worked
At the outset of the series of interviews, I provided the reporter with the interview outline and a list of jargon, proper names and other vocabulary terms that might be anticipated. The reporter thereby was able to maximize the number of terms that would be correctly translated by the computer from simple steno keystrokes rather than necessitating manual entry by fingerspelling. Between interviews, additional terminology arising was also added. This is typical of the professional preparation that is essential for competent realtime service.
When being invited to participate in the project, and again when invited for an interview, the subjects were told that the interview was to be held at my laboratory due to the lack of portability of “interpreting.” At the start of each interview, I explained to the subjects that the interpreting was being provided via the evident microphone and being displayed on the monitor, since this would permit me to know what he or she had said literally in its English form, rather than drawing inferences from sign language translation of the same statement. The subjects had the opportunity to observe that the monitor did indeed display what I had just said. At that point, most subjects appeared to forget about the entire interpreting issue. One subject initially remarked that the microphone made him nervous (but his behavior soon appeared quite comfortable). Another subject wondered midway through the interview about the disposition of the transcript after the interview, and I reiterated the confidentiality policies.
Because the reporter was not physically present, this functioned as a very unobtrusive method of communication. The unobtrusiveness failed only twice in 16 interviews. Once, a cable failed on the reporter’s steno equipment and was replaced in minutes. In a second instance, a translation dictionary failed to load and the start of the interview was briefly delayed, also by just a few minutes. In general, the realtime technology caused fewer disruptions than subjects being paged by their offices (or, on one occasion, extremely loud helicopter maneuvers overhead that interfered as much with the subject hearing the question as with the reporter hearing the interview).
This technology provided several significant benefits. Foremost, it permitted the investigator to understand the subject during the interview, replacing sign language interpretation. It also produced an ASCII transcript on disk that could later be coded and analyzed as data and excerpted verbatim. This benefit would be worth considering by any researcher conducting field interviews as an alternative to transcription from audiotapes. Each two-hour interview session was “cleaned up” by the reporter and ready to be transmitted within hours, in contrast to the typical delay entailed with audiotapes.
Following the interviews, the ASCII transcripts were downloaded by the reporter onto the interviewer’s computer, again using the pcAnywhere utility. After the interviews were completed and all transcripts obtained, transmission of ASCII transcripts was also successfully tested using the Internet, providing future researchers with another option.
About the Author
Kathryn Woodcock, Ph.D., is an associate professor in the School of Occupational and Public Health at Ryerson University teaching courses in occupational safety and ergonomics. She is the author of Deafened People: Adjustment and Support (University of Toronto Press, 2000). You can visit her Web site at www.deafened.org.
