Article Text


Digital dictation and voice transcription software enhances outpatient clinic letter production: a crossover study
  1. Kinesh Patel,
  2. Marcus Harbord
  1. Department of Gastroenterology, Chelsea and Westminster NHS Foundation Trust, London, UK
  1. Correspondence to Marcus Harbord, Chelsea and Westminster NHS Foundation Trust, Department of Gastroenterology, London SW10 9NH, UK; marcus.harbord{at}


Background Digital voice transcription has been introduced widely in the National Health Service (NHS), though primarily in radiology departments. There has been a long-standing problem with recruitment of medical secretaries within the NHS, leading to long delays in the production of correspondence from outpatient clinics.

Objective To determine whether use of widely available digital transcription software improves efficiency and the time taken to produce correspondence.

Methods The project used a prospective, crossover trial design in a ‘real-world’ environment. Correspondence from clinics was transcribed after dictation by a secretary using conventional analogue audio tape or the dictation software. After a 2-week washout period the same clinics' dictations were transcribed using the other method to produce identical correspondence. The two sets of letters were compared.

Results The mean time for the secretary to produce letters for a complete clinic using digital dictation was 66 min whereas analogue dictation took 121 min (p<0.00002). There was no difference in the number of mistakes per letter (p>0.05).

Conclusion Voice transcription software significantly decreased the time taken to transcribe outpatient clinic letters with minimal training of secretarial staff, resulting in improved efficiency.

Statistics from


Proprietary digital dictation and voice transcription has been widely introduced throughout the National Health Service (NHS) primarily in radiology departments, although there is still doubt that it is as good as transcription by a secretary.1,,3

Recruitment and retention of medical secretaries has been a problem for the NHS for many years.4 The combination of low rates of pay with a high-pressure environment has led to posts often being filled by temporary staff, which itself can add to difficulties in the workplace.

One of the traditional roles of secretaries has been typing correspondence, predominantly from outpatient clinics. Our department was plagued by long delays between dictation and transcription with resultant frustration on the part of clinicians, secretaries and patients.

The delay between dictation and transcription of correspondence was audited and had reached up to 2 months owing to an increasing number of patients seen in the gastroenterology department.

More staff could not be employed owing to budgetary constraints. A new solution was needed to improve productivity, as part of the national Quality, Innovation, Productivity and Prevention agenda,5 without increasing costs or staffing levels.

We describe the trial and introduction of a commercially available automated speech recognition system (Dragon NaturallySpeaking Medical version 10, Nuance Communications, Burlington, MA, USA) into the gastroenterology department of Chelsea and Westminster NHS Foundation Trust to try to overcome some of these problems.


The project used a prospective, crossover trial design in a ‘real-world’ environment.

Two clinicians (MH, KP) dictated letters from six entire outpatient clinics using an analogue passive noise-cancelling headset (Andrea NC-81; Andrea UK, Bohemia, NY, USA)connected to a standard personal computer. The dictation time was measured. The output was saved as a Windows WAV file (16 bit mono, 22 000 kHz).

Letters from these clinics were then recorded onto conventional mini-dictation cassettes (Philips, Guildford Business Park, Guildford, Surrey, UK),in a sound-proofed environment, resulting in identical electronic and analogue copies.

A secretary, who was unfamiliar with voice transcription software, was trained in the use of Dragon NaturallySpeaking Medical. The training lasted for less than 30 min.

Correspondence from each clinic was then ‘typed’ twice, once using voice transcription software with the secretary proofreading and correcting the text; and once by conventional typing using a headset, foot pedal and keyboard, similarly with proofreading. Patient and general practitioner name and address were added in both cases. There was a ‘washout interval’ of 2 weeks between producing the clinic letters using analogue and digital methods, with an equal number of analogue and digital clinic letters typed first time (figure 1).

Figure 1

Crossover trial design.

The total time taken using each method was measured. The letters from the clinics were saved separately so that an error analysis could be performed. Acceptability was assessed qualitatively.


In total, 45 letters were dictated during the trial, equating to 3589 words.

The mean time for the secretary to produce letters for a complete clinic using digital dictation was 66 min, whereas analogue dictation took 121 min (paired, two-tailed t test; p<0.00002). There was no difference in the number of mistakes per letter (p>0.05). Results for each investigator and each clinic are shown (table 1 and figure 2).

Figure 2

Voice transcribing clinic letters decreases their production time (p<0.00002).

Table 1

Transcription time using voice recognition is significantly reduced (p<0.00002)

There was no difference in user satisfaction, with the time taken to record the digital files being no more than for analogue dictation. The secretary actively preferred using voice transcription, and continued using it after the end of the study. The system was then adopted by other secretaries in her team.


Voice transcription software significantly decreased the time taken to transcribe outpatient clinic letters with minimal training of secretarial staff, resulting in improved efficiency. A further reduction in transcription time would be expected as the software is further adapted to each user's voice, thereby increasing accuracy.

It is easy to use voice transcription software in an office environment. The generic software and simple headset cost less than £100 per user, with an almost immediate improvement in efficiency. We find it reduces the time spent on email correspondence, particularly for those who cannot touch-type, facilitating more creative thinking based on the spoken word.

Hand-held digital dictation devices have been available for some years. Recently, voice recognition applications have become freely available for smart phones and tablets, and will probably become much more widely used. Converting all outpatient clinicians to using digital dictation/voice recognition is probably best undertaken using a hospital-wide approach, using hand-held devices; however, this study demonstrates its utility for early adopters of this technology.

We believe this to be the first crossover trial comparing digital and analogue dictation in a medical environment and the first trial that demonstrates effectiveness in an outpatient setting.

Voice transcription software has been assessed previously in hospitals, although few studies have been reported. Most concluded that accuracy and/or time efficiency is less than with conventionally typed letters,6,,9 although these studies used earlier software versions that are known to be less accurate. Two reports have described a more efficient ‘turn-around’ time to generate electronic patient records, in the emergency room9 and the pathology department,10 probably reflecting the types of record that are produced in these environments.

A retrospective comparative trial in a radiology department showed that no additional errors occurred when the users of voice transcription were experienced.11 This was confirmed more recently,12 in a study in which headset use and templates were associated with transcription efficiency.

Integration of voice transcription into the military outpatient clinic has been shown to be acceptable to the majority of users,13 which supports our demonstration that modern platforms produce an efficient and effective service.

Within the past decade, the NHS has increasingly used offsite digital transcription services14 as a result of shortages in secretarial staff and to save money.15 The clinician dictates correspondence onto a digital voice recorder, which is then sent via secure electronic means to a transcriptionist who produces a document to send back to the clinician. There has been no randomised published evaluation of the effect on quality or price of remote transcription compared with conventional onsite transcription.

What is known already?

  • Digital voice transcription is used widely throughout the NHS, but predominantly in imaging departments with proprietary software.

  • Within imaging departments, efficiency has been improved by using this technology.

What this study adds

  • Improvements in productivity can be achieved in a medical outpatient setting with commercial non-proprietary digital transcription software.

  • With only minimal training of users and secretarial staff, there was no increase in the error rate over conventionally produced correspondence.

The demonstrated efficiency and accuracy of voice transcription in this trial has resulted in Chelsea and Westminster Hospital procuring digital dictation for all outpatient work; to be followed by voice transcription, based on the Nuance Communications speech engine. This is expected to reduce the use of temporary secretarial staff, resulting in cost savings, and allow permanent staff to be redeployed in line with modern working practices. Similar cost savings could be made in other hospitals within the NHS.


This paper shows good results in using electronic transcription compared with conventional methods of producing correspondence in outpatients. However, there are limitations associated with this study. Both authors were born in the UK: the ability of the software to recognise the voices of those with either strong regional or foreign accents is not known.

As this was intentionally designed as a real-world trial, it was not possible to completely isolate the secretary from distractions while performing the typing. The impact of these interruptions on the data analysis was minimised, however, as they occurred during both analogue and digital transcription.

An alternative methodology would have been to record and transcribe a script, using both analogue and digital methods, in a silent isolated environment. However, these results would have been less valid in everyday clinical practice, even though it is likely that the digital data would have shown an even lower error rate.


Grateful thanks to Catherine O'Meara, for secretarial support during the study.


View Abstract


  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.