Interobserver variability in comfort scores for screening colonoscopy
  1. David N Naumann1,
  2. Sarah Potter-Concannon2,
  3. Sharad Karandikar3
  1. 1Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
  2. 2University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
  3. 3Gastrointestinal Endoscopy, Heart of England NHS Foundation Trust, Birmingham, UK
  1. Correspondence to Mr Sharad Karandikar,Gastrointestinal Endoscopy, Heart of England NHS Foundation Trust, Birmingham, UK; sharad.karandikar{at}


Objective To investigate the agreement in comfort scores between patients, endoscopist and specialist screening practitioner (SSP) for colonoscopy, and which factors influence comfort.

Design Prospective observational study.

Setting Single-centre UK Bowel Cancer Screening Program colonoscopy service from April 2017 to March 2018.

Patients 498 patients undergoing bowel cancer screening colonoscopy, with median age of 68 (IQR 64–71). 320 (64.3%) were men.

Intervention All patients underwent screening colonoscopy.

Main outcome measure Comfort scores on a validated 1 (best) to 5 (worst) ordinal scale were assigned for each colonoscopy by the patient, endoscopist and SSP. Inter-rater agreement of discomfort scores between endoscopist, patient and SSP was investigated using Cohen’s Kappa statistic. Multivariate ordinal logistic regression was used to investigate the effects of patient and colonoscopy factors on comfort scores.

Results SSPs had superior comfort score agreement with patients (0.638; ‘moderate agreement’) than endoscopists had with the same patients (0.526; ‘weak agreement’). Male patients reported lower scores than female patients (OR 0.483, OR 0.499 [95% CI 0.344 to 0.723]; p<0.001). Endoscopists reported lower scores when there was better bowel prep (OR 0.512 [95% CI 0.279 to 0.938]; p=0.030). Agreement was worse at higher levels of discomfort.

Conclusion There is variability in perceived comfort levels between healthcare providers and patients during screening colonoscopy, which is greater at worse levels of discomfort. Endoscopists who undertake screening colonoscopies may wish to consider both patient and healthcare provider comfort scores in order to improve patient experience while ensuring optimal quality assurance.

  • colonoscopy
  • endoscopy
  • abdominal pain
  • colorectal cancer screening
  • screening
Key messages

What is already known about this subject?

  • Patient comfort is key during colonoscopy to improve patient well-being, satisfaction and ensure high quality.

  • There is considerable variability in patient comfort during colonoscopy, and the level of comfort is usually recorded by both healthcare professionals and the patients themselves.

  • It is unknown whether healthcare professionals and patients agree on the comfort scores during screening colonoscopy.

What are the new findings?

  • There is considerable disagreement in comfort scores between healthcare professionals and patients.

  • Disagreement in comfort scores is worse at higher levels of patient discomfort.

How might it impact on clinical practice in the foreseeable future?

  • A patient-centred approach to screening colonoscopy may benefit from greater consideration of the patient’s own experience, rather than relying on scores assigned by healthcare professionals alone.


Patient experience is a key aspect of colonoscopy service quality and comfort is an important factor in the overall satisfaction of patients following colonoscopy.1 Comfort is a dynamic phenomenon during and after a colonoscopy, and is concurrently monitored by the endoscopist and specialist screening practitioner (SSP) during a bowel cancer screening programme (BCSP) colonoscopy. The Global Rating Scale is a component of the Joint Advisory Group on Gastrointestinal Endoscopy accreditation process in the UK and requires the capture of data regarding patient comfort.2 Quality assurance is essential during such screening programmes so that the benefits of screening outweigh any potential harms.3 As well as being important factors in the overall well-being of patients, pain and discomfort during colonoscopy may increase the risk of an incomplete procedure.4 Comfort levels are quantified using an ordinal scale ranging from 1 (best) to 5 (worst) comfort, corresponding to (1) ‘Comfortable’ (talking and comfortable throughout), (2) ‘Minimal’ (1 or 2 episodes of mild discomfort without distress), (3) ‘Mild’ (>2 episodes of discomfort without distress), (4) ‘Moderate’ (significant discomfort with some distress) and (5) ‘Severe’ (frequent discomfort with significant distress).5

Interobserver variability for comfort scores between patient, SSP and endoscopist may be recognised, but has not been investigated in a cohort of patients having bowel cancer screening colonoscopy in the National Health Service (NHS). Since patient experience is a key aspect of endoscopy service quality, it is important to explore differences in perceived comforts levels between the patient and their healthcare providers in order to ensure that differences in perception do not adversely affect the clinical care of patients. It is also important to explore patient and endoscopy factors that may influence comfort, in order to ensure high standards of bowel cancer screening colonoscopy.

The primary aim of the study was to investigate the interobserver variability for comfort scores between patient, SSP and endoscopist in a cohort of patients having bowel cancer screening colonoscopy. The secondary aim was to study the factors influencing comfort scores such as patient or endoscopy characteristics. Identification of modifiable factors may inform practitioners who wish to aim for improvements in patient comfort.


Study design and setting

A prospective observational study was performed to include successive patients who underwent colonoscopy as part of the bowel cancer screening programme or surveillance, at a high-volume screening centre. There were four accredited Bowel Cancer Screening endoscopists and eight SSPs who participated in the current study. Institutional approval was granted before the collection of any data (institutional registration number 3847). This observational study made no modifications in clinical management of any kind.

Patient selection

Patients were eligible for the current study if they attended for a colonoscopy as part of the BCSP for a 12-month period between April 2017 and March 2018, following a positive faecal occult blood test or surveillance programme. Only screening (ie, asymptomatic) patients were included. Patients were ineligible if they were referred for colonoscopy due to symptomatic disease.

Colonoscopy comfort scores

Three individual comfort scores were assigned for every colonoscopy, which were assigned by the patient, SSP and endoscopist. The endoscopists and SSPs recorded patient comfort scores on dedicated endoscopy databases, using Endobase (Olympus Medical Systems) and Bowel Cancer Screening Open Exeter (National Health Application and Infrastructure Services), respectively. All healthcare professionals had received training in the assessment of comfort during their training and accreditation as screening endoscopists and SSPs. Patient comfort scores were obtained from patients by discussion with an endoscopy nurse before they were discharged, and then by a subsequent telephone call the following day. During these conversations, the patients were informed verbally what each of the scores indicated according to the 5-point scale (as discussed earlier). These were recorded on a dedicated database held within the NHS Trust. Since each of these individuals assigns a score independently to the others, they were effectively blinded to each other’s scores. This standard clinical practice represents an ideal opportunity to study the interobserver variability.

Colonoscopy technique

All patients were assessed in the pre-assessment clinics by the SSP and given standard bowel preparation before arriving on the day of their procedure. They were also offered the choice to have sedation, opiate analgesia, nitrous oxide (Entonox) or none during the procedure. For patients who chose to have intravenous medication, peripheral access was achieved for the delivery of a combination of buscopan, midazolam and fentanyl. Standard observations were monitored throughout the procedure, including oxygen saturations and heart rate. A digital rectal examination was performed, and colonoscopy was performed to recommended BCSP standards including withdrawal time. Inspection of the intraluminal surface was undertaken, and rectal retroversion was performed. After every colonoscopy, a standardised electronic form was completed regarding the procedure technique and findings.

Data collection

Patient comfort score and endoscopy findings were prospectively recorded in a centralised BCSP database at the endoscopy centre. This database was interrogated to update the purpose-designed database. Data included age, gender, completeness of endoscopy, choice of analgesic/sedation (intravenous sedation or Entonox), quality of bowel preparation (good, adequate or poor, according to the endoscopist’s own assessment), colonoscopy findings, endoscopist and patient comfort scores (as assigned by patient, endoscopist and SSP).

Data analysis

Data are reported as median and IQR for continuous data and number and percentage for categorical data. Continuous data are compared between the four endoscopists using Kruskal-Wallis tests, and categorical data are compared using χ2 analysis. Ordinal logistic regression was used to investigate the effects of age, gender, completeness (complete or incomplete), findings (abnormal or normal), bowel preparation quality (good or not good), type of sedation (nitrous oxide or intravenous) and endoscopist on patient comfort score, as well as on endoscopist-assigned discomfort score. Both unadjusted univariate and adjusted multivariate analyses were performed using these prespecified covariates. ORs are used to indicate a move from a lower to a higher patient comfort score (ie, an OR >1 indicates a move towards worse comfort). Inter-rater agreement of discomfort scores between endoscopist, patient and SSP was investigated using Cohen’s Kappa statistic. First, all three raters were analysed together, and then separate pairwise comparisons were made. A p value of <0.05 was considered significant.


Patient and endoscopy characteristics

There were 498 patients with a median age of 68 (IQR 64–71), of which 320/498 (64.3%) were men. Patient characteristics and endoscopy details are summarised in table 1. The number of colonoscopies performed by the four endoscopists ranged from 78 to 194. The majority of patients (68.3%) had intravenous sedation, and the remainder had Entonox. None had both of these together. Most patients (91.8%) had good bowel preparation. Polyp detection rate was 45.1%. Patient and endoscopy characteristics were not significantly different between endoscopists, except for choice of analgesia (p=0.030).

Table 1

Patient and endoscopic characteristics compared between endoscopists

Patient-reported comfort scores

Out of 490 patients who recalled their own comfort score during colonoscopy, these included scores of 1 (N=160), 2 (N=206), 3 (N=83), 4 (N=27) and 5 (N=14). All patients assigned their scores before discharge from hospital, and none changed their minds about their comfort score on the subsequent day when telephoned. The only factor on univariate analysis that was significantly associated with patient-recalled comfort score was gender, with male patients being more likely to report lower scores (better comfort) than female patients (OR 0.483 [95% CI 0.344 to 0.723]; p<0.001). This association was confirmed on multivariate analysis (OR 0.499 [95% CI 0.344 to 0.723]; p<0.001) (table 2).

Table 2

Ordinal logistic regression to investigate the influence of patient and endoscopy characteristics on patient-recalled comfort score

Endoscopist-reported and SSP-reported comfort scores

Comfort scores were assigned by endoscopists for 493 patients, including scores of 1 (N=183), 2 (N=205), 3 (N=86), 4 (N=15) and 5 (N=4). Comfort scores were assigned by SSPs for all 498 patients, including scores of 1 (N=151), 2 (N=219), 3 (N=87), 4 (N=37) and 5 (N=4). Similar to patient-reported comfort scores, there was an association between gender and endoscopist-reported comfort scores on univariate analysis, with male patients scoring lower (better comfort) than female patients (OR 0.589 [95% CI 0.416 to 0.832]; p=0.003). There were further associations between completeness of colonoscopy and bowel preparation quality, with improved comfort scores for complete investigations (OR 0.238 [95% CI 0.065 to 0.837]; p=0.026) and better bowel preparation (OR 0.512 [95% CI 0.279 to 0.938]; p=0.030). These associations between gender, quality of bowel preparation and completeness of investigation were confirmed on multivariate analysis (table 3). Individual endoscopists reported significantly different patient comfort scores to each other (table 3), even though there were no significant differences in patient-reported scores (table 2).

Table 3

Ordinal logistic regression to investigate the influence of patient and endoscopy characteristics on endoscopist-reported patient comfort score

Interobserver agreement in comfort scores

Table 4 illustrates the Kappa scores for inter-rater agreement for all raters, as well as paired comparisons between raters. When all comfort scores were compared between all three raters (endoscopists, SSPs and patients), the Kappa value was 0.593, indicating borderline ‘weak’ to ‘moderate’ agreement. SSPs had superior agreement with patients (0.638; ‘moderate agreement’) than endoscopists had with the same patients (0.526; ‘weak agreement’). Agreement between all three raters as well as between pairs of raters was highest for lower scores (better comfort), but got progressively worse for higher scores (worse comfort) (table 4 and figure 1).

Table 4

Kappa values for inter-rater agreement in comfort scores

Figure 1

Kappa statistics for interobserver agreement at each comfort score for all raters, and pairwise comparisons between raters. SSP, specialist screening practitioner.


The current prospective observational study of 498 consecutive bowel cancer screening patients over 12 months confirms an interobserver variability in comfort scores between patients and their healthcare providers during colonoscopy. The majority of patients recalled comfort scores of 1 or 2, and male patients were more likely to recall lower comfort scores than female patients. Comfort scores recalled by patients did not appear to be associated with age, completeness of endoscopy, quality of bowel preparation, diagnosis or type of sedation. Conversely, endoscopist-reported comfort scores were more likely to be higher (worse comfort) when the procedure was incomplete and the bowel preparation was poor. This suggests that endoscopists were more likely to consider the comfort worse for more difficult procedures, which is in keeping with published evidence that lower-quality colonoscopy is associated with greater discomfort.5 6 The differences between patient and endoscopist scores in this context may reflect a scoring bias among endoscopists (ie, they were more likely to consider the patient in greater discomfort because of the difficulty of the procedure).

There was relatively poor agreement in comfort scores between patients and endoscopists, and agreement was worse at higher scores (ie, when the patient was in more discomfort). Other investigators have found similar discrepancies between comfort reported by the endoscopists and that recalled by the patient.7 SSPs had better agreement in comfort scores with patients, but there was a similar worsening of agreement when comfort was worse. Higher disagreement at greater levels of discomfort may be in part due to the setting of relative distress to the patient. Such disagreement may suggest that scores attributed to patients on their behalf may not be accurate enough to reflect their experiences. However, the rationale for the endoscopist-assigned scores is that the patient may not necessarily recall or accurately reflect their own discomfort throughout the entire investigation due to the effects of sedation and analgesia. Indeed, when greater discomfort was noted by the endoscopist, subsequent higher levels of sedation and analgesia may make the patient less likely to accurately recall that experience. Unfortunately, the current study did not record the exact dosage of sedation according to comfort scores, but this would be a factor of interest in future studies. There is some evidence that recall of pain may also be time dependent, with recollection of pain reducing over time.8 Although there were no changes in comfort scores assigned by patients between their discharge and the subsequent day, they assigned their scores after the procedure, and just before being discharged, meaning that there may still have been a risk of recollection bias.

There were similar numbers of patients within each individual comfort score between endoscopists and patients (for example, there were 206 patients who gave themselves a comfort score of ‘2’, and there were 205 patients who were assigned a score of ‘2’ by endoscopists). However, the fact that there was considerable inter-user variability infers that there was considerable overlap of patients. For example, some patients who scored themselves a ‘2’ may have been assigned a ‘1’ or ‘3’ by their endoscopist, or vice versa. This is an important factor when interpreting summary data from comfort scores since similar spread of scores between endoscopists and patients may be falsely reassuring and does not necessarily imply agreement.

The current study findings may suggest that efforts to reduce patient discomfort during colonoscopy should perhaps be targeted on two distinct but related levels: both the more ‘objective’ assessment of patient discomfort by the endoscopist or SSP during the procedure, but also the patient’s own subjective overall experiences. Although no endoscopist caused more pain than any other (as reported by the patient; table 2), they did assess levels of comfort differently to each other (table 3). Endoscopist C graded patient discomfort higher on average for their sample of patients than endoscopists B and D did for theirs. This discrepancy may illustrate the importance of obtaining comfort scores from patients as well as the endoscopists themselves in order to determine whether their assessment of comfort translated to the patients’ own experiences. Furthermore, individual practitioners may wish to use patient-recalled scoring systems when optimising individual procedures and practice, but use healthcare practitioner–assigned scores for the overall indication of quality assurance.9 Patient involvement in the overall assessment of colonoscopy is a valuable tool for quality assurance and benchmarking of practice.10

The current study did not identify any modifiable factors in reducing patients’ subjective experience of discomfort during colonoscopy since the only factor of influence was patient gender, in keeping with those of other investigators.8 11 Since the endoscopists were more likely to consider the patient in discomfort with poor bowel preparation, this is a modifiable factor which would be important to optimise if the risk of discomfort were to be reduced. Our finding that comfort scores were not different between patients having intravenous medication or nitrous oxide is in keeping with previous evidence,12 but is likely to be affected by selection bias since patients were not randomised to one treatment or the other. This finding does suggest, however, that patient selection (intravenous medication or not) tends to be appropriate for their own requirements. Although patient gender is a non-modifiable factor for colonoscopy, these findings may prompt the endoscopist to anticipate greater discomfort in female patients and prepare the appropriate levels of sedation and analgesia. The increased risk of discomfort in female patients may also be an important feature in the process of informed consent, including the decision regarding sedation and analgesia. Furthermore, other non-pharmaceutical techniques to reduce discomfort may also be considered in cases where more severe symptoms are anticipated.13 14


The current study was undertaken at a single centre, with four bowel cancer screening accredited male Consultant-grade endoscopists, which may limit its generalisability. In particular, there is some evidence that comfort scores and colonoscopy techniques were different when undertaken by trainees,15 and therefore the current study findings may not necessarily be applicable to colonoscopy undertaken by non–Consultant-grade endoscopists. We were not able to investigate the effect of gender of the healthcare professionals when studying comfort scores. This may be a source of bias since women may score higher than men on both affective and cognitive empathy.16 Others have suggested that study of gender differences in empathy might be improved by designing studies with greater statistical power and considering variables implicit in gender.17 We were also not able to investigate the influence of other factors on comfort scores, such as previous abdominal surgery (eg, hysterectomy) and pathology (eg, diverticulosis) due to the relatively low number of patients. Further investigations of all possible risk factors for discomfort may require greater numbers with increased granularity of data. All colonoscopies were conducted as part of a screening programme, and therefore the study findings may not necessarily translate to other types of colonoscopy, such as those for symptomatic patients.

The number of patients who had higher scores (scores of ‘4’ and ‘5’) were relatively low, which may have increased the risk of bias in our finding that higher scores were associated with poorer agreement. Greater numbers may be required for more conclusive evidence of this effect. The number of patients with poor bowel preparation and incomplete endoscopy were relatively low, and the exact reasons for incomplete colonoscopy were not quantified. Furthermore, the assessment of bowel preparation quality may differ between individuals, making it subject to bias. Therefore, caution is warranted when interpreting the effects of these factors.


There is important interobserver variability in perceived comfort levels between healthcare professionals and patients during screening colonoscopy, and this variability increases at worse levels of comfort. Female gender and poor bowel preparation were markers of worse comfort. Endoscopists who undertake screening colonoscopies may wish to consider both patient and healthcare provider comfort scores in order to both improve patient experience and also ensure optimal quality assurance.


The authors thank the endoscopy department for participating in the study.


  • Contributors SK and SP-C planned athe study. SK and SP-C conducted the study and collected all data. DNN analysed and interpreted the data. DNN wrote the first manuscipt. SK and SP-C provided critical appraisal and editing. All authors approved the final version of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

