Article Text

Download PDFPDF

Patient defined dichotomous end points for remission and clinical improvement in ulcerative colitis
  1. P D R Higgins,
  2. M Schwartz,
  3. J Mapili,
  4. I Krokos,
  5. J Leung,
  6. E M Zimmermann
  1. University of Michigan, Michigan, USA
  1. Correspondence to:
    Dr P Higgins
    6510 MSRB I, Box 0682, University of Michigan, Ann Arbor, MI, 48109, USA; phiggins{at}umich.edu

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Measurement of disease activity in ulcerative colitis is critical in determining whether new therapies are effective, but there is no gold standard for measuring disease activity in ulcerative colitis. For the purposes of clinical trials, continuous scores on disease activity indices offer good statistical power, but the clinical significance of a change of an arbitrary number of points on a particular index is questionable. In rheumatoid arthritis and in Crohn’s disease, standard definitions for significant clinical improvement and clinical remission on standard indices have come into common use, allowing helpful comparisons between trials which are measuring the same dichotomous end points. This has not been the case for ulcerative colitis where no single disease activity index is widely accepted and no generally accepted definition of improvement or remission exists.

The initial attempt at quantifying disease activity was by Truelove and Witts, who defined mild, moderate, and severe colitis1 in 1955. These broad definitions were transformed into a continuous point scale by Powell-Tuck et al, the St Mark’s index, in 1978,2 when endoscopy was added to the measurement of ulcerative colitis. As the St Mark’s index is a cumbersome measure of 11 items, simplified versions have come into use. Sutherland et al developed the ulcerative colitis disease activity index (UCDAI)3 and Schroeder et al developed the Mayo disease activity index,4 both of which have four components and include endoscopy. The first non-invasive index was developed by Seo et al, who developed an index based on symptoms and levels of haemoglobin, albumin, and Westergren erythrocyte sedimentation rate.5 Walmsley and colleagues6 developed the simple clinical colitis activity index (SCCAI), a survey of six questions about symptoms, and showed that it correlated well with the St Mark’s index, the Seo index, and with a validated quality of life measure, the inflammatory bowel disease quality of life index (IBDQ).7 None of these four disease activity indices have been formally validated. In the absence of a validated gold standard for disease activity, experts in the field have advocated different indices, leading individuals performing clinical trials to measure multiple indices in their subjects.

It is important to measure whether clinically significant end points are achieved in therapeutic trials. Many investigators have attempted to use clinically significant improvement and clinical remission as end points in therapeutic trials in ulcerative colitis. These end points are important goals for clinical trials, but have never been clearly defined for ulcerative colitis, in part because Truelove and Witts never included remission in their original scale and in part because of the multiple indices available. Many authors have made educated guesses as to what index score equates to remission, and have empirically selected what amount of change in a given index equates to clinically significant improvement.8,9,10 However, there is no justification cited for choosing any particular cut off. There is no valid definition of clinically significant improvement or of remission on standard indices of disease activity in ulcerative colitis. Given this situation, an ad hoc definition of remission has been used by regulatory authorities, although this definition has never been validated and may not be equivalent to patient defined remission.

The most important definitions of remission, and of clinically significant improvement, are those made by the patient. If the patient does not consider her or himself significantly improved or in remission, the disease will likely continue to impair the patient’s life, and (s)he will continue to seek additional health care. Patient centred definitions of significant improvement and remission are important end points for clinical trials in ulcerative colitis. However, we are unlikely to obtain objective data by simply asking patients in clinical trials if they are improved or in remission. Subjects in therapeutic trials both want to believe they are getting better and want to please the investigator, and are likely to over report improvement and remission in the setting of a clinical trial. Therefore, it is important to derive objective measures that can identify patients in remission outside of the context of a therapeutic trial.

In this study, our aims were: (1) to determine the most sensitive and specific cut offs for patient defined remission in four standard disease activity indices and the IBDQ; and (2) to determine the most sensitive and specific cut offs for significant clinical improvement at return visits for the two non-invasive indices and the IBDQ.

MATERIALS AND METHODS

Recruitment and enrolment of consecutive patients with ulcerative colitis who were scheduled for lower endoscopy for this study have been previously described.11 Informed consent was obtained before sedation and endoscopy, and survey forms to collect data for the SCCAI, Seo index, UCDAI, St Mark’s index, and IBDQ (a total of 50 questions) were administered. Fifty six subjects were additionally asked if their ulcerative colitis was in remission, with the survey question, “Is your ulcerative colitis in remission (not active)?” to which only yes or no answers were accepted. Blood for haemoglobin, albumin, and Westergren sedimentation rate for the Seo index was drawn before endoscopy, via an intravenous line when available. Primary gastroenterologists or endoscopists were asked to assess disease severity before endoscopy for the UCDAI. Endoscopists performing the procedure were asked to assess the endoscopic appearance for the UCDAI and St Mark’s indices. At a return visit between one and 14 months later, subjects were again asked if they were in remission, and if their ulcerative colitis was better or worse than at their previous visit on a seven point Likert scale (1 = much better, 2 = some better, 3 = a little better, 4 = about the same, 5 = a little worse, 6 = some worse, 7 = much worse). They also completed the surveys and blood tests for the SCCAI, IBDQ, and Seo indices.

Logistic regression and receiver operating characteristic (ROC) curve analyses were conducted to determine whether a cut off point on disease activity indices could be reliably used to predict patient defined remission. Sensitivity and specificity were calculated for each cut off. To obtain a robust end point for future trials, only cut offs with at least 80% specificity for remission were considered. Within these, the cut off with the highest sensitivity×specificity product was chosen as the optimal cut off in each case. Ties (differences of less than 1%) were settled by favouring cut offs with higher specificity for the clinical end point. The sensitivity and specificity of these cut offs were also calculated for the regulatory definition of remission.

Similar analyses with identical criteria for selecting optimal cut offs were performed for the non-invasive indices and the IBDQ to determine if changes in these indices could be used to predict a patient defined clinically significant improvement. All statistical calculations were done with Stata 8.2 (College Station, Texas, USA). The study protocol was approved by the University of Michigan IRB-MED Institutional Review Board (NIH Assurance # M-1184) on 14 November 2002. Subjects were not paid for their participation.

RESULTS

Seventy four consecutive subjects were approached for this study. Two subjects could not read or understand the consent form and two subjects refused to participate. Two subjects were excluded from data analysis because of incomplete data (random failure to collect all laboratory tests required for the Seo index). Two additional subjects were excluded from data analysis after consent was obtained because of ischaemic colitis in one and metastatic breast cancer in the other.

Subjects were asked to return for a follow up visit, and 56 of the final 66 subjects were able to do this. Demographic and treatment characteristics and disease activity of the total sample, as well as the 56 who returned for follow up visits are presented in table 1. No significant differences were found between the two groups.

Table 1

 Characteristics of the subject sample and disease activity and disease extent

Patient defined remission correlates with <3.5 points for the St Mark’s index and <2.5 points for the UCDAI

In our evaluation of the invasive indices, the St Mark’s index and the UCDAI, we had complete data on 56 subjects who reported whether or not they were in remission at the time of endoscopy. Twenty eight subjects reported that they were in remission while 28 subjects stated that they were not in remission. ROC curves showed that both the St Mark’s index (c statistic (aka area under the ROC curve) = 0.91) and the UCDAI (c statistic = 0.94) were good at predicting remission (fig 1A, B). Individual cut offs were then tested for their sensitivity and specificity for detection of patient defined remission. The optimal cut off value for remission with the St Mark’s index was <3.5 points, which had a sensitivity and specificity of 0.75 and 0.93, respectively. The optimal cut off value for remission with the UCDAI was <2.5 points, which had a sensitivity and specificity of 0.82 and 0.89. The distribution of index scores by remission status is shown in fig 2A and 2B.

Figure 1

 Receiver operating characteristic (ROC) curves demonstrate that invasive indices predict patient defined remission accurately. ROC curves (sensitivity v 1−specificity plots) for the St Mark’s index (A) and ulcerative colitis disease activity index (UCDAI) (B) for the end point of remission are presented. Areas under the ROC curve were 0.91 and 0.94, respectively.

Figure 2

 Optimal remission cut offs for the invasive indices are sensitive and specific. The cut offs were chosen to optimise sensitivity and specificity for remission. For the St Mark’s index (A), this cut off was <3.5 points, resulting in a sensitivity of 0.75 and a specificity of 0.93 for patient defined remission. For the ulcerative colitis disease activity index (UCDAI) (B), the optimal cut off was <2.5 points, resulting in a sensitivity of 0.82 and a specificity of 0.89 for patient defined remission.

Patient defined remission correlates with <2.5 points for the SCCAI, <120 points for the Seo index, and >205 points for the IBDQ

For the non-invasive indices, we had complete data on 106 patient visits at which subjects reported whether or not they were in remission. Sixty subjects reported that they were in remission while 46 stated that they were not in remission. ROC curves showed that the SCCAI (c statistic = 0.91), the Seo index (c statistic = 0.92), and the IBDQ (c statistic = 0.84) were good at predicting remission (fig 3A–C). Individual cut offs were then tested for their sensitivity and specificity for detection of patient defined remission. The optimal cut off value for remission with the SCCAI was <2.5 points, which had a sensitivity of 0.79 and specificity of 0.82. The optimal cut off value for remission with the Seo index was <120 points, which had a sensitivity of 0.96 and specificity of 0.82. The optimal cut off value for remission with the IBDQ was >205 points, which had a sensitivity of 0.81 and a specificity of 0.82. Side by side box plots of the disease activity values for the two non-endoscopic indices and the IBDQ with respect to remission status are presented in fig 4A–C.

Figure 3

 Receiver operating characteristic (ROC) curves demonstrate that non-invasive indices predict patient defined remission accurately. ROC curves (sensitivity v 1−specificity plots) for the simple clinical colitis activity index (SCCAI) (A), Seo index (B), and inflammatory bowel disease quality of life index (IBDQ) (C) for the end point of remission are presented. Areas under the ROC curve were 0.91, 0.92, and 0.84, respectively.

Figure 4

 Optimal remission cut offs for the non-invasive indices are sensitive and specific. The cut offs were chosen to optimise sensitivity and specificity for remission. For the simple clinical colitis activity index (SCCAI) (A), this cut off was <2.5 points, resulting in a sensitivity of 0.79 and a specificity of 0.82. For the Seo index (B), this cut off was <120 points, resulting in a sensitivity of 0.96 and a specificity of 0.82. For the inflammatory bowel disease quality of life index (IBDQ) (C), this cut off was ⩾ 205 points, resulting in a sensitivity of 0.81 and a specificity of 0.82.

These end points also have good sensitivity and specificity for a “regulatory definition” of remission

A common definition of remission used by pharmaceutical companies and regulatory authorities to assess the outcome of clinical trials is the combination of (a) no more than grade I or grade II changes on a modified Baron endoscopic score12 (absence of friability) and (b) absence of visible blood reported by the patient. While this definition has never been tested or formally validated, it is in common use. For the end points defined above to achieve general acceptance, they must be tested against this “gold standard”, which we have termed the “regulatory definition of remission”.

The sensitivity and specificity of the remission end points defined for each of the indices were therefore calculated with the regulatory definition of remission as the gold standard. These are presented in table 2 in comparison with their sensitivity and specificity for patient defined remission. The results are similar to those with patient defined remission, suggesting that these cut offs for remission work well with either patient defined remission or the regulatory definition of remission as the gold standard. The UCDAI, which includes the two items in the regulatory definition of remission, performs extremely well in detecting regulatory remission, as would be expected by direct correlation. The St Mark’s index, which is also invasive but includes a number of other items, has reasonable sensitivity and specificity for both definitions of remission. The SCCAI, despite the lack of direct correlation and lack of endoscopic information, performs nearly as well as the UCDAI, and better than the St Mark’s index for the regulatory definition of remission. Surprisingly, the Seo index is actually more sensitive than the UCDAI for the regulatory definition of remission, and has reasonable specificity. The IBDQ cut off of >205 is not as sensitive or specific for the regulatory definition of remission, as might be expected for a quality of life measure that does not solely measure disease activity. These index end points for remission for both invasive and non-invasive indices are effective in identifying subjects in regulatory remission.

Table 2

 Sensitivity and specificity of defined cut offs for patient defined remission and regulatory definition of remission

Patient defined change in clinical status correlates with common disease activity indices

For the non-invasive indices, we had complete return visit data after 1–14 months of follow up for 56 individuals who reported the change in their clinical status on a seven point Likert scale (fig 5A). Ranging from 1 = much better to 7 = much worse, subjects selected one point on the scale. A box plot for each point on the Likert scale in each non-invasive activity index versus the change in score of each index is presented in fig 5B–D. The Spearman correlation coefficients between the Likert scale for improvement and the SCCAI, Seo index, and IBDQ were 0.70, 0.59, and −0.64, respectively. The magnitude of the change in non-invasive indices was an accurate measure of patient perceived change in clinical status.

Figure 5

 Likert scale and correlation of scores with patient defined improvement. (A) Likert scale used at the return visits, showing that scores of 1 (much better) and 2 (somewhat better) were grouped as clinically significant improvement, while scores in the range of 3–7 were considered to represent no significant improvement. (B–D) Correlation between the Likert scale improvement scores and changes in the scores on the simple clinical colitis activity index (SCCAI) (B), Seo index (C), and the inflammatory bowel disease quality of life index (IBDQ) (D).

The effect of patient recall on the correlations between the subjects’ perception of improvement on the seven point scale and the change in each non-invasive index was evaluated by calculating the Pearson correlation in different deciles of time between visits. The correlations were quite good until 240 days (80% of follow up), with correlations of 0.77 for the change in SCCAI, 0.72 for the change in Seo index, and −0.70 for the change in IBDQ. For the last 20% of subjects (241–421 days from initial visit until follow up), correlations were significantly worse, with 0.37 for the change in SCCAI, 0.33 for the change in Seo, and −0.64 for the change in IBDQ.

Patient defined significant improvement correlates with decreases of >1.5 points on the SCCAI and >30 points on the Seo index, or increases of >20 points on the IBDQ

Subjects who reported being either “much improved” or “somewhat improved” were defined as significantly improved. Those who were “slightly improved”, “about the same”, or one of the three levels of worsening, were not considered significantly improved. One could argue that individuals who report themselves “slightly improved” should be included in the group with clinically significant improvement, but a robust end point was desired, which would represent clinically important improvement in the judgment of the subject. Twenty one subjects reported that they were significantly improved while 35 subjects stated that they were not significantly improved.

Using the dichotomous outcome of improvement, we constructed ROC curves to determine how well change from baseline to follow up visit in each non-invasive measure would predict significant improvement. ROC curves showed that the SCCAI (c statistic = 0.84), the Seo index (c statistic = 0.82), and the IBDQ (c statistic = 0.82) were good at predicting significant improvement (fig 6A–C). Individual cut offs were then tested for their sensitivity and specificity for detection of patient defined improvement, and specificity was favoured in each case in order to produce a robust end point for improvement.

Figure 6

 Receiver operating characteristic (ROC) curves demonstrate that non-invasive indices predict improvement accurately. ROC curves for the simple clinical colitis activity index (SCCAI) (A), the Seo index (B), and the inflammatory bowel disease quality of life index (IBDQ) (C) for the end point of improvement are presented. Areas under the ROC curves were 0.84, 0.82, and 0.82, respectively.

The optimal cut off value for significant improvement in the SCCAI was a decrease of >1.5 points, which had a sensitivity and specificity of 0.67 and 0.80, respectively. The optimal cut off value for significant improvement with the Seo index was a decrease of >30 points, which had a sensitivity and specificity of 0.67 and 0.91, respectively. The optimal cut off value for significant improvement with the IBDQ was a decrease of >20 points, which had a sensitivity and specificity of 0.62 and 0.91, respectively. These cut off thresholds are illustrated with the distributions of the index scores divided by remission status in fig 7A–C.

Figure 7

 Optimal improvement cut offs for the non-invasive indices are sensitive and specific. The cut offs were chosen to optimise sensitivity and specificity for improvement. For the simple clinical colitis activity index (SCCAI) (A) this cut off was a decrease by >1.5 points, resulting in a sensitivity of 0.67 and a specificity of 0.80. For the Seo index (B), this cut off was a decrease by >30 points, resulting in a sensitivity of 0.67 and a specificity of 0.91. For the inflammatory bowel disease quality of life index (IBDQ) (C), this cut off was an increase by >20 points, resulting in a sensitivity of 0.82 and a specificity of 0.91.

DISCUSSION

Dichotomous clinically significant end points are appealing for clinical trials, as they are easy to understand and, when achieved, they support the clinical significance of the outcome of a clinical intervention. While dichotomous clinically significant end points for remission and improvement are commonly accepted for Crohn’s disease using the Crohn’s disease activity index, no end points for ulcerative colitis have been defined. In this study we determined appropriate end points for significant improvement and for remission in subjects with ulcerative colitis on standard indices, and measured the sensitivity and specificity of each end point.

Patient centred definitions of clinically significant outcomes are necessary to determine if therapies will be perceived as beneficial by patients with disease. If improvement by an arbitrary number of points on a validated scale is not perceived as significant improvement or remission by patients, they will seek additional or alternative health care. The use of alternative health care is prevalent in the USA,13 and is relatively common in inflammatory bowel disease.14,15 Some part of this is due to dissatisfaction with current therapy. Future clinical trials must measure end points that are objective and are also important to patients. By determining objective index end points for patient defined remission and significant improvement in common disease activity indices, we hope to provide objective and meaningful goals for future clinical studies. The finding that these index end points derived from patient defined remission can also predict a standard regulatory definition of remission supports the validity of this approach.

Additional data from the literature also supports the use of this approach to define remission in ulcerative colitis. Jowett et al determined the appropriate cut off for the SCCAI for clinician defined relapse of ulcerative colitis in a cohort in Britain, and found this to be ⩾5 points.16 This is similar to our finding that remission is ⩽2 points on the same scale. By subtracting the two points needed for clinically significant improvement on the SCCAI (this study) from Jowett’s five point threshold for relapse, a SCCAI score of 3 is obtained, which is very close to the threshold of 2 points for achieving remission determined in this study. This internal consistency between relapse, significant improvement, and remission among different studies suggests that these results are reasonably consistent in US and British populations with ulcerative colitis.

The gap between these two end points of remission and relapse is explained in part by the difference between patient defined remission and clinician defined relapse, and in part by the different populations in each study. In our experience, three or four points on the SCCAI appears to be a “grey zone” in which patients are not entirely in remission (our study), yet not clearly in relapse (Jowett data). We would prefer to set a rigorous standard (2 points or lower) for achieving remission on the SCCAI, rather than leave patients dissatisfied with a “remission” of 3 or 4 points on the SCCAI that they do not perceive as remission.

It is important to note that the Seo index, which does not correlate as well with the endoscopic indices as does the SCCAI,11 performs very well in predicting clinically significant end points, with a sensitivity×specificity product for patient defined remission higher than both the SCCAI and the St Mark’s index. Assessment of systemic inflammation with laboratory values in the Seo index is unique among the existing disease activity indices, and gives this index predictive value for clinical end points that is greater than would be expected from its correlation with other indices. The future addition of laboratory measures of inflammation to existing indices may yield a new disease activity index that would be superior to existing indices in the prediction of important clinical outcomes.

An important limitation of this study is that repeated measurements of the invasive indices were not done, as a second endoscopy was not clinically indicated. Therefore, we were unable to determine how well the invasive indices predict patient defined improvement. As these invasive indices are able to predict patient defined remission well, it is reasonable to think that they may also predict improvement well, but that was not investigated in this study.

A second important limitation is that there was only a single follow up time point at which improvement was measured. In typical longitudinal clinical studies, multiple time points are measured. However, we did measure patients at a wide range of follow up times (30–421 days) and found that the measures did perform well. The degradation in correlation after 240 days may reflect decreased function of the activity indices but we feel that it is more likely due to a mixture of worsening recall by the subjects and increased noise in the smaller sample size (only 10 patients had follow up after 240 days).

This study identified objective index end points for remission and significant clinical improvement in US patients with ulcerative colitis which can be used as goals for clinical therapy and as end points in evaluating the results of clinical trials. While continuous end points can provide more statistical power, by the use of these dichotomous end points, we can objectively identify clinically significant changes in study subjects and allow readers of the literature to compare therapeutic efficacy between trials.

While the ability of these indices to predict clinical improvement and remission is encouraging, and suggests that these indices are useful measures of disease activity in ulcerative colitis, these indices have never been formally validated. In clinical trials with repeated measures of subjects, it is critical to know that the disease activity instrument used is stable in patients who have no improvement and is sensitive to change in patients who either improve or worsen. Formal evaluation of the validity of these indices for use in clinical trials is needed.

Acknowledgments

We would like to acknowledge the contributions of Sheryl Korsnes, MS, in recruiting patients in our endoscopy unit, and of Brenda Gillespie, PhD, who assisted with the statistical analysis.

Dr Higgins is supported by NIH K12 RR-017607-01. Dr Zimmermann is supported by NIH R01 DK-56750-01

REFERENCES

Footnotes

  • Conflict of interest: None declared.