Article Text

Research
Development and validation of a new disease severity index: the Inflammatory Bowel Disease Index (IBDEX)
  1. Laith Alrubaiy,
  2. Phedra Dodds,
  3. Hayley Anne Hutchings,
  4. Ian Trevor Russell,
  5. Alan Watkins,
  6. John Gordon Williams
  1. College of Medicine, Swansea University, Swansea, UK
  1. Correspondence to Dr Laith Alrubaiy, College of Medicine Swansea University, Room 220, Grove building, Swansea SA2 8PP, UK; l.alrubaiy{at}swansea.ac.uk

Abstract

Objective To develop, validate and apply a generic clinical severity index applicable to all adult patients with inflammatory bowel disease (IBD).

Design A review of the literature and an expert focus group consultation were carried out in order to draw out relevant items from existing literature. The new index was called the IBD Index (IBDEX). Standard psychometric analysis was carried out. The construct validity was assessed against biochemical markers, clinical and endoscopic indices. The new index was completed again within 6 weeks to check responsiveness and reproducibility.

Results IBDEX was used to assess 255 adult patients with IBD (125 with Crohn’s disease and 130 with ulcerative colitis), and 64 patients were re-evaluated within 6 weeks. It had good internal consistency (Cronbach's α=0.79) and correlated very well with the Harvey Bradshaw Index (r=0.94), the Simple Clinical Colitis Activity Index (r=0.92), the Mayo Clinic Index (r=0.87) and the Simple Endoscopic Score (r=0.76), all with p values <0.05. IBDEX had a moderate but positive correlation with C reactive protein (r=0.51) and erythrocyte sedimentation rate (r=0.36) p values both <0.05. The test–retest reliability was good (intraclass correlation coefficient 0.97) and responsiveness ratio was 2.27.

Conclusions IBDEX is the first properly validated Clinical Disease Severity Index in IBD. Our results showed that it is valid, reliable and reproducible and has the potential to be used in clinical practice.

  • INFLAMMATORY BOWEL DISEASE
  • CROHN'S DISEASE
  • CROHN'S COLITIS
  • ULCERATIVE COLITIS

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Inflammatory bowel disease (IBD) affects approximately one person in every 250 in the UK population.1 It includes ulcerative colitis (UC) and Crohn's disease (CD). Careful assessment of disease severity is required to inform appropriate treatment and assess progress. Clinical assessment of disease severity is increasingly being used in choosing the method of treatment and monitoring response.2–4 In clinical practice, a standardised and quantitative evaluation of the severity of IBD is needed.

The severity of disease in IBD can be assessed through using clinical, laboratory, endoscopic, histopathological and radiological indices. Although histopathological or endoscopic examinations are able to accurately assess inflammation in the intestinal mucosa, they are invasive, time consuming and expensive and, therefore, not routinely used in clinical outpatient clinics. Imaging techniques5 ,6 can be used to assess the severity of IBD especially in patients with CD rather than UC, but they are cumbersome and not readily available. Commonly used laboratory markers to assess the activity of IBD are erythrocyte sedimentation rate (ESR) and C-reactive protein (CRP). However, laboratory results may underestimate the severity of the disease resulting from the structural damage associated with IBD especially in CD.7 In UC, laboratory markers are not useful in distal proctitis because of the small area of inflammation involved.8 ,9 Faecal markers are being increasingly used in assessing inflammation in patients with established IBD. Commonly used faecal markers are lactoferrin, polymorphonuclear elastase and calprotectin.7 However, faecal markers are not specific for IBD, since they can be increased in mucosal inflammation or infection. The main limitation of symptomatic clinical indices is the subjective definition of symptoms such as frequency of bowel movement, urgency, well-being and global assessment. Perception of symptoms can vary between men and women.10 A few symptoms, such as abdominal pain and frequency of bowel movement can be due to functional intestinal disorder. Diarrhoea in CD can be due to other reasons, such as bacterial overgrowth or malabsorption rather than a genuine flare-up of IBD. However, clinical indices are still widely used in clinical trials due to their simplicity and ease of use compared with endoscopic, histopathological and imaging techniques.

A number of clinical indices have been put forward using different parameters based on different principles.11 ,12 These indices are routinely used in clinical trials to assess response to therapy and are becoming more commonly used in clinical practice. However, none of these clinical indices have been properly validated using a robust methodology.11 ,12

The need for a simple, reliable and valid severity score index that can be quickly completed in the clinical setting and is applicable to the majority of patients with IBD is still unmet. Such an index will aid clinical decision making, help assess response to treatment and early detection of relapse, and will be a useful tool in any future IBD registry.13

The aim of this study was to develop a clinical severity index for patients with IBD that could be completed based on clinical assessment only and that had proven validity and reliability on rigorous psychometric testing.14

Methods

Devising the items

Items were generated through a literature search to identify questions that can be used to assess the severity of disease in IBD clinically. We asked an expert panel of seven gastroenterologists, one IBD nurse and two specialist registrars to review the questions. They were asked to rate the relevance of each question in assessing the disease severity clinically (extremely relevant, very relevant, slightly relevant and not relevant), and only items that were extremely relevant or very relevant were included in the new index.14 We called the new index the IBD Index (IBDEX).

To test for acceptability and lack of ambiguity, IBDEX was pretested by two gastroenterologists and one IBD specialist nurse in a pilot study of 20 patients with IBD. Users were asked if they would suggest any changes or additions to the new severity index by asking four supplementary questions and invited them to explain their responses:

  1. Did you find any question difficult to understand?

  2. Was there any question you did not want to answer?

  3. Do you want to add an additional question?

  4. Do you want to remove any of the questions?

Main validation study and sample size

IBDEX was validated on patients with IBD in four large hospitals. The inclusion criteria were adult patients with confirmed diagnosis of UC or CD according to the European Crohn’s and Colitis Organisation criteria,15 ,16 and the extent of the disease was classified according to the Montreal classification.17 We excluded patients who were in a vulnerable group (such as people with mental illness or memory problems, learning difficulties or physical disabilities) and those who were unable to consent.

There is no rule in the literature about the number of patients required to validate outcome measures. However, a ratio of 5 or 10 patients per item was suggested.18 Recent guidelines suggested that a number of at least 100 patients was sufficient for the proper validation study.19 We, therefore, aimed for a sample size of at least 100 patients for the purpose of validating IBDEX.

IBDEX was recorded by the healthcare professionals when reviewing patients with IBD. Data were also collected about patients’ current disease severity using the Harvey Bradshaw Index20 (HBI) or Simple Clinical Colitis Activity Index21 (SCCAI) for CD or UC, respectively, endoscopic indices (Mayo Clinic Score,22 and Rachmilewitz Index23 for UC and Simple Endoscopic Score24 for CD and biochemical markers (haemoglobin, white cell count, CRP, ESR and albumin).

This study was approved by the South East Wales Research Ethics Committee (Reference 11/WA/0239), and the National Health Service code of confidentiality and data protection was followed.

Psychometric analysis

Data were analysed using the Statistical Package for Social Sciences (SPSS) V.19 licensed for Swansea University. We measured the following psychometric properties:

  1. Principal component analysis (PCA)6 was used to assess the underlying dimensions of IBDEX. PCA is a statistical technique for determining those questions which fit together as specific factors (components or domains) and which account for the greatest variance in the scale. A factor was considered important if its ‘Eigen value’ (a statistical measure of its power to explain variation between patients) exceeded 1.0.14 Questions were considered as contributing to IBDEX if they had a factor loading of at least 0.4 on one of the important factors, and had face and content validity as judged by the focus group. Questions not contributing to any of the important factors in this way were considered for removal from the final instrument. We checked the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy (which should be more than 0.5) and Bartlett's test of sphericity (which should be significant) to confirm that the sample was suitable for PCA.25–27

  2. Internal consistency is the correlation between the different questions in IBDEX. The internal consistency of the IBDEX was assessed by computing item-total correlations (which should be between 0.2 and 0.8) and Cronbach’s α (which should be more than 0.7).14 ,28

  3. Construct validity: is the correlation of IBDEX with other instruments that assess the severity of IBD. Construct validity is commonly assessed by Pearson's correlation coefficient (r).14 The construct validity of IBDEX was assessed against biochemical markers (CRP, white cell count, haemoglobin, albumin and ESR), and clinical indices: HBI20 for CD, and the SCCAI21 for UC. These clinical indices were selected because they are easy to use, correlated well with more complex indices and were widely cited in the literature. Endoscopic indices were also recorded (Mayo Clinic Score22 in UC and Simple Endoscopic Score24 in CD). A Pearson's correlation coefficient of more than 0.4 was regarded as acceptable.14 ,28

  4. Discriminative validity: is the ability of IBDEX to differentiate between patients with active IBD and those in remission. Active disease was defined as HBI Score of 5 or more which corresponds to a Crohn’s Disease Activity Index Score of over 150,29 ,30 or SCCAI ≥321 ,31 for CD and UC, respectively. Patients were stratified into active and remission IBD groups, and t test was done to assess the discriminative validity of IBDEX.

  5. Reproducibility, or test-retest reliability, assesses the consistency between successive applications of IBDEX.14 ,28 To assess the reproducibility, IBDEX was used to assess the disease severity of a subgroup of patients who were reviewed on two occasions within 2–6 weeks. In addition to IBDEX, a retest questionnaire asked the healthcare provider to assess the changes in the disease condition (improved, got worse or remained the same) since the last assessment, based on clinical judgement, biochemical markers and endoscopic findings if available. Patients whose condition remained the same were included in the reproducibility analysis. We assessed the reproducibility of scores for these stable patients using the intraclass correlation coefficient.32 An intraclass correlation between the first and second sets of IBDEX Scores should exceed 0.75 for good reproducibility.28 ,33

  6. Responsiveness is the ability of IBDEX to detect changes in the clinical condition of patients. By contrast with reproducibility, we assessed responsiveness in the retested subgroup of patients who had a change in their bowel condition on two occasions in the last 2–6 weeks as rated by the healthcare provider (got worse or improved). To assess IBDEX responsiveness, we computed the responsiveness ratio32 which is calculated by dividing the mean change in scores for patients who had a change by the SD of the scores of stable patients. This ratio should exceed 0.5 for good responsiveness.28 ,33

  7. Interobserver reliability measures the consistency between IBDEX Scores completed by different assessors on the same patients. To evaluate this feature, we asked two healthcare professionals to independently assess the same patients using IBDEX on the same day, and by calculating the intraclass correlation coefficient between their scores. A value of >0.75 was regarded as acceptable.14 ,28

  8. Stepwise regression is a statistical technique for exploring the relationship between a dependent variable ‘predicted’ (ie, IBDEX total score) and several independent variables ‘or predictor’ (ie, IBDEX questions). We used this method to identify a shorter version of IBDEX by finding the best combination and the fewest possible number of questions that best predicts the total IBDEX Score.34 ,35

Results

Devising the items and pretesting

The process of literature review identified 18 questions used to assess the severity of the disease in IBD (table 1). The expert panel reviewed these questions to check their suitability to be included in the new index. More than half of the experts rated the use of antidiarrhoeal drugs and faecal incontinence as not relevant or slightly relevant in assessing the severity of IBD clinically, and we, therefore, considered them as candidates for removal. Nausea and anorexia were mentioned only in a Powell Tuck Index,36 and they were considered as non-specific for IBD. Physician global assessment is a crude method of assessing the severity of IBD and we used it in assessing patients at baseline (table 2) but was not included in the IBDEX list of questions. Additionally, a few minor changes in the wording and ordering of the questions were suggested by the focus group. The resultant 14-item index was called the IBDEX (see online supplementary file appendix 1).

Table 1

Commonly used clinical severity indices

Table 2

The characteristics of the patient sample for IBDEX validation*

Scoring of IBDEX

IBDEX is completed by the healthcare provider while assessing patients. It includes a combination of patient-reported symptoms and clinical observations. Eleven questions are scored on using a Likert Scale with a list of answer options each with a numerical score. Three questions about frequency of stools, nocturnal diarrhoea and pain had open numerical answers. This change in response options was made to avoid the ceiling effect19 and to improve the discriminative ability of IBDEX in acutely unwell patients. The total score of IBDEX is the sum of all answers and ranges from zero to more than 23, the higher the score, the more severe the IBD condition.

Pilot study

The pilot study on 20 patients with stable IBD (10 UC and 10 CD), with ages from 30 to 55 years, showed that the questions were easy to complete and well received by the healthcare professionals. No additional questions were added or suggested as results of the pilot study. The mean completion time for IBDEX was 5 min (±3 min).

Main validation study

IBDEX was validated on 255 patients with IBD aged 18–90 years (125 patients with CD, 130 patients with UC (table 2) and 64 patients were re-evaluated within 6 weeks. Sixty patients (about 30%) were inpatients and the rest were assessed in outpatient clinics. All forms were completed and there were no missing data. Apart from perianal CD, the patients represented all the presentations of IBD (table 3): ileal Crohn’s disease (30 patients), ileocolonic CD (28 patients), colonic CD (67 patients)), ulcerative proctitis (22 patients), left-sided UC (75 patients) and extensive UC (33 patients).

Table 3

Types of IBD of patients, according to the Montreal classification17

Psychometric analysis

Internal consistency and underlying dimensions

All items, with the exception of abdominal mass, had good item-total correlation between 0.2 and 0.8 (table 4). The internal consistency was good with Cronbach's α of 0.79.

Table 4

Internal consistency and stepwise regression of the IBDEX questions

The KMO measure of sampling adequacy (KMO=0.86) and Bartlett's test of sphericity (p<0.001) confirmed that the sample was suitable for PCA. We identified four important factors with Eigen value more than 1.0. They contributed to 60.4% of the total variance in the score. All items had good factor loading of more than 0.4 (table 5). To facilitate interpretation, we attributed each question to one of the principal factors according to its factor loading. Attribution of the 14 questions to their factors showed that the first factor covers bowel symptoms, the second factor covers general well-being, the third factor covers general examination findings and the fourth factor mainly covers abdominal examination findings.

Table 5

Principal component analysis of the IBDEX questions

Construct and discriminative validity

IBDEX had very good correlation with other clinical severity indices: HBI (r=0.94), SCCAI (r=0.92), Mayo Clinic Index (r=0.87) and Simple Endoscopic Score (r=0.76), all with p values <0.05. The IBDEX had moderate correlation with CRP (r=0.51) and ESR (r=0.36), all p values <0.05. IBDEX did not correlate well with haemoglobin (r=0.01), white cell count (r=0.01) and albumin (r=−0.21).

To examine the discriminative validity of IBDEX and show that its scores differ significantly between patients in remission and those with active IBD, patients were stratified according to their disease activity into two groups: remission and active, according to HBI and SCCAI. There was a significant difference (p<0.05) in mean total IBDEX Scores of patients with active and inactive IBD.

Reproducibility

Of the 64 patients who were assessed within a 2–6 week period, 31 patients had their disease condition unchanged in the second visit, and were included in the reproducibility analysis. The correlation between the test and retest IBDEX Scores was very good (intraclass correlation coefficient 0.97, p value <0.05).

Responsiveness

Responsiveness was assessed for 33 patients whose disease severity had changed, as rated by the health professionals (10 had improved and 23 had worsened). The number was not large enough to support separate analysis for those who improved and those who did not. The responsiveness ratio was 2.27 which suggests that IBDEX is highly responsive to change.

Interobserver reliability

Interobserver reliability was assessed for 32 patients who were assessed independently by two healthcare professionals during the same visit to the hospital. The intraclass correlation coefficient was excellent with a value of 0.9 (p<0.05).

Stepwise regression and identification of a shorter version of IBDEX.

Stepwise regression of the 14 items on the total IBDEX Score (table 4) showed that abdominal pain or discomfort, stool frequency, stool consistency, general well-being, nocturnal diarrhoea and blood in stool contributed to more than 97% of the total score variance. Therefore, these items were considered as strong candidates to be included in a short version of IBDEX.

Discussion

The purpose of this study was to develop a valid and reliable tool to assess the severity of IBD using clinical parameters and to remove redundant items that do not add information to the index. We used clinical and psychometric approaches in developing IBDEX. Items were identified based on literature review, the opinion of a focus group of IBD experts and psychometric analysis.28 IBDEX consisted of 14 items and was well received by the health professionals.

We tested the questionnaire on 255 patients with IBD with almost equal numbers of patients with CD and UC (125 with CD and 130 with UC). Sixty patients (about 30%) were inpatients and 195 patients (70%) were assessed in outpatient clinics. With the exception of perianal CD, the patients represented all the presentations of IBD: Crohn’s disease (ileal CD (55 patients), ileocolonic CD (45 patients), colonic CD (25 patients)) and UC (proctitis (10 patients), left-sided (69 patients) and extensive (51 patients)).

The internal consistency (or homogeneity) of the items was excellent (Cronbach's α of 0.79) with good item-total correlation between 0.2 and 0.8 as suggested by other authors.14 ,28 PCA suggested that there were four important factors. All items had a factor loading of more than 0.4 to at least one of these factors, meaning that all items contributed to the total IBDEX Score.14 ,28

IBDEX measures the severity of IBD using clinical parameters. Therefore, we expected IBDEX to have a positive correlation with other measures of severity, with the largest correlation with other clinical severity indices, as they both measure clinical parameters. Indeed, IBDEX had very good correlation with other clinical indices (HBI for CD and SCCAI for UC) and endoscopic indices (Mayo Score and Rachmilewitz Score for UC, simple endoscopic score for CD with Pearson's correlations coefficient (r) of more than 0.4.14 We used these indices because they are simple and easily obtainable on the same day and had good correlation with disease severity. It is well known that the clinical findings of IBD can be due to the structural damage rather than inflammatory process. This might explain the borderline correlation with CRP and ESR and the lack of correlation with haemoglobin, white cell count and albumin levels. Findings also showed that IBDEX was a very useful tool to differentiate between patients with active and inactive IBD. We did not use faecal markers of inflammation, such as faecal calprotectin, in our study due to local hospital policies and lack of availability in certain sites.

We have shown that IBDEX had an excellent test–retest reliability (reproducibility and sensitivity) when it was repeated on a small subgroup of patients within 6 weeks. IBDEX, therefore, has the potential to be a useful tool for longitudinal monitoring of patients in clinical practice.

More than 50% of the expert focus group rated the use of antidiarrhoeal drugs (ie, part of CD severity index) and the faecal incontinence as not relevant and slightly relevant, respectively. Nausea, vomiting and anorexia were mentioned only in a Powell Tuck Index,36 and they were considered as non-specific for IBD. Physician global assessment is a crude method of assessing the severity of IBD and we used it in assessing patients at baseline (table 2) although it was not included in the IBDEX list of questions. The finding of abdominal mass on examination had poor item-total correlation. We also attempted to shorten the index by removing redundancy by carrying out stepwise regression of the total IBDEX Score on the individual questions. The items that were candidates to be included in the shorter version of the IBDEX were abdominal pain or discomfort, stool frequency, stool consistency, general well-being, nocturnal diarrhoea and blood in stool, which contributed to more than 97% of the total score variance.

Assessing the clinical severity of IBD is an important part of clinical practice. Several clinical indices have been developed to aid this assessment. However, most of the indices are designed for use in particular trials or in a certain group of patients, none have, however, been properly validated.2 ,11 ,12 Therefore, there is no ‘gold standard’ clinical severity index in IBD, and investigators choose their clinical indices based on their patient group or individual preferences. This issue has been identified by investigators who are involved in the designing and implementation of clinical trials in CD and UC.11 ,12 The newly established UK IBD Registry will need a simple measure that allows longitudinal monitoring of patients’ response to therapy in outpatient and inpatient settings. Therefore, having a clinical severity index that is valid, reliable and suitable for all presentations of IBD will be very useful. Although CD and UC differ from the histopathological point of view, there is much clinical overlap and we have, therefore, chosen to develop an index that can be used in both conditions. We believe the IBDEX will have wide applicability in the clinical management of patients with IBD and in research. We have not assessed its usefulness in patients with perianal CD, nor in patients with a stoma, and further development and/or validation will be needed for these groups. Although we identified the items for the short version of IBDEX, further studies are needed to further validate the short version of IBDEX in a larger group of patients.

In conclusion, we developed a valid and reliable tool that is useful in assessing the severity of IBD, both UC and CD, using clinical findings. The new index will facilitate quick decision making in outpatient and inpatient settings and help to monitor patients’ management. IBDEX will be freely available to anyone to use it, subject to approval from the corresponding author.

SIGNIFICANCE OF THIS STUDY

  • What is already known about this subject?

  • Assessing the severity of IBD is an important part of clinical practice.

  • A number of clinical indexes have been put forward using different parameters.

  • However, none of these clinical indexes have been properly validated using a robust methodology.

  • What are the new findings?

  • We used a thorough clinical and psychometric approach to develop a new clinical index to assess the severity of IBD, called the Inflammatory Bowel Disease Index (IBDEX).

  • IBDEX was well received by health care professionals.

  • IBDEX demonstrated good validity and reliability when used to assess the disease severity of 255 patients with IBD.

  • How might it impact on clinical practice in the foreseeable future?

  • IBDEX will facilitate quick decision-making in outpatient and inpatient settings and help to monitor patients' management.

  • IBDEX will also be a useful tool in clinical research to assess patients with IBD and their response to new therapies.

Acknowledgments

The authors would like to thank Dr Ian Rees, Dr Barney Hawthorne, Dr Dharmaraj Durai, Dr Linzi Thomas, Dr Ian Arnott, Dr Mesbah Rahman, Dr Ahsan Malik, Dr Sinan Al-Rubaye, Dr Peter Neville, Mr Dean Harris, Dr Keith Bodger and Dr Simon Travis for their help in reviewing IBDEX questions, identifying patients and allowing us to recruit patients under their care. Most of all, the authors wish to thank the patients who participated in the study.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Contributors LA is the chief investigator of the study. He contributed in writing the manuscript, designing the questionnaires, data collection and analysis. PD contributed in recruiting patients, data collection and all drafts of the manuscript. AW contributed to the statistical analysis of the study. HAH, ITR, and JGW contributed to designing the questionnaires, all drafts of the manuscript and data analysis.

  • Funding This work was supported by the Welsh Clinical Academic training scheme and is collaboration between Swansea University, Wales deanery and the Welsh Government.

  • Competing interests None.

  • Ethics approval South Wales research ethics committee.

  • Provenance and peer review Not commissioned; externally peer reviewed.