Big data is defined as being large, varied or frequently updated, and usually generated from real-world interaction. With the unprecedented availability of big data, comes an obligation to maximise its potential for healthcare improvements in treatment effectiveness, disease prevention and healthcare delivery. We review the opportunities and challenges that big data brings to gastroenterology. We review its sources for healthcare improvement in gastroenterology, including electronic medical records, patient registries and patient-generated data. Big data can complement traditional research methods in hypothesis generation, supporting studies and disseminating findings; and in some cases holds distinct advantages where traditional trials are unfeasible. There is great potential power in patient-level linkage of datasets to help quantify inequalities, identify best practice and improve patient outcomes. We exemplify this with the UK colorectal cancer repository and the potential of linkage using the National Endoscopy Database, the inflammatory bowel disease registry and the National Health Service bowel cancer screening programme. Artificial intelligence and machine learning are increasingly being used to improve diagnostics in gastroenterology, with image analysis entering clinical practice, and the potential of machine learning to improve outcome prediction and diagnostics in other clinical areas. Big data brings issues with large sample sizes, real-world biases, data curation, keeping clinical context at analysis and General Data Protection Regulation compliance. There is a tension between our obligation to use data for the common good and protecting individual patient’s data. We emphasise the importance of engaging with our patients to enable them to understand their data usage as fully as they wish.
- health service research
- clinical trials
Statistics from Altmetric.com
Twitter @drjamiec, @Rutter_Matt
Contributors JC wrote the definitions, Sources, research and issues sections of the manuscript, compiled others’ section and submitted the paper. JC addressed reviewers comments. BB wrote the machine learning and artificial intelligence section, contributed to the definitions and issues sections, and reviewed drafts. EM contributed to the power of linkage section and reviewed drafts. MR wrote the power of linkage section, significantly edited the entire manuscript and reviewed drafts.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.