Article Text


Curriculum based clinical reviews
The molecular genetics of colorectal cancer
  1. Iain Ewing1,
  2. Joanna J Hurley2,
  3. Eleni Josephides3,
  4. Andrew Millar1
  1. 1Department of Gastroenterology, North Middlesex University Hospital, London, UK
  2. 2Institute of Molecular Genetics, Cardiff University
  3. 3Department of Gastroenterology, Queen's Hospital Romford, London, UK
  1. Correspondence to Dr Iain Ewing, Department of Gastroenterology, North Middlesex University Hospital, London N18 1QX, UK; iainewing{at}


Colorectal cancer is a common but heterogeneous disease, which arises through the accumulation of genetic mutations. Knowledge of the molecular basis of colorectal cancer has advanced at a rapid pace in recent years, reflecting progress made in the field of genomic medicine. Targeted therapies have come into mainstream use, and the exciting prospect of treatment regimens tailored to the mutation profile of individual tumours is beginning to emerge. In order to understand the development and application of the next generation of colorectal cancer treatments, it is important that gastroenterologists have a working knowledge of the pathological mechanisms that drive the disease. This review examines our current understanding of the molecular genetics of colorectal carcinogenesis.

  • Colonic Neoplasms
  • Colonic Polyps
  • Colorectal Adenomas
  • Colorectal Cancer
  • Colorectal Cancer Genes

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from


Colorectal cancer (CRC) is the third most common malignancy diagnosed in the UK, with over 40 000 new cases identified annually, and a lifetime risk approaching 1 in 15 for men and 1 in 19 for women.1 CRC does not represent a single pathological entity, rather a heterogeneous group of diseases arising through various molecular pathways that influence individual susceptibility to cancer, and have the potential to determine responsiveness or resistance to antitumour agents.2 ,3

Competency 2.C. of the 2010 Gastroenterology curriculum (box 1) states that UK trainees must have an appreciation for the pathology of CRC, including awareness of the molecular genetics of colorectal carcinogenesis, the adenoma-carcinoma sequence and the range of predisposing inherited and acquired conditions. This review summarises current concepts of the molecular basis of CRC, using examples of CRC syndromes caused by germline mutations to illustrate the effects of acquired somatic mutations. Therapeutic targets within the signalling pathways that drive CRC tumourigenesis are also explored.

Box 1

Gastroenterology curriculum 2010

Competency 2.C. Intestinal disorders: large intestinal tumours

  • Knows the pathology of benign and malignant tumours of the colon and rectum

  • Has awareness of the molecular genetics of colorectal carcinogenesis and the adenoma-carcinoma sequence

  • Knows the range of predisposing conditions, including inherited syndromes and acquired colonic diseases

The adenoma carcinoma sequence

CRC arises as a result of the accumulation of genetic and epigenetic mutations, which transform normal glandular epithelial cells into benign neoplasms (adenomas) and subsequently into invasive carcinomas.3 ,4 Progression of tubulovillous and tubular adenomas has long been recognised, but there is also evidence that serrated adenomas have the potential for malignant transformation.4 This represents an alternative pathway for carcinogenesis, where a subset of hyperplastic polyps progress to serrated adenomas and ultimately a smaller proportion to carcinomas.4

The National Polyp Study5 carried out in the USA between 1980 and 1990 provided proof of the concept of malignant transformation of colorectal adenomas to adenocarcinoma, and the evidence base to support the effectiveness of removing adenomatous polyps at colonoscopy. In this study, the incidence of CRC in a cohort of 1418 patients who had undergone colonoscopy with polypectomy was compared with reference groups including two cohorts in whom colonic polyps were left in situ. Colonoscopic polypectomy resulted in a significantly lower than expected incidence of CRC.

Genomic instability

Loss of genomic stability facilitates the acquisition of multiple mutations that drive the development of CRC.3 Genomic instability can take a number of forms (table 1), including chromosomal instability (CIN), microsatellite instability, aberrant DNA methylation and DNA repair defects.3 ,4 Genome-wide analysis of gene mutations in CRCs has identified acquired somatic mutations in several 100 genes, and an average of 80 mutations in any single CRC, highlighting the heterogeneity of the disease.6 Table 2 highlights some important genes implicated in CRC tumourigenesis.

Table 1

Examples of genomic instability in CRC

Table 2

Examples of gene mutations implicated in CRC

Chromosomal instability

CIN, defined as the presence of structural aberrations or changes in chromosome copy number, is found in up to 85% of CRCs.4 Loss of function of tumour-suppressor genes, including APC, whose normal function is to oppose tumorigenesis, has been implicated in the development of CIN.7 APC regulates spindle microtubules, and is required to detect misaligned chromosomes during mitosis. The significance of this role in maintaining mitotic fidelity is highlighted by the CIN observed in CRC bearing APC mutations.7

Loss of function of the APC gene is further illustrated by the autosomal dominant condition, familial adenomatous polyposis (FAP), in which hundreds to thousands of adenomatous colonic polyps develop, leading to almost 100% lifetime risk of developing CRC in the absence of pre-emptive colectomy.12 ,13 FAP is the consequence of a germline mutation in the APC gene, which gives rise to a non-functional truncated protein, leading to accumulation of β-catenin and unregulated expression of a number of genes that drive colorectal tumorigenesis.14

Microsatellite instability

Microsatellites are mononucleotide or dinucleotide repeats found throughout the entire genome. Their repetitive nature makes them vulnerable to transcription errors during replication. Microsatellite unstable tumours are distinct from those with CIN as they display a normal karyotype.4 The mechanism of tumourigenesis in microsatellite instability involves inactivation of genes responsible for DNA mismatch repair (MMR) through somatic mutation or aberrant methylation.4 This loss of MMR gene function and resulting inability to repair strand slippage within nucleotide repeats changes the size of microsatellites.3 This is of particular importance if the microsatellite lies within the coding region of a gene as it may lead to altered gene function or a change in the protein product of gene expression.12 Somatic inactivation of MMR genes is found in approximately 15% of cases of sporadic CRC. There are associations with older age, female sex and proximal distribution of these tumours.3

Germline mutations of MMR genes are responsible for Lynch syndrome, or hereditary non-polyposis colorectal cancer (HNPCC), which carries a lifetime risk of CRC of about 80%.3 ,13 Mutations leading to loss of function have been identified in four genes involved in MMR: MLH1, MSH2 (accounting for the majority of cases), MSH6 and PMS2.12 ,13 Lynch syndrome is the most common hereditary CRC syndrome accounting for 2%–3% of all cases.12 ,10 Acceleration of the adenoma to carcinoma sequence is seen relative to sporadic CRC, with cancers evident at a median age of around 45 years.3 ,12 Affected individuals are also at increased risk of developing extra-colonic malignancy, in particular, endometrial and ovarian cancers.14 The inheritance of Lynch syndrome is autosomal dominant. Affected individuals carry a germline mutation in a single copy of a MMR gene. This alone is not thought to account for the observed increased risk of CRC, which occurs only when somatic mutation has affected the remaining wild-type parental allele.3

Germline deletion mutations in the EPCAM gene have recently been identified as a novel cause of Lynch syndrome. The mechanism is disruption of the 3′ end of EPCAM, which leads to epigenetic silencing of the neighbouring MSH2 MMR gene.10

The correct identification of patients with Lynch syndrome is clinically relevant as it allows for targeted CRC surveillance for the index case and family members. A definitive molecular diagnosis can be made by germline mutation analysis of the four DNA MMR genes implicated in the pathogenesis of Lynch syndrome.4 This process is expensive and a more pragmatic approach is to test for loss of MMR gene products by immunohistochemistry and for microsatellite instability (MSI) using PCR.4

Aberrant DNA methylation

Aberrant methylation of DNA is a further mechanism of gene silencing in patients with CRC that can lead to loss of MMR function.3 ,6 Methylated cytosine is incorporated in the normal genome, representing a fifth DNA base. It occurs outside of exons within CpG dinucleotides, but is largely absent from CpG-rich islands in the promoter regions of approximately half of all genes.3 ,15 In the CRC genome, there is aberrant methylation within promoter-associated CpG islands, leading to silencing of gene expression.3 ,15 Hypermethylation of promoters containing CpG islands is known as the CpG island methylator phenotype (CIMP).4 This phenomenon is observed in about 15% of CRCs, most of which show loss of MLH1 expression resulting in MMR deficiency and microsatellite instability.3 ,16

DNA base excision repair genes

The MYH gene is a base excision repair gene, responsible for repairing DNA damaged by reactive oxygen species.12 ,17 Polyposis develops in the presence of germline mutation of both MYH alleles.3 The resulting clinical syndrome, MYH-associated polyposis is therefore autosomal recessive.17 The mechanism of disease following germline inactivation of MYH is via subsequent somatic mutation of the APC gene causing CIN.11 The risk of CRC approaches 100% by age 60.3 ,12 Thus far only germline inactivating mutations of MYH are recognised, with no somatic equivalent. The diagnosis should be suspected in individuals with greater than 15 colonic adenomas, and can be confirmed by genetic testing.3

Tumour suppressor genes

Somatic mutations resulting in loss of function of the APC gene are the most commonly observed tumour suppressor gene defects in sporadic CRC.3 ,4 ,14 Other important examples include loss of TP53 and TGFβ function.

TP53 is a key tumour suppressor gene that is mutated in about half of all CRC.4 The wild-type p53 protein has a regulatory role in mediating cell-cycle arrest and cell death.3 Inactivation of the TP53 gene often coincides with malignant transformation of adenomas.3 ,4 The detection of TP53 mutation currently does not have any prognostic or clinical significance.4

Transforming growth factor β (TGFβ) signalling is an important tumour suppressor pathway. Deregulation of this pathway is a frequent observation in CRC, mediated by inactivating mutations of receptor genes (TGFBR1, TGFBR2) or postreceptor signalling pathway genes (SMAD2, SMAD4).4 Mutation of the TGFβ receptor genes commonly occurs in association with malignant transformation, and is seen in tumours with microsatellite instability.3 ,4 SMAD4 deletion has been shown to be associated with malignant transformation in murine models, and loss of expression correlates with lymph node metastases and possibly prognosis in human CRC.4

Juvenile polyposis syndrome (JPS) is a rare autosomal dominant disease. It carries an increased risk of development of gastrointestinal cancers. There is some discrepancy in the reported lifetime risk of developing CRC, perhaps reflecting the rarity of the condition. One relatively large registry reported that 14% of patients developed gastrointestinal cancer either by the time of diagnosis or during surveillance.18 A number of germline mutations, ultimately leading to downregulation of TGFβ signaling, have been reported, including inactivating mutations of SMAD4.12


Fundamental cellular activities including differentiation, proliferation and apoptosis, are mediated through intracellular signalling pathways. Oncogenic mutation of genes responsible for controlling these pathways can lead to loss of cellular regulation and subsequent development of invasive, immortal cancer cells.4 Examples of such pathways exhibiting oncogenic mutations in CRC include the epidermal growth factor receptor (EGFR), mitogen-associated protein kinase (MAPK) pathway and the phosphatidylinositol 3-kinase (PI3 K) pathway.3 ,4 ,19

EGFR activation triggers an intracellular phosphorylation cascade through downstream effectors RAS and BRAF, amplified through the MAPK pathway to promote cell growth.4 RAS and BRAF are implicated as oncogenes in a number of human cancers. Activating mutations promoting CRC have been identified in both genes.19 Mutations in KRAS are found in about 40% of CRCs, occurring as a relatively early event in the adenoma-carcinoma sequence.4 ,8 This is clinically relevant as there is concordance between the KRAS mutation status of primary tumour and metastases. Genetic analysis of tissue from the colorectal lesion can therefore predict response to targeted therapy in metastatic disease. This has been shown in trials of cetuximab, an immunoglobulin G1 monoclonal antibody against EGFR, which reduces the risk of progression of metastatic CRC, an effect limited to patients with KRAS wild-type tumours.8

Mutations of the PIK3CA gene, leading to upregulation of PI3 K signalling, are present in approximately 15%–20% of CRCs. Resulting enhanced prostaglandin E2 synthesis inhibits apoptosis of CRC cells.9 Aspirin may block the PI3 K pathway. Use of aspirin after diagnosis of CRC has been shown to significantly increase survival among patients with mutated PIK3CA tumours, in contrast to those with wild-type PIK3CA, who do not benefit.9 These findings suggest a role for the use of PIK3CA mutation status as a biomarker for targeted adjuvant therapy.


CRC are complex and heterogeneous solid tumours, exhibiting multiple genetic mutations. Individuals may carry predisposing germline mutations and accumulate further somatic mutations at various stages in the transition from normal mucosa, through adenomatous polyp, to invasive cancer.

Enhanced understanding of the molecular basis of CRC has led to new insights into the pathogenesis of familial forms of the disease, and how these relate to the accumulation of somatic mutations in sporadic tumours. Genetic testing for patients at high-risk of germline mutations, for example, testing for MSI in Lynch syndrome, has led to targeted surveillance for CRC that extends to at-risk family members.

Genetic biomarkers that predict response to treatment are beginning to come into routine practice in the management of CRC. KRAS-mutational testing to guide anti-EGFR therapy is one of the first examples of individualised targeted cancer therapy, and illustrates how molecular analysis of CRC tissue can improve outcomes by directing therapy to the most appropriate patients.4

The introduction of targeted therapies has already had an impact on the management of metastatic CRC. Cetuximab, an EGFR-blocking monoclonal antibody, reduces progression of KRAS wild-type metastatic CRC, and can lead to more curative resections of liver metastases.20

The pace of recent advances in our understanding of the molecular basis of CRC and the success of the first wave of targeted treatments provides an optimistic outlook for the future management of CRC. The armoury of specific drugs designed to inhibit oncogenes and signalling pathways is expanding.4 There is a real hope that the evolving application of molecular techniques to diagnosis, risk-stratification and management of CRC will translate to reduced disease burden in the future.

Multiple choice questions

  1. A 64-year-old man undergoes colonoscopy as part of the national CRC screening programme. An exophytic adenocarcinoma is found in the ascending colon. Staging CT of the chest, abdomen and pelvis is performed, which demonstrates the primary tumour and multiple liver lesions. Following review in the CRC multidisciplinary meeting, the diagnosis is confirmed as CRC with liver metastases that are not amenable to surgical resection.

When considering cetuximab in the adjuvant treatment of metastatic CRC, which of the following statements is most accurate?

  1. KRAS mutation testing should be performed on samples from both primary and metastatic tumours

  2. Cetuximab should be considered alongside conventional chemotherapy

  3. Cetuximab should be considered if the KRAS gene is wild-type on genetic testing of the primary tumour

  4. Cetuximab should be considered if the KRAS gene is mutated on genetic testing of the primary tumour

  5. Cetuximab should be considered if the KRAS gene is mutated on genetic testing of both primary and metastatic tumours

Answer: iii) Cetuximab should be considered if the KRAS gene is wild-type on genetic testing of the primary tumour.

Mutations of KRAS occur relatively early in the adenoma to carcinoma sequence, and there is good concordance in the mutation status of primary and metastatic disease.4 Genetic testing of KRAS can therefore be performed on the primary tumour only. Cetuximab has been shown to reduce the risk of progression of metastatic CRC, but this effect is limited to patients with KRAS wild-type tumours.8

  • A 45-year-old woman undergoes colonoscopy because of a family history of CRC affecting her father aged 43 and brother aged 47, and endometrial cancer in a paternal aunt. At colonoscopy, she is found to have four adenomatous polyps and an adenocarcinoma of the sigmoid colon. Subsequent genetic testing confirms the diagnosis of Lynch syndrome (HNPCC).

Which of the following statements is the most accurate?

  1. The inheritance of this condition is autosomal recessive

  2. Genetic testing will confirm multiple germline mutations of mismatch repair genes

  3. The median age of development of CRC is 45

  4. The genetic defect is an example of CIN

  5. Affected individuals are usually found to have hundreds of colonic polyps

Answer: iii) The median age of development of CRC is 45

Lynch syndrome is autosomal dominant, caused by germline mutation of a single mismatch repair gene. Large numbers of colorectal polyps are not characteristic of this condition. Affected individuals are at increased risk of extracolonic malignancy, particularly ovarian and endometrial cancers. The genetic defect is an example of microsatellite rather than CIN. The median age of development of CRC is 45.


View Abstract


  • Contributors IE wrote the article and prepared the manuscript. JH collaborated on the content of the review and editing of the manuscript. EJ contributed text on inherited polyposis syndromes. AM is supervising senior author.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.