From gene discovery, to database, to cancer treatment. COSMIC celebrates 15 years of fuelling cancer research.
By Ali Cranage, Science Writer at the Wellcome Sanger Institute
Finding that the BRAF gene had a role in cancer was the first success for Cancer Genome Project scientists at the Wellcome Sanger Institute. It was a big success. It has led to improved treatments for melanoma patients whose cancer is caused by changes to the BRAF gene . In some cases, the treatment causes melanoma to disappear completely.
Cancer is, at its core, a genomic disease, caused by changes to a cell’s DNA. The hunt for the precise DNA changes, or mutations, that turn a healthy cell into a cancerous one took off in the 1980s and was well underway by the time BRAF was discovered in 2002.
At that time, driven by rapid advances in genome sequencing technology, the amount of data about cancer genomes was exploding. Genome-wide data was providing a bigger and more detailed picture than ever before, enriching understanding and leading to discoveries.
But it also brought a problem: how to manage, view and search data from disparate locations, publications and databases.
At the time that BRAF was identified, it was one of around 260 genes shown to have mutations in cancer cells. Not all of those genes have a role in cancer as clear-cut as BRAF. It wasn’t certain, and in many cases still isn’t clear, if changes to all those genes are a cause of cancer, or a consequence – but they are present nonetheless.
The question was how to capture all of that information.
COSMIC assault on cancer
The Cancer Genome Project researchers developed a tool to bring all the data on mutations in cancer together – making research faster and easier. Previously they had been logging data in a spreadsheet, but the volume and speed of information that was being published outstripped their ability to keep it up to date. So the Catalogue of Somatic Mutations in Cancer (COSMIC) was born. It captures data on somatic DNA mutations which we accumulate over a lifetime, rather than germline mutations which are inherited from our parents. Two full time expert curators were employed, feeding in data.
At its launch in 2004, the COSMIC database was populated with information on four genes, one of which was BRAF. The other three included at the start were HRAS, KRAS and NRAS – these genes have functions in a cell that interact with BRAF in the same molecular chain of events. Much of the early work was standardising the terms and data formats used, in order to make it possible to compare data between genes, between different types of cancer, as well as from different research groups around the world. After a year, the database had information from 1,700 scientific papers on 1,800 mutations in 21 genes. The numbers have increased ever since.
The type of information included has expanded too, as new techniques and understanding of cancer genomes emerged. Data on fusion genes – hybrids formed of two separate genes – was added in 2007. Data from whole genomes, not just genes, was included in 2009. Copy number variation data, detailing how many repeats of a gene are present, was added in 2013. Data on gene expression – how active a gene is in a cell– was added in 2015. As the nature of research papers changed, purely manual curation was no longer realistic. The team created systems to import whole genome sequence data alongside expert analysis by curators. Ten years after its launch, in 2014, COSMIC had details of mutations in almost all approximately 20,000 human genes.
Finding which of those mutations has a role to play in cancer is essential. As a tumour cell divides, its DNA becomes more and more disordered and mutations accumulate – genome sequence data in cancer is noisy.
To become cancerous, a cell must acquire the ability to grow unchecked. It must also sidestep all of the mechanisms that kick in when DNA damage occurs – from DNA repair to programmed cell death. There is a distinctive and complementary set of molecular events which occur as a cell goes on its journey from an ordinary one to a cancerous one. These key processes, termed hallmarks, were first described in 2000  and underpin the understanding of the biology of cancer.
Hallmarks were brought into COSMIC in 2017, to help simplify the description of how genetic functions go wrong in cancer progression. This work was supported by Open Targets and GSK. The aim is to bring further evidence to the database to show which mutations, among the hundreds of thousands seen and catalogued, are important. COSMIC currently describes mutations in 283 genes as hallmarks – again BRAF is one.
From Mutation to Function
To understand if mutations in a patient are a cause or an effect of cancer, COSMIC also gathers evidence about the roles of genes in cancer. What do they do in a cell? Are they driving a cancer to grow or supressing its growth? Where evidence meets a set threshold, the genes and the mutations within them are included in the Cancer Gene Census (CGC)  section of COSMIC.
BRAF was an early entry in the CGC, together with well known genes like TP53 and EGFR. Now, in 2019, a total of 723 cancer-driving genes are currently in the CGC.
This knowledge is vital for developing new treatments. If
the mutations affect the functions of a gene such that it causes the cell to
divide uncontrollably, then those processes are targets for treatments. In the
case of BRAF, the gene contains the
instructions to make the BRAF protein – which controls cell growth. Mutations
in the gene cause the protein to be permanently active within a cell, resulting
in cells growing uncontrollably. In healthy cells, its activity is controlled.
This is the type of response that was seen. On the left all the lumps. On the right is the patient after a few weeks of treatment. To some extent the cancer goes to sleep, and dissolves away.
We were very pleased to see pictures like this. This is why we’re here.
All of these discoveries are the products of our imagination as human beings, and it’s a tribute to the power of the human mind.”Professor Sir Mike Stratton, Director of the Wellcome Sanger Institute
Quote adapted from speaking with George McGavin in BBC 4’s ‘A Year to Save My Life: George McGavin and Melanoma’. Mike initiated the Cancer Genome Project and discovered the mutations in the BRAF gene associated with melanoma.
From sequence to structure
The COSMIC team have worked with a number of others around the world – both academic groups and commercial companies, including the NCI Genomic Data Commons, Cancer Genomics Cloud and St Jude Children’s Research Hospital. This has ensured COSMIC is available where it can be most effective. It has developed as a result of each collaboration.
One important collaboration has been with Astex Pharmaceuticals. This work resulted in COSMIC-3D . COSMIC-3D maps DNA mutations onto the 3D structures of the proteins they code for. Researchers can visualise, in the context of proteins, the changes in DNA sequence. This enables researchers to identify and characterise potential targets for drug design.
In the case of BRAF, the molecular shape of the BRAF protein subtly changes when the BRAF gene is mutated, making it a viable target for drugs; specifically, drugs that would reduce its pro-cancer activity.
This type of precision in cancer medicine is what
researchers are striving for across all cancers and all mutations which cause
COSMIC was set up with a big ambition – to be the source of all cancer genomic knowledge. Now, 15 years later, the team have catalogued 6,233,984 mutations in 29,494 genes, and over 19 million mutations in other areas of the genome. Data from 26,878 papers is included. The database is the most comprehensive knowledge hub available. Scientists are able to view the data in multiple ways, as well as download what they need.
But Simon Forbes, Director of COSMIC, says there is still more to do.
“The database is a curation of the world’s knowledge of cancer mutations. It’s a huge resource supporting cancer research across the globe – from molecular biologists trying to untangle cellular growth processes, to bioinformaticians looking for patterns of DNA mutations in cancer cells, to pharmaceutical companies developing new drugs.”
“As genomic testing of cancer moves into everyday healthcare, the need for a comprehensive knowledge base is ever greater. COSMIC is now fuelling the next advances in cancer research and treatment.”
Find out more
- COSMIC database
- The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers
- COSMIC: the Catalogue of Somatic Mutations in Cancer (Tate et al., 2018)
- COSMIC-3D provides structural perspectives on cancer genetics for drug discovery (Jubb et al., 2018)
- COSMIC launch in 2004
- Open Targets
- St Jude Children’s Research Hospital
- ISB Cancer Genomics Cloud
- Seven Bridges CGC (Cancer Genomics Cloud)
- Astex Pharmaceuticals