
By: Alison Cranage
The Sanger Institute was set up to uncover the code of life – the human genome. We opened our doors 25 years ago and became the largest single contributor to the human genome project. The principles that sat behind those endeavours are still fundamental – tackling the biggest challenges, openness and collaboration. Those principles have also helped to make Sanger one of the world’s leaders in genomics and biodata.
The Human Genome Project transformed science. The seemingly simple order of four letters of DNA changes how we understand life. Vast new areas of research have opened up, impacting biology, medicine, agriculture, the environment, businesses and governments.
Alongside our sequencing facilities, our activities and research have grown to utilise genomic knowledge. Now we are using genomics to give us an unprecedented understanding of human health, disease and life on earth.
Sequencing at scale
From the completion of the first human genome in 2003, we moved to the 1,000 and 10,000 genomes projects. Being able to compare sequences between individuals enables the understanding of diversity, evolution and the genetic basis of disease.
One of our latest projects is to work with UK Biobank to sequence the genomes of 50,000 individuals. Participants have already provided a wealth of data about their health and their lives - from blood samples to details of their diet. Linking this information to sequence data means we can understand more than ever before about the connections between our genomes and our health.
Sanger researchers also sequence the genomes of pathogens and other organisms, as well as people. We have published the genomes of thousands of species - from deadly bacteria to worms to the gorilla. This enables research into evolution, infections, drug resistance, outbreaks, symbiosis, biology and host parasite interactions.
Our sequencing teams, led by Dr Cordelia Langford, are constantly developing the technology to improve both accuracy and speed. In early 2018, we celebrated sequencing over five petabases of DNA (if you typed it all out, it would take 23 million years). The first petabyte took just over five years to produce. The fifth, just 169 days. The amount of genomic data now rivals that of the biggest data sources in the world - YouTube, Twitter and astrophysics.
The Sanger Institute is not only developing sequencing technology but also leading research in computational science, IT and bioinformatics, developing new ways to store and analyse petabytes of genomic and bio-data.

From sequence to clinic
How genome sequencing, or the sequence of any given individual, can be used hasn’t always been clear. But in the case of rare genetic diseases, it can change lives.
The Deciphering Developmental Disorders (DDD) study started 8 years ago, led by Dr Matt Hurles at the Sanger Institute. Over 13,600 children with rare developmental conditions, but without a diagnosis, joined the study. Sanger researchers, working together with clinical geneticists, have used genome sequencing to diagnose their conditions. 40 per cent of the children now have a diagnosis – giving the families some of the answers they were searching for. Knowing the genetic cause of a condition can help doctors manage it, help families connect with others as well as plan for the future.
The ability of researchers to rapidly sequence and analyse bacterial genomes is also leading to advances for patients.
Dr Julian Parkhill and colleagues showed it was possible to track an MRSA outbreak in a neonatal ward in real-time. By sequencing MRSA isolates from patients and staff, they could track the outbreak, following its path from person to person. This enables clinicians to prevent further transmission and bring the outbreak under control.
Now, it is UK policy to sequence the genomes of pathogens in an outbreak.
But disease knows no borders. Pathogens can easily spread around the globe. Professor David Aanensen, group leader at the Sanger Institute, is also Director of the recently established Centre for Genomic Pathogen Surveillance. The centre co-ordinates global surveillance of pathogens (such as MRSA and the flu virus) using whole genome sequencing. The data is openly available. Countries around the world can monitor the rise and spread of pathogens as well as their growing resistance to antibiotics. This enables swift action – with the aim of stopping transmission and saving lives.
The forefront of human genomics
The rapid development of technology has led to the ability of researchers to sequence the DNA, or RNA, from a single cell. Previously, much larger quantities of material were needed. Single cell RNA sequencing is a powerful tool. It allows the study of an individual cell’s activity, functions and composition. And high throughput machines means hundreds of thousands of cells can be analysed at once.
The Human Cell Atlas is capitalising on these advances. The international collaboration is co-led by Dr Sarah Teichmann at the Sanger Institute. Launched in 2016, scientists are using Next-Generation Sequencing to sequence 30-100 million single cells from the human body – out of a total of roughly 37 trillion. The aim is to create a comprehensive, 3D reference map of all human cells. This will lead to a deeper understanding of cells as the building blocks of life. It will form a new basis for understanding human health and diagnosing, monitoring, and treating disease.
Like the human genome project before it, this huge project will disrupt science and human biology. And like the human genome project it will drive technology to make it possible.
The diversity of life
Beyond human health, genome sequence data allows the study of evolution, biology and biodiversity.
For our 25th anniversary we have sequenced a more diverse range of species than ever before. 25 different species that represent biodiversity in the UK – from the golden eagle to the humble blackberry. Sequencing new species will push development of our technologies as each presents unique challenges. The sequences themselves will aid research into population genetics, evolution, biodiversity management, conservation and climate change.
But 25 species is just the beginning. Every single living thing has a genome, made up of exactly the same molecules of DNA or RNA. We want to uncover how the order of those molecules lead to the diversity of life on earth.
It took 13 years to sequence the first human genome. When the project began, no-one knew where it would lead. Now we sequence the equivalent of one gold-standard (30x) human genome in 24 minutes - faster and deeper genomic insights are enabling discoveries that improve health and our understanding of biology. These insights are happening right now, and they will lead to unimagined benefits for future generations - all possible from a sequence of four letters of DNA code.
Links:
- News story: First 'non-gene' mutations behind neurodevelopmental disorders discovered
- News story: The Finished Human Genome - Welcome to The Genomic Age
- News story: New global health initiative for genomic surveillance of antimicrobial resistance funded by NIHR
- News story: Tracking MRSA in real time
- News story: What have we got in common with a Gorilla
- Profile: David Aanensen
- Profile: Matt Hurles
- Profile: Cordelia Langford
- Profile: Julian Parkhill
- Profile: Sarah Teichmann
- Project: 25 Genomes Project webpage
- Project: The Centre for Genomic Pathogen Surveillance
- Project: Deciphering Developmental Disorders (DDD)
- Project: Human Cell Atlas web page
- Organisation: Wellcome Sanger Institute