30 impacts for 30 years

From sequencing the human genome, to enabling the diagnosis of rare disease patients around the world, we highlight some of our impacts over the last 30 years.

scroll for more

Categories: Sanger Science4 October 2023

30 impacts for 30 years

This year the Wellcome Sanger Institute celebrates its 30th Birthday. Here we highlight some of the impacts we’ve had over that time – from sequencing the human genome, to supporting the ducklings in the lakes on site, to mapping the trillions of cells in the human body. 

For a more in-depth look at some of our scientific impacts, take a look at our reports.

DNA sequencing to underpin biological research

This is the one that the Sanger Institute is famous for. Over the last 30 years, staff have sequenced a total of 46.6 petabases (46,600,000,000,000,000 bases) of DNA and RNA (by June 2023). The Sanger Institute has one of the largest DNA sequencing facilities in the world, and DNA sequencing underpins the large-scale science carried out at the Sanger Institute and beyond. 

All of the sequencing data generated are made publicly available for researchers around the world, to improve understanding of biology, health and disease.

Human genome sequence

Our founding mission in 1993 was to contribute to the Human Genome Project. A landmark in the history of science, the first draft of the human genome sequence was published in 2000. The Sanger Institute sequenced one-third, the largest single contribution. 

The human genome sequence is now a fundamental resource underpinning research into most aspects of human biology and disease, and is used in many areas of medicine.

Read more:
The draft human genome sequence.


A less well-known, and more local, impact, is that the grounds team estimate they have come to the aid of around 720 ducklings over the 30 years. Often found wandering around the plaza, the team helps guide the ducks back to the nearest water, which is often the main lake. Or, as in the case in the video above, staff have steered young families on their way to the River Cam.

First species sequenced

In 1996, Sanger Institute staff made the single largest contribution to sequencing the first complete genome from a eukaryotic organism – Saccharomyces cerevisiae. The yeast, used for millennia in winemaking, brewing and baking is also an experimental organism widely used to study the functions of genes. Over half of the 6,000 genes revealed by the DNA sequence were previously unknown.

The genome sequence of Mycobacterium tuberculosis followed in 1998. It was one of the first bacteria sequenced. The sequence contains 4,000 genes and offers new strategies for tackling tuberculosis.

Also in 1998, founding Director of the Sanger Institute, John Sulston led the completion of the first genome sequence of an animal – Caenorhabditis elegans. The nematode worm is widely used as an experimental model organism. The sequence has enabled C.elegans to provide profound insights into development, neurobiology and ageing.

Laying the foundations for the next generation

375 PhD students and 39 MPhil students have studied at the Sanger Institute so far, with their degrees awarded from the University of Cambridge. 27 new students are joining us this month. Congratulations to everyone who has earned their degrees with us!  

As well as students, over 1,000 postdoctoral researchers have trained at the Sanger Institute.

Computational tools

Over 200 tools, softwares, protocols and services have been created by staff at the Sanger Institute. These software products enable data processing, analysis and management of data and laboratories.

Freely available to the scientific community, some of these software tools are adopted as standard practice and many remain in use, such as the Burrows-Wheeler Alignment (BWA) developed in 2009. This tool aligns short reads of DNA sequence produced by next-generation technologies against a reference genome sequence. By 2023 it has been used and cited in further research over 28,000 times.

Gnomes vs genomes

A small family of gnomes have made the Wellcome Genome Campus their home. They reside by a lake outside the Morgan Building, having moving in when the structure was commissioned in 2004. They were “installed” by the Sanger Institute’s web team and have lived there proudly over the years, only taking a week’s holiday from duty when the Princess Royal came to officially open the South Campus.  They are a proud beacon of the “Gnome Campus”.

121 million cells analysed

Staff from the Sanger Institute co-founded and co-lead the international Human Cell Atlas consortium. To date, 3,000 researchers from 97 countries have analysed the genome and transcriptome of 121 million cells, from 16,600 samples, from 9,890 individuals, in 18 different types of tissues. This project will produce a map of the 37 trillion cells of a human body, to transform medical research and healthcare worldwide.


Sanger Institute staff have published an impressive number of  over 9,900 scientific research articles and reviews over the last 30 years. The publications represent new findings, methods, data, and resources in genetic and genomic research. Around a quarter of papers by researchers at the Sanger Institute are in the top five per cent of the world’s most cited publications. 

In total, by August 2023 research articles and reviews authored by Sanger Institute researchers have been cited over 1,381,468 times.

More species sequenced – model organisms

As well as yeast and the nematode worm, the Sanger Institute provided the reference genome sequences of most of the widely used model organisms in biomedical research. These include Escherichia coli – a bacterium used in molecular biology, biotechnology and genetic engineering, critical for our understanding of cellular biology and cellular processes, such as DNA replication. 

The Sanger Institute also sequenced the mouse genome, the most widely used model organism for studying human disease and the zebrafish, a model organism with particular use in studying embryonic development.

More recently, the Sanger Institute, as part of a national initiative, has taken on the challenge of sequencing all eukaryotes (plants, animals, fungi and protists) in Britain and Ireland – some 70,000 species. So far over 1,000 species have had their DNA sequence determined for the first time. The data will underpin biological research for the future.


The Sanger Institute manages deliveries on a significant scale, having received over 1 million orders, 1.8 million products and delivered 454,440 products over 30 years. This logistical feat ensures that researchers have seamless access to the resources they need, keeping the Sanger Institute ticking. The stores team estimate they have walked 360 million steps to maintain the integrity and timely delivery of everything from biological samples, research materials, equipment, to high-throughput sequencing data.

Paying heed to parasites

Sanger Institute researchers have used genomics to understand the parasites behind neglected tropical diseases that collectively infect more than 1 billion people annually. 

They generated reference genomes for the protozoa that cause malaria, leishmaniasis and African trypanosomiasis, and parasitic worms including schistosomes, tapeworms, roundworms, hookworms, threadworms, and whipworms. 

This enabled the identification of genes involved with disease and interactions between parasites and their hosts. Researchers at the Sanger Institute also developed software and databases of vital importance to neglected tropical disease research, enabling innovation and supporting interventions, such as the prioritisation of novel drug targets and compounds to treat parasitic worm infections. The data have also provided the basis for diagnostics and have allowed insight into and tracking of drug resistance.

Read more:
Discovery of vaccine target for devastating livestock disease.

Six spin outs

The Sanger Institute has a culture and history of developing innovative large-scale platform technologies. Several of these are now being further developed and commercialised by our spin-out companies. They represent an important step towards providing societal benefit in today’s healthcare sector, bringing our science out into the world to solve real-world challenges. 

To date, the Sanger Institute has spun out six companies working with high calibre investors from an early stage to build opportunities leveraging our unique science. Four of these companies are active today,  and they have raised £320.5M of investment in total. Sanger’s first ever spin-out, Kymab, is developing therapeutics designed to modulate the immune system to overcome cancer immune tolerance and to treat autoimmune diseases. In 2021, it was acquired by Sanofi in a $1.1bn deal. 

Congenica is the exclusive Clinical Decision Support partner for the NHS Genomic Medicine Service and supports customers in 25 countries.  In the UK alone, Congenica’s platform has helped the NHS to increase their diagnostic yield by 50 per cent, while reducing analysis times 20-fold.

Microbiotica raised a record £50M series B round in 2022, the largest microbiome-related financing in Europe to date. It is considered a global leader in gut bacterial microbiology, genomics and identification of bacteria associated with health, disease or therapy from analysis of large clinical datasets.

Mosaic Therapeutics, the latest Sanger spin-out, raised an initial investment of £22.5M. Its aim is to develop a world-class drug discovery platform to identify genetic complexities in multiple different types of cancer that can be exploited to generate new medicines for patients.

Databases created

Some of the largest and most impactful genomic databases in the world were initiated at the Sanger Institute, freely sharing the resources from various projects with the global scientific community. This spirit of open science was championed at the Sanger Institute from the beginning. Some of the largest databases include Pfam, Ensembl, WormBaseParasite, ENCODE, DECIPHER, COSMIC, HipSci, HCA, Cancer Dependency Map, amongst others.

Improving life for rare disease patients

Working with colleagues in the NHS, Sanger Institute researchers established the DECIPHER platform in 2004, to help children with undiagnosed rare genetic conditions.

DECIPHER brings together genetic and health data from rare disease patients from around the world. Genetic and genomic variants are often difficult to interpret, making clinical diagnosis challenging. It can take many years for patients to get a rare disease diagnosis, meaning an arduous diagnostic odyssey for families. Sharing rare disease data helps clinicians to identify the genetic causes underpinning these disorders and assists in diagnosis. Today, DECIPHER openly shares records from over 47,000 people, and recently transitioned to being hosted by EMBL’s European Bioinformatics Institute (EMBL-EBI). It is used globally by researchers and health care providers. 3,000 papers have cited the resource in their studies.

Using DECIPHER, the Deciphering Developmental Disorders study was established to sequence the genes of over 14,000 children with rare diseases. Researchers have uncovered 60 new genetic variations involved in rare disease, 67 new genetic disorders and published over 300 publications. This has led to being able to provide 5,500 new diagnoses, which in many cases have led to improved healthcare and support for families.

By understanding the genetic causes of rare diseases, researchers are able to start work to design new treatments.

These projects were also the foundations of the NHS genomics service that launched in 2018 – enabling whole genome sequencing for all rare disease patients who might benefit.

Read more:
5,500 people diagnosed with rare genetic diseases in major research study

Connecting people with science

Open days, school visits and tours are a regular part of life at the Sanger Institute. The Wellcome Connecting Science team, who run programmes for schools and young people, on site and in the local area, estimate that over 25,000 school children have visited the Sanger Institute so far. And, over 1 million beads have been used to make DNA bracelets with visitors, as well as at local science fairs, talks and shows.

The Wellcome Connecting Science team also deliver courses and conferences for scientists and health professionals. Starting with a summer school on DNA Related Methods in Human Genetics at Guy’s and St Thomas’ Hospital in 1988, the programme celebrated 30 years in 2018. In that time they have run 471 courses and 349 conferences, reaching 58,385 scientists and clinicians from over 130 countries, at the Wellcome Genome Campus or virtually. Plus a further 3,083 researchers have attend global training courses in Africa, Asia and Latin America. In addition, over 173,000 people have joined their massive open online courses since these launched in 2018.

Cancer discoveries

Researchers at the Sanger Institute have sequenced thousands of cancer genomes. For many cancers – all of them caused by DNA mutations – it was the first time their DNA sequence was determined.

The data and subsequent analysis have enabled more research, and led to new cancer treatments. For example, in 2002, Sanger Institute researchers discovered ‘driver mutations’ in the BRAF gene that convert normal cells into cancer cells. These were present in 70 per cent of malignant melanomas, a disease that was untreatable at the time. Small molecule drug inhibitors targeting BRAF were developed by multiple pharmaceutical companies. The drugs cause BRAF mutant malignant melanomas to regress, and are now clinically standard treatments.

In 2008, researchers at the The Sanger Institute helped found the International Cancer Genome consortium to sequence 25,000 cancer genomes, describe the mutational landscapes of 50 types of common cancer and identify the mutated genes causing each type. By 2020 this work had completely transformed understanding of the mutations that convert normal cells into cancer cells.

The Pan-cancer Project followed, and has explored the nature and consequences of DNA variations in cancer, across the entire genome, including from areas of DNA that do not code for proteins. This internal, collaborative effort has provided comprehensive insights into many aspects of cancer genomes, and data are available to the research community, to help accelerate discoveries.

Read more:

Genetic mutations in healthy skin reveal a battlefield.

The multitudes within us

Sanger Institute staff established the Host-Microbiota Interactions Lab to explore the relationship between humans and the bacteria and viruses that live within and on us – our microbiota. They have a particular focus on early life development, inflammatory disease and cancer. 

The group has cultured and generated high quality reference genomes for more than 1,000 bacterial symbionts, around a third of which are newly-discovered species. 

They also identified an additional 2,000 bacterial species by reconstructing over 90,000 metagenome-assembled genomes from nearly 12,000 human gut microbiomes. 

Mining of global human gut metagenome and reference genomes also identified 142,000 non-redundant viral genomes. The lab’s approach of coupling high-throughput genomics and culturing has underpinned the creation of an industry-leading company that will shortly be launching clinical trials for microbiome-based therapeutics.

A quarter of bacterial pathogen genomes

Working with 300 collaborators in 69 countries, by 2018 pathogen genomics researchers at Sanger had contributed more than 25 per cent of the high quality genomes for the top 20 most-sequenced bacterial pathogens stored in the European Nucleotide Archive, including Staphylococcus aureus, Escherichia coli, Salmonella enterica and Vibrio cholerae.

Through long term initiatives like the Global Pneumococcal Sequencing project, in which 137 scientists in 59 countries have sequenced more than 31,000 Streptococcus pneumoniae genomes, Sanger researchers have generated unprecedented insights into how microbial pathogens evolve over time and space, and in response to global disease control efforts.

Discoveries about human origins

Teams at the Sanger Institute have investigated human evolutionary history, providing genetic evidence to integrate with archaeological and cultural studies. Their studies of genomes in present day north-east Africa support the theory of a northern exit route for early humans expanding out of Africa tens of thousands of years ago. Studies of New Guinea demonstrated that people there have evolved independently for 50,000 years, and the present-day genetic landscape is dominated by an expansion that took place 10,000 years ago, associated with food production. Other studies have provided insights into the movements and mixtures of prehistoric and historic populations in the Middle East, and Bronze-age Canaanites from Lebanon.

Researchers also led the sequencing and analysis of data from the Human Genome Diversity Project, providing a fully open-access reference resource from the most diverse set of publically-available samples. Their work shows the absence of fixed genetic differences between continents – demonstrating the lack of human ‘races’. These samples and data remain a key reference set for the foreseeable future because of their diversity and open access.

Read more:

DNA analysis shows Anglo-Saxon ancestry in East of England.


Despite progress in fighting the illness, malaria remains a pervasive global health challenge. Nearly half the world’s population – 3.3 billion people in 87 countries – are at risk, and thousands die every year. 

In 2013, researchers at the Sanger Institute discovered the molecule that makes the most deadly malaria parasite, Plasmodium falciparum, specifically infect humans, as opposed to other species. This molecule is a promising vaccine target for the disease. Researchers at the Sanger Institute have also led studies to characterise the complex life cycle of malaria parasites, using single-cell sequencing methods. 

They have also built reference genomes for the parasite and mosquitoes, and databases of mosquito and parasite genomes.

A key part of this work is MalariaGEN, a dynamic global data sharing network of dedicated researchers spanning over 50 countries. The collective efforts of MalariaGEN researchers have yielded 95 per cent of the world’s malaria genetic data, sourced by sequencing over 60,000 parasites and approximately 15,000 mosquito samples. The teams use genomics to help monitor and prevent the spread of anti-malarial drug resistance and inform the development of a vaccine.

The Sanger Institute continues to support the network to build capacity for malaria genomic surveillance in endemic countries.


The Catalogue of Somatic Mutations in Cancer (COSMIC) is a comprehensive database of cancer mutation data and analytic tools. Staff working on the database have recorded more than 23 million mutations and 6,800 precise forms of human cancer, from 29,000 peer reviewed publications and 42,000 whole genome screen samples. The COSMIC database has been cited more than 20,000 times, and has 50,000 academic and commercial registered users, with 30,000 users a month visiting the website.

Read more:
Journey to precision cancer treatment takes off


Sanger Institute scientists systematically knocked out 1000 of the 20,000 genes in the mouse genome, deleting them one at a time in all cells of individual mice. The knockouts led to a variety of changes in the mice, providing insights into gene function.

Other teams produced mice that capture the entire diversity of the human immune system, expressing human, rather than mouse, antibodies. This work has become the basis of advancing antibody-based treatments for infectious diseases, cancer, and blood disorders.

Human genomes – from one, to everyone

After sequencing the human genome, Sanger Institute researchers went on to lead and collaborate in international projects and sequence further groups of people to better capture human genetic diversity. Researchers used each new technology as it emerged to study genomes in more detail and more accurately than had previously been possible.

These projects include the HapMap project – which provided an initial database of over 3 million human DNA variants present in 270 DNA samples. Information and methods developed by the HapMap Project fuelled a first generation of ‘Genome Wide Association Studies’ (GWAS) that have localised over 600 new genetic risk factors for common diseases such as diabetes, heart attack, inflammatory bowel disease, breast cancer, schizophrenia, and other disorders. 

Building on this, the 1000 genomes project consortium exploited the first next-generation DNA sequencing technologies to develop a database including more human populations. The data provided the most comprehensive view of global human variation so far. This database contains all forms of variation – from single letter DNA changes to large alterations in the structure and copy number of segments of chromosomes. The reference data resources generated by the project remain heavily used by the biomedical science community. 

The UK10K project followed – which aimed to better understand the link between low-frequency and rare genetic changes and human disease. In 2015, the consortium published findings that showed new genetic variations linked to health, described population structures and functional annotation of rare and low-frequency genetic variants. Data and web-based tools to enable others to explore the results were made freely available.

Most recently, staff sequenced nearly 250,000 whole human genomes for the UK Biobank project, a study which brings together health, lifestyle and now genomic data for half a million people. The work is the largest DNA sequencing project of its kind, and data are available for researchers around the world to access. The resource is enabling research into human biology and health, and the ongoing, fundamental questions of how genetic variation relates to disease

Recognising our people

Over the years, staff have been recognised with awards, medals, fellowships and memberships of prestigious organisations, and two Knighthoods – all for their outstanding contributions to science and medicine. 

Staff and students have been awarded poster prizes or talk prizes at conferences. 

We have also had dozens of staff awarded ‘best cake’ in the annual summer BBQ bake off.

Tracking SARS-CoV-2

When the pandemic hit in 2020, staff with expertise came together from across the Sanger Institute, to sequence and analyse SARS-CoV-2 genomes. Our large-scale sequencing technology was adapted and sped up. The team achieved turnaround times to convert a virus sample into data in a timeframe that was able to help public health officials make decisions. At its peak, the team were able to sequence 70,000 SARS-CoV-2 genomes per week. 

This led to the establishment of the Sanger Institute’s Genomic Surveillance Unit to continue this work and also share expertise and knowledge. Now, scientists are setting up genome sequencing for all respiratory viruses in the UK to detect variants, inform public health decisions and aid vaccine development, in a new project – the Respiratory Virus and Microbiome Initiative.

Understanding development

The Sanger Institute has become one of the world’s leading centres for understanding human development through single cell genomics. Our work has mapped the development of several organs, including the gut, lung, and entire immune system as it matures across multiple organs. We are learning the rules of how different cell types are made in the body, and now applying these rules in the laboratories to make tailored cell types for further research. This work is also informing studies of how childhood cancers form and can be treated.

Read more:
How the maternal immune system is modified in early pregnancy

Accessing vertebrate genome data

Built as a collaboration between the Sanger Institute and EMBL’s European Bioinformatics Institute (EMBL-EBI) in 2000, Ensembl contains genome databases for vertebrate species. Now run solely by EMBL-EBI, it is used extensively by the global research community. It enables access to genome sequences and important additional information on genome structure and function. It continues to expand and provide services today.

Technical experts

Technical staff are the foundation of scientific endeavours. Within the Sanger Institute, they are trained and skilled in the techniques, tools and technology of their subject and provide practical application of knowledge, including hands-on support, directly contributing to our teaching and learning, research and enterprise activities. Over 30 years, technical staff member numbers increased substantially in parallel with the Institute’s size. Currently, we have more than 700 technical staff spread over 5 job families and 177 different job titles – about half of the Institute staff.

Thank you!

A huge thank you to all our staff and students, past and present, for being part of the Sanger family and making this a great place to work.

Find out more

Check out our current vacancies

Sign up for our newsletter