Genomic surveillance – the world’s binoculars focused on infectious diseases

By Alison Cranage, Science Writer at the Wellcome Sanger Institute

Data visualisation from Microreact, showing lineages of the SARS-CoV-2 virus using COG-UK and GISAID data.

Genomic surveillance is in the spotlight as scientists race to track emerging variants of the coronavirus, SARS-CoV-2. It is helping to identify mutations that could affect how the virus functions or transmits. The sooner new variants are found, the sooner action can be taken against them.

The speed and scale of genomic surveillance that has been set up in the UK for SARS-CoV-2 is unparalleled. The COVID-19 Genomics UK (COG-UK) consortium, which was started in March 2020, has now sequenced over 300,000 SARS-CoV-2 genomes – around half of the global total. The sequences have been used by researchers across the country to identify and help stop local outbreaks, and across the world to study the evolution of the virus.

Genomic surveillance refers to the systematic and regular collection of genetic sequence information from a pathogen population.

COG-UK researchers helped identify the B.1.1.7 variant in the UK in late 2020, which was shown to be more transmissible. UK authorities, and the world, were alerted to the threat. Different surveillance projects have picked up other worrying variants as the pandemic continues. Researchers expect 2021 to bring even more.

Genomic surveillance is not just vital for COVID-19. The Sanger Institute has long-established programmes that monitor the microorganisms that cause malaria, cholera and a whole range of other diseases. Surveillance enables researchers to spot mutations that might affect an organisms’ function. In the case of malaria parasites and bacteria, scientists are looking for drug-resistance.

We spoke to three researchers at the forefront of global efforts in genomic surveillance as it moves from a research tool into a public health asset.

Sequencing Laboratory at the Wellcome Sanger Institute. Image Credit: Dan Ross / Wellcome Sanger Institute

Sequencing the SARS-CoV-2 virus

Professor Sharon Peacock is Director of COG-UK. Sharon and her collaborators set up the consortium in March last year, as the UK was heading into lockdown for the first time. Having worked in infectious disease genomics and public health for many years, she knew the value that surveillance could bring to tackling outbreaks of an infectious disease.

“Genome sequencing is now fast enough, and possible on a large enough scale, that genomic data can inform public health programmes for COVID-19” says Sharon.

Scale is crucial for statistical analysis of the virus’s mutations. There needs to be enough data to ensure that findings are not biased, for example by small numbers of samples.

Timely feedback of information is important so that public health authorities can react. Sharon shared an example of an outbreak in a Cambridge Hospital, in March 2020. A cluster of identical viruses was identified in patients with renal failure who were attending the hospital for dialysis. Genomic sequence data helped rule out transmission within the hospital wards. Together with anonymised epidemiological data about the patients’ movements, the sequences implicated that shared patient transport was the source of infections. Uncovering previously hidden routes of transmission like this enables swift action – further cases were prevented after transport arrangements were changed.

The consortium has now sequenced just under half of the world’s publicly available SARS-CoV-2 genomes. Work continues on tracing local outbreaks, and there is parallel focus on identifying  mutations that might affect the virus’s function. Several mutations and variants have now been identified around the world that are more transmissible or may evade an immune response from a previous infection or vaccine – in South Africa, Brazil and the UK.

The pattern of genetic mutations in some variants has surprised scientists. “We had been seeing a steady rate of evolution of the SARS-CoV-2 virus, with a few mutations occurring in its 30,000 base pair genome every month as it replicates. But the new B.1.1.7 variant had 23 new mutations” explains Sharon. Many of these mutations are in the spike protein, which the virus uses to attach itself to human cells. Researchers were quickly able to detect that it was spreading faster and is more transmissible than previous variants.

It is likely that there are other variants too, which researchers aren’t aware of, having arisen in parts of the world where there isn’t large-scale genomic sequencing.

“As the virus continues to spread, we may see increasing complexity in the combination of mutations as they accumulate,” says Sharon. “This may not be limited to transmissibility. There is the possibility that mutations will make the disease it causes more severe, and there is increasing evidence that some mutations enable the virus to escape an immune response from either a previous infection or a vaccine.”

National surveillance programmes, like COG-UK, are helping researchers stay one step ahead of the virus as it evolves.

For older pathogens, even those that have been with humans for centuries, the same cutting-edge genomic approaches apply.

Sequencing malaria

Caused by single-celled Plasmodium parasites, malaria kills 400,000 people a year, mostly children under the age of five. While malaria used to be present in most continents of the world, it has now been eliminated from affluent parts of the globe. Efforts to eradicate the disease have brought down numbers dramatically over the last 20 years but recently, progress has stalled.

“To eliminate malaria we need to do more, to do better and to do things differently,” says Professor Abdoulaye Djimde, from the University of Science, Techniques and Technologies of Bamako, Mali. Djimde is one of the founders of the Pathogens genomic Diversity Network Africa (PDNA), which brings together scientists from 16 countries in the continent.

“We used to think that Plasmodium parasites in Sub-Saharan Africa were too diverse to group together,” explains Djimde, “But analysis of 2,000 Plasmodium falciparum genomes showed us there are six groups. This information is vital for national malaria control programmes. Each parasite population may react differently when you add in interventions – that might be vaccines, drugs or insecticides. One size doesn’t fit all.”

Djimde and his colleagues work closely to engage with policy makers in the region, so that they are well informed and can incorporate genomic data, something relatively new, into national control programmes or treatment guidelines.

His genomic surveillance work has also shown that parasites within each of the groups are becoming less diverse – evidence that certain genes or traits are being selected for. In this case, there are genetic markers that indicate the parasites populations are becoming resistant to some malaria treatments. In Ghana and Malawi, they have identified emerging resistance to the main antimalarial drug.

“This is an early warning for the region,” says Djimde. Real world impacts haven’t yet been seen – treatments are not yet failing in the clinic. “Now is our chance to zoom in and eliminate these parasites before they become fully resistant to treatments.”

Their genomic surveillance work is continuing, to characterise and monitor these parasites in full. And the network is expanding to sequence and monitor antibiotic-resistant bacteria, and COVID-19 in Africa.

Sequencing cholera

Vibrio cholerae bacteria. Image Credit: David Goulding / Wellcome Sanger Institute

Like malaria, cholera is seen as a disease of the past in the UK. One of the fathers of modern epidemiology found fame in tracing the cause of cholera to infected water supplies in Victorian London.

But in many parts of the world, cholera still poses a huge threat. Caused by Vibrio cholerae bacteria, there are millions of infections a year. It is responsible for hundreds of thousands of deaths, mostly of children under five, and often in already extremely vulnerable populations.

Sequencing cholera genomes is a global effort. Sanger researchers, led by Professor Nicholas Thomson, work with teams across the world to understand its spread and its evolution.

Not all strains of the bacterium are equal. Some are endemic – persisting in a region but not usually causing any major problems. Other pandemic strains rapidly spread and cause severe disease.

“Using fine scale genomic approaches we know what pandemic cholera looks like,” says Nick. “We can see the patterns of spread. We can determine how strains in one location relate to those in another.”

Previously, it was thought that the bacterium was an environmental pathogen, living in certain areas where it would then infect humans. But genomic surveillance has helped show that for pandemic strains, this isn’t the case. These are transmitted via human to human contact, they move with people from place to place.

“We are able to layer genomic data onto information about the bacteria. We include data on its surface proteins (its serotype), the presence of the cholera toxin molecule, and epidemiological data about where and when a sample was collected. This gives us exquisite detail. We can build evolutionary trees, tracing the history and spread of cholera around the globe.”

Using historical samples, the team have reconstructed the seventh, most recent, cholera pandemic that has circulated the globe since the 1960’s. A V. cholerae lineage termed 7PET is responsible. It emerged in the Bay of Bengal before spreading to Africa and then South America in the 1990’s. The team have shown that the same strain was also responsible for the outbreak in Haiti in 2010 and then Yemen more recently.

“Evidence from genomic surveillance can help us plan interventions. Genomics can differentiate between endemic disease and dangerous epidemic strains that mean authorities or healthcare systems need to take action. We need to define the bacterium that is causing disease.” 

“Two lineages of V. cholerae have caused seven global pandemics throughout history. 2020 has taught us that we can’t rule out a new pandemic strain of any pathogen evolving.”

As recent months have shown, genomic surveillance is essential in the fight against infectious diseases – be they caused by pathogens that have been with humanity for centuries or newly emerged foes.

Find out more

COVID-19 Research at the Sanger Institute

COVID-19 Genomics UK (COG-UK)

Malaria research at the Sanger Institute


Cholera research at the Sanger Institute