Streptococcus pneumoniae - as seen under a scanning microscope. Image credit: Debbie Marshall, Wellcome Images

Categories: Sanger Science12 June 20198.8 min read

Tracking a deadly shapeshifter

A global child killer constantly changes its coat to evade destruction, but a worldwide network of scientists is using genetics to get one step ahead

Pneumonia killed 920,136 children under the age of five in 2015.1 It is caused by a range of invading organisms - Streptococcus pneumoniae is the most common cause of bacterial pneumonia. Streptococcus pneumoniae or “the pneumococcus”, is a smart “shapeshifter”. Many of us carry Streptococcus pneumoniae in our throats, where it usually lives harmlessly. But for young children, and those with weakened immune systems, it can become deadly.

It has the extraordinary ability to dynamically alter its ‘coat’, or serotype. Every pneumococcus bacterial cell is surrounded by a coat made of carbohydrates, and there are at least 100 different forms. The coat provides an outer layer, protecting the bacteria from attacks by human immune cells.

The coat is also what current vaccines target – priming the body’s immune system to recognise it if the real thing invades. Current vaccines target the 10-13 serotypes which are most common in disease-causing strains of the bacteria.

Professor Robert Breiman, director of the Emory University Global Health Institute has seen the effects of the vaccine first hand. He has devoted almost a decade to establishing a surveillance programme for infectious diseases in Kenya. Before the introduction of the vaccine, the country had high rates of pneumonia.

“During my stay in Kenya, I worked closely with local doctors who made daily diagnoses of bacterial infections in infants. Every count of pneumococcal death on the surveillance report was a tragedy to a family.”

Vaccination programmes have saved millions of children’s lives around the world. While we rejoice at this victory, our wily enemy remains. It begins its comeback by changing its serotype to escape from the vaccine.

”Streptococcal Streptococcal bacteria showing their polysaccharide coats. Image credit: The Rockefeller University

DNA sequencing to track vaccine evasion

Professor Stephen Bentley, a team leader at the Sanger Institute aims to better understand the evolution of the bacteria through whole genome sequencing – the process of determining the complete DNA sequence of an organism. This information will enable researchers to better understand how the bacteria makes its coat and makes its escape, and improve the design of new vaccines.

He joined the Sanger Institute in 1998 to study bacterial genomes. In the early 2000s, with new technologies emerging to sequence genomes faster than ever before, he moved to studying whole bacterial populations. Although the technology had become more rapid and cost-effective, DNA sequencing of a large number of bacteria was still an enormous challenge.

Stephen and his colleagues made a breakthrough by adding a special tag, or label, to each DNA sample, and pooled them for sequencing, thus saving cost and time.

He recalled: “It was an ordinary chilly day in January 2010 when we published our first paper with this approach, but my colleagues and I were so excited by the success and the potential it opened up. It meant that observing evolution of bacterial populations in near real-time had become a realistic prospect.”

The technological breakthrough had also prompted Stephen to arrange to meet with Professor Keith Klugman, a prominent scientist in pneumococcal research. One Sunday in January 2009, over a pint of beer in The Queen’s Arms in Kensington, London, they decided that it would be feasible to carry out a genomic surveillance programme that would sequence the genomes of over 20,000 pneumococcal bacteria collected from around the world.

The Queen’s Arms in Kensington, London where Professor Bentley and Professor Klugman met to discuss the feasibility of sequencing 20,000 pneumococcal genomes on 18 January 2009. Image via

Initially the focus was to better understand the frequency of antibiotic resistance and to discover new resistance mechanisms but they were soon persuaded by the Bill & Melinda Gates Foundation that the primary focus should be on understanding vaccine escape.2

Global set-up

To achieve such a large-scale project, they needed help from other world-leading scientists. Dr Lesley McGee from the US Centers for Disease Control and Prevention, an expert in pneumococcal microbiology and vastly experienced in large scale international studies, was brought on board as one of the project directors. They also established strategic collaborations with other experts: Dr Anne von Gottberg from the National Institute Communicable Diseases, South Africa; Professor Martin Antonio from the Medical Research Unit The Gambia at the London School of Hygiene & Tropical Medicine; and Professor Dean Everett from Malawi-Liverpool-Wellcome Trust Clinical Research Programme.

And when Keith took up a new role as Head of the Pneumonia Program at the Bill & Melinda Gates Foundation, Robert Breiman from Emory University took over the reigns as Principal Investigator. With the financial support from the Gates Foundation, the Sanger Institute and the US Centers for Disease Control and Prevention, the Global Pneumococcal Sequencing (GPS) project was established in 2011.

The GPS project is the largest genomic survey of any bacterial species to date, representing a global collaboration among investigators from 55 countries. By the end of 2018, 22,000 pneumococcus bacteria from 50 countries were sequenced here at the Sanger Institute. Lesley and her US team made tremendous efforts on expanding the GPS network, facilitating sampling from all over the world and ensuring the DNA quality was good enough for sequencing.

She said: “Handling an unprecedented number of samples from many geographical locations was a great challenge for our team. Thanks to the collaborative spirit of all the GPS partners, we overcome a lot of obstacles and have been able to create a fantastic database”.

Locations of the pneumoccocal strains collected during the study

Sifting through the strains

To group the pneumococcal bacteria based on their DNA variations and produce a standardised naming system was a challenge in the beginning. Dr Nicholas Croucher from Imperial College London and Dr John Lees from New York University strived to go beyond the limitation of the existing methods. They developed a new software tool to define strains using the complex genomic data – named PopPUNK. Using their method, ~20,000 pneumococcal bacteria were grouped into 621 pneumococcal strains or Global Pneumococcal Sequence Clusters (GPSCs).

How PopPUNK got its name

”John submitted the paper two days after my daughter was born, and “pop punk” was one of the names we had suggested for my dad as an alternative to “grandad” - which he did not consider appropriate for himself, as an ex-punk. The option of having “Pop”-ulation at the start, and “K”-mers at the end, appealed for inventing an acronym for the software, so I stole it, and now my dad has to put up with being called grandad”

Nicholas Croucher

The genomic data allowed the GPS group to also analyse the serotypes (coats) of the bacteria within each strain. Of the 35 dominant strains in the world, the majority contained both some bacteria with coats that were targeted by the vaccine, and some that weren’t - highlighting their potential to escape the vaccine and continue to cause disease[1].

New studies on vaccine resistance

A GPS Analyses and Writing Workshop held at the Wellcome Genome Campus on 22-26 April 2019, in which Paula Gagetti, Roly Malaker, Ekaterina Egorova, Samanta Cristine Grassi Almeida, Geetha Nagaraj, Stephanie Lo, Keith Klugman, Hani Kim, Kedibone Ndlangisa, Lesley McGee, Robert Breiman, Nicholas Croucher, Stephen Bentley, Rebecca Gladstone and Chrispin Chaguza got together to produce a series of publications regarding the vaccine impact on six low and middle income countries

I joined the GPS and the Sanger Institute in 2016, and together with colleagues, have been working to understand the strains of Streptococcus pneumoniae which are causing deadly infections around the world. We were interested in strains that had escaped the vaccine and caused invasive disease among young children across the globe.

Like previous studies, we have seen that after the introduction of vaccines, pneumococcus bacteria with coats that are targets of the vaccine were replaced by those with non-vaccine targets.

But we have gone beyond previous studies, looking for the first time at all the strains circulating globally. Our most recent work, published in Lancet Infectious Diseases and EBioMedicine, has mapped all the strains and serotypes – showing which are present where, as well as their vaccine and antibiotic resistance.

Using the DNA data, we have also discovered nine potential new serotypes. They were in a variety of pneumococcus strains and in a number of different countries across the globe. The findings are a major advance in our understanding of the undiscovered diversity in pneumococcus bacteria coats.

Now, anyone can explore the data on the GPS website. You can “explore strains” using a visualisation tool called Microreact which interactively displays a phylogenetic tree (a diagram that presents evolutionary relationships - like a family tree), location and time of collection for each serotype within a strain. You can “explore countries” using the world map which shows you the number of samples collected from each location and in-depth analyses for countries with at least 100 samples. Researchers are able to use PopPUNK to assign Global Pneumococcal Sequence Clusters to their own pneumococcal genomic datasets.

Tackling a deadly disease

This disease kills nearly a million young children every year, but research and technology could help stem this tide and deliver a powerful public health strategy to prevent these tragic deaths.

Today’s research findings highlight the importance of genomic surveillance. Continued surveillance of the changing pneumococcal populations with increased geographical representation will generate crucial insights - defining the next generation of vaccines.

Vaccines prime our immune system to spot these deadly invaders. Our research can prime public health sectors around the world to devise vaccination policies and strategies to reduce child deaths. Now we have the chance to be one step ahead of this deadly shapeshifter.

Find out more