Category: Sanger Science

Exploring Sanger’s groundbreaking research

Giant Hogweed - one of the 25 genomes being read by the Wellcome Sanger Institute. Image credit: Appaloosa, Wikimedia Commons
25 GenomesSanger LifeSanger Science

Giant Hogweed sampling, a retrospective

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 14/12/2017

Anticipating that the Giant Hogweed would not win the popular vote in the “I’m a scientist, get me out of here – 25 Genomes…” event I decided to try to find some.

Let your fingers do the walking…

The National Biodiversity Network (NBN) atlas is incredibly useful for finding out where (and when) things are found, so I started there, looking for Heracleum mantegazzianum within 5km of the Sanger Institute:

Giant Hogweed locations in a 5km radius from the Wellcome Genome Campus

Giant Hogweed locations in a 5km radius from the Wellcome Genome Campus

This was a bust though, they’d been cleared out (the records are from 2004 and 2011 so no surprise there). So I went to a different source- the BSBI (Botanical Society of Britain and Ireland) who are linked to NBN but I figured would be more specific. I was right, you need to register to get the information but once you do, it is very good.

A nice chap called Kevin Walker sent me their records for the Cambridgeshire area. Turns out we’re in a bit of a hotspot, so that’s good news. Unfortunately by the time I found this out, it is November and plants tend to die back in the autumn.

On the other hand I had read somewhere that giant hogweed will germinate in winter so I figured that it might be possible to find some youngish plants – these are ideal for DNA as growing parts are the best for extraction.

It’s a matter of record

From the records I found out who had seen the plants in question and one of the most recent was a chap called Jonathan D. Shanklin who’d seen one in central Cambridge, on Hobson’s conduit. This, by the way, is a water channel cut approximately 400 hundred years ago as a water source for the centre of Cambridge and is now protected as a scheduled ancient monument.

Digression aside, with a name like that it was relatively easy to find Mr Shanklin with a quick google search. Turns out he works for the British Antarctic Survey. One slightly awkward phone conversation later I had clear idea of where this plant would be, not far from the Botanical Gardens. However driving in to the centre of Cambridge isn’t much fun, so this option went on the back burner.

Handle with care

Skin blistering caused by giant hogweed. Image credit: Cosima Pferdeliebe, Wikimedia Commons

Skin blistering caused by giant hogweed. Image credit: Cosima Pferdeliebe, Wikimedia Commons

Here’s a little tangent, this plant is not something to be trifled with. Giant hogweed is nasty stuff, its sap contains a sunlight (UV) activated toxin that can cause pretty horrible blistering (see below). So I made sure that I stocked up on a full face shield [liberated from a past position], plenty of nitrile gloves and a Tyvek suit (thanks to John Lovell at the Sanger).

The next location I wanted to scout out (I like to have backup plans) was the Bourne Brook area as this had a whole bunch of recorded sightings over the past few years (by a Ruth Hawksley) so I went for a little drive as it’s only 15 mins from work.

It turns out that Bourne Brook has been very effectively cleared of hogweed this year so I went to the workplace of Ruth Hawksley. Ruth works at the Bedfordshire and Cambridgeshire Wildlife Trust and they have an office that’s open to the public just 10 minutes from the brook. Sadly she wasn’t there.

However, her colleagues were in and they gave me her card. After a fruitful telephone chat, Ruth embarked on a mission to find some for me. This did not go well so we had an email exchange over the following days about getting hold of some seeds so that I could grow some myself. Again, no success as all the plants had been sprayed. Then Ruth remembered that there was a plant found and de-headed this past summer, rather handily just up the road from me in Ickleton, so off I went.

Lost in translation

Time for another aside. The location I was given was TL49384419 and a street. It seems that there are more ways to record location information than you might think. The above is an example of the Ordnance Survey National Grid coordinate system and it seems to be the standard for biological sample recordings in the UK.

Another nugget I discovered earlier in the project is that iPhones record GPS coordinates in the metadata of pictures. It isn’t easy to extract without using 3rd party software. However, if you do, you can then translate it from “AA; B; CC.cccccc” to “AA.B.CC.ccccc” which you can then copy and past into Google maps.

Anyhoo, a short walk up the lane later and I find myself standing by an electricity substation in Ickleton, looking somewhat suspicious in a pair of bright orange gloves, and staring at this:

Giant hogweed in the wild (hiding beside an electricity substation in Ickleton)

Giant hogweed in the wild (hiding beside an electricity substation in Ickleton)

This is the hogweed you’re looking for

One quick email later (thank you 4G connectivity) and Ruth confirms that this is the plant I’m looking for. Further confirmation came from around the back of the ‘station where there’s a 2m dead stem that’s been de-headed. So I took one of the leaves back to the lab to deposit in the -80 degree freezer, success!

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

About spiders (specifically the Fen Raft spider, Dolomedes plantarius) and where to get them from.
25 GenomesSanger Science

Concerning spiders, and where to find them

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 14/12/2017

Fen Raft Spider on the water, courtesy of Dr Helen Smith (

Fen Raft Spider on the water, courtesy of Dr Helen Smith (

About spiders (specifically the Fen Raft spider, Dolomedes plantarius) and where to get them from.

After visiting the Fen Raft spider website I contacted the woman listed on the Fen Raft spider web site, a Dr Helen Smith. Dr Smith was incredibly helpful and, I discovered, is the ‘grandma’ to the spiders living in the ditch we’re planning to source them from. Helen raised them in her house as part of the translocation programme she has been running for the past few years.

Also, I found out, spiders hibernate for the winter, something I probably should have intuited. The thing is, no one knows how these particular spiders do it! When farmers clear out the ditches to keep the water channels clear and healthy, a whole bunch of these raft spiders pop to the surface. It’s assumed they are hibernating under the water, but do they burrow into the mud? And how do they breathe during this time? Do they even breathe?

My guess is that they trap air around their bodies and this sustains them over winter but they must also shut down their metabolism to virtually nil to avoid using it up. Also how to they anchor themselves underwater? Do they burrow? Grab onto something? Wedge themselves under a rock? It’s a mystery…

Helen is going to send me photos and, maybe, videos of the collection as well. I’m hoping to see spiders appearing all over the surface of the water like some kind of hidden marine corps about to go into action.

This should all take place next Tuesday, weather permitting, so another species should be sampled by Christmas.

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Sanger Science

Life in the fast lane: rapid reproduction in malaria parasites

By: Brandon Invergo
Date: 01:12:17


Red blood cells infected with Plasmodium berghei gametocytes

It’s a classic image that many people have seen in school: an egg cell lying in wait while numerous sperm cells race towards it. The sperm that reaches the egg first gets to fertilise it. When multiple males are competing, this race can lead to the evolution of, for example, faster-swimming sperm.

Now imagine a more difficult scenario: before its sperm can swim to the egg, the male would first need to race to make the sperm. This is similar to what malaria parasites must do when they are first passed to a mosquito, and they have evolved to do it extremely fast. We wanted to know what’s happening inside the parasite to help it to so quickly prepare to reproduce and we found that maybe these species don’t follow the usual rules.

First, the malaria parasite needs a way to say to itself, “Ok, I’m inside the mosquito now. It’s time to get ready to reproduce!” One way that a cell can do this is by modifying its  proteins – the cell recognises a change in its environment and some protein is modified as a result.  This protein can then pass on the message by modifying other proteins, and so on in a chain, eventually leading to important cellular machinery being turned on. We performed an experiment that allowed us to watch how the parasites use one such protein modification, called phosphorylation, while they prepare to sexually reproduce.

What is the parasite doing at this point? Before the mosquito drinks the infected blood, some of the parasites are already separated into males and females. Once they’re inside the mosquito midgut, they leap into action, with each male producing eight  “microgametes” and each female becoming a “macrogamete.” Like sperm, the microgametes then have to race to find a macrogamete to fertilise.

In order to produce those eight microgametes, the male must copy its DNA and separate the copies three times, as well as building everything needed to make the gametes swim. In human embryonic cells, each cycle of copying and separation might take around 30 minutes each time, but the parasites can do all three cycles in only 10 minutes. We saw that even within the first 20 seconds, hundreds of proteins are being modified. We were sure that we would only see a few proteins affected in such a short period.

However, when we looked at which proteins were being modified, we had an even bigger surprise: not only did we see the proteins needed for copying DNA, we also saw ones related to separating the copies, as well as for the motors needed for swimming. In most species, these steps happen one after the other. Thanks to our results however, we are now starting to think that malaria parasites don’t wait for one step to finish before beginning the next.

This is the first really dynamic, “big-picture” look at how the parasite works “under the hood” and it’s shocking just how fast it is and just how much is happening in such a short period of time.  This will be useful to other researchers because if we can figure out how the parasite goes about preparing for reproduction, we might be able to figure out new ways to block it from happening.  Preventing this reproduction preparation could  then help to stop the spread of the disease.

This was interdisciplinary research that required the collaboration of teams at the Sanger Institute and EMBL-EBI ( The project was especially made possible thanks to the joint EMBL-EBI / Sanger Institute ESPOD postdoctoral fellowship, which gave me the opportunity to perform interdisciplinary research as a member of both institutes. The fellowship has been an excellent and unique experience, which I would enthusiastically recommend to anyone wanting to combine experimental and computational techniques in their research.

About the author:
Dr Brandon Invergo is an EBI–Sanger postdoctoral fellow (ESPOD) at EMBL-EBI and the Sanger Institute, working in the groups of Pedro Beltrao (EMBL-EBI), Oliver Billker (Sanger Institute), and Jyoti Choudhary (formerly Sanger Institute, now at the Institute of Cancer Research) on protein signalling in malaria parasites.

Related publication:
Invergo BM et al. (2017). Sub-minute Phosphoregulation of Cell Cycle Systems during Plasmodium Gamete Formation.Cell Reports Vol 21, issue 7, pages 2017-2029. DOI:10.1016/j.celrep.2017.10.071

Further links:



Sanger Science

A step towards an effective, multi-target vaccine for malaria

DATE: 31/10/17
By: Gareth Powell and Leyla Bustamante, who carried out the research at the Wellcome Trust Sanger Institute

Editor’s Note:
Gareth is now based at University College London where he is researching the development of left-right asymmetry in the vertebrate brain, under the supervision of Professor Steve Wilson.
Leyla works at the Ferrier Research Institute, Victoria University of Wellington, where she is using synthetic biology for the efficient generation of valuable chemicals in fungi.

The malaria parasites (the white circles) make contact with the red blood cells (large transparent circles) before deforming the blood cells' surfaces and invading

The malaria parasites (the white circles) make contact with the red blood cells (large transparent circles) before deforming the blood cells’ surfaces and invading

Malaria is a public health scourge in many developing countries. It particularly affects the very young and the elderly who are least able to cope with the waves of fever bought about by haemolysis – a process partially caused by red blood cells laden with parasites exploding, releasing their cargo to find and infect more red blood cells. Importantly, this process is the only narrow window within which the human immune system can recognise and adapt to the parasite invaders before they disappear within the safe confines of another red blood cell host. However, the parasite has evolved the ability to change in response to detection by the immune system – literally shedding its coat and replacing it with a new one, rendering it camouflaged again. Moreover, a further challenge is that malaria is caused by not just one species of parasite but a family of them. The genetic variation between species, and even within different strains of the same species, can render a successful defence against one parasite harmless against another.

Prompting an individual’s immune system to be ready for this attacker is very desirable, but how can we develop a vaccine against such a variable disease? As adaptable as the parasites are, there are some things that can’t be easily changed, like the proteins on the surface of the parasite that it uses to detect a red blood cell, attach itself to it and invade. These represent targets that are exposed to antibodies in the blood, and are important for the survival of the parasite. The targets can be similar between different species and strains of malaria. Given the challenges of provoking a strong immune response while being essential to parasite viability, and highly conserved across a broad spectrum of strains to increase the ability for successful heterologous challenge, targeting a single protein is unlikely to be enough. Therefore, an effective second-generation vaccine will almost certainly need to target multiple components simultaneously. So, how can we begin to workout which combination of these proteins are the best targets for a universal vaccine?

We, whilst at the Wellcome Trust Sanger Institute, tried to answer this question. We curated a list of potential targets and generated antibodies against each of them, to then test these antibodies in blood parasite cultures. By doing this, we identified five new potential vaccine targets that inhibited growth of two different strains of the parasite: CyRPA, EBA181, MSRP5, RAMA, SERA9. Then, we decided to explore how well these antibodies might work in combination (including another essential target: Rh5) by mixing them in different ratios and testing the different combinations in blood parasite cultures. Would they have the effect we would expect by just adding their efficacies together? Would we get a greater effect to neutralise parasites than would be predicted if the antibodies were acting individually? Or would we find that they interfere with each other, giving the parasites an easy ride? Surprisingly, we found examples of all three situations, suggesting that a multi-target vaccine is not as simple as just picking promising targets from a list and mixing them together.

What made this project even more special was being able to involve some great collaborators to examine other facets of this problem. With Tuan Tran and Peter Crompton at the NIH in the US, we were able to look at the real-world effect of these individual targets, and combinations of them, on the epidemiology of malaria in a human population. We wanted to answer the question of whether an immune response to the targets we identified offered an individual more protection from the disease, and found that the combination of some of the targets identified in vitro by us were associated with reduced malaria risk. With Yen-Chun Lin and Pietro Cicuta, at the University of Cambridge in the UK, we were able observe the process of invasion at the cellular level, as it was happening. We could begin to understand the mechanics of invasion and the stages at which these different parasite proteins functioned: allowing the parasites to detect and stick to a red blood cell, reorientate and form a strong anchor on the surface and then start to deform the membrane and push into the cell. Using this information, we were able to begin to theorise as to how different combinations of antibodies might work together to provide an improved protective effect or compete with each other and reduce protection.

All data combined, our research clearly showed that targeting multiple antigens is an effective tool against malaria. There may be a long way to go to making an inexpensive, widely available, multi-component universal vaccine for malaria. There are lots of other powerful factors besides scientific advancement playing an important part in its development and use, like politics, education and economics, but we like to think that we have at least helped to make a step in the right direction.

About the authors:

Dr Gareth Powell is now based at University College London where he is researching the development of left-right asymmetry in the vertebrate brain, under the supervision of Professor Steve Wilson..

Dr Leyla Bustamente works at the Ferrier Research Institute, Victoria University of Wellington, where she is using synthetic biology for the efficient generation of valuable chemicals in fungi.

Related publication:

Leyla Bustamante et al. (2017) Synergistic malaria vaccine combinations identified by systematic antigen screening. Proceedings of the National Academy of Sciences (PNAS). DOI: 10.1073/pnas.1702944114

Further links:

Sanger Science

Results of the Chordoma Genome Project reveal genetic changes that drive chordoma

DATE: 12/09/17
By: Chordoma Foundation Team

Editor’s Note: This blog is reproduced from The Chordoma Foundation blog, a charity that has funded and collaborated on recent research by Sanger Institute scientists. For more on The Chordoma Foundation, please visit:

Sacral bone chordoma

Sacral bone chordoma

This month, a group of chordoma scientists led by Dr. Adrienne Flanagan of University College London (UCL), Dr Sam Behjati and Dr. Peter Campbell of the Wellcome Trust Sanger Institute published results of the Chordoma Genome Project — the first major genetic sequencing study of sporadic (non-inherited) chordoma. Their findings, which appear in the leading research journal, Nature Communications, provide the most comprehensive insights to date about how chordoma forms, and important clues about how it could be treated.

This publication represents the culmination of a multi-year $215,000 investment by the Foundation which enabled the UCL and Sanger Institute teams to use advanced DNA sequencing technologies to analyze 104 chordoma tumor samples.

Key findings

In recent years, new sequencing technologies have enabled a revolution in the understanding of cancer at a genetic level. This study represents the first time those techniques have been applied at a large scale to chordoma.

By comparing chordoma tumor DNA to normal DNA from the same individuals, the researchers were able to catalog the genetic changes that occur in chordoma tumors. This yielded several important findings, including:

  • Amplification of the brachyury gene is common in sporadic chordoma
    Previous work supported by the Chordoma Foundation found that inheriting an extra copy of brachyury is responsible for familial (inherited) chordoma. The current study revealed that >20 per cent of sporadic (non-inherited) chordomas also acquire an extra copy of the brachyury gene during the development of the tumor. This discovery confirms a close relationship between familial and sporadic chordoma and provides further evidence that brachyury plays an essential role in driving the disease. It also highlights the importance of research to understand better how brachyury functions in chordoma and to develop therapies that target brachyury.
  • Mutations in the PI3K pathway
    The PI3K/Akt/mTOR pathway is activated in the majority of chordomas and this study found that 16 per cent of tumor samples had a mutation in the PI3K signaling pathway. These findings provide further rationale to investigate therapeutics targeting the PI3K pathway in chordoma patients.
  • Mutations in SWI/SNF chromatin remodeling genes
    This publication is the first to implicate a specific subset of mutations affecting the SWI/SNF chromatin remodeling complex in chordoma patients. Mutations in genes in this complex were identified in 17 per cent of chordoma patients, suggesting that epigenetic changes may play an important role in the development of chordoma. This finding may also point to a new therapeutic approach for some chordoma patients.
  • Mutations in the LYST gene
    Novel mutations in lysosomal trafficking regulator gene, LYST, were identified in 10 per cent of chordoma patients – the first time the LYST gene has been implicated in cancer. These mutations are believed to prevent the LYST protein from functioning properly and cause cells to transform into chordoma. Further research will be needed to understand the full role of LYST in chordoma and determine if mutations in the gene can be used to help guide diagnosis or treatment.
  • Low mutational load
    Compared to other tumor types, chordoma has a relatively small number of genetic changes. In total, potentially causative genetic mutations were found in only 45 per cent of tumors. This indicates that the key drivers of many chordoma tumors may be epigenetic changes or structural rearrangements within the genome that are not obvious with current analysis techniques.

Exploring new leads

These findings provide several important leads to pursue which could yield new therapeutic options for chordoma patients.

In the near term, there is now strong rationale for evaluating drugs that target the PI3K and SWI/SNF complex defects observed in this study. Several research groups have already begun work in this area, and are collaborating with the Foundation to test drugs through our Drug Screening Program.

Additionally, the results of this study underscore the urgency to develop therapies that target brachyury. As one of the Foundation’s top research priorities, we plan to invest significantly in this area over the next several years.

Finally, building on this study, research is needed to characterize the epigenetic changes that contribute to chordoma development. By adding this additional layer of data on top of the genetic changes characterized by this study, we will likely gain a more complete picture of how chordoma forms and the key pathways responsible for driving it.

A team effort

These important findings are the product of an impressive collaboration among a team comprising pathologists, surgeons, cancer biologists, and bioinformaticians from over a dozen institutions in North America and Europe. Our Co-Founder and Executive Director Josh Sommer and Manager of Research Patty Cogswell were also intimately involved in orchestrating the project and are co-authors on the paper.

The work was enabled by the generous contributions of many in the chordoma community in the US, Canada, and the UK. Special thanks go to Gerry and Susan Fitz-Gibbon whose organization, Chordoma UK provided vital support to Dr. Flanagan’s lab.

Adrienne Flannagan“This critical addition to the scientific knowledge base about chordoma would not have been possible without the dedication and involvement of patients and family members around the world. Thanks to them, we now have a clearer roadmap for how to attack chordoma and improve treatment in the future.”
– Professor Adrienne Flanagan, University College London and the Royal National Orthopaedic Hospital NHS Trust

The Foundation is grateful to all the investigators and donors who by working together made this study possible. It represents a major step forward for patients, researchers, and clinicians in understanding the genetic basis of chordoma and paves the way for better and more personalized treatments to come.

Sanger Science

Placing drug-resistant strains of E. coli into a broader context

DATE: 18/07/17
By: Teemu Kallonen, Julian Parkhill and Sharon Peacock

Escherichia coli is commonly carried in the human gut, and is the leading cause of bloodstream infection in England, elsewhere in Europe and the United States.

Several observations have made the study of E. coli an urgent priority. There has been a marked increase in the rate of E. coli bloodstream infection in recent years. For example, annual rates increased in England by 80% between 2003 and 2011 (from 16,542 to 29,777 cases), after which mandatory surveillance documented a further 10% increase between 2012/13 and 2014/15 (from 32,309 to 35,676 cases).

There has also been an emergence of E. coli strains that are resistant to multiple antibiotics. As the problem of antibiotic resistance and how to tackle this has become a global priority, this has gained increasing research attention. One of the highest profile drug-resistant E. coli strains to be studied so far is the bacterial type classified as sequence type (ST) 131.

A drawback of focusing on drug-resistant E. coli is that this only represents part of the story, since drug-resistant strains co-exist alongside their susceptible counterparts. In our study, we used whole genome sequencing to characterize an unbiased E. coli collection of over 1500 isolates, most of which had been collected by the British Society of Antimicrobial Chemotherapy over 11 years from patients with bloodstream infection in 10 hospitals across England.

Our study captured the year (2002) in which ST131 emerged in England (see figure below). We noted that within a short space of time, the number of ST131 isolates reached an equilibrium with other types. Around the same time, another type (ST69, not a multidrug resistant strain) also emerged, and again quickly reached an equilibrium within the overall population. These findings draw a sharp contrast with other drug-resistant pathogens such as methicillin-resistant Staphylococcus aureus (MRSA) or vancomycin-resistant E. faecium, where one or a limited number of types dominate the population that cause human infection.


Proportions of STs by year of isolation for our collection of E. coli from patients with bloodstream infection. Percentage is plotted by year ordered by frequency at the start of the study (most common at the bottom). The emergence of ST131 and ST69 can be observed in 2003 and 2002, respectively.

These findings suggest that the emergence of new types of E. coli may occur quite frequently, but does not necessarily indicate that these will out-compete other types to become a dominant cause of infection in the human population. This includes drug-resistant strains, a finding which has important implications for the control of drug-resistant infections. In particular, the development of control strategies should not restrict their focus to just drug-resistant strains.

The reason for this equilibrium may relate to the fact that all bacteria are constantly competing with others to survive. For example, E. coli has to compete with other bacterial strains of E. coli and other bacterial species in the human gut. It is likely that some bacteria carry a genetic repertoire (beyond antibiotic genes) that provide a fitness advantage. The pattern of equilibrium suggests that this advantage is stronger when the bacterium is rare, but is reduced as they become more common, a process called negative frequency-dependant selection.

E. coli type ST73 was the most common in our collection (ST131 was second commonest). ST73 was largely antibiotic-susceptible. We looked at whether drug resistance changed in these two types over time, and found this to be largely stable. This suggests that not every type is likely to develop resistance; furthermore, our findings indicated that most types were stably susceptible over the 11 years of study.

The fact that ST131 and ST73 were the most common types in the collection allowed us to compare whether there were specific genes that could explain their biological success. We found genes that were specific to ST131 and highly related types, and different genes that were specific to ST73. Their link to success can only be speculative based on sequence data, and these findings warrant further experimental studies to test whether they go some way to explaining their ability to be maintained in the human gut and elsewhere.

We did not look at E. coli types being carried by humans but not causing disease, which limits our ability to relate these findings to healthy people. But our findings provide some reassurance, at least in our setting, that ST131 associated with the severe end of the infectious disease spectrum (bloodstream infection) is in equilibrium with the overall population of E. coli strains, rather than increasing over time.

About the authors:

Dr Teemu Kallonen previously worked as a senior bioinformatician/postdoctoral researcher at the Wellcome Trust Sanger Institute and is currently working as a postdoctoral fellow in the Department of Biostatistics at the University of Oslo, Norway. He works with whole genome sequenced Enterobacteriaceae to investigate their evolution, virulence and resistance to antimicrobials.

Professor Julian Parkhill is Head of Infection Genomics and Senior Group Leader at the Wellcome Trust Sanger Institute, and Honorary Professor of Microbial Genomics at the University of Cambridge. His group is using high-throughput genomic approaches to understand the evolution of bacterial pathogens on short and long timescales; how they transmit between hosts on a local and global scale, how they adapt to different hosts and how they respond to natural and human-induced selective pressures.

Professor Sharon Peacock is an Honorary Faculty member at the Wellcome Trust Sanger Institute, an honorary Senior Research Fellow at the University of Cambridge and Professor of Microbiology at the London School of Hygiene & Tropical Medicine and University College Hospital. She has a long-term interest in bacterial infection, including antimicrobial resistance. Her group are currently using whole genome sequencing to study the reservoirs from which humans may acquire antibiotic-susceptible and drug-resistant bacteria.

Related publication:

Teemu Kallonen et al. (2017) Systematic longitudinal survey of invasive Escherichia coli in England demonstrates a stable population structure only transiently disturbed by the emergence of ST131. Genome Research. DOI 10.1101/gr.216606.116

Further links:

Sanger Science

Malaria parasites: more susceptible to a knock-out blow than we thought?

DATE: 13/7/17
By: Theo Sanderson and Ellen Bushell

BOdy Blog image

Profiling the effects of large-scale gene knock-outs reveals different evolutionary forces acting on different parts of the malaria parasite. Credit: Bushell, Gomes, Sanderson et al. (2017).

How does the malaria parasite work? Better answering that question will assist with developing drugs that destroy the parasite, and potentially with the design of vaccines. The PlasmoGEM project  is working to shed light on the parasite’s biology by characterising the role of many of the five thousand genes that make up its genome.

One way to study what a gene does is to “knock it out”, deleting its DNA sequence, and to see whether anything changes in the parasite. Historically this has been a laborious process in malaria parasites, and so until now only a small proportion of parasite genes have been studied this way. We have been working to develop new technologies to speed up the process of knocking out parasite genes and measuring the resulting biological effect.

With these new technologies, we have now been able to test the majority of the parasite’s core genes, and this provides us with the first overall picture of the functionality of the malaria genome, which has been published today.

A general theme of many historic discoveries in malaria biology has been that the parasite often has several ways of carrying out crucial functions. This could make developing drugs against malaria difficult if when one pathway was blocked by a drug, the parasite was simply able to use another. In addition, there are genes which are only involved in certain parts of the parasite’s complex lifecycle. For example some genes are needed only to carry it through the mosquito; others allow it to grow in the liver of the host.

We did all of our experiments in a single stage of the lifecycle – the period in the bloodstream that causes the symptoms of malaria, and so we expected that most genes we deleted would have no effect – either because they were needed in different stages, or because alternative pathways could compensate for their loss. But what we found was a surprise. Two-thirds of the parasite genes we deleted either killed the parasite outright or significantly decreased its rate of growth.

This is a higher proportion of “essential” genes than has been observed in any other organism studied. It may be connected with the fact that as the parasite’s ancestor, which looked something like an algae, evolved to become a parasite it drastically slimmed down its genome, shedding 6,000 genes. This may have been because living inside a host provided a more consistent environment than it had previously experienced in the outside world. This drastic genomic reduction seems to have left the parasite heavily dependent on most of its remaining genes, even at just a single stage of its lifecycle.

In our data we still see the redundant pathways that have long been known in malaria genetics, but these seem to be mostly limited to areas of close interaction between the host and the parasite, where evolutionary arms races lead to diversity and redundancy.  In contrast, the remainder of the genes the parasite uses in the blood-stage appear extremely important for growth.

This is an exciting discovery not just as a quirk of evolutionary biology, but because it has important implications for drug development. Any of these essential proteins, if it could be targeted by a drug, would be expected to kill the parasite and cure a patient. The fact that there are many such genes provides a piece of hope amid a background of growing drug resistance.

The data for all of the genes we studied is now available in an online database to allow researchers to prioritise different parasite metabolic pathways for drug development. We hope that this resource advances the fight against malaria, by allowing researchers to quickly find out whether a gene they are interested in represents a redundant component that the parasite can do without, or if it plays a crucial role in allowing the bloodstream growth that causes the symptoms of malaria.

About the authors:

Dr Theo Sanderson is a Postdoctoral Fellow at the Wellcome Trust Sanger Institute, working in Dr Julian Rayner’s group, Human-parasite interactions in malaria, to create new genetic methodologies that allow malaria parasites to be functionally studied at large scale. His experimental work focuses on the mechanism by which the parasite invades the human red blood cell, and he additionally develops novel bioinformatic methods to analyse data from the PlasmoGEM project.

Dr Ellen Bushell is a Senior Staff Scientist at the Wellcome Trust Sanger Institute working in the Rodent Models of Malaria group of Dr Oliver Billker, whose work centers around generating scalable genetic tools and developing screens to functionally analyse the malaria parasite genome at scale. She is the project manager for the PlasmoGEM project.

Related publication:

Ellen Bushell, Ana Rita Gomes, Theo Sanderson, et al. (2017) Functional profiling of a Plasmodium genome shows a high incidence of essential genes in an intracellular parasite. Cell. DOI: 10.1016/j.cell.2017.06.030

Further links:

Sanger Science

‘Like sugar in milk’: Parsi populations from India and Pakistan

DATE: 29/06/17
By: Qasim Ayub


Image shows the location of Parsi samples from South Asia and an ADMIXTURE plot demonstrating that the Parsis from India and Pakistan are a homogenous population that are genetically closer to present day Iranians. Credit: Gyaneshwer Chaubey et al. (2017)

I have always been fascinated by the Parsi (or Parsee) population of the Indian sub-continent, whom I encountered for the first time when visiting the port city of Karachi in Pakistan as a child with my parents. My recollection was of a cultured, well educated, generous community with their own beliefs and customs and who served delicious mouth-watering food. When I grew up I began to admire their seminal contributions to education, medicine, commerce and the many excellent philanthropic charities they supported.

Who are the Parsis? They are a small ethnic group from South Asia and followers of one of the world’s earliest religion, Zoroastrianism, which flourished in pre-Islamic Persia (present day Iran). Legends record their arrival in Sanjan, off the coast of Gujarat in present day India, during the 7-10th century where they were referred to as ‘Parsi’ (literally meaning ‘people from Paras’, the local term for Persia). Gyaneshwer Chaubey, from the Estonian Biocentre in Tartu, is working on contemporary Parsi samples from India and initiated the collaboration when he found out that I had genotyped Parsi samples from Pakistan. We decided to pool our resources and add results obtained from some ancient DNA extracted from human bones collected from the Sanjan dokhama (tower of silence), a site that the Indian archaeologists concluded was most likely one of the earliest Parsi burial sites in India. These ancient samples were analyzed under the supervision of Dr. Kumarasamy Thangaraj at the CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India. The study was recently published in the journal Genome Biology.

The analyses confirm that the Parsis from India and Pakistan were a homogenous population that were genetically closer to the ancient Neolithic Iranians, followed by present day Near Eastern, Iranian and Caucasian populations, rather than local populations from the vicinity. The results also demonstrate that these migrants genetically admixed with the Indian population about 1,200 years ago, around the time of their reported arrival in the sub-continent as recorded in their legends. This study supports sex-specific admixture and prevailing female gene flow from South Asians to the Parsis, that was observed in earlier studies using male (Y chromosomal) and female (mitochondrial) specific markers. The mitochondrial DNA analysis of the bones recovered from the Parsi burial site indicated that this admixture occurred soon after their arrival.


World Zoroastrian Organisation – Senior Citizens Home, Navsari. Credit: Parzor Foundation, India

This genetic analysis supports the account of the Parsi arrival in India as recorded in the Qissa-e-Sanjan. It records that the local ruler sent a glass full of milk to the Parsi group seeking refuge to indicate that his kingdom was full to the brim and could not accept immigrants. The Zoroastrian priest responded by putting sugar into the full glass of milk to indicate that they would assimilate with the locals like “sugar in milk” and they did indeed do so.

About the authors: 

Dr Qasim Ayub has been working with the Human Evolution Team at the Wellcome Trust Sanger Institute since 2008. His research focuses on the analyses of DNA variation in human genomes in order to understand how they adapted to local diets, environments and pathogens as they established themselves in different parts of the world. Qasim continues to maintain his interest in human Y chromosomal variation and South Asian population genomics in health and disease.

Related publication:

Gyaneshwer Chaubey et al. (2017) “Like sugar in milk”: reconstructing the genetic history of the Parsi population. Genome Biology. DOI: 10.1186/s13059-017-1244-9

Further links:

Sanger Science

Tracking the movement of a deadly pathogen and biothreat agent

DATE: 30/01/17
By Sharon Peacock and Claire Chewapreecha


Burkholderia pseudomallei grown on agar. Credit: Patpong Rongkard

Melioidosis is a frequently fatal infectious disease caused by a bacterium (Burkholderia pseudomallei) found in soil in certain parts of the world. We have known about melioidosis for many years, but it’s only in the last 25 years that we have started to understand it better. So, what’s changed?

Melioidosis was considered to be something of a medical curiosity and a rare tropical disease for many years, but recent figures are putting the record straight. An estimated 165,000 people develop this infection every year, around 89,000 of whom will die as a result.

There is also a great deal of information being accumulated to help people understand if they are at risk, and how to prevent getting infected. Maps are being regularly updated on the global whereabouts of the organism, with new data often being generated when diagnostic microbiology laboratories start providing facilities to culture people with fever for the first time. We now know that infection can result from bacteria being inoculated, ingested or inhaled, which implicates a wide range of human behaviour. This helps to shape guidelines on how to avoid getting a preventable disease.

The organism has also hit the headlines as a biothreat agent, meaning that the bacterium could be used by terrorists to infect people or contaminate the environment.

With the genomic revolution in full swing, we wanted to use this technology to see if genome sequencing of a global collection of isolates would allow us to plot its travel history. This is important, given its patchy global distribution. Our findings are consistent with the modern-day organism originating from Australia. From there, it seems to have been introduced just once into Asia, but after that the story changes – with evidence for repeated spread between countries bordered by the Mekong river, and between Malaysia and Singapore.


Transatlantic slave  trade routes and sampling locations. Credit: Nature Microbiology doi:10.1038/nmicrobiol.2016.263

From SE Asia, the organism was introduced into Africa and onwards to South America. The evolutionary clock of the bacterium can be used to estimate broad dates for these introductions. From this, we estimated that the organism was introduced into South America between 1682 and 1849, which overlaps with the height of the slave trade.

Another feature of the infection which has puzzled clinicians is that the way that people manifest infection (which organs are affected) can differ depending on where the bacterium was acquired, with a particular difference between Australia and SE Asia. For example, infection of the prostate and a particular type of brain involvement is well recognised in Australia, but not in SE Asia. An obvious explanation is that bacteria in the two regions have different gene sets in their genome, which lead to different patterns of interactions with humans.

We tested this possibility by looking for gene patterns, and found numerous examples of genes or gene variants that differed between bacteria from Australia and SE Asia. This included genes that encode virulence factors, such as those that allow bacteria to adhere to cells, and genes that promote survival inside of cells. This catalogue of genes has been shared with the scientific community to explore and mine.

Our findings provide some reassurance that the bacterium does not spread over long distances very often, although measures to prevent the transport of material contaminated with the organism are just as important as ever. Next steps are to complete targeted experimental work to confirm which of the gene differences found here relate to particular disease types in melioidosis. Our goal is to understand how we can modulate the disease process to improve patient outcome.

About the authors:

Professor Sharon Peacock is an Honorary Faculty member at the Wellcome Trust Sanger Institute, an honorary Senior Research Fellow at the University of Cambridge and Professor of Microbiology at the London School of Hygiene and Tropical Medicine and University College Hospital. She has a long-term interest in melioidosis and the causative bacterium (Burkholderia pseudomallei) and has published around 150 papers on the topic. Her group are currently using a combination of sequence-based methods and other laboratory approaches to understand the mechanisms by which B. pseudomallei causes disease.

Dr Claire Chewapreecha is a Wellcome Trust Sir Henry Wellcome post-doctoral Fellow at the University of Cambridge UK, and a lecturer at King Mongkut’s University of Technology Thonburi, Bangkok, Thailand.

Related publication:

Claire Chewapreecha et al. (2017) Global and regional dissemination and evolution of Burkholderia pseudomallei. Nature Microbiology doi:10.1038/nmicrobiol.2016.263

Further links:




Sanger Science

Out of Africa: more data must give us definite answers, soon ….

Mr. Aubrey Lynch, elder from the Wongatha Aboriginal language group, who participated in the study

Mr. Aubrey Lynch, elder from the Wongatha Aboriginal language group, who participated in the study. Credit: Preben Hjort, Magus Film

DATE: 26/09/16
By Yali Xue

On 21st September, three papers presenting genome sequences from diverse humans went online in Nature, with much subsequent coverage in the scientific and general press. How did it feel to be involved in them?

Working in a genomic institute in a team called ‘Human Evolution’, I naturally want genomic data from people from all over the world to investigate our evolutionary history. A few years ago, the 1000 Genomes Project started, providing for the first time whole genome sequences from all over the world. This was my dream, exactly what I wanted, and would surely allow us to answer all our questions! It was certainly a great project, yet ……. the sequences were not completely accurate (low coverage), and more importantly, some key populations were missing.

The basics of human evolutionary history are clear: we evolved in Africa. But now people are present all over the world. Surprisingly, the some of the earliest evidence for human presence outside Africa is found on the other side of the world, in Australia. Understanding this early migration by studying their living descendants should help us understand how humans changed from being an endangered African species to a worldwide one that endangers every other species.

I wished and dreamed to have an accurate dataset (high-coverage) from Aboriginal Australians and as many other worldwide populations as possible. With that we would surely be able to answer all our questions, including one of biggest and most debated ones in the field: one exit or two exits from Africa 60,000 years ago or earlier?

Now we have that dataset; even beyond my dreams, we have three, all of which I was privileged to be involved in, and all published this week in the same issue of Nature. Three dedicated international teams worked for years with sample donors and got them interested, sequenced their DNAs (the easy bit) and them sweated for ages analysing the sequences. So what was the answer?

All three studies agree that there was one main exit. But the word “main” is crucial here. One study talks about one exit, one says there was at least a 2% contribution in some populations from a second exit, and the third allows both of these possibilities. Despite such rich datasets, involvement of almost all the top scientists in the field, and the best available analytic methods, we cannot agree 100% on this question.

Human history is complicated, and we are trying to use modern population data to infer human history 60,000 years ago. So 787 high-quality DNA sequences from key populations are great, but ……. now I dream of ancient DNA sequences from 60,000 years ago and many years on either side, from all over the world. With those, we will surely be able to answer this question and lots of others …….

In China, we say that if you are satisfied with what you have, you will be happy. We are really happy with what we have achieved now, but in science, we would always like to be happier.

So I have to keep dreaming …

Yali Xue is a senior staff scientist in the Human Evolution group at The Wellcome Trust Sanger Institute. Her initial interest was in using variation on the Y chromosome to provide insights into aspects of human history and evolution. Now her work has extended to study patterns of variation throughout the entire human genome and to reveal further evolutionary insights, including medically-relevant ones.


Anna-Sapfo Malaspinas et al. (2016) A genomic history of Aboriginal Australia. Nature. DOI: 10.1038/nature18299 

Swapan Mallick et al. (2016) The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature. DOI: 10.1038/nature18964

Luca Pagani et al. (2016) Environmental challenges  and complex migration events during the peopling of Eurasia. Nature. DOI: 10.1038/nature19792

Further links:

Human evolution group at the Sanger Institute:

The genetic history of Aboriginal Australians and Papuans: