Category: Sanger Science

Exploring Sanger’s groundbreaking research

25 GenomesSanger LifeSanger Science

The Beast from the East? Vespa velutina

Words and pictures by: Alex Cagan
Date: 17.09.18

Prelude: Death from above

The Asian hornet - one of the 25 genomes being read by the Wellcome Sanger Institute

Today, you are a honeybee and today you are going to die.

You enjoyed a summer full of industry, dance and frenetic activity collecting nectar for the hive along with your 10,000 sisters. But all of that is about to change.

They came from the East.

One arrived this morning, unnoticed, a vanguard. After spotting your hive it flew back to its nest to recruit others. Now, a storm is gathering above you. Shadows skate overhead, silent portents of impending calamity.

This darkness is cast by Asian hornets, Vespa velutina. They are specialised honeybee hunters. Each one patrols its own small territory above the hive. Their ferocious mandibles and instincts designed to cleave through the carapaces of you and your sisters. Together the hornets create a tightening net through which you cannot escape.

If you were an Asian honeybee the arrival of the first hornet would have triggered one of the most extraordinary defensive maneuvers in the natural world. You and your fellow workers would surround the hornet. Rapidly twitching your wing muscles to generate heat, you would have smothered it in a pulsating mass, cooking the hornet alive in a blaze of cooperative fury before it could call for reinforcements. Together, you might just have survived.

But you are not an Asian honeybee.

You’re a European honeybee and neither you nor your ancestors ever faced such a threat, until now. As such your genetic and behavioural make-up is bereft of the tools or tricks with which you might hope to defend yourselves. What little resistance you can put up is futile.

After their grizzly work is done, the hornets take no honey. That is not what brought them here. It is the sinews of you and your sisters that they feast upon.

This morning the chambers of your hive were alive with the buzz of activity. This evening things are different. Drips of honey echo in silence.

The Asian hornet - one of the 25 genomes being read by the Wellcome Sanger Institute

25 Genomes: The Asian Hornet

I’m Alex Cagan, a post-doctoral researcher in genetics at the Wellcome Sanger Institute. The Institute turns 25 this year and to celebrate we are sequencing the genomes of 25 species found in the UK that have never had their genomes sequenced before. Over the course of the year I’ll be going ‘behind-the-scenes’ to chronicle this ambitious project in various ways. In doing so I hope to throw some light on the scientists, institutes and species involved in doing this kind of large-scale science and the reasons why it’s being done in the first place.

The Asian hornet is one of these 25 species being sequenced by the Wellcome Sanger Institute ( It is one of five species that was decided upon by a public vote, in a head to head competition with other species. The case for sequencing the Asian hornet was championed by the Sumner lab, a eusocial insect research group based at University College London ( They tell us that the Asian hornet is ‘a dangerous invasive species that poses a huge threat to bee populations in the UK and elsewhere in Europe’. Of all the 25 species being sequenced it’s also the newest arrival in the UK. The first confirmed sighting of a nest was in Gloucestershire in September 2016, with a second found in Devon in 2017.

For many, the mere mention of the word ‘hornet’ is enough to set the pulse racing. Even if we’ve never encountered one, the image of a swarm of angry wasps on steroids looms large in the imagination. But what of the reality? Are they truly winged nightmares or is this reputation undeserved? We already have hornets in the UK, so what makes Asian hornets so special? In search of answers I headed to London to meet with Dr Gavin Broad, Principal Curator in charge of insects at the Natural History Museum.

Natural History Museum, London

Behind the scenes

Visiting the NHM is always a joy. Encounters with wonders there as a child inspired me, along with countless others I’m sure, to become a biologist.

I’ve always had a yearning to go behind the scenes and glimpse the vast collections that have never been on display. I have read stories of the museum’s legendary collections, particularly in the excellent book ‘Dry store room No1’, by Richard Fortey (for a review see: I imagine endless corridors stacked with shelves containing specimens from all branches of the tree of life, gathered by naturalists across the globe. The biological equivalent to Borge’s fictitious Library of Babel, with specimens that display the near infinite variety of forms produced in nature. A library of life.

Pigeons at the Natural History Museum

It turns out my imagination is not far off. After signing in as a guest I meet Dr Gavin Broad at his desk. It is covered with a combination of books and wasp specimens of different sizes, encased in tubes or glass-windowed wooden boxes. I’ve come to the right place, I think to myself.

We begin by heading straight to the museum’s collection of Asian hornets. Dr Broad leads me to a corridor lined with tall grey metal cabinets as far as the eye can see. Each one has a number and a brief description of the contents within. After a few seconds I’m already disorientated. We stop at one case that looks to me like all the others. This one is labelled 58 – Vespidae / Vespinae?. He spins a wheel attached to the cabinets and the stacks begin to move, making a central row of cabinets accessible. Opening one of the top doors reveals a stack of wooden shelves that immediately fills the air with the rich smell of timber.

Dr Broad and his collection of Asian hornets at the Natural History Museum, London

Dr Broad retrieves the top two shelves and lays them on a table. Framed in glass are rows upon rows of hornets. They are beautifully laid out. A single pin connects each hornet to a paper record written with impossibly small, impossibly perfect handwriting. It is evocative of a golden age of natural history. An age, it dawns on me as I stand surrounded by cabinets of carefully documented insects, that may never really have ended.

Asian hornet collection

We take the shelves to a nearby room to take a closer look. Peering at them through the glass the first thing I notice is that they’re smaller than I expected. I’d imagined something at least thumb sized. They’re still larger than common wasps, but not by much. They’re darker too, a deep velvety black covers most of their body, in brilliant contrast to the yellow and orange hues that highlight their face, abdomen and the tips of their legs.

The population of Asian hornet that has begun to colonise Europe, Vespa velutina nigrithorax, has a particularly dark appearance. Up close it becomes clear that they are also covered in a fine layer of fuzzy hair. I think it would be wrong of me say this makes them look cute, but it certainly adds… character. As we spoke Dr. Broad took a high-resolution photo using the museum’s special insect imaging setup. Take a look and decide for yourself:

Close up of Asian hornet's head

A brief history of hornets

As I peer at hornets Dr Broad provides me with the broad brush strokes of their evolutionary history. Hornets, it turns out, are the result of a series of key evolutionary innovations. Hymenoptera, the order of insects containing hornets (as well as wasps, bees and ants), emerged in the Triassic, over 200 million years ago. One of the keys to their success was the evolution of the ovipositor, a specialised organ for laying eggs. This appendage really opened doors for the hymenopterans, as they could now lay their eggs in new places. These places could be hard to access crevices, where eggs would be safer from predators. Or, perhaps most ingeniously, on and even inside food sources such as fruits, or, in the case of the devious parasitic wasps, other insects.

Fast forward millions of years and we arrive at the emergence of the subclade Aceuleata. In these hymenoptera the ovipositor underwent a startling transformation. Instead of delivering eggs it became dedicated to delivering venom. The dreaded stinger was born. An organ used to create life metamorphosed into one that takes it. This sent the Aceuleate down a very different evolutionary road. While their ancestors laid their eggs on the food and then effectively abandoned them, the Aceuleata started to hunt and provision food for their offspring instead.

The next key development in the path leading to hornets was the emergence of eusocial societies. Eusociality is considered the highest level of organization among animal societies, defined by cooperative breeding, with breeding and non-breeding members dividing labour and working together. There are many startling parallels with our own societies here, yet they remain distinctly other and fascinatingly so. Eusociality appears to have arisen independently multiple times in the Aceuleata, in the social wasps, bees and ants. It may be that eusociality was made possible by the previous innovation, the stinger, which provided these fledgling societies with a viable way to defend their nests from intruders.

And what nests they are.

Not only does the NHM collection include specimens of the hornets themselves, they also have rows of boxes containing nests of different species. Dr Broad opens one box, at least an arm span wide, that is filled with a near complete example of an Asian hornet nest. It’s truly an architectural wonder. Created by wood pulp chewed and mixed with hornet saliva, it is both incredibly light and unbelievably sophisticated. The nest would have originally hung from a tree, and indeed a few remnant branches are still embedded in it. In this nest large swathes of the rippled paper shell are missing, revealing the interior structure. Horizontal discs, inlaid with hexagons that housed the larvae, are suspended by vertical columns that connect the levels. All the more remarkably, each nest is created and then abandoned within a single year.

Asian hornet's nest

It’s amazing to think that insects are capable of building such structures. One wonders what lessons there might be here for our own architects, given what these hornets can achieve with only a little wood and a little spit. Indeed, the field of biodesign continues to draw inspiration from these invertebrate constructs. Though for all their beauty the fact that they are usually packed with thousands of hornets would make me hesitant to admire one in the wild, where they are more akin to a devil’s piñata.

Having seen the rather beautiful hornets and their astonishing nests I’m starting to wonder what the big deal is about the threat of Asian hornets colonising the UK. Afterall, we have native hornets in Europe already, what harm could another species possibly do?

Apparently quite a lot. It turns out that unlike our native hornet species, which tend to be generalist hunters of many different insects, the Asian hornet is more like a precision honeybee-seeking missile. Asian hornets will search for honeybee hives and once they find one they will engage in ‘hawking’ behavior. This involves hovering near the entrance to the beehive, killing bees as they fly to and from the nest or try to defend themselves. The hornet, covered in a tough carapace, is practically impervious to the attack, while it’s own powerful mandibles are quite capable of tearing the poor bees apart.

In Asia, the local honeybees have coevolved with these hornets for millenia. This has resulted in the honeybees developing more effective defence mechanisms. The most famous being the ‘bee ball’ formed by Japanese honeybees. While European honeybees have been observed forming these bee balls, they do not appear to be of a sufficient intensity to effectively kill attacking hornets.


So for now the Asian hornets in Europe have the upper hand, mounting raids on honeybees that do not yet have a way to defend themselves. European honeybees have been in a perilous state for many years, with colonies collapsing at alarming rates due to a variety of factors, such as heavy pesticide use, that remain poorly understood(?). They really deserve a break. The arrival of a new and voracious specialised predator is the last thing that they need. Scientists and farmers alike are concerned that if Asian hornets become well established throughout Europe this will have a devastating impact on agricultural productivity. Or, in the worst case scenario, it could be the last nail in the coffin for the European honeybee.

How did they get here?

While there have only been two confirmed nests in the UK, Asian hornets are already established in France and Spain. How did they get to Europe in the first place? The most likely scenario, Dr Broad tells me, is that a hibernating queen arrived as a stowaway in a crate. It turns out that, like us, social insects such as the Asian hornet are particularly well-suited to becoming international colonisers. Though the reasons for their success are perhaps quite different from our own. The queen hornet mates with males only once in her life, during an initial mating flight. For the rest of her life the queen will carry the sperm she received in a specialised organ. She will use this to fertilise all future eggs she lays (apparently this remarkable process of internal fertilisation on demand is so precise that on average less than two sperm are released per fertilisation). Therefore, it only takes a single queen to establish a viable population. A lone voyager who contains within her the seeds of an entire society, waiting to bloom.

Vespa velutina have a particularly wide geographic range in Asia, this ability to tolerate a variety of climates likely contributes to their success as invaders. However, where exactly within Asia these invertebrate pilgrims came from remains a mystery. Here is where DNA could provide the answers. The DNA sequence of an individual is rich source of information but to understand it we first need a road map, a ‘reference genome’ from the same species that tells us the general shape and structure of the genome so that we can make sense of the pieces from any given individual. We hope that sequencing and assembling such a reference genome for the Asian hornet can shed light on this mystery, as well as many of their other secrets. What else can we hope to learn from sequencing the Asian hornet genome?

A genomic perspective

As well as helping us identify where exactly in Asia the hornets are likely to have come from the genome will enable us to design markers to help monitor their spread across Europe. This kind of information could be crucial to monitor which populations are spreading the fastest and which management strategies are proving most effective.

The Asian hornet genome can help us to understand the biology that underpins the amazing adaptations of this species. While this knowledge is fascinating in its own right, it could one day be used to help control their spread. For example, the Sumner lab proposes that if we can identify the chemical odorants and compounds that this species is sensitive to that information could be used to try and manipulate their behavior to limit their production.

Their website has more ideas on how the genome might be used, including the potential for biocontrol by gene-editing – and a list of great references if you want to get into the details.

So far, at least in the UK, vigilance has paid off. Both Asian hornet nests were quickly identified by the public and removed before they had a chance to spread. If you’d like to get involved there is an ‘Asian Hornet Watch’ app you can download (what a world!) to report sightings of Asian hornets ( and a handy identification poster (


I leave the Natural History Museum filled with a sense of awe, both for the Asian hornet and the museum’s collections which make such intimate encounters with the natural world possible. The carefully preserved and recorded specimens are a priceless trove of information. Sequencing the genome will be the latest chapter in our efforts to catalogue and understand the hornet. A new piece in an old game.

Asian hornet from above

About the authors:

Dr Alex Cagan is a postdoctoral fellow in Inigo Mortincorena’s research team at the Wellcome Sanger Institute, studying mutation and selection in healthy tissues and how this relates to cancer and ageing.



25 GenomesSanger Science

The golden eagle genome has landed

By: Kat Arney and Rob Ogden
Date: 03.09.18


The golden eagle genome has been sequenced as part of the Sanger Institute’s 25 Genomes Project

The golden eagle is undoubtedly one of the UK’s most iconic birds. With an impressive 2 metre wingspan and striking yellow feathered legs, it’s a stirring sight if you’re lucky enough to spot one soaring over the Scottish Highlands and Islands.

While golden eagles may not be critically endangered – the IUCN’s Red List of Threatened Species lists them as being of ‘least concern’ – their habitat is shrinking. Many of the already small populations around the world are continuing to decline, scattered through Europe, Japan and other areas, which is why today’s announcement of a new golden eagle genome is so important.

The eagle genome has been completed as part of our 25 genomes project, sequencing the genomes of 25 significant UK species ranging from pipistrelle bats and Eurasian otters to spiders, starfish and summer truffles. But while the announcement of a newly-sequenced species is undoubtedly exciting to fans of genomics, having a complete golden eagle genome is also a vital tool to help conservationists protect and manage these fabulous birds.

Genetics meets conservation

Conservation geneticist Dr Rob Ogden at the University of Edinburgh has been using simple DNA profiling and sequencing to monitor the genetic makeup of animal populations for at least 20 years, studying species as diverse as endangered gazelles, manta rays and (of course) golden eagles.

But, as Rob explains, while these tests can provide useful information about a population – such as how genetically diverse it is, and how individuals are related – it can only tell us so much.

“This basic information can help us when we come to make decisions about how to manage populations, but it’s based on looking at differences in small ‘snapshots’ of DNA scattered throughout the genome,” he says. “What we don’t really understand is what any of these genetic differences relate to in biological terms. If you keep a couple of populations separate for multiple generations, parts of their DNA will gradually drift apart, but we don’t know if they have any biological relevance at all.”

To draw an analogy with language, a simple alteration in spelling – for example, switching recognise to recognize – makes no difference to the meaning of the word. But more significant changes might alter the meaning of a word altogether, like changing ‘recognise’ to ‘organise’.

The simple DNA tools that Rob and his team have been using up until now can spot that individual ‘letters’ have changed, but they can’t identify the context of the biological ‘words’ in order to tell whether the difference is meaningful. And to read the ‘words’ in DNA (genes), you need to read the full genome.

The DNA that was used to create the new golden eagle genome came from a chick that was found dead in a Scottish nest during a raptor health study and was read using PacBio SMRT technology. Unlike other DNA reading methods, PacBio’s technique generates very long, high-quality stretches of sequence from which it’s easier to build a whole genome. This allowed the researchers to build what’s known as a ‘reference genome’, against which DNA from other golden eagles around the world can be compared.

Adapting to a changing world


The full genome sequence for the golden eagle will help conservation efforts

The full genome sequence for the golden eagle will help conservation efforts. Having a high-quality full genome sequence for the golden eagle opens up a treasure trove of biological information that conservationists can use to manage species more effectively in the wild.

“Now we have the whole genome we can identify specific genes and work out what they do, so we can see whether a specific change is likely to affect what happens in a cell or in a whole animal,” he says.

“Golden eagles are spread around the world in lots of different habitats and climates – there are hot weather birds and cold ones, eagles in forests and others in the hills – so are their genes adapted to their local environment? Do the genetic differences we see relate to important differences in the physiology of the animals which are related to how they can best survive in that particular environment?”

This knowledge is vital for managing endangered populations effectively as the global climate changes. One conservation tool is land management – generating certain types of habitats that will encourage particular species. Another option is translocation, moving animals from one area to another or releasing captive animals back into the wild. But if those creatures are poorly adapted to the environment they’re being put into, then there’s a good chance they’ll fail to thrive.

As temperatures across Europe are expected to increase over the next century and habitats change, it’s unlikely that large species like eagles will be able to adapt fast enough to cope. Instead, the most likely solution is for populations to move north in search of cooler climes.

“We know that Mediterranean golden eagles are genetically quite different from the Scottish birds, so perhaps we might see a situation where eagles from warmer climates become better adapted to a changing habitat type in northern Europe than the existing population that’s there now,” Rob explains.

“But if it’s taken 10,000 years to evolve a particular trait, there’s no way that’s going to adapt to climate change in the next hundred years, so understanding how these locally adapted populations have come about is really important for predicting how we can manage species in the future.”

Taking flight


Golden eagle in flight. Image credit: Martin Mecnarowski, Wikimedia Commons.

The completion of the golden eagle genome as part of the 25 genomes project is only the first part of the story. The golden eagle has been selected as one of the species to go forward into the Genome 10K project, carrying out detailed analysis of DNA from around 10,000 vertebrate species. Researchers will be using a technique called optical mapping to get an even more detailed picture of how the eagle genome is organised and to make sure they haven’t missed any bits.

“The golden eagle has been promoted up to the Premier league in terms of the quality of genome that we are going to obtain for it in the future,” Rob says. “The genome we have now is more detailed than anything that has been done before with golden eagles by quite a long way, but the next step is to make it way better – the best of all wild bird species.”

Having been lucky enough to watch a pair take flight in the hills on the Scottish island of Skye, watching with rapt attention as they swooped and circled round each other in a charming courtship dance, it’s easy to argue that golden eagles themselves are better than a lot of other birds.

“They certainly are very cool!” laughs Rob. “They’re an iconic species in the UK – people recognise them and are proud of them, but that’s true in every culture where you find golden eagles. It’s something that helps with conservation education because people can really relate to these animals and support projects that focus on saving them. They’re fantastic birds to work on.”

About the authors:

Dr Kat Arney is a science writer, public speaker and broadcaster, and author of the popular genetics books Herding Hemingway’s Cats and How to Code a Human. 

Dr Rob Ogden is Head of Conservation Genetics at the University of Edinburgh and a scientific adviser to the South of Scotland Golden Eagle Project.




Human Cell AtlasSanger Science

A trusty guide for exploring the complexity of cells

By: Martin Hemberg and Vladimir Kiselev
Date: 14.05.18

Page image 2

Scmap can map individual cells from a query sample to cell types or individual cells in a reference. Previously identified cell types are coloured, unknown types are grey.

Ever since scientists first used a microscope to inspect cells, it has been recognized that they can be grouped into distinct cell-types based on their morphology. The difference between cell-types, both in terms of form and function can be striking, even though all somatic cells in an organism share the same DNA. The reason why cells may exhibit such striking differences can be attributed to the fact that each cell-type expresses only ~10,000 of the ~20,000 genes that are present in our genomes.

Traditionally, cell-types are defined based on morphology – shape. However, recent technological advances have made it possible to measure the level of all approximately 20,000 genes expressed in individual cells. The technology is known as single-cell RNA-seq (scRNA-seq) and it builds upon the powerful methods that were initially developed as part of the Human Genome Project.

To carry out a scRNA-seq experiment, the biological sample provided (e.g. some blood, a piece of skin or a biopsy from an organ) is dissociated and the cells are isolated individually. A set number of cells are then randomly selected to have their mRNA extracted and profiled. Using computational analysis methods, cells with similar profiles are grouped together, making it possible to identify cell-types based on which genes are expressed.

In the fall of 2016, the Human Cell Atlas (HCA), a hugely ambitious international project to “generate a comprehensive map of all 37 trillion cells in the human body” was launched. The HCA uses scRNA-seq to profile cells from the human body and one of the goals is to define cell-types based on mRNA profiles. Most likely, the first release of the HCA will contain more than 100 million cells that have been profiled using scRNA-seq.

One of the key challenges will be to make sure that the HCA reference can be queried in a way that supports the questions that are likely to be asked most frequently, such as comparing cells from a new sample to the reference. This could be important for example in a clinical setting, where a doctor would be able to compare a patient sample (e.g. from an unhealthy liver) to the reference. Such a query would allow the doctor to determine if there is a major imbalance in the composition of cells, or even if there are cells that have acquired a disease state (e.g. cancer) that is not present in healthy individuals.

To support such queries, we have developed a novel computational method called scmap, which takes a query and a reference scRNA-seq dataset as the input. For each cell in the query, scmap can identify both the cell-type and the individual cell from the reference that provides the best match, as in the Figure above.

Comparing scRNA-seq profiles is challenging, mainly for two reasons: the data is high-dimensional (approximately 20,000 genes) and it is noisy.

Scmap is based on a recently developed feature selection algorithm for scRNA-seq data from the Hemberg lab. The algorithm is able to identify the subset of genes that are most informative for clustering in an unsupervised manner, and it uses state-of-the-art machine learning methods to achieve high specificity and sensitivity. Moreover, scmap is very fast, which means that it can be used for real-time searches of very large references.

Another key feature is that scmap’s internal representation of the reference is greatly compressed which means that it can be run on an ordinary workstation. Finally, scmap is modular which means that a new dataset can be added to the reference without having to re-compute previously added datasets.

Even though the HCA is years from completion, there are already large collections of scRNA-seq datasets available. In addition to the HCA, researchers are also building cell atlases for many of the model organisms that are widely used in biomedical research. The most impressive result to date are two large collections of reference data for the mouse. Researchers have already used scmap to compare  the two mouse datasets to compare the different methodologies for collecting the data, providing an excellent demonstration of how scmap can help analysing large datasets.

Since scmap carries out a simple yet fundamental operation –  comparison of cells from two datasets – we anticipate that it will become an integral part of many scRNA-seq analysis pipelines, and that other, more complex tasks will come to rely on it. In particular, we believe that the speed and compression afforded by scmap will ensure that the HCA becomes an accessible and easy to use reference for the community.

About the authors:

Dr Martin Hemberg is a Group Leader at the Wellcome Sanger Institute, interested in quantitative models of gene expression.

Dr Vladimir Kiselev is currently the Head of the Cellular Genetics Informatics group at the Wellcome Sanger Institute and used to be a postdoctoral researcheroc in Dr Martin Hemberg’s group.

Related publication:
Kiselev VY, Yiu A and Hemberg M. (2018). Scmap – projection of single-cell RNA-seq data across datasets. Nature Methods. DOI: 10.1038/nmeth.4644

Further Links:



Human Cell AtlasSanger Science

New computational method reveals where genes are expressed

By: Valentine Svensson
Date: 06.04.18

main figure

SpatialDE automatically identifies sub-structures (middle), and links these to genes that depend on spatial location (right) in mouse olfactory bulb data from Stahl et al 2016.

In the body, cells are often considered the atomic fundamental units. In a similar way to how atoms are structurally joined to form molecules, cells form tissues. The organization of these tissues let different cell types work together, to enable organs in the body to perform their functions. These structures have been studied and catalogued for hundreds of years in the field of histology, using microscopes.

During the 20th century molecular techniques have enabled researchers to investigate how different genes and proteins are used in different parts of tissues, to understand how cell types collaborate in tissues. Large scale projects such as the Protein Atlas or the Allen Brain Atlas have been systematically performing molecular measurements of individual genes and proteins in tissues.

In the last decade, tremendous advancements in the scale and cost effectiveness of molecular measurements have been made. This has led to the analysis of single cell gene expression -ie which genes are switched on in a cell. This lets researchers define cell types from molecular data. Similarly, spatially defined molecular measurements of gene expression can now be made on thousands of genes in single cell resolution. Projects that would previously have taken hundreds of people and long time schedules can now be done by individual labs, meaning more types of tissues in more conditions can be investigated.

The most powerful new high throughput methods generate measurements of expression levels for tens of thousands of genes. At this scale just looking at all the genes will not be possible. Typically these sorts of data have been analysed by only looking at a handful of known marker genes.

We have now developed a method that tells us if there is a relationship between genes expressed in cells, and where those cells are located.

Our SpatialDE method filters and sorts all the genes according to how certain we are that cell location matters for the expression levels. In the main data we analysed for our paper, out of close to 12,000 genes measured only 67 genes were filtered as “spatial”. By focusing on this shortlist of genes, researchers can quickly discover genes previously unknown to be related to tissue structure.

Tissues are often divided into sub-structures, based on visual appearance, or by expression of particular proteins indicating a specific function of that sub structure. The brain for example has different layers, so does skin: the thymus on the other hand consists of connected lobules with medullas inside.

The sub-structures are defined by different cell type compositions. For cells to have major functional differences they need to express many genes together that are specific to the function, which will be reflected on a whole tissue level. We created a second method which uses this property to automatically define tissue substructures. In one go, researchers obtain the genes defining the regions, as well as labels for the regions themselves.

This allows researchers to zoom into the structures of the tissue. The markers allow design of downstream functional experiments to investigate which genes cause the structure and which are a consequence of the structure. The spatial labels then allow researchers to investigate the interaction between structures, the development of the structures, and how the tissue performs its function.

Relating cell types to their spatial structure and organization in tissues is a major component in the ongoing Human Cell Atlas project. But the technologies for spatial gene expression measurements are feasible to perform for individual labs that wants to study their tissue of on a genomic level. With our methods, researchers can answer new questions about the relation between genes and tissue structure that was not possible before, which we demonstrate in our paper.

In the long term, genomic and quantitative spatial gene expression measurements, captured and analysed by methods such as SpatialDE, may form the basis of histology and pathology in the clinic. This would allow this area of medical diagnostics to become even more powerful and personalized.

About the author:
Dr Valentine Svensson was an EMBL PhD student supervised by Sarah Teichmann at the Wellcome Sanger Institute, collaborating with Oliver Stegle at the EMBL-EBI when this work was done.  He is now a postdoctoral scholar in the Division of Biology and Biological Engineering at Caltech, working with Lior Pachter on statistics for omics based cell biology.

Related publication:
Valentine Svensson, Sarah A Teichmann and Oliver Stegle. (2018). SpatialDE: identification of spatially variable genes. Nature MethodsDOI:10.1038/nmeth.4636

Further Links:


Sanger Science

Sequencing a superbug: How typhoid became extensively drug-resistant

By: Gordon Dougan and Elizabeth Klemm, Wellcome Sanger Institute and the Department of Medicine, University of Cambridge
Date: 20:02:18

Reprinted from the Take on Typhoid website,

The bacteria that causes typhoid fever, Salmonella Typhi, is a smart one.

I know this because our laboratory has been sequencing the DNA of S. Typhi strains that infect people around the world, and we have found evidence for an accelerating evolution of resistance to antibiotics.

After antibiotics were first introduced to treat typhoid in the 1940s, typhoid’s mortality rate plummeted from around 26 percent to just 1 percent. But within 20 years the first cases of typhoid resistant to chloramphenicol—one of the three first-line treatments for typhoid appeared signaling a battle between antibiotic and bacteria. Typhoid strains resistant to all three first-line treatments, which are known as multidrug-resistant (MDR) typhoid strains were quick to follow those resistant to only one antibiotic. And when doctors began using second-line antibiotics (more modern but expensive versions) such as fluoroquinolones, typhoid followed with resistance against those drugs, too.

A particular agressive strain (actually a genetic clone) of MDR typhoid, H58, first emerged in the 1990s. This H58 strain has grabbed our attention because, while other MDR typhoid strains have mostly remained in the local area where they first appeared, H58 has quickly spread across the globe. Currently, the majority of all global MDR typhoid strains can be classified as H58. It’s a quick learner that is able to not only evolve more easily, but also multiply and spread more rapidly than other typhoid strains.

The global prevalence of H58 typhoid strains, 2017

The global prevalence of H58 typhoid strains, 2017

Recently, the world saw yet another evolution of the H58 strain. In November 2016, doctors in Sindh, Pakistan, observed cases of a novel H58 S. Typhi strain that was resistant to not only the three first-line antibiotics and fluoroquinolones, but also a third-generation cephalosporin called ceftriaxone. This new strain is classified as extensively drug-resistant (XDR) typhoid. It is only susceptible to a limited number of antibiotics, which can be expensive and difficult to access, especially for low- and middle-income countries.

In an effort to learn more about this new XDR typhoid, our team, working closely with outstanding collegues in Pakistan, quickly went to work to sequence its DNA — research that was recently published in mBio. We found three concerning issues. First, we found that S. Typhi has the ability to transform from MDR to XDR in a single step. By acquiring just one highly mobile DNA molecule (plasmid) from another bacteria such as E. coli, MDR H58 typhoid in any location can potentially become XDR typhoid.

Second, we found that the new XDR strain is an end product of a global chain of antibiotic resistant bacteria. The plasmid that created XDR typhoid is present in a variety of diverse geographic settings across the globe, and once created, XDR typhoid rapidly reproduces itself. This is a concerning development because previous reports of XDR typhoid have been sporadic and isolated, while this particular strain has already caused large-scale outbreaks and is spreading within and outside Pakistan. It has already been carried as far as the United Kingdom.

Finally, our findings confirm the fact that the antibiotic arsenal for typhoid treatment is fading. We can no longer rely on antibiotics to treat typhoid fever. We need to shift our paradigm away from treatment and toward prevention.

Fortunately, we now have a promising new preventative tool. Typhoid conjugate vaccines are a newly WHO-prequalified class of typhoid vaccines that, compared to older typhoid vaccines, are longer-lasting, require fewer doses, and can be given to children as young as 6 months of age. Because they can be given to young children, countries can include typhoid conjugate vaccines in routine immunization programs, developing widespread immunity to typhoid and stopping dangerous strains like H58 from spreading and evolving. When implemented alongside improvements in water, sanitation, and hygiene, these vaccines can have the power to take on typhoid for good.

Typhoid may be smart, but we know how to outsmart it. We just have to act now.

This blog is reposted from Take on Typhoid website,

About the Author:
Professor Gordon Dougan is a Group Leader at the Wellcome Sanger Institute and University of Cambridge Department of Medicine.

Elizabeth Klemm is a postdoctoral researcher in Gordon Dougan’s research group at the Wellcome Sanger Institute.

Related publication:
Elizabeth Klemm et al. (2018) Emergence of an extensively drug-resistant Salmonella enterica Serovar Typhi clone harboring a promiscuous plasmid encoding resistance to Fluoroquinolones and third-generation Cephalosporins. mBioDOI: 10.1128/mBio.00105-18

Further links:

About spiders (specifically the Fen Raft spider, Dolomedes plantarius) and where to get them from.
25 GenomesSanger Science

Getting a hold of samples… [part 2]

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 16/02/2018

So far I’ve talked about Golden Eagle and Red Squirrel, also known by the moniker “charismatic megafauna” which a fantastic description of large cute/interesting things I first heard from Mark Blaxter.

So, I mentioned that some of the species are quite challenging to get but there are some that are also easy to sample (along with who provided them – thanks goes to them):

  • Himalayan Balsam – Lisa Outhwaite, found on the Genome Campus
  • Oxford Ragwort – Lisa Outhwaite, found on the Genome Campus
  • Summer Truffle – from Dr Paul Thomas, commercial source (the exact location is confidential though)
  • Common Starfish – from Prof Maurice Elphick, keeps a tank full for other ongoing work
  • King Scallop – Dr Susanne Williams, bought from a fishmongers!
  • Asian Hornet – Dr Seirian Sumner, already had a collection
  • Turtle Dove – Dr Jenny Dunn, had samples from previous work
  • Otter – Dr Frank Hailer, from routine health surveys
  • Roesel’s Bush-cricket – Dr Björn Beckmann, they’re quite abundant now so easy to find
  • Fen Raft Spider –  Dr Helen Smith, ditch maintenance means they ‘pop up’ at the time
  • Robin – Dr Jenny Dunn, had samples from previous work
  • Grey Squirrel – Kat Fingland, has samples from ongoing work

Although these were easy to get that doesn’t mean there aren’t some quite interesting anecdotes associated with the sample collection.

Summer Truffles

Summer truffles, for example, are pretty valuable (circa £400 per kilogram) so the reason we don’t have the exact location is to prevent rival hunters (?not sure you hunt for a truffle or forage?) from plundering the area.

King Scallop, Great Scallop, Coquilles Saint-Jacques

Also, imagine the confusion in the voice of the chap at the end of the phone when I ring up and ask the fishmonger if they have a GPS location for the source of their scallops. Then think what the guy must have been thinking when I try to explain why, hopefully he got it but I’m not so sure! This is why we need to reach out and explain science to the public more, there’s not a great deal of exposure to genomes/genetic research if it’s not human related.

Turns out they don’t know exactly where they came from anyway; the scallops hail from the Shetland Isles – might have to do some genotyping to find out!

Roesel’s Bush-Cricket

Crickets it turns out are quite the eaters and not wanting to limit their diet they are, like us, omnivorous. Unlike us, however, at least nowadays, they do practice cannibalism (not sure how you ‘practice’ mind you, maybe start with just a lick?!). It seems they can lose legs quite easily this way, one named Oscar had a run-in in their container with Hannibal and lost two legs, the third (Heather) just lost a single one prior to arrival.

Fen Raft Spider

Did you know you need a special license to collect Fen Raft spiders? This is because they’re red-listed like the Eagle but, thankfully for me, Helen has one. She has also raised many thousand spiderlings in her kitchen!

Check out her website ( and if you fancy a challenge see how easy it is to spot (what is after all the largest UK spider) them in their habitat here.

Clearing a fen ditch - home to the Fen Raft Spider (one of the 25 Genomes we are sequencing)

Clearing a fen ditch – home to the Fen Raft Spider (one of the 25 Genomes we are sequencing)

Grey Squirrel

Grey Squirrels are regarded as a pest species. This means that it’s legal to hunt them without a special license, provided that you don’t cause any unnecessary suffering. We are NOT, however, doing this for the project as it’s not the most ethical thing when people are already collecting them for other research.

Also, did you know that you can buy squirrel pie? Not had it myself but could be tasty…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Getting a hold of some samples… for the 25 Genomes Project
25 GenomesSanger LifeSanger Science

Getting a hold of some samples…

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 29/01/2018

[Because gathering samples is proving to be quite a major task, I’m going to split this across several posts]

First things first – find a sample

The first, and often most difficult, part of getting a sample for the 25 genomes project is finding out where from.

There are a number of reasons for this but it essentially boils down to the fact that the Sanger Institute has always focused¹ on human health and disease so we don’t have a particularly great list of contacts for this project.

¹There have been some dalliances into other areas in the past, notably; Cod, Coelocanth (it’s fish, known as a ‘living fossil’, although I prefer something that implies it’s been a long-term success like ’Pan-eon species’, a description I may have made up), Tasmanian Devil Cancer, Tomato and a butterfly

The ones that are most difficult to get are the ones that the steering group decided upon independently, this is because without a scientist/researcher/expert putting forward the species there isn’t anywhere to start from.

This is where working in science has a great advantage- collaboration. In the fields of Agricultural, Plant & Animal and Environment/Ecology sciences half of all articles were written by multiple institutions by 2009² and if the trend has continued it should be over 60% by now.

²Gazni, A., Sugimoto, C. R. and Didegah, F. (2012), Mapping world scientific collaboration: Authors, institutions, and countries. J. Am. Soc. Inf. Sci., 63: 323–335. doi: 10.1002/asi.21688

This is one reason why we need to collaborate more and will be subject of a later post.

How traditional biologists and computer biologists work together. #CartoonYourScience by @redpen/blackpen

How traditional biologists and computer biologists work together. #CartoonYourScience by @redpen/blackpen

(for more like this check out the wonderful @redpenblackpen)

In practice this should mean that us scientists are a helpful bunch, and it turns out this is true. Whereas cold-calling/emailing people about the ‘accident you’ve been recently involved in’ or ‘the security breach on you Microsoft device’ are extremely annoying [pro-tip, pass the phone to your pre-school child if this happens, the results are normally quite amusing] doing the same to a scientist to offer them free sequencing of their species of interest is generally quite warmly received!

Getting a Golden Eagle(‘s DNA)

So lets’ have a closer look at some of the species, firstly the Golden Eagle.

I would have thought that this would be a tricky one – they’re protected by a bunch of laws/regulations which means that without special licences you can’t mess with them. In fact even the locations of the nests are a closely guarded secret as they are still being illegally killed or the eggs are taken by collectors.

Turns out that a quick google and one email can lead to a great result, although it’s tinged with a bit of sadness which I’ll get to in a bit. I initially contacted Professor Anna Meredith at Edinburgh University with a general ‘can you help me with blah, blah, blah’ as she works with a number of species we were interested in (in this case I was actually after Red Squirrels) and she forwarded this on to Dr. Rob Ogden, also at Edinburgh.

As it turns out he is already working on Golden Eagles and was planning on doing some sequencing with some collaborators in Japan (they have eagles there too). Even better he had samples already from (here’s the sad bit) chicks that had died in the nest (plus one found rather suspiciously in a long abandoned nest).

So, one sample down, 24 to go!

[By the way I’m not going to go into the logistics and ENORMOUS cost of shipping things on dry ice, just assume that things arrive magically, but I may expand on why they need shipping this way some other time.]

Something squirreled away

Anna couldn’t help out with the Red Squirrel however, so I asked the National Trust who maintain a lot of the areas where these cute little critters still live:

UK Squirrel Distribution Maps, 1945 and 2010. Image Credit: Craig Shuttleworth, RSST

UK Squirrel Distribution Maps, 1945 and 2010. Image Credit: Craig Shuttleworth, RSST

A nice lady called Laura put me in touch with the Head of Conservation (David Bullock) who in turn linked me to Andrew Brockbank at Formby Point who then led me to Kat Fingland (Nottingham Trent University) and Rachel Cripps (Red Squirrel Officer). All this took about a month and a bit but I finally had the right people. Thankfully we didn’t need any extra licencing to get some samples as they were already collecting from animals that had died from natural or accidental causes.

2 down, 23 to go!

Ethical and responsible sampling

It’s worth mentioning at this point that for this project we want to limit the impact of our sampling as much as possible and therefore have had it approved by our AWERB (Animal Welfare and Ethical Review Body). What this means is that wherever possible we do not kill any animals solely for the project, although in practice this is easier said than done and it does create some difficulties.

  1. For some animals this is not a problem as they are large enough that we can take a small amount of blood (less than 1ml) but others are too small for this to be possible (pipistrelle bats for example weigh around 5g and have only 0.5ml blood in total). This means that we need to get hold of whole animals AND as some of our species are protected (Golden Eagle, Red Squirrel etc.) they need to have already passed away for us to be able to use them.
  2. Another related issue is that the protected species need special licences to take blood samples from even if they are large enough for this to be possible. Given the amount of time for the project it’s not really an option, so again we need naturally passed on animals.
  3. The nature of the sequencing technology we’re using means that we need to get really long bits of DNA (upwards of 150,000 base pairs – that’s the A-T/G-C parts of DNA). The problem is that when we use animals that have died of natural causes we need to find and sample them really quickly: as soon as the animal dies the DNA begins to break up through the natural decomposition process.
  4. The really small critters (invertebrates like the Roesel’s Cricket for example) are next to impossible to find when they’ve died, as they tend to be eaten by other things and are hard to spot unless they move. In these cases we have no choice but to take live creatures and euthanise them as humanely as possible.
  5. Plants and fungi are somewhere in the middle, we need quite a lot of material (DNA extraction is more difficult), but ethically it’s acceptable to take bigger samples, so in these cases we take cuttings or fruiting bodies.

So that’s it for this one, more on sample collection to come…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

On choosing the 25 species for our 25 Genomes project
25 GenomesSanger Science

On choosing the 25 species for the 25 Genomes Project

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/01/2018

For those that don’t know (and until recently I could include myself in this group) there are A LOT of species on and in the earth. Currently it’s estimated that there are 2 billion! (2,000,000,000; see for details). Most of these are bacteria, and we’re not looking at those for the 25 Genomes project, but this still leaves about 450 million to choose from.

To make it easier for ourselves, we also decided to limit ourselves only to the 1.5 MILLION species that have currently been described and catalogued. And, to help us along a bit more, we decided that only species found in the UK would count. According to the National Biodiversity Network, that brings the number down to ‘only’ 56,674. Now if you choose to only look at the local area surrounding the Sanger Institute then it’s a much more manageable 318.

However, it wasn’t going to be that easy. In the spirit of the Sanger’s inclusive approach to science, the Steering Group for the 25 Genomes project were concerned that such a narrow list was ‘too parochial’ and directed that the species sequenced should be a representative group of organisms from the whole of the UK.

So, how do you filter more than 56,000 species down to just 25?

The first thing to do was to break down the problem and the idea of a 5×5 matrix was mooted, discussed and agreed upon surprisingly quickly. Rather unsurprisingly coming up with five different categories was not as straightforward as it might first appear. While some were no-brainers (iconic species for instance), getting all five nailed down was tricky.

The wisdom of crowds

So we put out a call for suggestions to the whole Wellcome Genome Campus, to draw on the collective wisdom of the more than 2000 people who work here.

The results were, by turns, pleasing, odd, not-at-all-answering-the-question and esoteric. Here are some examples:

  • Species for which Britain has major global richness and conservation responsibility
  • Female emancipation in the wild
  • Unusual in terms of genetic load accumulation rate and mechanism
  • The three-toed sloth (which is neither a theme nor from the UK)
  • 25 local authors (and then we would really have 25 ‘novel’ genomes)
  • Species imported to the UK, which are making our lives healthier and happier (possibly a politically motivated suggestion)
  • What is ‘down there’ (in the detritus level down on the Ocean floor).

Finding five themes

Armed with these suggestions, the 25 Genomes Steering Group got back together to hammer out the final five categories. Here’s what we decided upon, reasoning that these themes should give a broad breadth of types of organism and habitats to sample:

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

Critical criteria

We also came up with a list of criteria that the species must meet:

  1. Scientific justification must be solid– are there good questions that can be answered by the genome sequence being made available?
  2. No decent draft sequence currently available
  3. Sample availability– some organisms are too small, others are too protected, while others are too seasonal for collection
  4. Tractable genome – some organisms have genomes that are incredibly complex and would take up too much time and resource. For example, many plants have cells that contain multiple copies of the same[ish] chromosomes, a phenomenon known as increased ploidy. (A hexaploid genome has SIX copies of each chromosome, and some plants have even more.)

Now there comes the hard part, actually getting the list of species. As mentioned in a previous post, our public engagement team suggested that we let the public decide five of the species, leaving us just 20.

Great you might think, as it means we don’t need to do as much work, but you’d be sadly mistaken. The reality was that I now needed a list of 20 to start collecting right away AND another 40+ that the public could vote on to decide the final five!

It’s who you know…

Rather splendidly we have a senior member of the Natural History Museum London on our steering group which meant we could exploit their contact list of some 400+ partner groups of wildlife experts. With this in mind I made a surveymonkey survey (it’s still about so you can check it out here, feel free to fill it in- you never know we might want to do more!) that, in my mind at least, cunningly hid the criteria in the questions. It also deliberately did not mention the themes so as not to steer people in any particular direction.

From this I got 99 responses (again discussed earlier) that made up most of the public vote and the 20* for getting on with, these latter ones are in the table below:

Cryptic Dangerous Floundering Flourishing Iconic
Brown Trout Indian Balsam Red Squirrel Grey Squirrel Golden Eagle
Common Pipistrelle King Scallop Water Vole Ringlet butterfly Blackberry
Carrington’s Featherwort New Zealand Flatworm Turtle Dove Roesel’s Bush-Cricket European Robin
Summer Truffle British Mosquito Northern February Red Stonefly Oxford Ragwort Orange-tailed Mining-bee

All in all, this took about 5 months to get to this stage as the species also needed to be individually reviewed to see if they met the criteria and then approved by the steering group.

Now the only problem is actually getting the species DNA; so collecting specimens and some lab work to follow, the supposed easy part….

More on this to come!

*Why we chose the above 20 species

Why sequence it?
Summer Truffle There is disagreement in the literature as to whether this truffle is one or two separate species, plus it grows underground and is therefore largely unseen and difficult to locate. Prices for those collected in the UK remaining relatively stable at around 400GBP per kilo. Known as mycorrhizal, these fungi form a symbiotic association with a host plant on which they are dependent throughout their lifecycle. The sequencing of UK T. aestivum syn. uncinatum populations would be pivotal in helping to answer questions of modes of reproduction, life cycle questions as well as aiding in some core speciation questions.
Brown Trout The Brown Trout has three isoforms that differ in their migratory patterns, one form remains in the locality of its birth where it will live out its life, spawn and die. The second type migrates from lakes to streams and rivers to spawn but remains in fresh water. The third form migrates to the sea/ocean and remains there for much of its life, only returning to spawn. There appears to be no genetic difference between these forms, also known as anadromous (migratory) and sympatric (resident). Additionally the Wellcome Genome Campus is built around an 18th century red brick hall, Hinton Hall, also known as Trout Hall, where a carved stone trout is prominently displayed over the main door to the croquet lawns.
Carrington’s Featherwort This is selected as a representative of the liverworts, an ancient plant group predating flowering plants. It is one of the characteristic liverworts of very high rainfall areas in Scotland, and thus a representative of one of the very special groups of the British biota confined to such high-rainfall areas. Outside Scotland, it is only found in Ireland (extremely rare), the Faeroes and the Himalayas. The Scottish plants are apparently all male – like the Ents, the sexes have become separated in this species and the nearest females are in the Himalayas.
Common Pipistrelle Until recently this bat was believed to be a single species however it is now know to be a dual species (common/soprano), with one other (Nathusius’) also being resident in the UK. Studying the genome will allow us to investigate the origins of the split between the two species, when and why it occurred.
Indian Balsam Highly invasive weed species that substantial effort to control is undertaken, control methods based on finding would have important implications for wetland and river management.
King Scallop Pecten maximus has been found to contain the Amnesic Shellfish Poisoning toxin, domoic acid, which accumulates after they consume algae/diatoms- especially in the event of algal blooms. This risk is regarded as a significant threat to both public health and the shellfish industry. Some studies have suggested that global warming is resulting in greater reproductive success for P. maximus in the UK, however concerns have been raised over increasing mortality, declining recruitment and spawning stock biomass in several Scottish populations. Pecten maximus is also of interest scientifically because of its unusual vision and because its two shell valves are coloured differently. Identifying molecular pathways for shell pigment production in Mollusca has lagged behind studies of vertebrates and terrestrial invertebrates, and is a major gap in our understanding of how colour has evolved in the natural world. Vision in Mollusca is also of great interest because of the many different eye morphologies and the fact that very few species are thought to see in colour.
New Zealand Flatworm New Zealand flatworms prey on earthworms, posing a potential threat to native earthworm populations. Further spread could have an impact on wildlife species dependent on earthworms (e.g. Badgers, Moles) and could have a localised deleterious effect on soil structure.
British Mosquito Mosquitos are an important disease vector and there has been speculation that an increase in the distribution of other species due to climate change could allow the re-introduction of diseases such as malaria to the UK.
Red Squirrel Sequencing the whole genome of the native red squirrel will hopefully provide new tools and resources into reversing their decline and aiding their long-term conservation in the UK. For example, this research could reveal key insights into how red squirrels have adapted to living in an urban environment. This study could also provide further information for managing the spread of diseases and helping to protect the red squirrel from the fatal squirrelpox virus, as well as to gain a deeper understanding into the impact of newly-discovered diseases
Northern February Red Stonefly These stonefly only inhabit the purest of waters and as such are very limited in their habitats and may struggle to adapt to climate change. Brachyptera putata is an endemic UK stonefly. There has been suggestions that other European Brachyptera species may be synonyms of B. putata. Sequencing would determine whether it is a true UK endemic.
Turtle Dove Turtle Dove numbers have fallen by a staggering 93% since 1970 and now resides on the Global Red List for Endangered Species. Smaller than its collared cousin, the Turtle Dove is now only found in eastern England, where farmers are working with the RSPB to create feeding habitats, the destruction of which are blamed for the bird’s decline.
Water Vole The Water vole is the UK’s fastest declining mammal and efforts to help the population maintain genetic fitness would benefit from having the genome sequenced. Arvicola is a fantastic example of a small mammal genus that survived through the last glaciation, and has adapted to a range of habitats across Europe and much of northern Asia.
Oxford Ragwort The Oxford Ragwort is representative of a species being introduced and excelling in another habitat. It was collected from the slopes of Mount Vesuvius sometime in the 17th Century, and planted in Oxford where it rapidly colonised the area due to its natural hardiness, and could grow on urban landscapes too (sides of buildings, on stairs, etc.). When railways were introduced to the UK landscape, this facilitated the spread of Oxford Ragwort across the UK (it can be found growing along railway tracks today). Sequencing the genome would better increase our understanding of a non – native species excelling in a new habitat and may expand on our understanding of the ecology of flowering plants.
Roesel’s Bush-cricket Once restricted to the south coast and estuaries (saltmarshes) it is now widespread, possibly due to climate change and the spreading of salt on UK roads.
Ringlet butterfly Despite an overall decline in butterflies over the last 50 years the ringlet has increased its population by nearly 400%. It’s one of the few to fly on overcast days and has an interesting dwarf form that appears at 600ft, increasing until 100% of the population is this form at 1000ft.
Grey Squirrel As the anti-hero for the red squirrel, investigating how/why the squirrelpox virus is tolerated
Blackberry Good opportunity for citizen science, population genomics specifically for schools engagement. Also commercial soft-fruit genetics as it is an important and expanding food crop.
Golden Eagle This is an iconic UK species that has suffered from hunting and pesticide poisoning in the past, leading to extinction in all parts of the UK except Scotland where there are still less than 500 breeding pairs.
Orange-tailed Mining-bee This species is conspicuous and attractive, one of the mining bees that is more likely to have come to the attention of the general public. It is widespread and common throughout the United Kingdom, flying in spring. It is a component of natural pollination services which can ensure crop pollination in the absence of honeybees, and also the pollination of many wild and garden flowering plants ensuring their genetic diversity and conservation.  In the UK, of 276 species of bee, there is only one honey bee, and a score of bumblebees, the great majority of native bees are mining bees, including 68 species of Andrena.  The genome sequence itself will be useful for comparative study of the genomes of this solitary bee with the available genomes of social bees, in terms of gene composition relevant to sociality.
European robin Robins use vision-based magneto-reception and the mechanism is not fully understood, it has been shown that it may involve quantum entanglement. Robins are also extremely territorial, unlike most other song birds, with up to 10% of all deaths occurring due to fights.

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

I'm a Scientist, Get me out of here - 25 Genomes
25 GenomesSanger Science

We let the public decide five of our species….

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 20/12/2017

We recently wrapped up the ‘I’m a scientist, get me out of here’ public engagement event. This was a fantastic exercise aimed at getting the public, specifically school children, excited about sequencing genomes and science in general.

Here’s how ‘I’m a scientist, get me out of here’ worked – 25 Genomes style:

We divided the species into five themes, each of which had their own ‘zone’:

  • Flourishing (species on the up in the UK)
  • Floundering (endangered and declining species)
  • Cryptic (species that are out of sight or indistinguishable from others based on looks alone)
  • Iconic (quintessentially British species that we all recognise)
  • Dangerous (invasive and harmful species)

In each zone were between 7-9 candidate species that had been proposed via an online poll of scientists, wildlife experts and interested members of the public.

Close, but no cigar…

The poll to suggest candidate species for the public vote ran throughout September and into early November and we had a pretty good response. Most of the replies were pretty sensible, and quite a few had very detailed justifications by experts (one ran to nearly 5,000 words, complete with references). But some suggestions were rather left field.

In the very first section of our explanation of the purpose of the poll, we say: “…we are embarking on a brand new project to sequence a cross-sample of UK biodiversity.”

Bearing this in mind I suspect some people weren’t that keen on reading or were just chancing their arm. Here are some of the more ‘exotic’ suggestions:

  • Resplendent Quetzal – a cool-looking bird, with a cool name. If you’ve not heard of it that’s because it lives in central America (not the UK).
  • The “Hoff” crabKiwa tyleri – so named because of its hairy chest, reminiscent of Baywatch actor David Hasselhoff. The species can be found in UK oversees territorial waters, but it’s not in the UK.
  • Fire Salamander – yet another cool name, and it looks pretty sweet too. Unfortunately only found in mainland Europe.
Fire Salamander - pretty, but not UK-based. Image Credit: William Warby, Wikimedia Commons

Fire Salamander – pretty, but not UK-based. Image Credit: William Warby, Wikimedia Commons

Some non-UK resident species suggestions were a little easier to spot:

  • Greenland Shark
  • Mongolian Gerbil
  • Madagascar Paradise Flycatcher
  • Asiatic black bear
  • Italian Mediterranean buffalo
  • Alpine grasshopper
  • Tasmanian Devil (funnily enough, this species has already been sequenced right here at the Sanger Institute.)
  • Antarctic Krill

Back to I’m a scientist, get me out of here – 25 Genomes

The idea for the zones was that each species would be represented by a ‘champion’ (or team thereof) and they would answer in the first person, to keep things more fun and relatable. It worked well:

Screenshot of I'm a Scientist, Get me out of here - 25 Genomes online chat

Screenshot of I’m a Scientist, Get me out of here – 25 Genomes online chat

During the ‘I’m a scientist get me out of here – 25 Genomes’ event was running anyone who logged on could vote for their favourite species, one vote per zone. When the vote was finished, the winning species from each zone was added to the 25 Genomes project.

Getting engaged with the students was the most successful way of winning. In all the zones the species that were among the top two most active in the live chats and answered more questions on average had a much better chance becoming the zone winner.

The winners!

The winning 5 species of the public vote for the 25 Genomes Project

The winning 5 species of the public vote for the 25 Genomes Project – Common Starfish, Asian Hornet, Eurasian Otter, Fen Raft Spider, Lesser Spotted Catfish

In all around 5,000 people participated in the events and there were over 150,000 page views, which sounds pretty successful to me.

One final invaluable piece of information that I learned from this whole process is that the Latin name (Onopordum acanthium) for Scotch Thistle is “donkey fart thistle”. In ye olden times people used to think that donkeys fart a lot if they eat it*.

*from the iconic zone Q&A

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Sanger Science

Big data for mosquito control

By: Alistair Miles
Date: 20:12:17


Collecting mosquitoes via pyrethrum spray catch in The Gambia. Credit: Beniamino Caputo

Recently, Erica McAlister from the Natural History Museum in London posted a beautiful image of a specimen of Anopheles gambiae – the mosquito species that contributes most to malaria transmission in Africa – from the museum’s collection. It turns out this is probably the original type specimen, collected in 1900 by a zoologist called John Samuel Budgett during an expedition to The Gambia. In one of history’s tragic ironies, Budgett died a few years later from malaria, contracted while on another expedition.

Despite more than a century of study, there is still so much we don’t know about this deadly mosquito species. Basic questions about life history and ecology, such as how do they survive the dry season, do they migrate, and if so, how far can they travel. Questions about evolutionary history, such as why have they diversified into a cryptic species complex. Practical questions, like how is insecticide resistance spreading, and what can we do about it.


A sample of female Anopheles gambiae mosquitoes. Credit: Martin Donnelly

If Anopheles gambiae did not transmit malaria, we could spend a very fulfilling academic career carefully unravelling the answers. Unfortunately we don’t have that luxury. The global campaign to eliminate malaria has made great strides over the last decade, but malaria remains a major disease burden in many parts of Africa and there is still a long way to go. And we are almost completely dependent on insecticide-based methods of mosquito control. We don’t yet know how badly insecticide resistance will impact on current control programmes, and there is a lot of debate and uncertainty about what should happen next. But most people agree that we cannot continue trusting blindly that the same insecticides we’ve been using for decades will continue to be effective.

One way to overcome uncertainty is to collect data. Lots of data. And that’s fundamentally what the Anopheles gambiae 1000 Genomes Project is about. In the first phase of the project, reported recently, we sequenced the genomes of 765 mosquitoes collected from field sites in 8 African countries. We then compared the sequences and discovered more than 52 million genetic variants. These data on genetic variation in natural mosquito populations can serve a range of purposes. For example, they can be used to study the evolution and spread of insecticide resistance, and inform the design of new mosquito control technologies based on gene drive. They also give us insights into the structure, size and history of mosquito populations, and provide evidence for substantial migration between populations. When we looked at some of the most rapidly evolving insecticide resistance genes, we found dramatic demonstrations of how mosquito populations across the continent are inter-connected, enabling resistance mutations to spread over thousands of kilometres.


Distribution of insecticide-treated bed-nets for malaria control. Credit: Martin Donnelly

Responding to insecticide resistance is a complex challenge, and there are no easy answers. Resources are finite, and need to be allocated wisely. If we want to bridge the gap between research and practice, we will need to collect more data to fill in the geographical gaps, and study how mosquito populations are changing over time. We also need to collect data from mosquito populations before, during and after specific control interventions, so we can measure the impact and learn which interventions are most (or least) effective. But the data we have collected so far demonstrate a clear path forward. By continuing to build a public resource of mosquito genome sequence data, and integrating with other data on ecology, malaria epidemiology and insecticide resistance phenotype, we hope to provide a source of much-needed intelligence to support the malaria elimination campaign in Africa.

This blog is reposted from Nature Ecology and Evolution – Behind the paper

About the Author:
Alistair Miles is Head of Epidemiological Informatics in the group of Dominic Kwiatkowski, at the University of Oxford, and the Wellcome Trust Sanger Institute.

 Related publication:
The Anopheles gambiae 1000 Genomes Consortium. Genetic diversity of the African malaria vector Anopheles gambiae. (2017) Nature DOI: 10.1038/nature24995

Further links: