Tag: 25 Genomes

Darwin Tree of Life: focusing on fungi and probing plants. Snapshots from the Fungarium at The Royal Botanic Gardens, Kew. Image Credit: The Royal Botanic Gardens, Kew
25 Genomes

Darwin Tree of Life: focusing on fungi and probing plants

By: Alison Cranage, Science Writer at the Wellcome Sanger Institute
Date: 12.11.18

Botanic gardens - such as Darwin Tree of Life Project partner Kew Gardens - contain approx 1/3rd of all plant life on earth

Darwin Tree of Life Project partner, The Royal Botanic Gardens, Kew has an estimated 1.25 million dried fungi specimens

All our lives depend on plants and fungi. Genome sequencing is beginning to uncover their incredible diversity, yet only a tiny fraction of the millions of species which inhabit the planet have been analysed. The Earth BioGenome Project, which aims to sequence the DNA of all complex life, will be cataloguing and sequencing all plants and fungi, together with all animals and protozoa.

The Royal Botanic Gardens, Kew is home to the largest and most diverse collection of plants and fungi in the world. They are a key partner in the Darwin Tree of Life Project to sequence the genomes of all eukaryotic life in the UK – providing scientific expertise and extensive plant and fungi collections.

The genomes of all fungi

Fungal Facts

Fungal Facts

Baker’s yeast (Saccharomyces cerevisiae) was the first eukaryote to have its whole genome sequenced back in 1996. Since then, over 1,500 species of fungi have had their genome sequenced. There are about 140,000 species known to science, with an average of 2,000 new species described every year. The total number of fungi species is estimated to be between 2.3 and 3 million.

Genome sequences are already helping people capitalise on some of the unique properties of fungi. They are widely used in industry for the large-scale production of a diverse array of chemicals – from food to pharmaceuticals.

We spoke to Dr. Ester Gaya, Senior Mycology Researcher at Kew Gardens, about some of the untapped resources of fungi.

“Many antibiotics come from fungi. Researching fungal diversity could lead to the discovery of new sources of antibiotics and medicines.

“In industry the genomes of several fungal species are being studied because of their ability to produce ‘mycodiesel’. It may be we can produce fuel sustainably and on an industrial scale.

“There is also research into how we can use fungi for bioremediation.  Some species have the ability to consume and break down environmental pollutants – so they could be used to clean up oil spills, for example.”

Dr Ester Gaya, Research Leader in Comparative Fungal Biology at The Royal Botanic Gardens, Kew. Photographed in the organisation's historical archive. Image Credit: The Royal Botanic Gardens, Kew

Dr Ester Gaya, Research Leader in Comparative Fungal Biology at The Royal Botanic Gardens, Kew. Photographed in the organisation’s historical archive. Image Credit: The Royal Botanic Gardens, Kew

Not all fungi are beneficial. Many species are harmful to humans. For example Pneumocystis jirovecii causes a type of pneumonia and Candida albicans causes thrush. These, together with hundreds of other harmful species, have had their genomes sequenced – helping researchers design better treatments and surveillance systems.

Other fungi species target plants, including key food crops. Researchers are studying their genomes to understand how their pathogenicity works – understanding which genes are active, will enable researchers to develop new ways to tackle them and improve crop yield.

Inside the Fungarium

This slideshow requires JavaScript.

Tasty genomes

Plants underpin all aspects of our everyday life – from the food we eat to the air we breathe. Like fungi, only a tiny fraction of plant species on the planet have had their genome sequence determined.

Most plant species with genomes sequenced to date are crops, including the major cereals – rice, wheat and maize, as well as fruits and vegetables. Commercially important crops that make our favourite drinks like coffee, grapes and hops have also had their genomes sequenced. Studying these genomes helps enhance yield, as well as shedding light on the mechanisms of taste and quality.

Studying the genomes of relatives of crop species is also important. These plants harbour important genetic diversity, often lost in the domesticated crops that dominate world agriculture.  75 per cent of the world’s food supply depends on just 12 species of plants. Their wild relatives harbour essential genetic diversity which can be used for breeding resilience to disease and to climate change.

Beyond food

Plants are a hugely diverse group of organisms, from trees with 5,000 year lifespans to unicellular green algae. Their uses are equally diverse, from medicines to biofuels and materials.

Plant sciences have a vital role in addressing some of the most critical global challenges, such as climate change and food security. Plant science can provide the fundamental research required to protect biodiversity, as well as mitigate and adapt to climate change. Whole genome sequence data will enable researchers to drive the understanding of plant development and evolution and their potential contribution to sustainable agriculture. And new, detailed insights from genome sequences may help us understand medicinally important compounds.


How modern life is built on fungi

How modern life is built on fungi

There is no doubt the project comes with challenges. The quality and amount of material available for DNA sequencing will be an issue. This is particularly a problem for microscopic fungi, as many cannot be cultivated outside their natural habitat. This makes it difficult to gather enough material for DNA sequencing. Getting good quality DNA from historical plant and fungal samples, like those housed at Kew, may also be difficult – though it is an area that is rapidly improving.

There are an estimated 1,500 plant species native to the UK, with a total of 400,000 around the world. Nearly all UK species have been catalogued and have seeds stored by Kew. The project is likely to discover new species of both plants and fungi though.

“In fungi there are what we call ‘dark taxa’ ” says Dr. Gaya. “They’re hidden to the naked eye. And before the advent of DNA sequencing we didn’t have the tools to discover them.”

Scientists at the University of Exeter discovered a whole new phylum of fungi in 2011 – it was a whole new branch, right at the base of the fungal tree of life. The microscopic fungi were found living in a pond on the University campus.

We have only just started to scratch the surface of these remarkable groups of organisms.

Dr. Gaya is particularly interested in the genomes of these lichenised fungi – whose orange pigment acts like a sunscreen, protecting them from UV damage and allowing them to grow in some of the driest places on Earth

Dr. Gaya is particularly interested in the genomes of these lichenised fungi – whose orange pigment acts like a sunscreen, protecting them from UV damage and allowing them to grow in some of the driest places on Earth. Image Credit: The Royal Botanic Gardens, Kew

About the Author

Alison Cranage is the Science Writer at the Wellcome Sanger Institute


The Darwin Tree of Life Project and the Earth BioGenome Project are aiming to sequence all animals, birds, fish, insects and plants in the UK and on earth, respectively
25 GenomesSanger Science

Sequencing All Life On Earth – Facts and Figures

Today (1 November 2018) a number of research organisations and funders announced the official launch of the Earth BioGenome Project – which aims to read the genomes of every species of animal, bird, fish, fungus, insect and plant on the planet. To help in this endeavour, the Wellcome Sanger Institute announced its intention to collaborate with a number of UK organisations to run the Darwin Tree of Life Project to sequence the DNA of all such life in the UK.

Below are 10 top facts that help to put the work into perspective…

 1. Let’s run the numbers

There are currently around 1.5 million catalogued eukaryote species on earth – that’s the known animals, plants, protozoa and fungi. But for a true total, estimates vary from 10-15 million species[1],[2]. There are an estimated 66,000 eukaryote species in the UK.


2. Ages of extinction – we’re up to 6…

The planet is in the sixth great age of extinction[3]. The Living Planet Index reported a 60 per cent decline in vertebrate populations since 1970[4]. By the year 2050, up to 50 per cent of all existing species may become extinct, mainly due to human activity[3].

3. It won’t be cheap, but it will cost less than the very first human genome

To sequence an average vertebrate-sized genome costs about US $1,000. To sequence the genomes of all 1.5 million known eukaryotes, plus up to 100,000 new eukaryotic species will cost US $4.7 billion. This is less than the cost of creating the first draft human genome sequence (US $5 billion in today’s money). The timescales are equally comparable – the first human genome took 13 years to sequence; scientists aim to sequence all eukaryotes on Earth in the next 10 years.

4. Beetle mania

There are believed to be approximately 1-1.5 milllion different species of beetles

There are believed to be approximately 1-1.5 milllion different species of beetles

There are 400,000 identified species of beetles (Coleoptera) in 30,000 genera across 176 families. This represents about 25 per cent of all classified eukaryotic life. There are a predicted 1.5 million beetle species inhabiting the planet[5].

There is a story, possibly apocryphal, of the distinguished British biologist, J.B.S. Haldane, who found himself in the company of a group of theologians. On being asked what one could conclude as to the nature of the Creator from a study of his creation, Haldane is said to have answered, “An inordinate fondness for beetles.”[6]

5. There’s a long way to go…

There are fewer than 3,500 eukaryotic species with sequenced genomes. This represents less than 0.2 per cent of known eukaryotes.


6. Botanical gardens of the world unite

Botanic gardens - such as Darwin Tree of Life Project partner Kew Gardens - contain approx 1/3rd of all plant life on earth

Botanic gardens – such as Kew Gardens – contain approx 1/3rd of all plant life

The collections of the botanical gardens of the world contain about a third of all species of plants, and more than 40 per cent of all endangered plant species[7].

7. It’ll take more than few usb sticks

The Wellcome Sanger Institute has the largest biosciences data centre in Europe, capable of storing and processing genomes of all sizes and complexities

The Sanger has the largest biosciences data centre in Europe, able to store and process genomes of all sizes and complexities

Storage and distribution of reference genomes and analyses will likely require less than 10 gigabytes per species or about 20 petabytes in total, well within current capabilities.[8] Storage of the underlying sequence read data for the completed Earth Biogenome Project is estimated to be approximately 200 petabytes. Total project information is likely to exceed an Exabyte of data.

8. DNA samples like it cold… very cold

DNA samples need to be stored at -80C

DNA samples need to be stored at -80C

For genome sequencing, ideally, DNA samples are frozen immediately upon collection. For long term storage, samples need to be kept at -80OC This isn’t always possible as resources may be limited at remote sites. Shipping samples over long distances can cause loss of DNA quality e.g. by thawing or leaking of preservation liquid. National networks of freezers, like the CryoArk BioBank will be used to store samples.

9. The world of fungi matters

Don't dismiss fungi - there are nearly 2-3.3 million different species, and they are vital for healthy ecosystems

Don’t dismiss fungi – there are nearly 2-3.3 million different species, and they are vital for healthy ecosystems

Fungi form one of the largest eukaryotic kingdoms, with an estimated 2.3-3 million species. They form a diverse group with a wide variety of life cycles, including mutualism[9] and parasitism. They have a broad and profound impact on the Earth’s ecosystem.[10]

10. There are three domains of life on Earth

How life is divided up - the three classes of life explained

The three domains of life

Life is categorised in to three domains:

  • Bacteria
  • Archaea
  • Eukaryota.

A domain is further divided into kingdom, phylum, class, order, family, genus, species.


News story: Genetic code of 66,000 UK species to be sequenced

News story: Launch of global effort to read genetic code of all complex life on earth


[1] Brendan B. Larsen et al, “Inordinate Fondness Multiplied and Redistributed: the Number of Species on Earth and the New Pie of Life,” The Quarterly Review of Biology 92, no. 3 (September 2017): 229-265.

[2] Hinchliff CE, et al. (2015) Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proc Natl Acad Sci USA 112:12764–12769.

[3]     Ceballos G, Ehrlich PR, Dirzo R (2017) Biological annihilation via the ongoing sixth mass extinction signaled by vertebrate population losses and declines. Proc Natl Acad Sci USA 114:E6089–E6096. AND International Union for Conservation of Nature (2017) IUCN 2016: International Union for Conservation of Nature annual report 2016 (International Union for Conservation of Nature, Gland, Switzerland).

[4] https://www.wwf.org.uk/updates/living-planet-report-2018

[5] http://www.pnas.org/content/112/24/7519?ijkey=09ba717ae0cbbc6bb750bedacf9b7db6e7b7969a&keytype2=tf_ipsecsha

[6] 1959 May-June, The American Naturalist, “Homage to Santa Rosalia or Why Are There So Many Kinds of Animals?” by G. E. Hutchinson, Page 146, Volume XCIII, Number 870.  [Taken from – 1959 May-June, The American Naturalist, “Homage to Santa Rosalia or Why Are There So Many Kinds of Animals?” by G. E. Hutchinson, Page 146, Volume XCIII, Number 870 (via Wikipedia.)]

[7] https://www.nature.com/articles/s41477-017-0019-3

[8] One Petabyte is 10×15 bytes. One petabyte is equivalent to 13.3 years of HDTV content

[9] Mututalism is where two organisms of different species exist in a relationship in which each individual fitness benefits from the activity of the other.

[10] http://www.pnas.org/content/pnas/114/35/9391.full.pdf

The Darwin Tree of Life Project and the Earth BioGenome Project are aiming to sequence all animals, birds, fish, insects and plants in the UK and on earth, respectively
25 GenomesSanger Science

The quest to sequence all life

Today (1 November 2018) the Earth BioGenome Project – a mission to sequence the genomes of all life on earth – was launched to the world’s media.

Unimaginable secrets are hidden in the genomes of the known, and unknown, species on our planet. The Sanger Institute is taking a leading role in this historic undertaking, as we plan to sequence the genomes of all 66,000 eukaryotic species in the UK.

Associate Director of the Sanger Institute, Dr Julia Wilson, talks about the ambitions, scale and challenges of this remarkable endeavour.

What is the Earth BioGenome Project?

EarthBioGenomeThe Earth BioGenome Project is a global collaboration, which aims to sequence the genomes of all eukaryote species on earth in the next 10 years. The ambition is vast. The project will transform science. We can only begin to imagine the benefits for advancing research into conservation, evolution, agriculture, biology and medicine.

Why sequence all life on Earth?

How life is divided up
Bacteria – such as MRSA and E. coli – relatively simple lifeforms which are single cells that have no membrane around their nucleus
Archaea – equally simple single cell lifeforms that are seen as the the oldest species of organisms on earth, and tend to be found in extreme environments
Eukaryota – everything else! These organisms have a nucleus with a membrane, and include animals, birds, fish, fungi and insects

All cellular life descended from a common ancestor, and genome sequences are the products of billions of years of evolution. Knowing the DNA sequences of all species will provide fundamental, transformative insights into biology.

There are an estimated 10–15 million eukaryotic species, and trillions of bacterial and archaeal species on Earth. But only a fraction of those – about 2.3 million, are actually known. We are only just beginning to understand the full splendour of life.

So far about 15,000 species, mostly microbes, have completed or partially sequenced genomes. From this, a wealth of knowledge has emerged, enabling enormous advances in agriculture, medicine, and biology-based industries and enhanced approaches for conservation.

Yet the world’s biodiversity remains largely uncharacterized. And the Earth has entered a period of unprecedented change. A new epoch – the Anthropocene – has been defined by human impact on the Earth’s geology and ecosystems. Human activity is threatening biodiversity through climate change, habitat destruction and species exploitation.

How life is divided up - the three classes of life explained

The three categories of life

We have a responsibility to care for our increasingly compromised planet. The project will produce a complete inventory of all life on Earth, and their complete DNA sequences; transforming our ability to monitor life as part of global conservation efforts.

It is essentially a mission to acquire knowledge of the natural world. That knowledge will form a foundation for future biotechnology.

Why now?

The family tree of life of life on earth. Ancestral tree courtesy of the Earth BioGenome Project

The family tree of life of life on earth. Ancestral tree courtesy of the Earth BioGenome Project

For the first time in history it is possible to efficiently sequence the genomes of all known species. In particular, the recent advances in DNA sequencing technology and the arrival of long sequence reads, mean that the project is now feasible.

A number of projects to sequence species for the first time are ongoing around the world. These include initiatives to sequence all birds, insects or bats. They are invaluable, but understandably fragmented and often shaped by funding limitations. Now is the time to bring everyone together, to co-ordinate DNA sequencing efforts. Joining these projects together will ensure consistency and deliver the best possible resource for future research.

How will this help with research into evolution, conservation, bio-diversity and health?

There are a broad set of scientific aims and outcomes of the EBP. The first is to revise and reinvigorate our understanding of biology, ecosystems, and evolution. This includes understanding the evolutionary relationships between all life on Earth, discovering new species, and uncovering fundamental laws that describe and drive evolution.

The second is to enable the conservation, protection, and regeneration of biodiversity. This includes clarifying how climate change and human activity are affecting biodiversity.

Finally, the goal is to explore the potential benefits for society and human wellbeing. This encompasses discovery of new medicines, enhanced control of pandemics, identifying new ways to improve agriculture, discovering new biomaterials, energy sources and biochemicals.

What role will the Sanger Institute play?

Organisations working together to read the genomes of UK fish, birds, animals, insects and plants

Organisations working together to read the genomes of UK fish, birds, animals, insects and plants

The Darwin Tree of Life Project will be an inclusive consortium of UK scientists and organisations. Key organisations are: the Sanger Institute, the Natural History Museum; Royal Botanic Gardens, Kew; EMBL-EBI; Earlham Institute; Edinburgh Genomics. Other institutes and organisations are expected to join. Together we will work to sequence all eukaryotic species in the UK, estimated at around 66,000 species.

We will also work with other countries to develop the global strategy for the EBP, and help to ensure that the benefits are shared.

What are the main challenges you can foresee?

Sample collection is a big challenge. It may be that we need to develop new machines or drones that can travel to hard to reach areas, for example sea beds. It’s possible they could be developed to extract DNA and store samples too.

The Wellcome Sanger Institute has the largest biosciences data centre in Europe, capable of storing and processing genomes of all sizes and complexities

The Wellcome Sanger Institute has the largest biosciences data centre in Europe, capable of storing and processing genomes of all sizes and complexities

Computing will also be a challenge. Requirements for data storage and processing are large – but tractable. In terms of computing power needed, mammalian-sized long-read genome assemblies currently require about 100 processor-weeks. The later phases of the EBP will require about 10,000 simultaneous assemblies running in parallel—a scale already approached by academic supercomputing centres.

Current tools are already capable of completing the project. But there is no doubt that genome assembly, alignment, and annotation algorithms will all need to be improved. It is a huge opportunity to develop new computational methods to maximize our understanding and use of the vast volumes of data that the project will produce.

How will you find all the species in the UK?

Finding, extracting and storing samples of all eukaryotic life in the UK is no easy task, and the Sanger will be working closely with the Natural History Museum, the Royal Botanic Gardens, Kew and other biobank repositories to fulfil the Darwin Tree of Life project

Finding, extracting and storing samples of all eukaryotic life in the UK is no easy task, and the Sanger will be working closely with a number of biobank repositories

For the Darwin Tree of Life Project, we’ll be working with UK organisations that have existing, extensive sample collections – including Royal Botanic Gardens, Kew; the Natural History Museum; the Culture Collection of Algae and Protozoa and others.

New sample collection will be required too. We’ll establish a dedicated team and strategy to survey the UK – gathering samples with the quality of DNA as their primary consideration. In the Darwin Tree of Life Project, we’ll be sequencing all eukaryotes in the UK. We won’t be sequencing non-native species, for example those in UK zoos.

Efforts to sequence all bacteria and archaea are already underway, so the EBP won’t be sequencing those.

Where will you start?

How species fit into the order of life. An animal such as a red fox would be in the domain of Eukaryota, in the Canidae family and the species Vulpes vulpes

How species fit into the order of life. An animal such as a red fox would be in the domain of Eukaryota, in the Canidae family and the species Vulpes vulpes

We have three starting points. Firstly, we will sequence a representative of each of the 3,849 families of species in the UK, plus a selected subset of species of particular interest.

Second, we will sequence all eukaryotic organisms from one or more ecosystems (e.g. St Kilda, Priests Pot or Wytham Woods).

Third, we will sequence all organisms from one or more of clades in the British Isles (group of organisms that consists of a common ancestor and all its descendants e.g. vertebrates).

How much is it going to cost?

The current estimate, for the whole EBP, is that sequencing all eukaryotic species will cost about $4.7 billion. This cost covers sample collection, sequencing machines, data storage, analysis, visualization and dissemination, and project management. Incredibly, this cost is similar to the cost of sequencing the first human genome, which in today’s money was about $5 billion.

The Darwin Tree of Life project is estimated to cost approximately £100 million over the first five years.

Will the sequences be made public?

Yes. UK species data will be publicly released and freely available via a dedicated website. EMBL-EBI will aggregate, curate and distribute assembly and gene sequences to the scientific community via a range of services and tools including Ensembl.

The data from the whole EBP will become a permanent foundation for future scientific discovery. The project will be working within international legislations to ensure that all countries can benefit from their involvement.  The EBP aims to provide fair and equitable access to genome sequence data and benefits it will bring.


News story: Genetic code of 66,000 UK species to be sequenced

News story: Launch of global effort to read genetic code of all complex life on earth

Long live bats
25 GenomesSanger Science

Long live bats

By: Alison Cranage
Date: 29.10.18

The golden flying fox is the largest bat known

The golden flying fox is the largest bat known

Bats hold an exclusive place in our collective consciousness – as creatures of the night, of vampires and witchcraft. They are truly unique mammals, essential to our ecosystems, with much to teach us about human health and longevity.

One in every five living mammals is a bat. There are over 1,300 species, spread across the globe in a wide range of ecological niches. The largest bat is the giant golden-crowned flying fox, weighing 1.6kg with an impressive 1.7m wing span.

Near the other end of the scale is the common pipistrelle bat (Pipistrellus pipstrellus) – the UK’s most abundant species. It weighs just 5g, or the same as a 20 pence piece. If you see a bat darting through the twilight it is likely to be a pipistrelle hunting down moths.

The common pipistrelle bat is one of the smallest bats known and easily fits in the palm of your hand

The common pipistrelle bat is one of the smallest bats known and easily fits in the palm of your hand

Emma Teeling is Professor at University College Dublin and Founding Director of the Centre for Irish Bat Research. World renown expert, she is our collaborator on the pipistrelle bat genome, which we sequenced as part of our 25 Genomes Project.

She is studying the whole range of exceptional adaptations of the bat. They are the only mammals which can fly, and use laryngeal echolocate. They also possess vocal learning, a rare feature among mammals.  Her most recent studies are into their unusual longevity.

A healthy life

The relationship between mammal size and lifespan is, on the whole, linear. Smaller mammals, with faster metabolisms, tend to have shorter lives. But some bat species live up to 10 times longer than would be predicted by their size. Not only do they live long lives, they live healthy ones. They can harbour a range of usually deadly viruses, including Ebola, SARS and rabies, yet they don’t get ill. Studying their immune systems could reveal new ways to treat these infections in people.

Bats are resistant to cancer too. In humans, the risk of developing cancer increases with age. Almost 9 in 10 cancer cases in the UK are in people aged 50 or over. Many other mammals also get cancer, but it is extremely rare amongst bats, naked mole rats, grey squirrels and elephants.

Secrets of the bat genome

The Myotis myotis species of bat is one of the longest lived, with some known to live to the ripe age of 42 years old. Image credit: Gilles San Martin, Wikimedia Commons

The Myotis myotis species of bat is one of the longest lived, with some known to live to the ripe age of 42 years old. Image credit: Gilles San Martin, Wikimedia Commons

Professor Teeling is combining a longitudinal study of one of the longest lived bat species with genomic and molecular analysis, to understand how bats age so healthily.

Her team visit Northern France every year to monitor a colony of about 700 Myotis bats – one of the longest lived species. Each member of the colony is tagged, and caught each year. A minute wing punch is taken and stored for later analysis, before the bat is released. One bat, first caught as an adult, has been caught again 42 years on, still healthy. The blood samples from the bats are used to study their genomes, cellular and gene function and immune system.

This enables the researchers to answer questions about the bat genome. Do they have the same age-related genes as other mammals? Do they regulate them differently? Or is there something else going on? Her work has already uncovered that genes involved in repairing the age-related damage at the ends of chromosomes may hold the key to bat longevity.

The pipistrelle bat lives five years on average. Comparing its genome to the longest lived species, like Myotis, will enable further discoveries.

Small genome

Bat Facts
Bats pollinate the flowers of the agave plant – essential to making tequila
In 1999 the soprano pipistrelle was formally identified as a separate species from the common pipistrelle, based on differing echolocation signatures
Male bats sing to attract females
A pipistrelle can eat 3,000 insects in one evening

Bats have the smallest genomes of all mammals. This could be an adaptation related to flight, as birds also have small genomes, but this is far from certain. Bats have highly active transposable elements in their genomes – these sequences of DNA copy and paste themselves, moving around the genome. In other mammals this leads to genomes growing over time as the copies stay – but bats must have a mechanism to remove them, because their genome has remained small. Studying bat genomes will help us understand this structural evolution, and uncover what is the minimal genome required to make a mammal.

1,000 Bats

Bat1K - 1000 bat genomes project

Bat1K – 1000 bat genomes project

Professor Teeling has set up the bat1k project – an effort to sequence the genomes of 1,000 different bat species. The sequence of the pipistrelle bat is just the start.

Usually out of sight, we perhaps don’t think about bats often. The pipistrelle is an iconic UK species, and highly adapted to live among us. The genome sequence will help researchers understand the native forna of the British Isles, as well as uncover the genetic basis of their unique features.

About the author:

Alison Cranage is a science writer for the Wellcome Sanger Institute.


The 25 Genomes Project

Bat1K Project

25 Genomes: The Common Starfish. Image credit: Ray Crundwell
25 GenomesSanger LifeSanger Science

25 Genomes: The Common Starfish

By: Alison Cranage
Date: 04.10.18


The other-worldly, bright orange, 5 limbed creature is instantly recognisable. Paddling on a Cornish beach, or rockpooling on the Isle of Mull at low tide – it’s pretty likely you’ll come across one.

Lurking in the shallow waters of the UK and across the North Atlantic, the common starfish (Asterias rubens) is one of 1,500 starfish species in the world.

Asterias rubens was nominated by the scientific community and won a public vote to sequence the genome as part of our 25 genomes project. The common starfish falls into our ‘cryptic’ category of creatures. Cryptic, because their behaviour and many hidden talents are not well understood.

Hidden talents

Starfish sperm
The DNA we collected for Asterias rubens was from its sperm. Professor Elphick’s lab in central London is home to some 200 starfish where he collected the sample for us to sequence.

Possibly the most remarkable feature of starfish is their ability to re-generate limbs. If a starfish is attacked or is in danger, it can lose an arm in order to escape. It then grows a new one in its place. Nobody’s exactly sure how this works, but the key to finding out will be in its genome. Understanding the process would have huge implications for regenerative medicine.

The starfish genome could also help research into glue, including surgical adhesives that are used to heal wounds. Asterias rubens feasts on mussels and other molluscs. To get to the meat inside a mussel, it attaches its tube feet to the shell, by secreting a glue, and pulls it apart. Researchers are interested in that glue, and the genome sequence might reveal more about its production and structure.


Professor Maurice Elphick is working with us on the starfish genome. His research interests lie in neuropeptides. These tiny molecules act in the brain to control a whole range of processes including pain, reward, food intake, metabolism, reproduction, social behaviours, learning and memory.

Starfish don’t have a brain, but they are more closely related to humans than they are to most invertebrates. They do have neuropeptides – and his team have discovered many already. Several are involved in the unusual feeding behaviour of starfish.

To eat a mussel, once it’s forced open the shell, a starfish pushes its stomach out of its mouth. It partially digests its prey, takes up the resulting mussel ‘chowder’ and then retracts its stomach.

“I’m interested in understanding the evolution of neuropeptide systems, and also want to compare their functions and to find out what homologous molecules are doing in very different biological contexts.”
Maurice Elphick, Professor of Animal Physiology & Neuroscience, Queen Mary University of London.
One of the molecules they discovered triggers the stomach retraction. The equivalent molecule in humans clearly has a very different role. Professor Elphick explained: “Interestingly, we have also found that the neuropeptide behind the stomach retraction is evolutionarily related to a neuropeptide that regulates anxiety and arousal in humans.”

Professor Elphick explained how the genome sequence will enhance their ability to discover and study more neuropeptides. Because neuropeptides are tiny, the genes encoding them are not always easy to find. The team will study the genome in places where other species are known to have neuropeptide genes, to see if they can pinpoint an equivalent in the starfish (an approach known as synteny). This is only possible because we are using ‘long-read’ technology in the 25 genomes project – so the genomes will be the best possible quality, with few gaps.

The future

The starfish genome is now sequenced and the raw data available for any researcher to use. Over the coming months, our partners at EMBL-EBI will be assembling and annotating it, marking the position of genes and other features.

The finished genome will enable researchers to answer their own questions. About evolution, glue, neuropeptides or growing new arms.

About the author:

Alison Cranage is a science writer for the Wellcome Sanger Institute.


10 surprises from sequencing 25 new species
25 GenomesSanger LifeSanger Science

10 surprises from sequencing 25 new species

By: Alison Cranage
Date: 04.10.18

Sequencing human genomes is now routine at the Sanger Institute. Bacteria, yeast, worms, malaria, and other pathogens are also all regularly sequenced in their thousands. Our people are pretty well known for sequencing the human genome, but we’ve also contributed to the first sequencing of many others including the mouse, rat, zebrafish, pig and gorilla too.

The 25 genomes project is an entirely different beast. It’s posing some new, and frankly very odd, challenges. The diversity of the new species means we’ve had a steep learning curve. Here’s a peek at some of the weird and wonderful things we’ve discovered so far:

New Zealand flatworms will explode if you freeze them - not terribly helpful when trying to extract DNA from samples... Image Credit: S. Rae, Wikimedia Commons

New Zealand flatworms will explode if you freeze them – not terribly helpful when trying to extract DNA from samples… Image Credit: S. Rae, Wikimedia Commons

1. Don’t freeze flatworms

They explode.

You may well ask why we’d freeze them in the first place. But freezing samples, or in this case, whole worms, is standard practice to store them ready for DNA extraction.

Freezing New Zealand flatworms didn’t go so well though. The resulting sticky goop proved difficult to handle… and to get DNA from.

Is this the Oxford Ragwort you are looking for? The best way to know is take a picture and send it to an Oxford expert... Image credit: Rosser1954, Wikimedia Commons

Is this the Oxford Ragwort you are looking for? The best way to know is take a picture and send it to an Oxford expert… Image credit: Rosser1954, Wikimedia Commons

2. It’s good to get a second opinion when you’re identifying something

The Oxford ragwort was chosen to sequence in our flourishing category. We have ragwort growing here on campus, so we took a plant for sequencing.

But once we started, we soon realised it was not the ragwort we were looking for. The plant we had was hexaploid (it has 6 copies of its genome in every cell). The Oxford ragwort, which we were hoping to sequence, is diploid (it has 2 copies).

We sent a photo of the plant to an expert at Oxford University, who informed us we had the common ragwort.

There 300+ species of blackberry - and telling them apart can literally take years of observation. Image credit: Fir0002, Wikimedia Commons

There 300+ species of blackberry – and telling them apart can literally take years of observation. Image credit: Fir0002, Wikimedia Commons

3. There are over 300 species of blackberry in the UK

Yes, 300+.

They differ in a whole host of characteristics; sweetness, number of drooplets (the little blobs that make up the fruit), colour, size, thorns, flowers, lifecycle and more.

Finding the right one wasn’t easy, but we did sequence the correct one first time this time. Read more about the blackberry saga.

Fen Raft Spider - more popular than beavers, apparently. Image credit: Helen Smith, www.dolomedes.org.uk

Fen Raft Spider – more popular than beavers, apparently. Image credit: Helen Smith, www.dolomedes.org.uk

4. Fen raft spiders are more popular than beavers

In a public vote, the fen raft spider won out over the beaver to have its genome sequenced.

Both were contenders in the flourishing category of the project. Over 5,000 votes were cast in total, as part of “I’m A Scientist Get Me Out Of Here”.

Scottish Featherworts are a lonely bunch, they're all male and their female partners are almost half a world away. Image credit: David Freeman, RSPB

Scottish featherworts are a lonely bunch, they’re all male and their female partners are almost half a world away. Image credit: David Freeman, RSPB

5. All the featherworts in Scotland are male

Their potential partners are over 4,500 miles away in the Himalayas.

Botanists don’t know when the populations split, or how they got there. They only reproduce clonally in Scotland, and so it is uncertain how long they can last in this way.

Bush crickets have issues #1 - their genomes are 2.5 times bigger than we expected. Image credit: Richard Bartz

Bush crickets have issues #1 – their genomes are 2.5 times bigger than we expected. Image credit: Richard Bartz

6. Genomes are not always what you expect

We estimated that the genome of the bush cricket would be 2Gb, about 2/3rds the size of the human genome. We were wrong.

The estimate was based on the average cricket genome from the animal size genome database. But in fact it is 2.5 times larger than the human genome, coming in at 8.5Gb.

Read more about how this affected the sequencing.

7. It’s good to share

We knew this already, but this project has been a huge collaborative effort. It wouldn’t have been possible without scientists giving their time and sharing their expertise.

The Natural History Museum are a key partner for the 25 genomes project. They are helping with species identification and collection, as well as providing a link to natural historians and species experts across the UK.

The sequencing itself wouldn’t have been possible without PacBio. They have provided a machine for the project and provided expert technical support to enable the sequencing of the new species.

Our other collaborators include EMBL-EBI, The National Trust, The Wildlife Trust, Royal Society for the Protection of Birds (RSPB), Nottingham Trent University, Edinburgh University, 10x Genomics, Illumina and many more. See the full list here.

Bush crickets have issues #2 - they have cannibal tendencies. Image credit: Richard Bartz

Bush crickets have issues #2 – they have cannibal tendencies. Image credit: Richard Bartz

8. Don’t put bush crickets in a box together

They eat each other (or parts of each other).

Scallops are 20 times more genetically diverse than humans. Image credit: Asbjorn Hansen

Scallops are 20 times more genetically diverse than humans. Image credit: Asbjorn Hansen

9. Scallops are more diverse than people

We’ve found that scallops have 20 times the diversity of humans.

The king scallop was sequenced in the dangerous category of creatures. Human genomes are just 0.1 per cent different to each other – that is, only 0.1 per cent of your DNA code is different to any other person on the planet.

We have a pretty good idea why human genomes are so similar. It’s likely that events in our evolutionary past, like ice ages or infectious diseases caused a genomic bottleneck, which meant only a small group survived.

In scallops, 1.7 per cent of the DNA differs between any given individuals.

Using Pacbio machines, we read 25 new genome sequences in less than 10 months. Image credit: Wellcome Sanger Institute, Genome Research Limited

Using Pacbio machines, we read 25 new genome sequences in less than 10 months. Image credit: Wellcome Sanger Institute, Genome Research Limited

10. We can go faster than we thought

This project started in January 2018. We’re barely into October.

We’ve sequenced 25 new genomes in less than 10 months.

The PacBio machines we are using have doubled the amount of data they produce, per run, in the last 12 months. Next year, they will quadruple capacity.

About the author:

Alison Cranage is a science writer for the Wellcome Sanger Institute.


25 GenomesSanger LifeSanger Science

25 Genomes at New Scientist Live

By: Alison Cranage
Date: 25.09.18

25logopngAlongside robots, slime and VR machines, Sanger researchers were at New Scientist Live last week – talking genomes. Sarah Teichmann was sharing the latest on the Human Cell Atlas Project and Peter Campbell finished a wonderful weekend of sharing the greatest stories from science by talking a fascinated audience through the latest on cancer science. On the main stage it was our 25 Genomes Project being shared with an intrigued audience – many keen to understand more about the genomes of 25 UK species, from catfish to blackberries

Julia Wilson and Cordelia Langford from the Sanger Institute took to the stage alongside Tim Littlewood from the Natural History Museum and Fergal Martin from the EMBL-European Bioinformatics Institute. They were discussing the project to sequence the genomes of 25 British species for the first time.

How it all began

Mike Dilger, TV broadcaster and naturalist, was asking the questions – first wondering how the project started.

“Only by understanding these species much better can we ever hope to protect our planet for ourselves and all the other species with which we share it.”

Mike Dilger, BBC One Show broadcaster and naturalist


The 25 Genomes Project being discussed at New Scientist Live. From left to right: Mike Dilger, Julia Wilson, Tim Littlewood, Cordelia Langford and Fergal Martin

Julia, Associate Director at the Sanger Institute, explained: “It came about because it’s our 25th anniversary. We celebrated with some parties, but we also wanted to leave a scientific legacy. And at the same time we wanted to celebrate the staff that we have at Sanger who are experts in DNA sequencing.”

It was a tough task to narrow down the ~66,000 species in the UK to just 25.

So the Sanger Institute connected with the Natural History Museum to help. Home to over 80 million specimens from around the world, Tim is providing the link between the Sanger Institute and natural historians who have detailed knowledge of the 66,000 UK species.

“Every species has a story to tell – it needs its champion.”

Tim Littlewood,  Head of Life Sciences from the Natural History Museum


The 25 Genomes that the Wellcome Sanger Institute is sequencing to celebrate its 25th Anniversary. To see the full-sized infographic, please click on the image

Categories of species helped the team to focus; flourishing, cryptic, iconic, flourishing, and floundering. And every species had to have a valid scientific reason for sequencing its genome.

Julia continued: “We also realised that the great British public are fascinated by the rich heritage and diversity of life in the UK and so we wanted a project that would resonate not just with our scientists and scientists beyond but a project that would pique the interest of the general public as well.”

So the Public Engagement team at the Wellcome Genome Campus got together with “I’m A Scientist Get Me Out Of Here” to organise a public vote for the final five species – one from each category.

Please click here for more about the 25 species selected

Rising to the challenge


The New Zealand flatworm – whose DNA has proved to be particularly difficult to extract

Mike asked the panel about the challenges of sequencing such a diverse range of creatures.

There was talk of ‘exploding flatworm goop’, tough plant skins and ‘difficult cellular structures’.

“We’re outside our comfort zone,” Julia admitted. But that’s a good thing and is helping us explore and learn how to overcome these new challenges.

Cordelia Langford, Head of Scientific Operations at the Sanger Institute described how the sequencing teams have had to change and optimise protocols to deal with the new organisms – but the learnings have had huge benefits.

“Sequencing of 25 genomes is setting the foundation for an enormously ambitious future. Our partnership with PacBio will help develop technology we need. We’ll learn a lot from the challenges of this project.”

The teams are applying this new knowledge to sequencing human genomes, refining their approach.

The first human genome took 13 years and billions of dollars. Now, the Sanger sequences the equivalent of a human genome in just 24 minutes, at a fraction of the cost.

Fergal described the excitement of sequencing a species for the first time. “It’s like a jigsaw. We have tiny fractions filled in. We don’t know what the big picture looks like. Once we fill it in we will have new questions, new science.”

Why sequence these genomes? What might you find?


Grey squirrels can resist the squirrel pox virus, but the red squirrel cannot. By comparing the grey squirrel’s genome with that of the red squirrel may show which gene(s) give immunity

Mike turned the panel’s attention to the ‘why’ of the project. Why sequence a genome at all? What do we expect to learn?

Tim was excited about the opportunities: “A massive amount of data is about to turn up. It’s going to reveal aspects of evolution we’ve not even dreamt of.”

Each species has secrets hidden in its genome. Robins can ‘see’ the magnetic fields of the earth – but we don’t know how. Starfish can re-grow limbs if they lose them. Grey squirrels are resistant to the squirrel pox virus whereas native red squirrels aren’t – and they’re dying out. Sequencing the genome will help researchers answer these puzzles. It will also drive research into conservation, climate change and evolution.

Fergal talked about how important it is that the data is publicly available for anyone to use.

“The sooner the data is public, the sooner science can be done on it.”

Fergal Martin, Ensembl Genebuild Project Leader, EMBL-European Bioinformatics Institute


Robins can see magnetic fields, it is hoped that reading their genome might reveal how

The EBI will be storing and publishing the data for the project. They will also be annotating the genomes – marking on the position of genes and other features.

“It shortcuts downstream research. Annotating takes a couple of weeks for us. An individual would take weeks or a year, it allows other researchers to ask more questions,” added Fergal.

Peering into the crystal ball…


Starfish can regrow their limbs. If we can find out which genes give them this ability, we might be able to improve wound healing

Mike asked the panel to consider the future. It’s 15 years since the human genome project was completed. Now 25 new species are being sequenced. What’s next?

Tim described life as variations on a theme, where every species is built from a blueprint of DNA. Sequencing different species will allow researchers to compare those blueprints, to understand the genomic diversity of the UK, and beyond.

Julia summed up: “We’re on the precipice of something even more interesting. Can we scale the software, can we scale the storage? Can we visualise the future? What questions should be asked?

“It’s a feasible and tantalising prospect to scale up even further. Why not think about sequencing 66,000 species?”

About the author:

Alison Cranage is a science writer for the Wellcome Sanger Institute.


25 GenomesSanger LifeSanger Science

The Beast from the East? Vespa velutina

Words and pictures by: Alex Cagan
Date: 17.09.18

Prelude: Death from above

The Asian hornet - one of the 25 genomes being read by the Wellcome Sanger Institute

Today, you are a honeybee and today you are going to die.

You enjoyed a summer full of industry, dance and frenetic activity collecting nectar for the hive along with your 10,000 sisters. But all of that is about to change.

They came from the East.

One arrived this morning, unnoticed, a vanguard. After spotting your hive it flew back to its nest to recruit others. Now, a storm is gathering above you. Shadows skate overhead, silent portents of impending calamity.

This darkness is cast by Asian hornets, Vespa velutina. They are specialised honeybee hunters. Each one patrols its own small territory above the hive. Their ferocious mandibles and instincts designed to cleave through the carapaces of you and your sisters. Together the hornets create a tightening net through which you cannot escape.

If you were an Asian honeybee the arrival of the first hornet would have triggered one of the most extraordinary defensive maneuvers in the natural world. You and your fellow workers would surround the hornet. Rapidly twitching your wing muscles to generate heat, you would have smothered it in a pulsating mass, cooking the hornet alive in a blaze of cooperative fury before it could call for reinforcements. Together, you might just have survived.

But you are not an Asian honeybee.

You’re a European honeybee and neither you nor your ancestors ever faced such a threat, until now. As such your genetic and behavioural make-up is bereft of the tools or tricks with which you might hope to defend yourselves. What little resistance you can put up is futile.

After their grizzly work is done, the hornets take no honey. That is not what brought them here. It is the sinews of you and your sisters that they feast upon.

This morning the chambers of your hive were alive with the buzz of activity. This evening things are different. Drips of honey echo in silence.

The Asian hornet - one of the 25 genomes being read by the Wellcome Sanger Institute

25 Genomes: The Asian Hornet

I’m Alex Cagan, a post-doctoral researcher in genetics at the Wellcome Sanger Institute. The Institute turns 25 this year and to celebrate we are sequencing the genomes of 25 species found in the UK that have never had their genomes sequenced before. Over the course of the year I’ll be going ‘behind-the-scenes’ to chronicle this ambitious project in various ways. In doing so I hope to throw some light on the scientists, institutes and species involved in doing this kind of large-scale science and the reasons why it’s being done in the first place.

The Asian hornet is one of these 25 species being sequenced by the Wellcome Sanger Institute (https://www.sanger.ac.uk/science/collaboration/25-genomes-25-years). It is one of five species that was decided upon by a public vote, in a head to head competition with other species. The case for sequencing the Asian hornet was championed by the Sumner lab, a eusocial insect research group based at University College London (http://www.sumnerlab.co.uk/im-a-scientist-get-me-out-of-here/). They tell us that the Asian hornet is ‘a dangerous invasive species that poses a huge threat to bee populations in the UK and elsewhere in Europe’. Of all the 25 species being sequenced it’s also the newest arrival in the UK. The first confirmed sighting of a nest was in Gloucestershire in September 2016, with a second found in Devon in 2017.

For many, the mere mention of the word ‘hornet’ is enough to set the pulse racing. Even if we’ve never encountered one, the image of a swarm of angry wasps on steroids looms large in the imagination. But what of the reality? Are they truly winged nightmares or is this reputation undeserved? We already have hornets in the UK, so what makes Asian hornets so special? In search of answers I headed to London to meet with Dr Gavin Broad, Principal Curator in charge of insects at the Natural History Museum.

Natural History Museum, London

Behind the scenes

Visiting the NHM is always a joy. Encounters with wonders there as a child inspired me, along with countless others I’m sure, to become a biologist.

I’ve always had a yearning to go behind the scenes and glimpse the vast collections that have never been on display. I have read stories of the museum’s legendary collections, particularly in the excellent book ‘Dry store room No1’, by Richard Fortey (for a review see: https://www.theguardian.com/books/2008/jan/26/history). I imagine endless corridors stacked with shelves containing specimens from all branches of the tree of life, gathered by naturalists across the globe. The biological equivalent to Borge’s fictitious Library of Babel, with specimens that display the near infinite variety of forms produced in nature. A library of life.

Pigeons at the Natural History Museum

It turns out my imagination is not far off. After signing in as a guest I meet Dr Gavin Broad at his desk. It is covered with a combination of books and wasp specimens of different sizes, encased in tubes or glass-windowed wooden boxes. I’ve come to the right place, I think to myself.

We begin by heading straight to the museum’s collection of Asian hornets. Dr Broad leads me to a corridor lined with tall grey metal cabinets as far as the eye can see. Each one has a number and a brief description of the contents within. After a few seconds I’m already disorientated. We stop at one case that looks to me like all the others. This one is labelled 58 – Vespidae / Vespinae?. He spins a wheel attached to the cabinets and the stacks begin to move, making a central row of cabinets accessible. Opening one of the top doors reveals a stack of wooden shelves that immediately fills the air with the rich smell of timber.

Dr Broad and his collection of Asian hornets at the Natural History Museum, London

Dr Broad retrieves the top two shelves and lays them on a table. Framed in glass are rows upon rows of hornets. They are beautifully laid out. A single pin connects each hornet to a paper record written with impossibly small, impossibly perfect handwriting. It is evocative of a golden age of natural history. An age, it dawns on me as I stand surrounded by cabinets of carefully documented insects, that may never really have ended.

Asian hornet collection

We take the shelves to a nearby room to take a closer look. Peering at them through the glass the first thing I notice is that they’re smaller than I expected. I’d imagined something at least thumb sized. They’re still larger than common wasps, but not by much. They’re darker too, a deep velvety black covers most of their body, in brilliant contrast to the yellow and orange hues that highlight their face, abdomen and the tips of their legs.

The population of Asian hornet that has begun to colonise Europe, Vespa velutina nigrithorax, has a particularly dark appearance. Up close it becomes clear that they are also covered in a fine layer of fuzzy hair. I think it would be wrong of me say this makes them look cute, but it certainly adds… character. As we spoke Dr. Broad took a high-resolution photo using the museum’s special insect imaging setup. Take a look and decide for yourself:

Close up of Asian hornet's head

A brief history of hornets

As I peer at hornets Dr Broad provides me with the broad brush strokes of their evolutionary history. Hornets, it turns out, are the result of a series of key evolutionary innovations. Hymenoptera, the order of insects containing hornets (as well as wasps, bees and ants), emerged in the Triassic, over 200 million years ago. One of the keys to their success was the evolution of the ovipositor, a specialised organ for laying eggs. This appendage really opened doors for the hymenopterans, as they could now lay their eggs in new places. These places could be hard to access crevices, where eggs would be safer from predators. Or, perhaps most ingeniously, on and even inside food sources such as fruits, or, in the case of the devious parasitic wasps, other insects.

Fast forward millions of years and we arrive at the emergence of the subclade Aceuleata. In these hymenoptera the ovipositor underwent a startling transformation. Instead of delivering eggs it became dedicated to delivering venom. The dreaded stinger was born. An organ used to create life metamorphosed into one that takes it. This sent the Aceuleate down a very different evolutionary road. While their ancestors laid their eggs on the food and then effectively abandoned them, the Aceuleata started to hunt and provision food for their offspring instead.

The next key development in the path leading to hornets was the emergence of eusocial societies. Eusociality is considered the highest level of organization among animal societies, defined by cooperative breeding, with breeding and non-breeding members dividing labour and working together. There are many startling parallels with our own societies here, yet they remain distinctly other and fascinatingly so. Eusociality appears to have arisen independently multiple times in the Aceuleata, in the social wasps, bees and ants. It may be that eusociality was made possible by the previous innovation, the stinger, which provided these fledgling societies with a viable way to defend their nests from intruders.

And what nests they are.

Not only does the NHM collection include specimens of the hornets themselves, they also have rows of boxes containing nests of different species. Dr Broad opens one box, at least an arm span wide, that is filled with a near complete example of an Asian hornet nest. It’s truly an architectural wonder. Created by wood pulp chewed and mixed with hornet saliva, it is both incredibly light and unbelievably sophisticated. The nest would have originally hung from a tree, and indeed a few remnant branches are still embedded in it. In this nest large swathes of the rippled paper shell are missing, revealing the interior structure. Horizontal discs, inlaid with hexagons that housed the larvae, are suspended by vertical columns that connect the levels. All the more remarkably, each nest is created and then abandoned within a single year.

Asian hornet's nest

It’s amazing to think that insects are capable of building such structures. One wonders what lessons there might be here for our own architects, given what these hornets can achieve with only a little wood and a little spit. Indeed, the field of biodesign continues to draw inspiration from these invertebrate constructs. Though for all their beauty the fact that they are usually packed with thousands of hornets would make me hesitant to admire one in the wild, where they are more akin to a devil’s piñata.

Having seen the rather beautiful hornets and their astonishing nests I’m starting to wonder what the big deal is about the threat of Asian hornets colonising the UK. Afterall, we have native hornets in Europe already, what harm could another species possibly do?

Apparently quite a lot. It turns out that unlike our native hornet species, which tend to be generalist hunters of many different insects, the Asian hornet is more like a precision honeybee-seeking missile. Asian hornets will search for honeybee hives and once they find one they will engage in ‘hawking’ behavior. This involves hovering near the entrance to the beehive, killing bees as they fly to and from the nest or try to defend themselves. The hornet, covered in a tough carapace, is practically impervious to the attack, while it’s own powerful mandibles are quite capable of tearing the poor bees apart.

In Asia, the local honeybees have coevolved with these hornets for millenia. This has resulted in the honeybees developing more effective defence mechanisms. The most famous being the ‘bee ball’ formed by Japanese honeybees. While European honeybees have been observed forming these bee balls, they do not appear to be of a sufficient intensity to effectively kill attacking hornets.


So for now the Asian hornets in Europe have the upper hand, mounting raids on honeybees that do not yet have a way to defend themselves. European honeybees have been in a perilous state for many years, with colonies collapsing at alarming rates due to a variety of factors, such as heavy pesticide use, that remain poorly understood(?). They really deserve a break. The arrival of a new and voracious specialised predator is the last thing that they need. Scientists and farmers alike are concerned that if Asian hornets become well established throughout Europe this will have a devastating impact on agricultural productivity. Or, in the worst case scenario, it could be the last nail in the coffin for the European honeybee.

How did they get here?

While there have only been two confirmed nests in the UK, Asian hornets are already established in France and Spain. How did they get to Europe in the first place? The most likely scenario, Dr Broad tells me, is that a hibernating queen arrived as a stowaway in a crate. It turns out that, like us, social insects such as the Asian hornet are particularly well-suited to becoming international colonisers. Though the reasons for their success are perhaps quite different from our own. The queen hornet mates with males only once in her life, during an initial mating flight. For the rest of her life the queen will carry the sperm she received in a specialised organ. She will use this to fertilise all future eggs she lays (apparently this remarkable process of internal fertilisation on demand is so precise that on average less than two sperm are released per fertilisation). Therefore, it only takes a single queen to establish a viable population. A lone voyager who contains within her the seeds of an entire society, waiting to bloom.

Vespa velutina have a particularly wide geographic range in Asia, this ability to tolerate a variety of climates likely contributes to their success as invaders. However, where exactly within Asia these invertebrate pilgrims came from remains a mystery. Here is where DNA could provide the answers. The DNA sequence of an individual is rich source of information but to understand it we first need a road map, a ‘reference genome’ from the same species that tells us the general shape and structure of the genome so that we can make sense of the pieces from any given individual. We hope that sequencing and assembling such a reference genome for the Asian hornet can shed light on this mystery, as well as many of their other secrets. What else can we hope to learn from sequencing the Asian hornet genome?

A genomic perspective

As well as helping us identify where exactly in Asia the hornets are likely to have come from the genome will enable us to design markers to help monitor their spread across Europe. This kind of information could be crucial to monitor which populations are spreading the fastest and which management strategies are proving most effective.

The Asian hornet genome can help us to understand the biology that underpins the amazing adaptations of this species. While this knowledge is fascinating in its own right, it could one day be used to help control their spread. For example, the Sumner lab proposes that if we can identify the chemical odorants and compounds that this species is sensitive to that information could be used to try and manipulate their behavior to limit their production.

Their website has more ideas on how the genome might be used, including the potential for biocontrol by gene-editing – and a list of great references if you want to get into the details.

So far, at least in the UK, vigilance has paid off. Both Asian hornet nests were quickly identified by the public and removed before they had a chance to spread. If you’d like to get involved there is an ‘Asian Hornet Watch’ app you can download (what a world!) to report sightings of Asian hornets (https://itunes.apple.com/gb/app/asian-hornet-watch/id1161238813) and a handy identification poster (https://www.pestmagazine.co.uk/en/library/posts/2015/asian-hornet-alert-poster).


I leave the Natural History Museum filled with a sense of awe, both for the Asian hornet and the museum’s collections which make such intimate encounters with the natural world possible. The carefully preserved and recorded specimens are a priceless trove of information. Sequencing the genome will be the latest chapter in our efforts to catalogue and understand the hornet. A new piece in an old game.

Asian hornet from above

About the authors:

Dr Alex Cagan is a postdoctoral fellow in Inigo Mortincorena’s research team at the Wellcome Sanger Institute, studying mutation and selection in healthy tissues and how this relates to cancer and ageing.



25 GenomesSanger Science

The golden eagle genome has landed

By: Kat Arney and Rob Ogden
Date: 03.09.18


The golden eagle genome has been sequenced as part of the Sanger Institute’s 25 Genomes Project

The golden eagle is undoubtedly one of the UK’s most iconic birds. With an impressive 2 metre wingspan and striking yellow feathered legs, it’s a stirring sight if you’re lucky enough to spot one soaring over the Scottish Highlands and Islands.

While golden eagles may not be critically endangered – the IUCN’s Red List of Threatened Species lists them as being of ‘least concern’ – their habitat is shrinking. Many of the already small populations around the world are continuing to decline, scattered through Europe, Japan and other areas, which is why today’s announcement of a new golden eagle genome is so important.

The eagle genome has been completed as part of our 25 genomes project, sequencing the genomes of 25 significant UK species ranging from pipistrelle bats and Eurasian otters to spiders, starfish and summer truffles. But while the announcement of a newly-sequenced species is undoubtedly exciting to fans of genomics, having a complete golden eagle genome is also a vital tool to help conservationists protect and manage these fabulous birds.

Genetics meets conservation

Conservation geneticist Dr Rob Ogden at the University of Edinburgh has been using simple DNA profiling and sequencing to monitor the genetic makeup of animal populations for at least 20 years, studying species as diverse as endangered gazelles, manta rays and (of course) golden eagles.

But, as Rob explains, while these tests can provide useful information about a population – such as how genetically diverse it is, and how individuals are related – it can only tell us so much.

“This basic information can help us when we come to make decisions about how to manage populations, but it’s based on looking at differences in small ‘snapshots’ of DNA scattered throughout the genome,” he says. “What we don’t really understand is what any of these genetic differences relate to in biological terms. If you keep a couple of populations separate for multiple generations, parts of their DNA will gradually drift apart, but we don’t know if they have any biological relevance at all.”

To draw an analogy with language, a simple alteration in spelling – for example, switching recognise to recognize – makes no difference to the meaning of the word. But more significant changes might alter the meaning of a word altogether, like changing ‘recognise’ to ‘organise’.

The simple DNA tools that Rob and his team have been using up until now can spot that individual ‘letters’ have changed, but they can’t identify the context of the biological ‘words’ in order to tell whether the difference is meaningful. And to read the ‘words’ in DNA (genes), you need to read the full genome.

The DNA that was used to create the new golden eagle genome came from a chick that was found dead in a Scottish nest during a raptor health study and was read using PacBio SMRT technology. Unlike other DNA reading methods, PacBio’s technique generates very long, high-quality stretches of sequence from which it’s easier to build a whole genome. This allowed the researchers to build what’s known as a ‘reference genome’, against which DNA from other golden eagles around the world can be compared.

Adapting to a changing world


The full genome sequence for the golden eagle will help conservation efforts

The full genome sequence for the golden eagle will help conservation efforts. Having a high-quality full genome sequence for the golden eagle opens up a treasure trove of biological information that conservationists can use to manage species more effectively in the wild.

“Now we have the whole genome we can identify specific genes and work out what they do, so we can see whether a specific change is likely to affect what happens in a cell or in a whole animal,” he says.

“Golden eagles are spread around the world in lots of different habitats and climates – there are hot weather birds and cold ones, eagles in forests and others in the hills – so are their genes adapted to their local environment? Do the genetic differences we see relate to important differences in the physiology of the animals which are related to how they can best survive in that particular environment?”

This knowledge is vital for managing endangered populations effectively as the global climate changes. One conservation tool is land management – generating certain types of habitats that will encourage particular species. Another option is translocation, moving animals from one area to another or releasing captive animals back into the wild. But if those creatures are poorly adapted to the environment they’re being put into, then there’s a good chance they’ll fail to thrive.

As temperatures across Europe are expected to increase over the next century and habitats change, it’s unlikely that large species like eagles will be able to adapt fast enough to cope. Instead, the most likely solution is for populations to move north in search of cooler climes.

“We know that Mediterranean golden eagles are genetically quite different from the Scottish birds, so perhaps we might see a situation where eagles from warmer climates become better adapted to a changing habitat type in northern Europe than the existing population that’s there now,” Rob explains.

“But if it’s taken 10,000 years to evolve a particular trait, there’s no way that’s going to adapt to climate change in the next hundred years, so understanding how these locally adapted populations have come about is really important for predicting how we can manage species in the future.”

Taking flight


Golden eagle in flight. Image credit: Martin Mecnarowski, Wikimedia Commons.

The completion of the golden eagle genome as part of the 25 genomes project is only the first part of the story. The golden eagle has been selected as one of the species to go forward into the Genome 10K project, carrying out detailed analysis of DNA from around 10,000 vertebrate species. Researchers will be using a technique called optical mapping to get an even more detailed picture of how the eagle genome is organised and to make sure they haven’t missed any bits.

“The golden eagle has been promoted up to the Premier league in terms of the quality of genome that we are going to obtain for it in the future,” Rob says. “The genome we have now is more detailed than anything that has been done before with golden eagles by quite a long way, but the next step is to make it way better – the best of all wild bird species.”

Having been lucky enough to watch a pair take flight in the hills on the Scottish island of Skye, watching with rapt attention as they swooped and circled round each other in a charming courtship dance, it’s easy to argue that golden eagles themselves are better than a lot of other birds.

“They certainly are very cool!” laughs Rob. “They’re an iconic species in the UK – people recognise them and are proud of them, but that’s true in every culture where you find golden eagles. It’s something that helps with conservation education because people can really relate to these animals and support projects that focus on saving them. They’re fantastic birds to work on.”

About the authors:

Dr Kat Arney is a science writer, public speaker and broadcaster, and author of the popular genetics books Herding Hemingway’s Cats and How to Code a Human. 

Dr Rob Ogden is Head of Conservation Genetics at the University of Edinburgh and a scientific adviser to the South of Scotland Golden Eagle Project.




25 Genomes Project update
25 Genomes

25 Genomes update. Yes, it’s been a while …

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 12/06/2018

25 Genomes Project, Wellcome Sanger Institute

25 Genomes Project, Wellcome Sanger Institute

The project had been progressing at a steady-ish rate for a while, up until a few weeks ago and now we’ve run into some technical problems.

We’re using a number of different technologies to make the final genomes of our 25 species, they all serve slightly different purposes, with the aim that they all complement each other. Combined these technologies (and the clever people and computer programs that check the data) means that we can make very, very good quality genomes in a matter of months (possibly better than the human genome which took over 10 years with the old stuff).

So where are we now?

Pacbio complete for 13

Pacific Biosciences SEQUEL system. This is the main thing we use, you can get a pretty good genome with this technology alone, it uses long bits of DNA (about 50,000 letters). This works in a similar way to most other technologies as it labels the DNA with coloured dyes and takes photos of them as they are added to the bit in the well. The difference is the scale- this tech means you can ‘read’ 10s of thousands of letters of DNA per well (and there are 1 million of those), leading to a better genome. See the video below for a better explanation.

10X complete for 16

10X Genomics Chromium system. This is a clever new use of existing Illumina sequencing capabilities. This tech basically allows us to map smaller bits of DNA into a larger picture.

Hi-C complete for 2

This was invented by Erez Lieberman Aiden and gives an even bigger picture of how the bits of DNA fit together, allowing it to be put together in chromosome-sized chunks.

Bionano genomics SAPHYR.

Another way of fitting DNA together, this is especially useful to see large chunks of it that have moved around somewhat.

[basic] Genome assembly complete for 14

So not bad progress. We’re a little delayed, but ok for now.

The trouble with starfish

However, some species are proving to be rather problematic, most notably the starfish. We got [a lot] of sperm from one starfish* a few months ago thinking that as the sole purpose of sperm is to deliver DNA to an egg it would be a good place to start. Wrong.

For some, as yet unknown, reason the DNA in starfish sperm is oddly fragile- when we tried to extract it from the cells it broke up into bits only 200 letters long- WAAAAY shorter than the 150,000 aimed for.

You might wonder how we got starfish sperm. Apparently there’s a special chemical (called GSS- ‘gonad stimulating substance’) that you inject into the starfish that makes them- shall we say- ‘produce’ the sperm in surprisingly large quantities.

Flatworms aren’t too helpful, either

Working with flatworms hasn’t been straightforward either. Their sliminess is a problem, but not the only issue. The worms are essentially just a long gut surrounded by a bit of muscle and other anatomical odds and ends. This means they have a lot of nasty enzymes and other digestive juices inside that are specifically designed to break up long molecules (see below for a video)

When you combine sliminess and a large concentration of enzyme with the effects of freezing for storage, you end up with what was affectionately labelled ‘a zombie worm mush’ by our wormologists. Needless to say the DNA was not of a usable quality.

And as for truffles…

Truffles, too, seem not to like having their DNA extracted. After a few unsuccessful attempts we’re going to try a technique from 1992 that gave good results in the paper it came from and seems simple, so fingers crossed…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page