On choosing the 25 species for our 25 Genomes project
Categories: Tree of Life8 January 201811.7 min read

On choosing the 25 species for the 25 Genomes Project

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/01/2018

For those that don’t know (and until recently I could include myself in this group) there are A LOT of species on and in the earth. Currently it’s estimated that there are 2 billion! (2,000,000,000; see http://www.journals.uchicago.edu/doi/10.1086/693564 for details). Most of these are bacteria, and we’re not looking at those for the 25 Genomes project, but this still leaves about 450 million to choose from.

To make it easier for ourselves, we also decided to limit ourselves only to the 1.5 MILLION species that have currently been described and catalogued. And, to help us along a bit more, we decided that only species found in the UK would count. According to the National Biodiversity Network, that brings the number down to ‘only’ 56,674. Now if you choose to only look at the local area surrounding the Sanger Institute then it’s a much more manageable 318.

However, it wasn’t going to be that easy. In the spirit of the Sanger’s inclusive approach to science, the Steering Group for the 25 Genomes project were concerned that such a narrow list was ‘too parochial’ and directed that the species sequenced should be a representative group of organisms from the whole of the UK.

So, how do you filter more than 56,000 species down to just 25?

The first thing to do was to break down the problem and the idea of a 5×5 matrix was mooted, discussed and agreed upon surprisingly quickly. Rather unsurprisingly coming up with five different categories was not as straightforward as it might first appear. While some were no-brainers (iconic species for instance), getting all five nailed down was tricky.

The wisdom of crowds

So we put out a call for suggestions to the whole Wellcome Genome Campus, to draw on the collective wisdom of the more than 2000 people who work here.

The results were, by turns, pleasing, odd, not-at-all-answering-the-question and esoteric. Here are some examples:

  • Species for which Britain has major global richness and conservation responsibility
  • Female emancipation in the wild
  • Unusual in terms of genetic load accumulation rate and mechanism
  • The three-toed sloth (which is neither a theme nor from the UK)
  • 25 local authors (and then we would really have 25 ‘novel’ genomes)
  • Species imported to the UK, which are making our lives healthier and happier (possibly a politically motivated suggestion)
  • What is ‘down there’ (in the detritus level down on the Ocean floor).

Finding five themes

Armed with these suggestions, the 25 Genomes Steering Group got back together to hammer out the final five categories. Here’s what we decided upon, reasoning that these themes should give a broad breadth of types of organism and habitats to sample:

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

Critical criteria

We also came up with a list of criteria that the species must meet:

  1. Scientific justification must be solid- are there good questions that can be answered by the genome sequence being made available?
  2. No decent draft sequence currently available
  3. Sample availability- some organisms are too small, others are too protected, while others are too seasonal for collection
  4. Tractable genome – some organisms have genomes that are incredibly complex and would take up too much time and resource. For example, many plants have cells that contain multiple copies of the same[ish] chromosomes, a phenomenon known as increased ploidy. (A hexaploid genome has SIX copies of each chromosome, and some plants have even more.)

Now there comes the hard part, actually getting the list of species. As mentioned in a previous post, our public engagement team suggested that we let the public decide five of the species, leaving us just 20.

Great you might think, as it means we don’t need to do as much work, but you’d be sadly mistaken. The reality was that I now needed a list of 20 to start collecting right away AND another 40+ that the public could vote on to decide the final five!

It's who you know...

Rather splendidly we have a senior member of the Natural History Museum London on our steering group which meant we could exploit their contact list of some 400+ partner groups of wildlife experts. With this in mind I made a surveymonkey survey (it’s still about so you can check it out here, feel free to fill it in- you never know we might want to do more!) that, in my mind at least, cunningly hid the criteria in the questions. It also deliberately did not mention the themes so as not to steer people in any particular direction.

From this I got 99 responses (again discussed earlier) that made up most of the public vote and the 20* for getting on with, these latter ones are in the table below:

Cryptic Dangerous Floundering Flourishing Iconic
Brown Trout Indian Balsam Red Squirrel Grey Squirrel Golden Eagle
Common Pipistrelle King Scallop Water Vole Ringlet butterfly Blackberry
Carrington’s Featherwort New Zealand Flatworm Turtle Dove Roesel’s Bush-Cricket European Robin
Summer Truffle British Mosquito Northern February Red Stonefly Oxford Ragwort Orange-tailed Mining-bee

All in all, this took about 5 months to get to this stage as the species also needed to be individually reviewed to see if they met the criteria and then approved by the steering group.

Now the only problem is actually getting the species DNA; so collecting specimens and some lab work to follow, the supposed easy part….

More on this to come!

*Why we chose the above 20 species

Why sequence it?
Summer Truffle There is disagreement in the literature as to whether this truffle is one or two separate species, plus it grows underground and is therefore largely unseen and difficult to locate. Prices for those collected in the UK remaining relatively stable at around 400GBP per kilo. Known as mycorrhizal, these fungi form a symbiotic association with a host plant on which they are dependent throughout their lifecycle. The sequencing of UK T. aestivum syn. uncinatum populations would be pivotal in helping to answer questions of modes of reproduction, life cycle questions as well as aiding in some core speciation questions.
Brown Trout The Brown Trout has three isoforms that differ in their migratory patterns, one form remains in the locality of its birth where it will live out its life, spawn and die. The second type migrates from lakes to streams and rivers to spawn but remains in fresh water. The third form migrates to the sea/ocean and remains there for much of its life, only returning to spawn. There appears to be no genetic difference between these forms, also known as anadromous (migratory) and sympatric (resident). Additionally the Wellcome Genome Campus is built around an 18th century red brick hall, Hinton Hall, also known as Trout Hall, where a carved stone trout is prominently displayed over the main door to the croquet lawns.
Carrington’s Featherwort This is selected as a representative of the liverworts, an ancient plant group predating flowering plants. It is one of the characteristic liverworts of very high rainfall areas in Scotland, and thus a representative of one of the very special groups of the British biota confined to such high-rainfall areas. Outside Scotland, it is only found in Ireland (extremely rare), the Faeroes and the Himalayas. The Scottish plants are apparently all male – like the Ents, the sexes have become separated in this species and the nearest females are in the Himalayas.
Common Pipistrelle Until recently this bat was believed to be a single species however it is now know to be a dual species (common/soprano), with one other (Nathusius’) also being resident in the UK. Studying the genome will allow us to investigate the origins of the split between the two species, when and why it occurred.
Indian Balsam Highly invasive weed species that substantial effort to control is undertaken, control methods based on finding would have important implications for wetland and river management.
King Scallop Pecten maximus has been found to contain the Amnesic Shellfish Poisoning toxin, domoic acid, which accumulates after they consume algae/diatoms- especially in the event of algal blooms. This risk is regarded as a significant threat to both public health and the shellfish industry. Some studies have suggested that global warming is resulting in greater reproductive success for P. maximus in the UK, however concerns have been raised over increasing mortality, declining recruitment and spawning stock biomass in several Scottish populations. Pecten maximus is also of interest scientifically because of its unusual vision and because its two shell valves are coloured differently. Identifying molecular pathways for shell pigment production in Mollusca has lagged behind studies of vertebrates and terrestrial invertebrates, and is a major gap in our understanding of how colour has evolved in the natural world. Vision in Mollusca is also of great interest because of the many different eye morphologies and the fact that very few species are thought to see in colour.
New Zealand Flatworm New Zealand flatworms prey on earthworms, posing a potential threat to native earthworm populations. Further spread could have an impact on wildlife species dependent on earthworms (e.g. Badgers, Moles) and could have a localised deleterious effect on soil structure.
British Mosquito Mosquitos are an important disease vector and there has been speculation that an increase in the distribution of other species due to climate change could allow the re-introduction of diseases such as malaria to the UK.
Red Squirrel Sequencing the whole genome of the native red squirrel will hopefully provide new tools and resources into reversing their decline and aiding their long-term conservation in the UK. For example, this research could reveal key insights into how red squirrels have adapted to living in an urban environment. This study could also provide further information for managing the spread of diseases and helping to protect the red squirrel from the fatal squirrelpox virus, as well as to gain a deeper understanding into the impact of newly-discovered diseases
Northern February Red Stonefly These stonefly only inhabit the purest of waters and as such are very limited in their habitats and may struggle to adapt to climate change. Brachyptera putata is an endemic UK stonefly. There has been suggestions that other European Brachyptera species may be synonyms of B. putata. Sequencing would determine whether it is a true UK endemic.
Turtle Dove Turtle Dove numbers have fallen by a staggering 93% since 1970 and now resides on the Global Red List for Endangered Species. Smaller than its collared cousin, the Turtle Dove is now only found in eastern England, where farmers are working with the RSPB to create feeding habitats, the destruction of which are blamed for the bird’s decline.
Water Vole The Water vole is the UK's fastest declining mammal and efforts to help the population maintain genetic fitness would benefit from having the genome sequenced. Arvicola is a fantastic example of a small mammal genus that survived through the last glaciation, and has adapted to a range of habitats across Europe and much of northern Asia.
Oxford Ragwort The Oxford Ragwort is representative of a species being introduced and excelling in another habitat. It was collected from the slopes of Mount Vesuvius sometime in the 17th Century, and planted in Oxford where it rapidly colonised the area due to its natural hardiness, and could grow on urban landscapes too (sides of buildings, on stairs, etc.). When railways were introduced to the UK landscape, this facilitated the spread of Oxford Ragwort across the UK (it can be found growing along railway tracks today). Sequencing the genome would better increase our understanding of a non – native species excelling in a new habitat and may expand on our understanding of the ecology of flowering plants.
Roesel’s Bush-cricket Once restricted to the south coast and estuaries (saltmarshes) it is now widespread, possibly due to climate change and the spreading of salt on UK roads.
Ringlet butterfly Despite an overall decline in butterflies over the last 50 years the ringlet has increased its population by nearly 400%. It’s one of the few to fly on overcast days and has an interesting dwarf form that appears at 600ft, increasing until 100% of the population is this form at 1000ft.
Grey Squirrel As the anti-hero for the red squirrel, investigating how/why the squirrelpox virus is tolerated
Blackberry Good opportunity for citizen science, population genomics specifically for schools engagement. Also commercial soft-fruit genetics as it is an important and expanding food crop.
Golden Eagle This is an iconic UK species that has suffered from hunting and pesticide poisoning in the past, leading to extinction in all parts of the UK except Scotland where there are still less than 500 breeding pairs.
Orange-tailed Mining-bee This species is conspicuous and attractive, one of the mining bees that is more likely to have come to the attention of the general public. It is widespread and common throughout the United Kingdom, flying in spring. It is a component of natural pollination services which can ensure crop pollination in the absence of honeybees, and also the pollination of many wild and garden flowering plants ensuring their genetic diversity and conservation.  In the UK, of 276 species of bee, there is only one honey bee, and a score of bumblebees, the great majority of native bees are mining bees, including 68 species of Andrena.  The genome sequence itself will be useful for comparative study of the genomes of this solitary bee with the available genomes of social bees, in terms of gene composition relevant to sociality.
European robin Robins use vision-based magneto-reception and the mechanism is not fully understood, it has been shown that it may involve quantum entanglement. Robins are also extremely territorial, unlike most other song birds, with up to 10% of all deaths occurring due to fights.

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page