Category: 25 Genomes

Providing 25 new genomes to support conservation and science

25 Genomes Project update
25 Genomes

25 Genomes update. Yes, it’s been a while …

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 12/06/2018

25 Genomes Project, Wellcome Sanger Institute

25 Genomes Project, Wellcome Sanger Institute

The project had been progressing at a steady-ish rate for a while, up until a few weeks ago and now we’ve run into some technical problems.

We’re using a number of different technologies to make the final genomes of our 25 species, they all serve slightly different purposes, with the aim that they all complement each other. Combined these technologies (and the clever people and computer programs that check the data) means that we can make very, very good quality genomes in a matter of months (possibly better than the human genome which took over 10 years with the old stuff).

So where are we now?

Pacbio complete for 13

Pacific Biosciences SEQUEL system. This is the main thing we use, you can get a pretty good genome with this technology alone, it uses long bits of DNA (about 50,000 letters). This works in a similar way to most other technologies as it labels the DNA with coloured dyes and takes photos of them as they are added to the bit in the well. The difference is the scale- this tech means you can ‘read’ 10s of thousands of letters of DNA per well (and there are 1 million of those), leading to a better genome. See the video below for a better explanation.

10X complete for 16

10X Genomics Chromium system. This is a clever new use of existing Illumina sequencing capabilities. This tech basically allows us to map smaller bits of DNA into a larger picture.

Hi-C complete for 2

This was invented by Erez Lieberman Aiden and gives an even bigger picture of how the bits of DNA fit together, allowing it to be put together in chromosome-sized chunks.

Bionano genomics SAPHYR.

Another way of fitting DNA together, this is especially useful to see large chunks of it that have moved around somewhat.

[basic] Genome assembly complete for 14

So not bad progress. We’re a little delayed, but ok for now.

The trouble with starfish

However, some species are proving to be rather problematic, most notably the starfish. We got [a lot] of sperm from one starfish* a few months ago thinking that as the sole purpose of sperm is to deliver DNA to an egg it would be a good place to start. Wrong.

For some, as yet unknown, reason the DNA in starfish sperm is oddly fragile- when we tried to extract it from the cells it broke up into bits only 200 letters long- WAAAAY shorter than the 150,000 aimed for.

You might wonder how we got starfish sperm. Apparently there’s a special chemical (called GSS- ‘gonad stimulating substance’) that you inject into the starfish that makes them- shall we say- ‘produce’ the sperm in surprisingly large quantities.

Flatworms aren’t too helpful, either

Working with flatworms hasn’t been straightforward either. Their sliminess is a problem, but not the only issue. The worms are essentially just a long gut surrounded by a bit of muscle and other anatomical odds and ends. This means they have a lot of nasty enzymes and other digestive juices inside that are specifically designed to break up long molecules (see below for a video)

When you combine sliminess and a large concentration of enzyme with the effects of freezing for storage, you end up with what was affectionately labelled ‘a zombie worm mush’ by our wormologists. Needless to say the DNA was not of a usable quality.

And as for truffles…

Truffles, too, seem not to like having their DNA extracted. After a few unsuccessful attempts we’re going to try a technique from 1992 that gave good results in the paper it came from and seems simple, so fingers crossed…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Job satisfaction: helping flatworms to chill out
25 Genomes

On Job Satisfaction

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/05/2018

People often seem to gripe about their job but I, however, am happy to put myself firmly in the ‘extremely satisfied’ category. Besides from working on one of the coolest projects at a world-renowned science-y place I think the main reason might be the sheer diversity of what I do.

Here are some of the things that I’ve been up to over the past few weeks.

Communing with nature

I had a wander out to the Genome Campus wetlands to find the Himalayan Balsam). The plant’s Latin name Impatiens glandulifera comes from the way it spreads its seeds – when disturbed the seed pods explode, flinging the seeds out to a distance of up to 7 meters!

So why was I wandering about a nature reserve? Well, we had run out of the sample we collected last year for genome sequencing. The new sample will be used to test DNA recovery methods before the nice people at Reading University send samples of Himalayan Balsam that are resistant to the rust fungus used to control its spread.

Exporting Golden Eagle heart

I’ve drafted a CITES (it’s the treaty that governs endangered species samples) application for exporting some Golden Eagle heart for analysis in the US.

Being a Bat-man

Discussed the ins and outs [pardon the pun] of dissecting a bat.

Working the numbers

Made a list of all recorded species in the UK, then assigned them into families and worked out the average genome size and total amount of sequence that represents. In case you were wondering, it amounts to 85,000,000,000,000 letters [bases] of DNA- or the equivalent of over 20 million copies of war and peace.

Helping worms to chill out

Received some slimy worms in the post, put them in a fridge.

Bought (with my own money) some clay granules to try and make the worms a little more comfortable- they act as a contaminant free soil that keep them nice and moist.

Pre-fridge, this is what a reasonably happy flatworm looks like

Pre-fridge, this is what a reasonably happy flatworm looks like

Calling on the kindness of my wife

Booked on a conference in Vienna (nice), found out I need to fly the day I’m coming back from a christening from Manchester and return to LHR (not so nice). This is not the main problem however- it means my wife needs to drive for 3 hours with two adorable over-sized bacterial/viral culture vessels (I call them Alex and Ben). Much apologies were given.

Becoming an accomplished host

Arranged catering for a meeting (also re-arranged the chairs/tables)

Had numerous tele-conferences with participants in Germany, USA, China, Hungary etc.

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Roesel's Bush Cricket: The trouble with crickets and their ever increasing genomes... Image: Richard Bartz, Wikimedia Commons
25 Genomes

The trouble with Crickets

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 23/04/2018

The type of cricket (Roesel’s Bush Cricket, Bicolorana roeselii or Metrioptera roeselii) we decided to sequence is interesting because it has spread out of its traditional salt-marsh environment to the interior of the country. We want to know if this is because it has adapted to live in less saline conditions or if it’s been possible due to the increased salt spreading on roads making corridors for the crickets to move along (or a combination of both).

This was one of the first species we received (from Björn Beckmann & Peter Sutton of the Orthoptera & Allied Insects group late in the summer of 2017. We got three, all from a field in Oxfordshire, and it turns out they’re not adverse to a little cannibalism – one of them ate the back legs of its roommate (the other was in a separate container although was also missing a leg) before I could separate them. Seeing as I was feeling a little mischievous I named them Hannibal, Oscar and Heather (despite them all being male – I took some creative license).

Getting the DNA from Oscar was one of the easier ones, good yield and reasonable (although not the best) quality, certainly good enough for PacBio sequencing though.

This is a femto pulse trace of the DNA fragment size, here it’s mostly in the 20Kb+ range, ideally it’d be bigger- an perfect trace has one giant peak at ~165Kb

This is a femto pulse trace of the DNA fragment size, here it’s mostly in the 20Kb+ range, ideally it’d be bigger- an perfect trace has one giant peak at ~165Kb

Later extractions also gave better DNA for the 10X sequencing and so things were going swimmingly. I’d estimated that the genome size for this was ~2Gb, based on the average cricket genome from the animal size genome database, so quite large for an insect, but reasonable enough for this project.

Little did I know the seemingly unending horror show that now befalls us …

Initially things progressed as expected, the PacBio sequencing went well – producing >95Gb data. Likewise for the 10X, we got 120Gb from that, so ~50X coverage for both.

Things started to get a bit icky when the assembly first failed for PacBio, then for 10X. A PacBio miniasm assembly then came back with a revised genome size of 2.8Gb, bigger than expected but not too bad at this point, although the N50 was terrible (76Kb).

The next thing that happened was a kmer-based quality control report – this gave the genome size as 4.6Gb! We’re definitely into the realm of the unexpected now … this reduces our effective coverage to ~20X, waaay less than is needed for a decent assembly.

Finally (after running out of memory a few times) Supernova ran on the 10X data. This returned a gut-wrenching estimated genome size of 7.5Gb!

Combine this with the heterozygosity estimate of around 3.04% and everything looks a little wonky.

So what went wrong?

I’ve just been back to the genome size database and there is an outlier in the sizes – the camel cricket (Ceuthophilus stygius) which is a cave cricket from North America.

By James St. John – Ceuthophilus stygius (camel cricket) inside entrance to Great Onyx Cave (Flint Ridge, Mammoth Cave National Park, Kentucky, USA) 1, CC BY 2.0,

By James St. John – Ceuthophilus stygius (camel cricket) inside entrance to Great Onyx Cave (Flint Ridge, Mammoth Cave National Park, Kentucky, USA) 1, CC BY 2.0,

This beauty has a genome size of 9.55Gb!

So the question is how likely is this to be the case (or close to) for our cricket?

Taking all the crickets with known genome sizes from the database (there aren’t that many – 7 – one of which is the gloriously named ‘unwelcome mole cricket’ Neoscapteriscus borellii) and putting them into the phyloT tree generator and IToL (Interactive Tree of Life) gives you this:

I don’t think you can by these anymore

Brown curry mole crickets in a can: I don’t think you can buy these anymore

Sorry, that’s just a can of curried crickets, the tree looks like this:

Unwelcome mole crickets are unwelcome in NCBI apparently, there’s no taxon number so no tree entry.

Unwelcome mole crickets are unwelcome in NCBI apparently, there’s no taxon number so no tree entry.

From this it looks like our Tettigoniidae bush cricket pre-dates our large-genomed friend the camel cricket (a Gryllacrididae) and split from the ‘true’ crickets (the Gryllidae) a while back. But how far?

Then we used another online resource, the timetree, to see when this split occurred. From the below you can see it was ~270MYA, which is a long time, plenty of time for some weird genome expansion to have happened I guess.

Gryllidae and Gryllacrididae separated 100MY before Tettigoniidae diverged from Gryllacrididae (~172MYA).

Gryllidae and Gryllacrididae separated 100MY before Tettigoniidae diverged from Gryllacrididae (~172MYA).

You may have noticed that this tree is a little different, this is for two reasons:

  • It’s a simple expansion of the last shared taxon group, the Ensifera.
  • The Gryllacrididae and Tettiginiidae split from the Rhaphidophoridae, not the other way around.

Before you ask, no I don’t know why, but I assume the latter is correct as the first tree lacks all the taxon groups for an input.

The sole example of the Rhaphidophoridae taxon has a 1.55Gb genome and as this line goes back to the common ancestor of the Roesel’s cricket it could be that our initial estimate is true OR, more likely, there’s been some horrible expansion that involves (multiple?) genome duplication events.

The thing that’s really annoying is my own lack of knowledge and tendency to make (in this case stupid) assumptions – who knew that Gryllacrididae and Gryllidae are actually further distant than Gryllacrididae and Tettiginiidae? Taxonomist probably, or someone who studied classics.

Anyway we’re doing some more sequencing to get extra 10x data, hopefully this will answer the question once and for all….stay tuned!

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

25 Genomes

A cautionary tale about blackberries [2/2]

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 09/03/2018

At the end of the last post things were looking up, a source of plant material was found, it was now just a case of waiting for the seeds to be delivered so we could grow some up.

This was just before I was due to go to the Plant and Animal Genome conference (PAG) in San Diego, California, in January this year. For those of you thinking “wow, that must be great, getting to go to a nice sunny place for work- must be like a free holiday”, no. Academic conferences are not an excuse for a jolly, the schedule for these generally means you work LONGER hours than normal.

PAG this year ran from the Friday 12th January to Wednesday the 17th- including the weekend, with talks etc scheduled from 8am-6pm (not including extra workshops in the evenings). Marry that with a full day’s travelling on either end and you can hopefully see my point.

Anyway, I’ve never had a need to go to this conference before as I’ve not worked with plants or animals (except malaria, but that’s a disease, even though technically an animal) so I thought I’d go to a wide variety of talks* to see what the craic was.

*There’s everything from wheat to water buffalo, insect genome assembly to livestock breeding.

One of these that seemed vaguely relevant to the project was a talk on the genetics of cherries. These are a big deal in Japan, where the speaker was from, with red skinned fruit and white flesh being the most desirable traits.

£106 for 40 cherries anyone?

£106 for 40 cherries anyone?

Some interesting stuff in this talk- apparently cherries have ~44,000 genes in a 350Mbp genome (humans have ~20,000 in a 3,000Mbp genome) and most trees there are bred from only a few (2 or 3 I think) original sources.

This talk was coming to an end and I was about to leave when up pops a lady with an announcement about the “Rosaceae Rosexec meeting” that was in a couple of days’ time. Blackberries are a member of this fruit family so I decided to crash the meeting (not really, I asked politely and was invited to attend).

As is fairly routine at these sorts of things, there was a ‘stand up and introduce yourself and your research’ bit at the beginning which is all well and good. Me being new, and that I got a bit lost^ trying to find the room, I ended up near the back of the room and was one of the last to speak up.

^PAG is a pretty big conference, there are over 3000 people there and there are dozens of rooms where talks/meeting/workshops happen.

Up I get and proceed to tell people that we’re sequencing the blackberry genome to a largely pleased audience when I notice one person giving me the daggers.

I’d already noticed her and was planning on chatting later as she mentioned blackberries in her intro. The rest of the meeting was great, lots of good work being done on soft fruit.

So at the end of the meeting I go and introduce myself again and explain what we’re doing in a bit more detail. Turns out that Margaret (Blackberry geneticist, not the Iron Lady) has already started sequencing the species and has sunk a bunch of her laboratory start-up seed money into it so that was why she was a bit miffed.

This was, however, the start of what I hope will be a very fruitful [ahem] partnership! After I’d told her that we were planning on releasing the data publicly and would be happy to finish the rest of the sequencing (as part of a new collaboration) things started looking up. We’re now working together, with other fruity people, to get this done. The combined efforts mean the cost is spread around nicely – and now I have actual experts in fruit genomes to help!

The lessons learnt here:

  • Don’t assume there’s only one species, there may be many that look the same (to the untrained eye)
  • Don’t be afraid to call people out of the blue, most often they’re as helpful as can be
  • Conferences are great for meeting people to work with
  • A bit of luck never hurts!

Finally, I met a chap at the Rosexec meeting who must have googled the 25 Genomes project whilst there as he approached me afterward with a bit of sage advice. We’re planning on sequencing the New Zealand Flatworm (it’s an invasive species in the UK) and he said we really should consult the Maori on this, which is now happening – updates to follow (if it’s interesting that is).

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

25 GenomesSanger Life

A cautionary tale about blackberries…[1/2]

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/03/2018

Blackberries – they’re everywhere right, in gardens, hedgerows, at the side of the road; in fact pretty much anywhere you go you can find a blackberry bush (?shrub, ?tree, ?thicket – who knows what the noun is!). This should make finding one for the project as easy as pie, I thought…

As an aside, before we included blackberry in the project I checked on the Kew Gardens plant database to see if they a suitable genome and, hooray, they do. It’s 450Mb (about 1/13th the size of the human genome) and diploid as well – so no odd chromosome duplications* to worry about.

Genomes Assemble!

It’s important to have species that are haploid (single copy of each chromosome- lots of insects have haploid males) or diploid (two copies of each chromosome, like humans- we have 23 pairs). This is because putting together the bits of DNA from the sequencing is much more difficult if there are more than two copies of everything. It would be simple if the copies were exact but there are always small differences between them; sometimes single bits of DNA vary, sometimes small sections are missing or duplicated or mirrored (inverted) etc.

Imagine that your genome of interest only had one chromosome- pictured as say, a jigsaw puzzle, let’s then say as a picture of a cat- in this case the @genomecat, Quincy.

This is the easiest to assemble, to MASSIVELY oversimplify things, you just match the edges to get the picture.

For two copies (diploid) it’s a little more complicated, twice the number of pieces and some will be a little different. Here you filter out the bits that look different and put these in the second chromosome (or ‘alternative haplotype).

Things get really hard when you enter the murky world of polyploidy (lots of copies of chromosomes). In this example our cat has 4 chromosomes (tetraploid) – all slightly different. The problem comes in trying to put each different piece in a separate chromosome, this is fine if 4 pieces all look different (or the same) but if there are 2 identical pieces how do you know where to put them?

It’s a simplistic way of looking at things but [sort of] gets to the point – assembling genomes like this is tricky, so we like to avoid this if possible!

Back to Black(berries)

Ok, on with the blackberry story. Finding a plant was, unsurprisingly, very easy- there’s a big thicket on the grounds that’s about 3m high and must cover 100m sq. or so. So we got some leaves and stored to wait for extraction.

A few weeks later I went on a nice trip down to the Natural History Museum in London to chat to some botanists (Fred Rumsey and Mark Carine) about preserving some samples for their collection. This involves getting some of the plant and pressing it flat/drying out to act as an example of what the species looks like – so people can check we’ve got what we say we have (anyone can go and look the collection there, by the way, some of the plants were collected hundreds of years ago!).

I think this is a ragwort.

We got chatting and I mentioned that we were doing the blackberry as part of the project. At this point Fred drops a casual comment that nearly made me soil myself:


“That’s interesting, which species are you doing? There are well over 300 in the UK…”

“Go on,” I say, “pray tell me about these 300 species.” (the exact conversation escapes me, it’s like a half remembered feverish nightmare now)

“Oh yes, there’s a whole book on them- I think we’re up to about 360 by now- you know the only way to identify a species confidently is to observe it’s life-cycle for at least a full year, most likely two to be sure. Wait whilst I find the book…”

[large thud as this tome hits the desk]

“Here it is, you can see how to identify them from this.”

“Thanks Fred,” [I hope I said that and not what I was thinking] “very interesting.”

And thus began the blackberry saga.

The Blackberry Saga

First thing to do was to find out what the species was that we had, however if you remember the ‘observe for a year’ bit this would take too long. So the next option (seeing as it’s definitely some kind of blackberry) was to try to find out the ploidy, sequence it and get the species ID later. Now this isn’t ideal so I also thought it would be a good idea to try to find a source that already has a known diploid species.

Turns out neither of these things was quite so simple.

One of the best ways of finding out the number of copies of chromosomes is to actually count them by looking at them using a microscope (this is called karyotyping), noting how many look the same, and the total number. So I asked our specialist and it was bad news – it’s too difficult to do in the time-frame but he put me on to a Professor from the University of Leicester who knows about blackberry genomes and things.

In an unlikely case of serendipity, it turns out this particular Professor’s mother worked on blackberries for a book in the 1950s, so he sent me a bunch of stuff to read and we had a nice chat on the phone. Now I had information on which species I should be looking for, Rubus ulmifolius, a relatively common diploid native to the UK. This species is also found in the local area around the Genome Campus where we sampled our blackberry from so I had a small measure of hope we had stumbled upon the right one.

Red boxes are locations of R. ulmifolius. On a zoomed out view it’s apparent there wasn’t any surveying east of the red boxes.

Of course I still had to find out and, after a few phone calls and suggestions, I contacted Julie Graham at the James Hutton Institute who had a test that they could do. It took a few weeks and in the end this was a bust, the Campus plant was tetraploid.

I may have been a little disappointed

The only option now was to find the plant from somewhere else.

There are a number of institutions that do soft fruit research (NIAB/EMR, Hutton, ART , Reading University, Leicester University, Earlham Institute, etc) and I called them all. I also called a bunch of commercial growers, yet none of these had the R. ulmifolius I was looking for.

Eventually I did get one lead; for the USDA clonal repository in Oregon (US) – they have a germplasm repository and, lo and behold, you can order plants from there!

So I did.

Job done. Sometimes things turn out to be easier than expected if you have the right information. Or so I thought…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

About spiders (specifically the Fen Raft spider, Dolomedes plantarius) and where to get them from.
25 GenomesSanger Science

Getting a hold of samples… [part 2]

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 16/02/2018

So far I’ve talked about Golden Eagle and Red Squirrel, also known by the moniker “charismatic megafauna” which a fantastic description of large cute/interesting things I first heard from Mark Blaxter.

So, I mentioned that some of the species are quite challenging to get but there are some that are also easy to sample (along with who provided them – thanks goes to them):

  • Himalayan Balsam – Lisa Outhwaite, found on the Genome Campus
  • Oxford Ragwort – Lisa Outhwaite, found on the Genome Campus
  • Summer Truffle – from Dr Paul Thomas, commercial source (the exact location is confidential though)
  • Common Starfish – from Prof Maurice Elphick, keeps a tank full for other ongoing work
  • King Scallop – Dr Susanne Williams, bought from a fishmongers!
  • Asian Hornet – Dr Seirian Sumner, already had a collection
  • Turtle Dove – Dr Jenny Dunn, had samples from previous work
  • Otter – Dr Frank Hailer, from routine health surveys
  • Roesel’s Bush-cricket – Dr Björn Beckmann, they’re quite abundant now so easy to find
  • Fen Raft Spider –  Dr Helen Smith, ditch maintenance means they ‘pop up’ at the time
  • Robin – Dr Jenny Dunn, had samples from previous work
  • Grey Squirrel – Kat Fingland, has samples from ongoing work

Although these were easy to get that doesn’t mean there aren’t some quite interesting anecdotes associated with the sample collection.

Summer Truffles

Summer truffles, for example, are pretty valuable (circa £400 per kilogram) so the reason we don’t have the exact location is to prevent rival hunters (?not sure you hunt for a truffle or forage?) from plundering the area.

King Scallop, Great Scallop, Coquilles Saint-Jacques

Also, imagine the confusion in the voice of the chap at the end of the phone when I ring up and ask the fishmonger if they have a GPS location for the source of their scallops. Then think what the guy must have been thinking when I try to explain why, hopefully he got it but I’m not so sure! This is why we need to reach out and explain science to the public more, there’s not a great deal of exposure to genomes/genetic research if it’s not human related.

Turns out they don’t know exactly where they came from anyway; the scallops hail from the Shetland Isles – might have to do some genotyping to find out!

Roesel’s Bush-Cricket

Crickets it turns out are quite the eaters and not wanting to limit their diet they are, like us, omnivorous. Unlike us, however, at least nowadays, they do practice cannibalism (not sure how you ‘practice’ mind you, maybe start with just a lick?!). It seems they can lose legs quite easily this way, one named Oscar had a run-in in their container with Hannibal and lost two legs, the third (Heather) just lost a single one prior to arrival.

Fen Raft Spider

Did you know you need a special license to collect Fen Raft spiders? This is because they’re red-listed like the Eagle but, thankfully for me, Helen has one. She has also raised many thousand spiderlings in her kitchen!

Check out her website ( and if you fancy a challenge see how easy it is to spot (what is after all the largest UK spider) them in their habitat here.

Clearing a fen ditch - home to the Fen Raft Spider (one of the 25 Genomes we are sequencing)

Clearing a fen ditch – home to the Fen Raft Spider (one of the 25 Genomes we are sequencing)

Grey Squirrel

Grey Squirrels are regarded as a pest species. This means that it’s legal to hunt them without a special license, provided that you don’t cause any unnecessary suffering. We are NOT, however, doing this for the project as it’s not the most ethical thing when people are already collecting them for other research.

Also, did you know that you can buy squirrel pie? Not had it myself but could be tasty…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Getting a hold of some samples… for the 25 Genomes Project
25 GenomesSanger LifeSanger Science

Getting a hold of some samples…

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 29/01/2018

[Because gathering samples is proving to be quite a major task, I’m going to split this across several posts]

First things first – find a sample

The first, and often most difficult, part of getting a sample for the 25 genomes project is finding out where from.

There are a number of reasons for this but it essentially boils down to the fact that the Sanger Institute has always focused¹ on human health and disease so we don’t have a particularly great list of contacts for this project.

¹There have been some dalliances into other areas in the past, notably; Cod, Coelocanth (it’s fish, known as a ‘living fossil’, although I prefer something that implies it’s been a long-term success like ’Pan-eon species’, a description I may have made up), Tasmanian Devil Cancer, Tomato and a butterfly

The ones that are most difficult to get are the ones that the steering group decided upon independently, this is because without a scientist/researcher/expert putting forward the species there isn’t anywhere to start from.

This is where working in science has a great advantage- collaboration. In the fields of Agricultural, Plant & Animal and Environment/Ecology sciences half of all articles were written by multiple institutions by 2009² and if the trend has continued it should be over 60% by now.

²Gazni, A., Sugimoto, C. R. and Didegah, F. (2012), Mapping world scientific collaboration: Authors, institutions, and countries. J. Am. Soc. Inf. Sci., 63: 323–335. doi: 10.1002/asi.21688

This is one reason why we need to collaborate more and will be subject of a later post.

How traditional biologists and computer biologists work together. #CartoonYourScience by @redpen/blackpen

How traditional biologists and computer biologists work together. #CartoonYourScience by @redpen/blackpen

(for more like this check out the wonderful @redpenblackpen)

In practice this should mean that us scientists are a helpful bunch, and it turns out this is true. Whereas cold-calling/emailing people about the ‘accident you’ve been recently involved in’ or ‘the security breach on you Microsoft device’ are extremely annoying [pro-tip, pass the phone to your pre-school child if this happens, the results are normally quite amusing] doing the same to a scientist to offer them free sequencing of their species of interest is generally quite warmly received!

Getting a Golden Eagle(‘s DNA)

So lets’ have a closer look at some of the species, firstly the Golden Eagle.

I would have thought that this would be a tricky one – they’re protected by a bunch of laws/regulations which means that without special licences you can’t mess with them. In fact even the locations of the nests are a closely guarded secret as they are still being illegally killed or the eggs are taken by collectors.

Turns out that a quick google and one email can lead to a great result, although it’s tinged with a bit of sadness which I’ll get to in a bit. I initially contacted Professor Anna Meredith at Edinburgh University with a general ‘can you help me with blah, blah, blah’ as she works with a number of species we were interested in (in this case I was actually after Red Squirrels) and she forwarded this on to Dr. Rob Ogden, also at Edinburgh.

As it turns out he is already working on Golden Eagles and was planning on doing some sequencing with some collaborators in Japan (they have eagles there too). Even better he had samples already from (here’s the sad bit) chicks that had died in the nest (plus one found rather suspiciously in a long abandoned nest).

So, one sample down, 24 to go!

[By the way I’m not going to go into the logistics and ENORMOUS cost of shipping things on dry ice, just assume that things arrive magically, but I may expand on why they need shipping this way some other time.]

Something squirreled away

Anna couldn’t help out with the Red Squirrel however, so I asked the National Trust who maintain a lot of the areas where these cute little critters still live:

UK Squirrel Distribution Maps, 1945 and 2010. Image Credit: Craig Shuttleworth, RSST

UK Squirrel Distribution Maps, 1945 and 2010. Image Credit: Craig Shuttleworth, RSST

A nice lady called Laura put me in touch with the Head of Conservation (David Bullock) who in turn linked me to Andrew Brockbank at Formby Point who then led me to Kat Fingland (Nottingham Trent University) and Rachel Cripps (Red Squirrel Officer). All this took about a month and a bit but I finally had the right people. Thankfully we didn’t need any extra licencing to get some samples as they were already collecting from animals that had died from natural or accidental causes.

2 down, 23 to go!

Ethical and responsible sampling

It’s worth mentioning at this point that for this project we want to limit the impact of our sampling as much as possible and therefore have had it approved by our AWERB (Animal Welfare and Ethical Review Body). What this means is that wherever possible we do not kill any animals solely for the project, although in practice this is easier said than done and it does create some difficulties.

  1. For some animals this is not a problem as they are large enough that we can take a small amount of blood (less than 1ml) but others are too small for this to be possible (pipistrelle bats for example weigh around 5g and have only 0.5ml blood in total). This means that we need to get hold of whole animals AND as some of our species are protected (Golden Eagle, Red Squirrel etc.) they need to have already passed away for us to be able to use them.
  2. Another related issue is that the protected species need special licences to take blood samples from even if they are large enough for this to be possible. Given the amount of time for the project it’s not really an option, so again we need naturally passed on animals.
  3. The nature of the sequencing technology we’re using means that we need to get really long bits of DNA (upwards of 150,000 base pairs – that’s the A-T/G-C parts of DNA). The problem is that when we use animals that have died of natural causes we need to find and sample them really quickly: as soon as the animal dies the DNA begins to break up through the natural decomposition process.
  4. The really small critters (invertebrates like the Roesel’s Cricket for example) are next to impossible to find when they’ve died, as they tend to be eaten by other things and are hard to spot unless they move. In these cases we have no choice but to take live creatures and euthanise them as humanely as possible.
  5. Plants and fungi are somewhere in the middle, we need quite a lot of material (DNA extraction is more difficult), but ethically it’s acceptable to take bigger samples, so in these cases we take cuttings or fruiting bodies.

So that’s it for this one, more on sample collection to come…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

On choosing the 25 species for our 25 Genomes project
25 GenomesSanger Science

On choosing the 25 species for the 25 Genomes Project

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/01/2018

For those that don’t know (and until recently I could include myself in this group) there are A LOT of species on and in the earth. Currently it’s estimated that there are 2 billion! (2,000,000,000; see for details). Most of these are bacteria, and we’re not looking at those for the 25 Genomes project, but this still leaves about 450 million to choose from.

To make it easier for ourselves, we also decided to limit ourselves only to the 1.5 MILLION species that have currently been described and catalogued. And, to help us along a bit more, we decided that only species found in the UK would count. According to the National Biodiversity Network, that brings the number down to ‘only’ 56,674. Now if you choose to only look at the local area surrounding the Sanger Institute then it’s a much more manageable 318.

However, it wasn’t going to be that easy. In the spirit of the Sanger’s inclusive approach to science, the Steering Group for the 25 Genomes project were concerned that such a narrow list was ‘too parochial’ and directed that the species sequenced should be a representative group of organisms from the whole of the UK.

So, how do you filter more than 56,000 species down to just 25?

The first thing to do was to break down the problem and the idea of a 5×5 matrix was mooted, discussed and agreed upon surprisingly quickly. Rather unsurprisingly coming up with five different categories was not as straightforward as it might first appear. While some were no-brainers (iconic species for instance), getting all five nailed down was tricky.

The wisdom of crowds

So we put out a call for suggestions to the whole Wellcome Genome Campus, to draw on the collective wisdom of the more than 2000 people who work here.

The results were, by turns, pleasing, odd, not-at-all-answering-the-question and esoteric. Here are some examples:

  • Species for which Britain has major global richness and conservation responsibility
  • Female emancipation in the wild
  • Unusual in terms of genetic load accumulation rate and mechanism
  • The three-toed sloth (which is neither a theme nor from the UK)
  • 25 local authors (and then we would really have 25 ‘novel’ genomes)
  • Species imported to the UK, which are making our lives healthier and happier (possibly a politically motivated suggestion)
  • What is ‘down there’ (in the detritus level down on the Ocean floor).

Finding five themes

Armed with these suggestions, the 25 Genomes Steering Group got back together to hammer out the final five categories. Here’s what we decided upon, reasoning that these themes should give a broad breadth of types of organism and habitats to sample:

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

5 Themes for the 25 Genomes Projects: Flourishing, Floundering, Cryptic, Iconic and Dangerous

Critical criteria

We also came up with a list of criteria that the species must meet:

  1. Scientific justification must be solid– are there good questions that can be answered by the genome sequence being made available?
  2. No decent draft sequence currently available
  3. Sample availability– some organisms are too small, others are too protected, while others are too seasonal for collection
  4. Tractable genome – some organisms have genomes that are incredibly complex and would take up too much time and resource. For example, many plants have cells that contain multiple copies of the same[ish] chromosomes, a phenomenon known as increased ploidy. (A hexaploid genome has SIX copies of each chromosome, and some plants have even more.)

Now there comes the hard part, actually getting the list of species. As mentioned in a previous post, our public engagement team suggested that we let the public decide five of the species, leaving us just 20.

Great you might think, as it means we don’t need to do as much work, but you’d be sadly mistaken. The reality was that I now needed a list of 20 to start collecting right away AND another 40+ that the public could vote on to decide the final five!

It’s who you know…

Rather splendidly we have a senior member of the Natural History Museum London on our steering group which meant we could exploit their contact list of some 400+ partner groups of wildlife experts. With this in mind I made a surveymonkey survey (it’s still about so you can check it out here, feel free to fill it in- you never know we might want to do more!) that, in my mind at least, cunningly hid the criteria in the questions. It also deliberately did not mention the themes so as not to steer people in any particular direction.

From this I got 99 responses (again discussed earlier) that made up most of the public vote and the 20* for getting on with, these latter ones are in the table below:

Cryptic Dangerous Floundering Flourishing Iconic
Brown Trout Indian Balsam Red Squirrel Grey Squirrel Golden Eagle
Common Pipistrelle King Scallop Water Vole Ringlet butterfly Blackberry
Carrington’s Featherwort New Zealand Flatworm Turtle Dove Roesel’s Bush-Cricket European Robin
Summer Truffle British Mosquito Northern February Red Stonefly Oxford Ragwort Orange-tailed Mining-bee

All in all, this took about 5 months to get to this stage as the species also needed to be individually reviewed to see if they met the criteria and then approved by the steering group.

Now the only problem is actually getting the species DNA; so collecting specimens and some lab work to follow, the supposed easy part….

More on this to come!

*Why we chose the above 20 species

Why sequence it?
Summer Truffle There is disagreement in the literature as to whether this truffle is one or two separate species, plus it grows underground and is therefore largely unseen and difficult to locate. Prices for those collected in the UK remaining relatively stable at around 400GBP per kilo. Known as mycorrhizal, these fungi form a symbiotic association with a host plant on which they are dependent throughout their lifecycle. The sequencing of UK T. aestivum syn. uncinatum populations would be pivotal in helping to answer questions of modes of reproduction, life cycle questions as well as aiding in some core speciation questions.
Brown Trout The Brown Trout has three isoforms that differ in their migratory patterns, one form remains in the locality of its birth where it will live out its life, spawn and die. The second type migrates from lakes to streams and rivers to spawn but remains in fresh water. The third form migrates to the sea/ocean and remains there for much of its life, only returning to spawn. There appears to be no genetic difference between these forms, also known as anadromous (migratory) and sympatric (resident). Additionally the Wellcome Genome Campus is built around an 18th century red brick hall, Hinton Hall, also known as Trout Hall, where a carved stone trout is prominently displayed over the main door to the croquet lawns.
Carrington’s Featherwort This is selected as a representative of the liverworts, an ancient plant group predating flowering plants. It is one of the characteristic liverworts of very high rainfall areas in Scotland, and thus a representative of one of the very special groups of the British biota confined to such high-rainfall areas. Outside Scotland, it is only found in Ireland (extremely rare), the Faeroes and the Himalayas. The Scottish plants are apparently all male – like the Ents, the sexes have become separated in this species and the nearest females are in the Himalayas.
Common Pipistrelle Until recently this bat was believed to be a single species however it is now know to be a dual species (common/soprano), with one other (Nathusius’) also being resident in the UK. Studying the genome will allow us to investigate the origins of the split between the two species, when and why it occurred.
Indian Balsam Highly invasive weed species that substantial effort to control is undertaken, control methods based on finding would have important implications for wetland and river management.
King Scallop Pecten maximus has been found to contain the Amnesic Shellfish Poisoning toxin, domoic acid, which accumulates after they consume algae/diatoms- especially in the event of algal blooms. This risk is regarded as a significant threat to both public health and the shellfish industry. Some studies have suggested that global warming is resulting in greater reproductive success for P. maximus in the UK, however concerns have been raised over increasing mortality, declining recruitment and spawning stock biomass in several Scottish populations. Pecten maximus is also of interest scientifically because of its unusual vision and because its two shell valves are coloured differently. Identifying molecular pathways for shell pigment production in Mollusca has lagged behind studies of vertebrates and terrestrial invertebrates, and is a major gap in our understanding of how colour has evolved in the natural world. Vision in Mollusca is also of great interest because of the many different eye morphologies and the fact that very few species are thought to see in colour.
New Zealand Flatworm New Zealand flatworms prey on earthworms, posing a potential threat to native earthworm populations. Further spread could have an impact on wildlife species dependent on earthworms (e.g. Badgers, Moles) and could have a localised deleterious effect on soil structure.
British Mosquito Mosquitos are an important disease vector and there has been speculation that an increase in the distribution of other species due to climate change could allow the re-introduction of diseases such as malaria to the UK.
Red Squirrel Sequencing the whole genome of the native red squirrel will hopefully provide new tools and resources into reversing their decline and aiding their long-term conservation in the UK. For example, this research could reveal key insights into how red squirrels have adapted to living in an urban environment. This study could also provide further information for managing the spread of diseases and helping to protect the red squirrel from the fatal squirrelpox virus, as well as to gain a deeper understanding into the impact of newly-discovered diseases
Northern February Red Stonefly These stonefly only inhabit the purest of waters and as such are very limited in their habitats and may struggle to adapt to climate change. Brachyptera putata is an endemic UK stonefly. There has been suggestions that other European Brachyptera species may be synonyms of B. putata. Sequencing would determine whether it is a true UK endemic.
Turtle Dove Turtle Dove numbers have fallen by a staggering 93% since 1970 and now resides on the Global Red List for Endangered Species. Smaller than its collared cousin, the Turtle Dove is now only found in eastern England, where farmers are working with the RSPB to create feeding habitats, the destruction of which are blamed for the bird’s decline.
Water Vole The Water vole is the UK’s fastest declining mammal and efforts to help the population maintain genetic fitness would benefit from having the genome sequenced. Arvicola is a fantastic example of a small mammal genus that survived through the last glaciation, and has adapted to a range of habitats across Europe and much of northern Asia.
Oxford Ragwort The Oxford Ragwort is representative of a species being introduced and excelling in another habitat. It was collected from the slopes of Mount Vesuvius sometime in the 17th Century, and planted in Oxford where it rapidly colonised the area due to its natural hardiness, and could grow on urban landscapes too (sides of buildings, on stairs, etc.). When railways were introduced to the UK landscape, this facilitated the spread of Oxford Ragwort across the UK (it can be found growing along railway tracks today). Sequencing the genome would better increase our understanding of a non – native species excelling in a new habitat and may expand on our understanding of the ecology of flowering plants.
Roesel’s Bush-cricket Once restricted to the south coast and estuaries (saltmarshes) it is now widespread, possibly due to climate change and the spreading of salt on UK roads.
Ringlet butterfly Despite an overall decline in butterflies over the last 50 years the ringlet has increased its population by nearly 400%. It’s one of the few to fly on overcast days and has an interesting dwarf form that appears at 600ft, increasing until 100% of the population is this form at 1000ft.
Grey Squirrel As the anti-hero for the red squirrel, investigating how/why the squirrelpox virus is tolerated
Blackberry Good opportunity for citizen science, population genomics specifically for schools engagement. Also commercial soft-fruit genetics as it is an important and expanding food crop.
Golden Eagle This is an iconic UK species that has suffered from hunting and pesticide poisoning in the past, leading to extinction in all parts of the UK except Scotland where there are still less than 500 breeding pairs.
Orange-tailed Mining-bee This species is conspicuous and attractive, one of the mining bees that is more likely to have come to the attention of the general public. It is widespread and common throughout the United Kingdom, flying in spring. It is a component of natural pollination services which can ensure crop pollination in the absence of honeybees, and also the pollination of many wild and garden flowering plants ensuring their genetic diversity and conservation.  In the UK, of 276 species of bee, there is only one honey bee, and a score of bumblebees, the great majority of native bees are mining bees, including 68 species of Andrena.  The genome sequence itself will be useful for comparative study of the genomes of this solitary bee with the available genomes of social bees, in terms of gene composition relevant to sociality.
European robin Robins use vision-based magneto-reception and the mechanism is not fully understood, it has been shown that it may involve quantum entanglement. Robins are also extremely territorial, unlike most other song birds, with up to 10% of all deaths occurring due to fights.

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

I'm a Scientist, Get me out of here - 25 Genomes
25 GenomesSanger Science

We let the public decide five of our species….

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 20/12/2017

We recently wrapped up the ‘I’m a scientist, get me out of here’ public engagement event. This was a fantastic exercise aimed at getting the public, specifically school children, excited about sequencing genomes and science in general.

Here’s how ‘I’m a scientist, get me out of here’ worked – 25 Genomes style:

We divided the species into five themes, each of which had their own ‘zone’:

  • Flourishing (species on the up in the UK)
  • Floundering (endangered and declining species)
  • Cryptic (species that are out of sight or indistinguishable from others based on looks alone)
  • Iconic (quintessentially British species that we all recognise)
  • Dangerous (invasive and harmful species)

In each zone were between 7-9 candidate species that had been proposed via an online poll of scientists, wildlife experts and interested members of the public.

Close, but no cigar…

The poll to suggest candidate species for the public vote ran throughout September and into early November and we had a pretty good response. Most of the replies were pretty sensible, and quite a few had very detailed justifications by experts (one ran to nearly 5,000 words, complete with references). But some suggestions were rather left field.

In the very first section of our explanation of the purpose of the poll, we say: “…we are embarking on a brand new project to sequence a cross-sample of UK biodiversity.”

Bearing this in mind I suspect some people weren’t that keen on reading or were just chancing their arm. Here are some of the more ‘exotic’ suggestions:

  • Resplendent Quetzal – a cool-looking bird, with a cool name. If you’ve not heard of it that’s because it lives in central America (not the UK).
  • The “Hoff” crabKiwa tyleri – so named because of its hairy chest, reminiscent of Baywatch actor David Hasselhoff. The species can be found in UK oversees territorial waters, but it’s not in the UK.
  • Fire Salamander – yet another cool name, and it looks pretty sweet too. Unfortunately only found in mainland Europe.
Fire Salamander - pretty, but not UK-based. Image Credit: William Warby, Wikimedia Commons

Fire Salamander – pretty, but not UK-based. Image Credit: William Warby, Wikimedia Commons

Some non-UK resident species suggestions were a little easier to spot:

  • Greenland Shark
  • Mongolian Gerbil
  • Madagascar Paradise Flycatcher
  • Asiatic black bear
  • Italian Mediterranean buffalo
  • Alpine grasshopper
  • Tasmanian Devil (funnily enough, this species has already been sequenced right here at the Sanger Institute.)
  • Antarctic Krill

Back to I’m a scientist, get me out of here – 25 Genomes

The idea for the zones was that each species would be represented by a ‘champion’ (or team thereof) and they would answer in the first person, to keep things more fun and relatable. It worked well:

Screenshot of I'm a Scientist, Get me out of here - 25 Genomes online chat

Screenshot of I’m a Scientist, Get me out of here – 25 Genomes online chat

During the ‘I’m a scientist get me out of here – 25 Genomes’ event was running anyone who logged on could vote for their favourite species, one vote per zone. When the vote was finished, the winning species from each zone was added to the 25 Genomes project.

Getting engaged with the students was the most successful way of winning. In all the zones the species that were among the top two most active in the live chats and answered more questions on average had a much better chance becoming the zone winner.

The winners!

The winning 5 species of the public vote for the 25 Genomes Project

The winning 5 species of the public vote for the 25 Genomes Project – Common Starfish, Asian Hornet, Eurasian Otter, Fen Raft Spider, Lesser Spotted Catfish

In all around 5,000 people participated in the events and there were over 150,000 page views, which sounds pretty successful to me.

One final invaluable piece of information that I learned from this whole process is that the Latin name (Onopordum acanthium) for Scotch Thistle is “donkey fart thistle”. In ye olden times people used to think that donkeys fart a lot if they eat it*.

*from the iconic zone Q&A

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Giant Hogweed - one of the 25 genomes being read by the Wellcome Sanger Institute. Image credit: Appaloosa, Wikimedia Commons
25 GenomesSanger LifeSanger Science

Giant Hogweed sampling, a retrospective

By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 14/12/2017

Anticipating that the Giant Hogweed would not win the popular vote in the “I’m a scientist, get me out of here – 25 Genomes…” event I decided to try to find some.

Let your fingers do the walking…

The National Biodiversity Network (NBN) atlas is incredibly useful for finding out where (and when) things are found, so I started there, looking for Heracleum mantegazzianum within 5km of the Sanger Institute:

Giant Hogweed locations in a 5km radius from the Wellcome Genome Campus

Giant Hogweed locations in a 5km radius from the Wellcome Genome Campus

This was a bust though, they’d been cleared out (the records are from 2004 and 2011 so no surprise there). So I went to a different source- the BSBI (Botanical Society of Britain and Ireland) who are linked to NBN but I figured would be more specific. I was right, you need to register to get the information but once you do, it is very good.

A nice chap called Kevin Walker sent me their records for the Cambridgeshire area. Turns out we’re in a bit of a hotspot, so that’s good news. Unfortunately by the time I found this out, it is November and plants tend to die back in the autumn.

On the other hand I had read somewhere that giant hogweed will germinate in winter so I figured that it might be possible to find some youngish plants – these are ideal for DNA as growing parts are the best for extraction.

It’s a matter of record

From the records I found out who had seen the plants in question and one of the most recent was a chap called Jonathan D. Shanklin who’d seen one in central Cambridge, on Hobson’s conduit. This, by the way, is a water channel cut approximately 400 hundred years ago as a water source for the centre of Cambridge and is now protected as a scheduled ancient monument.

Digression aside, with a name like that it was relatively easy to find Mr Shanklin with a quick google search. Turns out he works for the British Antarctic Survey. One slightly awkward phone conversation later I had clear idea of where this plant would be, not far from the Botanical Gardens. However driving in to the centre of Cambridge isn’t much fun, so this option went on the back burner.

Handle with care

Skin blistering caused by giant hogweed. Image credit: Cosima Pferdeliebe, Wikimedia Commons

Skin blistering caused by giant hogweed. Image credit: Cosima Pferdeliebe, Wikimedia Commons

Here’s a little tangent, this plant is not something to be trifled with. Giant hogweed is nasty stuff, its sap contains a sunlight (UV) activated toxin that can cause pretty horrible blistering (see below). So I made sure that I stocked up on a full face shield [liberated from a past position], plenty of nitrile gloves and a Tyvek suit (thanks to John Lovell at the Sanger).

The next location I wanted to scout out (I like to have backup plans) was the Bourne Brook area as this had a whole bunch of recorded sightings over the past few years (by a Ruth Hawksley) so I went for a little drive as it’s only 15 mins from work.

It turns out that Bourne Brook has been very effectively cleared of hogweed this year so I went to the workplace of Ruth Hawksley. Ruth works at the Bedfordshire and Cambridgeshire Wildlife Trust and they have an office that’s open to the public just 10 minutes from the brook. Sadly she wasn’t there.

However, her colleagues were in and they gave me her card. After a fruitful telephone chat, Ruth embarked on a mission to find some for me. This did not go well so we had an email exchange over the following days about getting hold of some seeds so that I could grow some myself. Again, no success as all the plants had been sprayed. Then Ruth remembered that there was a plant found and de-headed this past summer, rather handily just up the road from me in Ickleton, so off I went.

Lost in translation

Time for another aside. The location I was given was TL49384419 and a street. It seems that there are more ways to record location information than you might think. The above is an example of the Ordnance Survey National Grid coordinate system and it seems to be the standard for biological sample recordings in the UK.

Another nugget I discovered earlier in the project is that iPhones record GPS coordinates in the metadata of pictures. It isn’t easy to extract without using 3rd party software. However, if you do, you can then translate it from “AA; B; CC.cccccc” to “AA.B.CC.ccccc” which you can then copy and past into Google maps.

Anyhoo, a short walk up the lane later and I find myself standing by an electricity substation in Ickleton, looking somewhat suspicious in a pair of bright orange gloves, and staring at this:

Giant hogweed in the wild (hiding beside an electricity substation in Ickleton)

Giant hogweed in the wild (hiding beside an electricity substation in Ickleton)

This is the hogweed you’re looking for

One quick email later (thank you 4G connectivity) and Ruth confirms that this is the plant I’m looking for. Further confirmation came from around the back of the ‘station where there’s a 2m dead stem that’s been de-headed. So I took one of the leaves back to the lab to deposit in the -80 degree freezer, success!

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page