By: Dan Mead, the 25th Anniversary Sequencing Project Coordinator
Date: 08/03/2018

Blackberries – they’re everywhere right, in gardens, hedgerows, at the side of the road; in fact pretty much anywhere you go you can find a blackberry bush (?shrub, ?tree, ?thicket – who knows what the noun is!). This should make finding one for the project as easy as pie, I thought…

As an aside, before we included blackberry in the project I checked on the Kew Gardens plant database to see if they a suitable genome and, hooray, they do. It’s 450Mb (about 1/13th the size of the human genome) and diploid as well – so no odd chromosome duplications* to worry about.

Genomes Assemble!

It’s important to have species that are haploid (single copy of each chromosome- lots of insects have haploid males) or diploid (two copies of each chromosome, like humans- we have 23 pairs). This is because putting together the bits of DNA from the sequencing is much more difficult if there are more than two copies of everything. It would be simple if the copies were exact but there are always small differences between them; sometimes single bits of DNA vary, sometimes small sections are missing or duplicated or mirrored (inverted) etc.

Imagine that your genome of interest only had one chromosome- pictured as say, a jigsaw puzzle, let’s then say as a picture of a cat- in this case the @genomecat, Quincy.

This is the easiest to assemble, to MASSIVELY oversimplify things, you just match the edges to get the picture.

For two copies (diploid) it’s a little more complicated, twice the number of pieces and some will be a little different. Here you filter out the bits that look different and put these in the second chromosome (or ‘alternative haplotype).

Things get really hard when you enter the murky world of polyploidy (lots of copies of chromosomes). In this example our cat has 4 chromosomes (tetraploid) – all slightly different. The problem comes in trying to put each different piece in a separate chromosome, this is fine if 4 pieces all look different (or the same) but if there are 2 identical pieces how do you know where to put them?

It’s a simplistic way of looking at things but [sort of] gets to the point – assembling genomes like this is tricky, so we like to avoid this if possible!

Back to Black(berries)

Ok, on with the blackberry story. Finding a plant was, unsurprisingly, very easy- there’s a big thicket on the grounds that’s about 3m high and must cover 100m sq. or so. So we got some leaves and stored to wait for extraction.

A few weeks later I went on a nice trip down to the Natural History Museum in London to chat to some botanists (Fred Rumsey and Mark Carine) about preserving some samples for their collection. This involves getting some of the plant and pressing it flat/drying out to act as an example of what the species looks like – so people can check we’ve got what we say we have (anyone can go and look the collection there, by the way, some of the plants were collected hundreds of years ago!).

I think this is a ragwort.

We got chatting and I mentioned that we were doing the blackberry as part of the project. At this point Fred drops a casual comment that nearly made me soil myself:

[paraphrased]

“That’s interesting, which species are you doing? There are well over 300 in the UK…”

“Go on,” I say, “pray tell me about these 300 species.” (the exact conversation escapes me, it’s like a half remembered feverish nightmare now)

“Oh yes, there’s a whole book on them- I think we’re up to about 360 by now- you know the only way to identify a species confidently is to observe it’s life-cycle for at least a full year, most likely two to be sure. Wait whilst I find the book…”

[large thud as this tome hits the desk]

“Here it is, you can see how to identify them from this.”

“Thanks Fred,” [I hope I said that and not what I was thinking] “very interesting.”

And thus began the blackberry saga.

The Blackberry Saga

First thing to do was to find out what the species was that we had, however if you remember the ‘observe for a year’ bit this would take too long. So the next option (seeing as it’s definitely some kind of blackberry) was to try to find out the ploidy, sequence it and get the species ID later. Now this isn’t ideal so I also thought it would be a good idea to try to find a source that already has a known diploid species.

Turns out neither of these things was quite so simple.

One of the best ways of finding out the number of copies of chromosomes is to actually count them by looking at them using a microscope (this is called karyotyping), noting how many look the same, and the total number. So I asked our specialist and it was bad news – it’s too difficult to do in the time-frame but he put me on to a Professor from the University of Leicester who knows about blackberry genomes and things.

In an unlikely case of serendipity, it turns out this particular Professor’s mother worked on blackberries for a book in the 1950s, so he sent me a bunch of stuff to read and we had a nice chat on the phone. Now I had information on which species I should be looking for, Rubus ulmifolius, a relatively common diploid native to the UK. This species is also found in the local area around the Genome Campus where we sampled our blackberry from so I had a small measure of hope we had stumbled upon the right one.

Red boxes are locations of R. ulmifolius. On a zoomed out view it’s apparent there wasn’t any surveying east of the red boxes.

Of course I still had to find out and, after a few phone calls and suggestions, I contacted Julie Graham at the James Hutton Institute who had a test that they could do. It took a few weeks and in the end this was a bust, the Campus plant was tetraploid.

I may have been a little disappointed

The only option now was to find the plant from somewhere else.

There are a number of institutions that do soft fruit research (NIAB/EMR, Hutton, ART , Reading University, Leicester University, Earlham Institute, etc) and I called them all. I also called a bunch of commercial growers, yet none of these had the R. ulmifolius I was looking for.

Eventually I did get one lead; for the USDA clonal repository in Oregon (US) – they have a germplasm repository and, lo and behold, you can order plants from there!

So I did.

Job done. Sometimes things turn out to be easier than expected if you have the right information. Or so I thought…

About the author:

Dan Mead is the 25th Anniversary Sequencing Project Coordinator, for the 25 Genomes Project for the Wellcome Sanger Institute, Cambridge.

More on the 25 Genomes Project:

25 Genomes Project web page 

Posted by sangerinstitute

From the Welllcome Sanger Institute, a charitably funded genomic research organisation