

Gary Dillon, Business Development Manager at the Sanger Institute, gives us his view on some of the powers and pitfalls of genomics.
Yes, despite coming into contact with the highly contagious disease, I've never had chickenpox. Even though I’ve been exposed to the virus multiple times, courtesy of my three children. So what, right? Maybe I've just had a subclinical infection? I'm certainly not a disease-immune superhero*: I get colds all the time and, as any parent can attest, plague-bearing children certainly bring everything home to pass straight on to you. So it’s just a quirk, an oddity, in an otherwise typical person, let's not over interpret… Well yes, lots of that, but also no…
There is data in those quirks
There’s a lot of talk about personalised medicine these days. We will all have our genomes sequenced, our genetics decoded, everyone will be given lifestyle advice and medicines tailored to who they are: sequence all the people, then we will prevent and cure with genome-driven precision. Exciting though this prospect is, and it certainly is exciting, I’m a little more intrigued by what other secrets the forthcoming omnigenomics era will uncover. There’s valuable data located in those quirks and mining them may present whole new medical advances.
Take my curious imperviousness to chickenpox as an example, let’s say it genuinely is resistance, what is the cause? As is often the case with genetics, it’s not possible to answer by studying me alone. Even looking at my family it would be difficult to understand the whys. For example, all my children are susceptible. Is that because they haven’t inherited a version of the gene that makes them resistant? Is it because the trait needs two copies of a resistant gene (one from each parent) and they only have one? Or is it only unmasked in combination with another trait? These are the kinds of questions comparative genomics is well placed to answer. Once you have thousands of genomes to compare, and linked medical information, it becomes feasible to link particular genetic changes to specific characteristics. The more data you have, the more robust the comparison you can make and the more subtle, or complex, effects you can spot. That’s part of the driving rationale behind the UK’s Biobank’s partnership with the Sanger Institute and the MRC, which, ultimately, plans to sequence half a million people.
Like we’ve approached biology in the past… but bigger
Indeed, this is nothing necessarily new, for example, we’ve long known that some individuals are resistant to HIV because they carry a mutant gene. CCR5, one of the proteins on a cell’s surface that HIV uses to recognise and infect cells, is mutated to the extent a large chunk of it is missing in resistant individuals (the mutation is known as CCR5Δ32). In the absence of a target, the virus struggles to cling on to cells in people with truncated CCR5, leaving them resistant to infection. Controversially, the recent CRISPR baby scandal that has rocked the world’s scientific community has attempted to replicate this resistance through germline genome editing. More ethically, and far more practical, however, the mechanistic data revealed by identifying CCR5Δ32 has led to the development of drugs that can target of CCR5 and block HIV infection.
One big experiment
Evolution is one huge biology experiment, constantly generating new variations within a species. The power of sequencing everyone is we’ll be able to spot when natural experimentation has delivered a disease beating novelty, or a disease conferring susceptibility. That information can tell us something fundamental about the illnesses we seek to eliminate.
The effect of the CCR5Δ32 is pronounced, and it was, spotted before we really began to ramp up genomic efforts. However, it’s a good example of how informative mutations can be restricted to a specific population; in this case CCR5Δ32 is found predominantly in northern Europeans. It also, however, highlights how important it will be to sample across populations if we really want to realise the value of genomics.
A point heavily underscored by a recent publication that highlights just how much we may be missing by not sequencing across the globe. An analysis of 910 people of African descent showed we were missing around 10% “extra” DNA from the reference human genome – the human genome used as a guide in all genetic studies. That’s a fairly extraordinary amount of missing information, and of course, probably still a fraction of the variation out there in evolution’s grand experiment.
Even simple traits can have complex causes
Why worry about missing information though, how important is it? Well a recent publication that used hundreds of thousands of samples from the UK Biobank to explore the genetics of ginger hair delivers a fairly striking use case. Intuitively, you might imagine that hair colour is a simple and straightforward trait but it really isn’t. While a single gene can be pinned down as a cause of red hair, it only accounts for 73% of inherited flaming locks. That leaves a sizeable fraction of red heads unexplained. The researchers went on to look at interactions between genes, epistasis, phenomena that are incredibly tricky to spot without a lot of data. Sure enough, they were able to hunt down several more genes that contributed to the likelihood of red hair. In fact, looking more broadly at all hair colours blond hair alone was associated with over 200 genetic variants.
So if it’s this complicated to unpick something as straightforward as hair colour you can imagine how much more challenging understanding the interplay between the vast array of cellular machinery and a virus may be. Furthermore, you can get a feeling for the gargantuan amount of data integration required to process the huge datasets we must compile to generate leads. The wealth of data provided by basic research and human health records is why we will ultimately require AI to help us make sense of it all. Nevertheless, while it may be complex, the more complete the picture and the more links in the chain we have, the more chance projects like Open Targets have to rationally identify targets for treatments hidden in our genes.
Security, ethics and trust
Of course, I should note at this point that “with great power comes great responsibility” probably never applied more than it does to massive genomics programs. The data gathered is sensitive, especially when linked with medical records, and we really need to remember we’re not just collecting information on an individual donor. Genetic information, by its very nature, reveals substantial snippets of other people’s (i.e. relatives’) data. Moreover, gathering people’s data is very politically sensitive. For example, access to healthcare is not universal across the globe, pre-existing inequality means high income countries will benefit more from genome-driven drug development than low and middle income countries. Therefore richer nations should not run pell mell into sequencing developing nations, or use their data, without equitable return. It’s important that research is not exploitative at any level - ethics will always need to be put at the forefront of everything we do. That’s going to be quite a job when we already have huge levels of inequality and technology moves so quickly it runs the risk of amplifying the disparity. Scientists will need to work with legislators across countries to ensure high standards are maintained and value is equitably shared.
There are, thankfully, reasons to be optimistic and expect we will get more right than wrong on the ethical front. There is a lot of government, scientific, clinical and business engagement around ethics and trust. Even at the start-up scale it’s interesting to see start-ups like Nebula Genomics and Luna DNA have adopted business models which explicitly emphasise personal ownership and control of your data. It’s not all of the solution of course, and big pitfalls remain, but there’s growing realisation that tech moves very quickly and this can have negative consequences, so security and ethics are on everyone’s mind, as they should be.
What’s your superpower?
Returning to the possibilities, as we enter the omnigenomics era, we are at the beginning of a time where non-scientific citizens may find themselves contributing directly to biological understanding on a scale never seen before. Done properly, it’s a real opportunity for individuals to understand more about how they fundamentally function and, at its best, will mean the public can come along for the ride with clinical and scientific professionals while improving their health. Time to start thinking about what magical-folklore resistance you have that genomics will reveal…
*Although, in my lighter moments I may tell my children I'm an X-Man...