Categories: Sanger Science11 September 20195.5 min read

Powering discovery in blood disorders and depression

The wealth of information available to researchers through UK Biobank is powering studies into human health and disease.

Researchers in Professor Nicole Soranzo’s team, part of the Human Genetics programme at the Wellcome Sanger Institute, are interested in the genetic landscape of complex human traits - how our DNA contributes to how our bodies function.

“Most human characteristics are genetically complex,” says Dr Dragana Vuckovic, a postdoctoral researcher in the Human Genetics Programme. “There isn’t one gene, or one version of a gene, causing one thing. There may be many hundreds of different variants in our genome each contributing a tiny amount to any particular trait. Those variants may be in genes, or in other parts of the genome, like regulatory areas. The challenge is finding them and understanding what they do when they have such small effects.”


One of the team’s interests is in blood – how red blood cells, white blood cells and the sticky platelet cells form and function. Blood cells are vital for oxygen transport and the immune response, and have important roles in a whole range of other processes in the body including clearance of toxins and our responses to stress. If blood cells don’t function properly it can lead to cancer, anaemia, bleeding and immunodeficiencies. It is also thought that blood cells have a role in a range of other conditions – but whether changes in blood cell functions are a cause or an effect of many diseases remains unclear.

To find the areas in the genome involved in blood formation and function, Dragana and her colleagues use data from UK Biobank, together with data from other studies. UK Biobank includes health data from 500,000 UK volunteers. Detailed measurements from blood samples are available for researchers alongside a whole range of other data including longitudinal lifestyle data, and there will soon be whole genome sequence data available.

Dragana describes the potential of such a vast data set: “If we have data from enough people, we have enough statistical power to make the associations between a genetic variant – a specific DNA sequence - and any particular trait, like the function of a sub-set of white blood cells, for example.”

In an analysis published in 2016 the team, together with researchers at the University of Cambridge, tested 29.5 million genetic variants for association with 36 properties of red and white blood cells, and platelets[1]. They analysed data from 173,480 people.

They identified over 2,500 previously undiscovered locations in the genome that influence blood cell characteristics and functions. They showed that genetic differences between people affect some of these characteristics and are linked to increased risk of heart attack, or to rheumatoid arthritis and other common autoimmune diseases.

Whole-genome sequence data

“Having whole genome sequence data available will give us so much information to work with. There is genetic variation that just isn’t seen in any other data,” says Dragana.

“We are finding the ‘major genetic players’ that affect the healthy development of blood cells, but can also play a role in disease.”

Such findings give clues to the biology of blood cells – information that can inform studies into diseases where blood cells play a role, ranging from asthma to diabetes.

“We are also finding that genetic variants across the whole genome can influence a trait or regulate the effects of a major player. This could affect how genetic tests to diagnose certain conditions are developed – it’s not as simple as testing for one gene variant because that is going to be affected by variants in other areas of the genome too. Other researchers have seen this pattern in human characteristics too[2], not just blood cell development.”

The genetics of depression

Dr Na Cai is a postdoctoral fellow at EMBL’s European Bioinformatics Institute and the Sanger Institute, also working in the Human Genetics programme. She uses UK Biobank data too, but not information about blood cells. She is interested another complex disorder when it comes to genetics – depression.

There have been several studies over the past few years looking for the genetic variants that are associated with depression [3]. Data from millions of people from a whole range of studies, not just UK Biobank, have been analysed. Over a hundred genetic variants have been identified, though the findings haven’t always been consistent.

One explanation for this could be that each of those studies doesn’t define depression in the same way[4]. Sometimes it is ‘self-reported’ by asking someone if they’ve seen a doctor for depression, other times from asking people about their symptoms, other times inferred from hospital admission data. Na is interested in how the different ways of measuring depression affect the genetic findings.

She is using UK Biobank data because the data contained enough information to allow her to define depression in many ways, similar to those previously published, and to pick apart the effects of using different definitions. She’s shown that definition does have an effect on the genetic associations found, and needs to be carefully considered in studies that are hunting for genetic variations associated with depression[5]. This is likely true for some other conditions too.

“It’s great that UK Biobank is such a big dataset that has so many pieces of information collected from its participants. It’s extremely valuable to us, because we would not be able to do this comparison if not enough information has been collected. There aren’t any other datasets where it is possible to do this work.”

The power of UK Biobank

Professor Nicole Soranzo, Senior Group Leader at the Wellcome Sanger Institute, said: “UK Biobank data is so vast, and so detailed. It has changed the way we do research in human genetics. We are beginning not only to understand the complex genetic basis of a whole variety of devastating human diseases, but also how to better use this genetic information to understand how to predict and treat these diseases."

Further reading