Professor Matt Hurles, Wellcome Sanger Institute. Image credit: Onur Pinar / Wellcome Sanger Institute

Categories: Innovation, Sanger Science23 March 2023

Accidental Entrepreneur

By Alexandra Canet Font, Marketing Manager, Wellcome Genome Campus

In this third part of our innovator blog series, we spoke to Matt Hurles, Head of the Human Genetics programme and incoming Director of the Wellcome Sanger Institute. Matt defines himself as an accidental entrepreneur, even though he spearheads a number of innovative initiatives that showcase his motivation to ultimately bring benefits to patients, with a clear focus on rare developmental disorders.

Innovation takes many forms – from a tweak that improves technology, all the way to the development of new medicines. Translating science is about adapting research, moving it beyond the lab, or closing gaps in technologies, so that it can be used to improve our lives. Matt spoke to us about his experience in technology translation and the impact of his scientific endeavours.

Matt, you are Head of the Human Genetics programme, soon to be Director of the Wellcome Sanger Institute and you are also Scientific Director of Congenica, one of Sanger’s spin-outs. You are involved in major international research efforts such as the Atlas of Variant Effects. Where does this entrepreneurial spirit come from - and how do you find the time?

I definitely regard myself as being somewhat of an accidental entrepreneur. I’d say it started with the Deciphering Developmental Disorders (DDD) project, in which we asked ourselves back in 2010, can new genomic technologies help doctors understand what causes developmental disorders in patients without an obvious cause? For this, we collaborated with all 24 Regional Clinical Genetics Services, throughout the UK (NHS) and the Republic of Ireland. To give a little context on the project, developmental disorders are those where a baby does not develop normally due to an alteration in their genetic makeup. We often cannot find the underlying genetic alteration. What this means is that in most cases, we cannot tell the patient or their family why this has happened to them. And the project set out to change this.

We had two aims. The first one, more translational, was to have access to a genomic technology that could potentially diagnose children with severe developmental disorders. The second one, more academic, was to understand the genetic architecture, that is, the relative importance of different types of genetic variation. As the project evolved, the technology did so too - we were able to generate more and more data on each patient and identify more genetic variations. This was great news, but it did pose a conundrum. Whilst we were still searching for the one causal genetic variant that could potentially cause disease, we were generating much more data. It's as if the haystack was getting bigger and bigger - and bigger! - whilst the needle was still the same size.

The effort we had to make on the bioinformatics side of the project was huge. We needed the computer to take most of the interpretative load, as it would be too time-consuming for a person to do this. In this way, we’d whittle down the haystack into a very small number of genetic variants that would then need expert clinical interpretation.

“I’d say it started with the Deciphering Developmental Disorders (DDD) project, in which we asked can new genomic technologies help doctors understand what causes developmental disorders in patients without an obvious cause?”

So, you had to build the technology?

We had to, and we did. At the time there was no off-the-shelf solution to this, we had to build it. This was key to feeding back potentially relevant diagnostic variants to the clinical teams that had recruited the patients in the study. Then, a few years in, we realised that this was a workable solution and would allow diagnostic laboratories to generate large amounts of data in a reasonably timely manner, then interpret the data and eventually identify the genetic variation.

At the time, we were having annual meetings and sharing the findings with the NHS and other collaborators. Our original thinking was that the 24 NHS laboratories we were working with would take on the bioinformatics workflows and processes that were devised and thus integrate them into their own work. However, we soon realised that this wasn’t feasible. Trying to involve 24 different centres in what would be a complex software development exercise was inherently challenging. It became clear to us that, if we really wanted to have an impact, we needed to set up an entity that would professionally develop the software that would enable the clinical interpretation of the data we were collecting, and then used in every NHS centre and, potentially, around the world.

Deciphering Developmental Disorders

And that was Congenica?

So it was. That was Congenica. It was a way of translating what we'd learned internally as well as within our collaborative partners in the NHS. It was actually our plan B. Plan A had been to integrate within the diagnostic centres themselves, but at the time, they often only had one or two extremely busy bioinformaticians.

Also, what we had built was far from ideal - it served our needs in terms of a research project, but it was quite clunky, as in most academic software projects. It wasn't the kind of tool that was sustainable on its own and really needed professional software development expertise to then turn it into a real clinical piece of software, rather than a research tool that only worked in one particular circumstance.

I, along with Richard Durbin, Nick Lench, Phil Beales, Tom Weaver and Andy Richards, put the effort into this, and with help of the Technology Translation team, we were able to spin it out and actually create an impact, as it is doing today.

Is the DDD project still ongoing?

It is. It stopped recruiting patients in 2015, and we have been able to diagnose about 40 per cent of the families. The other 60 per cent still don’t have a diagnosis, but we have an incredibly motivated team and we are persevering in finding diagnoses. We've got plenty of good clues that set the direction for where we need to go next to achieve this. As the project has progressed, so have other international initiatives, and it has proven incredibly fruitful to combine DDD families’ data with the data of other families around the world with children with similar developmental disorders.

I keep talking about data. Data is key, especially for rare diseases like the ones we are diagnosing. This means we are highly collaborative. We are working with Genomics England, for example, which is undertaking the ongoing genetic testing of patients with the NHS. We collaborate with a big US diagnostics company (GeneDx), which does a lot of diagnostic testing on similar patients, and then we also work with the Radboud University Medical Center in the Netherlands, which is a leader in the field in Europe.

With rare diseases, data sharing becomes crucial. It may be that you can only really be confident that a particular genetic variant is causing disease when you bring together enough patients from around the world who all have similar developmental conditions and similar genetic variants. This is an international endeavour that puts patients at the centre, as they are the ones who will benefit from this sharing.

“With rare diseases, data sharing becomes crucial ... This is an international endeavour that puts patients at the centre, as they are the ones who will benefit from this sharing.”

A passion for problem solving

Read our interview with Qianxin Wu who is at the forefront of innovation, working on CRISPR, genome editing and single-cell technologies

You are also involved in the Atlas of Variant Effects, is it a follow on from the DDD project?

It is very much conceptually linked to the DDD project, although organisationally separate. This is an international effort that comes from the challenge of interpreting variants - how do we interpret a variant that we've never seen before? We know damaging genetic variation can cause a disorder, but we also know that most genetic variation in a gene doesn’t actually damage the gene. Genes are impressively robust to genetic change. Hence, when we identify a new variant that we’ve never seen before, how do we interpret it? How do you gather enough evidence to determine if it is causing disease or not? Interpretation is a challenge we faced in the DDD project. We hope that the Atlas of Variant Effects will help with this.

We are currently performing experiments in which we create a pool of human cells, each with a different mutation in the same gene, and identify the subset of cells in which the mutation is damaging the gene. Because we can do these experiments in a highly parallel fashion, we can quickly assess the functional impact of every possible genetic variant in a disease-associated gene, including variants that have not yet been observed in a patient. This is in contrast to the current way of working, in which variants that have already been observed in patients are assessed one-by-one for their functional impact retrospectively.

Currently, this technology can only be applied to a subset of genes. What we now envisage, and propelled by new technology, is to prospectively determine the functional impact of every single possible change in a gene. Our aim is to create what could be called a variant effect map, where you literally just go through mutating every single possible base, and then you have a functional assay that allows you, in parallel, to read out which variants are damaging and which aren't. Essentially, an atlas comprising maps of every single disease-causing gene, with each map containing the effect of every single possible change in that gene.

“Our aim is to create ... an atlas comprising maps of every single disease-causing gene, with each map containing the effect of every single possible change in that gene.”

What do you think can be the impact of this collaboration?

There are multiple different impacts, from the very clinical to fundamental biology. Diagnosis is the obvious one. To give you an example, currently, we're not as good at diagnosing children from single-parent families as we are with children where DNA from both parents is available. And that's because the parental DNA really helps to work out whether a variant is causing disease or not. Now, if we had a map where we knew what each variant was, we could address that inequality, because we can see a variant in a single-parent family and be much more confident it was causing disease or not causing disease. It will level the playing field that we have at the moment.

It helps us start to think about screening too. So, not necessarily a diagnosis of patients who already have a disorder, but in the prediction of future problems. For example, screening for breast cancer risk often involves looking at the sequence of breast cancer-associated genes. Again, there's a challenge there - is a variant damaging the gene or not?

There are also opportunities at the fundamental level of just understanding how proteins work - which is the important part of the protein, which are the dispensable parts? How does the sequence of a protein determine its function? In many respects, one can imagine that generating a large number of variant effect maps will be a bit like generating a large number of protein structures. And the revolution we've seen in protein structures recently is through technologies like Alpha Fold, which has used artificial intelligence to predict structures that we’ve never seen before. Once we've collated enough of these maps of variant effects, I think we'll also be able to predict what is going to be the effect of different variants in proteins that we've never previously analysed.

Sounds like a herculean task - how did it all start?

There are some real leaders in this field, especially in Seattle at the University of Washington. David Adams and I got together with them in early 2020 to establish an international community that would work together to develop the Atlas. It’s a community that’s growing, even though we’ve had to work virtually since then.

We’ve also got some challenges on the table - what are the key data standards that will enable a distributed model whereby many different centres generate different maps but feed into a common resource in a standardised way? The goal is for a clinician to be able to understand which datasets can be trusted when helping them diagnose a patient - a centralised resource. If you have a damaging genetic variant in say, 100 different genes, you don’t want to be going to 100 different platforms or databases to try and find that data - you want to go to one, and one that you can trust.

Of course, this means that we’re navigating a set of logistical and operational challenges. However, we know from previous large-scale genomic projects what elements are critical for its success, such as the Human Genome Project, and that is having one place where the data are deposited, with a set of standards that define what constitutes good quality data, and importantly, the metadata associated with that data that enables it to be reused.

“The goal is for a clinician to be able to understand which datasets can be trusted when helping them diagnose a patient - a centralised resource.”

Back to Congenica, what is your role now? And how does that fit in with your academic research?

I'm a non-executive director on the board. This means I have board director and high-level strategy responsibilities. I also consult for the company on the areas in which I can add value in terms of my scientific expertise, and dive down into the weeds a bit more, which, being a scientist, I love doing.

It takes up less than a day a week of my time and I find it very synergistic with the academic work that we do. For example, thinking about what things actually impact patients. How do our discoveries end up having an impact beyond the paper? But also from an organisational perspective - how do you organise a bunch of academic people around a common vision and get them to really buy into it? And how do you bring together different kinds of capabilities and expertise and make good use of that? There are definitely leadership and organisational lessons that I in retrospect would have missed out on learning had I not been involved in Congenica.

Sequencing anything, anytime, anywhere

Read our interview with Physilia Chua from the Tree of Life research programme

What is your involvement in the development of functional genomics in the UK, led by the Medical Research Council (MRC)? Can you tell us a little bit about this?

The field as a whole has come a long way in the last ten years. There used to be a huge bottleneck in discovering robust associations between genetic variation and disease. Back in 2005, for example, we had almost no genetic associations with common diseases, like type two diabetes, rheumatoid arthritis or other autoimmune diseases. We knew they had to exist, but we couldn’t identify them. Whilst now, we're in a situation where we have hundreds of thousands of these associations, some for common diseases, some for rare diseases.

The question is - what do we do with that information? We can be statistically confident that genetic variant A increases the risk of disease B. But that was only ever a means to an end, with the end being understanding more about the mechanism of disease. This in turn allows us to think about how we might treat the disease. So, the challenge is how do we take all of these associations and turn them into a functional understanding of disease?

The problem is that this is not scalable, yet. It takes a lot of deep investigation over quite a period of time to really understand just one or two associations. The challenge is that the scalability of that endeavour doesn't match the huge number of associations that we now have. There's a consensus on this and we now see elements that are scalable and reusable. Thus, we need to start thinking about what platforms we are going to put in place nationally and internationally to try and enable a rapid understanding of how variants influence disease. That’s a major aim of the MRC functional genomics initiative.

“We need to start thinking about what platforms we are going to put in place nationally and internationally to try and enable a rapid understanding of how variants influence disease.”

Why do you think engaging in technology translation is important as a researcher?

It’s interesting, it covers a lot of areas. For example, in clinical translation in the DDD study. We have been instrumental in diagnosing thousands of families who might not otherwise have had a diagnosis. That is something myself and the team are hugely passionate about.

At the end of the day, it’s about using your knowledge to have a positive impact and help people. That’s hugely rewarding and motivating.

On the Congenica side, it's not just about helping the people that are in your research study, but it's thinking about how the lessons that you've learned can be generalised and benefit more widely.

“It’s about using your knowledge to help people. That’s hugely rewarding and motivating.”