Image credit: Wellcome Sanger Institute


Understanding the nature of life on Earth has been revolutionised by DNA sequencing. In the past we could only observe what was happening, now we can read (and alter) the blueprints of life to understand health and disease at the most intimate level. Yet none of this would be possible without the unsung work of bioinformaticians.
The deluge of biological data that began with high-throughput genome sequencing in the 1990s continues. As petabytes of genome sequence data accumulate, there is an ever-increasing need to analyse and interpret that data, and use it to answer fundamental questions about DNA, biology... and life itself.
Interpreting and managing biological data is a key task of bioinformaticians. While all scientists need to be able to analyse data from experiments, bioinformaticians undertake the task on a massive scale, by designing and utilising computer software.
People are drawn to bioinformatics for a range of reasons, but they are tied together in their desire to understand the code of life. We spoke to three bioinformaticians at the Sanger Institute to explore what they do at the cutting edge of genomics.
Decoding parasites
Bioinformaticians create, use and develop computer algorithms to analyse genomic, and related, data. The work brings together biology, computing, maths and statistics, with different aspects taking the lead depending on the exact task at hand.
Faye Rodgers is a bioinformatician working on helminths – parasitic worms that cause a range of diseases. “I always preferred the data analysis side of working in a lab. You know where you are with data, you have more control,” she explains.
Faye uses a range of languages to write programs or stitch together existing ones to get the analysis pipelines she needs. She works closely with colleagues in her team who are generating the data. Recently they have been infecting gut cells with helminths, and then sequencing the genes that are active in both the parasite and host.
Their aim is to understand the early interactions between the host and the parasite, which could open avenues to uncovering potential drug targets.
“My colleagues are immunologists or cell biologists and know what they are looking for in the data. Together we plan experiments that allow us to understand how the worms are interacting with their hosts. At the moment we are looking at which genes are switched on in the host’s gut cells when they encounter the worm.”
New and powerful single-cell sequencing technology used at the Sanger Institute is helping researchers look at how the genome is behaving in an individual cell. The combination of this technology and bioinformatics will open the door to a deeper understanding of how a cell functions.
“I find the work really absorbing. It has goals, it’s problem solving and it’s very motivating.”

“I find the work really absorbing. It has goals, it’s problem solving and it’s very motivating.”
Faye Rodgers
Sanger Institute
Cracking cancer’s code
The collaborative nature of bioinformatics is something that appealed to Laura Riva too. She is a bioinformatician working on the Mutographs project at the Sanger Institute. Her collaborators are both within the Sanger Institute and beyond – mainly in the USA.
“We’re looking at patterns of change in DNA that are caused by a specific chemical or agent, and that are associated with cancer,” says Laura.
“We know that different carcinogens, like tobacco smoke, produce different patterns of damage to DNA, which can cause cancer. It’s like each carcinogen can leave a signature in DNA. In humans, there are about 30 distinct DNA patterns associated with cancer, and for many of them we don’t yet know the cause.”
She is studying data from mice to try and identify the links between potential carcinogens, and those DNA patterns.
“It’s challenging, but that’s what I like best about it. There is lots of data, and the data can be noisy. But getting to grips with the data on such a large scale, to understand what it means, that’s what’s really exciting.”

“Getting to grips with the data on such a large scale... that’s what’s really exciting.”
Laura Riva
Sanger Institute
Reading the blueprints of all life on Earth
Understanding cancer is what drew Ying Yan to bioinformatics, too, although in a very different way. She was a software developer, designing websites, when her Dad developed cancer. She wanted to be able to help him. She started reading scientific papers and decided to move into medical research. She got an internship at Sanger Institute and then a PhD position at the EMBL-EBI. She picked up her knowledge of human biology and genetics along the way.
“I find bioinformatics answers my questions. It is very satisfying, rewarding work.”
Now, Ying works on the Vertebrate Genomes Project – its mission to sequence the genomes of all 66,000 species of vertebrates on Earth. Her role is to integrate data from a variety of sequencing technologies to determine the genomes of those species for the very first time.
It took 13 years to put together the first human genome sequence. Sequencing a species for the first time is still a complex process, but now it takes weeks or months rather than years. The team are working on completely automating the process, so they can undertake the task in just days, and on a massive scale.
“I love being at the forefront of technology developments – it’s what keeps my research going.”
Once a species has been sequenced and scientists have that ‘reference’ genome, sequencing subsequent individuals becomes simpler and quicker, taking just a few days. Bioinformaticians can then look at differences between individuals of a species to try and determine what sequences in the genome are associated with different traits. Comparing sequences between species enables researchers to understand how their genomes have evolved over time.

“I find bioinformatics answers my questions. It is very satisfying, rewarding work.”
Ying Yan
Sanger Institute
The future’s bright - the future’s bioinformatics
It’s likely that more and more biologists will need skills in bioinformatics, or close collaborations with bioinformaticians, to get the most out of their data.
Genome sequencing is ramping up. It’s moving into everyday healthcare. As well as the initiative to sequence all life on Earth, there are projects like the Human Cell Atlas, which aims to characterise and map every cell type in the human body. The continuing downpour of data offers promising insights into health, disease and evolution – if we can analyse it.
How to become a bioinformatician
There are many routes into bioinformatics. Ying was previously a software developer and Laura studied engineering. Faye was a cell biologist before becoming a bioinformatician. They all agree that an interest in both biology and computing are important.
Their advice for anyone interested in bioinformatics was to try it. Ying said: “It’s never too late to try something you might like. You can change careers at any age.”
For anyone starting out, there are undergraduate degrees in bioinformatics as well as internships and apprenticeships – including at the Sanger Institute.
And even before that, it’s possible to try out programming, online, for free. Try Codecademy or Rosalind.
Find out more
Bioinformatics Careers
- Bioinformatics Apprenticeship position (Closing Date 31 March 2019)
- Apprenticeships at the Sanger Institute
- Jobs at the Sanger Institute
Bioinformatics Courses at the Wellcome Genome Campus
- Summer School in Bioinformatics
- Bioinformatics for Immunologists
- Next Generation Sequencing Bioinformatics (links to 2018 course, 2019 course will be available soon)