Image credit: Onur Pinar / Wellcome Sanger Institute

Categories: Sanger Science18 March 2024

Digital transformation and a new era in science

By Katrina Costa, Science Writer at the Wellcome Sanger Institute

Digital technologies are rapidly evolving. James McCafferty, Chief Information Officer at the Wellcome Sanger Institute, discusses the latest opportunities and challenges for research, including generative AI and enhanced data science.

Sign up for our email newsletter

Sign up

Science is changing. Digital transformation, a process in which organisations undertake a cultural shift to integrate digital technologies at the core of everything they do, is impacting the way organisations carry out research. The Wellcome Sanger Institute and its partners are mobilising to support these technological advancements, which will have a powerful impact on our understanding of health, disease, evolution, climate change and beyond. Moreover, with the astonishing rise of generative AI, the Sanger Institute will actively create publically-available generative models of aspects of human cells, alongside other biological systems. I met with the Sanger Institute’s James McCafferty, Chief Information Officer (CIO), to discover more.

Businesses have long embraced digital transformation to maximise the benefits of adopting digital technologies and ensure their work is effective, efficient and adaptable. We can see this in daily life, from self-service websites to digital assistants such as Apple’s Siri and Amazon’s Alexa. But whilst science organisations have traditionally lagged behind, only gradually incorporating tools like digital lab notes and automation, things are set to take a major step change. This is especially true with the rise of AI, or artificial intelligence, where scientists train computer systems to carry out complex tasks that would otherwise require human intelligence.

“Science, particularly genomics, is becoming increasingly dependent on IT and data. A big part of my role is to work with the scientists to make sure they get new and innovative IT solutions that help them achieve their scientific goals. And that’s why the IT community here plays an important role - and it’s what attracted me to Sanger over three years ago.”

James McCafferty,
CIO, Wellcome Sanger Institute

James’s role is crucial because, in addition to the usual CIO responsibilities of overseeing services such as data centres, servers, networks and laptops, his team provides vital support to bioinformaticians and co-creates innovative scientific software. Applying digital transformation to science is not an easy task and the team faces very specific obstacles.

Challenges for IT in scientific research

One fundamental challenge for the digital transformation of science is the sheer scale of the data, which are the results of large-scale biological experiments and measurements. The Sanger Institute alone manages over 90 petabytes of data, which is 90 million gigabytes. A mid-range iPhone 15 stores 256 GB, so it would take approximately 351.6 million of these phones to store 90 PB. Managing this much data requires both specialist storage facilities and responsible data handling.

Another complication is that scientists have diverse research needs, from analysing and comparing entire genomes to pinpointing individual differences between single cells. Moreover, world-leading researchers will have innovative ideas and new tools they want to explore, and these requirements must be balanced with the need to maintain cybersecurity and operational efficiency.

As part of addressing these challenges and balancing often-competing needs, James leads a dedicated Informatics and Digital Solutions (IDS) team at the Sanger Institute.

Digital focus at the Sanger Institute

The IDS team has a unique approach for an IT team at a research institute, with a focus on two-way conversations with the institute’s scientists.

“The goal for me was to build on Sanger’s existing success with IT and to work closely with the scientists - to give them new tools they need to do advanced, leading-edge science. I call these ‘Science Solutions’", James adds.

This shift towards more innovative, science-focused thinking led to IDS launching a new informatics strategy to support the vast data generation and analysis at the Sanger Institute. The strategy covers all research groups and is instrumental in enhancing data science and AI, as well as supporting scientific discoveries.

Recognising the Sanger Institute’s ability to run advanced scientific technologies at scale, James saw the opportunity for the Institute to leverage digital transformation to tackle new translational challenges.

2020-11-09_Data-Centre_SI_1440_808

Data centre at the Wellcome Sanger Institute. Credit: Dan Ross / Wellcome Sanger Institute

AI enters the lab

It is hard to escape the growing importance of AI in society, especially with the advent of generative models such as Open AI’s ChatGTP and Google’s Gemini. Whilst the concept of AI has existed for over 70 years, generative AI represents a significant leap forward because it can create new content, such as text and images, as well as deal with situations it has not encountered previously.

James emphasises that AI will have the greatest impact on research in the “dry lab”. This describes laboratories where computer models simulate and analyse experiments based on data, in contrast to “wet labs” where physical experiments take place.

Traditionally, scientists develop a hypothesis or idea about how something works, for example a particular chemical pathway in a human cell. They would then test this hypothesis under various conditions in an experimental lab. But now, as a result of the huge amounts of data produced by researchers at the Sanger Institute and organisations worldwide, it is possible to model aspects of human cells purely in-silico, or on a computer. These approaches will reduce the cost of doing science, reduce the ‘time-to-science’ for projects and enable a faster discovery cycle.

So as AI improves, James anticipates the Sanger Institute will lead the way in building publically-available generative models of aspects of human cells, alongside other biological systems. These models will themselves provide the foundations for other more specific models, such as testing reactions to specific drugs. These models will speed up the process of scientific discovery, for example in developing new drugs and treatments. With this in mind, the IDS team is preparing extensively for generative AI in science.

Preparing for AI and digital change

Sanger’s IDS team is well positioned to support the advancement of generative AI and digital transformation, both at the Institute and the wider research communities. Whilst AI is already used to support operational tasks at Sanger, such as the auto-transcription of video conferences, it is clear that generative AI will also impact every area of scientific research.

To provide effective support, the IDS team is already investing in equipment and infrastructure, including enhanced data networks, supercomputers and increased data storage. Looking beyond technology, the team will provide training to all staff on AI and collaborate with the scientists to anticipate their needs. James is also setting up collaborative networks and communities to help develop and promote best practices for AI in science. This will include exploring the ethical implications of using generative AI in science, as well as ensuring the models are trained on diverse data. This reflects the Sanger Institute's wider commitment to participant diversity in scientific data and research.

A new research programme at the Sanger Institute, Generative and Synthetic Genomics, led by Ben Lehner, exemplifies this new direction in science. The team will combine large-scale biological data generation and AI to accelerate the scientific understanding of genomics and how cells react to different changes. Generative genomics involves creating in-silico simulations of DNA sequences to understand the effects of genetic changes, whilst synthetic genomics is creating artificial DNA sequences using computer-aided design and chemical techniques.

LATEST VACANCIES FOR DATA SCIENCE AND TECHNOLOGY

Current vacancies

Our people are shaping the future by delivering life-changing science with the reach, scale, and creativity to solve some of humanity’s greatest challenges.

Digital transformation and the future of science

The UK government recently pledged £100 million to create dedicated AI research hubs and to train regulators on the technology. Given this wide-scale investment in AI, James recognises that it is vital for research institutes to collaborate closely and establish best practices for digital transformation and generative AI. For example, the Sanger Institute collaborates with EMBL’s European Bioinformatics Institute (EMBL-EBI), since biological data are central to this work. The IDS team are also exploring opportunities to work with other institutes too, to bring together experts from the fields of both AI and genomics to progress digital transformation in the life sciences.

The Sanger Institute, with its advanced genomic research and dedicated IDS team, will continue to lead the way on digital transformation in science and the adoption of breakthrough technologies. James emphasises that by focusing on ways to do science better, faster and cheaper, scientists will be empowered to address bigger questions:

“We are at the tipping point where the very nature of science, and how we do science, will change. For the future, so much of science will be driven by the use of digital models.”

James McCafferty

As the Sanger Institute continues to push the boundaries of genomic research through digital transformation, a global effort to combine both digital and biological innovation will revolutionise the scientific understanding of life, health and the environment.

Find out more