Category: Sanger Science

Exploring Sanger’s groundbreaking research

Sanger Science

Gene responsible for diarrhoeal disease transmission identified

Clostridium difficile [Credit: CDC/Dr. Gilda Jones]

21 June 2012

Written by Laura Deakin

As a Ph.D student at the Wellcome Trust Sanger Institute, I have been investigating how Clostridium difficile bacteria are able to infect people for nearly four years. This bacterium, which is found in hospitals and is rife in the developing world, has been a hot topic of discussion in both the scientific literature and mainstream media in recent years.

C. difficile is the leading cause of antibiotic-associated diarrhoea in developed countries and has been responsible for a number of deaths in hospital patients. The bacterium releases spores that are highly infectious and cannot be killed by standard hospital cleaning routines. As a result C. difficile bacteria are now widespread in many hospitals and they are capable of causing major epidemics that are becoming increasingly frequent and severe.

To understand how the bacteria are able to infect people and transmit from one person to the next, I have been investigating the role of a gene called spo0A. Working with the C. difficile team at the Institute, I infected mice with C. difficile, to allow us to recreate and study many aspects of the disease; including its persistence and transmission in humans.

Using these mice as a model, we are able to mimic the transmission of C. difficile within hospitals and the effects of different techniques employed to minimise its spread. For example, we are able to explore the impact on transmission of patient-to-patient contact and shared rooms, and to study the effectiveness of patient isolation in lowering infection rates.

The study we published online in the journal Infection and Immunity looked at the role the spo0A gene plays in allowing C. difficile to transfer from person to person. We found that the bacterium had to have a normal version of the gene for it to be transmitted. The gene is essential for disease transmission.

Further study revealed that spo0A is also responsible for the persistent nature of C. difficile. This persistence is seen in patients who have been given vancomycin (a powerful antibiotic) to treat the disease. The treated patients recover and return home to an environment that contains C. difficile. The bacteria are then able to reinfect them, resulting in a second wave of disease. Some people can experience multiple episodes of infection over many years. Successful reduction of transmission would greatly reduce the threat of C. difficile as a cause of disease in hospitals.

Our findings suggest that the spo0A gene is a potential target for the development of therapies to disrupt or stop C. difficile transmission. The discovery of this genes role also has clinical implications relating to the management of patients in hospital to minimise transmission: for example by isolating infected patients and by using ‘barrier nursing’ (that is, the wearing gloves, gowns when treating the patients and employing heightened disinfection regimes).

This discovery is just the beginning: now that we’ve identified the importance of spo0A in transmission and persistence, we are now expanding our search to find other, related, genes that may also play a role. Finding these genes will allow us to identify points of intervention that might ultimately be used to contain the bacteria’s spore-mediated transmission and limit the spread of C. difficile.

Laura Deakin is a Ph.D student in the Microbial Pathogenesis team at the Wellcome Trust Sanger Institute… more

Paper: Deakin L et al. Clostridium difficile spo0A gene is a persistence and transmission factor. Infect Immun 2012. doi: 10.1128/IAI.00147-12

Related Links

Credit: Luc Viatour /
Sanger Science

Creating a gold-standard, not a rotten, tomato genome

Credit: Luc Viatour /

Credit: Luc Viatour /

Recently the full reference genome of the tomato (Solanum lycopersicum) was published in Nature (31 May 2012). Here, at the Wellcome Trust Sanger Institute, some of our sequencing people took part in the international collaboration of 10 countries that developed the DNA sequence. Each research group was tasked with working on a different chromosome, and we sequenced Chromosome 4. By being part of the project we were able to share our experiences and knowledge from producing animal reference genomes to enable the plant genome research teams to work together to deliver high-quality, standardised data.

When the tomato genome sequencing project began the teams estimated that the genome was 950 million base (Mb) pairs in size, split across 12 chromosomes. This was no small undertaking: it is one-third the size of the human genome (a project that had taken a worldwide collaboration 10 years to deliver). In addition, the project had limited funding resources, meaning that the work needed to be as tightly focused and efficient as possible.

Fortunately only 25 per cent of the tomato genome contains gene-rich areas, so the project teams agreed that capturing and sequencing these areas only would provide the most valuable information in the most effective way. To achieve this, we used mapping techniques to identify the gene-rich areas and used clone-by-clone sequencing to fully sequence them using the shortest number of sequencing runs.

Clone-by-Clone sequencing

We took clones taken from existing libraries and digested them with restriction enzymes, producing a fingerprint signature for each. We processed these fingerprint signatures in a database known as FPC (Fingerprint Contigs). Sections of signature in common indicate an overlap between clones and these overlaps can often be verified if known markers can be placed in them. By knowing where each clone belonged on the chromosome, we were able to select only a minimal set of clones to cover the area of interest. We made the FPC database for all the chromosomes publically available for the research community.

Fig 1. Screenshot showing the Fingerprint Contigs database. Clones highlighted in red and grey show the minimal tiling path selected for the sequencing project.

Using this approach, we mapped, sequenced and finished the gene-rich clones of Chromosome 4, which was estimated to be roughly 19Mb long. The UK team was led by Principal Investigators Gerard Bishop from Imperial College London, Graham Seymour from Nottingham University, Glenn Bryan from Scottish Crop Research Institute, and Jane Rogers from the Sanger Institute.

Finishing the genome

However, mapping and sequencing are not the whole story when producing a high-quality reference genome: the sequences need to be pieced together and inconsistencies resolved. In other words, the sequences need to be finished. This can be a long and time-consuming process, especially if a project consists of differing standards and approaches. Fortunately, we have long experience in finishing DNA sequencing data from our work on the human, mouse and zebrafish genome projects. So, to enable the other international teams draw on our experience and to develop the common standards needed for efficient finishing, we organised two International Finishing Workshops.

In these, representatives of the different research groups from across the world met and discussed the various challenges of working with the sequencing data. It was a chance to pool experience and look at efficient ways to progress each data set for each of the chromosomes. Our discussions centered around techniques for improving the data for the clones as well as ensuring that the metrics all the teams used to assess the quality of each clone was comparable.

Through meeting together and talking through the issues, the teams ensured that the resulting genomic sequence from all the laboratories involved showed parity. This data was then annotated and made publically available for the wider Solonaceae research community.

Another area that we were able to make a useful contribution to was to guide the project teams through the challenges of adopting and incorporating new technology sequencing data; which the project went on to adopt.

Funding bodies: BBSRC, EU-SOL, DEFRA and the Wellcome Trust