30 May 2014
By Alistair Rust
One approach (explained in this blog) is to use short sequences of DNA, transposons or retroviruses, to integrate into the genome and cause genes to malfunction.
However, it can be tricky to know which points of integration are valuable and that’s where large, complementary datasets can be used to help make useful comparisons.
Not all of the places in the genome that these DNA integrating elements hop into cause genes to malfunction. Some of these hotspots on the genome are simply regions in which, for some reason, transposons prefer to cluster. Such regions therefore need to be ignored when trying to identify true, disrupted regions that are driving the formation of tumours.
In a recent study led by collaborators at the Netherlands Cancer Institute in Amsterdam and the Delft University of Technology, also in the Netherlands, a wide range of publicly available epigenomic datasets were combined using statistical methods, to generate landscapes of molecular signals to better understand the patterns of transposons and retroviruses found in cancer.
The generated genomic landscapes are a great resource as they can be used to filter hotspots from transposon-and-retrovirus-based cancer studies. This study is another example of using systems biology approaches to analyse big biological data sets to better understand the complexities of smaller data sets.
Alistair Rust is a Principal Bioinformatician in Dave Adams’ group at the Wellcome Trust Sanger Institute, uncovering cancer genes using mouse models.
- de Jong J, et al. (2014) Chromatin Landscapes of Retroviral and Transposon Integration Profiles. PLOS Genetics. DOI: 10.1371/journal.pgen.1004250