
24 September 2013
By Art Wuster

DeNovoGear discovers mutations in DNA sequencing data from mother-father-child families. This task is made more difficult by some variation being passed on to the child from its parents (inherited sites), as well as poor data quality (not shown).
Every mother knows that smoking is a particularly bad idea when expecting a baby. One reason for this is that the smoke can cause mutations that lead to severe disease in the infant. But, sadly, even parents leading an impeccably healthy lifestyle sometimes have babies that are seriously ill.
In many cases, this is due to random mutations of the DNA in the father's sperm or the mother's egg cells. All sperm and egg cells have mutations like this, and most of the time they are harmless. Only in rare cases does a mutation disrupt an important gene, which in turn can lead to disease.
In the last few years it has become possible to identify and study these mutations directly. This is done by sequencing the DNA of a mother-father-child family and looking for sites that are present in the child but not in either parent. The only explanation for the presence of these sites is that a mutation must have occurred in the father's sperm or the mother's egg.
In principle this is relatively straightforward but in practice, as is so often the case, it is unfortunately rather complicated. There are two reasons for this. Firstly, DNA sequencing machines make errors. As a result, when observing a site that is different in the child compared to the parents, this could indicate either a mutation or a sequencing error, and it is not always straightforward to tell which it is. Secondly, we have two copies of our genome in each cell. Variants can either be present in one copy (heterozygous) or in both copies (homozygous). This makes it harder to decide whether a variant we see in the child has already been present in the parents.
As a result, mutations can only be discovered using relatively complex statistical methods. To make this easier for other researchers, we have developed a software tool that we have called DeNovoGear. Given DNA sequence data from a father-mother-child family, DeNovoGear will return sites that it considers likely to be mutations that are unique to the child, and it does so with pretty good accuracy.
For the time being, DeNovoGear is only suitable for academic research, but with sequencing becoming more important as a tool for diagnosis in clinical settings, an algorithm like DeNovoGear may soon be used to diagnose harmful mutations in hospitals as well.
DeNovoGear can be downloaded for free at http://sourceforge.net/projects/denovogear/. To use it correctly, you will need to have some prior experience with analysing DNA sequence data.
Art Wuster is a postdoctoral fellow at the Sanger Institute, where he develops methods to identify mutations that cause genetic disorders. Before joining the Sanger Institute, he worked at an international firm of management consultants.
References
- Ramu, A., Noordam, M.J., Schwartz, R.S., Wuster A., Hurles, M.E., Cartwright, R.A. and Conrad, D.F. (2013) DeNovoGear: De novo indel and point mutation discovery and phasing. Nature Methods, PMID:23975140
Related Links: