SpatialDE automatically identifies sub-structures (middle), and links these to genes that depend on spatial location (right) in mouse olfactory bulb data from Stahl et al 2016.

Categories: Human Cell Atlas6 April 20183.9 min read

New computational method reveals where genes are expressed

In the body, cells are often considered the atomic fundamental units. In a similar way to how atoms are structurally joined to form molecules, cells form tissues. The organization of these tissues let different cell types work together, to enable organs in the body to perform their functions. These structures have been studied and catalogued for hundreds of years in the field of histology, using microscopes.

During the 20th century molecular techniques have enabled researchers to investigate how different genes and proteins are used in different parts of tissues, to understand how cell types collaborate in tissues. Large scale projects such as the Protein Atlas or the Allen Brain Atlas have been systematically performing molecular measurements of individual genes and proteins in tissues.

In the last decade, tremendous advancements in the scale and cost effectiveness of molecular measurements have been made. This has led to the analysis of single cell gene expression -ie which genes are switched on in a cell. This lets researchers define cell types from molecular data. Similarly, spatially defined molecular measurements of gene expression can now be made on thousands of genes in single cell resolution. Projects that would previously have taken hundreds of people and long time schedules can now be done by individual labs, meaning more types of tissues in more conditions can be investigated.

The most powerful new high throughput methods generate measurements of expression levels for tens of thousands of genes. At this scale just looking at all the genes will not be possible. Typically these sorts of data have been analysed by only looking at a handful of known marker genes.

We have now developed a method that tells us if there is a relationship between genes expressed in cells, and where those cells are located.

Our SpatialDE method filters and sorts all the genes according to how certain we are that cell location matters for the expression levels. In the main data we analysed for our paper, out of close to 12,000 genes measured only 67 genes were filtered as “spatial”. By focusing on this shortlist of genes, researchers can quickly discover genes previously unknown to be related to tissue structure.

Tissues are often divided into sub-structures, based on visual appearance, or by expression of particular proteins indicating a specific function of that sub structure. The brain for example has different layers, so does skin: the thymus on the other hand consists of connected lobules with medullas inside.

The sub-structures are defined by different cell type compositions. For cells to have major functional differences they need to express many genes together that are specific to the function, which will be reflected on a whole tissue level. We created a second method which uses this property to automatically define tissue substructures. In one go, researchers obtain the genes defining the regions, as well as labels for the regions themselves.

This allows researchers to zoom into the structures of the tissue. The markers allow design of downstream functional experiments to investigate which genes cause the structure and which are a consequence of the structure. The spatial labels then allow researchers to investigate the interaction between structures, the development of the structures, and how the tissue performs its function.

Relating cell types to their spatial structure and organization in tissues is a major component in the ongoing Human Cell Atlas project. But the technologies for spatial gene expression measurements are feasible to perform for individual labs that wants to study their tissue of on a genomic level. With our methods, researchers can answer new questions about the relation between genes and tissue structure that was not possible before, which we demonstrate in our paper.

In the long term, genomic and quantitative spatial gene expression measurements, captured and analysed by methods such as SpatialDE, may form the basis of histology and pathology in the clinic. This would allow this area of medical diagnostics to become even more powerful and personalized.

About the author

Dr Valentine Svensson was an EMBL PhD student supervised by Sarah Teichmann at the Wellcome Sanger Institute, collaborating with Oliver Stegle at the EMBL-EBI when this work was done. He is now a postdoctoral scholar in the Division of Biology and Biological Engineering at Caltech, working with Lior Pachter on statistics for omics based cell biology.

Related publication

Valentine Svensson, Sarah A Teichmann and Oliver Stegle. (2018). SpatialDE: identification of spatially variable genes. Nature Methods. DOI:10.1038/nmeth.4636

Further Links