Trajectory space analysis: Leveraging computational models and single cell RNAseq to understand genetic programs defining intestinal lineages and infer colon stem cells in mouse
High-dimensional single cell profiling holds the potential to elucidate developmental sequences and define genetic programs directing cell lineages as well as determine putative progenitor cell types.
Existing algorithms have limited ability to elucidate branching developmental paths or to identify multiple branch points in an unsupervised manner. We introduce the concept of “trajectory space”, in which cells are defined not by their phenotype but by their distance along nearest neighbor trajectories to every other cell in a population. We implement “trajectory space” in a tSpace algorithm, and show that multidimensional profiling of small intestine cells in trajectory space, with minimal user input, allows unsupervised reconstruction of developmental sequences from intestinal stem cells to specialized epithelial phenotypes. tSpace delineates absorptive/enterocyte and secretory/enteroendocrine (EE) development, both arising from Lgr5+ crypt base columnar (CBC) cells, and positions cell types in developmentally meaningful relations. tSpace clearly positions short lived EE Dll1-expressing progenitors in trajectory space between CBC cells and mature EE populations, outperforming existing analytical tools (t-SNE, SPADE), which failed to define these cells either as a discrete subset or as a precursor population. Furthermore, our analysis reveals patterns of gene expression mirroring observations from decades of research on intestinal development.
Utilizing power of tSpace we identify three transcription modules that specify cell fate within CBC progenitors. The module consisting of Foxa2, Foxa3, Neurog3, Sox4, Sox9 is expressed by early cells but maintained in late CBC cells selectively in the EE branch; these transcription factors have been associated with endocrine and pancreatic development and may coordinate secretory pathways within intestinal enteroendocrine cells. Two modules are expressed preferentially in the enterocyte branch. One activates lipid and cholesterol metabolism, known to be important for mature enterocytes, while other is associated with Nfe2l2/Nrf2-antioxidant response element (ARE) pathway. The trajectory analysis shows that rapidly proliferating subset of CBC and transit amplifying cells are already heterogeneous and express gene programs leading to secretory vs. absorptive phenotypes.
Furthermore, we apply similar approaches to understand developmental sequences within mouse colon crypt.
We believe that tSpace will prove useful to the rapidly growing field of singe cell analysis and that our intestinal analysis can be a resource for further studies of intestinal development. More can be seen in pre-print here.