Descripción del proyecto
The genome of an animal encodes a large set of regulatory programs that give rise to the thousands of cell types that make up its tissues and organs. Despite recent progress in single-cell omics, our knowledge about the regulatory programs that control the establishment and maintenance of cell type identity remains limited, and methods are lacking to infer regulatory programs directly from the genome sequence. In this project, which lies at the interface between the genome and single-cell atlases, we ask how the genome sequence translates into cell types. We start with Drosophila as model organism. Its compactness allows sampling of all its cell types and developmental trajectories from egg to adult, using whole-organism single-cell multi-omics, thus capturing the spectrum of activation states that emerge from the regulatory genome. Deep learning models will be trained on regulatory sequences to predict and explain gene regulatory networks (GRN) and GRN transitions between cell states, encoded by enhancers, promoters, transcription factors (TF), effector genes, and feedback loops. Based on a better mechanistic understanding, we will translate this framework to other animals, including octopus, birds, and mammals, and ask how regulatory programs evolve, with a focus on neuronal diversity in the brain. Using new algorithms for cross-species deep learning and combinatorial optimization, we will study how combinations of expressed TFs co-evolve with genomic enhancer logic. We are unique in our approach because we will develop and use new technological assays, deep learning, and massively parallel reporter assays, and combine these with perturbation experiments and synthetic biology to test our hypotheses. After iteratively improving our regulatory models, we ultimately aim to predict which regulatory programs, and thus which cell types, are encoded in an animal’s genome, and how changes in these programs underlie changes in cell types during evolution.