TITLE: “Genomewide supervised prediction of activating and repressive regions in over hundred cell and tissue types”
ABSTRACT: While the vast majority of variants associated with common disease risk are distributed across the non-coding genome, our understanding of the regulatory elements contained within remains notably incomplete. Strategies for identifying and characterizing these regulatory elements, such as high-throughput reporter assays and CRISPR-dCas9 screens, have been essential in decoding this complex regulatory landscape, but they only provide information on the particular regions they cover and are currently available only in a limited number of cell types. By leveraging data from these functional assays and epigenetic features (such as histone modifications and chromatin accessibility), we built a supervised model to estimate the activating and repressive potential of any particular segment of the genome for over hundred cell and tissue types. We evaluate how our model learns regulatory activity from different datasets and investigate strategies for generalizing regulatory activity predictions to multiple cell types.