Functional, computational,
and statistical genetics

We aim to understand gene regulation by building broadly applicable computational tools to analyze high-throughput genomic assays. This approach works by carefully investigating the experimental design and developing data-driven models using computer science and high-dimensional statistics to advance biomedical discovery.

Our lab is affiliated with the Departments of Computational Medicine and Human Genetics in the David Geffen School of Medicine at UCLA.

Research

Our goal is to better understand how genes regulate each other and how disease disrupts gene networks. This goal is realized by developing software that enables us and others to analyze complex genomic datasets that sample the transcriptional landscape. With this mission in mind, we build models with insight of the experimental protocols and biological questions while utilizing and extending the latest theory of computer science and statistics.

Structure learning for gene regulation

We are utilizing and extending new developments in machine learning to model the behavior of modern genetic perturbations (i.e. CRISPR) to develop tools for inferring gene network structure.

Active learning for experimental design

Modern experiments can capture multiple phenotypes at the cell-level while capturing millions of cells in one library preparation. Seemingly unbounded sampling potential means that naively sampling everything is likely to be prohibitively expensive. We are developing active machine learning techniques to guide experimentalists through iterative sampling towards the most informative data given prior data, goals, and cost.

Quantifying the genetic component of RNA

We are developing methods to integrate the advances in RNA-seq quantification uncertainty, population specific sequence alignment, and causal inference to understand genetic drivers of disease.

Software

Below are some representative tools that we developed in collaboration before the inception of the Pimentel lab.

sleuth

sleuth is a transcript level differential expression analysis tool for RNA-Seq. It is different from most tools in that it incorporates the inferential variability from transcript abundance estimation.

sleuth also allows scientists to interactively explore their data using an R Shiny application which can be easily shared to encourage reproducibility and scientific sharing. A demo of this app can be found at the bear's lair.

software @ GitHub

kallisto

kallisto is an incredibly fast and accurate transcript abundance estimation tool for RNA-Seq. It forgoes the costly step of alignment by implementing a novel idea which we call a pseudoalignment. Pseudoalignment along with a re-parameterization of the likelihood allows for extremely fast inference of transcript abundance estimation which is now possible on the laptop in about 5 minutes.

software @ GitHub

Join us!

We are currently looking for kind, curious, and ambitious researchers.

Graduate students

Graduate students in our lab will be on collaborative teams with the goal of building generalizable software to solve complex problems in genomics. The goal and expectation is for each student to graduate with expertise in modeling and data analysis as well as expertise in an area of functional genomics.

We take UCLA students from Biomathematics, Bioinformatics, Genetics and Genomics, Computer Science, and related programs.

The students we expect will thrive in our lab come from diverse backgrounds but are not limited to:

students coming in with strong quantitative backgrounds interested in building models and software.
students coming in with mixed backgrounds (e.g. quantitative biology) interested in analyzing complicated data sets with specific biological questions in mind.
experimental students who are primarily advised by another principal investigator, but want a computational component in their dissertation.

If you think you might be a good fit and are already at UCLA, please send Harold an email with your CV and research interests.

Postdoctoral researchers

We currently have a position open jointly with Bogdan Pasaniuc. The job ad is here.

We are also open to co-advising postdoctoral researchers with other computational or experimental labs. An example might be an experimental postdoc with their own biological question for which there are no current analysis tools.

If interested, please send the following to Harold:

a short research statement on your goals
your CV
contact information for three references

Undergraduate students

Unfortunately we are currently at capacity for undergrads. However, if you think that you might want to do some work in the lab in the future, please reach out so we can discuss.

Summer undergraduate students

UCLA sponsors B.I.G. Summer every year, a 8-week paid summer research internship. Please click here for more information.

Research assistants / other

We don’t have immediate plans to hire in other areas but if you think you are an exceptional fit, please email Harold your CV along with a short description of what you hope to accomplish by joining the lab.

Functional, computational,
and statistical genetics

The Team

Principal Investigator

Harold Pimentel

Assistant Professor
HHMI Hanna H. Gray Fellow

Students

Albert Xue

Bioinformatics PhD student

Jingyou Rao

CS PhD student

Nathan LaPierre

Computer Science PhD student

Sandy Kim

Bioinformatics PhD student

Alumni

Ashwin Ranade

BIG Summer 2020

Yiwen Chen

BIG Summer 2020