QCB encompasses a broad range of quantitative and computational biosciences research. We develop cutting edge quantitative and computational tools ranging from statistical analysis and modeling approaches to physics-based algorithms and mechanistic modeling.

Computational Analysis and Software Tools by QCBio-affiliated Laboratories

Interpreting Next Gen Seq Data

GENESCISSORS
http://csbio.unc.edu/genescissors/
a comprehensive approach to detecting and correcting spurious transcriptome inference due to RNAseq reads misalignment.

GIREMI
https://www.ibp.ucla.edu/research/xiao/GIREMI.html
GIREMI is a method to identify RNA editing sites and distinguish them from SNPs using RNA-Seq data.
http://www.nature.com/nmeth/journal/v12/n4/full/nmeth.3314.html

LAPELS
https://github.com/shunping/lapels
Remaps reads aligned to the in silico genome back to the reference coordinate and annotates variants.

NMFP
(Non-negative Matrix Factorization based Preselection) file://localhost/(http/::www.stat.ucla.edu:~jingyi.li:software-and-data.html
NMFP is a non-negative matrix factorization based pre-selection method to increase accuracy of identifying mRNA isoforms from RNA-seq data.

RASER https://www.ibp.ucla.edu/research/xiao/RASER.html
Brief description: Read aligner for SNPs and editing sites of RNA.
http://bioinformatics.oxfordjournals.org/content/early/2015/09/04/bioinformatics.btv505.abstract

RNA-Skim
https://githum.com/zzj/RNASkim
A rapid method for RNA-Seq quantification at transcript level

ROP
https://github.com/smangul1/rop
A computational protocol aimed to discover the source of all reads, which originated from complex RNA molecules, recombinant antibodies and microbial communities.

SURVIV
https://github.com/Xinglab/SURVIV
Survival Analysis of mRNA Isoform Variation

SUSPENDERS
https://github.com/holtjma/suspenders
Merges multiple alignments of the same reads under different pretenses.

Exploring Molecular Genomics

CHROMHMM
http://compbio.mit.edu/ChromHMM/
ChromHMM allow for chromatin state discovery and characterization.
http://www.nature.com/nmeth/journal/v9/n3/full/nmeth.1906.html

CHROMIMPUTE http://www.biolchem.ucla.edu/labs/ernst/ChromImpute/
ChromImpute allow imputing specific epigenomic data, based on a number of available datasets. This may fill in missing data, or provide a means to identify and correct low quality data. http://www.nature.com/nbt/journal/v33/n4/full/nbt.3157.html

DYNAMIC REGULATORY EVENTS MINER (DREM)
http://www.sb.cs.cmu.edu/drem/
DREM is used for the analysis of dynamical changing TF binding events or mRNA abundances, as revealed in time series NGS datasets.
http://msb.embopress.org/content/3/1/74

MATS
http://rnaseq-mats.sourceforge.net/
A computational tool to detect differential alternative splicing events from RNA-Seq data.

RRHO
(Rank-Rank Hypergeometric Overlap gene expression signature comparison)
http://systems.crump.ucla.edu/rankrank
Algorithm for comparing and visualizing overlap in two gene expression signatures input as ranked gene lists.

SAVANT
http://pathways.mcdb.ucla.edu/savant/
The web-based Signature Visualization Tool (SaVanT) visualizes these cell-type-specific gene expression signatures in user-generated expression data.

SHORT TIME-SERIES EXPRESSION MINER (STEM)
http://sb.cs.cmu.edu/stem/
For clustering and analyzing short time series gene expression data. http://www.biomedcentral.com/1471-2105/7/191

TROM
https://cran.r-project.org/web/packages/TROM/index.html
For comparing transcriptomes of two biological samples from the same or different species. The comparison (i.e., transcriptome mapping) is conducted based on the overlap of the associated genes of different samples. More examples and detailed explanations are available in the vignette.

WGCNA
(Weighted Gene Co-Expression Network Analysis)
https://labs.genetics.ucla.edu/horvat/CoexpressionNetwork/

Interpreting Clinical Genetics

ASGENSENG
https://sourceforge.net/projects/asgenseng/
A software to detect allele-specific CNV from both WGS and WES data.

BEAST
Bayesian Evolutionary Analysis by Sampling Trees for bayesian phylogenetic inference.
http://mbe.oxfordjournals.org/content/29/8/1969

CAVIAR
(CAusal Variants Indentification in Associated Regions)
http://genetics.cs.ucla.edu/caviar/ or https://github.com/fhormoz/caviar
CAVIAR implements a new statistical framework that allows for the possibility of an arbitrary number of causal variants in genome-wide association studies.

http://www.genetics.org/content/early/2014/08/06/genetics.114.167908

FASTANOVA
http://compgen.unc.edu/wp/?page_id=275
An Efficient Algorithm for Genome-Wide Association Study

FOURSITE
https://github.com/LohmuellerLab/FourSite
Estimates heterozygosity from sequencing reads in low-coverage data

GAIA
https://sourceforge.net/projects/discriminatives/
Implementation of “GAIA: graph classification using evolutionary computation” in SIGMOD’10.
a discriminative subgraph pattern mining algorithm using evolutionary computation
implemented by the author

GAIN
http://liuyi1.com/GAIN/
Efficient Genome Ancestry Inference in Complex Pedigre

GENOTYPE SEQUENCE SEGMENTATION
http://compgen.unc.edu/wp/?page_id=253
http://compgen.unc.edu/wp/wp-content/uploads/2008/07/minseg-final.pdf

GENSENG
https://sourceforge.net/projects/genseng/
A software detecting CNVs(Copy Number Variations) from NGS(Next Generation Sequencing) data.

HTREEQA
http://www.csbio.unc.edu/htreeqa/
Using semi-perfect phylogeny trees in quantitative trait loci study on genotype data

IGMS
(Inferring Genome-wide Mosaic Structure)
http://compgen.unc.edu/wp/?page_id=256
http://compgen.unc.edu/minmosaic/

MACH-ADMIX
http://www.unc.edu/~yunmli/MaCH-Admix/
a genotype imputation software that extends the capabilities of MaCH 1.0.

MENDEL http://www.genetics.ucla.edu/software/download?package=1
Mendel is a comprehensive Package for Statistical Analysis of Qualitative and Quantitative Traits.
http://www.ncbi.nlm.nih.gov/pubmed/26567478 http://www.ncbi.nlm.nih.gov/pubmed/24955378

NPUTE
http://compgen.unc.edu/wp/?page_id-57
Fast Algorithm for Imputing Missing Genotypes in SNPs

PAINTOR http://bogdan.bioinformatics.ucla.edu/software/paintor
PAINTOR integrates functional and association data in fine-mapping studies

PASANIUC LAB TOOLS
Integrating functional data to prioritize causal variants in statistical fine-mapping studies.
PLoS Genet. 2014 Oct 30;10(10):e1004722.
Leveraging functional annotation data in trans-ethnic fine-mapping studies.
Am J Hum Genet. 2015, 97(2):260-71.

PREFERSIM
https://github.com/LohmuellerLab/PReFerSim
Performs forward in time population genetic simulations.

REM
http://csbio.unc.edu/eQTL/

TREEQA
http://compgen.unc.edu/wp/?page_id=239
Tree-based quantitative genome-wide association mapping

TWAS http://bogdan.bioinformatics.ucla.edu/software/twas/
Transcriptome-wide association study through expression imputation
Nat Genet. 2016 48(3):245-52

Integrating Clinical Data

ANGICART
https://github.com/mnewberry/angicart/
Angicart analyzes 3d radiographic images of blood vessels to determine the centerlines, topology, radius, length, and volume of blood vessel segments.
PLoS Comput Biol 11(8): e1004455.

FFSM
Fast Frequent Subgraph Mining
https://sourceforge.net/projects/ffsm/

LTS
https://sourceforge.net/projects/learning2search2/
is an optimized Java implementation of the algorithm from “LTS: Discriminative Subgraph Mining by Learning from Search History” in Data Engineering (ICDE), IEEE 27th International Conference, pages 207-218, 2011.

MERGEOMICS
http://mergeomics.research.idre.ucla.edu
Mergeomics integrates multidimensional genomic data to identify biological pathways, gene networks, and key regulators of a disease or physiological trait.

http://biorxiv.org/content/early/2016/01/07/036012

[/expand]

[expand title=”Making Biomed BigData Accessible“]LOS ANGELES DATA RESOURCE (LADR)
http://ctsi.ucla.edu/researcher-resources/pages/LADR
A joint project of major Los Angeles healthcare provider organizations (including UCLA, Cedars-Sinai (CSMC), Charles Drew University (CDU), USC, Children’s Hospital Los Angeles (CHLA) and the City of Hope) aimed at enabling research that improves the health of all people in the region using data representing the continuum of care across the region’s major health systems. LADR allows investigators to conduct interactive searches across the participating organizations on patient demographics, diagnosis and procedure codes (ICD-9 and CPT), labs, and medications and will be available to you and your research team for recruitment purposes for your study. LADR formally launched in May 2014 with two organizations, UCLA and CSMC, and a total of 6.8 million patient records. Three additional institutions, USC, CHLA, and CDU, have joined LADR since 2015. A future key feature being being developed for LADR is its “private record linkage” technology that identifies data from the same patients across the participating organizations. By creating this linkage, LADR will enable institutions to assemble more data on patient treatments and other exposures along with more data on their outcomes, empowering research that could not be conducted by any individual organization.

OHDSI
http://www.ohdsi.org
The Observational Health Data Sciences and Informatics (or OHDSI, pronounced “Odyssey”) program is a multi-stakeholder, interdisciplinary collaborative to bring out the value of health data through large-scale analytics.
http://www.ncbi.nlm.nih.gov/pubmed/26262116

REDCAP: RESEARCH ELECTRONIC DATA CAPTURE
http://ctsi.ucla.edu/researcher-resources/pages/REDCap
Research Electronic Data Capture) is a secure, HIPAA compliant web-based application for quickly building and managing online surveys, data collection forms and databases. REDCap provides audit trails for tracking data manipulation and user activity, as well as automated export procedures for seamless data downloads to Excel, PDF, and common statistical packages (SPSS, SAS, Stata R).

UC-RESEARCH EXCHANGE
http://ctsi.ucla.edu/researcher-resources/pages/ucrex
The University of California Research eXchange (UC-ReX) is a joint activity of the 5 University of California (UC) CTSAs, charged with fostering multi-site clinical research by providing access to harmonized clinical data from the 5 health systems. The UC Rex Data Explorer is a secure online system designed to enable UC clinical investigators to identify potential research study cohorts spanning the five UC medical centers. The Data Explorer allows investigators to conduct interactive searches of data derived from patient care activities at Davis, Irvine, Los Angeles, San Diego and San Francisco. Search criteria can include demographics, diagnosis and procedure codes (ICD-9 and CPT), labs, and medications. The output of each query from the UC ReX Data Explorer is numerics count of patients by site that match the criteria identified in the query. The numeric count helps investigators assess the feasibility of their study idea by identifying whether there are sufficient numbers of prospective subjects within the UC system.

Making Biomed BigData Accessible

LOS ANGELES DATA RESOURCE (LADR)
http://ctsi.ucla.edu/researcher-resources/pages/LADR
A joint project of major Los Angeles healthcare provider organizations (including UCLA, Cedars-Sinai (CSMC), Charles Drew University (CDU), USC, Children’s Hospital Los Angeles (CHLA) and the City of Hope) aimed at enabling research that improves the health of all people in the region using data representing the continuum of care across the region’s major health systems. LADR allows investigators to conduct interactive searches across the participating organizations on patient demographics, diagnosis and procedure codes (ICD-9 and CPT), labs, and medications and will be available to you and your research team for recruitment purposes for your study. LADR formally launched in May 2014 with two organizations, UCLA and CSMC, and a total of 6.8 million patient records. Three additional institutions, USC, CHLA, and CDU, have joined LADR since 2015. A future key feature being being developed for LADR is its “private record linkage” technology that identifies data from the same patients across the participating organizations. By creating this linkage, LADR will enable institutions to assemble more data on patient treatments and other exposures along with more data on their outcomes, empowering research that could not be conducted by any individual organization.

OHDSI
http://www.ohdsi.org
The Observational Health Data Sciences and Informatics (or OHDSI, pronounced “Odyssey”) program is a multi-stakeholder, interdisciplinary collaborative to bring out the value of health data through large-scale analytics.
http://www.ncbi.nlm.nih.gov/pubmed/26262116

REDCAP: RESEARCH ELECTRONIC DATA CAPTURE
http://ctsi.ucla.edu/researcher-resources/pages/REDCap
Research Electronic Data Capture) is a secure, HIPAA compliant web-based application for quickly building and managing online surveys, data collection forms and databases. REDCap provides audit trails for tracking data manipulation and user activity, as well as automated export procedures for seamless data downloads to Excel, PDF, and common statistical packages (SPSS, SAS, Stata R).

UC-RESEARCH EXCHANGE
http://ctsi.ucla.edu/researcher-resources/pages/ucrex
The University of California Research eXchange (UC-ReX) is a joint activity of the 5 University of California (UC) CTSAs, charged with fostering multi-site clinical research by providing access to harmonized clinical data from the 5 health systems. The UC Rex Data Explorer is a secure online system designed to enable UC clinical investigators to identify potential research study cohorts spanning the five UC medical centers. The Data Explorer allows investigators to conduct interactive searches of data derived from patient care activities at Davis, Irvine, Los Angeles, San Diego and San Francisco. Search criteria can include demographics, diagnosis and procedure codes (ICD-9 and CPT), labs, and medications. The output of each query from the UC ReX Data Explorer is numerics count of patients by site that match the criteria identified in the query. The numeric count helps investigators assess the feasibility of their study idea by identifying whether there are sufficient numbers of prospective subjects within the UC system.

Exploring Dynamical Cell Biology

CTCS MODELS: Co-transcriptional constitutive Splicing model
https://github.com/jdavisturak/CTCSmodel
Co-transcriptional splicing is a dynamic process that renders introns as potential bottlenecks for efficient mRNA processing. This model allows exploration of the parameters that affect the efficiency of constitutive mRNA processing.
Nucleic Acids Res. 2015 Jan;43(2):699-707

DSNICKFURY
https://github.com/michael-weinstein/dsNickFury2
A Python3 program to help select guide RNA sequences for use with any CRISPR/Cas system.

FLOWMAX
Interpreting Lymphocyte dynamics
http://www.signalingsystems.ucla.edu/models-and-code/
Dye dilution experiments (typically CFSE) are commonly used to investigate the population dynamics of lymphocytes in response to immunogenic stimulation. FLowMax interprets such data to derive cell biological parameters (such as probability to grow, time to division and time to death) with a measure of confidence.
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0067620

KINETIC MODELS OF NFκB DYNAMICS http://www.signalingsystems.ucla.edu/webmodel/view.DetailSelectModel.php
This web-interface allows access to a series of mathematical models that simulate NFκB dynamics in response to different stimuli. The user may investigate the effects of knockouts or kinetic reactions on the dynamics of NFκB.
Immunol Rev. 2012 Mar;246(1):221-38.

py-SUBSTITUTION
Distinguishing between kinetic and static features within a molecular network
http://www.signalingsystems.ucla.edu/models-and-code/
Typical formulations of dynamical systems models of molecular networks involve kinetic parameters that affect both the abundances and the flux of molecular species. py-Substitution allows for analytical expressions of the steady state that enable the study of abundances and fluxes separately.
PLoS Comput Biol. 2013 Feb; 9(2): e1002901

SPOTLITE
https://lbgsites2.bioinf.unc.edu/spotlite/ Web Application and Augmented Algorithms for Predicting Co-Complexed Proteins from Affinity Purification – Mass Spectrometry Data.