B.I.G. Summer 2024 – Institute for Quantitative and Computational Biosciences

2024 Bruins-In-Genomics Summer Undergraduate Research Program

2024 B.I.G. Summer Participants

Lab PIs	Mentors	Students
VALERIE ARBOLEDA	Maneesha Thaker	Samantha Scott
MEHDI BOUHADDOU		Matias Lee
		Ricardo Roure
PAUL BOUTROS	Nicole Zeltser	Isabella Lamont
	Helena Winata	Elise Stagaman
	Helena Winata	Jacob Valenzuela
	Jaron Arbet	Adriana Wiggins
JEFF CHIANG		Alex Chen
		Joy Cheng
TOM CHOU		Henry Li
LUIS DE LA TORRE UBIETA	Celine Vuong	Martin Ibarra
	Celine Vuong	Jesus Mauricio
MARIO DIPOPPA	Matteo Mariani and Timothy Lindsey	Marcello Berger
	Matteo Mariani and Timothy Lindsey	Kate Jackson
JASON ERNST	Jingyuan Fu	Gideon Shaked
NANDITA GARUD	Aina Martinez i Zurita	Evelyn Barajas
	Michael Wasney	Alexandria Hunt
DANIEL GESCHWIND		Jesus Velazquez
ALEXANDER HOFFMANN	Jennifer Chia	Aryahi Deorukhkar
	Haripriya Vaidehi Narayanan	Chengyuan Li
	Jennifer Chia	David Mastro
WILLIAM HSU	Lottie (Luoting) Zhuang	Leonard Garcia
JIMMY HU		Emily Nguyễn
		Troy Osborn
		Keera Puett
BEN KNOWLES		Naomi Barber-Choi
		Sarah Bonver
		Jacob Fisher
		Kyle Kalindjian
		Jacob Kelman
		Madelaine Leitman
		Carly Rabun
HUIYING LI		Paul Chou
CHONGYUAN LUO		Edward Chen
		Thea Traw
RENATE LUX		Jennifer Rios-Roodriguez
AARON MEYER	Andrew Ramirez & Jackson Chin	Salina Adhanom
	Meera Trisal & Jackson Chin	Ethan Hung
	Andrew Ramirez & Jackson Chin	Ayana Price
TIMOTHY O'SULLIVAN		Jingtong Liang
LUIS OLDE LOOHUIS	Aditya Pimplaskar	Yael Beshaw
	Aditya Pimplaskar	Lotem Efrat
ROEL OPHOFF	Lingyu Zhan	Beyza Duymayan
	Lingyu Zhan	Alexander Everett
MATTEO PELLEGRINI	Fei-Man Hsu	Galen Heuer
	Fei-Man Hsu	Praveena Ratnavel
	Fei-Man Hsu	Lily Zello
FLAVIA PIRIH	Steven Gonzalez and Davi Silva	Angelina Savino
	Steven Gonzalez and Davi Silva	Gloria Wang
ANTONI RIBAS	Katie Campbell	Cadence Chang
MARLIN TOUMA		Charlotte Wolf
DANIEL TWARD	Siqi Fang	Josef Lied
	Siqi Fang	Faria Tavacoli
SHARMILA VENUGOPAL		Jenna Jabourian
WEI WANG		Jared Arroyo-Ruiz
		Anapaula Camou
FANG WEI	Rachel Fox	Louise Oh
		Hassoon Sarwar
JENNIFER WILSON		Lydia Longfritz
		Jiale Yang
DAVID WONG	Irene Choi	Dav Trivedi
	Irene Choi	Aaron Zander
XINSHU XIAO	Giovanni Quinones-Valdez	Ryan Barney
	Giovanni Quinones-Valdez	Taylor Harris
XIA YANG		Shunsuke Kikuchi
HONG ZHOU	Alex Stevens	Ethan Crofut

2024 B.I.G. Summer Poster Abstracts

ADHANOM: Identifying variation in COVID-19 pneumonia patients and predicting patient metadata using PARAFAC2 and logistic regression

Identifying variation in COVID-19 pneumonia patients and predicting patient metadata using PARAFAC2 and logistic regression

SALINA ADHANOM¹, Andrew Ramirez², Jackson Chin², Dr. Meyer²

¹Department of Bioengineering, University of California, Los Angeles, CA 90024, USA

²BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

Pneumonia patients admitted to the intensive care unit (ICU), specifically those with COVID-19-based pneumonia, have prolonged time spent in the ICU as compared to other pathogen-based pneumonia. To understand how COVID-19 has impacted patient outcomes and to identify underlying patterns on the differences between pathogens causing pneumonia, we applied the tensor factorization method PARAFAC2 on the scRNA-seq measurements collected from bronchoalveolar lavages. PARAFAC2 identifies common variation in cells and genes across patients, allowing us to isolate trends towards groups of patients. Using the results of the PARAFAC2 model, we perform principal component analysis (PCA) to identify which patterns/components are relevant to isolating COVID-19 based pneumonia. In addition to the scRNA-seq, we also relate the results of the PARAFAC2 model to the electronic health records of the patient samples using a logistic regression to relate biological variation to patient demographics. Salina Adhanom-B.I.G-Summer-Poster-Dr.Meyer-Lab.pptx

ARROYO RUIZ: Investigating the Influence of Social Determinants of Health on Oral Microbiome Diversity

Investigating the Influence of Social Determinants of Health on Oral Microbiome Diversity

JARED ARROYO RUIZ1, Anapaula Camou1, Yan Wang2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Division of Oral and Systemic Health, Public and Population Health, School of Dentistry, UCLA

Understanding the impact of social determinants of health (SDOH) on oral microbiome diversity is essential for effective public health strategies. This study examines how composite SDOH scores affect alpha and beta diversity metrics of the oral microbiome using data from the NHANES 2009-2010 and 2011-2012 cycles. We also explore the interaction of oral health perceptions with SDOH and their impact on microbiome composition. Our analysis shows that lower SDOH scores correlate with increased alpha diversity but reduced beta diversity, suggesting a wider range of microbial species but less variation between individuals. Furthermore, positive oral health perceptions and experiences can partially counteract the negative effects of lower SDOH scores on microbiome diversity, highlighting the importance of improving oral health perceptions to mitigate the impacts of social inequalities on microbiome health.

BARAJAS: Detecting parallel evolution of bacteria in infant gut microbiomes

Detecting parallel evolution of bacteria in infant gut microbiomes

EVELYN BARAJAS1, Aina Martinez Zurita2, 3, Nandita Garud2, 3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Ecology and Evolutionary Biology, UCLA

3 Department of Human Genetics, UCLA

Infant gut microbiomes undergo significant evolutionary changes during the first year of life, due to an initial process of bacterial colonization followed by shifts in their diet. We aimed to investigate whether parallel allele frequency changes—multiple independent occurrences of the same evolutionary change across different individuals—occur in infants’ gut microbiomes. We analyzed temporally sampled data from the Backhed et al. 2015 dataset to determine allele frequency changes at different time points in the first year of life: birth, 4 months, and 12 months. By aggregating representative non-synonymous sites on a per-gene basis and using generalized linear models, we aimed to observe clear evidence of parallelism in these allele frequency changes. Discovering parallelism would indicate that certain bacterial genes provide adaptive benefits during early gut colonization, offering insights into microbial evolution and its impact on infant gut health.

Barajas_Evelyn_Poster

BARBER-CHOI: Finding Coexistence: Stochastic agent-based modeling of viral-host interactions

Finding Coexistence: Stochastic agent-based modeling of viral-host interactions

NAOMI BARBER-CHOI1, Isha Tripathi1, Benjamin Knowles 2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Ecology and Evolutionary Biology, UCLA

The most widely used theoretical models of virus-host interaction are century-old Lotka-Volterra ordinary differential equations (ODE). However, our lab identified theoretical and empirical concerns with applying these models to virus-host interactions. While viruses and hosts are able to coexist in nature, coexistence is rarely observed in ODE models, due to the encounter term between viruses and hosts predicting more lysis events than is possible. We therefore created an agent-based model that is more realistic than the ODE model to examine virus-host interactions. By probing our agent-based model, we were able to identify parameter sets for viruses and hosts that led to virus-host coexistence, host proliferation, or systemwide extinction. Future work will continue using the agent-based model to probe competition dynamics between temperate and lytic viruses, giving insight into the composition and impacts of viral communities across ecosystems from coral reefs to the human gut.

Naomi Barber-Choi-NAOMI_POSTER

BARNEY: Investigating the Effects of Allele-specific Alternative Polyadenylation on Alzheimer’s Disease

Investigating the Effects of Allele-specific Alternative Polyadenylation on Alzheimer’s Disease

RYAN BARNEY1,4, Giovanni Quinones-Valdez1,2,3, and Xinshu Xiao1,2,3,4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Bioengineering, UCLA

3 Department of Integrative Biology and Physiology, UCLA

4 Bioinformatics Interdepartmental Master’s Program, UCLA

Single nucleotide polymorphisms (SNPs) could contribute to the development or progression of Alzheimer’s Disease (AD) pathology. Allelic-specific processing of these SNPs at various genetic coordinates has the potential to drastically alter cellular pathways, particularly in the RNA. Discovering these alternative RNA processing events is critical in our effort to comprehend and combat AD. We seek to understand how allele-specific alternative polyadenylation (ASAPA) presents powerful insight into the functional consequences of SNPs in AD patients. Using the allele-specific alternative mRNA processing pipeline (ASARP) developed by our lab, we identified SNPs in RNA-seq data from 364 human brains from the Mount Sinai Brain Bank (MSBB). We reveal how ASAPA events are found in genes that demonstrate AD relevance, and they also demonstrate downstream effects on processes such as RNA binding protein (RBP) binding. Taken together, we illustrate how ASAPA has strong evidence for being a contributor of AD.

BIG_Summer_Poster_RyanBarney

BERGER, JACKSON: Inferring Cortical Circuit Structure from Patterns of Correlated Activity Across Cell Types

Inferring Cortical Circuit Structure from Patterns of Correlated Activity Across Cell Types

MARCELLO BERGER1,2, KATE JACKSON1,3, Timothy Lindsey4,5, Matteo Mariani5, Mario Dipoppa5

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2Department of Mathematics and Statistics, Williams College

3Department of Computer Science and Engineering, UCSD

4Bioinformatics Interdepartmental Program, UCLA

5Department of Neurobiology, David Geffen School of Medicine, UCLA

The cerebral cortex performs the most complex operations in the mammalian brain including sensory, motor, and cognitive processes. Neural circuits within the cortex exhibit correlated activity, which profoundly impacts brain computations. However, it is not yet fully understood how observed patterns of correlation across cortical cell types emerge from the underlying circuit. This project takes two approaches: first, using Bayesian inference with Markov Chain Monte Carlo methods to infer the parameters of neural circuits from patterns in neural activity; and second, creating a simplified mathematical model that relates correlations in neural activity to network parameters. Our results indicate that Bayesian inference can effectively infer network parameters from experimental data even at high levels of noise. Additionally, for large networks, average correlations in neural activity across cell types can be accurately approximated linearly with input correlations. Going forward, these methods could help construct models that validate and predict brain activity.

Kate Jackson-BIG_Summer_Poster

BESHAW, EFRAT: Predicting conversion to bipolar disorder in a cohort of major depressive disorder patients utilizing polygenic scores and EHR data

Predicting conversion to bipolar disorder in a cohort of major depressive disorder patients utilizing polygenic scores and EHR data

YAEL BESHAW 1, LOTEM EFRAT 1, Aditya Pimplaskar 2, Loes Olde Loohuis 2

BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA
Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, UCLA

Bipolar disorder (BD) is a severe mood disorder often initially diagnosed as major depressive disorder (MDD) due to initial onset of depressive episodes1. Delayed diagnosis can lead to inappropriate treatments and worsened patient outcomes. To predict MDD to BD conversion, we leverage electronic health records (EHR) from Colombia and publicly available genome-wide association studies (GWAS) to generate 13 polygenic scores (PGS) for psychiatric phenotypes. Our cohort included 1738 patients with a first severe mental illness diagnosis of MDD, 575 of whom had delayed BD diagnosis. Logistic regression with age, sex, ICD-10 MDD subcodes, MDD hospitalization, and all 13 PGS revealed that, after accounting for multiple testing, BD PGS (p=0.003, OR=1.202) was the only significant PGS indicator of conversion. Furthermore, conducting similar logistic regressions with each PGS individually, highlighted that BD (p=8.75E-6, OR=1.279), Suicide Attempt (p=0.002, OR=1.185), SCZ (p=9.16E-5, OR=1.241), and Suicidality (p=0.006, OR=1.158) were significant. These results provide insights into genetic risk factors for MDD-BD conversion, with implications on psychiatric precision medicine and improvements in patient care.

Service SK, De La Hoz J, Diaz-Zuluaga AM, Arias A, Pimplaskar A, Luu C, Mena L, Valencia J, Ramírez MC, Bearden CE, Sabbati C, Reus VI, López-Jaramillo C, Freimer NB, Loohuis LMO. Predicting diagnostic conversion from major depressive disorder to bipolar disorder: an EHR based study from Colombia. medRxiv [Preprint]. 2023 Oct 2:2023.09.28.23296092. doi: 10.1101/2023.09.28.23296092. PMID: 37873340; PMCID: PMC10593019.

Yael Beshaw-BIG-POSTER.pptx-2

BONVER: Quantifying Growth Rates in Coral Reef Microbes to Guide Conservation Efforts

Quantifying Growth Rates in Coral Reef Microbes to Guide Conservation Efforts

SARAH BONVER1, Grace Donoghue1, Natalie Falta1, Eleanor Gorham1, Madeleine Leitman1, Benjamin Knowles2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Ecology and Evolutionary Biology, UCLA

Understanding microbial growth dynamics in the ocean is crucial for environmental conservation. Growth rates are commonly assessed using culture dependent techniques which can prove challenging given that only 1% of microbes are culturable. Metagenomics provides an alternative route to predict microbial growth. In this study we utilized MEGAHIT to assemble metagenomic samples collected from coral reefs across the central Pacific Ocean representing a spectrum of bacterial growth rates. We estimated growth rates from assembled contigs using Codon Usage Bias, and three Peak-to-Trough Ratio approaches. We aim to use these tools to understand the relationship between coral reef degradation and bacterial growth rates given that degraded coral reefs are overgrown in bacteria. Doing so will allow us to have a better understanding of how we can target bacterial growth rate for conservation efforts and remediation of degraded coral reefs.

SarahBonverPoster

CAMOU: Exploring the Influence of Social Determinants of Health and Oral Microbiome Alpha Diversity on Oral HPV Prevalence

Exploring the Influence of Social Determinants of Health and Oral Microbiome Alpha Diversity on Oral HPV Prevalence

ANAPAULA CAMOU1, Jared Arroyo-Ruiz1, Yan Wang2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Division of Oral and Systemic Health Sciences, Public and Population Health, School of Dentistry, UCLA

Oral Human Papillomavirus (HPV) affects approximately 10% of men and 3.6% of women, with prevalence increasing with age. HPV impacts the mouth and throat and is associated with oropharyngeal cancer, with the CDC estimating that HPV is responsible for 60% to 70% of such cases in the U.S. However, the influence of social determinants of health (SDOH) and the alpha diversity of the oral microbiome on oral HPV positivity remains underexplored. To address this, NHANES (2009-2012) data was analyzed using machine learning regression methods. It was found that overall SDOH scores and their association with alpha diversity scores may influence the likelihood of testing positive for oral HPV. While alpha diversity alone did not have a significant effect, SDOH and HPV seem to alter the oral microbiome. These findings emphasize the need to consider multiple factors when assessing oral HPV prevalence and suggest further research into these interactions.

CHANG: Establishing a semi-automated pipeline for parallelized processing of multiplexed imaging data across sequential melanoma biopsies

Establishing a semi-automated pipeline for parallelized processing of multiplexed imaging data across sequential melanoma biopsies

CADENCE CHANG1, Katie M. Campbell2,3, Antoni Ribas2

1 Bruins in Genomics Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, UCLA

3 Jonsson Comprehensive Cancer Center, UCLA

Multiplexed imaging technology enables the detection of tens to thousands of proteins, making it a powerful tool for analyzing cellular interactions in the tumor microenvironment. However, these platforms can be costly or time-consuming, challenging the detection of many features on a single biopsy slide. We developed a semi-automated workflow to process data from sequential slides to integrate their individual analytes, transferring regional patterns of protein expression and cell types between them. Sequential slides sectioned from clinical biopsies (N=112) were collected from patients (N=39) with anti-PD1 therapy-resistant melanoma. The Vectra Polaris platform processed slides on two protein panels to annotate melanoma and immune cells. Sequential images were registered using the VALIS package, enabling automated translation of melanoma regions from slide 1 to the tissue stained on slide 2; 18% of 73 completed samples required manual modification of registration parameters. This workflow will refine integrated analysis of spatial profiling experiments for orthogonal validation.

CHEN A.: Predicting recovery following traumatic brain injury using multimodal MRI

Predicting recovery following traumatic brain injury using multimodal MRI

ALEX S CHEN1, Yanai Halperin, Paul M Vespa, Martin M Monti, Jeffrey N Chiang1

1 Department of Computational Medicine, University of California, Los Angeles

Disorders of consciousness and coma are common consequences of traumatic brain injury (TBI). Quantifying and prognosticating patient outcomes remains a challenge in neurological care. Current prognostic models consider clinical and demographic factors collected at the bedside to predict favorable or unfavorable long term outcomes, but do not currently do not incorporate imaging-derived features which have been shown to be associated with recovery trajectories in disorders of consciousness. The goal of this study is to assess the added utility of multimodal MR imaging features in predicting 6 month outcomes following severe TBI. A clinical driven score (IMPACT), logistic regression models developed on non-parametric combination (NPC) inference of MR images, and a neural network model (CNN) were used to predict favorable and unfavorable 6 month functional impairment based on Glasgow Outcome Scale-Extended (GOSe) score. The clinical IMPACT score and all models including multimodal MR features significantly outperformed logistic regression models using clinical and demographic covariates alone. Overall, machine learning models can predict 6 month outcomes following severe traumatic brain injury. Features derived from multimodal MR improve prognostic performance relative to currently used clinical variables.

CHEN E.:Divergent trajectories in pluripotency reprogramming: investigating chromatin accessibility and gene expression during cell fate transition

Divergent trajectories in pluripotency reprogramming: investigating chromatin accessibility and gene expression during cell fate transition

EDWARD CHEN1, Cuining Liu2, Terence Li2, Yu Sun3, Justin Langerman3, Kathrin Plath3, Chongyuan Luo4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Bioinformatics Interdepartmental Graduate Program, University of California Los Angeles, Los Angeles, CA, USA

3 Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA

4 Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA

The process of reprogramming various somatic cell types into induced pluripotent stem cells (iPSCs) involves large-scale remodelings of gene regulatory networks. We used 10X Multiome to analyze the changes in chromatin accessibility and gene expression involved in cell fate change at various time points during reprogramming. We inferred the possible branching trajectories into successful or failure end states using the partition-based graph abstraction (PAGA) algorithm and characterized their relative change in state using diffusion pseudotime. We found that both presumed iPSCs and stalled cells showed relatively large gene expression changes, as given by relatively high pseudotimes. On the other hand, based on the relative pseudotimes, we found that putative iPSCs had a relatively larger change in chromatin accessibility than stalled cells. Our results suggest that chromatin accessibility and gene expression are not fully linked during the cell fate transformation for the reprogramming of fibroblasts.

CHENG: Segmentation of retinal fluid using foundation models

Segmentation of retinal fluid using foundation models

JOY CHENG1,2, Anthony Wu3, Jeffrey N Chiang3,4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Computer Science, Henry Samueli School of Engineering, UCLA

3 Department of Computational Medicine, David Geffen School of Medicine, UCLA

4 Department of Neurosurgery, David Geffen School of Medicine, UCLA

Neovascular age-related macular degeneration (nAMD) represents a leading cause of vision loss worldwide. Optical coherence tomography (OCT) B-scans can be observed for the presence of retinal fluid to assess disease progression. The time-consuming and subjective nature of manual OCT fluid segmentation demands clinically applicable computational segmentation methods; however, the development of these methods is hampered by a lack of high-quality annotated data. We approached automatic segmentation with deep learning by adapting and fine-tuning the vision transformer-based foundation model MedSAM using natural language prompts and benchmarked it against U-Net, a convolutional neural network model. The U-Net outperformed the foundation model in our experiments, but the foundation model offers an innovative approach to automatic segmentation with limited data and additional research is required to determine its clinical applicability. Our work may be adapted for clinical uses such as predicting visual acuity and quantifying the effectiveness of nAMD treatments.

CHOU: Analysis of Oral Neutrophil Transcriptome Changes before and after Treatment in Periodontal Patients with and without Type 2 Diabetes

Analysis of Oral Neutrophil Transcriptome Changes before and after Treatment in Periodontal Patients with and without Type 2 Diabetes

PAUL CHOU1,2, Huiying Li2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences

2 Department of Molecular and Medical Pharmacology, Crump Institute for Molecular Imaging, UCLA

Periodontitis is a significant oral disease affecting about half of American adults. It is associated with inflammations caused by immune cells in response to changes in the subgingival microbiome. Type 2 Diabetes (T2D) upregulates inflammatory response and promotes tissue destruction and thus increases the risk of developing periodontitis. It has been suggested that oral neutrophils are the main immune cell responsible for periodontitis progression. However, how the neutrophils function differently in T2D patients compared to non-diabetic individuals (ND) remains to be understood. In this study, oral rinse samples were collected from periodontitis patients with or without T2D before and after periodontitis treatment. Oral neutrophils were isolated and RNA sequencing (RNA-Seq) was performed. PCA and hierarchical clustering analyses showed that the neutrophil transcriptome profiles were different between T2D and ND patients. Using DESeq2, we identified a set of genes that were differentially expressed between T2D and ND patients, including TCF7L2, CAPN10, and HHEX, which have been associated with T2D. By using Gene Set Enrichment Analysis (GSEA), we found two pathways, epithelial mesenchymal transition and interferon gamma response, significantly enriched in the differentially expressed genes. This study provides important molecular insights on the immune response differences by neutrophils between T2D and ND patients in the context of periodontitis. It will help us better understand the interplay between periodontitis and T2D.

BIG_Poster_Paul_Chou

CROFUT: Structure of the native doublet microtubule from Trichomonas vaginalis reveals parasite-specific proteins as potential drug targets

Structure of the native doublet microtubule from Trichomonas vaginalis reveals parasite-specific proteins as potential drug targets

ETHAN H. CROFUT1,2,4, Alexander Stevens1,2,3, Saarang Kashyap1,2, Shuqi E. Wang1, Katherine A. Muratore1, Patricia J. Johnson1, Z. Hong Zhou1,2,3

1 Department of Microbiology, Immunology & Molecular Genetics, UCLA

2 California NanoSystems Institute, UCLA

3 Department of Chemistry and Biochemistry, UCLA

4 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

Doublet microtubules (DMTs) are flagellar components required for the parasite Trichomonas vaginalis (Tv) to swim through the human genitourinary tract and cause trichomoniasis, the most common non-viral sexually transmitted disease. The lack of high resolution DMT structures has prevented structure-guided drug design to manage Tv infection. Here, we determined the cryo-EM structure of native Tv DMTs, identifying 29 unique proteins, including 18 microtubule inner proteins and 9 microtubule outer proteins. Notably, the parasite-specific proteins TvFAP35 and TvFAP40 form filaments at the DMT junctions, providing structural stability important for Tv locomotion. Additionally, TvFAP40 has a small molecule coordinated within a charged binding pocket, which may be targeted by an inhibitor. These structural findings shed light on the diversity of flagellar adaptations and provide a framework to inform rational design of therapeutics.

Crofut_Ethan_Poster_BIGSummer_2024

DEORUKHKAR: Comparative Analysis of scRNA-seq Annotation Pipelines in Hematopoietic Stem and Progenitor Cells

Comparative Analysis of scRNA-seq Annotation Pipelines in Hematopoietic Stem and Progenitor Cells

ARYAHI DEORUKHKAR1,2, Jennifer J. Chia3,4,5, Alexander Hoffmann4,6,7

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Biochemistry, School of Medicine, Case Western University, Cleveland, OH

3 Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA

4 Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA

5 Broad Stem Cell Research Center, University of California, Los Angeles, CA

6 Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA

7 Institute for Quantitative and Computational Biosciences (QCB), University of California, Los Angeles, CA

Hematopoiesis is the process that gives rise to all blood cells, beginning with multipotent and self-renewing hematopoietic stem and progenitor cells (HSPCs) in the bone marrow. Single-cell RNA sequencing (scRNA-seq) is well-suited to study hematopoiesis in health and disease, yet accurate cell identification can be difficult. While mature blood cells have unique markers that allow straightforward annotation, discerning HSPC subtypes is challenging because their transcriptomes are similar. Here, we compared a classifier of total bone marrow cells (Bone Marrow Map; BMM) to our in-lab pipeline developed for purified HSPCs. We first found that BMM assigned purified CD34+ HSPCs to more mature identities than expected. Furthermore, high-confidence labels from our in-lab pipeline better corresponded to de novo clusters than BMM labels in total marrow cells with non-zero CD34. Together these findings suggest that transcriptomically similar HSPC subsets are better resolved by a dedicated classifier than one built for total bone marrow.

Deorukhkar_BIG-Summer-Poster_Final

DUYMAYAN: Identifying repeat expansions in individuals with autism spectrum disorder

Identifying repeat expansions in individuals with autism spectrum disorder

BEYZA DUYMAYAN1, Lingyu Zhan2, Roel A. Ophoff2,3,4

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA, Los Angeles, CA USA

2Department of Human Genetics, UCLA, Los Angeles, CA USA

3Center of Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, UCLA, Los Angeles, CA USA

4Department of Psychiatry, Erasmus University Medical Center, Rotterdam, The Netherlands

Repeat expansions are highly polymorphic structural variations in the human genome. When these expansions exceed pathogenic thresholds, they can contribute to various neurological genetic disorders. This study aimed to identify known repeat expansions in individuals with Autism Spectrum Disorder (ASD) using ExpansionHunter, the computational tool that uses short-read sequencing data, which estimates the sizes of the repeats defined in the catalog. An extended ExpansionHunter variant catalog was developed, and a suite of advanced genomic analysis tools was employed to perform a pedigree genome-wide association study (PED-GWAS) on families with unaffected parents, a proband child, and an unaffected sibling. Starting from raw whole genome sequence data, associations between single nucleotide polymorphisms (SNPs), within a certain range, and the locus were observed. In addition, specific haplotypes linked to repeat expansions were identified. The findings may enhance the screening of patients for disease-causing genetic factors.Duymayan_Beyza_BIGSUMMER_Poster

EVERETT: Using polygenic scores as a predictive model for bipolar disorder to assess pleiotropic effects

Using polygenic scores as a predictive model for bipolar disorder to assess pleiotropic effects

ALEX EVERETT1, Lingyu Zhan2, Roel Ophoff2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Human Genetics, David Geffen School of Medicine, UCLA

Bipolar disorder is a complex, heritable neuropsychiatric condition characterized by recurring episodes of mania and depression, influenced by complex genetic factors. This study investigates the potential of polygenic scores (PGS) to reveal the genetic relationships between bipolar disorder and various other traits. Using summary statistics from 547 traits, we computed PGS based on genotype data from a cohort of 2111 Dutch patients, including 950 with bipolar disorder. By applying a LASSO regression to build a predictive model for case/control diagnosis, our research aims to assess the pleiotropic effects of genetic variants linked with disease risks and identify specific traits that most significantly contribute to or are affected by bipolar disorder.

FISHER: Impact of Coral Reef-Derived Viruses on Host Carbon Metabolism: Insights from Genomic Sequencing

Impact of Coral Reef-Derived Viruses on Host Carbon Metabolism: Insights from Genomic Sequencing

JACOB FISHER1 Benjamin Knowles2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA 2 Department of Ecology and Evolutionary Biology, UCLA

This research delves into the intricate relationships between marine viruses, derived from coral reefs, and the metabolic pathways of their hosts. Utilizing high-throughput genomic sequencing, we identify and analyze viral proteins that could modulate key enzymes involved in carbon metabolism, particularly focusing on the glycolytic pathway. The objective is to elucidate the strategies employed by these marine viruses to potentially redirect host metabolic processes to favor viral replication. Our initial analyses reveal that these viruses may engage in complex interactions with host cellular mechanisms to manipulate carbohydrate metabolism, suggesting a novel layer of viral influence on coral ecosystem health. Such insights are crucial for understanding the broader implications of viral presence in marine environments and for developing strategies to mitigate their effects on coral reef stability and resilience.

Jacob Fisher-Impact-of-Coral-Reef-Derived-Viruses-on-Host-Carbon

GARCIA: Integration of Imaging Features and Clinical Features for Early Detection of Lung Cancer

Integration of Imaging Features and Clinical Features for Early Detection of Lung Cancer

LEONARD GARCIA1, Luoting Zhuang2, Yannan Lin2, William Hsu2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Medical & Imaging Informatics, Department of Radiological Sciences, David Geffen School of Medicine at UCLA

Lung cancer is the leading cause of cancer-related mortality. Much work has been done on developing machine learning models that can predict the likelihood of lung cancer development from low-dose chest computed tomography scans. Although promising, imaging-based models often overlook valuable clinical information. Therefore, this project developed machine learning algorithms to integrate imaging features extracted from a lung cancer risk prediction model, Sybil, and PLCO clinical features to provide a more accurate and holistic lung cancer risk assessment. We concatenated imaging and PLCO features and trained multimodal machine learning models on NLST data and evaluated on a UCLA dataset. The best-performing of these multimodal models achieved an AUC score of 0.933 and AUPRC of 0.463, outperforming all unimodal models, the best achieved an AUC score of 0.909 and AUPRC of 0.454. Therefore, multimodal fusion of imaging and clinical features can enhance early detection of lung cancer and reduce mortality rates.

Big-Summer-Poster-Leonard-Garcia

HARRIS: Utilizing isoLASER to Identify Allele-Specific Alternative Splicing of HLA Genes to Enable HLA typing

Utilizing isoLASER to Identify Allele-Specific Alternative Splicing of HLA Genes to Enable HLA typing

TAYLOR HARRIS1, Warren Xu2, Giovanni Quinones-Valdez3, Xinshu Xiao2,3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA
2 Computational and Systems Biology Interdepartmental Program

3 Department of Integrative Biology and Physiology

Human Leukocyte Antigen (HLA) typing is a clinical practice that identifies the pattern of HLA genes in an individual commonly done to determine the match between patients and donors for core blood or solid organ donation. In addition to the polymorphic nature of these genes, alternative splicing of these genes further diversifies the transcriptome repertoire from the HLA region. In this study, we propose a new approach to HLA typing through a computational method, isoLASER. isoLASER leverages long-read RNA-seq data to allow for HLA typing and the identification of allele-specific alternative splicing of these genes. By applying isoLASER to 88 Colorectal Cancer samples, we could identify the HLA allele groups found in each sample and their associated splicing events. Using isoLASER, we could establish a framework for a significant advancement in efficient HLA typing and processing, improve transplant medicine, and better comprehend the relationship between HLA gene variation and genetic diseases.

HEUER: Cell-free DNA fragmentation signatures for disease state prediction in amyotrophic lateral sclerosis

Cell-free DNA fragmentation signatures for disease state prediction in amyotrophic lateral sclerosis

GALEN HEUER1, Fei-man Hsu2, Christa Caggiano3,4, Noah Zaitlen5,6, Matteo Pellegrini2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Molecular, Cell, and Developmental Biology, UCLA

3 Department of Neurology, UCLA

4 Institute of Genomic Health, Icahn School of Medicine at Mt Sinai

5 Department of Human Genetics, David Geffen School of Medicine, UCLA

6 Department of Computational Medicine, David Geffen School of Medicine, UCLA

Cell-free DNA (cfDNA) is released into bodily fluids after cell death. During this process, DNA is preferentially cleaved at linker regions between nucleosome cores, leading to distinct fragmentation patterns which reflect a cell’s transcriptional accessibility. This has prompted investigation into cfDNA fragmentation as a disease biomarker, notably in cancer. However, this method’s application in neurodegenerative conditions such as amyotrophic lateral sclerosis (ALS) remains unexplored. Here, we analyze cfDNA extracted from blood plasma of two independent cohorts of ALS patients and healthy controls. We observe a distinct difference in cfDNA fragment length between patients and controls in both cohorts, with ALS patients exhibiting shorter fragments than their healthy counterparts on the average (p=0.0001). Furthermore, we quantify decreased periodicity of fragment length probability density in ALS patients. Our work lends insight into cell death biology and cfDNA degradation in neurodegenerative processes, with the potential to inform diagnosis and patient outcome.

Galen Heuer-big_poster

HUNG: Endogenous Anti-Tumor Antibody Variation as a Predictor of Breast Cancer Prognosis

Endogenous Anti-Tumor Antibody Variation as a Predictor of Breast Cancer Prognosis

ETHAN HUNG1,2, Michelle Loui3, Jackson Chin3, Meera Trisal3, Crystal Xiao3, Allison Brookhart1,2, Aaron S Meyer2,3,4,5

1Computational and Systems Biology, University of California, Los Angeles (UCLA), USA

2Institute for Quantitative and Computational Biosciences, UCLA, USA

3Department of Bioengineering, UCLA, USA

4Jonsson Comprehensive Cancer Center, UCLA, USA

5Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, USA

Cytotoxic antibody responses are regulated through selective binding to mutant or infected cells and subsequent engagement of immune cells via the antibody’s Fc region. Tumor-specific antibody responses have been demonstrated against various tumor-specific surface markers, notably EGFR/HER1 and HER2. However, while exogenous therapeutic monoclonal antibodies have been shown to treat various cancers effectively through their ability to engage immune cells, endogenous patient antitumor antibodies are ineffective at driving tumor elimination in vivo; as a result, they remain clinically underutilized. Here, we attempt to predict progression-free survival using measurements drawn from quantitatively profiling the Fc composition of tumor-specific antibodies through a novel systems serology panel in a cohort of high-grade serous ovarian cancer patients. While the data suggests antibody profiles with differential Fc specificities across patients, we find that survival analysis models like the Cox Proportional Hazards model are unable to predict patient outcomes well.

HUNT: Detecting horizontal gene transfer during fecal microbiota transplants

Detecting horizontal gene transfer during fecal microbiota transplants

ALEXANDRIA HUNT1, Michael Wasney2,3, Nandita Garud2,3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Human Genetics, David Geffen School of Medicine, UCLA

3 Department of Ecology and Evolutionary Biology, UCLA

Fecal microbiota transplants (FMTs) are a microbiome-based therapeutic intervention that have been used to successfully treat C. difficile infection, with positive health outcomes linked to the successful engraftment of donor bacterial strains in the recipient gut. However, this correlation is not observed for all diseases. Furthermore, it is unknown whether intraspecific horizontal gene transfer (HGT) occurs frequently during FMTs, or how it shapes clinical outcomes. We hypothesize that HGT during FMT may facilitate the transfer of genetic material from the donor strain to the recipient strain or vice versa, potentially influencing disease outcomes by introducing new functional capabilities. To test this, we developed a pipeline to identify HGT events between co-colonizing strains of the same species and determine the origin of these shared genetic regions. Our approach is an important tool for characterizing evolutionary dynamics in the context of FMT, and could enhance our understanding of the mechanisms underlying the success of microbiome-based therapies.

IBARRA, MAURICIO: Cell-Specific Gene Expression and Chromatin Changes Associated with 17q21.31 Locus Haplotypes During Neurogenesis

Cell-Specific Gene Expression and Chromatin Changes Associated with 17q21.31 Locus Haplotypes During Neurogenesis

MARTÍN IBARRA1, JESUS MAURICIO1, Celine K. Vuong2, Beck Shafie2, Angelo Salinda2, Pan Zhang3, Michael J. Gandal3, Luis de la Torre-Ubieta2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, University of California Los Angeles.

2 Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles.

3 Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania.

The 17q21.31 locus is associated with neuropsychiatric and brain structure phenotypes through genome-wide association studies. Here, we seek to understand cell-specific gene expression and gene-regulatory alterations associated with 17q21.31 haplotypes during neurodevelopment. We generated a single-cell multi-omic map (gene expression + ATAC-seq) from 200,000 differentiating primary human neural progenitor cells (phNPCs) carrying different haplotypes. We observed cell composition changes, including an increase in radial glial cells and a decrease in neuronal cells in the H2=1 haplotype. In addition to genes within the 17q21.31 such as MAPT, KANSL1, LRRC37A2, we identified 347 genes with significant expression changes at the cell type level. Ongoing analyses are focused on identifying changes in gene regulatory networks. This work enhances our understanding of the genetic mechanisms underlying neuropsychiatric and brain structure phenotypes and could inform targeted approaches for studying and potentially treating disorders associated with the 17q21.31 locus.

JABOURIAN: A correlation-based motif search algorithm for biological network graph models

A correlation-based motif search algorithm for biological network graph models

JENNA JABOURIAN1, Sharmila Venugopal2

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2Department of Neurology, David Geffen School of Medicine, UCLA

Brain functions are orchestrated by causal interactions between diverse biomolecules. Currently, there is a lack of systematic approaches to assemble interactions derived from disparate experimental datasets. We recently developed a workflow to gather evidence-based interactions focusing on neuronal functions. We created a novel directed network graph with neurotransmitters, neuroproteins and nucleic acids as nodes with edges representing pairwise causal associations. Here, we implemented a Python software module for network analysis. The network was modeled as a connectivity matrix in which nodes were denoted by row and column numbers, and positive or negative causal effects were modeled as weighted integers with zero being no association. Iterative reordering revealed subspaces with highly correlated nodes (corr. coeff. ≥ 0.5). For proof-of-concept, counts of 3-node motifs were searched. Ongoing work is automating statistical significance tests for network structure in order to predict novel motifs of multicellular signaling relevant to neuroinflammation and neural excitability functions.

Automated development of novel functional interactomes across brain scales, D Takher, S Kadadi, S Venugopal, SACNAS National Diversity in STEM (NDiSTEM) Conference, Oct, 2023, in Portland, Oregon.
Constructing a novel functional interactome to study the dynamic crosstalk between neuroinflammation and neural excitability, JV Morgenland, A Allahverdian, R Knorr, S Venugopal, 50th Society for Neuroscience Meeting, Nov 2021, (Virtual).
Miller, P. (2018).”Connections between Neurons.” An Introductory Course in Computational Neuroscience. The MIT Press. Chapter 5. Jenna Jabourian-BIG-Summer-Poster

KALINDJIAN: Interferon-responsive genes Oasl1 and Oasl2 are upregulated in aging and inflammation

Interferon-responsive genes Oasl1 and Oasl2 are upregulated in aging and inflammation

KYLE KALINDJIAN1, Dr. Pearl Quijada2, David Wong3, Matthew Tran2, Dr. Eric Small4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Integrative Biology and Physiology, UCLA

3 Molecular, Cellular, and Integrative Physiology PhD Program, UCLA

4 Medical Center, University of Rochester

As the heart ages, structural and functional changes occur, driven by cellular senescence, oxidative stress, and inflammation. Cardiac fibroblasts contribute to increased stiffness and inflammation, promoting pathological remodeling and elevating disease risk. Despite understanding the phenotype of aged cardiac fibroblasts, the specific roles and mechanisms by which inflammatory genes influence these processes in disease models remain unclear. In this study, we used in vitro and computational approaches to identify potential master regulators and key players in interferon signaling pathways that influence inflammation. Our in silico analysis revealed that Oasl1 and Oasl2 are prominent interferon-responsive genes significantly upregulated in aged mice, which we confirmed through genotyping of cardiac fibroblasts treated with reactive oxidative stress-inducing cocktails in vitro. These findings highlight Oasl1 and Oasl2 as potential therapeutic targets for reducing inflammation and fibrosis in age-related cardiac dysfunction, offering new avenues for treatment strategies.

KELMAN: Uncovering the Viral and Bacterial Dynamics Within the Human Gut Microbiome

Uncovering the Viral and Bacterial Dynamics Within the Human Gut Microbiome

JACOB KELMAN1, Michael Iter, Aydin Karatas, Madelaine Leitman1, Benjamin Knowles4

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2Department of Quantitative and Computational Biology, USC

3Department of Bioinformatics and Systems Biology, UCSD

4Department of Ecology and Evolutionary Biology, UCLA

The human microbiome has undergone significant changes over time, largely influenced by variations in lifestyle and diet. These changes are more pronounced when compared across industrial and non-industrial populations. A growing body of evidence links these differences in the microbiome to the contemporary rise of microbiome-associated diseases. Despite this, the role of gut viruses in such diseases remains unclear. Recent research from our lab indicates that viruses in the human gut have become increasingly pathogenic, particularly within industrial populations. This pathogenic shift is accompanied by a transition from lytic-dominated to temperate-dominated viral communities. Our research involves the use of genomic protein clustering to develop a network of viral genomes across industrial, non-industrial, and premodern samples. We found high viral diversity and low viral connectedness across premodern metagenomes, and low viral diversity and high viral connectedness in industrial metagenomes, suggesting the homogenization of modern metagenome. Given the association between temperate viral communities and bacterial growth rates, we investigated the replicative rate of bacteria in the human gut and its association with viral dynamics across a historical timeline, from premodern to modern eras. Through this analysis, we aim to uncover the metabolic and mechanistic foundations of viral lifestyle changes and their pathogenicity in the industrialized and frequently diseased human gut.

KIKUCHI: scGRNdb – Cell Type Level Gene Regulatory Network Database for Single-cell Analysis Framework

scGRNdb – Cell Type Level Gene Regulatory Network Database for Single-cell Analysis Framework

SHUNSUKE KIKUCHI1, Michael Cheng2, Xia Yang3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Bioinformatics Interdepartmental Program, UCLA

3 Department of Integrative Biology & Physiology, UCLA

Recent advancements in sequencing technology have enabled the acquisition of high-resolution gene expression data at the cellular level. However, existing network databases primarily operate at the tissue level, limiting their ability to capture cellular heterogeneity. To address this gap, we proposed scGRNdb, a comprehensive database of cell type-specific networks. Using SCING, a model from our previous research, we constructed over a thousand cell type-specific gene regulatory networks (GRNs) from 8 human and mouse single-cell repositories. These GRNs facilitated various downstream analyses, including pathway enrichment and key driver analysis, producing interpretable results and offering an alternative method for single-cell data analysis. By comparing existing tissue-level networks and elucidating disease mechanisms, the advantages of this method were confirmed. scGRNdb will be an interactive web server, allowing researchers to perform these analyses and hence will contribute to the discovery of knowledge across a wide range of biology.

LAMONT: Germline structural variation captured by short-read targeted sequencing in low-risk prostate cancer patients.

Germline structural variation captured by short-read targeted sequencing in low-risk prostate cancer patients.

ISABELLA LAMONT1, Nicole Zeltser2, Nicholas Wang2, Paul Boutros2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Human Genetics, David Geffen School of Medicine, UCLA

Prostate cancer (PC) is one of the most common cancers among men in the United States and up to 60% of the risk of this cancer can be explained by inherited genetic factors. To capture these genetic factors, a targeted sequencing panel containing genomic regions that have associations to prostate cancer, including two known structural variants (SV), was developed. To evaluate the panel’s ability to detect SVs for future PC risk stratification, two pilot cohorts (n = 48) of prostate cancer patients were sequenced with this panel by short-read sequencing (150bp reads). Patient sequences were aligned to the GRCh38 reference genome using BWA-MEM2 and SVs were called using DELLY. Deletions and inversions were the most prominent structural variants detected, and 50% of SVs called were also detected in an external cohort of 200 whole genome sequences level patients. Among 16 re-sequenced samples, a 74.4% genotype concordance rate was detected. Current methods used for the classification of PC risk are limited in accuracy, therefore by ensuring precise methods of SVs detection, these detected variations can improve risk stratification.

LEE: An algorithm to improve the recovery of low-abundant viral peptides responsible for HIV-1 reactivation from raw mass spectrometry data

An algorithm to improve the recovery of low-abundant viral peptides responsible for HIV-1 reactivation from raw mass spectrometry data

MATIAS LEE1,2,3,4, Prashant Kaushal3,4, Mehdi Bouhaddou3,4

1Department of Applied Mathematics, Brown University

2Buins In Genomics (BIG) Summer Program, UCLA

3Department of Microbiology, Immunology, and Molecular Genetics (MIMG), UCLA

4Institute for Quantitative and Computational Biosciences (QCBio), UCLA

Human Immunodeficiency Virus 1 (HIV-1) presents unique challenges due to its ability to exist in both latent and actively replicating forms. The reactivation process is known to be controlled by chemical modifications on viral proteins, known as post-translational modifications (PTMs). This study utilizes mass spectrometry (MS) proteomics to identify viral HIV peptides and their PTM forms that may be missed by traditional mass spectrometry due to their low abundance. Here, we developed a new bioinformatics pipeline in R to uncover PTMs critical for reactivating latent HIV. Our pipeline successfully recovered 102 of peptides found by our traditional search algorithm, Spectronaut, while simultaneously discovering 91 additional candidates not found by the algorithm. Future work will require the development of MS targeted proteomics methods to validate specific candidates. Once confirmed, probing the function of these PTMs during reactivation may lead to new therapeutic strategies to eradicate the HIV reservoir and cure HIV.

LEITMAN: Evolution of Gut Bacteriophages: Shifts in Phage Lifestyle and Pathogenicity from Ancient to Modern Populations

Evolution of Gut Bacteriophages: Shifts in Phage Lifestyle and Pathogenicity from Ancient to Modern Populations

MADELAINE LEITMAN, Aydin Karatas, Michael Iter, Benjamin Knowles

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Quantitative and Computational Biology, USC

3 Department of Bioinformatics and Systems Biology, UCSD

4 Department of Ecology and Evolutionary Biology, UCLA

The decline in gut microbial diversity in industrialized populations is closely linked to a spectrum of chronic health conditions. While research on the gut microbiome has predominantly focused on bacteria, the role of bacteriophages (phages) remains largely unexplored despite their ubiquity and critical role in modulating bacterial populations. In this study, we extracted phage genomes from pre-modern and modern fecal metagenomes to examine shifts in phage pathogenicity, lifestyle, and beta diversity over time. Our findings reveal a transition from lytic to temperate-dominated viral communities and an increase in viral pathogenicity in the gut over time. Connectivity analysis of a protein-sharing network among the viral genomes demonstrated a significant temporal and lifestyle-specific clustering. Given the role of lysogenic conversion in driving bacterial evolution, the observed rise in temperate lifestyles and virulence factors among phages suggests that prophage gene transfer may contribute to heightened pathogenicity in modern gut bacteria.

LI C.: Topological measures of antibody phylogenies reveal B-cell fate decisions

Topological measures of antibody phylogenies reveal B-cell fate decisions

CHENGYUAN LI1, Haripriya Vaidehi Narayanan2,3, Alexander Hoffmann2,3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Institute for Quantitative and Computational Biosciences, UCLA

3 Department of Microbiology, Immunology, and Molecular Genetics, UCLA

The immune response to vaccination generates an antibody repertoire through Darwinian evolution involving selection, mutation, and expansion of B-cells. Impaired immune responses may be due to alterations in B-cell fate decisions but in vivo they cannot be directly observed. We asked whether cellular dynamics can be revealed from phylogenetic trees constructed from end-point antibody repertoire sequencing. We developed a mathematical model parameterizing survival, mutation, and expansion probabilities of B-cells under sequence-based selection, performed Monte Carlo simulations of phylogenetic trees in the repertoire, and analyzed distributions of graph-theoretic measures of topology and sequence abundance. We found that purely topological measures like root-to-tip depth are sensitive to mutation and death rates, while abundance-weighted measures reveal selection stringency and expansion rates. This novel approach yields quantitative insights across biological scales, inferring control of B-cell fate decisions from the resulting antibody repertoire. As a potential diagnostic measure it may inform strategies for personalizing vaccination.

LI H.: The true cause of Influenza: Excess Mortality Analysis Using Multilayer Perceptron Time Series Models

The true cause of Influenza: Excess Mortality Analysis Using Multilayer Perceptron Time Series Models

HENRY LI¹, Tom Chou2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Mathematics, University of California, Los Angeles, CA 90024, USA

Influenza is a seasonal communicable disease responsible for over 45,000 deaths annually in the United States. Beyond its direct impact, influenza can exacerbate other health conditions, leading to increased mortality across various causes. However, the excess mortality indirectly caused by influenza remains unquantified. In this study, we analyze data from 2008-2018 on the top 15 causes of death, sourced from the Centers for Disease Control and Prevention (CDC). We employed Multilayer Perceptron (MLP) time series models to forecast cause-specific mortality in 2019, incorporating flu deaths as an exogenous variable. To assess influenza’s broader impact, we simulated varying levels of flu reduction and calculated a case reduction rate for each scenario. This approach aims to quantify the yearly excess mortality attributable to influenza, offering a more accurate measure of its public health impact. The findings could inform critical public health policies aimed at targeting flu reduction.

LIANG: A novel approach to identify and prioritize clinically significant cell-cell interactions in human solid cancers.

A novel approach to identify and prioritize clinically significant cell-cell interactions in human solid cancers.

Authors: JINGTONG LIANG1,2, Varchas Bharadwaj3, Kenneth Ho2, Johnny Ji2,4, Timothy E. O’Sullivan2,4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Microbiology, Immunology, and Molecular Genetics, UCLA

3 Department of Computational and Systems Biology, UCLA

4 Molecular Biology Institute, UCLA

Cell-cell interactions, mainly through ligands, induce critical changes in gene expression in the tumor microenvironment. Algorithms including CellPhoneDB and NicheNet are widely used to predict these interactions using single-cell RNA sequencing data. However, their outputs are often difficult to interpret due to the large number of results and their lack of clinical significance. Therefore, a new method is necessary to prioritize clinically relevant cell-cell interaction. In our study, we combined fifteen publicly available single-cell RNA sequencing datasets to identify ligands with high regulatory potential to induce downstream gene expression changes in the tumor microenvironment. Next, we used survival data from The Cancer Genome Atlas (TCGA) to rank patients’ expression of cell lineage markers and ligands and determine which predicted cell-cell interactions led to a significant difference in patient outcomes. Our method has the potential to become an individualized tool for discovering and screening novel drug targets and next-generation cancer therapies.

Jingtong-Liang_poster

LIEM, TAVACOLI: Investigating moment kernel CNNs for rotation equivariance in bioimage analysis

Investigating moment kernel CNNs for rotation equivariance in bioimage analysis

JOSEF T. LIEM1, FARIA TAVACOLI1, Daniel J. Tward2,3

1 BIG Summer Program, Institute of Quantitative and Computational Biosciences, UCLA

2 Department of Computational Medicine, David Geffen School of Medicine, UCLA

3 Brain Mapping Center, Department of Neurology, David Geffen School of Medicine, UCLA

The success of Convolutional Neural Networks over Multilayer Perceptrons for image analysis is driven by translational equivariance, meaning feature recognition is consistent across shifts of the input. In microscopy images, reflective and rotational equivariance can be used to exploit additional symmetries, but current methods have not achieved significant success. We developed a new approach called moment kernels, which are rotationally symmetric kernels, multiplied by some power of “x”. We benchmarked performance on the MEDMNIST dataset relative to alternatives and investigated applications to spatial transcriptomics data. We evaluated performance on 9 datasets and found our model did best on 4 of 9, where rotation invariance holds or when sample sizes are small. This method offers a simple and scalable approach to achieve equivariance, potentially reducing model complexity and improving accuracy for smaller datasets. Future work will extend this framework to 3D image analysis, where complexity reduction may be more critical.

Josef Liem-BIG_SUMMER_2024_Poster

LONGFRITZ:: Insights from network-predicted associations between drugs and cancer

Insights from network-predicted associations between drugs and cancer

LYDIA LONGFRITZ1, Jocelyn Yang1, Jennifer L. Wilson2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Bioengineering, UCLA

Network-based approaches have great potential in cancer research due to the quantity of genes implicated in cancer, but they tend to overpredict effects compared to clinically observed ones. Previously, we showed that these predictions still align with clinical data, suggesting potential for novel purposes and pathways in existing drugs. Here, we compared drugs predicted to treat different types of cancers by studying their pathway proteins. We consolidated a list of cancer-related phenotypes from multiple datasets, and generated association networks between 455 of these phenotypes and 1,436 clinical drugs. We split these phenotypes into 20 groups and used preexisting drug and gene classification systems to find patterns in each group. Strikingly, drugs in all groups had similar trends in therapeutic use, even though there were differences in the genes involved in drug-phenotype associations for each disease group. Our results suggest that there is potential for drug repurposing based on downstream protein effects and highlight genes involved in potential treatment options that could benefit from further study.

MASTRO: Identification of putative leukemic stem cells in B-Lymphoblastic Leukemia at a single cell resolution

Identification of putative leukemic stem cells in B-Lymphoblastic Leukemia at a single cell resolution

DAVID MASTRO1,2,3, Jennifer J. Chia4,5,6, Dinesh S. Rao4,5,6, and Alexander Hoffmann1,2,5

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2Department of Microbiology, Immunology, and Molecular Genetics, UCLA

3Department of Computer Science, UCLA

4Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, UCLA

5Jonsson Comprehensive Cancer Center, UCLA

6Broad Stem Cell Research Center, UCLA

B-Lymphoblastic Leukemia (B-ALL) is a cancer in which malignant B-lymphoblasts clonally expand in the bone marrow. Children and adolescents are often affected, with recurrence after initial treatment leading to high morbidity and mortality. In myeloid phenotype leukemia (Acute Myeloid Leukemia, AML), relapsed disease is partially attributed to Leukemic Stem Cells (LSCs), a rare population of quiescent malignant cells that can evade therapy and replenish the leukemic clone. However, whether LSCs contribute to relapse in B-ALL remains unknown. Thus, we asked whether putative LSCs could be identified in B-ALL single cell RNA sequencing (scRNA-seq) datasets. We first selected malignant cells in these datasets using known transcriptomic markers, then applied two established AML LSC gene signatures to recognize putative LSCs. This analysis identified rare putative LSCs in the evaluated datasets, suggesting LSCs may be an important contributor to disease relapse in B-ALL. These results may inform therapeutic strategies to prevent relapse.

David_Mastro_BIG_Poster_Final

NGUYEN, OSBORN, PUETT: Exploring the effects of lamins and microtubules on cellular gene expression in the mouse dental incisor epithelium using scRNA-seq

Exploring the effects of lamins and microtubules on cellular gene expression in the mouse dental incisor epithelium using scRNA-seq

EMILY NGUYEN1, TROY OSBORN1, KEERA PUETT1, Jimmy Hu2, Qianlin Ye2, Abinaya Thooyamani2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Section of Biosystems and Function, Division of Oral and Systemic Health Sciences, School of Dentistry, UCLA

The maintenance of many adult organs depend on the function of somatic stem cells, which undergo tightly-regulated processes of self-renewal and differentiation in response to genetic and signaling cues. Generation of mice lacking lamins or microtubules in the incisor epithelium revealed that these structural proteins are critical for stem cell shapes and arrangements. This led us to perform single-cell RNA sequencing (scRNA-seq) to determine if mutant incisors also exhibit differentiation defects and fate changes. Clustering at a low resolution revealed 19-20 individual clusters, with 2-3 epithelial cell clusters in both the lamin and microtubule mutant mice. Subsetting out the epithelial cells and reclustering at a higher resolution, we further analyzed gene set enrichment with Metascape, cell lineages with Dynverse, cell-cell communication with CellChat, and transcriptional regulatory networks with SCENIC. These findings provided insights into how the absence of lamins and microtubules affects epithelial gene expression in the mouse incisor.

Troy Osborn-HuLab_poster

OH, SARWAR: Saliva Cell-Free DNA as a Biomarker for Early Detection of Gastric Cancer

Saliva Cell-Free DNA as a Biomarker for Early Detection of Gastric Cancer

LOUISE OH1, HASSOON SARWAR1, Aaron Zander1, Dev Trivedi1, Irene Choi2, Neeti Swarup2, Mohammed Aziz2, David T.W. Wong2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 School of Dentistry, UCLA

Gastric cancer (GC) is one of the leading causes of cancer death annually and due to its disease heterogeneity, it is difficult to detect during early stages. The tumor microenvironment (TME) includes various molecular components, such as immune cells, that favor tumor progression. Previous studies have utilized salivary cell-free DNA (cfDNA) as a non-invasive means to guide early cancer detection. This project explores non-mutational analyses of cfDNA and considers the TME of GC to investigate the underlying genetic differences between healthy and cancer patients. We employed a low-coverage single-stranded library NGS pipeline on saliva samples of the two cohorts to study cfDNA characteristics including fragmentomics, G-quadruplex prevalence, and end-motif profiles. Our analysis showed a significant difference between the two cohorts for both saliva cfDNA characteristics and TME-specific biomarkers. These discoveries could potentially improve the application of cfDNA analysis in clinical settings for both early disease detection and monitoring its progression.

BIGDOC2024_Hassoon_Louise

PRICE: Leveraging Tensor Decomposition and Statistical Methods to Unveil Patterns in Patient Outcomes for Severe Pneumonia and COVID-19 Using Health Metrics

Leveraging Tensor Decomposition and Statistical Methods to Unveil Patterns in Patient Outcomes for Severe Pneumonia and COVID-19 Using Health Metrics

AYANA PRICE1, Andrew Ramirez2, Jackson Chin2, Aaron Meyer2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Bioengineering, Samueli School of Engineering, UCLA

Unresolved secondary pneumonia, particularly ventilator-associated pneumonia (VAP), significantly impacts mortality in patients with severe respiratory illnesses, including COVID-19. While it is known that prolonged ICU stays and respiratory failure in COVID-19 patients increase the risk of developing VAP, the specific relationships between clinical variables and patient outcomes remain unclear. This study investigates these relationships through factors generated by Parallel Factor Analysis 2 (PARAFAC2) and mathematical modeling to uncover patterns for mortality predictions. Statistical tests highlighted which features hold the strongest associations with patient outcomes. We further examined the influence of these features on prolonged intubations and ICU stays, using survival analysis techniques to analyze relationships with mortality. This study identifies potential biomarkers that can inform treatment strategies and highlights how clinical parameters such as age, etiology, and cell type percentages manifest in underlying biological pathways. Understanding these connections ultimately aims to improve patient outcomes in severe respiratory illnesses.

Price_Ayana_Poster

RABUN: Application of Efficient Frontier Cost-Benefit Analysis to Lytic and Temperate Viral Life Cycles

Application of Efficient Frontier Cost-Benefit Analysis to Lytic and Temperate Viral Life Cycles

CARLY RABUN1, Ben Knowles 2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Ecology and Evolutionary Biology, UCLA

All choices in life involve a cost and benefit. For example, a virus can choose to follow a risky but rewarding lytic or a safe but less lucrative temperate life cycle. Evolution dictates that viruses must choose the option leading to the highest fitness, but how much risk is worth the reward? Pulling from ideas of financial portfolio optimization, we used the Efficient Frontier cost-benefit analysis to compare the deviation from normal returns known as “risk” to the expected returns known as “reward”. Applying this methodology, we observed that temperate viruses earn only a tenth to a sixth of the reward compared to lytic viruses for the same amount of risk. All together, our work offers a quantitative understanding of viral life cycle trade-offs.

RATNAVEL: Alignment-free colorectal cancer classification approach using cell-free DNA

Alignment-free colorectal cancer classification approach using cell-free DNA

PRAVEENA RATNAVEL1, Lily Zello1, Fei-man Hsu2, Matteo Pellegrini2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Molecular Cell and Developmental Biology, UCLA

Previous research has shown that DNA methylation signatures in cell-free DNA (cfDNA) can effectively classify cancer patients with high specificity. The conventional method first maps sequencing reads to a reference genome and then applies comprehensive bioinformatics analyses. However, this pipeline is computationally demanding. Here we use a publicly available colorectal cancer MeDIP-Seq dataset to assess an alignment-free classification technique utilizing k-mer counting and compare it to the traditional alignment-based method. Our findings suggest that the alignment-free approach reduces computation time and resource usage while maintaining accuracy. These results indicate that k-mer counting could be more feasible and enable quicker diagnosis in healthcare settings.

Praveena-Ratnavel-BIG-Summer-Poster

RIOS-RODRIGUEZ: Evaluating the Clinical and Microbiological Impact of Laser Therapy as an Adjunct to Non-Surgical Treatment for Chronic Periodontitis in High-Risk Groups

Evaluating the Clinical and Microbiological Impact of Laser Therapy as an Adjunct to Non-Surgical Treatment for Chronic Periodontitis in High-Risk Groups

JENNIFER RIOS-RODRIGUEZ1, Renate Lux2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Division of Biosystems and Function, School of Dentistry, UCLA

Type 2 Diabetes Mellitus (T2DM) and smoking are significant risk factors for periodontitis and complicate treatment outcomes. While adjunctive laser therapy (ALT) combined with scaling and root planing (SRP) has gained popularity, its long-term benefits in managing chronic periodontitis across risk groups and impact on microbial profiles remain poorly understood. This study compared treatment outcomes and microbiome composition in patients with risk factors treated with either SRP alone or SRP combined with ALT. Using next-generation sequencing of the 16S rRNA gene, we characterized subgingival plaque samples collected at baseline and 1, 3, 6, 9, and 12 months following SRP or SRP+ALT. QIIME2 was employed to evaluate microbial community compositions and compare the treatment efficacy via analysis of alpha- and beta-diversity and taxonomy. This study will provide critical insights into the differential impacts of laser therapy and its potential role in improving treatment outcomes for chronic periodontitis across risk groups.

JENNIFER-RIOS-RODRIGUEZ-B.I.G-SUMMER-2024

ROURE: Quantifying Host and Viral Protein Remodeling during HIV Reactivation from Latency

Quantifying Host and Viral Protein Remodeling during HIV Reactivation from Latency

RICARDO ROURE1,2,3, Dain Ryan Brademan4, Prashant Kaushal2,3, Ruth Huttenhain4, Mehdi Bouhaddou2,3

1 Bruins In Genomics (BIG) Summer Program, UCLA

2 Department of Microbiology, Immunology, and Molecular Genetics (MIMG), UCLA

3 Institute for Quantitative and Computational Biosciences (QCBio), UCLA

4 Department of Molecular & Cellular Physiology, Stanford University

After infection, human immunodeficiency virus 1 (HIV-1) enters a latent stage within the host genome. One approach in developing a cure for HIV-1 is reactivating the latent virus, allowing the host immune system to sense and destroy the virus. During reactivation, HIV produces viral proteins for transmission by utilizing host proteins and post-translational modifications, of which the full extent remains mysterious. This project seeks to quantitatively compare the host proteins affected during HIV reactivation and identify their pathways. We developed a quantitative analysis pipeline using MSstats to assess quality control of mass spectrometry proteomics and phosphoproteomics during HIV reactivation using Phorbol-12-myristate-13-acetate (PMA), conduct statistical analyses between conditions (e.g. PMA vs Mock), and perform gene set overrepresentation analysis to reveal the biological pathways regulated. Our results from the proteomics showed leukocyte activation is most regulated while the phosphoproteomics revealed the cell adhesion pathway, possibly due to cytoskeleton modifications during egress.

Ricardo Roure-UCLA-Poster

SAVINO, WANG: Optimizing a Genomic-Wide Association Study in LPS-Induced Peri-implantitis

Optimizing a Genomic-Wide Association Study in LPS-Induced Peri-implantitis

ANGELINA SAVINO1, GLORIA WANG1, Davi Silva2, Flavia Pirih2, Karolina Urbanowicz2, Sepehr Poormonajemzadeh2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Periodontics, School of Dentistry, UCLA

Dental implants are increasing in prevalence as desirable options in replacement of missing teeth. Unfortunately, implants come with complications, and animal models are key to studying the pathophysiology of complications, such as peri-implantitis (PI). PI is characterized by inflammation in the tissues around dental implants with progressive loss of supporting bone. This study outlines a genome-wide association study (GWAS) for peri-implantitis in female mice. VCF files from the Mouse Genomes Project were merged using bcftools. PLINK was employed to create filtered bed files. Phenotype files were generated separately for control and ligature-induced peri-implantitis groups, incorporating average linear and volumetric bone loss measurements. The GWAS analysis was performed using FaST-LMM. This approach allows for identifying genetic variants associated with peri-implantitis susceptibility, potentially revealing novel insights into the disease’s molecular mechanisms. The study’s focus on female mice aligns with the available genomic data, providing a foundation for future research in this field.

Wang_Savino_Poster

SCOTT: Multi-Omics Analysis of the Effects of BRPF1 Mutations in Rare Disease

Multi-Omics Analysis of the Effects of BRPF1 Mutations in Rare Disease

SAMANTHA E. SCOTT1,7, Maneesha Thaker2,4,6, Matan Shepes8, Valerie Arboleda3,4,5

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA
2 Department of Molecular and Medical Pharmacology, David Geffen School of Medicine (DGSOM), UCLA
3 Department of Human Genetics, DGSOM, UCLA
4 Department of Pathology and Laboratory Medicine, DGSOM, UCLA
5 Department of Computational Medicine, DGSOM, UCLA
6 Medical Scientist Training Program, DGSOM, UCLA
7 Department of Microbiology, Immunology, and Molecular Genetics (MIMG), UCLA
8 CIRM COMPASS Training Program (N-COMPASS), CSUN

Bromodomain and PHD finger-containing protein 1 (BRPF1) is an epigenetic reader and scaffolding protein that forms a complex with KAT6 and accessory proteins to facilitate histone acetylation at H3K23, H3K14, and H3K9. De novo, germline, pathogenic mutations were identified in BRPF1, resulting in a rare disorder known as Intellectual Developmental Disorder with Dysmorphic Facies and Ptosis (IDDDFP). The specific molecular mechanisms linking BRPF1 mutations to disease are not well established. To understand the effects of BRPF1 mutations on the epigenome and transcriptome, we performed ATAC-seq and RNA-seq on wild-type and BRPF1 mutated cell lines (n= 12). Multi-omics integration identified dysregulation of pathways associated with development such as GJA5, SERP2, ADAMTS10, CAND2, and RAB11FIP1. Next steps include using the CUT&RUN assay to assess BRPF1-DNA binding and possible co-localization with histone H3 acetylation sites. A pilot CUT&RUN assay was conducted to test histone H3 antibodies, and our preliminary data validated the H3K9ac and H3K14ac antibodies. Ultimately, this work will define the molecular mechanisms linking BRPF1 mutations to disease and identify potential drug targets for IDDDFP patients.

Scott_Samantha_Poster-BIG-Summer

SHAKED: Identifying Functional Genomic Regions in iPSC Reprogramming with ChromHMM

Identifying Functional Genomic Regions in iPSC Reprogramming with ChromHMM

Gideon Shaked1, Jingyuan Fu2, Jason Ernst2,3,4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Computer Science, UCLA

3 Department of Biological Chemistry, UCLA

4 Department of Computational Medicine, UCLA

The process of reprogramming human fibroblasts into induced pluripotent stem cells (iPSCs) involves extensive and complex changes in the genomic landscape, particularly regarding chromatin accessibility. This epigenomic reorganization of chromatin structure is essential for activating pluripotency-associated genes and silencing lineage-specific genes, thereby facilitating the transition from a differentiated fibroblast to a pluripotent stem cell. As such, this study aims to map and functionally annotate the genomic regions involved in iPSC reprogramming. In order to do so, we used ChromHMM to characterize chromatin states, patterns of chromatin modifications linked to functional elements and regulatory activities in the genome. We applied ChromHMM to ATAC-seq data annotated with cell type clusters. By cross-referencing these cell type clusters with ChromHMM-generated chromatin state annotations and other genomic data, we newly identified specific genomic regions with functional roles in the reprogramming process. These findings enhance our understanding of epigenomic changes during iPSC reprogramming, which provides key insights into regulatory mechanisms. Such insights could improve reprogramming efficiency and advance therapeutic applications.

Gideon Shaked-poster1

STAGAMAN: Integrating Single-Cell and Bulk DNA Sequencing Results for Breast Cancer Subclonal Reconstruction

Integrating Single-Cell and Bulk DNA Sequencing Results for Breast Cancer Subclonal Reconstruction

ELISE STAGAMAN1, Helena Winata2, Dan Knight3, Paul Boutros3,4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Graduate Programs in Biosciences, UCLA

3 Department of Human Genetics, David Geffen School of Medicine, UCLA

4 Institute for Urologic Oncology, David Geffen School of Medicine, UCLA

Many cancer evolution studies often use bulk tumor DNA sequencing to gain valuable insights on cancer initiation and progression through subclonal reconstruction (SRC) analysis. However, this method of SRC is reliant on complex statistical inferences and simplifying assumptions. Alternatively, single-cell sequencing can provide additional resolution, but leaves gaps in knowledge due to low read depth and allele dropouts. Our goal is to integrate these two sequencing approaches while addressing their individual limitations—balancing the information available to produce an optimal evolutionary timeline. Using matched bulk and single-cell DNA sequencing data from a patient with metastatic breast cancer, both approaches were compared for mutation frequencies and inferred evolutionary relationships. To accomplish this, a combination of machine-learning and statistical methods were employed, correcting for low read depth in single-cell data and analyzing mutations present in each cell for consistency with the expected SRC from bulk sequencing.

Stagaman.Elise_.UCLA_.BIG_.final_

TRAW: Copy number variation profiling using multi-individual single-nucleus data in prostate cancer tumors

Copy number variation profiling using multi-individual single-nucleus data in prostate cancer tumors

THEA TRAW1, Terence Li2, Cuining Liu2, Chongyuan Luo3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Bioinformatics Interdepartmental PhD Program, UCLA

3 Department of Human Genetics, David Geffen School of Medicine, UCLA

Cancer is a highly heterogeneous disease and the molecular basis of its progression is not well understood. A common hallmark in cancerous cells is the development of copy number variations (CNVs) across the genome, which then propagate as the tumor grows. To determine the extent to which CNVs vary at an individual level, we analyzed a 5-patient prostate cancer dataset generated from single nucleus methyl-3C sequencing (sn-m3C-seq). After applying a single-cell based CNV caller, we investigated the differences between CNV profiles across individuals, cell types, and tumor spatial locations. For instance, we found that donor BS13497 expressed significantly more CNVs, particularly in luminal cells. Also, CNV counts varied widely by spatial location, with benign cells containing the least. This project highlights the variability of CNV profiles in prostate cancer across individuals, cell types, and spatial locations, with the further potential to identify the biological significance of the most salient CNVs.

Traw_Thea_BIG_Poster

TRIVEDI, ZANDER: Unlocking the Potential of Non-Mutational Characteristics of Cell-Free DNA as a Biomarker for Oral Cancer

Unlocking the Potential of Non-Mutational Characteristics of Cell-Free DNA as a Biomarker for Oral Cancer

DEV TRIVEDI1, AARON ZANDER1, Louise Oh1, Hassoon Sarwar1, Irene Choi2, Neeti Swarup2, Mohammed Aziz2, David T.W. Wong2

1BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA.

2Center for Oral/Head & Neck Oncology Research, School of Dentistry, UCLA

Plasma cell-free DNA (cfDNA) is a developing source of biomarkers for oral cancer and potentially other diseases. Previous studies have used Broad Range whole genome sequencing to identify various patterns in cfDNA such as methylation patterns, acetylation patterns, and g-quadruplex abundance. In our study on plasma cfDNA for oral cancer detection, we analyzed end-motif profiles, short to long fragment ratios, and G-quadruplex abundance throughout the genome. For each feature, we examined three cfDNA populations: mononucleosomal (mncfDNA), ultrashort (uscfDNA), and short cfDNA (scfDNA). The results demonstrated distinct fragmentomic ratios across the different cfDNA populations and microenvironments. Additionally, G-quadruplex abundance was notably higher in certain cfDNA populations, indicating potential as biomarkers for oral cancer detection. Our comprehensive approach highlights the diagnostic potential of cfDNA characteristics, providing insights into their role in the complex microenvironment of oral cancer.

BIGDOC2024_Aaron_Dev

VALENZUELA: Benchmarking Subclonal Reconstruction Tools: A Comparative Study with EMulSI-Phy and CONIPHER Using TRACERx Data

Benchmarking Subclonal Reconstruction Tools: A Comparative Study with EMulSI-Phy and CONIPHER Using TRACERx Data

JACOB VALENZUELA1, Helena Winata2, Yash Patel3, Paul Boutros3

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Graduate Programs in Biosciences, UCLA

3 Department of Human Genetics, David Geffen School of Medicine, UCLA

Subclonal reconstruction (SRC) provides a framework for studying genetic diversity within tumors, highlighting subpopulations of cancer cells, key mutations, and understanding mutation timing and origins. Initially performed with single biopsy samples, this method offered limited insight into the full spectrum of tumor mutations. Multi-sample reconstruction significantly improves resolution, but most methods struggle to accurately and efficiently reconstruct evolutionary paths from such data. We developed EMulSI-Phy to overcome this limitation and this study aims to benchmark it against established tools such as DPClust, PyClone, PyClone-VI, Pairtree, and CONIPHER. The benchmarking is performed using the publicly available TRACERx data, involving 421 patients with multi-sample whole-exome sequencing data. The results highlight the performance metrics like runtime and memory usage for each tool, as well as their accuracy relative to the published TRACERx results. Establishing a standardized benchmarking pipeline for multi-sample SRC will significantly improve the evaluation of future methods.

Jacob Valenzuela-UCLA-Final-Poster

VELAZQUEZ: Sex Differences in Gene Expression: Transcriptomic Insights into Neurodevelopmental and Neuropsychiatric Disorders

Sex Differences in Gene Expression: Transcriptomic Insights into Neurodevelopmental and Neuropsychiatric Disorders

JESUS V VELAZQUEZ1, Ramin Ghoddousi1, Daniel Geschwind1, Department of Neurobiology, University of California, Los Angeles, CA, 90025

Sex differences in the prevalence, onset, and phenotypic profiles of various neurodevelopmental and neuropsychiatric disorders, such as Autism Spectrum Disorder (ASD), schizophrenia, bipolar disorder, and major depressive disorder, have been well-documented. These differences suggest underlying biological mechanisms that contribute to sex-specific vulnerabilities and manifestations of these conditions. Current research efforts aim to elucidate the genetic, transcriptomic, hormonal, and cellular factors driving these sex differences. This study focuses specifically on the transcriptome aspect, exploring gene expression variations between male and female brains. Using adult bulk-sequencing data, the research seeks to identify differentially expressed genes (DEGs) between sexes in a specific brain region. By comparing gene expression profiles of male and female brains, it aims to uncover sex-specific DEGs and investigate their potential associations with neurodevelopmental and neuropsychiatric disorders. Identifying disease-associated risk genes within these DEGs could provide valuable insights into the molecular mechanisms contributing to sex differences in these disorders. The study involves a systematic analysis, starting with data acquisition and preprocessing, followed by differential gene expression analysis, and culminating in the identification and characterization of DEGs. The findings may contribute to a deeper understanding of sex-specific molecular underpinnings in neurodevelopmental and neuropsychiatric disorders, potentially guiding future research and therapeutic strategies.

JESUS-VELAZQUEZ-_-UCLA-BIG-POSTER.pptx

WIGGINS: Enhancing the Prediction of Prostate Cancer Recurrence with Multi-omic Molecular and Clinical Data

Enhancing the Prediction of Prostate Cancer Recurrence with Multi-omic Molecular and Clinical Data

ADRIANA WIGGINS1, Jaron Arbet2-4, Paul C. Boutros2-4

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA.

2 Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, California.

3 Institute for Precision Health, University of California, Los Angeles, Los Angeles, California.

4 Department of Human Genetics, University of California, Los Angeles, Los Angeles, California. 5 Department of Urology, David Geffen School of Medicine, University of California, Los Angeles, California.

Clinical measurements such as age, PSA, ISUP grade, and tumour stage are used to inform prostate cancer treatment decisions. However, heterogeneity remains when predicting the trajectory a patient’s cancer will take and the best route of treatment. To address the need to develop more robust biomarkers, we evaluated the predictive power of Random Forest and Cox Proportional models using various combinations of clinical and multi-omic tumour features including DNA methylation, CNA, RNA, and driver mutation data in a cohort of localized prostate cancer patients. As a baseline, our analysis found that Random Forest trained only on clinical data achieved a test C-index of 0.807 for predicting time until BCR. However, a Random Forest model trained on screened clinical, methylation, and RNA data yielded a C-index of 0.852. These findings suggest that combining molecular and clinical information can improve the accuracy of predicting disease prognosis and personalized cancer treatment.

Adriana Wiggins-BIG-Poster.pptx

WOLF: Investigating Effects of Perinatal Hypoxia on the Neonatal Heart Transcriptome

Investigating Effects of Perinatal Hypoxia on the Neonatal Heart Transcriptome

CHARLOTTE WOLF 1,3, Marlin Touma 1,3

1 Neonatal/ Congenital Heart Laboratory, Cardiovascular Research Laboratories, David Geffen School of Medicine, University of California Los Angeles

2 Bruins in Genomics, Institute for Quantitative and Qualitative Biology, University of California Los Angeles

3 Department of Pediatrics, David Geffen School of Medicine, University of California Los Angeles

The development of the heart postnatally can be heavily influenced by perinatal stress factors. Environmental perinatal stress factors, such as hypoxia, have been shown to have significant pathological effects on the heart and can contribute to the progression of cyanotic congenital heart defects (CHDs). This project seeks to elucidate chamber-specific effects of perinatal hypoxia on neonatal heart transcriptome regulation. We are conducting transcriptome-wide analysis of both the right ventricle (RV) and left ventricle (LV) under hypoxic and normoxic conditions in neonatal mouse hearts over three time points of postnatal transition: day of life zero, three, and seven. Through sequential analysis of RNA-seq derived of a transcriptome dataset, significant differentially regulated genes and pathways will be identified. By uncovering how hypoxia changes the transcriptome of the neonatal heart, we can enhance our understanding of patients with cyanotic CHDs, and potentially identify novel therapeutic approaches to improve outcomes.

CharlotteWolfPoster

YANG: Drug-Phenotype Associations and Pathway Genes in Neurodegenerative Diseases: Insights from PathFX Analysis

Drug-Phenotype Associations and Pathway Genes in Neurodegenerative Diseases: Insights from PathFX Analysis

JIALE YANG1, Jennifer L Wilson1,2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Bioengineering, UCLA

Neurodegenerative diseases like Alzheimer’s and Parkinson’s pose significant challenges due to their complex pathology and lack of curative treatments. Computational models like protein-interaction networks can predict novel therapeutic approaches but require more robust validation. Our previous research demonstrated that analyzing pathway genes linking drugs to phenotypes could unveil clinical therapeutic effects, a method proven effective in predicting rare drug-drug interactions. Building on this, we used PathFX to analyze drug-phenotype associations across six major neurologic and neurodegenerative disease categories. We identified 1,010 pathway genes linking 4,260 drugs to 19 phenotypes. Despite low similarity among pathway genes, core genes such as CCL5, CCR2, and CXCR4 were consistently implicated across most neurodegenerative categories. Further analysis through GO enrichment and ATC code assessments demonstrated commonalities in key biological processes and targeted organs across these diseases, while also uncovering unique patterns specific to ALS and stroke. Our findings pave the way for developing new therapies for various neurodegenerative diseases by prioritizing pathway proteins with unique links to these conditions.

Yang_BIG_updated

Zello: An alignment-free approach to diagnose colorectal cancer using cell-free RNA

An alignment-free approach to diagnose colorectal cancer using cell-free RNA

LILY ZELLO1, Praveena Ratnavel1, Fei-man Hsu2, Matteo Pellegrini2

1 BIG Summer Program, Institute for Quantitative and Computational Biosciences, UCLA

2 Department of Molecular Cell and Developmental Biology, UCLA

Previous research has validated the use of liquid biopsy for disease detection, particularly for cancer diagnosis and prognosis. While this method offers significant advantages such as being minimally invasive and providing comprehensive tumor profiling, the traditional pipeline used to classify patients, which includes mapping sequencing reads a reference genome and further bioinformatics analyses, is computationally expensive. Here we evaluate an alignment-free approach using k-mer counting and benchmark with the alignment-based pipeline using a public colorectal cancer cell-free RNAseq dataset. Our results show that the alignment-free method outperforms in terms of computing time and resources while maintaining accuracy. Ultimately, these results support the possibility of using kmer-counting to achieve a faster diagnosis in the clinical setting.

2024 Bruins-In-Genomics Summer Undergraduate Research Program

2024 B.I.G. Summer Participants

2024 B.I.G. Summer Poster Abstracts

Interesting links

Pages

Categories

Archive