Workshop Description

High-throughput sequencing technologies have allowed researchers to extract DNA at the individual, population, and species levels. In this workshop, students will learn how to analyze and interpret population-level genetic information with PLINK and R. Students will also be exposed to the literature on the different topics, followed by hands-on exercises and paper discussion.

Workshop Materials

At the end of the workshop, the students will be able to:

  • Describe what a variant calling file (VCF) format is and how to manage those files.
  • Conduct quality assesment of a VCF.
  • Learn about population structure and how to compute it with PLINK.
  • Learn about linkage disequilibrium and how to compute it with PLINK.
  • Learn about basic association testing and genome wide association studies (GWAS).
  • Learn about copy number variants (CNVs) and how to test for common CNVs across indidviduals.
  • Discuss original literature within the subjects.
  • Background lecture (45 minutes)
    • What is a VCF?
    • What is QC, and why is it so important?
  • Break (15 minutes)
  • Hands on exercise (1 hour)
    • VCF Data management (read, recode, reorder, merge, subset, compress data)
    • QC assessement
  • Break (15 minutes)
  • Paper discussion on quality control assessment (30 minutes)
  • Assignment explanation (15 minutes)
  • Background lecture (45 minutes)
    • What is population structure?
    • What is linkage disequilibrium?
    • How does population structure and LD affect association mapping?
  • Break (15 minutes)
  • Hands on exercise (1 hour)
    • Population stratification detection
    • LD estimation
  • Break (15 minutes)
  • Paper discussion on genome wide association studies
  • Assignment explanation (15 minutes)
  • Peer review of previous assignment (15 minutes)
  • Background lecture (45 minutes)
    • What is association testing and GWAS?
    • What is a Manhattan plot and a Q-Q plot?
    • What is a copy number variant?
  • Break (15 minutes)
  • Hands-on exercise (1 hour and 30 minutes)
    • Basic association testing
    • GWAS accounting for population structure
    • CNV detection
  • Break (15 minutes)
  • Assignment explanation (15 minutes)

Technical Requirements

Attendees are required to have a Hoffman2 account. To apply for an account, click here. UCLA participants who lack a faculty sponsor and non-UCLA participants may apply for a temporary Hoffman2 account, requesting sponsorship from Collaboratory Workshops.


Dr. Cavassim Alves is a Postdoctoral Fellow at the Lohmueller lab in the Department of Ecology and Evolutionary Biology. Her interests are in using comparative genomics to deepen our understanding of the evolutionary mechanisms that give rise to and maintain genetic variation. Currently, Izabel is investigating the impact of dominance on the distribution of fitness effects in human populations. Dr. Cavassim Alves earned her Bachelor in Agriculture Engineering from University of Sao Paulo (Brazil), Masters and Ph.D. in Bioinformatics from Aarhus University (Denmark).




Workshop Details

Prerequisites: Some familiarity with basic command line, R, and genetics
Length: 3 days, 3 hrs per day
Level: Intermediate
Location: Boyer 529
Seats Available: 28

Fall 2021 Dates

Nov. 30, Dec. 1 and 2
9:00 AM – 12:00 PM