Workshop Description (Intermediate Course)

The workshop will mainly focus on teaching the fundamentals of bacterial genomics and basic bioinformatics analysis. As a sample data Vibrio cholerae genome will be used as the practice dataset, and workshop participants will try to reproduce the expected results. The workshop will start with detailed instructions on how to quality control the raw sequence data and the assemble of a genome. Next, basic statistics will be explained to learn how to check the quality of assembledgenome (e.g., N50, # of contigs, # of core genes). On the genome analysis, average nucleotide identity (ANI) will be performed to identify the genome into known species followed by a prediction of open reading frames (ORFs), protein annotation, and comparing the results with other genomes.

Workshop Materials

Genome assembly

  • Quality control raw data
  • Genome Assembly
  • How to measure the quality of the assembled genome

Taxonomy identification

  •       A brief history of bacterial taxonomy
  • Type strain
  •       Average nucleotide identity
  •       Reference Database (EzBioCloud, NCBI)
  • Phylogenetic tree inference

Comparative genomics analysis

  • Genome annotation by Prokka
  • Search specific gene of interest by BLAST
  • How to find specific types of features (Antibiotic resistance gens, virulence factors)
  • Genome browser (Artemis)
  • Pan & core genome (Roary)
  • Visualization tools (BRIG)

Technical Requirements

  • Attendees are required to have a Hoffman2 account or personal computer.
  • For mac users, you will be working on the terminal.
  • For Windows users, you have to install the Windows subsystem for Linux (WSL) prior to coming to the class (
  • You will be working on Linux or a linux-like environment.
  • It is highly recommended to take the “Intro to Unix command line” workshop if you are not familiar with the Linux environment.


Dr. Daniel (Sung-min) Ha is a Postdoctoral Fellow at the Xia Yang Lab in the Department of Integrative Biology and Physiology. Dr. Ha research focuses on Bacterial Genomics and Epidemiology and interest involves host-microbiome interaction using multi-omics systems biology approach. He earned a Bachelor in Mathematics and Genetics & Biotechnology at the University of Toronto in Canada, and a Ph.D. at Seoul National University in South Korea.



Workshop Details

Prerequisites: Attendees are advised to take W1. Intro to UNIX command line, W2. Using NGS analysis tools prior to this workshop, but not mandatory.
Length: 3 days, 3 hrs per day
Level: Intermediate
Location: Boyer 529
Seats Available: 28

Spring 2022 Dates

April 26, 27, and 28
1:30 PM-4:30 PM