High-throughput sequencing technology involves a number of concepts and techniques that shape a project before application-specific processes are utilized. First, this workshop introduces the more “universal” aspects of high-throughput sequence analysis—from experimental design to sequencing and alignment methods. Next, this workshop covers common file formats for sequence data and limitations of sequencing technologies.
We will explore a hands-on exercise using the Hoffman2 cluster and will use programs designed to analyze read data, clean artifact and low-quality sequences, and align short reads to a reference genome. At the end of this workshop, students will better understand the logic behind how many of these steps work.
Students who are comfortable working with the command line will find this course significantly easier than those who are inexperienced in command-line interface. While Workshop 1 is not an absolute requirement, it is highly recommended for any student who is not already experienced working in a command line environment. A student who takes this workshop at the same time as Workshop 1 should not find any problems with timing of material between these two classes.