Genomics Workshop

This hands-on workshop is designed to introduce students to the basic analysis pipeline for Next-Generation Sequencing (NGS) reads, specifically from the Illumina platform for population genetics analysis.

We’ll focus on whole genome resequencing data (WGS), though the same pipeline may also be applied to RAD-seq or amplicon-seq data (with minor modifications) if a reference genome is available.

By the end of the workshop, participants will understand how to:


Pre-requisites & Setup

Before joining the workshop, please make sure you can access a Unix/Linux shell environment, as all commands and tools will be run from the command line.

Windows users may install MobaXterm. Mac/Linux systems already have a terminal available.

Recommended Resources

How Illumina Sequencing works?

Watch this video to understand the technology

Linux Shell Basics

Advanced Linux Commands


Dataset Download Instructions

Download Here

  1. Go to the link above.
  2. Create a directory: mkdir name_of_directory
  3. Use wget to download files.

Workshop Pipeline

1. Initial Processing of FASTQ Files

FastQC video
Includes quality check (FastQC), adapter trimming (Trim Galore/trimmomatic), and preparing cleaned reads.

2. Mapping Reads to Reference Genome

BWA video

3. Identifying Variants (SNPs/Indels)

Script:

  • Variant Identification.
  • Variant Calling video

    4. Filtering Variants

  • Explaination of the filters: Variant Filter Descriptions
  • 5. Population Structure & Visualization

    Script:

  • Basic Population Genetics.
  • PCA video

    © Content adapted from PoODL-CES Genomics_learning_Workshop.