Variant Analysis Task List
Task 1: VCF to PLINK Format Conversion
- Convert VCF to PLINK binary format using
plink --make-bed
- Edit
.fam file to add region information to second column
- Edit
.bim file: Replace chromosome name E2 with 1
Task 2: Principal Component Analysis (PCA)
- Run PCA with
plink --pca 5
- Visualize in R with
ggplot2
- Save PCA plot as
.png and .pdf
Task 3: File Transfer
- Use
scp to copy output files from server to local machine
- Example:
scp user@IP:path/to/Rplots.pdf .
Task 4: ADMIXTURE Analysis
- Create conda environment and install ADMIXTURE
- Run ADMIXTURE for K = 2 to 4 using loop
- Generate output files
.K.P and .K.Q
Task 5: ADMIXTURE Plotting
- Read
.Q and .fam files into R
- Extract group/population from sample IDs
- Reshape data and plot using
ggplot2
- Save barplot as
admixture_K3_grouped.png
Task 6: Heterozygosity Estimation
- Install
rtg-tools using conda
- Run
rtg vcfstats on final VCF
- Save output as
.vcfstats file
Task 7: Optional / Miscellaneous Tasks
- Activate/deactivate conda environments as needed
- Exit server after analysis with
exit command