Variant Analysis Task List
Task 1: VCF to PLINK Format Conversion
- Convert VCF to PLINK binary format using
plink --make-bed
- Edit
.fam
file to add region information to second column
- Edit
.bim
file: Replace chromosome name E2
with 1
Task 2: Principal Component Analysis (PCA)
- Run PCA with
plink --pca 5
- Visualize in R with
ggplot2
- Save PCA plot as
.png
and .pdf
Task 3: File Transfer
- Use
scp
to copy output files from server to local machine
- Example:
scp user@IP:path/to/Rplots.pdf .
Task 4: ADMIXTURE Analysis
- Create conda environment and install ADMIXTURE
- Run ADMIXTURE for K = 2 to 4 using loop
- Generate output files
.K.P
and .K.Q
Task 5: ADMIXTURE Plotting
- Read
.Q
and .fam
files into R
- Extract group/population from sample IDs
- Reshape data and plot using
ggplot2
- Save barplot as
admixture_K3_grouped.png
Task 6: Heterozygosity Estimation
- Install
rtg-tools
using conda
- Run
rtg vcfstats
on final VCF
- Save output as
.vcfstats
file
Task 7: Optional / Miscellaneous Tasks
- Activate/deactivate conda environments as needed
- Exit server after analysis with
exit
command