Find genome variants and indels from full-genome NGS (workflow)
- Workflow title
- Find genome variants and indels from full-genome NGS
- Provider
- geneXplain GmbH
Workflow overview
Description
This workflow is based on a framework to discover genotype variations in full-genome NGS data by De Pristo et al., Nature Genetics 43:491-498, 2011. The process includes initial read mapping, local realignment around indels, base quality score recalibration, SNP discovery and genotyping to find all potential variants.
In the first part of the workflow the input sequences are mapped using the BWA tool (Galaxy). BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the other two algorithms are designed for longer sequences ranging from 70bp to 1Mbp.
The second part includes local realignment around indels, base quality score recalibration, SNP discovery and genotyping to find all potential variants. After the first part, and after identification of duplicates and covariates, the workflow creates a first output as a new BAM file. Then the recalibrated BAM file is used as an input for SNP discovery and genotyping to find all potential variants by GATK (Genome Analysis Toolkit).
Parameters
- Forward fastq
- Reverse fastq
- OutputFolder
- Results are here