Find genome variants and indels from full-genome NGS (workflow)

From BioUML platform
Jump to: navigation, search
Workflow title
Find genome variants and indels from full-genome NGS
Provider
geneXplain GmbH

Workflow overview

Find-genome-variants-and-indels-from-full-genome-NGS-workflow-overview.png

Description

This workflow is based on a framework to discover genotype variations in full-genome NGS data by De Pristo et al., Nature Genetics 43:491-498, 2011. The process includes initial read mapping, local realignment around indels, base quality score recalibration, SNP discovery and genotyping to find all potential variants.

In the first part of the workflow the input sequences are mapped using the BWA tool (Galaxy). BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the other two algorithms are designed for longer sequences ranging from 70bp to 1Mbp.

The second part includes local realignment around indels, base quality score recalibration, SNP discovery and genotyping to find all potential variants. After the first part, and after identification of duplicates and covariates, the workflow creates a first output as a new BAM file. Then the recalibrated BAM file is used as an input for SNP discovery and genotyping to find all potential variants by GATK (Genome Analysis Toolkit).

 

Parameters

Forward fastq
Reverse fastq
OutputFolder
Results are here
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox