Difference between revisions of "Find gene fusions from RNA-seq (workflow)"
(Automatic synchronization with BioUML) |
(Automatic synchronization with BioUML) |
||
(One intermediate revision by one user not shown) | |||
Line 5: | Line 5: | ||
== Workflow overview == | == Workflow overview == | ||
[[File:Find-gene-fusions-from-RNA-seq-workflow-overview.png|400px]] | [[File:Find-gene-fusions-from-RNA-seq-workflow-overview.png|400px]] | ||
+ | == Description == | ||
+ | This workflow offers the ability to discover gene fusions from RNA-seq data (single-end (SE) or paired-end (PE) RNA-Seq read data) based on the fast FusionFinder program published in 2012 (Francis et al., PLoS ONE 7:e39987, 2012) . It accepts raw RNA-seq reads (fastq format) and produces a table with found gene fusions. | ||
+ | |||
+ | The '''FusionFinder program''' analyses FASTQ read data (reads must be of at least 50 nucleotides long; see input example) to identify gene fusion candidates. | ||
+ | |||
+ | The first step is to align the full length reads against a normal coding reference transcriptome. After creation of pseudo paired-end reads (PE), these PE reads are aligned against the coding reference transcriptome. A further step is to analyze the results and filter false-positives. The last step consists of a block filtering and identification of fused exons and isoforms from candidate fusion transcripts. | ||
+ | |||
+ | The output table Fusion summary is a ranked list of fusion candidates based on their evidence strength (total number of sequence reads = total reads). The file provides the Ensembl and HUGO (Human Genome Organization) Gene Nomenclature Committee (HGNC) common name identifiers for G1 and G2 (G1_Ensembl_HGNC_ID and G2_Ensembl_HGNC_ID), the number of blocks on each gene (G1_blocks and G2_blocks), an indication of how many isoforms exist for each G1:G2 pair and the category of fusion indicated by the pair. | ||
+ | |||
+ | The output table Fusion isoforms gives the full details for each isoform of G1 and G2 and includes the genomic coordinates of the alignment blocks on G1 and G2, and their respective corresponding Ensembl exon IDs. | ||
+ | |||
+ | |||
+ | |||
== Parameters == | == Parameters == | ||
;Input fastq | ;Input fastq |
Latest revision as of 16:34, 12 March 2019
- Workflow title
- Find gene fusions from RNA-seq
- Provider
- geneXplain GmbH
[edit] Workflow overview
[edit] Description
This workflow offers the ability to discover gene fusions from RNA-seq data (single-end (SE) or paired-end (PE) RNA-Seq read data) based on the fast FusionFinder program published in 2012 (Francis et al., PLoS ONE 7:e39987, 2012) . It accepts raw RNA-seq reads (fastq format) and produces a table with found gene fusions.
The FusionFinder program analyses FASTQ read data (reads must be of at least 50 nucleotides long; see input example) to identify gene fusion candidates.
The first step is to align the full length reads against a normal coding reference transcriptome. After creation of pseudo paired-end reads (PE), these PE reads are aligned against the coding reference transcriptome. A further step is to analyze the results and filter false-positives. The last step consists of a block filtering and identification of fused exons and isoforms from candidate fusion transcripts.
The output table Fusion summary is a ranked list of fusion candidates based on their evidence strength (total number of sequence reads = total reads). The file provides the Ensembl and HUGO (Human Genome Organization) Gene Nomenclature Committee (HGNC) common name identifiers for G1 and G2 (G1_Ensembl_HGNC_ID and G2_Ensembl_HGNC_ID), the number of blocks on each gene (G1_blocks and G2_blocks), an indication of how many isoforms exist for each G1:G2 pair and the category of fusion indicated by the pair.
The output table Fusion isoforms gives the full details for each isoform of G1 and G2 and includes the genomic coordinates of the alignment blocks on G1 and G2, and their respective corresponding Ensembl exon IDs.
[edit] Parameters
- Input fastq
- Ensembl version
- Output folder