Quantification of RNA-seq with Cufflinks for multiple BAM files (workflow)

From BioUML platform
Jump to: navigation, search
Workflow title
Quantification of RNA-seq with Cufflinks for multiple BAM files
Provider
geneXplain GmbH

Workflow overview

Quantification-of-RNA-seq-with-Cufflinks-for-multiple-BAM-files-workflow-overview.png

Description

This workflow is designed to estimate abundances of transcripts in several RNA-Seq samples using the Cufflinks method (published in Trapnell C, Williams BA, Pertea G, Mortazavi AM, Kwan G, van Baren MJ, Salzberg SL, Wold B, Pachter L.)

Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among isoforms. Nature Biotechnology doi:10.1038/nbt.1621.

In the first part of the workflow, the Cufflinks method accepts aligned RNA-Seq reads (in ""aligned"" BAM files) and assembles the alignments into a set of transcripts using a reference annotation of transcripts and genes. Cufflinks then estimates the relative abundances of these transcripts and genes based on how many reads support each one.

The Result folder and CountsFolder are created temporarily for storing Cufflinks outputs and intermediate quantification outputs. The FPKM folder defines the folder for the final output of the workflow. The abbreviation FPKM stays for Fragments Per Kilobase of transcript per Million mapped reads, and is a commonly accepted standard measure for this kind of data.

The results folder consists of several tables of Ensembl type containing the results of quantification of every BAM file from the input folder, The FPKM value corresponds to the expression value of this gene. For RNA-Seq data, the relative expression of a transcript is proportional to the number of cDNA fragments that originated from it.

Note. This workflow may take several hours to complete. You can start this workflow and even switch off your computer, e.g. overnight, while the computation will be running on the server. After several hours you can check the results. In case of any questions, please feel free to ask for details (info@genexplain.com).

 

Parameters

Input folder
Input folder with multiple BAM files
Sequence source
Select species, genome build and Genecode version of GTF file
Result folder
Temporal folder to store Cufflinks results (will be created if not exists yet)
CountsFolder
Temporal folder to store information about transcript counts
FPKMfolder
Output folder for gene transcript counts in FPKM (Fragments Per Kilobase of transcript per Million mapped reads)
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox