Difference between revisions of "Compute differentially expressed genes (Agilent probes) (workflow)"

From BioUML platform
Jump to: navigation, search
m (Cosmetic changes in category formatting)
(Automatic synchronization with BioUML)
Line 6: Line 6:
 
[[File:Compute-differentially-expressed-genes-Agilent-probes-workflow-overview.png|400px]]
 
[[File:Compute-differentially-expressed-genes-Agilent-probes-workflow-overview.png|400px]]
 
== Description ==
 
== Description ==
This workflow is designed to identify upregulated, downregulated and non-changed genes for experimental data with three and more data points for each experiment and control. 
+
This workflow is designed to identify up-regulated, down-regulated and non-changed genes for experimental data with three and more data points for each experiment and control.  
  
As input, the normalized data with Agilent probe IDs can be submitted.
+
As input, normalized data with Agilent probeset IDs can be submitted. Such normalized files are the output of the “Normalize data” procedure.
  
Such normalized files are resulting from the output of the “Normalize data” procedure under  “analyses/Methods/Data normalization/Normalize Agilent experiment and control”.
+
In the next step, p-values for up- and down- regulated probes are calculated for all probes using the “Up and Down Identification”'' ''analysis,. This analysis applies Student’s T-test for p-value calculation, thus the number of data points should be at least three for each experiment and control. 
  
At the next step, p-value is calculated for up-and down-regulated Agilent probe IDs. This workflow applies Student T-test for p-value calculation, and therefore the number of data points should be at least three for each experiment and control.
+
Simultaneously, the log fold change is calculated for each probeset ID, resulting in  a table in which both log fold changes and p-values are assigned to each probeset ID. A histogram with log fold change distribution is calculated and generated as one of the output files.
  
Simultaneously, log fold change is calculated for Agilent probeIDs, and as the result of this step, a table is produced in which both LogFoldChange and p-value are assigned to each row.
+
In addition, this table is filtered by several conditions in parallel applying the “Filter table” method, to identify up-regulated, down-regulated, and non-changed Agilent probeset IDs. The filtering criteria are set as follows:
  
Further, this table is filtered by several conditions in parallel, to identify upregulated, downregulated, and non-changed Agilent probe IDs.
+
For up-regulated probes: LogFoldChange>0.5 and -log_P_value_>3.
  
The filtering criteria are set as the following.
+
For down- regulated probes: LogFoldChange<-0.5 and -log_P_value_<-3.
  
For upregulated probes: LogFoldChange>0.5 and -log_P_value_>3.
+
For non-changed genes : LogFoldChange<0.002 and LogFoldChange>-0.002
  
For downregulated probes: LogFoldChange<-0.5 and -log_P_value_<-3.
+
The resulting tables of the up-regulated, down-regulated, and non-changed Agilent probeset IDs are converted into a gene set via the “Convert table” method and annotated with additional information, gene descriptions, gene symbols, and species via “Annotate table”. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs. 
  
For non-changed probes: LogFoldChange<0.01 and LogFoldChange>-0.01
+
A new folder is generated as output containing Ensemble and Entrez gene tables for up-regulated, down-regulates, up- and down-regulated together, and non-changed genes. After completion of the workflow, a script generates a report which gives the summary of the workflow output files. 
 
+
Resulting tables of the upregulated, downregulated, and non-changed Agilent probe IDs are annotated with additional information, gene description, gene symbols, species.
+
 
+
Finally, these tables are converted into the tables of genes. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs.
+
  
 
== Parameters ==
 
== Parameters ==

Revision as of 13:34, 30 May 2013

Workflow title
Compute differentially expressed genes (Agilent probes)
Provider
geneXplain GmbH

Workflow overview

Compute-differentially-expressed-genes-Agilent-probes-workflow-overview.png

Description

This workflow is designed to identify up-regulated, down-regulated and non-changed genes for experimental data with three and more data points for each experiment and control.  

As input, normalized data with Agilent probeset IDs can be submitted. Such normalized files are the output of the “Normalize data” procedure.

In the next step, p-values for up- and down- regulated probes are calculated for all probes using the “Up and Down Identification” analysis,. This analysis applies Student’s T-test for p-value calculation, thus the number of data points should be at least three for each experiment and control. 

Simultaneously, the log fold change is calculated for each probeset ID, resulting in  a table in which both log fold changes and p-values are assigned to each probeset ID. A histogram with log fold change distribution is calculated and generated as one of the output files.

In addition, this table is filtered by several conditions in parallel applying the “Filter table” method, to identify up-regulated, down-regulated, and non-changed Agilent probeset IDs. The filtering criteria are set as follows:

For up-regulated probes: LogFoldChange>0.5 and -log_P_value_>3.

For down- regulated probes: LogFoldChange<-0.5 and -log_P_value_<-3.

For non-changed genes : LogFoldChange<0.002 and LogFoldChange>-0.002

The resulting tables of the up-regulated, down-regulated, and non-changed Agilent probeset IDs are converted into a gene set via the “Convert table” method and annotated with additional information, gene descriptions, gene symbols, and species via “Annotate table”. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs. 

A new folder is generated as output containing Ensemble and Entrez gene tables for up-regulated, down-regulates, up- and down-regulated together, and non-changed genes. After completion of the workflow, a script generates a report which gives the summary of the workflow output files. 

Parameters

Experiment normalized
Control normalized
Species
Results folder
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox