Difference between revisions of "Upstream analysis (TRANSFAC(R) and TRANSPATH(R)) (workflow)"

Latest revision as of 13:34, 30 May 2013

Workflow title: Upstream analysis (TRANSFAC(R) and TRANSPATH(R))
Provider: geneXplain GmbH

[edit] Workflow overview

[edit] Description

This workflow is designed to perform a complete upstream analysis including a search for putative transcription factor binding sites (TFBSs), in the promoters of the input gene set as well as an analysis of the pathways upstream of the suggested TFs. The resulting master regulatory molecules can be considered as new targets, and are candidates for further experimental validations

As input, any gene or protein table can be submitted. The input is a table with the genes under study (“Yes” set), and a background set, or No set.

At the first step, both input tables are converted into the corresponding tables with Ensembl Gene IDs.

At the next step, TFBSs are search in the promoters of the specified gene sets. Promoters in this workflow are defined as sequences from -1000 to +100 relative to the transcription start sites, as they are annotated in Ensembl.

Site search is done with the TRANSFAC® library of positional weight matrices, PWMs, namely with the profile vertebrate_non_redundant_minSUM.

At the same step, frequencies of putative TFBS are compared between Yes set and No set to identify sites that are overrepresented in Yes set versus No set.

The output of this step is a list of PWMs the hits of which are overrepresented in Yes set versus No set.

Next, the list of PWMs is converted into a table of transcription factors with TRANSPATH® IDs, which are used to search for master regulatory molecules in the TRANSPATH® network. For each potential master regulator, FDR, Score, and Z-score are calculated.

The results are filtered by Z_Score>1 and Score>0.2 to select statistically significant master regulators.

The table with the resulting master regulatory molecules is converted into the table Ensembl Gene IDs and annotated with additional information, gene description and gene symbols.

Finally, networks for the three top master regulatory molecules are visualized as diagrams in the hierarchical layout.

The output is a new folder with several tables, including summary of the predicted TFBS, genomic tracks of the Yes and No promoters and sites, as well as a table with candidate master regulators and network diagrams for three top candidates.

This workflow is available together with valid TRANSFAC® and TRANSPATH® licenses..

[edit] Parameters

Input Yes gene set
Species
Input No gene set
Profile
Start of promoter: Position relative to TSS, bp
End of promoter: Position relative to TSS, bp
Results Folder: Folder to store results (will be created if not exists)

@@ Line 2: / Line 2: @@
 :Upstream analysis (TRANSFAC(R) and TRANSPATH(R))
 ;Provider
-:[[GeneXplain GmbH]]
+:[[geneXplain GmbH]]
 == Workflow overview ==
 [[File:Upstream-analysis-TRANSFAC-R-and-TRANSPATH-R-workflow-overview.png|400px]]
 == Description ==
-This workflow is designed to perform a complete upstream analysis including search for putative transcription factor binding sites, TFBSs, on the promoters of the input gene set as well as an analysis of the pathways upstream of the suggested TFs. Resulting master regulatory molecules can be considered as new targets, and are candidates for further experimental validations.
+This workflow is designed to perform a complete upstream analysis including a search for putative transcription factor binding sites (TFBSs), in the promoters of the input gene set as well as an analysis of the pathways upstream of the suggested TFs. The resulting master regulatory molecules can be considered as new targets, and are candidates for further experimental validations
 As input, any gene or protein table can be submitted. The input is a table with the genes under study (“Yes” set), and a background set, or No set.
 At the first step, both input tables are converted into the corresponding tables with Ensembl Gene IDs.
 At the next step, TFBSs are search in the promoters of the specified gene sets. Promoters in this workflow are defined as sequences from -1000 to +100 relative to the transcription start sites, as they are annotated in Ensembl.
 Site search is done with the TRANSFAC® library of positional weight matrices, PWMs, namely with the profile vertebrate_non_redundant_minSUM.
 At the same step, frequencies of putative TFBS are compared between Yes set and No set to identify sites that are overrepresented in Yes set versus No set.
 The output of this step is a list of PWMs the hits of which are overrepresented in Yes set versus No set.
@@ Line 30: / Line 30: @@
 The output is a new folder with several tables, including summary of the predicted TFBS, genomic tracks of the Yes and No promoters and sites, as well as a table with candidate master regulators and network diagrams for three top candidates.
-This workflow is available together with valid TRANSFAC® and TRANSPATH® licenses.
+This workflow is available together with valid TRANSFAC® and TRANSPATH® licenses..
 == Parameters ==
@@ Line 37: / Line 43: @@
 ;Input No gene set
 ;Profile
-;Start of promotor
+;Start of promoter
 :Position relative to TSS, bp
-;End of promotor
+;End of promoter
 :Position relative to TSS, bp
 ;Results Folder
@@ Line 45: / Line 51: @@
 [[Category:Workflows]]
+[[Category:GeneXplain workflows]]
 [[Category:Autogenerated pages]]

Difference between revisions of "Upstream analysis (TRANSFAC(R) and TRANSPATH(R)) (workflow)"

Latest revision as of 13:34, 30 May 2013

[edit] Workflow overview

[edit] Description

[edit] Parameters

Personal tools

Namespaces

Variants

Views

Actions

Search

BioUML platform

Community

Modelling

Analysis & Workflows

Collaborative research

Development

Virtual biology

Wiki

Toolbox