Difference between revisions of "Analyze promoters (GTRD) (workflow)"

From BioUML platform
Jump to: navigation, search
(GeneXplain -> geneXplain)
(Automatic synchronization with BioUML)
 
(6 intermediate revisions by one user not shown)
Line 6: Line 6:
 
[[File:Analyze-promoters-GTRD-workflow-overview.png|400px]]
 
[[File:Analyze-promoters-GTRD-workflow-overview.png|400px]]
 
== Description ==
 
== Description ==
This workflow is designed to search for putative transcription factor binding sites, TFBS, on the promoters of an input gene set.
+
This workflow is designed to search for putative transcription factor binding sites, TFBS, on the promoters of an input gene set.   As input, any gene or protein table can be submitted. '''The input table contains genes under study, and it is called “Yes” set'''.  At the first step, the input table is converted into a table with Ensembl Gene IDs.
  
As input, any gene or protein table can be submitted. The input table contains genes under study, and it is called “Yes” set.
+
At the next step, promoters are analyzed for potential cis-regulatory sites. Promoters in this workflow are defined as sequences from -1000 to +100 relative to the transcription start sites, as they are annotated in Ensembl. '''Site search is done with the help of the GTRD library of the positional weight matrices, PWMs, namely with the profile moderate threshold. '''
  
At the first step, the input table is converted into a table with Ensembl Gene IDs.
+
At the same step, frequencies of putative TFBSs are compared between Yes set and No set to identify sites overrepresented in Yes set versus No set. Default No set in the workflow is a set of housekeeping genes for the corresponding species (as they are published in PMID: 19534766)  . The result of this step is a list of PWMs the hits of which are overrepresented in Yes set versus No set. Next, the list of PWMs is converted into a table of transcription factors. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs.  
  
At the next step, promoters are analyzed for potential cis-regulatory sites. Promoters in this workflow are defined as sequences from -1000 to +100 relative to the transcription start sites, as they are annotated in Ensembl.
+
Finally, both tables with transcription factors are annotated with additional information, gene description and gene symbols.
  
Site search is done with the GTRD library of positional weight matrices, PWMs, namely with the profile “moderate threshold”.
+
The output is a new folder with several tables, including a summary of the predicted TFBSs, genomic tracks of the Yes and No promoters and sites, as well as the tables with transcription factors potentially regulating the genes in the Yes set.
  
At the same step, frequencies of putative TFBSs are compared between Yes set and No set to identify sites overrepresented in Yes set versus No set. Default No set in the workflow is a set of housekeeping genes for the corresponding species (as they are published in PMID: 19534766) 
+
 
 
+
The result of this step is a list of PWMs the hits of which are overrepresented in Yes set versus No set.
+
 
+
Next, the list of PWMs is converted into a table of transcription factors. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs.
+
 
+
Finally, both tables with transcription factors are annotated with additional information, gene description and gene symbols.
+
  
The outputis a new folder with several tables, including a summary of the predicted TFBS, genomic tracks of the Yes and No promoters and sites, as well as the tables with transcription factors potentially regulating the genes in the Yes set.
+
 
  
 
== Parameters ==
 
== Parameters ==
;Input gene set
+
;Input gene set (Yes set)
 +
;Profile
 
;Species
 
;Species
 +
;No set
 +
;5<nowiki>'</nowiki> flank
 +
:Position relative to TSS, bp
 +
;3<nowiki>'</nowiki> flank
 +
:Position relative to TSS, bp
 
;Results folder
 
;Results folder
  
 
[[Category:Workflows]]
 
[[Category:Workflows]]
 +
[[Category:GeneXplain workflows]]
 
[[Category:Autogenerated pages]]
 
[[Category:Autogenerated pages]]

Latest revision as of 16:34, 12 March 2019

Workflow title
Analyze promoters (GTRD)
Provider
geneXplain GmbH

[edit] Workflow overview

Analyze-promoters-GTRD-workflow-overview.png

[edit] Description

This workflow is designed to search for putative transcription factor binding sites, TFBS, on the promoters of an input gene set.   As input, any gene or protein table can be submitted. The input table contains genes under study, and it is called “Yes” set.  At the first step, the input table is converted into a table with Ensembl Gene IDs.

At the next step, promoters are analyzed for potential cis-regulatory sites. Promoters in this workflow are defined as sequences from -1000 to +100 relative to the transcription start sites, as they are annotated in Ensembl. Site search is done with the help of the GTRD library of the positional weight matrices, PWMs, namely with the profile moderate threshold. 

At the same step, frequencies of putative TFBSs are compared between Yes set and No set to identify sites overrepresented in Yes set versus No set. Default No set in the workflow is a set of housekeeping genes for the corresponding species (as they are published in PMID: 19534766)  . The result of this step is a list of PWMs the hits of which are overrepresented in Yes set versus No set. Next, the list of PWMs is converted into a table of transcription factors. Two tables are produced, with Ensembl Gene IDs and with Entrez IDs.

Finally, both tables with transcription factors are annotated with additional information, gene description and gene symbols.

The output is a new folder with several tables, including a summary of the predicted TFBSs, genomic tracks of the Yes and No promoters and sites, as well as the tables with transcription factors potentially regulating the genes in the Yes set.

 

 

[edit] Parameters

Input gene set (Yes set)
Profile
Species
No set
5' flank
Position relative to TSS, bp
3' flank
Position relative to TSS, bp
Results folder
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox