Difference between revisions of "ChIP-Seq - Identify composite modules on peaks (TRANSFAC(R)) (workflow)"

From BioUML platform
Jump to: navigation, search
m (Cosmetic changes in category formatting)
(Automatic synchronization with BioUML)
Line 6: Line 6:
 
[[File:ChIP-Seq-Identify-composite-modules-on-peaks-TRANSFAC-R-workflow-overview.png|400px]]
 
[[File:ChIP-Seq-Identify-composite-modules-on-peaks-TRANSFAC-R-workflow-overview.png|400px]]
 
== Description ==
 
== Description ==
This workflow is designed to search for composite regulatory modules on DNA sequences identified by ChiP-Seq approach. On the first step potential binding sites are identified using TRANSFAC(R) database. On the second step composite modules are found using genetic algorithm.
+
This workflow is designed to search for composite modules in DNA sequences identified by the ChiP-Seq approach. Actually, any dataset in BED format can be submitted as input track for this workflow.
  
 
+
As input, two different tracks should be submitted, one is a dataset under study, Yes track, and the other is a background dataset, No track. The default No track corresponds to far upstream regions of the house keeping genes, where no functional composite modules are expected. 
 +
 
 +
In the first step both Yes and No tracks are submitted to the ''Site search on track'' and ''Site search results optimization'' analyses to find TFBSs enriched in the Yes-track in comparison with the No-track.  The workflow uses the default profile vertebrate_non_redundant_minSUM from the TRANSFAC<sup>®</sup> library. 
 +
 
 +
In the second step composite modules are identified using the ''Construct composite modules on tracks ''analysis. Default parameters of this analysis are adjusted to find 2 to 8 different pairs of TF binding sites. Minimum and maximum numbers of pairs are available in the input form and can be comfortably adjusted. The default number of iterations for the genetic algorithm is 300 but can be modified in the input form. 
 +
 
 +
In the next step all site models that are identified as parts of the composite module are converted into a table of genes with Entrez IDs using ''Composite module to proteins'' analysis. The resulting table of Entrez genes is additionally annotated with gene symbols and gene descriptions. 
 +
 
 +
This workflow is available together with a valid TRANSFAC® license.
  
 
== Parameters ==
 
== Parameters ==

Revision as of 11:49, 30 July 2013

Workflow title
ChIP-Seq - Identify composite modules on peaks (TRANSFAC(R))
Provider
geneXplain GmbH

Workflow overview

ChIP-Seq-Identify-composite-modules-on-peaks-TRANSFAC-R-workflow-overview.png

Description

This workflow is designed to search for composite modules in DNA sequences identified by the ChiP-Seq approach. Actually, any dataset in BED format can be submitted as input track for this workflow.

As input, two different tracks should be submitted, one is a dataset under study, Yes track, and the other is a background dataset, No track. The default No track corresponds to far upstream regions of the house keeping genes, where no functional composite modules are expected. 

In the first step both Yes and No tracks are submitted to the Site search on track and Site search results optimization analyses to find TFBSs enriched in the Yes-track in comparison with the No-track.  The workflow uses the default profile vertebrate_non_redundant_minSUM from the TRANSFAC® library. 

In the second step composite modules are identified using the Construct composite modules on tracks analysis. Default parameters of this analysis are adjusted to find 2 to 8 different pairs of TF binding sites. Minimum and maximum numbers of pairs are available in the input form and can be comfortably adjusted. The default number of iterations for the genetic algorithm is 300 but can be modified in the input form. 

In the next step all site models that are identified as parts of the composite module are converted into a table of genes with Entrez IDs using Composite module to proteins analysis. The resulting table of Entrez genes is additionally annotated with gene symbols and gene descriptions. 

This workflow is available together with a valid TRANSFAC® license.

Parameters

Input Yes track
Sequence source
Species
Input No track
Minimal number of pairs
Maximal number of pairs
Number of iterations
Results folder
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox