Difference between revisions of "Workflow"

From BioUML platform
Jump to: navigation, search
(step-by-step instruction added)
Line 25: Line 25:
  
 
To create a new workflow in [[BioUML]] workbench you first need to go to the '''Data''' folder (or a subfolder within) of your project in which the new workflow will be stored (''e.g. '''Data''' tab in navigation pane > '''data''' > '''Collaboration''' > '''test''' project > '''Data''' folder''). As soon as you click on the '''Data''' folder in the navigation pane, a number of operation icons  appears in the navigation toolbar. The names of the operations are shown as tooltip text (just place the mouse pointer over an icon). Click on the '''New workflow''' option.
 
To create a new workflow in [[BioUML]] workbench you first need to go to the '''Data''' folder (or a subfolder within) of your project in which the new workflow will be stored (''e.g. '''Data''' tab in navigation pane > '''data''' > '''Collaboration''' > '''test''' project > '''Data''' folder''). As soon as you click on the '''Data''' folder in the navigation pane, a number of operation icons  appears in the navigation toolbar. The names of the operations are shown as tooltip text (just place the mouse pointer over an icon). Click on the '''New workflow''' option.
 
+
[[File:Starting a new workflow.png|thumb|Creating a new workflow file]]
 
To create a workflow in [[geneXplain]] go to the '''Start''' page and click '''Create your own workflow''' under the list of pre-defined workflow groups.
 
To create a workflow in [[geneXplain]] go to the '''Start''' page and click '''Create your own workflow''' under the list of pre-defined workflow groups.
  
Line 36: Line 36:
  
 
Upon clicking on any component of the workflow you can see the information about this particular element in the operations field below.  
 
Upon clicking on any component of the workflow you can see the information about this particular element in the operations field below.  
 +
 +
Now, let's consider the following step-by-step example of creating a simple workflow for filtering table data.
 +
  
 
===Six steps to compose a simple workflow===
 
===Six steps to compose a simple workflow===
 
{{draft}}
 
{{draft}}
 +
Suppose you have specified the name and directory for your new workflow and its tab is active in the workspace,
 +
 +
'''the first step''' is to '''add the analysis function''' to the workflow diagram. You can either drag and drop the '''Filter table''' method from the subdirectory '''Methods/Data''' in the '''Analyses''' tab of the navigation pane, or add the analysis-method item from the toolbox and choose '''Data/Filter table''' from the drop-down list of the '''Create new node''' dialog box. The item will appear on the diagram as a light blue rectangle labelled "Filter table".
 +
 +
'''Step 2.''' For '''creating the input table''' click on the green element in the tool bar, locate the cursor in the Work Space where you would like to put this element and
 +
click. A new window '''Create new node''' will pop up, where you are to define the parameters of the element: 
 +
 +
* Name field: the title of the element;
 +
* Type: select “Data element”, for any objects like tables.
 +
* In the field Default value you can type a full folder path where the table is located. You can also use some global variables, like “$project$” that already contain the full path (by clicking on the “…” button you can access all the global variables defined for this workflow.
 +
* Rank (sort order): this number gives the position of this input element in the list of all elements upon starting the workflow.
 +
* Role: "Input", since we are using this element for inputting a table into the workflow.
 +
 +
The selected item will appear on the diagram as a light green arrow-like pentagon labelled according to the '''Name''' parameter.
 +
 +
'''Step 3.''' '''Joining elements''' on the diagram is done by clicking on the arrow symbol in the toolbar - a new window '''Create new edge''' will pop up. By clicking on the '''Table''' element in the workspace you select it as the '''Input node''' for this edge. Similarly, you click on the left table symbol in the '''Filter table''' element to select it as the '''Output node''' of this new edge. After pressing '''OK''' a new connection (edge) appears.
 +
 +
'''Step 4.''' The same way you can '''create now an output table element''' on the diagram by selecting the yellow element in the tool bar (since it is going to be an intermediate table for further use in the next steps of the workflow) and connecting it with the output icon of the '''Filter table''' element. In the '''Expression''' field you can now use a new global variable “$Table$” which will contain, during the run of the workflow, the name of the table which you have entered.
 +
So in this case we are creating a new name for the future output table “$Table$ filtered” by adding to the name of the input table an ending “filtered”. 
 +
As a result, we have now created one step of the workflow.
 +
 +
'''Step 5.''' To filter data, we have to '''define a filtering condition'''. To do this we have to create a new element '''Filtering condition''' (yellow element in the tool bar), which will be now of simple '''String''' type and which contains a filtering condition “Score > 2” in the '''Expression''' field. A new element '''Filtering condition''' is created. This element should now be connected to the analysis function “Filter table” in order to define the filtering condition that is going to be applied at this step of the workflow. To do that, click first on the '''Filter table''' element and open the parameters of this element in the operations field. After that, click on the field “Filtering condition” (1) in the parameter list and select it (a blue background color indicates that the field is selected). Click on the '''Bind property to variable''' button (2) in the toolbar of the Operations Field. And after that, move the cursor to the workspace and click on the '''Filtering condition''' element on the diagram (3). 
 +
 +
So, the filtering condition parameter is now connected to the corresponding field of the '''Filter table''' function. 
 +
 +
'''Step 6.''' The workflow is now ready to be executed. To '''start the workflow''' please click on the '''Run workflow''' button in the toolbar of the operations field. 
 +
In the pop-up menu “Workflow parameters” you should specify the input table. Navigate to the folder with your tables and select a table which has a column “Score” and press OK. The workflow will be executed and a new table with a new name and the appendix “filtered” will be created in the same folder as the input table. 
  
 +
 
 
===Complex workflows===
 
===Complex workflows===
 
{{draft}}
 
{{draft}}

Revision as of 17:36, 22 April 2013

This page or section is a stub. Please add more information here!

A workflow in bioinformatics can be defined as a series of computational or data manipulation steps. To compose and execute such sequences a variety of special workflow management systems has been developed. All such systems are based on an abstract representation of how a computation proceeds in the form of a directed graph, where each node represents a task to be executed and edges represent either data flow or execution dependencies between different tasks. Each system typically provides visual front-end allowing the user to build and modify complex applications with little or no programming expertise.

BioUML provides a workflow management system, which is intuitively handled through a simple drag-and-drop interface. With BioUML-related products users can either run pre-defined workflows or create their own for specific analysis purposes.

Contents

Pre-defined workflows

BioUML, and especially the geneXplain platform, facilitates standard analyses through a number of pre-composed workflows concatenating some of the most important modules, which also allows users to start with their first analyses right away, even before having learned all sophisticated details of the platform.

With any workflow the following steps are normally taken:

  • importing data into the project (if necessary),
  • data normalization (if necessary),
  • selecting the appropriate workflow,
  • specifying the input file(s),
  • parameters setting,
  • specifying the output directory,
  • running the workflow,
  • viewing the output.

A few pre-defined workflows in BioUML workbench are available as example data (e.g. Data tab in navigation pane > data > Examples > ChIPMunk workflows). GeneXplain offers a somewhat richer choice of ready-made workflows (see the list of workflows below) with all the links launching them right from the start page.


Creating your own workflows

To create a new workflow in BioUML workbench you first need to go to the Data folder (or a subfolder within) of your project in which the new workflow will be stored (e.g. Data tab in navigation pane > data > Collaboration > test project > Data folder). As soon as you click on the Data folder in the navigation pane, a number of operation icons appears in the navigation toolbar. The names of the operations are shown as tooltip text (just place the mouse pointer over an icon). Click on the New workflow option.

Creating a new workflow file

To create a workflow in geneXplain go to the Start page and click Create your own workflow under the list of pre-defined workflow groups.

At this point you will be asked to specify the name of the new workflow in the pop-up dialog box. Here you can also choose a different location to save the workflow by navigating the directories tree and creating (sub)folders.

As you press Ok in the dialog box, a new tab opens in the workspace, where you can design a new workflow diagram. The workflow diagram represents different analysis functions being connected by input and output files. The resulting directed graph visualizes the sequence of analysis steps in the workflow. The diagram also may contain parameters, which are to be defined by the user.


You can add nodes to the graph by dragging and dropping items directly from the Analyses tab of the navigation pane (Analyses tab > analyses > Methods > ...) or by using the toolbox within the tab. The toolbox contains icons for such types of nodes as analysis methods, analysis parameters, analysis expressions, cycles and analysis scripts as well as for the Select tool, directed edges, notes and note edges. To add a node from the toolbox click first on the icon, then within the workspace, set parameters in the Create new node dialog box and press Ok - the node will appear in the graph. To add an edge just click on the icon in the toolbox, specify the output and input nodes in the Create new edge dialog box and press Ok.

Upon clicking on any component of the workflow you can see the information about this particular element in the operations field below.

Now, let's consider the following step-by-step example of creating a simple workflow for filtering table data.


Six steps to compose a simple workflow

This page or section is under construction right now.

Suppose you have specified the name and directory for your new workflow and its tab is active in the workspace,

the first step is to add the analysis function to the workflow diagram. You can either drag and drop the Filter table method from the subdirectory Methods/Data in the Analyses tab of the navigation pane, or add the analysis-method item from the toolbox and choose Data/Filter table from the drop-down list of the Create new node dialog box. The item will appear on the diagram as a light blue rectangle labelled "Filter table".

Step 2. For creating the input table click on the green element in the tool bar, locate the cursor in the Work Space where you would like to put this element and click. A new window Create new node will pop up, where you are to define the parameters of the element:

  • Name field: the title of the element;
  • Type: select “Data element”, for any objects like tables.
  • In the field Default value you can type a full folder path where the table is located. You can also use some global variables, like “$project$” that already contain the full path (by clicking on the “…” button you can access all the global variables defined for this workflow.
  • Rank (sort order): this number gives the position of this input element in the list of all elements upon starting the workflow.
  • Role: "Input", since we are using this element for inputting a table into the workflow.

The selected item will appear on the diagram as a light green arrow-like pentagon labelled according to the Name parameter.

Step 3. Joining elements on the diagram is done by clicking on the arrow symbol in the toolbar - a new window Create new edge will pop up. By clicking on the Table element in the workspace you select it as the Input node for this edge. Similarly, you click on the left table symbol in the Filter table element to select it as the Output node of this new edge. After pressing OK a new connection (edge) appears.

Step 4. The same way you can create now an output table element on the diagram by selecting the yellow element in the tool bar (since it is going to be an intermediate table for further use in the next steps of the workflow) and connecting it with the output icon of the Filter table element. In the Expression field you can now use a new global variable “$Table$” which will contain, during the run of the workflow, the name of the table which you have entered. So in this case we are creating a new name for the future output table “$Table$ filtered” by adding to the name of the input table an ending “filtered”. As a result, we have now created one step of the workflow.

Step 5. To filter data, we have to define a filtering condition. To do this we have to create a new element Filtering condition (yellow element in the tool bar), which will be now of simple String type and which contains a filtering condition “Score > 2” in the Expression field. A new element Filtering condition is created. This element should now be connected to the analysis function “Filter table” in order to define the filtering condition that is going to be applied at this step of the workflow. To do that, click first on the Filter table element and open the parameters of this element in the operations field. After that, click on the field “Filtering condition” (1) in the parameter list and select it (a blue background color indicates that the field is selected). Click on the Bind property to variable button (2) in the toolbar of the Operations Field. And after that, move the cursor to the workspace and click on the Filtering condition element on the diagram (3).

So, the filtering condition parameter is now connected to the corresponding field of the Filter table function.

Step 6. The workflow is now ready to be executed. To start the workflow please click on the Run workflow button in the toolbar of the operations field. In the pop-up menu “Workflow parameters” you should specify the input table. Navigate to the folder with your tables and select a table which has a column “Score” and press OK. The workflow will be executed and a new table with a new name and the appendix “filtered” will be created in the same folder as the input table.


Complex workflows

This page or section is under construction right now.

Cycles and scripts

This page or section is under construction right now.


See also

Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox