Difference between revisions of "BioUML"

From BioUML platform
Jump to: navigation, search
(Collaborative research)
m (Collaborative research)
 
(22 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{Stub}}
+
BioUML is an open source integrated Java platform for analysis of data generated by  [[wikipedia:omics|omics]] technologies using advanced tools of [[wikipedia:computational biology|computational biology]]. The BioUML vision is to provide a computational platform  to build [[virtual cell]] and [[wikipedia:Virtual Physiological Human|virtual physiological human]]. BioUML spans a comprehensive range of capabilities, including access to databases with experimental data, tools for formalized description of biological systems structure and functioning, as well as tools for their visualization, simulation, parameters fitting and analyses. Due to scripts ({{R}}, [[wikipedia:JavaScript|JavaScript]]) and workflow support it provides powerful possibilities for analyses of high-throughput data. The [[#See also|plug-in based architecture]] ([http://www.eclipse.org/ Eclipse] run time from IBM is used) allows to add new functionality using plug-ins.
 
+
BioUML is an open source integrated Java platform for analysis of data from omics sciences research and other advanced computational biology, for building the [[virtual cell]] and the [http://en.wikipedia.org/wiki/Virtual_Physiological_Human virtual physiological human]. It spans a comprehensive range of capabilities, including access to databases with experimental data, tools for formalized description of biological systems structure and functioning, as well as tools for their visualization, simulation, parameters fitting and analyses. Due to scripts (R, JavaScript) and workflow support it provides powerful possibilities for analyses of high-throughput data. The plug-in based architecture (Eclipse run time from IBM is used) allows to add new functionality using plug-ins.
+
  
 
The whole system aims at covering, with time, all areas of computational applications in bioinformatics and systems biology. The architecture is open, so that users' own scripts can be easily loaded into the system, and new modules can be programmed and added by any skilled person.
 
The whole system aims at covering, with time, all areas of computational applications in bioinformatics and systems biology. The architecture is open, so that users' own scripts can be easily loaded into the system, and new modules can be programmed and added by any skilled person.
The community is invited to contribute to the development, either as public-domain or as a commercial part of the platform. The developers are confident that this way a most powerful and needed system of tools can grow.
+
The community is invited to contribute to the development, either as public-domain or as a commercial part of the platform. [[BioUML team|The developers]] are confident that this way a most powerful and needed system of tools can grow.
 
   
 
   
The team also hopes that BioUML will contribute to creation of virtual cell, virtual physiological human and [http://en.wikipedia.org/wiki/Virtual_patient virtual patient], which would be extremely useful in medicine as a means of computational identification of the most effective therapeutic interventions and for prediction of potential outcomes of intended treatments for a given patient.
+
[[BioUML team|The team]] also hopes that BioUML will contribute to creation of virtual cell, virtual physiological human and [[wikipedia:virtual patient|virtual patient]], which would be extremely useful in medicine as a means of computational identification of the most effective therapeutic interventions and for prediction of potential outcomes of intended treatments for a given patient.
  
 
==Principles==
 
==Principles==
Line 14: Line 12:
 
[[File:Data flow in BioUML.png|thumb|Visual modeling within the data workflow in BioUML]]
 
[[File:Data flow in BioUML.png|thumb|Visual modeling within the data workflow in BioUML]]
  
Reconstruction of complex biological systems from a huge amount of experimental data requires a formal language that can be easily understood both by human and computer. It is known that graphical depiction of complex systems is the most suitable way of understanding their structure by human. Graphical notation allows human to completely and formally specify model so computer programs can analyze the model and simulate its behavior.
+
Reconstruction of complex biological systems from a huge amount of experimental data requires a formal language that can be easily understood both by the human and the computer. It is known that graphic depiction of complex systems is the most suitable way of understanding their structure by humans. [[Graphic notation]] allows humans to completely and formally specify model so computer programs can analyze the model and simulate its behavior.
  
 
This approach is widely used in engineering and computer science. Some examples are:
 
This approach is widely used in engineering and computer science. Some examples are:
Line 21: Line 19:
 
* UML (http://www.omg.org/uml/) - the most known graphical language for computer science.  
 
* UML (http://www.omg.org/uml/) - the most known graphical language for computer science.  
  
BioUML adopts the visual modeling approach for formal description and simulation of complex biological systems. Another distinctive feature of BioUML is a tight integration with databases on biological pathways, query engine allows user to find interacting components of the system and show results as an editable graph.  
+
BioUML adopts the visual modeling approach for formal description and simulation of complex biological systems.  
  
BioUML also fully exploits principles of [[modular modeling]].
+
Another distinctive feature of BioUML is a tight integration with databases on biological pathways, query engine allows user to find interacting components of the system and show results as an editable graph.  
  
 +
BioUML also fully exploits principles of [[modular modeling]].
  
 
===Meta-model===
 
===Meta-model===
 
[[File:Metamodel layers.png|thumb|System of two consecutive chemical reactions (a), its formal description using three meta model levels (b), and corresponding mathematical model (c), that can be generated automatically for system simulations]]
 
[[File:Metamodel layers.png|thumb|System of two consecutive chemical reactions (a), its formal description using three meta model levels (b), and corresponding mathematical model (c), that can be generated automatically for system simulations]]
  
The core of BioUML is meta-model. It provides an abstract layer (compartmentalized attributed graph) for comprehensive formal description of wide range of biological and other complex systems. The content of databases on biological pathways, SBML (Hucka M. et al., 2003) and CellML(Lloyd C.M. et al., 2004) models, as well as biological pathways in BioPAX format can be expressed in terms of the meta model and used by the BioUML workbench.
+
The core of BioUML is a meta-model. It provides an abstract layer (compartmentalized attributed graph) for comprehensive formal description of wide range of biological and other complex systems. The content of databases on biological pathways, [http://sbml.org/ SBML] (Hucka M. et al., 2003) and [http://www.cellml.org/ CellML](Lloyd C.M. et al., 2004) models, as well as biological pathways in [http://www.biopax.org/ BioPAX] format can be expressed in terms of the meta model and used by the BioUML workbench.
 
   
 
   
 
This formal description can be used both for visual depiction and editing of biological system structure and for automated code generation to simulate a model behavior.
 
This formal description can be used both for visual depiction and editing of biological system structure and for automated code generation to simulate a model behavior.
Line 41: Line 40:
 
BioUML supports the following mathematical elements: variable, formula,  equation, event, state and transition.  
 
BioUML supports the following mathematical elements: variable, formula,  equation, event, state and transition.  
  
The figure demonstrates how this approach is applied to modeling a system consisting of two consecutive chemical reactions. Here the graph nodes representing chemical substances are considered as variables and the corresponding graph edges contain right parts of corresponding differential equations. Using this information the BioUML workbench can generate MATLAB or Java code for model simulation.  
+
The figure demonstrates how this approach is applied to modeling a system consisting of two consecutive chemical reactions. Here the graph nodes representing chemical substances are considered as variables and the corresponding graph edges contain right parts of corresponding differential equations. Using this information the [[BioUML workbench]] can generate MATLAB or Java code for model simulation.  
  
  
Line 49: Line 48:
  
 
Detailed description of DML format is available at http://www.biouml.org/dml.shtml
 
Detailed description of DML format is available at http://www.biouml.org/dml.shtml
 +
  
 
===Reproducible research===
 
===Reproducible research===
{{draft}}
+
 
 +
Reproducible research is the common name for functionality for saving, representing and reproducing a set of data manipulation. BioUML provides special [[Project]] collections, [[workflows]] and [[Workflows#research diagram|research diagram]]s for this purpose.
  
 
===Collaborative research===
 
===Collaborative research===
{{draft}}
+
 
 +
Collaborative research is any research project, especially a large one involving various areas of expertise, that is carried out by at least two people. To enable collaborative research BioUML sustains the following functionality:
 +
 
 +
* [[BioStore#InvitingOthers|sharing projects with other users]],
 +
* [[collaborative diagram editing]],
 +
* access to [[diagram editing history]] (both session and version),
 +
* in-built [[User to user communication|group chat]].
  
 
==Architecture overview==
 
==Architecture overview==
 +
 
BioUML platform consists of 3 parts:
 
BioUML platform consists of 3 parts:
 
* [[BioUML server]] - provides access to data and analyses methods installed on the server side for BioUML clients (workbench and web edition) via the Internet.
 
* [[BioUML server]] - provides access to data and analyses methods installed on the server side for BioUML clients (workbench and web edition) via the Internet.
 
* [[BioUML workbench]] - Java application that can work standalone or as "thick" client for BioUML server.
 
* [[BioUML workbench]] - Java application that can work standalone or as "thick" client for BioUML server.
* [[BioUML web edition]] - "thin" client for BioUML server (you just need to start web browser) that provides most of functionality of BioUML workbench. It uses [http://en.wikipedia.org/wiki/Ajax_(programming) AJAX] and [http://en.wikipedia.org/wiki/Canvas_element HTML5 <canvas>] technology for visual modeling and interactive data editing.
+
* [[BioUML web edition]] - "thin" client for BioUML server (you just need to start web browser) that provides most of functionality of BioUML workbench. It uses [[wikipedia:Ajax (programming)|AJAX]] and [[wikipedia:Canvas element|HTML5 <canvas>]] technology for visual modeling and interactive data editing.
  
 
===Plug-in based architecture===
 
===Plug-in based architecture===
 
[[File:Plug-in based architecture.png|thumb|Plug-in based architecture scheme with extension points shown as sockets and plug-ins as plugs]]
 
[[File:Plug-in based architecture.png|thumb|Plug-in based architecture scheme with extension points shown as sockets and plug-ins as plugs]]
  
Plug-in based architecture provides extensibility of BioUML platform. This functionality is provided by [http://www.eclipse.org Eclipse] platform runtime kernel.  
+
The plug-in based architecture determines extensibility of the BioUML platform. This functionality is provided by [http://www.eclipse.org Eclipse] platform runtime kernel.  
  
 
The basic components of the plug-in based architecture are:
 
The basic components of the plug-in based architecture are:
* Plug-in - is the smallest unit of BioUML workbench function that can be developed and delivered separately into BioUML workbench. Plug-ins are coded in Java. A typical plug-in consists of Java code in a JAR library, some read-only files, and other resources such as images, message catalogs, native code libraries, etc. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime.
+
* '''[[:Category:Plugins|Plugin]]''', the smallest unit of [[BioUML workbench]] function that can be developed and delivered separately into the BioUML workbench. Plug-ins are coded in Java. A typical plug-in consists of Java code in a JAR library, some read-only files, and other resources such as images, message catalogs, native code libraries, etc. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime.
* Extension points are well-defined function points in the system where other plug-ins can contribute functionality.  
+
* '''[[:Category:Extension points|Extension points]]''', well-defined function points in the system where other plug-ins can contribute functionality.  
* Extension is a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them.
+
* '''Extension''', a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them.
  
== Features==
+
== See also ==
  
The following systems biology standards are applied in BioUML:
+
* [[Features]]
+
* [[BioUML_user_interface]]
* [http://sbml.org/ '''SBML''' - Systems Biology Markup Language].
+
* [[BioStore#InvitingOthers|Inviting other users to your project]]
BioUML supports SBML Level 1 version 1-2; Level 2 versions 1-4; Level 3 version 1. BioUML is the only simulator that has passed all the tests from the SBML test suite version 2.0 (test details). 
+
* [[BioUML development history]]
* [http://www.sbgn.org/ '''SBGN''' - Systems Biology Graphic Notation].
+
* [[BioUML development roadmap]]
BioUML supports Process Diagrams as they are defined by SBGN version 1.0. 
+
* [http://biopax.org/ '''BioPAX''' - Biological Pathway Exchange].
+
BioUML can import data in BioPAX 2.0 format. Imported data can be stored as native BioPAX file, SQL or text database. 
+
* [http://www.psidev.info/node/60 '''PSI-MI''' - The Proteomics Standards Initiative Molecular Interaction XML format].
+
BioUML supoorts data in PSI-MI format. 
+
* [http://www.geneontology.org/GO.format.obo-1_2.shtml '''OBO''' - Ontology Flat File Format].
+
BioUML can import ontology in OBO 1.2 format. Imported data can be presented as dependences diagram. 
+
* [http://www.cellml.org/ '''CellML''' - Cell Markup Language]. BioUML can read and simulated biochemical models presented in CellML 1.0 format.
+
+
 
+
BioUML also supports '''JavaScript''' (script console, JavaSsript editor, JavaScript debugger (BioUML workbench only), JavaScript preprocessor (allows to embed easily R expressions), '''R''' (connect to R on local or remote machine, convert BioUML data to R and save R results as BioUML data, R graphics support, R preprocessor for JavaScript) and '''SQL''' (SQL console, direct SQL access to analysis results tables).
+
 
+
 
+
It works with the main biological databases:
+
* catalogues: Ensembl, UniProt, ChEBI, GO
+
* pathways: KEGG, Reactome, EHMN, BioModels, SABIO-RK, TRANSPATH, EndoNet, BMOND
+
+
 
+
BioUML provides powerful search possibilities with such tools as:
+
* full text search ([http://lucene.apache.org/ Apache Lucene] is used),
+
* graph search - finds related pathway components and presents results as an editable graph.
+
+
BioUML combines
+
 
+
{| class="wikitable"
+
!a graph layout engine !!tools for visual modeling!!parameters fitting!!a genome browser
+
|-
+
|
+
* includes different layout algorithms:
+
** force directed layout,
+
** hierarchical layout,
+
** cross grid layout (Kato,M. et al., 2005: Automatic drawing of biological networks using cross cost),
+
** fast grid layout (Kaname, K., Masao, N. and Satoru, M., 2008: Fast grid layout algorithm for biological networks with sweep calculation);
+
* supports incremental graph layout;
+
* supports compartments;
+
* layout preview;
+
* possibility to reuse layout for similar diagrams;
+
|
+
* powerful diagram editor;
+
* virtual experiment - variations of diagram to simulate different experimental conditions, knock-outs, etc.;
+
* automated generation of optimized Java code for model simulation from corresponding pathway diagram;
+
* different solvers for differential equations:
+
** JVODE - ported to Java version of CVODE,
+
** RADAU IIA - (implicit Runge-Kutta method for stiff delay differential equations),
+
** Imex - (implicit Runge-Kutta method for stiff differential equations),
+
** Dormand-Prince - (explicit Runge-Kutta method),
+
** Euler (for debugging complex models);
+
* supports different model types:
+
** ODE - odinary differential equations,
+
** DAE - differential algebraic equations,
+
** ODE/DAE with delay,
+
** 1D PDE (for blood flow simulation),
+
** hybrid models support (with events, states and transitions),
+
** hierarchical models;
+
* plots (using JFreeChart)
+
** time series,
+
** phase portrait;
+
 
+
|
+
* experimental data - time courses or steady states;
+
* experimental data - exact or relative values of substance or concentrations;
+
* multiexperiment fitting;
+
* global and local parameters for multiexperiment fitting;
+
* constraint support;
+
* different optimization methods:
+
** Adaptive Simulating Annealing,
+
** Cellular genetic algorithm,
+
** Evolution strategy (SRES),
+
** GLBSOLVE,
+
** Particle swarm optimization,
+
** Quadratic Hill-climbing;
+
* optimization and parallelization of computations;
+
* JavaScript API for parameters fitting;
+
|
+
* uses AJAX and HTML5 <canvas> technologies (BioUML web edition);
+
* interactive - dragging, semantic zoom;
+
* DAS support (Distributed Annotation System);
+
* tracks support:
+
**Ensembl tracks
+
**DAS tracks
+
**user-loaded BED/GFF/Wiggle files
+
|}
+
 
+
 
+
BioUML utilizes a wide variety of '''methods for data analyses''':
+
* supports a set of analysis method,
+
* biosequence analysis,
+
* gene expression regulation modeling,
+
* model optimization,
+
* statistics,
+
* executing analysis from JavaScript,
+
+
and '''microarray analyses''':
+
* normalization,
+
* annotation,
+
* up and down identification,
+
* correlation analysis,
+
* hypergeometric meta-analysis,
+
* cluster analysis.
+
+
It also allows for '''workflows, reproducible research'''
+
* actions journal,
+
* Analysis,
+
* JavaScript,
+
* SQL requests,
+
* allows to present set of actions in research diagram,
+
* allows to build and execute workflow document,
+
+
and generating '''reports, templates'''
+
* different templates for representing data element info
+
* model reports
+
* Overview
+
* Reactions
+
* Parameters
+
* Variables
+
* ODE(model as differential equation system).
+

Latest revision as of 16:43, 9 October 2013

BioUML is an open source integrated Java platform for analysis of data generated by omics technologies using advanced tools of computational biology. The BioUML vision is to provide a computational platform to build virtual cell and virtual physiological human. BioUML spans a comprehensive range of capabilities, including access to databases with experimental data, tools for formalized description of biological systems structure and functioning, as well as tools for their visualization, simulation, parameters fitting and analyses. Due to scripts (R, JavaScript) and workflow support it provides powerful possibilities for analyses of high-throughput data. The plug-in based architecture (Eclipse run time from IBM is used) allows to add new functionality using plug-ins.

The whole system aims at covering, with time, all areas of computational applications in bioinformatics and systems biology. The architecture is open, so that users' own scripts can be easily loaded into the system, and new modules can be programmed and added by any skilled person. The community is invited to contribute to the development, either as public-domain or as a commercial part of the platform. The developers are confident that this way a most powerful and needed system of tools can grow.

The team also hopes that BioUML will contribute to creation of virtual cell, virtual physiological human and virtual patient, which would be extremely useful in medicine as a means of computational identification of the most effective therapeutic interventions and for prediction of potential outcomes of intended treatments for a given patient.

Contents

[edit] Principles

[edit] Visual modeling

Visual modeling within the data workflow in BioUML

Reconstruction of complex biological systems from a huge amount of experimental data requires a formal language that can be easily understood both by the human and the computer. It is known that graphic depiction of complex systems is the most suitable way of understanding their structure by humans. Graphic notation allows humans to completely and formally specify model so computer programs can analyze the model and simulate its behavior.

This approach is widely used in engineering and computer science. Some examples are:

BioUML adopts the visual modeling approach for formal description and simulation of complex biological systems.

Another distinctive feature of BioUML is a tight integration with databases on biological pathways, query engine allows user to find interacting components of the system and show results as an editable graph.

BioUML also fully exploits principles of modular modeling.

[edit] Meta-model

System of two consecutive chemical reactions (a), its formal description using three meta model levels (b), and corresponding mathematical model (c), that can be generated automatically for system simulations

The core of BioUML is a meta-model. It provides an abstract layer (compartmentalized attributed graph) for comprehensive formal description of wide range of biological and other complex systems. The content of databases on biological pathways, SBML (Hucka M. et al., 2003) and CellML(Lloyd C.M. et al., 2004) models, as well as biological pathways in BioPAX format can be expressed in terms of the meta model and used by the BioUML workbench.

This formal description can be used both for visual depiction and editing of biological system structure and for automated code generation to simulate a model behavior. Meta-model is problem domain neutral and splits the system description into 3 interconnected levels:

  • graph structure - the system structure is described as compartmentalized graph;
  • database level - each graph element can contain reference to some database object;
  • mathematical model - any graph element can be element of mathematical model.


BioUML supports the following mathematical elements: variable, formula, equation, event, state and transition.

The figure demonstrates how this approach is applied to modeling a system consisting of two consecutive chemical reactions. Here the graph nodes representing chemical substances are considered as variables and the corresponding graph edges contain right parts of corresponding differential equations. Using this information the BioUML workbench can generate MATLAB or Java code for model simulation.


Special BioUML diagrams markup language (DML) is developed to store BioUML meta model instance in XML format. Diagram description is divided into two parts:

  • graph structure - it describes location of diagram elements and contains references to associated with them database objects;
  • executable model - stores mathematical model associated with graph.

Detailed description of DML format is available at http://www.biouml.org/dml.shtml


[edit] Reproducible research

Reproducible research is the common name for functionality for saving, representing and reproducing a set of data manipulation. BioUML provides special Project collections, workflows and research diagrams for this purpose.

[edit] Collaborative research

Collaborative research is any research project, especially a large one involving various areas of expertise, that is carried out by at least two people. To enable collaborative research BioUML sustains the following functionality:

[edit] Architecture overview

BioUML platform consists of 3 parts:

  • BioUML server - provides access to data and analyses methods installed on the server side for BioUML clients (workbench and web edition) via the Internet.
  • BioUML workbench - Java application that can work standalone or as "thick" client for BioUML server.
  • BioUML web edition - "thin" client for BioUML server (you just need to start web browser) that provides most of functionality of BioUML workbench. It uses AJAX and HTML5 <canvas> technology for visual modeling and interactive data editing.

[edit] Plug-in based architecture

Plug-in based architecture scheme with extension points shown as sockets and plug-ins as plugs

The plug-in based architecture determines extensibility of the BioUML platform. This functionality is provided by Eclipse platform runtime kernel.

The basic components of the plug-in based architecture are:

  • Plugin, the smallest unit of BioUML workbench function that can be developed and delivered separately into the BioUML workbench. Plug-ins are coded in Java. A typical plug-in consists of Java code in a JAR library, some read-only files, and other resources such as images, message catalogs, native code libraries, etc. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime.
  • Extension points, well-defined function points in the system where other plug-ins can contribute functionality.
  • Extension, a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them.

[edit] See also

Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox