BioUML

From BioUML platform
Revision as of 23:58, 25 April 2013 by Fedor Kolpakov (Talk | contribs)

Jump to: navigation, search
This page or section is a stub. Please add more information here!

BioUML is an open source integrated Java platform for analysis of data from omics sciences research and other advanced computational biology, for building the virtual cell and the virtual physiological human. It spans a comprehensive range of capabilities, including access to databases with experimental data, tools for formalized description of biological systems structure and functioning, as well as tools for their visualization, simulation, parameters fitting and analyses. Due to scripts (R, JavaScript) and workflow support it provides powerful possibilities for analyses of high-throughput data. The plug-in based architecture (Eclipse run time from IBM is used) allows to add new functionality using plug-ins.

The whole system aims at covering, with time, all areas of computational applications in bioinformatics and systems biology. The architecture is open, so that users' own scripts can be easily loaded into the system, and new modules can be programmed and added by any skilled person. The community is invited to contribute to the development, either as public-domain or as a commercial part of the platform. The developers are confident that this way a most powerful and needed system of tools can grow.

The team also hopes that BioUML will contribute to creation of virtual cell, virtual physiological human and virtual patient, which would be extremely useful in medicine as a means of computational identification of the most effective therapeutic interventions and for prediction of potential outcomes of intended treatments for a given patient.

Contents

Principles

Visual modeling

Visual modeling within the data workflow in BioUML

Reconstruction of complex biological systems from a huge amount of experimental data requires a formal language that can be easily understood both by human and computer. It is known that graphical depiction of complex systems is the most suitable way of understanding their structure by human. Graphical notation allows human to completely and formally specify model so computer programs can analyze the model and simulate its behavior.

This approach is widely used in engineering and computer science. Some examples are:

BioUML adopts the visual modeling approach for formal description and simulation of complex biological systems. Another distinctive feature of BioUML is a tight integration with databases on biological pathways, query engine allows user to find interacting components of the system and show results as an editable graph.

BioUML also fully exploits principles of modular modeling.


Meta-model

System of two consecutive chemical reactions (a), its formal description using three meta model levels (b), and corresponding mathematical model (c), that can be generated automatically for system simulations

The core of BioUML is meta-model. It provides an abstract layer (compartmentalized attributed graph) for comprehensive formal description of wide range of biological and other complex systems. The content of databases on biological pathways, SBML (Hucka M. et al., 2003) and CellML(Lloyd C.M. et al., 2004) models, as well as biological pathways in BioPAX format can be expressed in terms of the meta model and used by the BioUML workbench.

This formal description can be used both for visual depiction and editing of biological system structure and for automated code generation to simulate a model behavior. Meta-model is problem domain neutral and splits the system description into 3 interconnected levels:

  • graph structure - the system structure is described as compartmentalized graph;
  • database level - each graph element can contain reference to some database object;
  • mathematical model - any graph element can be element of mathematical model.


BioUML supports the following mathematical elements: variable, formula, equation, event, state and transition.

The figure demonstrates how this approach is applied to modeling a system consisting of two consecutive chemical reactions. Here the graph nodes representing chemical substances are considered as variables and the corresponding graph edges contain right parts of corresponding differential equations. Using this information the BioUML workbench can generate MATLAB or Java code for model simulation.


Special BioUML diagrams markup language (DML) is developed to store BioUML meta model instance in XML format. Diagram description is divided into two parts:

  • graph structure - it describes location of diagram elements and contains references to associated with them database objects;
  • executable model - stores mathematical model associated with graph.

Detailed description of DML format is available at http://www.biouml.org/dml.shtml

Reproducible research

Collaborative research

Architecture overview

BioUML platform consists of 3 parts:

  • BioUML server - provides access to data and analyses methods installed on the server side for BioUML clients (workbench and web edition) via the Internet.
  • BioUML workbench - Java application that can work standalone or as "thick" client for BioUML server.
  • BioUML web edition - "thin" client for BioUML server (you just need to start web browser) that provides most of functionality of BioUML workbench. It uses AJAX and HTML5 <canvas> technology for visual modeling and interactive data editing.

Plug-in based architecture

Eclipse platform runtime kernel that supports 'plug-ins' and a set of plug-ins that support database access, diagram editing, and biological systems simulation.

Plug-in based architecture scheme with extension points shown as sockets and plug-ins as plugs

Plug-in based architecture provides extensibility of BioUML platform. The basic components of the plug-in based architecture are:

  • Plug-in - is the smallest unit of BioUML workbench function that can be developed and delivered separately into BioUML workbench. Plug-ins are coded in Java. A typical plug-in consists of Java code in a JAR library, some read-only files, and other resources such as images, message catalogs, native code libraries, etc. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime.
  • Extension points are well-defined function points in the system where other plug-ins can contribute functionality.
  • Extension is a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them.

Features

The following systems biology standards are applied in BioUML:

BioUML supports SBML Level 1 version 1-2; Level 2 versions 1-4; Level 3 version 1. BioUML is the only simulator that has passed all the tests from the SBML test suite version 2.0 (test details).

BioUML supports Process Diagrams as they are defined by SBGN version 1.0.

BioUML can import data in BioPAX 2.0 format. Imported data can be stored as native BioPAX file, SQL or text database.

BioUML supoorts data in PSI-MI format.

BioUML can import ontology in OBO 1.2 format. Imported data can be presented as dependences diagram.


BioUML also supports JavaScript (script console, JavaSsript editor, JavaScript debugger (BioUML workbench only), JavaScript preprocessor (allows to embed easily R expressions), R (connect to R on local or remote machine, convert BioUML data to R and save R results as BioUML data, R graphics support, R preprocessor for JavaScript) and SQL (SQL console, direct SQL access to analysis results tables).


It works with the main biological databases:

  • catalogues: Ensembl, UniProt, ChEBI, GO
  • pathways: KEGG, Reactome, EHMN, BioModels, SABIO-RK, TRANSPATH, EndoNet, BMOND


BioUML provides powerful search possibilities with such tools as:

  • full text search (Apache Lucene is used),
  • graph search - finds related pathway components and presents results as an editable graph.

BioUML combines

a graph layout engine tools for visual modeling parameters fitting a genome browser
  • includes different layout algorithms:
    • force directed layout,
    • hierarchical layout,
    • cross grid layout (Kato,M. et al., 2005: Automatic drawing of biological networks using cross cost),
    • fast grid layout (Kaname, K., Masao, N. and Satoru, M., 2008: Fast grid layout algorithm for biological networks with sweep calculation);
  • supports incremental graph layout;
  • supports compartments;
  • layout preview;
  • possibility to reuse layout for similar diagrams;
  • powerful diagram editor;
  • virtual experiment - variations of diagram to simulate different experimental conditions, knock-outs, etc.;
  • automated generation of optimized Java code for model simulation from corresponding pathway diagram;
  • different solvers for differential equations:
    • JVODE - ported to Java version of CVODE,
    • RADAU IIA - (implicit Runge-Kutta method for stiff delay differential equations),
    • Imex - (implicit Runge-Kutta method for stiff differential equations),
    • Dormand-Prince - (explicit Runge-Kutta method),
    • Euler (for debugging complex models);
  • supports different model types:
    • ODE - odinary differential equations,
    • DAE - differential algebraic equations,
    • ODE/DAE with delay,
    • 1D PDE (for blood flow simulation),
    • hybrid models support (with events, states and transitions),
    • hierarchical models;
  • plots (using JFreeChart)
    • time series,
    • phase portrait;
  • experimental data - time courses or steady states;
  • experimental data - exact or relative values of substance or concentrations;
  • multiexperiment fitting;
  • global and local parameters for multiexperiment fitting;
  • constraint support;
  • different optimization methods:
    • Adaptive Simulating Annealing,
    • Cellular genetic algorithm,
    • Evolution strategy (SRES),
    • GLBSOLVE,
    • Particle swarm optimization,
    • Quadratic Hill-climbing;
  • optimization and parallelization of computations;
  • JavaScript API for parameters fitting;
  • uses AJAX and HTML5 <canvas> technologies (BioUML web edition);
  • interactive - dragging, semantic zoom;
  • DAS support (Distributed Annotation System);
  • tracks support:
    • Ensembl tracks
    • DAS tracks
    • user-loaded BED/GFF/Wiggle files


BioUML utilizes a wide variety of methods for data analyses:

  • supports a set of analysis method,
  • biosequence analysis,
  • gene expression regulation modeling,
  • model optimization,
  • statistics,
  • executing analysis from JavaScript,

and microarray analyses:

  • normalization,
  • annotation,
  • up and down identification,
  • correlation analysis,
  • hypergeometric meta-analysis,
  • cluster analysis.

It also allows for workflows, reproducible research

  • actions journal,
  • Analysis,
  • JavaScript,
  • SQL requests,
  • allows to present set of actions in research diagram,
  • allows to build and execute workflow document,

and generating reports, templates

  • different templates for representing data element info
  • model reports
  • Overview
  • Reactions
  • Parameters
  • Variables
  • ODE(model as differential equation system).
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox