Difference between revisions of "Convert table (analysis)"

From BioUML platform
Jump to: navigation, search
(Mistaken file names update reverted)
(Class name added)
Line 3: Line 3:
 
;Provider
 
;Provider
 
:[[Institute of Systems Biology]]
 
:[[Institute of Systems Biology]]
 +
;Class
 +
:{{Class|ru.biosoft.analysis.TableConverter}}
 
;Plugin
 
;Plugin
 
:[[Ru.biosoft.analysis (plugin)|ru.biosoft.analysis (Common methods of data analysis plug-in)]]
 
:[[Ru.biosoft.analysis (plugin)|ru.biosoft.analysis (Common methods of data analysis plug-in)]]

Revision as of 11:14, 31 May 2013

Analysis title
Data-Convert-table-icon.png Convert table
Provider
Institute of Systems Biology
Class
TableConverter
Plugin
ru.biosoft.analysis (Common methods of data analysis plug-in)

Convert table identifiers using BioHub(s)

This analysis allows you to change the type of identifiers in the table and convert rows accordingly using chain of BioHubs. BioHub is a converter capable to convert between two or more types (for example, convert "Genes: Ensembl" into "Proteins: Ensembl"). If direct conversion between two given types is impossible, this analysis will create the optimal chain of several BioHubs and use them subsequently.

Note that several non-trivial situations might occur during conversion:

  • Single source ID matches to several target IDs. In this case source row will be copied several times, one copy per one target ID.
  • Source ID doesn't match to any target ID. In this case source row will be removed from result.
  • Several source ID's match to single target ID. In this case two options available:
    • You have specified main column. Of all suitable source rows only one will be selected to be put into result, based on specified aggregator. For example, if you specified 'maximum' as an aggregator, source row with maximal value in main column will be selected from suitable rows.
    • You have not specified main column. All the corresponding source rows will be merged together using merging rules. Non-trivial columns like 'Graph' will be removed from result. Text columns will have all values joined into sorted comma-separated list with duplicates removed. Numerical columns will be merged based on selected aggregator. For example, if you select 'average' as an aggregator, then mean value will appear in the result. If your source column have integral type, some aggregators may change it to float.

Parameters:

  • Input table – Data set to be converted
  • Column with IDs (expert) – Column to be used as source ID. Select (none) to use row IDs
  • Input type – Type of references in input table
  • Output type – Select type of identifiers for the resulting table
  • Species – Select human, mouse or rat species
  • Numerical value treatment rule – Select one of the rules to treat values in the numerical columns of the table when several rows are merged into a single one.
    In cases of "average", "average w/o 20% outliers" and "sum", the selected rule is applied to all numerical columns of the table. In cases of "minimum", "maximum" and "extreme" a new option appears bellow which request user to select a "Leading column". The chosen rule is applied then to the values in the selected Leading column (e.g. in the Leading column the maximum value is computed among all the merged rows). All other numerical values of the table will be taken from that row which corresponds to the selected value in the leading column.
  • Leading column – Select the column with numerical values to apply one of the rules described above
  • Unmatched rows (expert) – Path to store unmatched rows of the table
  • Output table – Path to store the resulting table in the tree
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox