Difference between revisions of "Correlation Analysis"

From BioUML platform
Jump to: navigation, search
m (Protected "Correlation Analysis": Autogenerated page (‎[edit=sysop] (indefinite)))
(Class name added)
 
(9 intermediate revisions by one user not shown)
Line 1: Line 1:
=== Correlation analysis ===
+
;Analysis title
 +
:[[File:Statistics-Correlation-Analysis-icon.png]] Correlation Analysis
 +
;Provider
 +
:[[Institute of Systems Biology]]
 +
;Class
 +
:{{Class|ru.biosoft.analysis.CorrelationAnalysis}}
 +
;Plugin
 +
:[[Ru.biosoft.analysis (plugin)|ru.biosoft.analysis (Common methods of data analysis plug-in)]]
  
 +
==== Description ====
 +
'''Correlation analysis''' uses standard statistical methods to calculate correlation between two data sets which indicates measure of statistical dependency between them. Input of the analysis are two data sets. Output is a table containing all posible correlations between them and their P-values. For example if input has two data sets one of which contains n samples and another - m samples, then result will contain n x m correlation values and n x m P-values. One should keep in mind that analysis may produce huge amount of data and take a long time to work, especially if '''Calculate FDR''' flag is set "true".
 
==== Parameters: ====
 
==== Parameters: ====
  
Line 9: Line 18:
 
** '''Table''' - a table data collection with control data stored in BioUML repository.
 
** '''Table''' - a table data collection with control data stored in BioUML repository.
 
** '''Columns''' - the columns selected from the table for further analysis.
 
** '''Columns''' - the columns selected from the table for further analysis.
 +
*:Please note that, despite their names, both experiment and control data sets has equal meaning, since correlation is a symmetric function.
 
* '''Data source''' - data source for correlations (rows or columns from tables).
 
* '''Data source''' - data source for correlations (rows or columns from tables).
 
* '''Result type''' - the type of result representation: correlation matrix or triplets (id1, id2, correlation).
 
* '''Result type''' - the type of result representation: correlation matrix or triplets (id1, id2, correlation).
Line 14: Line 24:
 
** '''Pearson''' - Pearson correlation.
 
** '''Pearson''' - Pearson correlation.
 
** '''Spearman''' - Spearman rank correlation (non-parametric).
 
** '''Spearman''' - Spearman rank correlation (non-parametric).
* '''''P''-value threshold''' - thresold for ''P''-value (only elements with lower ''P''-value will be included in the results).
+
* '''''P''-value threshold''' - threshold for ''P''-value (only elements with lower ''P''-value will be included in the results).
 
* '''Outline boundaries''' - lower and upper boundaries for values from the input table. Outliers will be ignored.
 
* '''Outline boundaries''' - lower and upper boundaries for values from the input table. Outliers will be ignored.
 
* '''Output table''' - the path in BioUML repository where the result table will be stored. If a table with the specified path already exists it will be replaced.
 
* '''Output table''' - the path in BioUML repository where the result table will be stored. If a table with the specified path already exists it will be replaced.
Line 20: Line 30:
 
[[Category:Analyses]]
 
[[Category:Analyses]]
 
[[Category:Statistics (analyses group)]]
 
[[Category:Statistics (analyses group)]]
 +
[[Category:ISB analyses]]
 
[[Category:Autogenerated pages]]
 
[[Category:Autogenerated pages]]

Latest revision as of 11:15, 31 May 2013

Analysis title
Statistics-Correlation-Analysis-icon.png Correlation Analysis
Provider
Institute of Systems Biology
Class
CorrelationAnalysis
Plugin
ru.biosoft.analysis (Common methods of data analysis plug-in)

[edit] Description

Correlation analysis uses standard statistical methods to calculate correlation between two data sets which indicates measure of statistical dependency between them. Input of the analysis are two data sets. Output is a table containing all posible correlations between them and their P-values. For example if input has two data sets one of which contains n samples and another - m samples, then result will contain n x m correlation values and n x m P-values. One should keep in mind that analysis may produce huge amount of data and take a long time to work, especially if Calculate FDR flag is set "true".

[edit] Parameters:

  • Experiment data - experimental data for analysis.
    • Table - a table data collection with experimental data stored in the BioUML repository.
    • Columns - the columns selected from the table for further analysis.
  • Control data - control data for analysis.
    • Table - a table data collection with control data stored in BioUML repository.
    • Columns - the columns selected from the table for further analysis.
    Please note that, despite their names, both experiment and control data sets has equal meaning, since correlation is a symmetric function.
  • Data source - data source for correlations (rows or columns from tables).
  • Result type - the type of result representation: correlation matrix or triplets (id1, id2, correlation).
  • Correlation type - the correlation method.
    • Pearson - Pearson correlation.
    • Spearman - Spearman rank correlation (non-parametric).
  • P-value threshold - threshold for P-value (only elements with lower P-value will be included in the results).
  • Outline boundaries - lower and upper boundaries for values from the input table. Outliers will be ignored.
  • Output table - the path in BioUML repository where the result table will be stored. If a table with the specified path already exists it will be replaced.
Personal tools
Namespaces

Variants
Actions
BioUML platform
Community
Modelling
Analysis & Workflows
Collaborative research
Development
Virtual biology
Wiki
Toolbox