Correlation Analysis
From BioUML platform
Revision as of 15:07, 16 April 2013 by BioUML wiki Bot (Talk | contribs)
Correlation analysis uses standard statistical methods to calculate correlation between two data sets which indicates measure of statistical dependency between them. Input of the analysis are two data sets. Output is a table containing all posible correlations between them and their P-values. For example if input has two data sets one of which contains n samples and another - m samples, then result will contain n x m correlation values and n x m P-values. One should keep in mind that analysis may produce huge amount of data and take a qute time to work, especially if Calculate FDR flag is set "true".
Parameters:
- Experiment data - experimental data for analysis.
- Table - a table data collection with experimental data stored in the BioUML repository.
- Columns - the columns selected from the table for further analysis.
- Control data - control data for analysis.
- Table - a table data collection with control data stored in BioUML repository.
- Columns - the columns selected from the table for further analysis.Please note that, despite their names, both experiment and control data sets has equal meaning, since correlation is a symmetric function.
- Data source - data source for correlations (rows or columns from tables).
- Result type - the type of result representation: correlation matrix or triplets (id1, id2, correlation).
- Correlation type - the correlation method.
- Pearson - Pearson correlation.
- Spearman - Spearman rank correlation (non-parametric).
- P-value threshold - thresold for P-value (only elements with lower P-value will be included in the results).
- Outline boundaries - lower and upper boundaries for values from the input table. Outliers will be ignored.
- Output table - the path in BioUML repository where the result table will be stored. If a table with the specified path already exists it will be replaced.