<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="https://wiki.biouml.org/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
		<id>https://wiki.biouml.org/index.php?action=history&amp;feed=atom&amp;title=Site_prediction</id>
		<title>Site prediction - Revision history</title>
		<link rel="self" type="application/atom+xml" href="https://wiki.biouml.org/index.php?action=history&amp;feed=atom&amp;title=Site_prediction"/>
		<link rel="alternate" type="text/html" href="https://wiki.biouml.org/index.php?title=Site_prediction&amp;action=history"/>
		<updated>2026-05-03T17:00:08Z</updated>
		<subtitle>Revision history for this page on the wiki</subtitle>
		<generator>MediaWiki 1.20.3</generator>

	<entry>
		<id>https://wiki.biouml.org/index.php?title=Site_prediction&amp;diff=8408&amp;oldid=prev</id>
		<title>Semyonk@dote.ru: Created page with &quot;Prediction of TF-binding sites of given TF in whole genome or in given chromosome fragment or in ChIP-Seq dataset. Sites are predicted by different Position Weight Matrix (PWM...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.biouml.org/index.php?title=Site_prediction&amp;diff=8408&amp;oldid=prev"/>
				<updated>2019-05-02T15:33:42Z</updated>
		
		<summary type="html">&lt;p&gt;Created page with &amp;quot;Prediction of TF-binding sites of given TF in whole genome or in given chromosome fragment or in ChIP-Seq dataset. Sites are predicted by different Position Weight Matrix (PWM...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Prediction of TF-binding sites of given TF in whole genome or in given chromosome fragment or in ChIP-Seq dataset. Sites are predicted by different Position Weight Matrix (PWM) methods.&lt;br /&gt;
== Description ==&lt;br /&gt;
The following 6 PWM methods (models) were available:&lt;br /&gt;
# Given HOCOMOCO  site models (Kulakovskiy et al, 2016). These models are available in HOCOMOCO database. They are located at &amp;quot;databases/HOCOMOCO v11/Data/PWM_HUMAN_mono_pval=0.0001”&lt;br /&gt;
# MATCH models (Kel et al, 2003);&lt;br /&gt;
# Additive IPS (Individual Probability Score) models, or briefly, IPS models (Volkova et al, 2018);&lt;br /&gt;
# Multiplicative IPS models. These models can be reduced to equivalent additive IPS models by taking logarithms of matrix elements;&lt;br /&gt;
# Common additive models;&lt;br /&gt;
# Common multiplicative models.&lt;br /&gt;
&lt;br /&gt;
For determination of common additive and multiplicative models let’s matrix MAT = (m&amp;lt;sub&amp;gt;ij&amp;lt;/sub&amp;gt;), i={A,C,G,T}, denotes the given frequency matrix, j=1,...,l and l denotes the length of sites. For this analysis we used HOCOMOCO  frequency matrices available in HOCOMOCO   database (Kulakovskiy et al, 2016). These matrices They are located at “&amp;quot;databases/HOCOMOCO v11/Data/PCM_HUMAN_mono/”.&lt;br /&gt;
To test an arbitrary DNA fragment S=(s&amp;lt;sub&amp;gt;1&amp;lt;/sub&amp;gt;,...,s&amp;lt;sub&amp;gt;l&amp;lt;/sub&amp;gt;), the common additive score x is determined using a standard way:&lt;br /&gt;
&amp;lt;div class=&amp;quot;center&amp;quot; style=&amp;quot;width: auto; margin-left: auto; margin-right: auto;&amp;quot;&amp;gt;x = Σj=1,...,l   score(j),&amp;lt;/div&amp;gt;&lt;br /&gt;
where the score(j),  j=1,…,l, are determined as follows:&lt;br /&gt;
&amp;lt;div class=&amp;quot;center&amp;quot; style=&amp;quot;width: auto; margin-left: auto; margin-right: auto;&amp;quot;&amp;gt;score(j) = {m&amp;lt;sub&amp;gt;Aj&amp;lt;/sub&amp;gt;, if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=A;    m&amp;lt;sub&amp;gt;Cj&amp;lt;/sub&amp;gt;, if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=C;   m&amp;lt;sub&amp;gt;Gj&amp;lt;/sub&amp;gt;, if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=G; m&amp;lt;sub&amp;gt;Tj&amp;lt;/sub&amp;gt;,   if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=T;}, j=1,…,l&amp;lt;/div&amp;gt;&lt;br /&gt;
The common multiplicative score y is determined by formula:&lt;br /&gt;
&amp;lt;div class=&amp;quot;center&amp;quot; style=&amp;quot;width: auto; margin-left: auto; margin-right: auto;&amp;quot;&amp;gt;y = ∏&amp;lt;sub&amp;gt;j=1,...,l&amp;lt;/sub&amp;gt;   score(j).&amp;lt;/div&amp;gt;&lt;br /&gt;
If the the calculated score (x or y) exceeds the pre-specified threshold, then the tested DNA fragment S is declared as the predicted site.&lt;br /&gt;
It is important to note that common multiplicative model can be converted to equivalent additive model by taking logarithms of matrix elements, i.e.&lt;br /&gt;
&amp;lt;div class=&amp;quot;center&amp;quot; style=&amp;quot;width: auto; margin-left: auto; margin-right: auto;&amp;quot;&amp;gt;y = Σ&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=1,...,l   score*(j),&amp;lt;/div&amp;gt;&lt;br /&gt;
where the values score*(j),  j=1,…,l, are determined as follows:&lt;br /&gt;
&amp;lt;div class=&amp;quot;center&amp;quot; style=&amp;quot;width: auto; margin-left: auto; margin-right: auto;&amp;quot;&amp;gt;score*(j) = {ln(m&amp;lt;sub&amp;gt;Aj&amp;lt;/sub&amp;gt;), if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=A;    ln(m&amp;lt;sub&amp;gt;Cj&amp;lt;/sub&amp;gt;),  if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=C;   ln(m&amp;lt;sub&amp;gt;Gj&amp;lt;/sub&amp;gt;), if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=G; ln(m&amp;lt;sub&amp;gt;Tj&amp;lt;/sub&amp;gt;)  if s&amp;lt;sub&amp;gt;j&amp;lt;/sub&amp;gt;=T}.&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== References ===&lt;br /&gt;
&lt;br /&gt;
Kulakovskiy,I.V., Vorontsov,I.E., Yevshin,I.S., Soboleva,A.V., Kasianov,A.S., Ashoor,H., Ba-Alawi,W., Bajic,V.B., Medvedeva,Y.A., Kolpakov,F.A. et al. (2016) HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res., 44, D116–D125.&lt;br /&gt;
&lt;br /&gt;
Kel, A.E., Gobling, E., Reuter, I., Cheremushkin, E., Kel-Margoulis, O.V. and Wingender, E. (2003) MATCHTM: a tool for searching transcription factor binding sites in DNA sequences, Nucleic Acids Res., 31, p.3576-3579.&lt;br /&gt;
&lt;br /&gt;
Volkova OA, Kondrakhin YV, Kashapov TA, Sharipov RN. Comparative analysis of protein-coding and long non-coding transcripts based on RNA sequence features. J Bioinform Comput  Biol. 2018 Apr;16(2):1840013. doi: 10.1142/S0219720018400139. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Analysis Parameters: ==&lt;br /&gt;
* '''Sequence set type''' – Select type of sequences;&lt;br /&gt;
** '''Available sequence types''':  1) Whole genome 2) Chromosome fragment  3) ChIP-Seq peaks from given track&lt;br /&gt;
* '''Sequences collection''' – Select a source of nucleotide sequences &lt;br /&gt;
** '''Sequences source''' – Select database to get sequences from or 'Custom' to specify sequences location manually &lt;br /&gt;
** '''Sequence collection''' – Specify path to folder containing sequences if 'Custom' sequences source is selected&lt;br /&gt;
* If '''Sequence set type''' = Chromosome fragment &lt;br /&gt;
** '''Chromosome name''' – Select chromosome name&lt;br /&gt;
** '''Start position''' – Type start position of chromosome fragment&lt;br /&gt;
** '''Finish position''' – Type finish position of chromosome fragment&lt;br /&gt;
* If '''Sequence set type''' =  ChIP-Seq peaks from given track&lt;br /&gt;
** '''Path to track''' – Select  Path to track with ChIP-Seq dataset; For example, track from GTRD database can be selected, i.e. Path to track = databases/GTRD/Data/peaks/gem/PEAKS033057&lt;br /&gt;
* '''Site name''' – type name of predicted sites&lt;br /&gt;
* '''Prediction models''' – Define prediction models. User can define several prediction models.&lt;br /&gt;
** '''modelName''' – Type model name&lt;br /&gt;
** '''siteType''' –  Select site type; &lt;br /&gt;
*** '''Available site types:''' 1) Given site model 2) IPS model 3) Multiplicative IPS model 4) Common additive model 5) Common multiplicative model 6) MATCH model&lt;br /&gt;
** If '''siteType''' = Given site model&lt;br /&gt;
*** '''modelPath''' –  Input path to given site prediction model. In particular, user can select given site model from HOCOMOCO database, such as  &amp;quot;databases/HOCOMOCO v11/Data/PWM_HUMAN_mono_pval=0.0001/CEBPA_HUMAN.H11MO.0.A&amp;quot;&lt;br /&gt;
** If '''siteType'''  ≠  Given site model&lt;br /&gt;
*** '''matrixPath''' - Input path to given frequency matrix . In particular, user can select given frequency matrix from HOCOMOCO database, such as  &amp;quot;databases/HOCOMOCO v11/Data/PCM_HUMAN_mono/CEBPA_HUMAN.H11MO.0.A&amp;quot;&lt;br /&gt;
*** '''threshold''' - Type threshold &lt;br /&gt;
* '''The output track name''' – Type the output track name&lt;br /&gt;
* '''Path to output folder''' – Path to output fold&lt;/div&gt;</summary>
		<author><name>Semyonk@dote.ru</name></author>	</entry>

	</feed>