Overview |
Start
at bench |
Install
TOUCAN |
Get
Sequences |
Annotate |
MotifScanner |
Statistics |
ModuleSearcher |
MotifSampler |
Return to bench |
References
ModuleSearcher - Search
for Combinations of Transcription Factor Binding Sites
After having
performed both "MotifScanner" and "Statistics" to identify significant
TFBS, we now want to investigate which transcription factors tend to
"occur together". The
transcriptional regulation of a metazoan gene depends on the
cooperative action of multiple transcription factors that bind to cis-regulatory
modules (CRMs). ModuleSearcher (
Aerts et al.,
2003) is
designed to find CRMs in a set of coexpressed or coregulated genes.
Note that you first should run a program like MotifScanner
to display all relevant TF sites.
Then, choose "Motifs",
"ModuleSearcher". In the appearing window (see below), you have
to select the "Feature Source", meaning you can select between
all results of previous analyses (like MotifScanner sites), which
ModuleSearcher shall now scan. You may leave the other parameters
unchanged as
they are well suited for a first approach.
Here are some brief remarks on the different
options which may
be selected. In
"Algorithm", you can select the search algorithm to use. Generally,
the Genetic Algorithm is faster in larger search-spaces (i.e. combination of 5 elements or more),
and the A* algorithm is faster in smaller search-spaces. Both should provide the same (or very similar) results.
"Nr elements" defines the
number of matrices in the module.
"Size" defines the maximum
length in base pairs that a module may occupy (maximum sequence range which
is covered by the TFs of the module). The usefull range is between 50 bp
(stringent) and 1000 bp (loose).
"Allow
overlap" specifies whether two binding sites in a module instance
are allowed to overlap.
"penalisation" 'punishes' instances that contain less than the requested number of binding sites.
ModuleSearcher at first only returns the best
(highest scoring) combinations of TFs. If you want to retrieve
additional
(second, third best) modules, you simply have to exclude (mask)
these TFs of the first run, by entering their accessions (like
"M00272-V$P53_02") in the field
"Exclude matrices (comma separated)"
of
the ModuleSearcher input window. Then repeat this process as
often as you want. Please note that you also should perform analyses
where you exclude only ONE of the TFs found in the first run, in order
to potentially catch additional modules involving only the other TF(s).
The algorithm can also return more than one top-scoring module.
Please be patient when using this service, as the computations may take a while.
If, in the first run, ModuleSearcher finds
no
hits, you could change the following parameters. You may
lower the number of elements (meaning number of TF sites that have to
appear "in common"; default: 5), you may increase the length
(sequence
stretch that includes the elements),
e.g. to 500 bp (default: 200 bp), change the overlap or the penalisation settings.
The "modules" will be displayed in
TOUCAN, the names starting with the prefix "Mod" in the left frame
("Feature List"). Again, if you want to
exculsively display the modules, highlight them and hit the
"Enter"-key.
The image below shows the result of ModuleSearcher, when analyzing the
predicted TFBS of our cluster of 53 promoter sequences, using the
default
parameters. It can be seen that 2 matrices (AP-2 and Sp-1) are
predicted to build a module, as marked by red circles.
If we, as described above, perform a
second run, where we
exclude these two matrices (AP-2 and Sp-1), and extend the allowed size
of the module
to 1000 bp, then we obtain a result which is depicted in the following
figure. We can see that a module of 3 matrices is predicted, containing
the TFs NF-kappaB, HFH3, and FREAC7. Obviously, the distances between
the 3 factors are larger than in the example above. Still, little is
known about the relevance of distances between co-operating
transcription factors. Naturally, it is a challenge to verify whether
such a
combination of TFs indeed works in a functional context
in vivo.
Please note, that if we exclude only the matrix of AP-2, then the "next
best" module would be Sp-1 plus FREAC7 (not shown).