Deciphering regulatory
patterns from clusters of co-regulated genes:
A step-by-step tutorial using TOUCAN
Overview |
Start
at bench |
Install
TOUCAN |
Get
Sequences |
Annotate |
MotifScanner |
Statistics |
ModuleSearcher |
MotifSampler |
Return to bench |
References
The TOUCAN program (
Aerts
et al., 2003) is a highly versatile package for the analysis
of
regulatory elements in clusters of genes, which may be derived from
different species. TOUCAN provides access to a series of tools and
databases which are essential for this purpose. The scope of this
tutorial is not a comprehensive description of all features (like in a
program manual) but to provide a detailed
step-by-step procedure
to
address a very common biological question. As example, we will start
"at the
bench", where we performed microarray analyses to cluster genes which
are significantly upregulated in human endothelial cells by the
inflammatory mediator interleukin-1, over a time-course of 6 hours. We
ask the question, if it is possible to delineate common regulatory
elements in clusters of genes which show a common behavior upon
stimulation.
For this purpose, several
individual steps are needed, starting
from
the automated "in-batch" extraction of promoter sequences from genome
databases ("Get Sequences"). We will also see how to generate an
annotation table for these genes of interest ("Annotate"). Then, we
will delineate the strategy to predict transcription factor binding
sites (TFBS) in this set of promoters ("MotifScanner"), perform
statistical analyses to identify over-represented TFBS ("Statistics"),
and also screen for significant combinations or "modules" of TFBS
("ModuleSearcher"). Finally, we will address the question, if it is
possible to predict potential novel regulatory elements via
de novo
screening for over-represented sequence motifs ("MotifSampler"). At the
end, we will return to the bench to verify or support our hypotheses.