Overview | Start at bench | Install TOUCAN | Get Sequences | Annotate | MotifScanner | Statistics | ModuleSearcher | MotifSampler | Return to bench | References  
   
ModuleSearcher - Search for Combinations of Transcription Factor Binding Sites

After having performed both "MotifScanner" and "Statistics" to identify significant TFBS, we now want to investigate which transcription factors tend to "occur together". The transcriptional regulation of a metazoan gene depends on the cooperative action of multiple transcription factors that bind to cis-regulatory modules (CRMs). ModuleSearcher (Aerts et al., 2003) is designed to find CRMs in a set of coexpressed or coregulated genes. Note that you first should run a program like MotifScanner to display all relevant TF sites. Then, choose "Motifs", "ModuleSearcher". In the appearing window (see below), you have to select the "Feature Source", meaning you can select between all results of previous analyses (like MotifScanner sites), which ModuleSearcher shall now scan. You may leave the other parameters unchanged as they are well suited for a first approach.

Here are some brief remarks on the different options which may be selected. In "Algorithm", you can select the search algorithm to use. Generally, the Genetic Algorithm is faster in larger search-spaces (i.e. combination of 5 elements or more), and the A* algorithm is faster in smaller search-spaces. Both should provide the same (or very similar) results. "Nr elements" defines the number of matrices in the module. "Size" defines the maximum length in base pairs that a module may occupy (maximum sequence range which is covered by the TFs of the module). The usefull range is between 50 bp (stringent) and 1000 bp (loose). "Allow overlap" specifies whether two binding sites in a module instance are allowed to overlap. "penalisation" 'punishes' instances that contain less than the requested number of binding sites. ModuleSearcher at first only returns the best (highest scoring) combinations of TFs. If you want to retrieve additional (second, third best) modules, you simply have to exclude (mask) these TFs of the first run, by entering their accessions (like "M00272-V$P53_02") in the field "Exclude matrices (comma separated)" of the ModuleSearcher input window. Then repeat this process as often as you want. Please note that you also should perform analyses where you exclude only ONE of the TFs found in the first run, in order to potentially catch additional modules involving only the other TF(s). The algorithm can also return more than one top-scoring module. Please be patient when using this service, as the computations may take a while.

If, in the first run, ModuleSearcher finds no hits, you could change the following  parameters. You may lower the number of elements (meaning number of TF sites that have to appear "in common"; default: 5), you may increase the length (sequence stretch that includes the elements), e.g. to 500  bp (default: 200 bp), change the overlap or the penalisation settings.       
         
ModuleSearcher1
       
The "modules" will be displayed in TOUCAN, the names starting with the prefix "Mod" in the left frame ("Feature List"). Again, if you want to exculsively display the modules, highlight them and hit the "Enter"-key. The image below shows the result of ModuleSearcher, when analyzing the predicted TFBS of our cluster of 53 promoter sequences, using the default parameters. It can be seen that 2 matrices (AP-2 and Sp-1) are predicted to build a module, as marked by red circles.
                      
ModuleSearcher2
     
If we, as described above, perform a second run, where we exclude these two matrices (AP-2 and Sp-1), and extend the allowed size of the module to 1000 bp, then we obtain a result which is depicted in the following figure. We can see that a module of 3 matrices is predicted, containing the TFs NF-kappaB, HFH3, and FREAC7. Obviously, the distances between the 3 factors are larger than in the example above. Still, little is known about the relevance of distances between co-operating transcription factors. Naturally, it is a challenge to verify whether such a combination of TFs indeed works in a functional context in vivo. Please note, that if we exclude only the matrix of AP-2, then the "next best" module would be Sp-1 plus FREAC7 (not shown).
    
ModuleSearcher3
            

Previous <       > Next