You can use this webinterface to cluster your gene expression data. This interface is intended to find groups of genes that have a similar expression profile. To find these groups we use Adaptive Quality-Based Clustering, an algorithm designed by Frank De Smet. To enter your data you need to save them as a tab-delimited ascii text file. The correct format is described below.
If you have any comments or questions concerning this web site, please feel free to contact Gert Thijs. For specific questions about the clustering algorithm you can contact Frank De Smet.
If you like our software, you can always cite:
Frank De Smet, Janick Mathys, Kathleen Marchal, Gert Thijs, Bart De Moor and Yves Moreau. 2002. Adaptive Quality-based clustering of gene expression profiles, 2002. Bioinformatics, 18(6), 735-746.
Additional information accompanying this paper can be found here.
Data Format
Before you start using this clustering web server please check the required format of the data file.
- Your data file should be a tab-delimited ascii text file.
- All fields should be tab separated.
- All lines starting with a '#' are discarded. If there are lines in your file that do not contain measurements and that do not start with a '#', these line will certainly corrupt the input.
- To obtain the best results you might log-transform your data. It is not necesarry to normalize the data, this is done within the core of the algorithm.
- You have two options to identify the genes in your data file.
- First column: Primary identifier of the gene of interest: accession number, ...
- Optionally, second column: eg. gene name
- All the other columns contain the expression levels as numerical values. If there are some missing values in your data you can leave them blank or substitute them by NaN. This is to indicate it is 'not a number'.
- If you have any questions about the data format take a look at the example or feel free to contact us.
Example
- Here you can find an example of the data file with the first two columns as gene identifiers. An example of the results page showing all the clusters found in this data set with the parameters, MIN_NR_GENES = 3 and S = 0.95, can be found through this link. You can of course download the data and try the clustering software yourself.
The expression data used in this example is originally generated by "Reymond et al. (2000) Differential Gene Expression in Response to Mechanical Wounding and Insect Feeding in Arabidopsis. Plant Cell 12:707-720"
This page is maintained by Gert Thijs. Last update 2005/09/22.
Email: gert.thijs@esat.kuleuven.be
Copyright © 2001-2005, KULeuven.