
Starting Date : 01/11/2007
Title Research : Innovative Methods for Drug Predictions.
Promoter(s) : Bart De Moor
Short Abstract :
As a computational chemist, I will be responsible for researching the
viability of new drug ligands. With the aide of Declarative Languages and
Artificial Intelligence (DTAI) and the Center for Drug Discovery Design (CD3). I will primarily apply statistical models and methods, such as Partial Least Squares, Principal Component Analysis and optimization methods like Support Vector Machines. The innovative methods I will derive and use for the purpose of the drug predictions, will be guided by knowledge of contemporary and groundbreaking scientific research being done by researchers in the area of HIV in academia (KULeuven, division of Rega Prof. Jan Balzarini and Prof. Annemie Vandamme) and in industry (example Tibotec). I will additionally be using cutting edge and novel software to computationally derive compounds that can be used in realistic chemotherapeutic applications. I will use chemically relevant data sources from PubChem, PDB/PDBBind, Stanford's hivdb, along with many others, to assist in the drug discovery process. Along with occasionally having to build molecular models using a molecular modeling tools, This will be my goal as I work towards developing drug ligands using innovative methods.
Title Research : Disease Gene identification through collaborative knowledge management and candidate gene priorization
Promoter(s) : Yves MoreauShort Abstract :
Biotechnologies now allow the detection of chromosomal rearrangements in the whole human genome which is a crucial step in identifying genes responsible for congenital diseases. However, given the size of such chromosomal events, researchers still face lists of hundreds of genes that may be causing the investigated disease. Accurate candidate genes prioritization is therefore of utmost importance to help researchers focus on the best candidates. Besides prioritization improvement and optimization, the process of identifying best candidates relies on what is known about the genetics of the disease of interest. The efficiency of this process can thus be increased by developing a new kind of knowledge management system allowing at the same time (i) to gather and collaboratively annotate state of the art knowledge on the genes and chromosome regions associated to different subtypes of the disease of interest, (ii) to prioritize and annotate candidate genes and (iii) to represent this information in various context such as gene networks.
Starting Date : 01/02/2009
Title Research : Biological knowledge bases using Wikis
Promoter(s) : Bart De Moor
Starting Date : 26/11/2009
Title Research : Molecular bar codes for analysis of clinical samples by next-generation sequencing
Promoter(s) : Bart De Moor
Starting Date : 01/11/2008
Title Research : Statistical Analysis of Array CGH Data
Promoter(s) : Yves Moreau, Joris Vermeesch
Short Abstract :
Chromosomal aberrations exist in human tumor cells as well in the blastomeres of early human embryos. They are the key factors to cause irregularities in later human developments. CGH array is a popular technique to detect copy number variation. However, few studies have been done concerning the analysis of the array CGH data derived from single cells. Moreover, the amplification of the single cell DNAs causes a bigger variation and bias of microarray measurements than genomic DNA from multiple cells, which leads to more difficulties to detect segmentation of chromosomes than in the classical way. The goal of this study is to analyze microarray data of a single cell by using different arrays (e.g. Agilent array and SNP array). We will investigate the suitable preprocessing methods and algorithms to detect the chromosomal aberrations in the single cells with lower false positives.
Title Research : Probabilistic algorithm for finding motifs in sets of orthologous sequences
Promoter(s) : Kathleen Marchal Short Abstract :
Regulatory motifs are characterized as conserved sites in a non-functional background. In motif detection methods, conservation is typically quantified by overrepresentation, or by an evolutionary relation to a common ancestor. We develop an algorithm that searches motifs in both spaces of conservation simultaneously. The method enables to study evolution of motifs.
Starting Date : 06/10/08
Title Research : Algorithms for Bayesian network modeling of high dimensional biological data
Promoter(s) : Bart De Moor
Short Abstract:
The availability of extensive, heterogeneous sources of relevant clinical and biological information pose challenges in reliable biomedical decision making. Considering all data pertaining to specific cases becomes increasingly difficult when clinical data is enhanced by high-dimensional information such as microarray data, proteomics data and in the near future even full genome sequences of patients. Bayesian networks can offer support in diagnosis, prognosis and drug response prediction by combining the strength of intuitive graphical models of causal relationships with the benefits of computationally intensive accurate estimation of outcome probabilities based on evidence. The aim of this thesis is to investigate algorithms for efficient learning of relationships in domain knowledge, updating of probabilities based on observed data and versatile querying of outcome probabilities.
Title Research : Data Integration Techniques for Molecular Biology Research
Short Abstract :
Bert is investigating possibilities for information integration and informatics integration in bioinformatics. He is especially interested in working with ontologies and web service technology for conceptual or qualitative data integration, and in statistical techniques to be used in algorithmic or quantitative data integration.
Title Research : The design of clinical decision support systems for breast cancer based on clinical and molecular data
Promoter(s) : Bart De Moor and Dirk TimmermanShort Abstract :
My main topic is the use of support vector machines and kernel methods for the integration of multiple data sources such as clinical data, microarray data and proteomic data, among others in oncology and/or gynecology. Considering information from multiple levels in the genome will improve diagnosis, prognosis and the prediction of therapy response in cancer patients. Furthermore am I responsible for the study design and data analysis using advanced methods from biostatistics in several clinical studies.
Starting Date : 01/10/2006
Title Research : Ovarian cancer
Starting Date : 01/01/2007Title Research : Query-based biclustering of microarray data
Promoter(s) : Kathleen Marchal (promoter), Bart De Moor (co-promoter)Short Abstract :
We want to develop a query-based biclustering methodology for microarray data. Starting from existing biclustering algorithms we will evaluate how simple and eventually also more complex queries can be incorporated in existing frameworks. A central aspect of our research is the application of the query-based biclustering algorithm to two research questions: 1) Differential transcription regulation within operons ; 2) Expression divergence between paralogs
Title Research : Mining hierarchical and modular genetic network structure : algorithms for heterogeneous data integration
Promoter(s) : Bart De MoorShort Abstract :
The project aims at the development of probabilistic algorithms for unraveling the hierarchical and modular structur of genetic regulatory networks (biclustering algorithms, graphical probabilistic models, MCMC methods, etc.). In addition, it will focus on structured data integration and on the comparison of candidate frameworks (kernel-based, probabilistic clustering-based) for such integration of heterogeneous data sources. The approach in this doctoral project is primarily theoretic and algorithmic, but an evolution towards biological-clinical applications or specific implementations and webtools cannot be excluded.
Titel Research : Detection of transcription regulartory motifs in Bacillus Subtilis
Promotor(s) : Kathleen Marchal, (co) Jos VanderleydenShort Abstract :
Regulation of gene transcription is one of the key tools in which a cell regulates its internal and, in some cases, external environments. Revealing the regulatory networks, is thus of great importance towards the control of different diseases. The availability of new bacterial genome sequences made it possible to investigate these bacteria on genomic level. To this end, we intend to discover transcription regulatory sites (motifs) using comparative genomics. The approach is named phylogenetic footprinting and is based on the theory that genetic regulatory elements are especially retained during the evolutionary process. The predicted motifs are then to be used in a more integrative approach together with gene expression profiles towards the delineation of transcriptional modules
Title Research : Modelling clinical, microarray and proteomics data with Bayesian networks to study cancer.
Promoter(s) : Bart De MoorShort Abstract :
Modelling clinical, microarray and proteomics data with Bayesian networks to study ovarian masses. Research interest include the following: Bayesian networks and extensions of Bayesian networks for medical decision support systems based on clinical, microarray or proteomics data or a combination of these with applications in oncology or gynecologic oncology. This includes the development of methods to integrate the abovementioned data sources. Secondly, the use of Least Squares Support Vector Machines (LS-SVM) to classify microarray data (e.g. breast cancer, ovarian cancer), reverse engineering genetic networks using Bayesian network and their extensions. Moreover this includes the research of publicly available data sources as prior information.
Promoter(s) : Yves Moreau
Starting Date : 06/07/2009
Title Research : Health Support Decision Systems
Promoter(s) : Bart De Moor
Short Abstract :
Hospitals deal with very diverse information flows, e.g. clinical data, imaging data, Array-CGH data, # Each of these information flows are stored on different hospital information subsystems. Purpose of this research is to create a Health Decision Support System that integrates these information flows to derive models using machine-learning techniques, and to make predictions for incoming patients based on the derived models. When sufficiently tested, the models these predictions are based on can be used as #Clinical Guidelines#, in effect resulting in a Policy Decision Support System.
Starting Date: 21/09/07
Title Research : Unraveling the intricate sRNA mediated transcriptional network in bacteria.
Promoter(s) : Kathleen Marchal
Short Abstract :
Small non coding RNAs (sRNAs) are recognized as important regulators in all kingdoms of life. In bacteria sRNAs usually regulate gene expression either by pairing to mRNAs and affecting their stability and/or translation or by binding to proteins and modifying their activity. In Escherichia coli more than 80 sRNA genes have been identified. sRNAs would play a crucial role in bacterial adaptation and virulence. While numerous in silico approaches have been developed for the detection of short non-coding RNAs in the genome, there has baan a relative shortage of approaches for identifying the potential target genes of these sRNAs and studying their regulation. Therefore, with this study, we aim at developing and applying in silico methods 1) to study the transcriptional regulation of the sRNA and 2) to predict targets of sRNAs. Although our methods will be generic, we will mainly study E. coli as a test case as there we can rely on a large body of experimental information and public data. Once optimized, we will apply our methods to characterize the sRNA dependent networks in Salmonella typhimurium. This in silico work will extend our knowledge on the mechanism and evolution of these intriguing networks.
Starting Date : 01/01/2009
Title Research : Rol van structurele eigenschappen bij transcriptieregulatie
Promoter(s) : Kathleen Marchal, Kristof Engelen
Starting Date : 30/03/2009
Title Research : The integration of multiple data matrices and development of a unified framework for graph analysis algorithms.
Promoter(s) : Bart De Moor
Short Abstract :
i. The primary goal of this thesis is the extension of graph matrix algorithms and the notation of information theory to multiple data matrices. PCA and CCA analysis are first performed for two or more matrices (A, B, C, …) and interpreted in a Bayesian context. Data sources B, C, … are considered as a priori or additional information. Restricted SVD is subsequently interpreted in the context of Fiedler retrieval for several data matrices. The problem of early, intermediate and late integration is formalized. In early integration, an appropriate ranking algorithm is applied to the concatenated matrix. Intermediate integration corresponds to the weighted sum of the covariance matrices of A and B. Late integration implies the ranking of matrix A and B separately before merging the rankings based on order statistics. We will furthermore investigate under which circumstances early, intermediate and late ranking are preferred. Finally, the information in the matrices is linked to clusters, in order to identify the most relevant matrix or matrices.
ii. The secondary goal is the development of a unified framework for graph matrix algorithms. Some recent research papers have showed that PCA and SVD, k-means clustering, spectral bi-clustering and non negative matrix factorization show similarity and are connected. Also a relationship between bipartite spectral graph partitioning and the SVD has been discovered, while a similar correspondence holds for LSA and Fiedler retrieval. This allows interpreting LSA-related algorithms in the framework of spectral graph partitioning. As PCA solves a relaxed version of k-means clustering, dimensionality reduction and clustering are intimately related. Finally, the relationship between spectral clustering and PCA has been explored, with interpretation in terms of random walks. We will analyze the properties of Nonnegative Matrix Factorization and its relation to K-means clustering and spectral clustering. In our work, we will check whether all these relations hold for multi-clustering or graph multi-partitioning tasks and whether they are all generalizable to kernel methods when generalized correlation or similarity matrices are used.
Title Research : Clustering of scientific fields by integrating text mining and bibliometrics
Promoter(s) : B. De Moor, K. DebackereShort Abstract :
Increasing dissemination of scientific and technological publications via the Internet, and their availability in large-scale bibliographic databases, has led to tremendous opportunities to improve classification and bibliometric cartography of science and technology. This metascience benefits from the continuous rise of computing power and the development of new algorithms. Paramount challenges still remain, however.
Accuracy of clustering and classification of scientific fields is enhanced by incorporation of algorithms and techniques from text mining and bibliometrics. Both textual and bibliometric approaches have advantages and intricacies, and both provide different views on the same interlinked corpus of scientific publications or patents. In addition to textual information in such documents, citations between them also constitute huge networks that yield additional information. We incorporate both points of view and improve on existing text-based and bibliometric methods for the mapping of science.
Title Research : Statistical and machine learning methods for genetype-phenotype correlation and association studies of copy number variations
Promoter(s) : Y. MoreauShort Abstract :
The methodology for genome-wide association (GWA) studies has been developed mainly for single-nucleotide polymorphism (SNP) platforms, and encompass for both family- and population-based designs. Recently, copy-number variants (CNVs) have been found to be accounting for a substantial part of genomic variance (Redon et al. 2006), but no suitable methodology is currently available for their use in GWA studies. The aim of this thesis is to extend the existing methodology and develop novel approaches for CNV analysis in GWA settings. Data from Leuven University Hospital will be considered as case study data.
Title Research : Network Models for Genomic Datasets
Promoter(s) : Yves Moreau and Bart De moorShort Abstract :
The recent rise in genome-scale datasets has created the need for both principled integration and visualization of these datasets. A successful trend has been to represent relationships graphically, where nodes in the graph represent the proteins and edges between them represent functional relationships evident in individual or combined genomic datasets. The challenge comes in choosing the right method to combine datasets and choosing the appropriate subset of the protein interaction network to present for interpretation. Probabilistic integration strategies prove useful to account for the intrinsic differences in scale and reliability of the individual datasets. Moreover, probabilistic methods offer a frameword in which to highlight particulary confident or relevant subnetworks, often by dynamically incorporating additional context-specific data.
Title Research : Development of a method for the reconstruction of regulatory modules by using heterogeneous data
Promoter(s) : Bart De Moor, Kathleen MarchalShort Abstract :
Nowadays, high-throughput data representing divers cellular processes (transcriptome, proteome, metabolome, …) are being generated. By combining these heterogeneous data sources correctly, the regulatory network can be reconstructed such that we can unravel the biological regulatory pathways. Therefore we will first create a compendium of all available “omics” data. An integrative analysis of these data sources will allow us to identify de novo (i.e. without prior knowledge) the regulatory network modules in those organisms for which the data is available. Subsequently, we will check how this framework can be used as a generator of prior information for probabilistic network inference.
Title Research : Ab initio gene prioritization
Promoter(s) : Yves MoreauShort Abstract :
The goal of this project is to develop new methodologies and tools for ab initio gene prioritization by genomic data fusion. Genetic studies and high-throughput genome wide screens can identify genes and proteins that are candidate members for a biological process of interest (disease, pathway). The disadvantage of the screens is the identification of tens or hundreds of candidate genes. Aerts et al. developed gene prioritization methods that rank candidate genes based on their similarity to genes already associated to a disease or process, using multiple data sources sources (sequence, expression, regulation, annotation, literature, etc.) (Aerts et al., 2006, PMID: 16680138). However, this method cannot prioritize candidates if no similar genes can be identified a priori. This prevents the method from tackling truly innovative discoveries (when little is known about a disease). Currently, there are no well-established gene prioritization strategies without a set of test genes. The new strategy we propose consists in checking not only the expression of a candidate but also of its "partner" genes in a gene network derived from multiple sources. A strong candidate should have many partners that are differentially expressed, meaning that it belongs to a disrupted expression "module". The method will be applied to congenital heart defects studies ongoing at CME-UZ.
Title Research : Kernel Based methods for microarray and mass spectrometry data analysis
Promoter(s) : B. De Moor, J. SuykensShort Abstract :
Kernel learning methods are advanced and powerful techniques that allow the construction of non-linear models for classification and regression problems.
Microarray and mass spectrometry data sources measure the activity and/or expression of thousands of genes and proteins, respectively, on a given set of biological samples. Analysis of the information contained in such samples has become a crucial activity in cancer research during the last decade. However, common problems encountered on these biological data sources are related to : large number of variables compared to the number of examples, low signal to noise ratio, irrelevant variables and the presence of missing values and outliers. Additionally, current methodologies are not totally well established and results are not always reproducible.
The goal of the proposed research is mainly the application of existing kernel-based methods and their subsequent adaptation to the areas of microarray and mass spectrometry data analysis. Topics included are among other preprocessing, prediction/classification models, variable selection (gene selection or biomarker identification), novelty detection. Model selection will play a central role in the construction of reliable and reproducible algorithms.
Title Research : Inferring networks based on probabilistic relational models
Promoter(s) : Bart De Moor, Kathleen MarchalShort Abstract :
The project aims at developing an algorithmic platform for the inference of regulatory networks based on heterogeneous datasources. The PhD research will involve extending current basic implementations for the inference of transcriptional networks with other proteomics datasources and elaborating on the aspect of model based inference.
Title Research : Study of the regulatory network involved in secondary metabolism in the filamentous fungus Mycosphaerella fijinsis based on a motif detection approach
Promoter(s) : Kathleen MarchalShort Abstract :
The general objective of the PhD project is the identification of pathogenicity related factors and the mechanism of transcriptional regulation in Mycosphaerella fijiensis, a filamentous plant pathogenic fungus. This fungus is of urgent importance since is the causal agent of the devastating leaf streak disease (commonly called Black Sigatoka) of bananas and plantains worldwide
Starting Date : 01/10/2009
Title Research : Hoge-doorvoer mutatieanalyse via genomische gegevensintegratie, genomische DNA-captatie en ultra-hoge-doorvoersequenering: naar de ontrafeling van mutatienetwerkenSituering
Promoter(s) : Yves Moreau
Short Abstract :
During the past decade, high-throughput technologies have caused a true revolution in the fields of molecular biology and clinical genetics. The focus of research has shifted from the analysis of single, or a limited group of genes to a full genomic view. Processing this kind of heterogeneous data to identify genes causal for specific phenotypes is far from trivial, especially when dealing with disorders involving multiple genes. The sheer amount and the low signal-to-noise ratio of the data demand automatic and statistical approaches to find genes that are relevant to the biological process or pathology of interest.
Title Research : Dection of regularty motifs and modules in eukaryotic sequences base don co-expression and orthology information.
Promoter(s) : Kathleen MarchalShort Abstract :
Unravelling regulatory motifs based on the detection of overrepresented motifs in sets of co-expressed genes is still prone to the detection of many false positives by the presence of a small signal to noise ratio. Novel probabilistic motif detection methods combine the co-regulation space with the orthologous space by incorporating phylogenetic models. We will benchmark these advanced motif detection tools and use this knowledge to implement an improved motif detection tool based on Gibbs sampling (M. Claeys) and to improve the analysis of motifs in vertebrate datasets (in collaboration with Legendo).
Title Research : ViTraM: Visualization of Transcriptional Modules
Promoter(s) : Bart De Moor, Kathleen MarchalShort Abstract :
In my pre-doc period i worked on the development of a tool/software for module network visualization. During my PhD period i will further modify and finalize it. Thereafter i will work on extending our probabilistic framework for motif detection for the detection of regularity modules (study of combinatorial transcription regulation).
Starting Date : 01/01/2008
Title Research : Genome-wide analysis of regulatory protein-DNA interactions in Salmonella Typhimurium
Starting Date : 11/06/2007Title Research : Transcriptional regulation
Senior Scientist (scientific responsible – if different from promoter) : Yves Moreau, Bart De Moor, Frans SchuitShort Abstract :
Based on a large series of high density rat and mouse expression microarrays, we study the regulation of transcription underlying several cellular and physiological processes, with a focus on insulin producing pancreatic beta cells in response to food intake fluctuations. A novel series of highly conserved motifs that are overrepresented in the untranslated regions (UTR) of transcripts encoding cell fate decision proteins (growth&proliferation, differentiation, apoptosis, senescence) is examined, assessing their potential relationships with non-translated RNA's such as micro-RNA's. Alu repeats, consisting of 10% of the human genome and previously thought to be "junk DNA" are examined for a potential hitherto unknown role in transcription regulation. In addition, genomic location and evolutionary conservation of genes is studied to elucidate interactions and finetuning of different pathways. These molecular mechanisms have broad relevance for human disease such as cancer, neurodegenerative disorders and diabetes.
Title Research : Gene prioritization through genomic data fusion.
Promoter(s) : Yves Moreau, Bart De MoorShort Abstract :
Genetic studies (linkage, linkage disequilibrium, association studies) and high-throughput genomewide screens (expression microarrays, CGH microarrays, proteomics, metabolomics) can identify genes and proteins that are candidate members for a biological process of interest (disease, pathway). Unfortunately such screens often identify tens or hundreds of candidates and downstream validation of all those candidates is difficult or impossible because such validation is expensive in both time and expense. Often, biologists select a restricted subset of candidates based on expert knowledge and hunches. Gene prioritization by genomic data fusion aims at automating this process and prioritizing the best subset of candidates by integrating genomic information from numerous sources (sequence, expression, literature, protein-protein interaction, functional annotation). This prioritization is based on the similarity of the candidates to the profile of genes known to belong to the biological process (disease, pathway) established across the different data sources.
Title Research : Bayesian MCMC to train HMMS for cis-regulartory model discovery
Promoter(s) : Bart De Moor, Yves MoreauShort Abstract :
Computational discovery of transcription factor binding sites in eukaryotic DNA is an unresolved issue. For this task, we develop a learning machine based on hidden Markov models that is trained by Bayesian Markov chain Monte Carlo methods. The performance will be validated in a case study on heart failure.
Title Research : Robust algorithms for inferring regulatory networks based on gene expression measurements and biological prior information
Promoter(s) : Bart De Moor, Kathleen MarchalShort Abstract :
Gene expression measurements provide insufficient information to reverse engineer gene regulatory networks. Additional data sources, such as ChIP-chip data, sequence information and protein-protein interactions, provide partially complementary information and can be used to improve the reverse engineering of gene regulatory networks. Probabilistic Relational Models have recently been developed. These models extend Bayesian Networks to the relational domain and will be used to combine the heterogeneous data sources in a uniform probabilistic framework.
Title Research : In Silico Analysis and Algorithm Development for Imaging Mass Spectrometry in Proteomics
Promoter(s) : Bart De Moor – Etienne Waelkens (co-promoter)Short Abstract :
Mass spectral imaging (MSI) or imaging mass spectrometry is a developing technology that combines spatial information with traditional mass spectrometry and biochemical characterization. It enables researchers to study the spatial distribution of biomolecules such as proteins, peptides, and metabolites throughout organic tissue sections. This PhD centers around developing multivariate analysis methods and algorithms capable of handling the massive amount of data produced by MSI in order to extract the biochemical trends underlying organic tissue and for use in differential studies for the elucidation of disease mechanisms.
Starting Date : 01/05/2008
Title Research : Blood glucose control in critically ill patients: Design of assessment procedures and a control system
Promoter(s) : Bart De Moor
Short Abstract :
Critically ill patients, typically admitted to the Intensive Care Unit (ICU), show hyperglycemia and insulin resistance associated with adverse outcomes. It has been demonstrated that strict blood glucose control (between 80 and 110 mg/dl) results in an important reduction in mortality and morbidity. Current therapy requires a manual and rigorous administration of insulin and could, therefore, be replaced by a semi- or fully-automatic blood glucose control system leading to a potential decrease of hypoglycemic events and workload of the medical staff.
The first objective is the design of a procedure to evaluate the reliability of glucose sensor devices with regard to a gold standard blood glucose sensor. The quality of blood glucose control depends on the reliability (accuracy) of the measurements, but current methods to assess this reliability level may mislead evaluations and/or lack statistical evidence. Therefore, the GLYCENSIT procedure is developed: http://www.esat.kuleuven.be/GLYCENSIT.
The second objective is the design of a procedure to appropriately assess the adequacy of blood glucose control algorithms used in the ICU. Based on clinical expert knowledge, the Glycemic Penalty Index (GPI) is introduced as a measure for the overall glycemic control behaviour in ICU patients as current evaluation measures have weaknesses that may mislead assessments.
The third objective is the design of a predictive control system that can potentially be used for (semi-)automatically normalizing the blood glucose in the critically ill. This blood glucose control system comprises a patient model and a controller.
Starting Date : 01/10/2004
Title Research : Systems biology: identification ofregulatory regions and disease causing genes and mechanisms
Promoter(s) : Bart De Moor
Starting Date : 01/11/2007Title Research : Kernelmethoden voor genomische datafusie en genpriorisatie
Promoter(s) : Y. Moreau, B. De MoorShort Abstract :
Tegenwoordig is er een enorme hoeveelheid en verscheidenheid aan gegevens, informatie en kennis ter beschikking van moleculaire biologen. Hogedoorvoertechnieken (zoals sequeneringsprojecten, microroosterexperimenten of proteomics screens) produceren massa’s gegevens. Complexe annotatiedatabanken (zoals Gene Ontology of KEGG) verzamelen de kennis die beschikbaar is over gene en proteïnen. MEDLINE levert abstracts voor 15 miljoen wetenschappelijke artikels. Verder is deze informatie beschikbaar voor een brede waaier van organismen. Er is een schrijnende nood aan geautomatiseerde methoden die de data uit deze verschillende bronnen op een systematische en coherente manier integreren (wat wij datafusie noemen) om het onderzoek van de moleculaire biologen te helpen aansturen. Een van de toepassingen die hier het belangrijkste is, is de prioritisatie van de kandidaatgenen om de identificatie van ziektegenen bij menselijke pathologiëen te versnellen. Kernelmethoden bieden hiervoor een efficiënt en gestructureerd kader. De doelstellingen van dit doctoraat zijn (1) de ontwikkeling van nieuwe kernelmethoden voor genomische datafusie, (2) de ontwikkeling van kernelmethoden voor de integratie van gegevens uit meerdere organismen en (3) de toepassing van deze methoden op belangrijke biologische vraagstelling om hun relevantie te demonstreren.
Title Research : Strategieën voor Kennisontginning bij Moleculaire karyotyping : Gekoppelde analyse van Rooster-CGH en biomedische rapporten
Promoter(s) : B. De Moor, Y. MoreauShort Abstract :
Recente evoluties in Moleculaire Karyotyping laten toe om de resolutie bij het bepalen van de chromosomale karakteristieken van een weefsel aanzienlijk te verfijnen. Het bepalen van deleties en duplicaties gebeurt door rooster-CGH (Comparative Genomic Hybridization), waarbij genomische klonen op microroosters worden aangebracht. De groeiende hoeveelheid rooster- en patiëntengegeven houden uitdagingen in op het vlak van identificatie en localisatie van genoomregio’s gelinkt aan aangeboren afwijkingen, en het automatisch analyseren van patiëntgerelateerde data.
Starting Date : 20/10/2008
Synthetic biology: building a microbial pathway with characterized biomodules according tot electrical circuit principles.
Promoter(s) : Kathleen Marchal, Georges Gielen
Short Abstract :
In this PhD project we will apply the principles of synthetic biology to design a generic bacterial microsensor that is able to sense external signals (such as for instance a virus or a polluting component in waste water). The biosensor, which consists of a synthetized bacterium, transduces the signal into a desired outcome i.e, a membrane localized fluorescent or electrical signal, that can subsequently be read out by a properly designed chip.
Starting Date : 01/10/2009
Title Research : Advanced datamining algorithms for mass spectral imaging data in a biomedical context.
Promoter(s) : Bart De Moor
Short Abstract :
This PhD the application of a relatively new imaging technology, Mass Spectral Imaging (MSI), on samples from ovarian tumours and endometriosis. More specifically, we will focus on the data processing of the complex data that results from such analyses.
Ovarian tumours and endometriosis have a high impact on general well-being and form a large burden on healthcare systems. There is a strong need for a better understanding of their underlying mechanisms and for effective tissue markers that allow a better differential diagnosis and a diagnosis of the early stages of disease. By analyzing diseased tissue using MSI, it is possible to find biomolecules that play a specific role in these diseases, for instance by comparing proteins located at the edge of a tumour with those located at the core or by comparing non-affected tissue with affected tissue regions.
The data that results from a MSI analysis is very high-dimensional and non-linear. The algorithms currently used for processing do not take these properties into account and are therefore unable to extract all the information it holds. The aim of this PhD is the development of advanced datamining algorithms that can incorporate these aspects in order to exploit the full potential of these data.
Starting Date : 25/11/2009
Title Research : Integrative modeling in systems biology using statistical relational models
Promoter(s) : Kathleen Marchal
Starting Date : 01/11/2007Title Research : Pattern analysis of high dimensional data in biomedical text mining
Promoter(s) : B. De Moor, Y. MoreauShort Abstract : µ
The research will investigate text mining process results in computational biological studies. Statistical machine learning methodologies will be evaluated by using textual gene profiles in gene prioritization, prediction and genomic network construction. Moreover, several text mining configurations will be compared and evaluated and ideas about text mining improvements will be gained in this research.
Title Research : Preprocessing and integration of high throughput data
Promoter(s) : Kathleen Marchal, Bart De Moor, Mieke Verstuyf, Jos VanderleydenShort Abstract :
Inferring comprehensive regulatory networks from high-throughput data is one of the foremost challenges of modern computational biology. As high-throughput expression profiling experiments have gained common ground in many laboratories, different techniques have been proposed to infer transcriptional regulatory networks from them. Furthermore, with the advent of diverse types of high-throughput data, the research in network inference has received a new impulse. The use of diverse types of data, together with the increasing tendency of building the inference on biologically plausible simplifications, allows a more reliable and more complete description of networks. Inferring networks requires.
I. Careful preprocessing of the heterogeneous data: therefore part of the research will be related to preprocessing of microarray related, proteomics, metabolomics data
literature overview, gaining expertise for application on real data, set up of analysis flow
improvement of existing methods if needed
application
II. The development of data integration techniques
Different techniques to infer networks from heterogeneous data will be tested and compared. Based on these results an own, improved technology will be developed.
Short Abstract :
As an evolutionary biologist and a bioinformatician, I'm interested in using bioinformatics approaches to study the evolution of organisms, genes and genomes. Regarding genome structure and evolution (which forms a major part of our research), I'm particularly interested in the study of gene and genome duplications as well as in the evolution of novel gene functions after duplication. Gene duplication events have been considered important mechanisms that facilitated the increasing complexity of organisms and also speciation because they might have permitted functional diversification of genes, created complex gene families and generally increased genomic and phenotypic complexity. However, great controversy still exists about how and how fast duplicated genes evolve new functions. Another point of discussion is whether most gene duplications are the result of local (e.g., tandem) gene duplications or of large-scale gene or even entire genome duplication events. Although the number of sequence data that can provide us with answers to the questions raised above increases at a fast rate, the interpretation of the data and mapping and interpreting (large scale) gene duplication events remains often difficult. For example, typically, developmental control genes belong to multigene families and, more often than not, the evolutionary relationships within these gene families in comparative developmental studies are unknown and potentially complex. Also the elucidation of the exact relationships between gene family members therefore forms part of our research.
Short Abstract :
I have always been drawn to biology; I am not a computer freak. Nevertheless, as soon as it turned out that it will be possible to sequence DNA routinely, and as biological sequences begun to accumulate in the early eighties, I got the strong conviction that a new way to investigate biology using computers was open and I battled in my institute to convince people to invest in this field and get the tools for it, with quite few success in the beginning, I must confess, colleagues of mine telling me that 'I should better work than play with a keyboard' or 'why are you stealing the results of others instead of producing yours?'. Life circumstances and an open-minded boss (maybe not that unprejudiced?) offered me the great opportunity to devote all my time playing with BioInformatics since 1994.
My deep interest is shared by many: from the fast accumulating raw genome data, sequences, microarrays and others, how to decipher information and build knowledge on function of genes and intermediary to end products of genes, on their interplay at the cell level, and finally on biology of the organism leaving in its environment. Wide and unfocussed, especially for someone of your age, will you say! The focus comes from history and necessity: sequences were first, sequencing Arabidopsis begun, and not many were interested in doing bioinformatics for plant genomes ten years ago, but my colleagues at bench in Ghent and in France were (elsewhere too, as a matter of fact). Gene prediction, genome sequence annotation became therefore my first topic, and is still today, as so many new organisms are going to be sequenced and as there are still so many uncertainties in simply finding (all) their genes and modeling them correctly. My second interest came later, aiming at gathering functional attributes for the genes after their finding. Database (and literature) mining provides functional information but essentially on products of genes. To get information on mechanisms governing their expression proper, one has to pay attention to promoters, and the coming of transcriptomics was making this topic timely. I would like to know not only what are the motifs and modules involved, but how their interplay control specifically a given gene expression, and to investigate ways to integrate experimental data and biological knowledge at the different molecular and cellular levels
I am deeply convinced that, with bioinformatics, we are only making the first steps into something I would call 'dry biology', just the same way physics did more than a century ago, and that the landscape is wide open for creative and uninhibited young scientists.
Short Abstract :
microRNAs, non-coding RNAs, Systems biology, Computers & grid computing
Short Abstract :
Detection of plant miRNAs and miRNA targets</h4><p>Recently, a totally new level of regulatory control was discovered in higher plants and animals. MicroRNAs, tiny genome-encoded regulators, were shown to have crucial roles in, for instance, diverse developmental pathways. In plants, the activity of miRNAs relies on the recognition of a miRNA binding site in the coding part of the target gene. Upon binding, the mRNA of the target genes is cleaved by RISC. A few hundred miRNA genes have been identified in the genome of higher plants such as Arabidopsis, poplar and rice (Bonnet et al. 2004; 2006). Although some miRNA genes are conserved throughout the plant kingdom, evidence grows on the presence of lineage or species-specific miRNAs in plants. Through a comparative approach, we aim to detect highly specific miRNAs in closely related plant species. As such, we attempt to identify miRNAs, which have a regulatory role in agronomically interesting processes, specific to certain plant lineages.
Short Abstract :
Systems biology: Modeling regulatory networks : Regulation of gene expression at the transcriptional level is mediated by transcription factors binding to the DNA. The activity level of these transcription factors depends on the expression level of their respective transcription factor genes, which is again controlled by other transcription factors, thus forming a complex transcription regulatory network. In such a network, nodes represent genes and directed edges represent regulatory interactions. Probabilistic graphical models or Bayesian networks provide a mathematical framework for reconstructing a regulatory network from genome-wide expression measurements. In our research group we have started up a project to infer regulatory networks frommicroarray data generated by other groups in the department.
Statistical physics of DNA : The computation of the thermal stability and statistical physics of nucleic acids is a classical problem going back to the 1960's, with recent results relating the physics of denaturation (DNA strandseparation) to the biology of genomes. Other experimental developments, which can also be modeled accurately by statistical physics, have made it possible to manipulate single polymeric molecules directly and offer access to a whole new range of DNA properties
Short Abstract :
Fundamental research
Feature selection in machine learning : The selection of a subset of relevant features from potentially huge initial feature step is an important topic in machine learning. Sometimes the choice of the feature subset may be even more important than the learning model that is chosen to achieve the best results. My research focuses on selection methods that are able to deal with both (i) large feature sets, and (ii) feature dependencies. A more recent topic of investigation is the use of feature selection for clustering, a non-trivial and challenging topic that is gaining more and more attention from the scientific community.
Modelling gene networks using different sources of data : Modelling the interactions between genes remains a difficult research topic, as often the starting data is quite noisy and it is difficult to evaluate the obtained results. To minimise the amount of error in the results and get to more reliable models, different sources of data need to be combined (sequence data (motifs), expression data, interaction data,...). However, rigorous mathematical techniques to combine and reason with these different types of data are lacking, and hence present a great opportunity for research
Mathematical models for gene splicing : Gene splicing is a very intricate and tightly regulated process in the cell. However, computational models for recognizing splice sites are still far from being perfect. A particular difficult issue from a machine learing point of view is the large amount of negative examples that occur in genomes. Therefore, additional submodels (e.g. branch point model) should be designed and evaluated to increase overall performance. Another important issue in the context of splicing is the analysis of alternative splicing. Using machine learning techniques, we try to find common patterns that could lead to increase our insight into the process of dectecting alternative splice variants
Applied research
Feature selection for classification of nucleic acid sequences : The application of feature selection to different recognition problems related to gene recognition/genome annotation can provide new biological insights in how some processes work. In addition, looking for a core set of relevant features can improve model robustness and increase classification performance
Feature selection for promoter prediction : The computational identification of promoter regions on a genomic scale is still in its childhood. To improve the models that are used to locate promoters, one should first have some knowledge about which characteristics differentiate promoter regions from other genomic regions. The application of feature selection techniques can aid in finding new features that are important for promoter modelling
Gene and genome annotation : Our team is involved in the genome annotation of several organisms. To do this job properly, advanced modelling techniques are needed to find and combine the different signals in the gene. We are developping software for the recognition of the most important gene features (start/stop codon and splice sites), as well as for methods to identify potential protein coding regions (coding potential prediction).
Hardware-based speed up of bioinformatics algorithms : As bioinformatics databases are increasing at an exponential rate, there is a need for fast implementations of very common algorithms (such as alignemnt). In this research project, together with the PARIS research group of the Laboratory of Electronics and Information systems (ELIS) we are experimenting with the implementation of several common bioinformatics algorithms in parallel, using specialised hardware (FPGA).
Short Abstract :
Modeling Biological Systems : With the availability of fully sequenced genomes and the development of high-throughput functional genomics technologies, we now have the tools to look at the molecular biology of an organism from a systemic viewpoint. Systems biology is a dynamic and highly interdisciplinary field, requiring input from biology as well as engineering, physics and mathematics. My main interest is the development of methods to analyze functional genomics data and integrate them in network models that reflect the regulatory wiring and modularity of biological systems.
Impact of gen(om)e duplications on plant evolution : Expansion of gene families by duplication and subsequent functional diversification is considered to be of major importance for the development of biological novelties during evolution. However, we have only begun to elucidate the mechanisms underlying evolutionary innovation through gen(om)e duplication. Particularly, expansion and functional diversification of regulatory gene families is considered necessary to bring about an increase in morphological complexity. Recent studies in A. thaliana have indeed found that transcription factors, signal transducers and developmental genes have been retained in excess after genome duplications. More importantly, it seems that the majority of these genes could have been retained only because they were created through genome duplication, suggesting a key role for large-scale gene duplication events in plant evolution (see Maere et al. PNAS 2005, De Bodt et al. TREE 2005).
We are studying the impact of small- and large-scale gene duplications in the evolution of plant complexity and developmental processes (evo-devo) from a systems biology perspective. In recent years, it has become widely acknowledged that morphological evolution reflects the evolution of the underlying developmental networks, and that it is therefore necessary to study the evolution of genetic networks in order to comprehend the evolution of organisms. Such network-level analyses have recently become feasible thanks to the increased availability of relevant functional genomics data.
Short Abstract :
Leaf Development Modeling :http://www.psb.ugent.be/images/stories/psb/modeling/virtualleaf.png
In collaboration with thehttp://www.psb.ugent.be/leaf Leaf Development Group we are developing cell-centered modeling approaches to leaf growth. We focus on the interplay between patterned cell division, growth factor channeling through the vasculature, and whole leaf growth. Our modeling efforts aim to reconcile known molecular mechanisms of auxin transport with microscopic observations of leaf growth and venation patterning. We start from the hypothesis that PIN1 (an auxin transporter) localizes near the neighboring cells with the highest auxin concentration, producing auxin accumulation points as in previous models (Jonsson et al., 2006; Smith et al., 2006; Barbier de Reuille et al., 2006). Our aim is to reproduce recently published PIN1 expression patterns and PIN1 cellular localizations in the leaf (Scarpella et al., 2006).
Lateral root initiation : Lateral roots originate from cells in the root basal meristem, a proliferating tissue region just above the root tip, forming a regular branching pattern with evenly spaced lateral roots. The crucial signal for initiating the lateral root is most likely the phytohormone auxin. Auxin levels oscillate at a period of around 15 hours, precisely coinciding with the rhythm by which new lateral roots appear. In collaboration with thehttp://www.psb.ugent.be/root-development/index.php Root Development group, we are building computational models of the root basal meristem. We aim at unraveling the mechanisms behind these oscillating auxin flows, which may be driven by a dynamics interaction between auxin and the production and cellular localization of its transporter proteins, including PIN, AUX and LAX.
For this project we currently have a PhD position available. Please seehttp://www.psb.ugent.be/vacancies/vacancy-for-a-phd-student.html
Lignin Biosynthesis : Wood consists for 20% of lignin, a polymer formed in angiosperms from primarily the two monolignols coniferyl (G-subunit) and sinapyl alcohol (S-subunit) that bind non-enzymatically form a huge variety of lignin molecules. In collaboration with thehttp://www.psb.ugent.be/tree-biotechnology-research/index.php Bioenergy group led by Wout Boerjan, we are building a bottom-up model with the aim of predicting lignin structure from low-level chemical kinetic factors, including subunit coupling probabilities and monolignol synthesis rates. We will use the model to explain the mechanism behind a range of controlling factors, indentified in experimental work, including a) the ratio of coniferyl vs. sinapyl monolignols, b) the monolignol supply rate, and c) the abundancy of alternative monolignols present in lignin biosynthesis mutants and transgenics. Lignin composition, structure and its interaction with hemicellulose are important factors limiting the quality of lignocellulosic plant material as fodder, conversion to bioethanol or conversion to paper. Eventually the model will suggest new targets for controlled, improved lignin biosynthesis.
Previous research interests : My previous research interests are also related to biological morphogenesis, including coral growth simulation, modeling vasculogenesis and angiogenesis, and biological image analysis. See myhttp://www.roelandmerks.nl personal webpage for details.
Short Abstract :
Plant evolutionary genomics : Plants come in a wide range of forms and colors and their genomes exhibit a large degree of variation, even between different species from the same gene family. Apart from the diversity present in the construction and organization of DNA sequences in different species, molecular and evolutionary processes are continuously shaping nuclear genome structures. Although it has become clear that major genome size differences can be explained by differences in ploidy levels and dissimilar amounts of mobile and tandem repetitive elements, the mechanisms driving gene and genome evolution in higher plants, together with their implications on gene function and regulation, are largely unknown. Despite the fact that it is easy to understand that changes in the place or time of gene expression can create new or alternative molecular interactions, little information about the evolution of transcriptional regulation in plants is known. This knowledge however, is essential, because each gene is flanked by regulatory sequences which, together with the expression and activity of other proteins, determine the amount, place, and timing of expression. Therefore, characterizing these motifs is required in order to understand the regulatory interactions between trans-acting proteins and the promoters of thousands of genes within a eukaryotic genome. This information is also essential when studying biological processes from a holistic point of view by incorporating and combining complementary functional data sets (systems biology).
Short Abstract :
My post-doctoral research is an interplay between comparative and evolutionary bioinformatics> applied at the crossroads between prokaryotic taxonomy, ecology and population genetics. The research bridges between the research athttp://lmg.ugent.be/ Laboratory of Microbiology and the expertise in the Bioinformatics & Evolutionary Genomics group, both at the Ghent University, Faculty of Sciences. Currently I'm staying in thehttp://web.mit.edu/polz/ Polz lab at MIT to develop a different viewpoint on microbial evolution with inputs from environmental microbiology and population genetics. Specific topics of my research include:
Roadmap towards a prokaryotic genomic taxonomy: developing a bioinformatics toolbox for the detection of phylogenetic markers that can be used in multi-locus sequence schemes for high-throughput classification and identification, i.e. a genomic taxonomy. Applied to several model groups of bacteria representing different phylogenetic lineages: vibrios, lactic acid bacteria, burkholderias, ...
Population-level analysis of the bacterial species: in silico analysis of the sequence diversity and dynamics of a well-defined natural bacterial population, in order to increase our understanding of species and speciation in prokaryotes. Focus on the role of homologous recombination in the creation of genomic coherence.
Analysis of the Azorhizobium caulinodans genome with the aim of increasing our insight in the invasion, colonisation and signalisation between plant and bacteria (in collaboration with thehttp://www.psb.ugent.be/plant-microbes/index.php plant-microbes group at PSB)
Short Abstract :
Systems biology and biological networks : In recent years, high-throughput methods have generated genome-wide datasets that enable us to study biological networks (biology at the systems level), rather than single genes, proteins or cells. As bioengineer, I am intrigued by systems biology, since it is an interdisciplinary research field that requires the knowledge of biology, mathematics, engineering, computer science and physics.
Differential gene expression is an important driving force in the development, function and pathology of multicellular organisms, plants as well as animals. Proper spatial and temporal gene expression is most importantly controlled at the initiation of transcription by regulatory transcription factors that directly bind to their genomic DNA targets, resulting in an activation or repression of target gene expression. In addition, the recently discovered miRNAs function as repressors at the posttranscriptional level. Both regulators, transcription factors and miRNAs, function in the context of intricate regulatory networks that describe gene expression as a function of inputs specified by physical and functional interactions between transcription factors, miRNAs and DNA. Deciphering these regulatory networks in eukaryotic organisms is the main challenge. How can we extract biologically relevant information from different genome-wide datasets (data-integration)? How can reverse-engineering algorithms be improved to more correctly infer the regulatory network? Once, we have the regulatory networks in hand, we are able to tackle specific biological questions. Which regulators and target genes control a specific biological process and how? This can be accomplished by studying network modules, groups of highly interconnected components in the network that together carry out particular biological functions. How is the regulatory specificity of transcription factors and miRNAs determined? Which mechanisms have shaped the specificity of these regulators during evolution?
Previously : Functional foods. Bioavailability of bioactive peptides. Gastrointestinal microbiota
Short Abstract :
Text mining, data integration, evolution and functional consequences of protein-interaction (especially dimerization) networks in transcription factors. Secondary metabolites in prokaryotes and fungi
Short Abstract :
As a bioinformatician with a molecular biologists background, my first project was to set up from scratch what was going to become the PlantCARE database and brought me to focus my interest on gene expression. I then have been working on the annotation of the genomic sequences that contributed to the ESSA projects for the sequencing of Arabidopsis were the first genome duplications in Arabidopsis were shown. This involved next to some dataming, a lot of manual annotation of raw genomic sequences and genes, correcting faulty annotation done by automated systems that reported poor results at that time. I since then collaborated and still now provide an input in the development and enhancement of the Eugene gene prediction platform that performs now among the best. Over the years I have kept both interest. I'm still maintaining the PlantCARE database, that became part of the PlaNet project aiming at interconnecting databases on different aspects of plant genomics, while, on the other hand, I'm involved in the annotation of new upcoming plant genomes, which brough us to adapt the Eugene platform to other plant genomes than Arabidopsis thaliana. Now I try to combine both, as more plant genomes become available and comparative methods enable more reliable in silico promoter analyses. This means that from a raw genomic sequences I follow the whole pipe line from genome annotation, to the extraction of the data necessary to study promoter sequences and find clues to decifer potential co-expressed genes in networks.
Title Research : The use of microarray technology to analyse the evolution and functional divergence of duplicated genes
Promoter(s) : Yves Van de Peer
Short Abstract :
It has been generally accepted that gene duplication increases the amount of genetic material as a necessary source for the origin of evolutionary novelties. Complete genomic analyses of different organisms now show that a major fraction of their genomes indeed consists of paralogous genes that arose through large-scale gene or entire genome duplication events. However, it is still unclear why so many duplicated genes are retained in genomes. Furthermore, also the (general) fate of initially redundant genes on a genome-wide scale is still unclear. Through large-scale analysis of functional data, I will study the role of genetic redundancy on the evolution of gene function in Arabidopsis thaliana.
Title Research : Modelling the covarion/heterotachy hypothesis
Promoter(s) : Yves Van de PeerShort Abstract :
Modelling the covarion/heterotachy hypothesis : Phylogenetics plays an important role in many areas of biology. Placing model organisms, as well as the genes they house, in the appropriate phylogenetic context allows for a better understanding of both patterns and processes of evolution. Over the past years different evolutionary models have been designed to infer phylogenetic trees. A "covarion" model for nucleotide substitution which allows sites to turn "on" and "off" with time was proposed many years ago by Fitch and Markowitz and several approaches to model heterotachy have been developed recently. It has been argued recently that evidence supports such models over later, alternative models which postulate a static distribution of rates across sites. Advanced covarion/heterotachy models are needed to allow for the correct reconstruction of phylogenetic trees. Using these models it should also be possible to detect functional divergence.
Modelling context-dependent evolution : While many phylogenetic analyses in the past have assumed independent evolution of the sites in alignment, the trend is shifting towards using more realistic assumptions when inferring phylogenetic trees. In the past, nucleotide models have been improved and continue to improved by codon models and models assuming a known secondary (or even tertiary) structure. In regions where there is few information concerning the structure, it can be assumed that dependencies upon other sites in the alignment can lead to improvements in explaining the data. It remains to be seen which assumptions will actually lead to different, and better supported, tree topologies and whether such complex models are actually preferred over independent models. To assess this a reliable method is needed to compare (non-nested) models, keeping the error margins at a low
Title Research : Plant Genome Annotation and Evolution in the Genomic Era
Promoter(s) : Yves Van de PeerShort Abstract :
My main interest of my PhD is gene-prediction and Genome annotation. A good (structural) gene-annotation is one of the cornerstones of bioinformatics and the correct structural prediction of genes (in a high through-put manner) is only become more important as more and more genomes become available.
An other research topic of my PhD is, to me, a logical consequence of the previous one. An annotated whole genome sequence is the starting point of a wealth of other bioinformatics analyses especially when genome sequences from more and more species are becoming available. That's way I'm also interested in understanding how evolution shaped and/or shapes genomes of different species. I try to achieve this by looking for large-scale duplication events in genome sequences and by analyzing the collinearity between different genomes. I believe that comparative genomics is an ideal way to start to tackle these important questions.
Short Abstract :
Many scientists believe that there were two large-scale gene duplications at the origin of the vertebrates (2R hypothesis) which made the success of this group possible. About 320 million years ago, the fishes underwent an additional large-scale gene duplication (3R). The goal of my study is to investigate the consequences of this fish-specific large-scale gene duplication. This study is meant to bring more clarity in vertebrate genome evolution and in the mechanisms that cause gene loss and the development of novel gene functions. Tine is working on /research_evolution.php#divergence large-scale duplication events in vertebrates and /research_evolution.php#divergence functional divergence.
Title Research : Comparative evoluationary analysis of the gene and genome organisation of the chromalveolates
Promoter(s) : Yves Van de PeerShort Abstract :
The chromalveolate supergroup represents a large fraction of known eukaryotic diversity, ranging from tiny obligate intracellular parasites to free-living algae. Many of these protist species are of great medical (e.g. Plasmodium spp., causative agent of malaria), veterinary and agricultural importance because of their pathogenicity to man, cattle and crops. Others are of ecological interest such as the diatoms, which are responsible for approximately 40% of the marine primary production. The goal of my research is to characterize chromalveolate genome dynamics and to unravel the molecular mechanisms responsible for the unique features of these species using comparative genomics
Title Research : Plant evolutionary and comparative genomics
Promoter(s) : Yves Van de PeerShort Abstract :
Arabidopsis thaliana is undoubtedly the most used plant model system in laboratories. However, fully sequenced genomes of close relatives until recently were not available, rendering extensive comparative genomics with this interesting species impossible. This changes with the genome sequencing of its closest relative, Arabidopis lyrata and of Capsella rubella, a species from the closest related genus.
My interests mainly are:
how fast do plant genomes evolve, at different levels. Despite the short evolutionary distances, chromosome numbers for instance have already changed. What is the frequency of chromosomal rearrangements and which mechanism are behind them?
Secondly, three whole genome duplications have been shared. What has happened with these duplicates in the different species since their divergence? Have they been retained more or less in a similar way or did for instance A.thaliana with its smaller genome size loose more frequently large parts of duplicated regions? Do species-specific regions explain by GO-annotations in some way different lifestyles?
Third, why is the genome of A.thaliana some remarkably smaller compared to its close relatives? How did the supposed genome reduction in A.thaliana happen?
Forth, A.thaliana is a highly successful species nowadays occurring almost worldwide. Although part of this seems to be explained by the fact that A.thaliana is a self-fertilizer, not everything seems to be clarified by this factC.rubella is also highly inbreeding, yet has a much more limited areal compared to A.thaliana. Can we find additional explanations? Is genome size a factor? Are gene family expansions/contractions present? Also searching for rapidly evolving genes or regions is an important topic here. GO-annotation of the involved families/genes can lead us to more insight in this matter.
Short Abstract :
Fungal Genome Annotation : We are involving in the EVOLTREE (EVOlution of TREEs as drivers of terrestrial biodiversity,http://www.evoltree.eu/http://www.evoltree.eu/ ) project. The main goal for this project is to understand the evolution history of forest trees and their ecosystems. Fungus including Laccaria bicolor, Melampsora larici-populina, Glomus intraradices and Tuber melanospora which interact with Populus trichocarpa in different ways are the major fungi species being studied in this project. As these genomes are being sequenced, our task is to annotate these fungi genomes through our gene prediction platform – EuGene
Fungal Genome Evolution : Different life styles of fungus are being sequenced in the EVOLTREE project. The Laccaria bicolor and Tuber melanospora perform ectomycorrhizal symbiose with trees whereas Glomus intraradices is an endomycorrhizal symbiose fungi and Melampsora larici-populina will cause tree leaf rust. Comparing these fungi genomes can help us to understand how these fungus evolved and how do they interact with trees and the environment.
Title Research : Barely visible but highly unique: the Ostreococcus genome unveils its secrets
Promoter(s) : Yves Van de Peer
Short Abstract :
After finishing my master thesis on the comparison of genes involved in the cell cycle in different eukaryotic organisms, I switched my focus to a tiny green organism, Ostreococcus tauri. Ostreococcus tauri is a unicellular green alga that was discovered in the Mediterranean Thau lagoon (France) in 1994. With a size less than 1 µm , comparable with the size of a bacterium, it is the smallest eukaryotic organism described until now. Its cellular organisation is rather simple with a relative large nucleus with only one nuclear pore, a single chloroplast, one mitochondrion, one Golgi body and a very reduced cytoplasmatic compartment. The presence of only one chloroplast and mitochondrion makes it interesting to use not only for evolutionary studies, but also for experimental studies. Phylogenetic analyses placed Ostreococcus within the Prasinophyceae, an early diverging group of the Chlorophyta (green algae).
Genome Annotation : After sequencing its genome (Laboratoire Arago, Banyuls, France), I was a member of the annotation group that performed the complete genome annotation. As a case study I annotated the core cell cycle genes which lead to two major conclusions: i) Ostreococcus harbors the first described cdc25 Dual-Specificity Phosphatase present in the green lineage (Khadaroo et al, 2004); ii) Ostreococcus shows the minimal yet complete set of core cell cycle genes described to date (Robbens et al, 2005). This knowledge was further used for the whole genome annotation of Ostreococcus tauri (Derelle et al, 2006). Besides the annotation of its nuclear genome, I was responsible for the annotation of both the chloroplast and mitochondrial genome
Gene Family Evolution : With the availability of completely sequenced genomes of different members of the green lineage, I'm interested in the evolution of gene families. The goal of this research is to characterize genome dynamics within the green lineage and to unravel the molecular mechanisms responsible for the unique or common features present in these species using comparative genomics.
Short Abstract :
Promotor Prediction : The main topic of my research is the accurate and fast prediction of gene promoters. The accurate prediction of gene promoters in whole genomes is still one of the most difficult problems in bioinformatics. Ab initio promoter detection in anonymous sequences is at best in it's infancy, in eukaryots only for Human and Drosophila there exist tools that can predict promoters to some extent.
In my research I will focus on applying machine learning techniques for feature extraction, classification and evaluation to the problem of annotating promoters correctly. Hitherto most programs have used a functional description of the promoter, promoters are seen a collection of motifs that follow each other. Instead of this rather sequential approach I will focus on the three dimensional structure of a promoter. It has been shown before that promoters have distinct structural features and it should be possible to use this distinct structure to predict promoters.
Promoter structure : While all promoter sequences have a distinct structure compared to genes or intergenics, it is very interesting to study to structure of different classes of promoters and how they have evolved.
Short Abstract :
Sequencing of the whole genome of various species has significantly advanced our understanding of the structure and evolution of genes and genomes. Annotation of the genome is an important aspect of a whole genome project because most subsequent analyses will heavily rely on that annotation. As an increasing number of genomes are being sequenced, a major challenge lies in providing an efficient and accurate annotation. Our group is/has been involved in the annotation, gene prediction in particular, of several genomes using a gene prediction software called EuGene. Apart from doing the annotations, I am also interested in studying gene/genome duplications, transposable elements, and various other topics related to comparative genomics and gene/genome evolution
The genomes I work on: Solanum lycopersicum (tomato), /genomes/view/Physcomitrella-patens Physcomitrella patens ; Vitis vinifera (grapevine) /genomes/view/Arabidopsis-lyrata Arabidopsis lyrata /genomes/view/Capsella-rubella Capsella rubella.
Title Research : Alternative Splicing prediction
Promoter(s) : Yves Van de PeerShort Abstract :
The main topic of my research consists of the prediction of alternative splicing events in Arabidopsis thaliana. While a lot of work has been done on the general prediction of constitutive splice sites, prediction of alternative splicing events remains a rather difficult domain to tackle with ab initio methods. My research will focus on combining motifs and features from both the primary and the secondary pre-mRNA structure. Machine learning techniques are hereby used for feature extraction and classification.
Short Abstract :
One of the goals of systems biology is inferring molecular interaction networks through integrative modeling of high-throughput transcriptional, protein and metabolite data. In this project we focus on the integration of diverse protein interaction data sources and linking the protein interaction network to the transcriptional network
Title Research : Computational inference of transcription regulatory networks in A. thaliana
Promoter(s) : Yves Van de PeerShort Abstract :
Recent technological advances allow analyzing large numbers of genes or proteins simultaneously, and stimulate a systems biology approach where genes are no longer studied as a single entity, but as part of a complex interacting network. We developed an algorithm to build transcription regulatory networks from expression data. We validate this methodology on well studied systems like E. coli and B. subtilis, thanks to the availability of huge amount of publicly available expression datasets. Currently I am working on improving current prediction by adding other data sources like motif information, protein-protein interaction data. Finally, we plan to build transcription regulatory networks for A. Thaliana using different data sources available.
Title Research : Integration of genomic data to study genome evolution and transcriptional regulation in plants
Promoter(s) : Yves Van de Peer en Klaas VandepoeleShort Abstract :
There are still some uncertainties about whole genome duplications (WGD) in the green plant lineage. Now, with multiple plant species being sequenced, it’s an excellent time to start using a comparative genomics approach to study how and when these events took place. Both the evolution after a WGD and its link with speciation will be studied. Using novel genomic data also allows us to discover conserved non-coding elements (CNE), evolution of these elements will be studied and methods for further characterisation will be developed. For this study an integrated platform will be developed, containing genome sequence, annotation, gene family, ... together with tool to perform analysis and visualizations
Short Abstract :
A thorough analysis of similarity measures for learning from heterogeneous data sources : In this project we will investigate the use of similarity measures for machine learning from heterogeneous data sources. A special class of similarity measures are the kernel methods, which will be thoroughly analysed. This project aims to stimulate the development of new information systems, allowing the richness of current data sources to be used more efficiently.
Short Abstract :
Detection of transcription factor binding sites integrating experimental and comparative data: Detection of transcription factor binding sites using computational methods like phylogenetic footprinting or motif over representation has been successful in the past. But the disadvantage is the lack of information to analyze whether the predicted motifs are in deed functional. Comparing predicted motifs with experimental data derived from ChIP-CHIP data will improve the prediction. Comparing those verified elements for several orthologous and paralogous genes can reveal better insight into the mechanism and the evolution of transcriptional evolution
Title Research : Analysis of the Hox multigenic family with bioinformatics approaches
Promoter(s) : Jacques van Helden, Luc Leyns (Laboratory of Cell Genetics, VUB)Short Abstract :
Hox transcription factors are known for their crucial role in the regulation of many developemental processes. A complex evolutionary history of duplications and loss of Hox genes accounts for the various number of Hox genes found throughout the animal kingdom. The project involves the study of the Hox multigenic family with matrix-based techniques. An automated classification procedure, HoxPred, has been developped to assign Hox proteins to their homology groups. The method relies on discriminant analysis that classifies Hox proteins according to their scores for a combination of protein generalised profiles. A second aspect of the project is the prediction of putative binding sites of these factors in the genomes of higher eukaryotes with a matrix-based pattern-matching approach. One of the challenges is to evaluate and reduce the number of false positives among the predictions.
Title Research : Strategies for regulatory motifs discovery
Promoter(s) : Jacques van HeldenShort Abstract :
Extracting meaningful biological information form DNA sequence data is a challenging task.
During the last decade, several methods have been developed for the detection of cis-regulatory elements in DNA sequences. This work focus on one category of such methods - regulatory motif discovery - that is useful when very few a priori information is available. Existing motif discovery strategies like word counting and Gibbs sampling for can be enhanced to allow more accurate predictions. We propose to evaluate biological relevance of these methods and to develop an hybrid approach that will take advantage of best part of each method.
Title Research : Protein and Lipid Simulations
Promoter(s) : Michel VandenbrandenShort Abstract :
The cationic lipid di-C14-amidine resembles the physiological lipid DMPC both in structure and in properties, but contains a small and positively charged head group that is not found in nature. Amidine fuses rapidly with the plasma membrane and is a good candidate for intracellular delivery of drugs. We investigate the molecular and physical properties of amidine membrane and its interaction with membrane-associated proteins by means of molecular dynamics simulations and other bioinformatics approaches.
Title Research : Analysis of transcriptionnal regulatory regions in higher organisms
Promoter(s) : Jacques van HeldenShort Abstract :
The prediction of regulatory elements in the regulatory regions of genomes has made a lot of progress since its infancy, but remains a serious challenge for higher organisms. The Regulatory Sequence Analysis Tools (RSAT) developed in the lab since several years are very efficient whe applied to bacteria and yeast sequences, but perform rather poorly when tested on mouse or human sequences, for instance. The first part of our work was to build up a solid validation framework and to evaluate quantitatively the performances of the tools in various organisms. We are now aiming at trying to improve the reliability of the tools on higher organisms.
To allow analysis of large amounts of datasets, we also developed a Web Service interface to the RSAT tools and are currently working on collaborative workflows between data resources and analysis tools.
Title Research : Towards in silico detection and classification of prokaryotic mobile genetic elements
Promoter(s) : Ariane Toussaint and Jacques van HeldenShort Abstract :
Bacteriophage genomes show a pervasive mosaicism, indicating the importance of horizontal gene exchange in their evolution. Being unique combinations of modules with different phylogenetic histories, these genomes call for a reticulate classification. Using a weighted graph, where nodes represent phages and edges represent phage-phage similarities in terms of shared gene/protein families, we performe a fuzzy classification, such that each phage is associated with a membership vector, which quantitatively characterizes the membership of the phage to the clusters. Clustering genes based on their phylogenetic profiles allows for defining evolutionary cohesive modules, many of which mark the phage clusters. We now plan to apply the same methodology for the classification of other modular prokaryotic mobile genetic elements.
Title Research : Bioinformatics analysis of the transcriptional regulation network evolution
Promoter(s) : Jacques van HeldenShort Abstract :
The goal of my PhD project is to study the transcriptional regulation mechanisms of the formation of bristle patterns in insects. On the thorax of Drosophila melanogaster there is a stereotyped array of 22 large sensory bristles. Their development requires the activity of the achaete-scute complex (ASC) genes whose products are bHLH-type transcription factors whose expression confers neural fate to cells. The pattern bristle is the result of ACS genes expression in clusters of cells which will give rise to bristle. The ACS genes share common cis-regulatory elements on which bind transcription factors that regulate the complex spatial and temporal expression patterns of these genes. We develop methodological approaches based on pattern-matching to detect genome wide cis-regulatory elements involved in the regulation of the ACS genes as well as their target genes. As the mechanism of bristles formation seems to be conserved, we use the 14 completely sequenced insect genomes to study the evolution of these elements by comparative genomic.
Title Research : Inferring metabolic pathways from clusters of co-expressed genes in yeast.
Promoter(s) : Jacques van HeldenShort Abstract :
It is the aim of my work to infer relevant metabolic pathways from a metabolic network and sets of enzyme-coding genes. To reach this aim, I evaluate a k shortest path based and a random walk based algorithm on a number of known metabolic pathways. The algorithms are compared on the basis of their pathway inference accuracy. For this, each algorithm is provided with some nodes of a known pathway and infers a pathway given these nodes as seeds. The accuracy is then calculated as function of the overlap between the known and the inferred pathway. In addition, I attempt to optimize a number of parameters such as metabolic network construction and weight. Pathway inference can be applied to micro-array data in order to predict metabolic pathways from sets of co-expressed genes.
Starting Date : 01/10/2004
Biological knowledge bases using Wikis
Promoter(s) : Albert Goldbeter
Starting Date : 01/01/2008
Promoter(s) : Jacques Van Helden
Starting Date : 01/10/2009
Title Research : Modeling of the structures of biological macromolecules by machines learning
Promoter(s): Louis Wehenkel
Starting Date : 01/06/2007
Title Research : Bioinformatics applied to systems biology
Promoter(s) : Louis Wehenkel
Starting Date : 01/01/2008
Title Research : Model-Based Multifactor Dimensionality Reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data
Promoter(s) : Kristel Van Steen
Starting Date : 01/10/2009
Title Research : Supervised inference of biological networks - application to the prediction of genetic interactions in yeast
Promoter(s) : Pierre Geurts
Short Abstract :
The general objective of this project is to develop new methods for supervised graph inference or improve existing ones and to apply them to different biological networks, starting with the genetic interaction network of the yeast. The expected contributions will be in the field of machine learning as well as in the field of biology.
Starting Date : 01/10/2007
Title Research : Identification of genes affecting complex traits by automatic learning
Promoter(s) : Louis Wehenkel
Starting Date : 01/02/2009
Title Research : Development of Multifactor Dimensionality Reduction (MDR) techniques to detect gene-gene and gene-environment interactions in complex diseases
Promoter(s) : Kristel Van Steen
Starting Date : 01/02/2009
Title Research : Statistical processing of images defined on nonlinear spaces and application to brain diffusion tensor imaging
Promoter(s) : Rodolpe Sepulchre
Starting Date : 01/10/2005
Title Research : Contributions to Stochastic Programming and Reinforcement Learning.
Promoter(s) : Louis Wehenkel, Rodolpe Sepulchre
Short Abstract :
The research aims at cross-fertilizing different approaches to Sequential Decision Making under Uncertainty, coming from the Operations Research and the Machine Learning communities. Multi-stage stochastic Programming is limited to small time horizons. It outputs decisions that are difficult to generalize to new scenarios, but can deal with many continuous decision variables under risk constraints. Reinforcement Learning is subject to the curse of dimensionality, but can deal with arbitrary state dynamics. It outputs policies that can be easily validated.
Starting Date : 01/10/2008
Title Research : Models of interacting dopaminergic neurons
Promoter(s) : Rodolphe Sepulchre
Title Research : Parameter estimation for biochemical reaction systems
Promoter(s) : Eric Bullinger
Short Abstract :
Dirk Fey's research concerns dynamic modelling of biological systems and the corresponding identification / parameter estimation problems. Currently he is working on two projects:
1. Parameter estimation methods: Observer based approaches for parameter estimation of kinetic reaction models. Aim: Exploit biology specific properties (form of reaction kinetics) to develop accurate estimation methods (by mathematical proof).
2. Modelling learning & memory: Modelling rat swimming behaviour in Morris water maze experiments by identification of autoregressive models. Aim: Better understanding of learning & memory by revealing contribution/trade-offs between randomness & different search strategies & leaning protocols.
Starting Date: 01/10/2007
Title Research : Evaluation of performances and identification of informative variables in the context of the inference from clinical data of dynamic treatment regimes.
Promoter(s) : Damien Ernst, Louis Wehenkel
Short Abstract :
Nowadays, many diseases as for example HIV/AIDS, cancer, inflammatory or neurological diseases are seen by the medical community as being chronic-like diseases, resulting in medical treatments that can last over very long periods. For treating such diseases, physicians often adopt explicit, operationalized series of decision rules specifying how drug types and treatment levels should vary over time, called Dynamic Treatment Regimes (DTRs). While typically DTRs are based on clinical judgment and medical insight, since a few years the biostatistics community is investigating a new research field addressing specifically the problem of inferring in a well principled way DTRs directly from clinical data gathered from patients under treatment. This research project aims at studying two open problems related to this well pincipled way of designing DTRs : first, the prediction of the performances of DTRs using only clinical data, and second the development of methods to select the most relevant clinical indicators in order to build convenient DTRs.
Starting Date : 01/10/2007
Title Research : Trimming the computational complexity of RPC without significant reliability decrease
Promoter(s) : Louis Wehenkel
Short Abstract :
In the state-of-the-art in Ranking by Pairwise Comparison (RPC), a main problem which prevents RPC to deal with a very large number of labels is the computational complexity of order N2 with respect to the number N of labels. The purpose of this thesis is to find an algorithm to reduce the complexity of the method to order N by learning a good subset of N base classifiers while maximizing the ranking score.
Starting Date : 01/10/2007
Title Research : Computational analysis of genetic regulatory mecanisms related to tissue differentiation
Promoter(s) : Pierre Geurts, Louis Wehenkel
Short Abstract :
The objective of this thesis is to contribute to the analysis of regulatory mechanisms underlying gene expressions with bioinformatics and modeling approaches. More specifically, we will focus on the regulatory mechanisms involved in tissue differentiation, like angiogenesis, with a joint analysis of transcriptional and miRNA regulations.
Starting Date : 01/01/2005
Title Research : Development and application of machine learning and computer vision methods for large-scale, biomedical image datasets
Promoter(s) : Louis Wehenkel
Starting Date : 01/10/2007
Title Research : Data reduction algorithms for bioinformatics
Promoter(s) : Rodolphe Sepulchre
Title Research : Reinforcement Learning for Sequential Decision Problems
Promoter(s) : Damien Ernst, Louis Wehenkel
Starting Date : 01/10/2008
Title Research : Robustness and performance measures of biological models
Promoter(s) : Rodolphe Sepulchre
Short Abstract :
Robustness is a ubiquitously observed feature of biological systems. Broadly speaking, it is a property that allows a system to maintain its function despite external and internal perturbations. Starting from classical measures of performance and robustness in system theory, this project aims at proposing new performance and robustness measures that apply to particular nonlinear physiological models, such as "switches" (bistable models) and "clocks" (oscillator models). Special interest will be devoted to the role of the circuitry in comparing the robustness of models of increasing complexity.
Starting Date : 01/10/2008
Title Research : Probability density estimation in high dimentional spaces by mixtures of simple graphical probabilistic models.
Promoter(s) : Louis Wehenkel
Starting Date : 01/01/2007
Title Research : Application of machine learning and computer vision methods for the automatic analysis of Zebra fish images
Promoter(s) : Louis Wehenkel
Starting Date : 01/01/2007
Title Research : Development of the bioinformatics platform in the Alma-In-Silico project
Promoter(s) : Louis Wehenkel
Starting Date : 01/10/2009
Title Research : Robustness analysis of bistable models
Promoter(s) : Eric Bullinger
Short Abstract :
Bistability has been used to describe several physiological decision-making processes including cell fate decisions via the MAP kinase cascade, the switch between the survival and the death in apoptotic process and differentiation in the drosophilia embryo. The goal of the project is to define performance and robustness criteria of bistable systems and apply them on specific biological models, with the aim of obtaining a deeper understanding of biological switches.
Short Abstract :
Farida Zehraoui defended in 2004 her PhD thesis on the combination of unsupervised learning (self-organizing maps) and case-base reasoning for sequences prediction. She is currently working on spectral methods applied to biological data and the extension of biclustering algorithms to heterogeneous and complex data (sequences, graphs, etc.).
Short Abstract :
Trained as statistician, Nicolas Brunel has defended his PhD thesis on the inference on Hidden Markov Models, with a view towards Radar Signal processing. Since two years, he has focused on the estimation of dynamical systems for systems biology (state-space models, differential equations) and the use of (biological) prior constraints. He is still involved in computational statistics and learning theory.
Title Research : Analysis of the transcriptomic response of the yeast S. cerevisiae to gamma radiation: from responses identification to the reconstruction of pieces of regulatory networks
Promoter(s) : Génopole Evry / University of Evry / Institut CurieShort Abstract :
Inference of gene regulatory pathways from large scale gene expression data is still a bottleneck when no prior knowledge is available. In order to identify the transcriptional response of yeast to gamma-radiation and its regulatory mechanisms, I consider the task of discovering gene regulatory pathways from gene expression kinetics measured across several perturbations (Institut Curie data).
Taking into account that a scientific discovery process is both a matter of induction and deduction, I conceived XRegPath (in collaboration with AMIS Bio team), a general methodology based on automated deduction and statistical inference that helps the biologist to extract gene regulatory pathways involved in the cellular response of a given organism to some stress signal. XRegPath mines large datasets, extracts and filters information, deduces potential regulators, confronts different sources of data and finally gathers various pieces of evidence about regulatory processes. I applied this methodology to the analysis of the yeast transcriptional response to gamma-radiation. I extracted typical responses as co-expressed gene groups with typical behaviour. I showed the interest of this approach by drawing a global regulation schemes of the irradiation response. Finally, I generated new hypotheses about yeast cellular response to radiation and its regulation that have been experimentally confirmed afterwards.
Title Research : Estimating parameters in models based on ODEs for biological networks inference
Promoter(s) : University of EvryShort Abstract :
* Statistical machine learning and bioinformatics
* Learning of dynamical systems
* Parameter Estimation * Biological networks
Title Research : Networks modularity in the transcriptional regulation : Independent subspaces and independent and mixture probabilistic
Promoter(s) : Génopole Evry / University of Evry.Short Abstract :
Our aim is to study the concept of independence of sub-spaces or sub-networks to define functional modules in biology. We try to extract modules from static and kinetics gene expression data and estimate a mixture of sub-networks models (we achieve the extraction of modules and the modelling of the network decomposition into sub-networks at the same time).
Title Research : Logical Modelling of the Eukaryotic Cell Cycle
Promoter(s) : Denis Thieffry and Andrea Ciliberto (IFOM, Milano)Short Abstract :
Leaning on existing models for the regulatory networks controlling the cell cycle in eukaryotes, we are developing logical models for the core cycling engine as well as for various checkpoint modules (including the morphogenetic checkpoint and the MAPK pathway in the case of Yeast, or yet the DNA damage checkpoint in the case of mammals). The next step consists in integrating these modules into more comprehensive models and to assess their dynamical properties for documented or novel perturbations. This involves the development of a modular modeling approach, progressively implemented into GINsim, our software dedictated to the qualitative modeling and analysis of biological regulatory networks.
PhD obtained in November 2009
Starting Date : October 2006Title Research : Logical Modelling of T-helper cell differentiation
Promoter(s) : Claudine Chaouiya and Denis ThieffryShort Abstract :
The immune response is highly controlled by specialized cell types, which arise from “naive” T helper cells peripheral differentiation. This work aims at integrating and extending existing qualitative models of T cells activation and Th ½ differentiation, in order to further cover alternative pathways leading to Treg and Th17 differentiation. This involves the development of novel computational methods to cope with the complexity of the corresponding networks. Leaning on a logical (multilevel) formalism, novel algorithms are systematically integrated into GINsim, a software dedictated to the modeling and analysis of biological regulatory networks.
PhD obtained in July 2009
Starting Date : October 2007Title Research : From functional genomic data to genetic network modelling
Promoter(s) : Denis Thieffry, Denis Puthier and Elisabeth RemyShort Abstract :
This works aims at developming efficient algorithms for processing, integrating and analyzing transcriptomic data to validate and refine existing qualitative dynamical models. It encompassed the conception and implementation of algorithms to extract transcriptional signatures from public available data (TranscriptomeBrowser software), and the confrontations of the results of transcriptome meta-analyses with logical models for T cell activation, proliferation and differentiation in human and mouse (using and extending GINsim, a software dedicated to the modeling and simulation of biological regulatory networks).
PhD obtained in July 2009
Starting Date : 01/09/2008
Title Research : Qualitative dynamical modelling of the core regulatory network controlling heart development in Drosophila melanogaster.
Promoter : Denis Thieffry
Starting Date : 01/10/2008
Title Research : Identification of cis-regulatory elements common to early co-expressed genes during Drosophila embryogenesis.
Promoters : Denis Thieffry and Jacques van Helden
Starting Date : 01/10/2008
Title Research: Mining microarray data for regulatory interactions with TranscriptomeBrowser.
Promoter(s) : Denis Puthier and Jean Imbert
Starting Date : 01/05/2008
Promoter(s) : Denis Thieffry and Elisabeth Remy
Title Research : Computational analysis of the dynamics of logical regulatory graphs.
Starting Date : 01/09/2009
Promoters : Denis Puthier and Christine Brun
Title Research: Functional genomic data integration for regulatory network inference and validation.
Promoter : Denis Thieffry
Title Research : Qualitative modelling and analysis of the dynamics of biological regulatory networks.
Habilitation à dirigé des recherches (HDR) obtained in September 2007.