Network-based methods to identify mechanisms of action in disease and drug perturbation profiles using high-throughput genomic data
OA Version
Citation
Abstract
In the past decade it has become increasingly clear that a biological response is rarely caused by a single gene or protein. Rather, it is a result of a myriad of biological factors, constituting a systematic network of biological variables that span multiple granularities of biology from gene transcription to cell metabolism. Therefore it has become a significant challenge in the field of bioinformatics to integrate different levels of biology and to think of biological problems from a network perspective. In my thesis, I will discuss three projects that address this challenge.
First, I will introduce two novel methods that integrate quantitative and qualitative biological data in a network approach. My aim in chapters two and three is to combine high-throughput data with biological databases to identify the causal mechanisms of action (MoA), in the form of canonical biological pathways, underlying the data for a given phenotype. In the second chapter, I will introduce an algorithm called Latent Pathway Identification Analysis (LPIA). This algorithm looks for statistically significant evidence of dysregulation in a network of pathways constructed in a manner that explicitly links pathways through their common function in the cell.
In chapter three, I will introduce a new method that focuses on the identification of perturbed pathways from high-throughput gene expression data, which we approach as a task in statistical modeling and inference. We develop a two-level statistical model, where (i) the first level captures the relationship between high-throughput gene expression and biological pathways, and (ii) the second level models the behavior within an underlying network of pathways induced by an unknown perturbation.
In the fourth chapter, I will focus on the integration of high throughput data on two distinct levels of biology to elucidate associations and causal relationships amongst genotype, gene expression and glycemic traits relevant to Type 2 Diabetes. I use the Framingham heart study as well as its extension, the SABRe initiative, to identify genes whose expression may be causally linked to fasting glucose.