Liis Kolberg  PhD thesis defense “Developing and applying bioinformatics tools for gene expression data interpretation” 

Klipi teostus: Merlin Pastak 21.06.2021 165 vaatamist Arvutiteadus

* Assoc. Prof. Hedi Peterson, Institute of Computer Science, UT.

* Dr.  Martina Summer-Kutmon, Maastricht University (Netherlands);
* Assoc. Prof. Kerrin Small, King's College London (United Kingdom).
Modern technologies enable researchers to simultaneously measure the expression levels of all genes under different conditions and in different groups of people. For example, gene expression is measured in cancer and normal human tissues. The result is a high-dimensional data set with expression levels of tens of thousands of genes that are searched for genes with similar expression patterns that may be involved in developing a particular type of cancer. Different data mining methods and statistical tests are used to detect gene groups with a similar expression. To better understand these groups, previously known information about them is gathered to identify common functions. Thus, new functions of less studied genes or new genes related to the studied disease can be found. However, such analyses require applying several methods and performing numerous statistical tests. For this reason, bioinformaticians develop tools that perform such calculations. In this thesis, we developed two tools, g:Profiler and funcExplorer, that enable to interpret gene expression data easily. g:Profiler finds significant intersections from the descriptions of gene lists, funcExplorer groups genes with a similar profile, taking into account the descriptions found with g:Profiler. Among other things, these tools present the results using plots and interactivity, allowing to obtain a global overview of the data and share the results with others. In the second part of the thesis, we studied genetic variants that affect gene expression levels. To do this, we first used funcExplorer to detect gene groups with a similar expression. We then identified genetic variants that influence the expression of these genes. Finally, we used g:Profiler to interpret these groups and thus the genetic variants that affect them. As a result, we identified a novel association, an essential part of which is the time and conditions of expression measurement, and confirmed several previously found associations.

Lisainfo veebis: