Search form

UGent, Department of Mathematical Modelling, Statistics and Bio-informatics

Contact details
Traineeship proposition
Abstract traineeship advanced bachelor of bioinformatics 2017-2018: A container for the Hardy-Weinberg equilibrium based genomic imprinting detection pipeline

For nearly a century, the term “epigenetics” was unknown to researchers. Today we can link a malfunction in this mechanism to many illnesses. Genomic imprinting is one of those important epigenetic processes.

We can use the term imprinting when there is a parental epigenetic influence on the genome which causes monoallelic expression. This phenomenon plays an important role in growth and development. Unfortunately, (epi)mutations or chromosomal aberrations may cause loss of imprinting of individual genes. Although imprinting is known only to occur in >0.1% of all genes, mistakes in this epigenetic regulation have been found in several diseases and cancers. A commonly known example is the Prader-Willi syndrome where the active paternal copy on chromosome 15 is deactivated.

This is why it would be interesting to have an accessible method to detect these kind of genes. Older methods relied on genotyping combined with RNA-sequencing. The cost and often restricted detection of SNPs by genotyping are some of the problems occurring with these methods.

Recently, at BioBix, a new method was successfully developed to detect imprinted loci in RNA sequences, which does not need genotyping. It makes use of the Hardy-Weinberg equilibrium to estimate the expected number of heterozygous samples in a population, and makes a comparison with the observed heterozygous samples in the RNA-sequence data. The adjusted method provides a cheaper and less computationally intensive way to analyze these genes.

The new method would benefit the whole society and it would be a great addition to publish the tool which would be easily accessible for researchers all over the globe. This is why the methodology was put into a package with a handy manual available where different parameters can be effortlessly changed depending on requirements.

The tool has been integrated in the widely accessible galaxy server. By using this new package, new gained insights will create a new point of view on the relation between cancer (and disease in general) and loss of imprinting.

Samenvatting eindwerk 2013-2014: RNA-seq data-analysis pipeline integration in GALAXY
Using Next Generation Sequencing(NGS) technology, the amount of data we are able  to collect has increased dramatically. NGS also gave birth to software/tools that are able to  analyze this quantity of data. The analysis of this data consists out of multiple steps (Quality control, normalization, differential expression,..). For every step, there is a handful of tools available for use.
GALAXY is a web based platform and has an easy to use interface that doesn’t require much knowledge of informatics. GALAXY offers the user a collection of data-analysis tools by default, but also enables the user to adjust tools that are already present or integrate tools that the user needs. 
The aim of this work was to create a RNA-seq data-analysis pipeline in GALAXY. This pipeline, a collection of tools within GALAXY, enables the user to save time, choose the required settings and reproduce an exact workflow of the executed analysis. The tools used in this pipeline have been adjusted so that the pipeline can be used in multiple situations regarding RNA-seq data.
The pipeline was tested using a dataset with results from research involving
Birt-Hogg-Dubé(BHD) syndrome. BHD-syndrome is a rare condition that affects the skin and increases the chance of developing kidney and lung tumors. This dataset has also been analyzed using Rstudio packages to compare with the pipeline results to see if the pipeline works.


Sint-Pietersnieuwstraat 25
9000 Gent


Traineeship supervisor
Tim De Meyer
Via Map