Abstract traineeship 2020-2021 (project 1): AUTOMATIZATION OF DDPCR ANALYSIS
The goal of this internship was to create an application with the ability to analyse ddPCR data from a QX200 instrument. Before the internship, this analysis was done manually in Excel. To automate this process, a Shiny app was developed. Figure 1 shows how the app will be constructed. In this internship, we started from a premade application, called ddPCRQuant. This app was developed at UGent. We expanded the existing application by adding GLMM analysis which were also developed at UGent. The results from ddPCRQuant were used as input for the GLMM analysis. After GLMM analysis, some basic calculations were preformed (eg. to calculate the concentration of the target in the sample). ddPCRQuant combined with GLMM analysis is called the MVP or Minimal Viable Product and is the core of the application. After this, we expanded the application even more. There are three different extra modules that can be used on top of the MVP: clinical trial module, Custom module and Scientist support module. Within the timeframe of this internship, there was no time to develop the custom module and Scientist support module. Within the MVP, all settings can be adjusted. Here lies the main difference with the clinical trial module. In the latter module every setting needs to be predetermined and should not be adjustable once the module is activated. In clinical trials, the analysis needs to be fixed to make results comparable between different.
To run the application, several files are needed: server.R, ui.R, global.R and glmmfunctions.R. In server.R we can find the server site, in ui.R we can find the structure of the user interface, in global.R we can find all the general information (which packages are needed) and in the glmmfunctions.R file we can find all the GLMM functions that are needed for the analysis.
Abstract traineeship 2020-2021 (project 2): SHINY APPLICATION FOR ANTISENSE OLIGONUCLEOTIDE OFF-TARGET SCREENING
Antisense oligonucleotides (ASOs) are a class of RNA targeting medicines that typically display high target-specificity. In the drug discovery pipeline, potent ASOs are selected based on their ability to reduce target expression. Despite sequence specificity for the target, non-specific regulation of off-target genes is frequently observed. Since these off- target effects may result in toxicity of the oligonucleotides and failure in clinical trials, it is important to assess off-target profiles of potential ASO drugs and use these to further prioritize candidate ASOs early in the drug development process. Biogazelle offers a unique high-throughput molecular characterization service of RNA targeting drugs to study these off-target effects. This application, called HTTargetSeq, combines differential gene expression with in silico off-target predictions to identify off-target profiles. Currently, a PDF report with several data visualizations is provided to the customer to help interpret the vast amount of data generated in an HTTargetSeq off-target analysis and ultimately select the ASOs with favorable off-target profile.
The goal of this traineeship was to develop an interactive tool for data visualization and exploration using the Shiny package in R. This Shiny application will complement the current static report and help facilitate data interpretation by the customer.
The newly developed tool presents HTTargetSeq results in two separate tabs, named “ASO off-target overview” and “ASO off-target profile”, respectively. In the “ASO off-target overview” tab, a summary of the off-target analysis for all tested ASOs is given. This summary consists of a bar plot visualizing the number of off-targets for all ASOs and a data table with the numeric information of this plot, which can be exported as a csv file. One can choose to display either predicted or validated off-targets. Further filtering options are available and include cell line selection, selection of the type of rescaling used to obtain the rescaled normalized counts and ASO subset selection. A second visualization in this tab is a heatmap representing average gene expressions of the ASO treatments. Optional clustering of the expression heatmaps enable appreciation of off-target profile similarity between ASOs.
The “ASO off-target profile” tab visualizes off-target information of one selected ASO. A cumulative distribution plot of average expressions is given for off-targets and non-off- targets. The data is pre-processed based on the selected qualification as an off-target, with the option to choose between predicted off-targets (with a selected number of mismatches) and validated off-targets. In addition to the off-target distribution plot, a table with off- target information of the selected ASO is shown and can be exported as a csv file.
Since the tool is developed with an in silico generated dataset, the next step would be to use a real-life dataset in the tool.
Future opportunities for expanding the tool include, but are not restricted to, adding two extra tabs (“ASO off-target comparison” and “Toxicity viewer”) and adding a dose response curve (displaying the dose response relationship of both the on-target and selected off- target) to the off-target profile tab. The visualizations can be added to the ASO off-target comparison tab to compare the off-targets: a Venn/Euler diagram, a comparison of the cumulative distributions and a comparison of therapeutic indices. The Toxicity viewer tab can be adapted from the existing GSEA (Gene Set Enrichment Analysis) app.
Abstract traineeship 2018-2019: HTPathwaySeq: RShiny user interface prototype for data exploration of high-throughput pathway analysis
HTPathwaySeq is a novel application, developed by Biogazelle, for high-throughput RNA sequencing based molecular phenotyping. It allows one to establish compound activity in early drug discovery stages in a more cost-effective manner. The data output of the pipeline consists of enriched gene sets, for up to 95 contrasts with varying condition, in numerous text-based files. The objective during this traineeship was to develop an interactive and dynamic user interface with the help of the R package Shiny. Users should be able to explore the data generated by HTPathwaySeq and interact with data analyses, such as pathway activity, compound similarity and molecular toxicity. In short, with the application it’s possible to: (i) get a general overview of the results obtained from HTPathwaySeq. (ii) Explore the enriched gene sets for individual contrasts in addition with informative descriptions and gene details. (iii) Determine Compound toxicity by visualizing enriched gene sets with known cellular responses to toxic agents. (iv) Assess molecular similarity amongst compounds by clustering contrasts based on gene set enrichment correlation. Furthermore, the application is designed in such a way that it works dynamically for different experimental designs and with a lot of interactivity to improve exploration and visualization. To conclude, The HTPathwaySeq Rshiny user interface allows for improved data explorations in order to gain more insight and increase research outcome.
Abstract traineeship advanced bachelor of bioinformatics 2017-2018: RNA Seq compendium using R/Shiny
Biogazelle is a transcriptomics service provider with a large experience in RNA sequencing technologies. Besides sequencing projects for customers, the R&D unit has generated a large catalog of sequenced samples, covering a series of tissues, sample types, diseases, and many technical settings.
The purpose of the internship is to explore this internal collection of mRNA expression data to support our customers in finding an optimal experimental setting for their liquid biopsy experiment.
In the first phase, we collected sample information, gene annotation and gene expression data from projects executed at Biogazelle. Data from all projects were normalized and homogenized to generate large data matrices (in R). In the second phase, we built a simple (web) interface (using Shiny) to query these data matrices.
The web interface allows users to select a gene of interest and display the normalized gene expression values together with sample information. Furthermore, the application is capable of visualizing gene expression levels (R) between different datasets, across different tissue types, different sample types and in multiple diseases. It also supplies the user with sample by sample QC information to make data interpretation easier.
For the user this will offer them insight in the expression of their selected gene. They will be able to determine the best course of action for future experiments in terms of selection of samples (sample types/tissue types), the type of disease they want to study, the expression levels of the chosen gene and more.
Gert Van Peer