Antelope Dx

AZ Sint-Lucas Gent

Biogazelle

Bioresource Center UZ Gent (Biobank)

Departement Moleculaire Biotechnologie, Research Bio-informatics & Computational Genomics

DoCoLab (DopingControleLaboratorium)

Ecca laboratorium

GenOhm

HIV Cure Research Center, Ghent University

Labo voor milieutoxicologie en aquatische ecologie (UGent)

UG, vakgroep (micro)biologie

Contact details

Traineeship proposition

Abstract

Testimony

Admin

Stageonderwerp 2010-2011:

Deze stage kadert in een lopend onderzoeksproject die tot doel heeft de fylogenie en evolutie van groenwieren op te helderen. Dit onderzoek bestaat uit twee grote luiken: 1) soortstructuur en verwantschappen nagaan tussen nauw verwante soorten binnen specifieke groenwiergroepen, en 2) relaties tussen de grote groepen groenwieren achterhalen.

1) Soortstructuur en verwantschappen tussen nauw verwante soorten

De soortgrenzen bij algen zijn niet gemakkelijk te achterhalen op basis van morfologische kenmerken. Daarom wordt meer en meer gebruikt gemaakt van DNA sequentiegegevens om soorten af te bakenen en om de relaties tussen verschillende soorten op te helderen. Om verwantschappen tussen nauw verwante soorten te achterhalen wordt gebruik gemaakt van snel evoluerende genen. De specifieke doelstelling van deze stage is om meerdere nucleaire genen te sequeneren om zo een beter beeld te krijgen van de soortstructuur, mogelijke hybridisatie en verwantschappen tussen soorten binnen een specifieke groep van groenwieren, de Bryopsidales.

2) Relaties tussen groenwiergroepen

De ouderdom van de groenwieren wordt geschat op 1 miljard jaar. Deze ouderdom bemoeilijkt sterk het ophelderen van verwantschappen tussen de grote groepen. Het opstellen van een acurate fylogenetische boom van de groenwieren is dan ook enkel mogelijk door analyse van meerdere genen. Specifiek zal van enkele vertegenwoordigers uit de voornaamste groenwiergroepen een tiental genen worden gesequeneerd en geanalyseerd.

Praktisch

Labowerk: cultiveren van algen, DNA extractie, PCR amplificatie, sequenering, clonering

Data-analyse: sequentie assemblage, alignering van DNA sequenties, fylogenetische analyse technieken (selectie van evolutionaire modellen, maximum likelihood analyse, Bayesiaanse analyse, etc.)

Stageonderwerp 2009-2010: diversiteit van bacteriële symbionten in het groenwier Boergesenia

Veel eukaryoten leven in nauwe fysiologische associatie met bacteriën die hetzij binnen de cellen hetzij op het celoppervlak leven. Vergeleken met prokaryoten, zijn eukaryoten sterk beperkt in hun biochemisch repertoire. Om deze reden wordt verondersteld dat eukaryoten symbiotische relaties met bacteriën aangaan. Microbiële symbionten zijn reeds uitvoerig bestudeerd bij verscheidene dieren en heterotrofe protisten maar bacteriën worden ook in nauwe associatie aangetroffen met tal van algen.

Tijdens de stage zullen endosymbiotische bacteriën in het mariene groenwier Boergesenia bestudeerd worden aan de hand van verschillende moleculaire technieken. De meerkernige reuzencellen van Boergesenia zijn logische doelwitten voor intracellulaire bacteriën. Het onderzoek richt zich specifiek naar karakterisatie van de endofytische microflora, en het inzicht verwerven in de gastheerspecificiteit van de geassocieerde bacteriën en co-evolutionaire processen tussen de bacteriële symbionten en de gastheren.

De associatie tussen bacteriën en groenwieren zullen onder laboratoriumcondities, aan de hand van zuivere (unialgale) algenculturen bestudeerd worden. Boergesenia isolaten, ingezameld uit verschillende geografische locaties, worden onder steriele omstandigheden opgekweekt in marien cultuurmedium onder aangepaste licht en temperatuursregimes. Moleculaire identificatie van de intracellulaire bacteriële diversiteit gebeurd in de eerste plaats via DNA-sequentiebepaling van het 16S rDNA. Een standaard CTAB methode wordt toegepast voor DNA-extractie van de alg-protoplasten, gevolgd door PCR-amplificatie met universele prokaryote 16S rDNA primers. PCR producten zullen vervolgens gekloneerd worden met een pGEM-T vectorsystem II plasmide. Verschillende methodes zullen worden uitgetest om de gekloneerde sequentiediversiteit te screenen, waaronder “Denaturating gradient gel electrophoresis” (DGGE) en “Restriction fragment length polymorphism” (RFLP) analyse. DNA sequentiebepaling zal gebeuren aan de hand van hand van geautomatiseerde Sanger sequencing.

Stagetaken:

- Opkweken en onderhoud van de Boergesenia algenculturen

- DNA extractie

- PCR amplificatie

- Klonering van PCR producten

- Denaturating gradient gel electrophoresis (DGGE)

- Restriction fragment length polymorphism (RFLP)

- DNA sequencing

- Sequentie editing, assemblage en analyse

Stageonderwerp 2005-2006

Het chloroplast genoom van Cladophora coelothrix (Cladophorales, Chlorophyta)

Het doel van deze stage is het bepalen van de volledige sequentie van het chloroplast (cp) genoom van Cladophora coelothrix. De weinige cp-genomen van groenwieren die momenteel voorhanden zijn concentreren zich voornamelijk op groenwier vertegenwoordigers die relatief dicht aan de basis liggen van de landplanten. Cladophora staat evolutief erg ver van deze reeds gekende taxa.

Chloroplast genomen bij groenwieren zijn circulair en variëren in grootte tussen de 120- 210 kb. De interne architectuur (aantal en organisatie van de genen) blijkt erg variabel te zijn, dit in tegenstelling tot de cp-genomen van landplanten. De bepaling van het Cladophora cp-genoom zal leiden tot een betere kennis van cp-genoom evolutie, maar ook tot nieuwe inzichten in de evolutie van groenwieren.

Methoden:

1) Isolatie van zuiver cpDNA gebeurt in 3 basisstappen: scheiding van chloroplasten van anderen organellen (sucrose step-gradiënt centrifugering), lysis van de chloroplasten en zuivering van het cpDNA (scheiding van A+T rijk organel DNA door CsCl ultracentrifugering).

2) Sequenering van het cp genoom. Drie verschillende methoden worden gebruikt bij groenwieren en planten: a) bereiding van plasmid libraries, b) constructie van bacterial artificial chromosomes (BAC) of Fosmid libraries, en c) amplificatie van het volledige genoom door rolling circle amplificatie (RCA).

a) Bereiding van een plasmid library van DNA fragmenten tussen 1200-3000 bp. Sequenering met behulp van plasmide primers.

b) Clonering van het genoom in bacterial artificial chromosomes (BAC) of Fosmid libraries heeft als voordeel dat de insert veel groter is dan bij plasmid libraries: 40-150 kb. Isolatie van cpDNA, scheiding van 40-150 kb fragmenten door pulse field gel electrophoresis (PFGE), clonering (in 284 well platen) en screening van de macroarrays met hybridisatie probes. De positieve clones knippen, shotgun cloneren en sequeneren.

c) Amplificatie van het volledige cp genoom door rolling circle amplification (RCA): isotherme amplificatie dmv bacteriophage Ph29 polymerase, in staat tot DNA synthese van meer dan 70 kb. Deze techniek werd vooral gebruikt voor amplificatie van het menselijke genoom en voor dit doel is er een commerciële kit voorhanden.

Abstract advanced bachelor of bioinformatics 1 2019-2020 (microbiology): Bacterial whole-genome sequence analysis for taxonomy and epidemiology studies

Introduction
To study the taxonomy and epidemiology of bacterial pathogens it is important to obtain as much information as possible about these strains. As sequencing technology has become more advanced and cheaper it has become custom to sequence whole-genomes instead of only a few individual genes. However, for many bacterial species there is no reference genome available yet. The aim of this project was to process raw whole-genome sequencing data to create assembled and annotated genomes that can be published and used as a reference for future research.

Methods
Two different datasets were used, i.e. one with 52 Achromobacter species and one with 19 Burkholderia species. Both datasets consist of raw data in fastq.gz format that was generated by the Illumina HiSeq 4000 and Illumina NovaSeq 6000 platforms. A pipeline was followed to turn these reads into assembled and annotated genomes. The first step was to evaluate the quality of the raw data and to filter them on a minimum read length of 50bp and a quality score of Q30. This was done with fastQC and fastp. Next, we used Shovill which is a tool that combines SPAdes, bwa, samtools and pilon. The amount of reads are by default downsized to a subset with a sequencing depth of 150. SPAdes assembled the genome by using Kmers according to the De Bruijn graph principle, which resulted into the construction of contigs. They were then filtered on a minimum contig length of 500bp. The raw reads were mapped against these contigs with bwa and sorted and indexed with samtools. Pilon then performed an alignment analysis and tried to make improvements to the assembly.
The assembled genomes were annotated with Prokka. To check the quality of our final assemblies, we mapped the raw reads against their assemblies and used Qualimap to provide us more details about the features of the resulting alignment. Important checks are the amount of unmapped reads and coverage. CheckM is another method that was used to evaluate the quality of the annotated assemblies by estimating if the genome is complete or possibly contaminated. As a second part in the project, housekeeping genes were extracted from the genomes and added to an internal database. Multi-Locus Sequence Typing analysis was performed to determine the sequence type of the strains. Because there is variation in the sequence of these housekeeping genes between different strains, this kind of information is important for epidemiological studies.
For both datasets one housekeeping gene was used as a control to verify the authenticity of the samples. The sequence that was extracted from our genome was compared via multiple sequence alignment to the sequence of a gene that was previously determined from the DNA sample via Sanger sequencing. For this MUSCLE was used in MEGA7 software. A phylogenetic tree was also constructed using the Neighbour-Joining method to visualize the evolutionary relationships. In figure 1 we can see that the sequences obtained via Sanger sequencing and via whole-genome sequencing (indicated by “exseq WGS”) cluster together for each of the samples.

Finally bcgTree was used to make an extended phylogenetic tree based on maximumlikelihood analysis of 107 different genes from the strains of both datasets. The genomes that passed all quality controls were approved for publication. Both the raw reads and the annotated genomes were registered and uploaded into the European Nucleotide Archive. As a result, 41 Achromobacter and 15 Burkholderia strains are now publicly available through the accession numbers PRJEB37567 and PRJEB37806, respectively.

Abstract advanced bachelor of bioinformatics 2 2019-2020 (biology): Bioinformatics analyses of amplicon data (16S and 18S rRNA genes) for studying biodiversity patterns in Bacteria and Eukaryotes in high-Arctic biological soil crusts

Background: The project I’m working on is BioSoCr, a project of the research group PAE, part of Biology Section. It is a project that investigates the “Greening of the Arctic”, which is probably a direct correlation with the warming of the climate. For this there were samples taken in Svalbard, an island very close to the north pole. Samples were taken at 12 different location. At those locations there were samples taken at three positions: Young (no vegetation), Intermediate (small vegetation) and Old (lot of vegetation), and there were samples taken at the top layer, and deeper (5-10cm). Also tree biological replicates were taken per place, called A-B-C. The aim of my traineeship is to perform the bio-informatics analysis on the first set of samples.

Method: First I started with the raw analysis of the samples. I checked for quality, ran a “multiqc” to check for the quality. After that I assembled the sequences, using the PEAR program, then performed a quality filtering using USEARCH. After that I merged the samples and only kept the unique sequences, on which I checked for chimeras using USEARCH. At last the blast analysis was performed using the program MOTHUR. After that there was a manual check of the data, to check for contamination, delete the organisms that are not interesting for the project, …

Then I performed the further analysis in R. Within R I first started with the analysis of the raw data, check the library sizes, look at the rarefaction curve and check for the amount of reads I loose after filtering. Based on the analysis of the library sizes, the decision is made to normalize the Eukaryotic libraries to 3500 reads and the Prokaryotic libraries to 10000 reads. The reason for this difference is because I lost a lot of reads for the Eukaryotes after deleting the Embryophyceae. For the other parts of the R analysis we worked with the normalized data.

Second part was performing some statistical analysis on the data, performing an NMDS analysis, Heatmap and CAP analysis. In the CAP analysis you include some metadata in the calculation. For our dataset, I checked for clustering by both including the metadata “Location” and metadata “Deep/Top” in the CAP analysis.

In the last part I looked more into detail at the different taxonomic groups which were available in the samples. I checked the overall biggest groups for the Eukaryotes and Prokaryotes, then checked per location what were the differences. Also a dendogram was made to check for the most diverse groups of organisms. At last the most interesting groups were checked for their composition in more detail.

Results: For the statistical analysis, NMDS was least informative, CAP and Heatmap were giving better results. For the NMDS I could not find a real clustering of the samples, based on some metadata. For the CAP analysis with both location and depth of sampling there was a nice clustering visual. This was also statistically substantiated with the CCR (correct class ratio) values. Also Heatmap was showing that the samples had some similarity. The similarity was biggest in the 16S samples, where the Heatmap was calculate based on the presence/absence of the different OTU’s.

For the taxonomic analysis I could find the major groups of Prokaryotic and Eukaryotic organisms and found that the biggest groups of Prokaryotes were Proteobacteria, Actinobacteria and Acidobacteria. For the Eukaryotes I found that the biggest groups were Opistokonta, Achaeplastida and Alveolata.

Conclusion: As a main conclusion of the different parts I could state that for the first part of the analysis, the analysis of the raw data, I had to make some changes in the script, but overall it was very well documented. For the future though, I would want to include some other pipeline, to be able to compare.

As for the work in R, I first looked at the correlation of the samples and if the different metadata were clustering together yes or no. For that I tried several options, an MDS, NMDS, PCA, CAP analysis. At the end I only kept NMDS and CAP analysis. The advantage of NMDS compared with MDS is that NMDS in more robust. PCA was left out, because this method is not so compatible with a lot of zero-counts. CAP analysis has come out as the best one, because this analysis can include the different metadata you want to correlate. It is also statistically substantiated. It gives the CCR (correct class ratio) which indicates the “correctness” of the clustering. At last I also included a heatmap, this gives the correlation between the samples with a color code, which also gives a good representation of how related samples are. The last part in R included some taxonomic analysis. This was very important to look further into detail at the different taxonomic groups, which ones appear the most, are there a lot of differences in between location, in between sampling depth, … Of course for this last part, there are further analysis necessary to loopback to the original set-up of this project. Is what we find logical, compared to what is expected.

Samenvatting eindwerk 2010-2011: Evolutie van groenwieren: diepe fylogenetische relaties in de Chlorophyta en speciatie in Halimeda

Algen zijn een diverse groep van autotrofe eukaryoten die voornamelijk in het aquatische milieu leven. Algen zijn niet allemaal ontstaan uit één gemeenschappelijke voorouder, ze zijn onafhankelijk van elkaar ontstaan door endosymbiotische gebeurtenissen. Verschillende algengroepen zijn divers en ecologisch belangrijk, waaronder de groenwieren, roodwieren, bruinwieren, diatomeeën en dinoflagellaten. Dit werk richt zich vooral op het onderzoek van groenwieren, meerbepaald de Chlorophyta. Binnen de Chlorophyta bevinden zich vier grote klassen: Trebouxiophyceae, Ulvophyceae, Chlorophyceae en Prasinophyceae.

Een deel van dit onderzoek gaat over hybride speciatie in Halimeda, dit is een genus binnen de orde Bryopsidales (Ulvophyceae). Op basis van PCR en DNA sequentieanalyse van nucleaire genen worden mogelijke hybriden achterhaald. Het amplificeren van nucleaire genen is niet evident wat het weinig positief resultaat verklaart. De fylogenetische boom van het EF1a-gen bevestigt eerdere verwantschappen. Hybride speciatie kan op basis van één gen niet achterhaald worden en daarom moeten bijkomende nucleaire genen onderzocht worden.

Een ander deel van dit onderzoek gaat over de diepe fylogenetische relaties binnen de Chlorophyta: meerbepaald de Trebouxiophyceae. Recent ingezamelde stalen zijn onderzocht en geanalyseerd. Op basis van PCR en DNA sequentieanalyse van één gen kunnen geen diepe relaties achterhaald worden en daarom worden er zes chloroplastgenen onderzocht in dit werk. De fylogenetische analyses zijn gebaseerd op nucleotide en aminozuur sequenties. Deze fylogenetische bomen bevestigen enkele stellingen die al eerder achterhaald werden. Ook worden mogelijk nieuwe verwantschappen aan het licht gebracht (bijvoorbeeld het genus Lobosphaera en de Prasiola “clade”). Om alle relaties binnen de Trebouxiophyceae te achterhalen moeten meerdere genen en soorten onderzocht worden.

Samenvatting eindwerk 2009-2010: Fylogenie en soortafbakening binnen de groenwierordes Cladophorales en Bryopsidales, en evolutie van endosymbiotische bacteriën

Algen, een heterogene groep van autotrofe eukaryote organismen die voornamelijk in het aquatische milieu leven, zijn niet allemaal ontstaan uit één gemeenschappelijke voorouder, maar ze zijn verschillende keren onafhankelijk van elkaar ontstaan door een complexe evolutie van chloroplasten door endosymbiotische gebeurtenissen. Door het onderzoek naar de identiteit van endosymbiotische bacteriën in nu levende algen kan duidelijk worden welke invloeden endosymbiose heeft gehad gedurende de evolutie.

In dit werk wordt er voornamelijk onderzoek gedaan naar één bepaalde groep algen, namelijk de groenwierorde Cladophorales, met het oog op drie doelstellingen. Een eerste doelstelling is om specifieke fylogenetische verwantschappen binnen de Cladophorales na te gaan. Ten tweede zal aan de hand van de DNA-sequenties soorten worden afgebakend. Een derde doelstelling is de karakterisatie van endosymbiotische bacteriën in verschillende soorten Cladophorales aan de hand van DNA sequentieanalyse.

We hebben recent ingezamelde stalen onderzocht en geanalyseerd en hebben enkele specifieke fylogenetische verwantschappen binnen de Cladophorales kunnen achterhalen (bijvoorbeeld fylogenetische positie van de Australische endemen Cladophorpsis magna en Dictyosphaeria sericea). Aan de hand van fylogenetische bomen, vervaardigd op basis van behaalde DNA-sequenties, zijn een aantal mogelijk nieuwe soorten voor de wetenschap afgebakend. In dit onderzoek werd ook de diversiteit van endosymbiotische bacteriën in twee Cladophorales-stalen nagegaan. Uit dit deel van het onderzoek bleek dat de stalen een aanzienlijke verscheidenheid aan endosymbiotische bacteriën vertonen.

Samenvatting eindwerk 2005-2006:Moleculaire fylogenie van de Siphonocladales (Chlorophyta)

Aan de hand van moleculaire technieken wordt een fylogenetische reconstructie van de Siphonocladales gemaakt. Eerst wordt een DNA-extractie uitgevoerd, waarbij het DNA uit de cel wordt vrijgesteld. Daarna zorgt de ‘Polymerase chain reaction’ (PCR) voor het vermenigvuldigen van het DNA. De stukken DNA kunnen dan gesequeneerd worden.

Er werd een fylogenetische analyse gemaakt van 36 soorten Siphonocladales op basis van de gecombineerde SSU en partiële LSU sequentiegegevens. Deze resultaten tonen aan dat er zeven clades zijn binnen de Siphonocladales. De resultaten van de fylogenetische analyse van 159 soorten Siphonocladales op basis van de partiële LSU sequenties zijn in overeenstemming met de gecombineerde SSU-LSU analysen en tonen dezelfde voornaamste clades aan.

De SSU en LSU rDNA tonen aan dat er een nauw verwantschap is tussen Cladophoropsis membranacea en soorten uit de genera Boodlea, Phyllodictyon en Struveopsis. Daarom werden de sequenties van de variabele ITS regio’s gesequeneerd en geanalyseerd. De Cladophoropsis membranacea bestaat uit vier verschillende ‘cryptische’ soorten.

Tags: biology; bioinformatics

Address

KL Ledeganckstraat 35

9000 Gent

Belgium

Contacts

Traineeship supervisor

Olivier De Clerck

09 264 8500

olivier.declerck@ugent.be

Traineeship supervisor

Frederik Leliaert

09 264 8508

frederik.leliaert@ugent.be

Traineeship supervisor

Heroen Verbruggen

09 264 8507

heroen.verbruggen@ugent.be

Traineeship supervisor

Bjorn Tytgat

092648504

bjorn.tytgat@ugent.be

Traineeship supervisor

Charlotte Peeters

0924645128

charlotte.peeters@ugent.be

Zoekopdracht

Klassiek

Via Map

BLT Stages

Traineeship / bachelor project

Pages

UG, vakgroep (micro)biologie

Stageonderwerp 2010-2011:

Stageonderwerp 2009-2010: diversiteit van bacteriële symbionten in het groenwier Boergesenia

Stageonderwerp 2005-2006

Address

Contacts

Traineeship / bachelor project

Search form

Pages

UG, vakgroep (micro)biologie

Stageonderwerp 2010-2011:

Stageonderwerp 2009-2010: diversiteit van bacteriële symbionten in het groenwier Boergesenia

Stageonderwerp 2005-2006

Address

Contacts