Computational analysis of short read Illumina and nanopore sequencing libraries

Habulin, Dunja (2016) Computational analysis of short read Illumina and nanopore sequencing libraries. Diploma thesis, Faculty of Science > Department of Biology.

[img]
Preview
PDF
Language: Croatian

Download (1MB) | Preview

Abstract

It is increasingly popular to use hybrid data, i.e. reads from two or more different sequencing libraries, for analysis and de novo genome assembly. The Illumina and the Oxford Nanopore sequencing methods are a very good combination used for this, since the Illumina produces extensive amounts of precise data and the Oxford Nanopore has shown to produce long reads. I analysed genomic DNA libraries of Ephydatia mülleri sponge, which were a result of hybrid sequencing. Sponges belong to the Porifera phylum and might be a milestone in the early evolution of Metazoa kingdom. Therefore it is very important to assemble as many genomes as possible in this particular group in order to clarify various events in the metazoan evolution and relations between other groups within this kingdom. Considering the fact that sponges are indeed obligatory symbionts with many bacterial species and the fact that they are impossible to grow in sterile laboratory conditions, prepared sequencing libraries also contain a certain level of DNA contamination. I clustered reads belonging to both sequencing technologies using machine learning methods, in order to analyse the abundance of contaminant reads. Reads within particular cluster were used to make contigs. After the comparison of both clustering results and contaminant detection efficiency, reads from the Oxford Nanopore sequencing technologies have shown better results. Regardless of the applied methodology, there are still problems with certain contaminant identification and it is therefore essential to further enhance the protocol by using additional computational methods.

Item Type: Thesis (Diploma thesis)
Keywords: next generation sequencing technologies, unsupervised machine learning, contig assembly
Supervisor: Vlahoviček, Kristian
Date: 2016
Number of Pages: 45
Subjects: NATURAL SCIENCES > Biology
Divisions: Faculty of Science > Department of Biology
Depositing User: Grozdana Sirotic
Date Deposited: 28 Nov 2016 09:36
Last Modified: 28 Nov 2016 09:36
URI: http://digre.pmf.unizg.hr/id/eprint/5348

Actions (login required)

View Item View Item