Prediction of protein-protein interactions from primary structure using a Random Forest classifier

Franke, Vedran (2010) Prediction of protein-protein interactions from primary structure using a Random Forest classifier. Diploma thesis, Faculty of Science > Department of Biology.

[img] PDF
Restricted to Registered users only
Language: English

Download (357kB) | Request a copy


The interaction between proteins is fundamental to a broad spectrum of biological functions, including regulation of metabolic pathways, immunological recognition, DNA replication, progression through the cell cycle, and protein synthesis. Due to the growing disparity between the amount of sequenced genomic content and functional data, there exist a pressing need for tools and methods that will enable prediction of phenotypic traits, on the molecular or organism level, based on the sequence alone. In this work we have constructed a high quality dataset of protein structures that has enabled us to use the Random Forest non-linear classificator to develop a method for prediction of interacting residues from the protein primary structure. Our results have shown that, although the Random Forest algorithm has a unique capability of accurately classifying highly dimensional data, we still have an incomplete knowledge of structural factors that determine the specificity of protein-protein interactions, thus putting an upper limit the on the usefulness of the machine learning approach in predicting protein interactions on the level of single amino-acids.

Item Type: Thesis (Diploma thesis)
Keywords: protein interactions, random forest, machine learning, prediction
Supervisor: Vlahoviček, Kristian
Date: 2010
Number of Pages: 32
Subjects: NATURAL SCIENCES > Biology
Divisions: Faculty of Science > Department of Biology
Depositing User: Silvana Šehić
Date Deposited: 05 Sep 2014 12:23
Last Modified: 05 Sep 2014 12:23

Actions (login required)

View Item View Item

Nema podataka za dohvacanje citata