Campus Units

Genetics, Development and Cell Biology, Bioinformatics and Computational Biology, Computer Science

Document Type

Article

Publication Version

Accepted Manuscript

Publication Date

2010

Journal or Book Title

International Journal of Data Mining and Bioinformatics

Volume

4

Issue

1

First Page

21

Last Page

43

DOI

10.1504/IJDMB.2010.030965

Abstract

We explore whether protein-RNA interfaces differ from non-interfaces in terms of their structural features and whether structural features vary according to the type of the bound RNA (e.g., mRNA, siRNA, etc.), using a non-redundant dataset of 147 protein chains extracted from protein-RNA complexes in the Protein Data Bank. Furthermore, we use machine learning algorithms for training classifiers to predict protein-RNA interfaces using information derived from the sequence and structural features. We develop the Struct-NB classifier that takes into account structural information. We compare the performance of Naïve Bayes and Gaussian Naïve Bayes with that of Struct-NB classifiers on the 147 protein-RNA dataset using sequence and structural features respectively as input to the classifiers. The results of our experiments show that Struct-NB outperforms Naïve Bayes and Gaussian Naïve Bayes on the problem of predicting the protein-RNA binding interfaces in a protein sequence in terms of a range of standard measures for comparing the performance of classifiers.

Comments

This is a manuscript of an article from International Journal of Data Mining and Bioinformatics 4 (2010): 21, doi: 10.1504/IJDMB.2010.030965. Posted with permission.

Copyright Owner

Inderscience

Language

en

File Format

application/pdf

Share

COinS