Genetics, Development and Cell Biology, Bioinformatics and Computational Biology, Computer Science
Journal or Book Title
International Journal of Data Mining and Bioinformatics
We explore whether protein-RNA interfaces differ from non-interfaces in terms of their structural features and whether structural features vary according to the type of the bound RNA (e.g., mRNA, siRNA, etc.), using a non-redundant dataset of 147 protein chains extracted from protein-RNA complexes in the Protein Data Bank. Furthermore, we use machine learning algorithms for training classifiers to predict protein-RNA interfaces using information derived from the sequence and structural features. We develop the Struct-NB classifier that takes into account structural information. We compare the performance of Naïve Bayes and Gaussian Naïve Bayes with that of Struct-NB classifiers on the 147 protein-RNA dataset using sequence and structural features respectively as input to the classifiers. The results of our experiments show that Struct-NB outperforms Naïve Bayes and Gaussian Naïve Bayes on the problem of predicting the protein-RNA binding interfaces in a protein sequence in terms of a range of standard measures for comparing the performance of classifiers.
Towfic, Fadi; Caragea, Cornelia; Gemperline, David C.; Dobbs, Drena; and Honavar, Vasant, "Struct-NB: Predicting Protein-RNA Binding Sites Using Structural Features" (2010). Genetics, Development and Cell Biology Publications. 111.