Protein fold classi cation with Grow-and-Learn network
Abstract
Protein fold classi cation is an important subject in computational biology and a compelling work from the point of machine learning. To deal with such a challenging problem, in this study, we propose a solution method for the classi cation of protein folds using Grow-and-Learn (GAL) neural network together with one-versus-others (OvO) method. To classify the most common 27 protein folds, 125 dimensional data, constituted by the physicochemical properties of amino acids, are used. The study is conducted on a database including 694 proteins: 311 of these proteins are used for training and 383 of them for testing. Overall, the classi cation system achieves 81.2% fold recognition accuracy on the test set, where most of the proteins have less than 25% sequence identity with the ones used during the training. To portray the capabilities of the GAL network among the other methods, comparisons between a few approaches have also been made, and GAL\'s accuracy is found to be higher than those of the existing methods for protein fold classi cation. Protein fold classi cation is an important subject in computational biology and a compelling work from the point of machine learning. To deal with such a challenging problem, in this study, we propose a solution method for the classi cation of protein folds using Grow-and-Learn (GAL) neural network together with one-versus-others (OvO) method. To classify the most common 27 protein folds, 125 dimensional data, constituted by the physicochemical properties of amino acids, are used. The study is conducted on a database including 694 proteins: 311 of these proteins are used for training and 383 of them for testing. Overall, the classi cation system achieves 81.2% fold recognition accuracy on the test set, where most of the proteins have less than 25% sequence identity with the ones used during the training. To portray the capabilities of the GAL network among the other methods, comparisons between a few approaches have also been made, and GAL\'s accuracy is found to be higher than those of the existing methods for protein fold classi cation.
Source
Turkish Journal of Electrical Engineering and Computer SciencesVolume
25Issue
2URI
http://www.trdizin.gov.tr/publication/paper/detail/TWpRNE5EZzFOUT09https://hdl.handle.net/20.500.12418/3480
Collections
- Makale Koleksiyonu [3404]
- Öksüz Yayınlar Koleksiyonu - TRDizin [3395]