TY - GEN
T1 - Protein function prediction using kernal logistic regresssion with ROC curves
AU - Liu, Jingwei
AU - Qian, Minping
PY - 2011
Y1 - 2011
N2 - To avoid the "over-fitting" problem in protein function prediction based on protein-protein interactions (PPI), we propose a pattern recognition strategy that all the features of PPI observation data are divided into three sets, training set, learning set and testing set. The employed classifiers are trained on training sets, the receiver operating characteristic (ROC) curve and optimal operating point (OOP) is calculated on learning set, and the accuracy rate is reported on the testing set with OOP. Under this framework, we compare the performances of logistic regression (LR) model with kernel logistic regression (KLR) model on two different feature selection sets, 1-order feature and 2-order feature according to PPI data. The experiment results on a standard PPI data show that KLR model performs better than LR model on training sets of both 1-order feature set and 2-order feature set, and the 2-order feature outperforms 1-order feature set with KLR model on training set . The predictive rates on testing set of both 1-order feature and 2-order feature with LR and KLR can achieve 95%.
AB - To avoid the "over-fitting" problem in protein function prediction based on protein-protein interactions (PPI), we propose a pattern recognition strategy that all the features of PPI observation data are divided into three sets, training set, learning set and testing set. The employed classifiers are trained on training sets, the receiver operating characteristic (ROC) curve and optimal operating point (OOP) is calculated on learning set, and the accuracy rate is reported on the testing set with OOP. Under this framework, we compare the performances of logistic regression (LR) model with kernel logistic regression (KLR) model on two different feature selection sets, 1-order feature and 2-order feature according to PPI data. The experiment results on a standard PPI data show that KLR model performs better than LR model on training sets of both 1-order feature set and 2-order feature set, and the 2-order feature outperforms 1-order feature set with KLR model on training set . The predictive rates on testing set of both 1-order feature and 2-order feature with LR and KLR can achieve 95%.
KW - kernel logistic regression
KW - logistic regression
KW - optimal operating point
KW - protein-protein interaction
KW - receiver operating characteristic
UR - https://www.scopus.com/pages/publications/80052807961
U2 - 10.1007/978-3-642-24091-1_65
DO - 10.1007/978-3-642-24091-1_65
M3 - 会议稿件
AN - SCOPUS:80052807961
SN - 9783642240904
T3 - Communications in Computer and Information Science
SP - 491
EP - 502
BT - Computing and Intelligent Systems - International Conference, ICCIC 2011, Proceedings
T2 - 2011 International Conference on Computing, Information and Control, ICCIC 2011
Y2 - 17 September 2011 through 18 September 2011
ER -