电子科技大学学报
電子科技大學學報
전자과기대학학보
JOURNAL OF UNIVERSITY OF ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA
2015年
3期
467-470
,共4页
王明会%龚艺%王强%冯焕清%李骜
王明會%龔藝%王彊%馮煥清%李驁
왕명회%공예%왕강%풍환청%리오
生物信息学%K近邻算法%蛋白质相互作用%亚细胞定位
生物信息學%K近鄰算法%蛋白質相互作用%亞細胞定位
생물신식학%K근린산법%단백질상호작용%아세포정위
bioinformatics%K-nearest neighbor algorithm%protein-protein interaction%subcellular localization
提出了一种基于序列和PPI特征的距离公式,可综合序列氨基酸组成和PPI对象、强弱等信息对两个蛋白质的相似性进行表征,并在此基础上提出了一种用于蛋白质亚细胞定位预测的K近邻算法。利用留一法对性能进行了评估,结果显示,在序列基础上加入PPI特征,可明显有助于亚细胞定位的预测;同时基于上述距离的K近邻算法也优于使用相同特征的SVM算法,表明该算法可以对蛋白质的亚细胞定位信息进行准确有效的预测。
提齣瞭一種基于序列和PPI特徵的距離公式,可綜閤序列氨基痠組成和PPI對象、彊弱等信息對兩箇蛋白質的相似性進行錶徵,併在此基礎上提齣瞭一種用于蛋白質亞細胞定位預測的K近鄰算法。利用留一法對性能進行瞭評估,結果顯示,在序列基礎上加入PPI特徵,可明顯有助于亞細胞定位的預測;同時基于上述距離的K近鄰算法也優于使用相同特徵的SVM算法,錶明該算法可以對蛋白質的亞細胞定位信息進行準確有效的預測。
제출료일충기우서렬화PPI특정적거리공식,가종합서렬안기산조성화PPI대상、강약등신식대량개단백질적상사성진행표정,병재차기출상제출료일충용우단백질아세포정위예측적K근린산법。이용류일법대성능진행료평고,결과현시,재서렬기출상가입PPI특정,가명현유조우아세포정위적예측;동시기우상술거리적K근린산법야우우사용상동특정적SVM산법,표명해산법가이대단백질적아세포정위신식진행준학유효적예측。
Information of protein subcellular localization is indispensable to study protein function, as a protein can perform its function only after it is correctly transported to a specific subcellular compartment. Thus it is very important to provide accurate prediction of protein subcellular localization in biological studies. In contrast to sequence features (e.g. amino acids composition) that are widely used in subcellular localization prediction, features extracting protein-protein interaction (PPI) are largely ignored, although they reflect the co-localization information of different proteins. In this study, we propose a novel distance formula based on both protein sequence and PPI features, which precisely measures the similarity of proteins by incorporating protein information including amino acid composition, PPI and the corresponding interaction scores. Based on this distance formula, we further introduce a k-nearest neighbor (KNN) algorithm for predicting subcellular localization. The results of leave-one-out test on a benchmark dataset show that PPI features significantly improve the performance of protein subcellular localization. Meanwhile, this KNN algorithm also outperformes SVM algorithm adopting the same features, suggesting the efficiency of the proposed algorithm for predicting protein subcellular localization.