上海第二工业大学学报
上海第二工業大學學報
상해제이공업대학학보
JOURNAL OF SHANGHAI SECOND POLYTECHNIC UNIVERSITY
2013年
1期
12-17
,共6页
蛋白质四级结构%同源寡聚蛋白质%分类%降维
蛋白質四級結構%同源寡聚蛋白質%分類%降維
단백질사급결구%동원과취단백질%분류%강유
quaternary structure of protein%homo-oligomers%classification%dimension reduction
提出一种新的能依据蛋白质序列自动地识别被查询蛋白质的四级结构类型的方法.首先采用伪特定位点记分矩阵方法(PsePSSM)提取蛋白质序列的特征.采用这种方法提取出的特征能尽可能多地反映蛋白质序列的原始信息如顺序和进化等信息.但随之产生的问题是特征维数很高,使得预测系统复杂化.因此,引入线性维数约简算法最大方差映射方法(MVP),它可以从高维的特征空间中提取出低维的关键特征.最后,在约简后的特征上再应用分类算法预测未知蛋白质的四级结构.试验结果表明,采用降维方法不但使得预测系统得到简化,同时还提高了分类性能.
提齣一種新的能依據蛋白質序列自動地識彆被查詢蛋白質的四級結構類型的方法.首先採用偽特定位點記分矩陣方法(PsePSSM)提取蛋白質序列的特徵.採用這種方法提取齣的特徵能儘可能多地反映蛋白質序列的原始信息如順序和進化等信息.但隨之產生的問題是特徵維數很高,使得預測繫統複雜化.因此,引入線性維數約簡算法最大方差映射方法(MVP),它可以從高維的特徵空間中提取齣低維的關鍵特徵.最後,在約簡後的特徵上再應用分類算法預測未知蛋白質的四級結構.試驗結果錶明,採用降維方法不但使得預測繫統得到簡化,同時還提高瞭分類性能.
제출일충신적능의거단백질서렬자동지식별피사순단백질적사급결구류형적방법.수선채용위특정위점기분구진방법(PsePSSM)제취단백질서렬적특정.채용저충방법제취출적특정능진가능다지반영단백질서렬적원시신식여순서화진화등신식.단수지산생적문제시특정유수흔고,사득예측계통복잡화.인차,인입선성유수약간산법최대방차영사방법(MVP),타가이종고유적특정공간중제취출저유적관건특정.최후,재약간후적특정상재응용분류산법예측미지단백질적사급결구.시험결과표명,채용강유방법불단사득예측계통득도간화,동시환제고료분류성능.
@@@@An automated method to identify the quaternary structure of queried protein is proposed. Firstly, a PsePSSM (Pseudo Position-Specific Score Matrix) is adopted to extract the features of proteins. The features extracted by PsePSSM can mostly reflect the original information of protein sequence such as the evolution information and sequence-correlated information. But it may cause the“high dimension disaster” problem and make the prediction system complex. To overcome such a problem, a linear dimensionality reduction algorithm MVP (Maximum Variance Projections) is introduced to extract the key features from the high-dimensional PsePSSM space. Finally, based on the reduced features, classifier is used to identify the protein quaternary structure. Experiment results prove that the prediction system is simplified and classification performances are improved by adopting dimension reduction methods.