计算机应用研究
計算機應用研究
계산궤응용연구
APPLICATION RESEARCH OF COMPUTERS
2009年
12期
4617-4620
,共4页
关联规则%蛋白质二级结构预测%KDD~*%合成金字塔模型%基于关联分类算法
關聯規則%蛋白質二級結構預測%KDD~*%閤成金字塔模型%基于關聯分類算法
관련규칙%단백질이급결구예측%KDD~*%합성금자탑모형%기우관련분류산법
association rule%protein secondary structure prediction%KDD~*%compound pyramid model%CBA algorithm
蛋白质二级结构预测问题,是生物信息学领域中最为重要的任务之一,历经三十多年的研究,已取得了一些进展,尤其是近来集成预测模型与混合预测模型的引入,为预测精度带来了一定程度的提高,然而其离从二级结构推导三级结构的目标,仍然存在很大差距.为了有效提高蛋白质二级结构预测精度,以KDTICM理论的扩展性研究与KDD~*模型为基础, 使用基于KDD~*模型的关联分析蛋白质二级结构预测方法KAAPRO,提出一种基于支持度与可信度的复杂距离度量的CBA(classification based on association)算法,并以该算法为核心构建逐步求精、多层递阶的合成金字塔模型,该模型整体贯穿领域知识,并采用因果细胞自动机选择有效物化属性.在对偏alpha、beta型蛋白质的预测实验中, 改进型CBA算法较好地完成了对结构特征不明显氨基酸的预测,获得了较优的预测效果.
蛋白質二級結構預測問題,是生物信息學領域中最為重要的任務之一,歷經三十多年的研究,已取得瞭一些進展,尤其是近來集成預測模型與混閤預測模型的引入,為預測精度帶來瞭一定程度的提高,然而其離從二級結構推導三級結構的目標,仍然存在很大差距.為瞭有效提高蛋白質二級結構預測精度,以KDTICM理論的擴展性研究與KDD~*模型為基礎, 使用基于KDD~*模型的關聯分析蛋白質二級結構預測方法KAAPRO,提齣一種基于支持度與可信度的複雜距離度量的CBA(classification based on association)算法,併以該算法為覈心構建逐步求精、多層遞階的閤成金字塔模型,該模型整體貫穿領域知識,併採用因果細胞自動機選擇有效物化屬性.在對偏alpha、beta型蛋白質的預測實驗中, 改進型CBA算法較好地完成瞭對結構特徵不明顯氨基痠的預測,穫得瞭較優的預測效果.
단백질이급결구예측문제,시생물신식학영역중최위중요적임무지일,력경삼십다년적연구,이취득료일사진전,우기시근래집성예측모형여혼합예측모형적인입,위예측정도대래료일정정도적제고,연이기리종이급결구추도삼급결구적목표,잉연존재흔대차거.위료유효제고단백질이급결구예측정도,이KDTICM이론적확전성연구여KDD~*모형위기출, 사용기우KDD~*모형적관련분석단백질이급결구예측방법KAAPRO,제출일충기우지지도여가신도적복잡거리도량적CBA(classification based on association)산법,병이해산법위핵심구건축보구정、다층체계적합성금자탑모형,해모형정체관천영역지식,병채용인과세포자동궤선택유효물화속성.재대편alpha、beta형단백질적예측실험중, 개진형CBA산법교호지완성료대결구특정불명현안기산적예측,획득료교우적예측효과.
The problem of protein secondary structure prediction is one of the most important problems in bioinformatics. After the study of this problem for 30 years and more, there have been some breakthroughs. Especially the introduction of ensemble prediction model and hyrid prediction model, make the accuracy of prediction better, but there is a long distance to induce the tertiary structures from the secondary ones. As one of the researches of KDTICM theory, this paper proposed an improved algorithm of CBA , which was based on KDD~* model and combined with KAAPRO method, for protein secondary structure prediction. And proposed a gradually enhanced, multi-layer systematic perditions model, compound pyramid mode. The kernel of this model was association rules analysis of CBA. Domain knowledge was used through the whole model, and the phychemical attributes was chosen by causal cellular automata. The experiment predicted the proteins containing more alpha/beta structure. The structures of amino acids, whose structural traits were obscure, were predicted well by the improved CBA . Hence, the result of this model is satisfying too.