计算机研究与发展
計算機研究與髮展
계산궤연구여발전
Journal of Computer Research and Development
2015年
8期
1784-1793
,共10页
流模拟%图聚类%软聚类%蛋白质互作用网络%蛋白质复合体
流模擬%圖聚類%軟聚類%蛋白質互作用網絡%蛋白質複閤體
류모의%도취류%연취류%단백질호작용망락%단백질복합체
flow simulation%graph clustering%soft clustering%protein-protein interaction network%protein complex
蛋白质互作用(protein‐protein interaction ,PPI)网络是广泛存在的一类复杂生物网络,其网络拓扑特征与功能模块分析密切相关。图聚类是对复杂网络进行分析和处理的一种重要计算方法。传统的PPI网络中蛋白质复合体检测算法通常对网络图中的对象进行硬划分,而寻找网络中的重叠簇的软聚类算法已成为当前研究热点之一。现有的软聚类算法较少关注寻找网络中具有重要生物意义的小规模非稠密簇。对此,基于网络中结点邻域给出了边关联强度的度量方法,并在此基础上提出了一种基于流模拟的PPI网络中复合体检测的图聚类(flow‐simulation graph clustering ,F‐GCL )算法,该算法可以在快速发现PPI网络中的重叠簇的同时找到小规模非稠密簇;同时,与MCODE(molecular complex detection), MCL(Markov clustering),RNSC(restricted neighborhood search clustering)和CPM(clique percolation method)算法在6个酿酒酵母PPI网络上进行比较,该算法在 F‐measure ,Accuracy ,Separation方面表现了较好的性能。
蛋白質互作用(protein‐protein interaction ,PPI)網絡是廣汎存在的一類複雜生物網絡,其網絡拓撲特徵與功能模塊分析密切相關。圖聚類是對複雜網絡進行分析和處理的一種重要計算方法。傳統的PPI網絡中蛋白質複閤體檢測算法通常對網絡圖中的對象進行硬劃分,而尋找網絡中的重疊簇的軟聚類算法已成為噹前研究熱點之一。現有的軟聚類算法較少關註尋找網絡中具有重要生物意義的小規模非稠密簇。對此,基于網絡中結點鄰域給齣瞭邊關聯彊度的度量方法,併在此基礎上提齣瞭一種基于流模擬的PPI網絡中複閤體檢測的圖聚類(flow‐simulation graph clustering ,F‐GCL )算法,該算法可以在快速髮現PPI網絡中的重疊簇的同時找到小規模非稠密簇;同時,與MCODE(molecular complex detection), MCL(Markov clustering),RNSC(restricted neighborhood search clustering)和CPM(clique percolation method)算法在6箇釀酒酵母PPI網絡上進行比較,該算法在 F‐measure ,Accuracy ,Separation方麵錶現瞭較好的性能。
단백질호작용(protein‐protein interaction ,PPI)망락시엄범존재적일류복잡생물망락,기망락탁복특정여공능모괴분석밀절상관。도취류시대복잡망락진행분석화처리적일충중요계산방법。전통적PPI망락중단백질복합체검측산법통상대망락도중적대상진행경화분,이심조망락중적중첩족적연취류산법이성위당전연구열점지일。현유적연취류산법교소관주심조망락중구유중요생물의의적소규모비주밀족。대차,기우망락중결점린역급출료변관련강도적도량방법,병재차기출상제출료일충기우류모의적PPI망락중복합체검측적도취류(flow‐simulation graph clustering ,F‐GCL )산법,해산법가이재쾌속발현PPI망락중적중첩족적동시조도소규모비주밀족;동시,여MCODE(molecular complex detection), MCL(Markov clustering),RNSC(restricted neighborhood search clustering)화CPM(clique percolation method)산법재6개양주효모PPI망락상진행비교,해산법재 F‐measure ,Accuracy ,Separation방면표현료교호적성능。
Protein‐protein interaction (PPI ) networks are widely present in complex biological networks .The topological features of PPI networks play an important role in analyzing the functional modules in networks . Some graph clustering methods have been successfully used to complex networks to detect protein complexes in PPI networks .Traditional graph clustering algorithms in PPI analyzing methods primarily focus on hard clustering for a network ,while ,nowadays soft clustering algorithms to find overlapped clusters have become one of the hotspots of current research .Existing soft clustering algorithms pay less attention on small‐scale non‐dense clusters ,while some small‐scale non‐dense clusters often have important biological meaning in PPI networks .A measuring method of the association strength of edges is developed based on node neighborhoods in networks ,and then a soft clustering algorithm named flow‐simulation graph clustering (F‐GCL ) on the basis of flow simulation is presented to detect complexes in a PPI network .Experiments show that the proposed soft clustering algorithm F‐GCL can simultaneously find out overlapping clusters and small‐scale non‐dense clusters without improving the running time . Compared with MCODE (molecular complex detection) ,MCL (Markov clustering ) ,RNSC (restricted neighborhood search clustering ) and CPM (clique percolation method) algorithms on six Saccharomyces cerevisiae PPI networks ,the algorithm F‐GCL shows considerable or better performance on three evaluating indicators :F‐measure ,Accuracy and Sep aration .