中华医学图书情报杂志
中華醫學圖書情報雜誌
중화의학도서정보잡지
CHINESE JOURNAL OF MEDICAL LIBRARY AND INFORMATION SCIENCE
2015年
1期
50-54,60
,共6页
Weka%Cobweb%聚类分析%白血病%基因%数据挖掘%共现分析%可视化分析%研究热点
Weka%Cobweb%聚類分析%白血病%基因%數據挖掘%共現分析%可視化分析%研究熱點
Weka%Cobweb%취류분석%백혈병%기인%수거알굴%공현분석%가시화분석%연구열점
Weka%Cobweb%Cluster analysis%Leukemia%Gene%Data mining%Coocurence mining system%Visual analysis%Research hot spot
目的::使用Weka挖掘白血病与基因关系。方法:检索PubMed数据库,获得研究数据;利用BICOMB抽取主要主题词和副主题词,生成高频词共现矩阵和词篇矩阵,利用Weka平台、采用Cobweb算法对共现矩阵数据进行聚类分析得到研究热点和进行文献验证。结果:Weka将42个高频词聚为7类,代表白血病与基因的7个可能联系,但第1,2,4,5类中没有白血病或基因高频词,聚类效果较差,其余类聚类效果较好。结论:聚类分析发现白血病与myc基因、abl基因、p53基因、病毒基因、免疫球蛋白基因和mdm基因有关。
目的::使用Weka挖掘白血病與基因關繫。方法:檢索PubMed數據庫,穫得研究數據;利用BICOMB抽取主要主題詞和副主題詞,生成高頻詞共現矩陣和詞篇矩陣,利用Weka平檯、採用Cobweb算法對共現矩陣數據進行聚類分析得到研究熱點和進行文獻驗證。結果:Weka將42箇高頻詞聚為7類,代錶白血病與基因的7箇可能聯繫,但第1,2,4,5類中沒有白血病或基因高頻詞,聚類效果較差,其餘類聚類效果較好。結論:聚類分析髮現白血病與myc基因、abl基因、p53基因、病毒基因、免疫毬蛋白基因和mdm基因有關。
목적::사용Weka알굴백혈병여기인관계。방법:검색PubMed수거고,획득연구수거;이용BICOMB추취주요주제사화부주제사,생성고빈사공현구진화사편구진,이용Weka평태、채용Cobweb산법대공현구진수거진행취류분석득도연구열점화진행문헌험증。결과:Weka장42개고빈사취위7류,대표백혈병여기인적7개가능련계,단제1,2,4,5류중몰유백혈병혹기인고빈사,취류효과교차,기여류취류효과교호。결론:취류분석발현백혈병여myc기인、abl기인、p53기인、병독기인、면역구단백기인화mdm기인유관。
Objective To mine the relation between leukemia and genes using Weka. Methods The papers on leuke-mia and genes were retrieved from PubMed, their subject headings and subheadings were extracted using BICOMB to generate co-occurrence matrix and term-paper matrix. The research hotspots were found by cluster analysis of the data on co-occurrence matrix using Weka and Cobweb. The literature was verified. Results The 42 high fre-quency words were clustered into 7 classes by Weka. No high frequency words of leukemia or genes were found in classes 1, 2, 4 and 5, indicating that their clustering efficiency was poor. The clustering efficiency of the other 3 classes was good. Conclusion Cluster analysis showed that leukemia is related with myc gene, ab1 gene, p53 gene, virus gene, immunoglobulin gene and mdm gene.