计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2009年
31期
118-121
,共4页
搜索结果聚类%关键名词短语抽取%C-Value算法%Chameleon算法
搜索結果聚類%關鍵名詞短語抽取%C-Value算法%Chameleon算法
수색결과취류%관건명사단어추취%C-Value산법%Chameleon산법
search result clustering%key noun phrase extraction%C-Value algorithm%Chameleon algorithm
目前,搜索结果聚类方法大多教采用基于文档的方法,不能生成有意义的聚类标签.为了解决这个问题,提出一种基于关键名词短语聚类的中文搜索结果聚类方法,该方法将名词短语、相关搜索词作为候选聚类标签,利用C-Value算法、IDF值筛选标签,然后使用Chameleon算法将标签聚类,最后将搜索结果划分到最相关的聚类簇.实验证明,该方法把关键名词短语和相关搜索词作为聚类标签,有效地提高了标签的描述性,降低了聚类算法的时间复杂度.
目前,搜索結果聚類方法大多教採用基于文檔的方法,不能生成有意義的聚類標籤.為瞭解決這箇問題,提齣一種基于關鍵名詞短語聚類的中文搜索結果聚類方法,該方法將名詞短語、相關搜索詞作為候選聚類標籤,利用C-Value算法、IDF值篩選標籤,然後使用Chameleon算法將標籤聚類,最後將搜索結果劃分到最相關的聚類簇.實驗證明,該方法把關鍵名詞短語和相關搜索詞作為聚類標籤,有效地提高瞭標籤的描述性,降低瞭聚類算法的時間複雜度.
목전,수색결과취류방법대다교채용기우문당적방법,불능생성유의의적취류표첨.위료해결저개문제,제출일충기우관건명사단어취류적중문수색결과취류방법,해방법장명사단어、상관수색사작위후선취류표첨,이용C-Value산법、IDF치사선표첨,연후사용Chameleon산법장표첨취류,최후장수색결과화분도최상관적취류족.실험증명,해방법파관건명사단어화상관수색사작위취류표첨,유효지제고료표첨적묘술성,강저료취류산법적시간복잡도.
Nowadays,the conventional search result clustering methods employ the document-based approach and can not generate clusters with highly readable names.To solve the problem,based on key noun phrase clustering,this paper proposes a method for Chinese search result clustering.First is to extract key phrases from search results,and use the phrases of correlative search as addition.Second is a new label selecting criterion based on C-Value algorithm and the value of IDF.The third is clustering the labels by Chameleon algorithm.Finally,the search result classification has been performed in terms of the results of label clustering.The experiment shows that using key noun phrases and the phrases of correlative search as clustering labels can improve the description of labels and reduce the computation complexity of clustering algorithm.