福建工程学院学报
福建工程學院學報
복건공정학원학보
Journal of FuJian University of Technology
2015年
4期
372-375
,共4页
文本聚类%词性标注%自然语言处理%聚类分析
文本聚類%詞性標註%自然語言處理%聚類分析
문본취류%사성표주%자연어언처리%취류분석
text clustering%part-of-speech tagging%natural language process%cluster analysis
针对传统的文本聚类容易受到噪声影响的问题,提出一个基于词性标注的文本聚类算法。该算法利用词性标注从文本中识别并抽取最能体现文本特征的关键词,再基于所抽取的关键词进行聚类操作。实验发现,相对传统的聚类算法,基于词性标注的文本聚类算法能够有效地提高聚类结果的质量。
針對傳統的文本聚類容易受到譟聲影響的問題,提齣一箇基于詞性標註的文本聚類算法。該算法利用詞性標註從文本中識彆併抽取最能體現文本特徵的關鍵詞,再基于所抽取的關鍵詞進行聚類操作。實驗髮現,相對傳統的聚類算法,基于詞性標註的文本聚類算法能夠有效地提高聚類結果的質量。
침대전통적문본취류용역수도조성영향적문제,제출일개기우사성표주적문본취류산법。해산법이용사성표주종문본중식별병추취최능체현문본특정적관건사,재기우소추취적관건사진행취류조작。실험발현,상대전통적취류산법,기우사성표주적문본취류산법능구유효지제고취류결과적질량。
To tackle the problem that traditional text clustering methods are susceptible to the effects of noises,a text clustering algorithm based on part-of-speech tagging was proposed.Firstly,the part-of-speech tagging was utilized to recognize the keywords that well characterize the text features. A text clustering based on the recognized keywords was performed via the proposed algorithm.The experimental results show that comparing with the clustering results generated by the traditional clus-tering algorithm,our proposal was able to effectively improve the quality of clustering results.