河北农业大学学报
河北農業大學學報
하북농업대학학보
JOURNAL OF AGRICULTURAL UNIVERSITY OF HEBEI
2009年
4期
100-102,107
,共4页
中文分词%逆向最大匹配算法%单字率%词频
中文分詞%逆嚮最大匹配算法%單字率%詞頻
중문분사%역향최대필배산법%단자솔%사빈
chinese word segmentation%RMM%rate of chinese character%term frequency
为提高逆向最大匹配算法的分词精度,本研究利用词频阙值,单字函数等方法取得了较好的消歧效果.实验结果表明:该分词算法既能遵循长词优先的原则,又能进一步识别和消除覆盖歧义.改进的RMM不仅在速度上仍保持较大优势而且在分词准确率上有了进一步的提高,对使用机械分词算法的中小型搜索引擎在提高分词精度方面具有一定的实用价值.
為提高逆嚮最大匹配算法的分詞精度,本研究利用詞頻闕值,單字函數等方法取得瞭較好的消歧效果.實驗結果錶明:該分詞算法既能遵循長詞優先的原則,又能進一步識彆和消除覆蓋歧義.改進的RMM不僅在速度上仍保持較大優勢而且在分詞準確率上有瞭進一步的提高,對使用機械分詞算法的中小型搜索引擎在提高分詞精度方麵具有一定的實用價值.
위제고역향최대필배산법적분사정도,본연구이용사빈궐치,단자함수등방법취득료교호적소기효과.실험결과표명:해분사산법기능준순장사우선적원칙,우능진일보식별화소제복개기의.개진적RMM불부재속도상잉보지교대우세이차재분사준학솔상유료진일보적제고,대사용궤계분사산법적중소형수색인경재제고분사정도방면구유일정적실용개치.
In order to enhance the accuracy of chinese word segmentation, using term frequency value and single character function, the present study has made great progress on the ambiguity resolution. The experiment shows this method is able to follow the long-word-first principle, and can further detect and resolve ambiguity. The improved RMM not only has a greater advantage in speed, but also increases the accuracy. It has practical value in the aspect of ambiguity resolution to the middle and small-scale search engines which adopt mechanical method.