现代情报
現代情報
현대정보
Journal of Modern Information
2014年
4期
132~136
,共null页
Lucene 书目搜索 中文分词 分析器
Lucene 書目搜索 中文分詞 分析器
Lucene 서목수색 중문분사 분석기
lucene; bibliographic retrieval; chinese segmentafion; analyzer
针对基于Lucene实现中文书目搜索系统的项目中,如何选择最合适的Lucene中文分析器进行了研究。通过大量实验,对Lucene自带的3个分析器及开发活跃度较高的两个第三方中文分析器,从分词效果,建立索引的时间和空间,检索的时间、检全率和平均检准率等方面进行了分析比较。综合实验分析结果,指出ik分析器总体性能最优,为最佳选择。
針對基于Lucene實現中文書目搜索繫統的項目中,如何選擇最閤適的Lucene中文分析器進行瞭研究。通過大量實驗,對Lucene自帶的3箇分析器及開髮活躍度較高的兩箇第三方中文分析器,從分詞效果,建立索引的時間和空間,檢索的時間、檢全率和平均檢準率等方麵進行瞭分析比較。綜閤實驗分析結果,指齣ik分析器總體性能最優,為最佳選擇。
침대기우Lucene실현중문서목수색계통적항목중,여하선택최합괄적Lucene중문분석기진행료연구。통과대량실험,대Lucene자대적3개분석기급개발활약도교고적량개제삼방중문분석기,종분사효과,건립색인적시간화공간,검색적시간、검전솔화평균검준솔등방면진행료분석비교。종합실험분석결과,지출ik분석기총체성능최우,위최가선택。
How to choose the most appropriate Chinese analyzer Lucene in Chinese bibliographic Retrieval system which bases on Lucene? With a lot of experiments, the author has compared three analyzers and two kinds of third party Chinese analyz- ers that are development of high active, which were all owned by Lueene. From the effect of Chinese segmentation, indexing time and space, and time retrieval, recall and average precision, etc. the author deemed that the IK analyzer was the best selection, due to its best overall performance.