计算机系统应用
計算機繫統應用
계산궤계통응용
APPLICATIONS OF THE COMPUTER SYSTEMS
2014年
4期
52-56
,共5页
尹芝芳%王鑫%蔡文正%李鹤%阮玲玲
尹芝芳%王鑫%蔡文正%李鶴%阮玲玲
윤지방%왕흠%채문정%리학%원령령
Lucene.Net%LSA%问答系统%互信息
Lucene.Net%LSA%問答繫統%互信息
Lucene.Net%LSA%문답계통%호신식
Lucene.Net%LSA%Question-Answering system%mutual information
本文设计的法律咨询系统,结合法律行业的现状,以中文问答系统为原型,结合了开源数据检索项目Lucene.net,扩展了数据的存储类型.本文借助中科院研发的中文分词系统,集成到Lucene.Net平台上,弥补了其分词不足.并使用互信息技术,使同义的法律相关词语优先进行检索.在中文问答系统的答案提取时,经常出现答案的“漏取”和“错取”的情况,本文提出了一种基于潜在语义分析(LSA)的问题和答案句子相似度计算方法,利用空间向量模型作为表示方法,借助潜在语义分析理论,通过奇异值分解的降维方法构建了一个低维的语义空间,并在语义空间上实现了问题与答案句子相似度计算.经试验证明,本系统具有较精准的查询正确率以及较少的运行计算时间.
本文設計的法律咨詢繫統,結閤法律行業的現狀,以中文問答繫統為原型,結閤瞭開源數據檢索項目Lucene.net,擴展瞭數據的存儲類型.本文藉助中科院研髮的中文分詞繫統,集成到Lucene.Net平檯上,瀰補瞭其分詞不足.併使用互信息技術,使同義的法律相關詞語優先進行檢索.在中文問答繫統的答案提取時,經常齣現答案的“漏取”和“錯取”的情況,本文提齣瞭一種基于潛在語義分析(LSA)的問題和答案句子相似度計算方法,利用空間嚮量模型作為錶示方法,藉助潛在語義分析理論,通過奇異值分解的降維方法構建瞭一箇低維的語義空間,併在語義空間上實現瞭問題與答案句子相似度計算.經試驗證明,本繫統具有較精準的查詢正確率以及較少的運行計算時間.
본문설계적법률자순계통,결합법률행업적현상,이중문문답계통위원형,결합료개원수거검색항목Lucene.net,확전료수거적존저류형.본문차조중과원연발적중문분사계통,집성도Lucene.Net평태상,미보료기분사불족.병사용호신식기술,사동의적법률상관사어우선진행검색.재중문문답계통적답안제취시,경상출현답안적“루취”화“착취”적정황,본문제출료일충기우잠재어의분석(LSA)적문제화답안구자상사도계산방법,이용공간향량모형작위표시방법,차조잠재어의분석이론,통과기이치분해적강유방법구건료일개저유적어의공간,병재어의공간상실현료문제여답안구자상사도계산.경시험증명,본계통구유교정준적사순정학솔이급교소적운행계산시간.
The designation of this law consultation system, not only considers the situation of the legal profession and based on Chinese Question-Answering System as prototype, but also use searching technology Lucene.net which is a open source project that can preform on many kind of types file. This article also uses ICTCLAS and applies it to the Lucene that makes up for Lucene’s lack of word segmentation and mutual information technology to make the law word to be priority search. This paper proposes a method to calculate similarity between question and sentence based on Latent Semantic Analysis (LSA). This method represents the question and sentence with space vector model, under the help of latent semantic analysis theory, and constructs a semantic space, which gets rids of the correlativity between word. And then similarity calculation between question and sentence is implemented in this semantic space. Experiments show that this system has the precision of the operation of the inquiry accuracy and less computation time.