CAJ | 학술논문

基于FAQ库的限定域自动问答系统由于更具实用性而成为自然语言处理领域的研究热点，而问题之间的相似度计算是其中最关键的技术。现有的问句相似度计算技术在处理带有上下文情景描述的问题时效果较差。针对现有技术存在的问题，提出将用户问题分为简洁模式问题（SMQs）和情景模式问题（CMQs），并提出了基于规则的问题模式分类算法。在此基础上，进一步提出了综合考察情景相似度和问句相似度的情景模式问题（CMQs）相似度计算方法。实验结果表明，问题模式分类算法取得了90%以上的准确率和召回率，情景模式问题相似度计算方法在时间复杂度较低的情况下也取得了74.3%的正确率。
기우FAQ고적한정역자동문답계통유우경구실용성이성위자연어언처리영역적연구열점，이문제지간적상사도계산시기중최관건적기술。현유적문구상사도계산기술재처리대유상하문정경묘술적문제시효과교차。침대현유기술존재적문제，제출장용호문제분위간길모식문제（SMQs）화정경모식문제（CMQs），병제출료기우규칙적문제모식분류산법。재차기출상，진일보제출료종합고찰정경상사도화문구상사도적정경모식문제（CMQs）상사도계산방법。실험결과표명，문제모식분류산법취득료90%이상적준학솔화소회솔，정경모식문제상사도계산방법재시간복잡도교저적정황하야취득료74.3%적정학솔。
At present, question answering system based on Frequently Asked Questions(FAQ)for restricted domains is a research focus in the field of natural language processing due to its practicality. The similarity measure between questions plays a very important role in one question answering system. The traditional questions similarity measure technologies have unsatisfactory effects for those questions with context information. A rule-based question pattern classification algo-rithm is proposed for dividing all questions into two categories:Simple Mode Questions(SMQs)and Context Mode Ques-tions(CMQs). Then, a similarity measure method for CMQs is presented in which the similarities between context infor-mation and that between questions are combined together. The experimental results show that both precision and recall rate of the proposed question pattern classification method exceed 90%, and the accuracy of similarity measure for con-text mode questions reaches 74.3%with lower time complexity.