计算机研究与发展
計算機研究與髮展
계산궤연구여발전
JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT
2010年
2期
300-304
,共5页
信息检索%查询扩展%上下文%语言模型%伪反馈
信息檢索%查詢擴展%上下文%語言模型%偽反饋
신식검색%사순확전%상하문%어언모형%위반궤
information retrieval%query expansion%context%language model%pseudo feedback
针对信息检索查询所使用的词可能与文档集中使用的词不匹配从而影响检索效果这一信息检索关键问题,提出了一种基于上下文的查询扩展方法,该方法根据查询的上下文信息对扩展词进行选择,同时考虑到扩展词与整个查询句以及与查询词的位置关系.在TREC信息检索测试集上进行的实验表明,相对于通常简单的语言模型,方法取得了5%~19%的提高.与流行的基于伪反馈的查询扩展方法相比,提出的方法也具有相当的平均准确率.
針對信息檢索查詢所使用的詞可能與文檔集中使用的詞不匹配從而影響檢索效果這一信息檢索關鍵問題,提齣瞭一種基于上下文的查詢擴展方法,該方法根據查詢的上下文信息對擴展詞進行選擇,同時攷慮到擴展詞與整箇查詢句以及與查詢詞的位置關繫.在TREC信息檢索測試集上進行的實驗錶明,相對于通常簡單的語言模型,方法取得瞭5%~19%的提高.與流行的基于偽反饋的查詢擴展方法相比,提齣的方法也具有相噹的平均準確率.
침대신식검색사순소사용적사가능여문당집중사용적사불필배종이영향검색효과저일신식검색관건문제,제출료일충기우상하문적사순확전방법,해방법근거사순적상하문신식대확전사진행선택,동시고필도확전사여정개사순구이급여사순사적위치관계.재TREC신식검색측시집상진행적실험표명,상대우통상간단적어언모형,방법취득료5%~19%적제고.여류행적기우위반궤적사순확전방법상비,제출적방법야구유상당적평균준학솔.
The effectiveness of information retrieval (IR) systems is influenced by the degree of term overlap between user queries and relevant documents. Query-document term mismatch, whether partial or total, is a fact that must be dealt with by IR systems. query expansion (QE) is one method for dealing with term mismatch. Classical query expansion techniques such as the local context analysis make use of term co-occurrence statistics to incorporate additional contextual terms for enhancing passage retrieval. However, relevant contextual terms do not always co-occur frequently with the query terms and vice versa. Hence the use of such methods often brings in noise, which leads to reduced precision. On the basis of analyzing the process of producing query, the authors propose a new method of query expansion on the basis of context and global information. At the same time, the expansion terms are selected according to their relation with the whole query. Additionally, the position information between terms is considered. The experiment result on TREC data collection shows that the method proposed outperforms the language model without expansion by 5%~19%. Compared with the popular approach of query expansion, pseudo feedback, the method has the competitive average precision.