宁波大学学报(理工版)
寧波大學學報(理工版)
저파대학학보(리공판)
JOURNAL OF NINGBO UNIVERSITY(NSEE)
2014年
4期
38-41
,共4页
产品评论挖掘%无监督学习%微摘要%web N-gram
產品評論挖掘%無鑑督學習%微摘要%web N-gram
산품평론알굴%무감독학습%미적요%web N-gram
product reviews mining%unsupervised approach%micro-abstract%web N-gram
提出一种新的无监督的方法,对网络上存在的大量中文产品评论信息进行处理,生成简洁的非结构化的可读性强且具有代表性、简洁性的理解式评论微摘要。用N-gram语言模型来衡量可读性,用改进的点间互信息函数来衡量代表性,用同义词词林来计算词语相似度;将这种产品评论微摘要问题归结为优化问题,试图寻找具有可读性和代表性的简洁、低冗余的词组,并提出了一个启发式算法来解决这个优化问题。
提齣一種新的無鑑督的方法,對網絡上存在的大量中文產品評論信息進行處理,生成簡潔的非結構化的可讀性彊且具有代錶性、簡潔性的理解式評論微摘要。用N-gram語言模型來衡量可讀性,用改進的點間互信息函數來衡量代錶性,用同義詞詞林來計算詞語相似度;將這種產品評論微摘要問題歸結為優化問題,試圖尋找具有可讀性和代錶性的簡潔、低冗餘的詞組,併提齣瞭一箇啟髮式算法來解決這箇優化問題。
제출일충신적무감독적방법,대망락상존재적대량중문산품평론신식진행처리,생성간길적비결구화적가독성강차구유대표성、간길성적리해식평론미적요。용N-gram어언모형래형량가독성,용개진적점간호신식함수래형량대표성,용동의사사림래계산사어상사도;장저충산품평론미적요문제귀결위우화문제,시도심조구유가독성화대표성적간길、저용여적사조,병제출료일개계발식산법래해결저개우화문제。
This paper presents a new unsupervised approach to generate micro-abstract of Chinese product reviews, mainly focusing on generating the readable, representative, concise and unstructured abstractive summaries. We model the readability with an N-gram language model, measure representative impact based on a modified mutual information function, and calculate words similarity with Tongyici Cilin. We also formulate the addressed issue as an optimization problem in a move to seek a set of concise and non-redundant phrases that are readable and can represent key opinions in text. In the end, we propose the heuristic algorithms to efficiently solve the formulated optimization problem.