计算机辅助设计与图形学学报
計算機輔助設計與圖形學學報
계산궤보조설계여도형학학보
JOURNAL OF COMPUTER-AIDED DESIGN & COMPUTER GRAPHICS
2015年
5期
771-782
,共12页
汤斯亮%程璐%邵健%吴飞%鲁伟明
湯斯亮%程璐%邵健%吳飛%魯偉明
탕사량%정로%소건%오비%로위명
概率图模型%主题建模%可视化
概率圖模型%主題建模%可視化
개솔도모형%주제건모%가시화
probabilistic graph model%topic modeling%visualization
伴随着信息技术的发展, 传统纸质新闻逐渐向新媒体新闻转变. 与此同时, 近年来数据挖掘和自然语言处理等技术得到了极大的发展, 使得对新闻所蕴含丰富语义和主题进行深度挖掘成为可能. 然而, 信息的超载使得主题可视化成为一个新的挑战, 即如何以更好的方式来呈现海量互联网文本所蕴含的主题. 隐形语义分析(LDA)是近年来兴起的主题建模方法, 被当前学术界认为是主流的主题建模技术. 文中首先介绍以LDA为主的文本概率主题建模技术及其发展, 讨论了新闻主题建模特点; 随后概括对比新闻主题可视化的若干方法, 并对其进行分类, 分析不同方法的适用性和局限性; 最后对新闻主题可视化进行总结和展望.
伴隨著信息技術的髮展, 傳統紙質新聞逐漸嚮新媒體新聞轉變. 與此同時, 近年來數據挖掘和自然語言處理等技術得到瞭極大的髮展, 使得對新聞所蘊含豐富語義和主題進行深度挖掘成為可能. 然而, 信息的超載使得主題可視化成為一箇新的挑戰, 即如何以更好的方式來呈現海量互聯網文本所蘊含的主題. 隱形語義分析(LDA)是近年來興起的主題建模方法, 被噹前學術界認為是主流的主題建模技術. 文中首先介紹以LDA為主的文本概率主題建模技術及其髮展, 討論瞭新聞主題建模特點; 隨後概括對比新聞主題可視化的若榦方法, 併對其進行分類, 分析不同方法的適用性和跼限性; 最後對新聞主題可視化進行總結和展望.
반수착신식기술적발전, 전통지질신문축점향신매체신문전변. 여차동시, 근년래수거알굴화자연어언처리등기술득도료겁대적발전, 사득대신문소온함봉부어의화주제진행심도알굴성위가능. 연이, 신식적초재사득주제가시화성위일개신적도전, 즉여하이경호적방식래정현해량호련망문본소온함적주제. 은형어의분석(LDA)시근년래흥기적주제건모방법, 피당전학술계인위시주류적주제건모기술. 문중수선개소이LDA위주적문본개솔주제건모기술급기발전, 토론료신문주제건모특점; 수후개괄대비신문주제가시화적약간방법, 병대기진행분류, 분석불동방법적괄용성화국한성; 최후대신문주제가시화진행총결화전망.
With the development of information technology, traditional media is making transformation to the"New Media", which is based on the Internet. Meanwhile, data mining and natural language processing have been developed greatly these years. These technologies are utilized to uncover the semantics and topics in news articles. In addition, the overflow of information motivates the development of new visualization methods. Overall, these methods have great potential of making people better understand news. LDA is a promising topic modeling method, which is adopted by lots of researchers. In this paper, we first introduced the text probabilistic topic modeling technologies based on LDA and discussed the unique characteristics of topic modeling for news. Then, we summarized different visualization methods of topic modeling for news. We also analyzed and compared advantages and shortcomings of these methods. At last, we discussed some future directions of topic modeling for news.