计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2015年
16期
121-129
,共9页
郭瑞强%郭阿为%韩忠明%周萌%张伟
郭瑞彊%郭阿為%韓忠明%週萌%張偉
곽서강%곽아위%한충명%주맹%장위
建模%时间序列%热点话题%脉冲噪声
建模%時間序列%熱點話題%脈遲譟聲
건모%시간서렬%열점화제%맥충조성
modeling%time series%hot topics%pulse noise
微博、论坛等交互式网站上的热点话题是网络舆情的源头与集散地,早期发现与预测网络热点话题是舆情控制的关键。针对交互式网络热点话题,Yasuko Matsubara等人对信息传播的模式进行了建模,提出了SpikeM模型,该模型可以较好地反映信息传播的模式。但是针对热点话题呈现多峰的情况,该模型则无法拟合。且该模型假设针对某一事件,每个网络用户只能发布一次消息,这与实际情况不符。从实际情况出发(针对同一话题,网络用户可以多次发布消息),提出了脉冲时序行为动力模型(PTSDM)。假设多次发布消息的用户数服从幂律分布,从用户行为的角度分析话题的特征,在模型中引入脉冲干扰,使模型更具随机性,更符合客观实际,从而可以拟合不同类型的热点话题。采用两个数据集作为测试样本,进行了实验,实验表明了所构建模型的有效性。
微博、論罈等交互式網站上的熱點話題是網絡輿情的源頭與集散地,早期髮現與預測網絡熱點話題是輿情控製的關鍵。針對交互式網絡熱點話題,Yasuko Matsubara等人對信息傳播的模式進行瞭建模,提齣瞭SpikeM模型,該模型可以較好地反映信息傳播的模式。但是針對熱點話題呈現多峰的情況,該模型則無法擬閤。且該模型假設針對某一事件,每箇網絡用戶隻能髮佈一次消息,這與實際情況不符。從實際情況齣髮(針對同一話題,網絡用戶可以多次髮佈消息),提齣瞭脈遲時序行為動力模型(PTSDM)。假設多次髮佈消息的用戶數服從冪律分佈,從用戶行為的角度分析話題的特徵,在模型中引入脈遲榦擾,使模型更具隨機性,更符閤客觀實際,從而可以擬閤不同類型的熱點話題。採用兩箇數據集作為測試樣本,進行瞭實驗,實驗錶明瞭所構建模型的有效性。
미박、론단등교호식망참상적열점화제시망락여정적원두여집산지,조기발현여예측망락열점화제시여정공제적관건。침대교호식망락열점화제,Yasuko Matsubara등인대신식전파적모식진행료건모,제출료SpikeM모형,해모형가이교호지반영신식전파적모식。단시침대열점화제정현다봉적정황,해모형칙무법의합。차해모형가설침대모일사건,매개망락용호지능발포일차소식,저여실제정황불부。종실제정황출발(침대동일화제,망락용호가이다차발포소식),제출료맥충시서행위동력모형(PTSDM)。가설다차발포소식적용호수복종멱률분포,종용호행위적각도분석화제적특정,재모형중인입맥충간우,사모형경구수궤성,경부합객관실제,종이가이의합불동류형적열점화제。채용량개수거집작위측시양본,진행료실험,실험표명료소구건모형적유효성。
The hot topics on the microblogs,forums and other interactive websites are the source and distribution center of the network public opinion.Therefore,early detection and prediction of network hot topics are key to the control of public opinion.Yasuko Matsubara and his colleagues proposed a model(SpikeM model)for information diffusion,which can describe certain patterns of information diffusion well.However,the SpikeM model does not work well the multimodal patterns,and its assumption that each blogger blogs at most once about an event is inconsistent with the actual situation. Since most web users post about the same topics repeatedly,the authors assume that the number of users following a power law distribution.Then they analyze the characteristics of the topics from the dimension of the user behavior.Finally,they propose a new model(PTSDM)for interactive network based the assumption just mentioned,which is cable of fitting different kinds of hot topics.Meanwhile,the introduction of the pulse noise makes the model more in line with the reality. Two datasets are selected and comprehensive experiments are conducted.Experimental results show the effectiveness of the model built in this paper.