计算机科学与探索
計算機科學與探索
계산궤과학여탐색
JOURNAL OF FRONTIERS OF COMPUTER SCIENCE & TECHNOLOGY
2014年
9期
1076-1084
,共9页
柴变芳%赵晓鹏%贾彩燕%于剑
柴變芳%趙曉鵬%賈綵燕%于劍
시변방%조효붕%가채연%우검
广义社区发现%大规模内容网络%随机块模型%抽样
廣義社區髮現%大規模內容網絡%隨機塊模型%抽樣
엄의사구발현%대규모내용망락%수궤괴모형%추양
general community detection%massive content networks%stochastic block model%sampling
在对网络无任何先验知识情形下,PPSB-DC模型(popularity and productivity stochastic block model and discriminative content model)利用网络的内容和链接对网络生成过程进行建模,可有效地发现广义社区及社区间的链接模式。但该概率模型的参数估计算法耗时,初始链接模式参数设置敏感,限制了该模型的应用。对参数求解算法进行了改进,设计了一个有效的内容网络广义社区发现算法EPPSBDC(efficient PPSB-DC)。该算法通过采取抽样和并行技术,提高了算法运行速度,通过引入链接概率先验,消除了算法对初始参数的敏感性。在内容网络上与同类算法进行了比较,验证了EPPSBDC算法的有效性。
在對網絡無任何先驗知識情形下,PPSB-DC模型(popularity and productivity stochastic block model and discriminative content model)利用網絡的內容和鏈接對網絡生成過程進行建模,可有效地髮現廣義社區及社區間的鏈接模式。但該概率模型的參數估計算法耗時,初始鏈接模式參數設置敏感,限製瞭該模型的應用。對參數求解算法進行瞭改進,設計瞭一箇有效的內容網絡廣義社區髮現算法EPPSBDC(efficient PPSB-DC)。該算法通過採取抽樣和併行技術,提高瞭算法運行速度,通過引入鏈接概率先驗,消除瞭算法對初始參數的敏感性。在內容網絡上與同類算法進行瞭比較,驗證瞭EPPSBDC算法的有效性。
재대망락무임하선험지식정형하,PPSB-DC모형(popularity and productivity stochastic block model and discriminative content model)이용망락적내용화련접대망락생성과정진행건모,가유효지발현엄의사구급사구간적련접모식。단해개솔모형적삼수고계산법모시,초시련접모식삼수설치민감,한제료해모형적응용。대삼수구해산법진행료개진,설계료일개유효적내용망락엄의사구발현산법EPPSBDC(efficient PPSB-DC)。해산법통과채취추양화병행기술,제고료산법운행속도,통과인입련접개솔선험,소제료산법대초시삼수적민감성。재내용망락상여동류산법진행료비교,험증료EPPSBDC산법적유효성。
Without any prior knowledge about networks, the PPSB-DC (popularity and productivity stochastic block model and discriminative content model) models the generative process by contents and links, which makes it be able to detect general communities and identify link patterns between any two communities. However, the algorithm for this probabilistic model costs much time and is sensible to the initial parameters of link patterns. These disadvantages limit the application of the algorithm. In order to improve the parameter estimation algorithm, this paper proposes an efficient algorithm for general community detection in content networks EPPSBDC (efficient PPSB-DC). EPPSBDC improves the speed by sampling and parallel strategies, and decreases the sensibility for the initial parameters by introducing a prior of link pattern. Comparisons of similar algorithms in content networks demonstrate the validity of EPPSBDC.