软件学报
軟件學報
연건학보
JOURNAL OF SOFTWARE
2014年
12期
2808-2823
,共16页
微博%社区发现%关注关系%重叠社区
微博%社區髮現%關註關繫%重疊社區
미박%사구발현%관주관계%중첩사구
micro-blog%community detection%following relationship%overlap community
在微博市场营销、个性化推荐等应用中,发现兴趣和网络结构双内聚的用户社区起着至关重要的作用。现阶段,绝大多数的用户社区发现算法往往将用户联系与用户内容相隔离,从而导致其社区发现结果不够合理,而少数综合用户联系和内容的用户社区发现算法较为复杂;LCA 算法是重叠社区发现算法中算法效率较高且社区质量较好的算法,然而,其在聚类时未考虑边的真实兴趣体现。针对这些问题,构建了以关注关系为网络节点、以关注关系之间是否有共同用户为关注关系潜在的边、以关注关系所关联用户的兴趣集的交集为关注关系的兴趣特征,构建微博网络 R-C 模型,并探讨了其进行微博用户社区发现的方法,分析了该方法的复杂度。最后,以新浪微博数据集为实验,对照节点CNM算法和LCA算法,从兴趣内聚和网络结构内聚两方面进行分析,发现该方法能够发现更好的微博用户社区。
在微博市場營銷、箇性化推薦等應用中,髮現興趣和網絡結構雙內聚的用戶社區起著至關重要的作用。現階段,絕大多數的用戶社區髮現算法往往將用戶聯繫與用戶內容相隔離,從而導緻其社區髮現結果不夠閤理,而少數綜閤用戶聯繫和內容的用戶社區髮現算法較為複雜;LCA 算法是重疊社區髮現算法中算法效率較高且社區質量較好的算法,然而,其在聚類時未攷慮邊的真實興趣體現。針對這些問題,構建瞭以關註關繫為網絡節點、以關註關繫之間是否有共同用戶為關註關繫潛在的邊、以關註關繫所關聯用戶的興趣集的交集為關註關繫的興趣特徵,構建微博網絡 R-C 模型,併探討瞭其進行微博用戶社區髮現的方法,分析瞭該方法的複雜度。最後,以新浪微博數據集為實驗,對照節點CNM算法和LCA算法,從興趣內聚和網絡結構內聚兩方麵進行分析,髮現該方法能夠髮現更好的微博用戶社區。
재미박시장영소、개성화추천등응용중,발현흥취화망락결구쌍내취적용호사구기착지관중요적작용。현계단,절대다수적용호사구발현산법왕왕장용호련계여용호내용상격리,종이도치기사구발현결과불구합리,이소수종합용호련계화내용적용호사구발현산법교위복잡;LCA 산법시중첩사구발현산법중산법효솔교고차사구질량교호적산법,연이,기재취류시미고필변적진실흥취체현。침대저사문제,구건료이관주관계위망락절점、이관주관계지간시부유공동용호위관주관계잠재적변、이관주관계소관련용호적흥취집적교집위관주관계적흥취특정,구건미박망락 R-C 모형,병탐토료기진행미박용호사구발현적방법,분석료해방법적복잡도。최후,이신랑미박수거집위실험,대조절점CNM산법화LCA산법,종흥취내취화망락결구내취량방면진행분석,발현해방법능구발현경호적미박용호사구。
Detecting user communities with denser common interests and network structure plays an important role in target marketing and self-oriented services. User-Generated content and the relationship between the users are often separated in the current methods on community detection, which results in the unreasonable community structures. Though some methods tried to combine the two factors, they are complex. Link community algorithm (LCA) is an efficient state-of-art method on overlapping community discovery. However, LCA does not take into account the real interest characteristics when calculating the similarity between the links. To solve the issues on user community detection on Micro-blog, this paper proposes a R-C model which takes the user relationships as the network nodes, treats the intersection of the interest characteristics of the two users in a link as the link’s interest characteristics, and makes the shared user between two links as the underlying link between the links. Also, the community detection method based on the R-C model is discussed,and the complexity in clustering is analyzed. Finally, compared with node CNM and LCA, the method using R-C model is proved to be better in finding closer relationship and denser common interest user communities.