计算机技术与发展
計算機技術與髮展
계산궤기술여발전
Computer Technology and Development
2015年
10期
1-6
,共6页
内容推荐算法%同义词词林%层次聚类%TextRank%图模型
內容推薦算法%同義詞詞林%層次聚類%TextRank%圖模型
내용추천산법%동의사사림%층차취류%TextRank%도모형
content recommendation algorithm%Tongyici Cilin%hierarchical clustering%TextRank%graph model
在基于内容的推荐系统中,初始用户模板的准确性对后面的推荐精度有很大影响。因此,在系统初始时,必须从少量用户信息中准确地提取出用户兴趣模板,尽可能减少噪声的引入。否则会在后期更新模板时产生偏移性问题,造成推荐的不准确。针对此问题,文中提出了一种基于TextRank算法建立初始模板的方法。首先对所拥有的少量用户感兴趣文本进行预处理并确定词义项,然后进行聚类,接下来对聚类得到的每个类别分别以义项为单位构建TextRank模型,并引入相似度影响因子、共现度影响因子、类权重影响因子对TextRank模型中的概率转移矩阵进行改进。迭代之后选取每个类中最为关键的若干义项进行综合,得到最终的初始用户模板。实验结果表明,该算法得到的初始用户模板较为精确,可以达到较好的推荐效果。
在基于內容的推薦繫統中,初始用戶模闆的準確性對後麵的推薦精度有很大影響。因此,在繫統初始時,必鬚從少量用戶信息中準確地提取齣用戶興趣模闆,儘可能減少譟聲的引入。否則會在後期更新模闆時產生偏移性問題,造成推薦的不準確。針對此問題,文中提齣瞭一種基于TextRank算法建立初始模闆的方法。首先對所擁有的少量用戶感興趣文本進行預處理併確定詞義項,然後進行聚類,接下來對聚類得到的每箇類彆分彆以義項為單位構建TextRank模型,併引入相似度影響因子、共現度影響因子、類權重影響因子對TextRank模型中的概率轉移矩陣進行改進。迭代之後選取每箇類中最為關鍵的若榦義項進行綜閤,得到最終的初始用戶模闆。實驗結果錶明,該算法得到的初始用戶模闆較為精確,可以達到較好的推薦效果。
재기우내용적추천계통중,초시용호모판적준학성대후면적추천정도유흔대영향。인차,재계통초시시,필수종소량용호신식중준학지제취출용호흥취모판,진가능감소조성적인입。부칙회재후기경신모판시산생편이성문제,조성추천적불준학。침대차문제,문중제출료일충기우TextRank산법건립초시모판적방법。수선대소옹유적소량용호감흥취문본진행예처리병학정사의항,연후진행취류,접하래대취류득도적매개유별분별이의항위단위구건TextRank모형,병인입상사도영향인자、공현도영향인자、류권중영향인자대TextRank모형중적개솔전이구진진행개진。질대지후선취매개류중최위관건적약간의항진행종합,득도최종적초시용호모판。실험결과표명,해산법득도적초시용호모판교위정학,가이체도교호적추천효과。
In content-based recommendation system,the accuracy of the initial user profile has a great influence on the accuracy of recom-mendation later. Therefore,profile must be built as precise as possible on condition of having little user information when the system is in initial state. Otherwise,it will bring offset when updating the user profile later,which will cause inaccuracy of recommendation. A method of building initial user profile based on TextRank is presented in this paper. At first,the texts user interested in are preprocessed and the meaning of each word is determined. Then,clustering operation is done and TextRank models are built by using meaning of word as unit. Various influence factors are also introduced to make the TextRank transition probability matrix better. At last,the most important mean-ings of word are chosen from each cluster to build the final initial user profile. Experimental results show that the accuracy of recommen-dation is high by using this method.