计算机技术与发展
計算機技術與髮展
계산궤기술여발전
COMPUTER TECHNOLOGY AND DEVELOPMENT
2015年
2期
55-59
,共5页
王全民%苗雨%何明%郑爽
王全民%苗雨%何明%鄭爽
왕전민%묘우%하명%정상
ALS算法%协同过滤%Hadoop%迭代式MapReduce
ALS算法%協同過濾%Hadoop%迭代式MapReduce
ALS산법%협동과려%Hadoop%질대식MapReduce
alternating least squares%collaborative filtering%Hadoop%iterative MapReduce
基于矩阵分解的协同过滤算法是近几年提出的一种协同过滤推荐技术,但其每项预测评分的计算都要综合大量评分数据,同时在计算时还需要存储庞大的特征矩阵,用单一节点来进行推荐将会遇到计算时间和计算资源的瓶颈。通过对现有的基于ALS(最小二乘法)的协同过滤算法在Hadoop上并行化实现的原理和特点进行深入的研究,得到了传统的迭代式算法在Hadoop上运算效率不高的原因。根据迭代式MapReduce思想,提出了循环感知任务调度算法、缓存静态数据、任务循环控制、迭代终止条件检测等方法。通过在Netflix数据集上的实验表明,迭代式MapReduce思想提高了基于ALS的协同过滤算法的并行化计算的效率。
基于矩陣分解的協同過濾算法是近幾年提齣的一種協同過濾推薦技術,但其每項預測評分的計算都要綜閤大量評分數據,同時在計算時還需要存儲龐大的特徵矩陣,用單一節點來進行推薦將會遇到計算時間和計算資源的瓶頸。通過對現有的基于ALS(最小二乘法)的協同過濾算法在Hadoop上併行化實現的原理和特點進行深入的研究,得到瞭傳統的迭代式算法在Hadoop上運算效率不高的原因。根據迭代式MapReduce思想,提齣瞭循環感知任務調度算法、緩存靜態數據、任務循環控製、迭代終止條件檢測等方法。通過在Netflix數據集上的實驗錶明,迭代式MapReduce思想提高瞭基于ALS的協同過濾算法的併行化計算的效率。
기우구진분해적협동과려산법시근궤년제출적일충협동과려추천기술,단기매항예측평분적계산도요종합대량평분수거,동시재계산시환수요존저방대적특정구진,용단일절점래진행추천장회우도계산시간화계산자원적병경。통과대현유적기우ALS(최소이승법)적협동과려산법재Hadoop상병행화실현적원리화특점진행심입적연구,득도료전통적질대식산법재Hadoop상운산효솔불고적원인。근거질대식MapReduce사상,제출료순배감지임무조도산법、완존정태수거、임무순배공제、질대종지조건검측등방법。통과재Netflix수거집상적실험표명,질대식MapReduce사상제고료기우ALS적협동과려산법적병행화계산적효솔。
Collaborative filtering algorithm based on matrix factorization is a collaborative filtering recommendation technique proposed in recent years. In the process of recommendation each prediction depends on the collaboration of the whole known rating set and the feature matrices need huge storage. So the recommendation with only one node will meet the bottleneck of time and resource. Through in-depth study on the principle and feature of current parallel implementation of a collaborative filtering algorithm based on ALS ( Alternating-Least-Squares) ,get the reason why the computing efficiency of the implementation of traditional iterative algorithm on Hadoop is very low. According to the idea of iterative MapReduce,some methods such as loop-aware scheduling algorithm,static data caching,job loop controlling,fixed point detecting are proposed. The experiment on Netflix data set shows that the iterative MapReduce has improved the parallel computing efficiency of collaborative filtering algorithm based on ALS.