山东大学学报(工学版)
山東大學學報(工學版)
산동대학학보(공학판)
Journal of Shandong University (Engineering Science)
2015年
5期
22-28
,共7页
何东之%张吉沣%赵鹏飞
何東之%張吉灃%趙鵬飛
하동지%장길풍%조붕비
MapReduce%云计算平台%二分网络%不确定性传播算法%分布式
MapReduce%雲計算平檯%二分網絡%不確定性傳播算法%分佈式
MapReduce%운계산평태%이분망락%불학정성전파산법%분포식
MapReduce%cloud computing paltform%bipartite network%probabilistic spreading algorithm%distributed
为了克服单机串行不确定性传播算法处理大规模数据集的局限,采用MapReduce编程模型对算法进行并行化实现。将单机算法按照算法流程进行拆分,每一步对应一个MapReduce程序。每一步的输入及输出数据都存储在Hadoop分布式文件系统上。用命中率对比并行化的不确定性传播算法与全局排名算法的性能。对比不同数据量、不同节点数时并行化的不确定性传播算法的加速比。试验结果表明,不确定性传播算法MapReduce并行化后部署在Hadoop集群上运行,命中率显著高于全局排名算法,且有着较好的并行性,扩大了单机算法所能处理的数据规模且提高了算法的运算速度。
為瞭剋服單機串行不確定性傳播算法處理大規模數據集的跼限,採用MapReduce編程模型對算法進行併行化實現。將單機算法按照算法流程進行拆分,每一步對應一箇MapReduce程序。每一步的輸入及輸齣數據都存儲在Hadoop分佈式文件繫統上。用命中率對比併行化的不確定性傳播算法與全跼排名算法的性能。對比不同數據量、不同節點數時併行化的不確定性傳播算法的加速比。試驗結果錶明,不確定性傳播算法MapReduce併行化後部署在Hadoop集群上運行,命中率顯著高于全跼排名算法,且有著較好的併行性,擴大瞭單機算法所能處理的數據規模且提高瞭算法的運算速度。
위료극복단궤천행불학정성전파산법처리대규모수거집적국한,채용MapReduce편정모형대산법진행병행화실현。장단궤산법안조산법류정진행탁분,매일보대응일개MapReduce정서。매일보적수입급수출수거도존저재Hadoop분포식문건계통상。용명중솔대비병행화적불학정성전파산법여전국배명산법적성능。대비불동수거량、불동절점수시병행화적불학정성전파산법적가속비。시험결과표명,불학정성전파산법MapReduce병행화후부서재Hadoop집군상운행,명중솔현저고우전국배명산법,차유착교호적병행성,확대료단궤산법소능처리적수거규모차제고료산법적운산속도。
In order to overcome the limitations of the serial probabilistic spreading algorithm in dealing with large-scale dataset,a parallelization of the algorithm was put forth by using MapReduce.The complex computing tasks were de-composed into a series of MapReduce job flow for distributed parallel processing on Hadoop.The input and output data of every step were stored in the Hadoop distributed file system.Hit ratio was used to compare the parallelizable probabi-listic spreading algorithm versus the global ranking method performance.Speedups of the parallelizable algorithm were compared while the amount of data and the number of nodes was different.Experiment results showed that the probabi-listic spreading algorithm based on MapReduce had good parallelism and had higher hit ratio than the global ranking method.Data scale that can be handled by the serial algorithm was expanded,and the operation speed of the algorithm was raised.