山东农业科学
山東農業科學
산동농업과학
SHANGDONG AGRICULTURAL SCIENCES
2015年
8期
119-122,126
,共5页
袁伟%罗丽琼%赵路%张军情%付思芮%鲁绍坤
袁偉%囉麗瓊%趙路%張軍情%付思芮%魯紹坤
원위%라려경%조로%장군정%부사예%로소곤
Hadoop%Hbase%MapReduce%性能测试%农业大数据
Hadoop%Hbase%MapReduce%性能測試%農業大數據
Hadoop%Hbase%MapReduce%성능측시%농업대수거
Hadoop%Hbase%MapReduce%Performance test%Agricultural big data
随着农业大数据时代的来临,传统串行程序及关系数据库已经不能满足对大数据处理的需求,使用分布式平台对数据进行处理逐渐取代传统的数据处理技术。本文使用 Hadoop 分布式平台,结合非关系型数据库 Hbase 和并行编程模型 MapReduce,对香格里拉地区酿酒葡萄种植区的环境数据的存储和计算进行了设计,测试了 Hbase 对数据的存储性能以及 MapReduce 用于回归分析的性能,并将 MapReduce 并行计算程序与单机串行程序进行了性能对比。结果表明,通过对 Hbase 进行合适的配置,数据写入时间随着节点的增加而减少,存储性能具有良好的扩展性;MapReduce 在处理少量数据时效率低于串行程序,但随着数据量增加,其计算效率明显优于串行程序。
隨著農業大數據時代的來臨,傳統串行程序及關繫數據庫已經不能滿足對大數據處理的需求,使用分佈式平檯對數據進行處理逐漸取代傳統的數據處理技術。本文使用 Hadoop 分佈式平檯,結閤非關繫型數據庫 Hbase 和併行編程模型 MapReduce,對香格裏拉地區釀酒葡萄種植區的環境數據的存儲和計算進行瞭設計,測試瞭 Hbase 對數據的存儲性能以及 MapReduce 用于迴歸分析的性能,併將 MapReduce 併行計算程序與單機串行程序進行瞭性能對比。結果錶明,通過對 Hbase 進行閤適的配置,數據寫入時間隨著節點的增加而減少,存儲性能具有良好的擴展性;MapReduce 在處理少量數據時效率低于串行程序,但隨著數據量增加,其計算效率明顯優于串行程序。
수착농업대수거시대적래림,전통천행정서급관계수거고이경불능만족대대수거처리적수구,사용분포식평태대수거진행처리축점취대전통적수거처리기술。본문사용 Hadoop 분포식평태,결합비관계형수거고 Hbase 화병행편정모형 MapReduce,대향격리랍지구양주포도충식구적배경수거적존저화계산진행료설계,측시료 Hbase 대수거적존저성능이급 MapReduce 용우회귀분석적성능,병장 MapReduce 병행계산정서여단궤천행정서진행료성능대비。결과표명,통과대 Hbase 진행합괄적배치,수거사입시간수착절점적증가이감소,존저성능구유량호적확전성;MapReduce 재처리소량수거시효솔저우천행정서,단수착수거량증가,기계산효솔명현우우천행정서。
With the advent of the era of agricultural big data,the traditional serial program and relational data base could not meet the need for processing big data,which was gradually replaced by the distributed computing platform.In this paper,the Hadoop distributed platform combined with the non -relational data base Hbase and the parallel programming model MapReduce was used to study the storage and calculation of environmental data from Shangri -la grape growing region.The performance of Hbase for data storage and Ma-pReduce for regression analysis was tested,and the property of parallel calculating of MapReduce was com-pared with that of the traditional calculating method of serial storage.The results showed that the data writing time of Hbase decreased with the increase of node through appropriate configuration,and its storage property possessed better expansibility;the processing efficiency of MapReduce was lower for a few data,while that was obviously superior to the serial program for large amounts of data.