计算机工程
計算機工程
계산궤공정
COMPUTER ENGINEERING
2015年
6期
49-55
,共7页
高能物理数据%大数据%HBase数据库%ROOT框架%BEAN框架%MapReduce框架
高能物理數據%大數據%HBase數據庫%ROOT框架%BEAN框架%MapReduce框架
고능물리수거%대수거%HBase수거고%ROOT광가%BEAN광가%MapReduce광가
high energy physics data%big data%HBase database%ROOT frame%BEAN framework%MapReduce frame-work
高能物理对撞机产生数百亿计的物理事例,而物理分析则是从中选取几千个有意义的事例,该分析过程是一个典型的大数据处理及数据挖掘应用。由此,设计高效的数据结构、存储及访问机制,快速挑选出有意义的物理事例十分重要。介绍事例的数据结构、存储和处理技术,分析高能物理数据的特点,提出一种以HBase,ROOT,BEAN及MapReduce为基础的新型高能物理数据存储及处理技术系统。利用HBase存储数据、MapReduce实现并行处理,选择ROOT和BEAN作为高能物理分析框架,并给出具体设计与实现方案。测试结果表明,与传统高能物理数据存储系统相比,该系统具有更快的数据处理速度,当预筛选服务生效时能够更加有效地利用I/O和CPU资源。
高能物理對撞機產生數百億計的物理事例,而物理分析則是從中選取幾韆箇有意義的事例,該分析過程是一箇典型的大數據處理及數據挖掘應用。由此,設計高效的數據結構、存儲及訪問機製,快速挑選齣有意義的物理事例十分重要。介紹事例的數據結構、存儲和處理技術,分析高能物理數據的特點,提齣一種以HBase,ROOT,BEAN及MapReduce為基礎的新型高能物理數據存儲及處理技術繫統。利用HBase存儲數據、MapReduce實現併行處理,選擇ROOT和BEAN作為高能物理分析框架,併給齣具體設計與實現方案。測試結果錶明,與傳統高能物理數據存儲繫統相比,該繫統具有更快的數據處理速度,噹預篩選服務生效時能夠更加有效地利用I/O和CPU資源。
고능물리대당궤산생수백억계적물리사례,이물리분석칙시종중선취궤천개유의의적사례,해분석과정시일개전형적대수거처리급수거알굴응용。유차,설계고효적수거결구、존저급방문궤제,쾌속도선출유의의적물리사례십분중요。개소사례적수거결구、존저화처리기술,분석고능물리수거적특점,제출일충이HBase,ROOT,BEAN급MapReduce위기출적신형고능물리수거존저급처리기술계통。이용HBase존저수거、MapReduce실현병행처리,선택ROOT화BEAN작위고능물리분석광가,병급출구체설계여실현방안。측시결과표명,여전통고능물리수거존저계통상비,해계통구유경쾌적수거처리속도,당예사선복무생효시능구경가유효지이용I/O화CPU자원。
High energy collider produces several billions of events in the whole life time. Physical analysis is to select thousands of meaningful events from them and it is a typical big data processing and data mining application. Therefore, it is significantly important to design an efficient data structure, storage and access mechanism, so that the meaningful events can be selected quickly. This paper introduces event data structure,storage and processing technology in popular. This paper analyses the features of high energy physics analysis and proposes a new technology of data storing and processing for high energy physics. This paper fertilizes HBase to store data, uses MapReduce to implement parallel processing and selects ROOT and BEAN as high energy physics analysis frame. This paper also describes the specific design and implementation of the new platform. Test result shows that compared with traditional data storage system of high energy physics, the system has quick data processing speed, it can use effectively I/O and CPU resources when reselection goes into effect.