浙江师范大学学报(自然科学版)
浙江師範大學學報(自然科學版)
절강사범대학학보(자연과학판)
JOURNAL OF ZHEJIANG NORMAL UNIVERSITY(NATURAL SCIENCES)
2015年
2期
179-184
,共6页
张宇翔%赵建民%朱信忠%徐慧英
張宇翔%趙建民%硃信忠%徐慧英
장우상%조건민%주신충%서혜영
HDFS%小文件%SequenceFile%文件合并%元数据存储%缓存策略
HDFS%小文件%SequenceFile%文件閤併%元數據存儲%緩存策略
HDFS%소문건%SequenceFile%문건합병%원수거존저%완존책략
HDFS%small files%SequenceFile%merging of files%metadata storage%caching strategies
HDFS设计之初只考虑到如何更好地处理大文件,并没有针对海量小文件进行优化,因此,当使用HDFS管理海量指纹数据小文件时会出现 NameNode 内存负载过重、上传及查询性能过低等问题。采用SequenceFile序列化技术进行小文件的合并,并且对于小文件合并、元数据存储、缓存策略等进行了针对性优化。实验证明,该优化方案可以有效地解决NameNode内存负载过重的问题,并且海量指纹数据小文件的上传和查询性能得到了提高。
HDFS設計之初隻攷慮到如何更好地處理大文件,併沒有針對海量小文件進行優化,因此,噹使用HDFS管理海量指紋數據小文件時會齣現 NameNode 內存負載過重、上傳及查詢性能過低等問題。採用SequenceFile序列化技術進行小文件的閤併,併且對于小文件閤併、元數據存儲、緩存策略等進行瞭針對性優化。實驗證明,該優化方案可以有效地解決NameNode內存負載過重的問題,併且海量指紋數據小文件的上傳和查詢性能得到瞭提高。
HDFS설계지초지고필도여하경호지처리대문건,병몰유침대해량소문건진행우화,인차,당사용HDFS관리해량지문수거소문건시회출현 NameNode 내존부재과중、상전급사순성능과저등문제。채용SequenceFile서렬화기술진행소문건적합병,병차대우소문건합병、원수거존저、완존책략등진행료침대성우화。실험증명,해우화방안가이유효지해결NameNode내존부재과중적문제,병차해량지문수거소문건적상전화사순성능득도료제고。
When designed the HDFS, it was usually only considered how to handle large files better, and HDFS was not optimized for massive small files. When used HDFS to manage massive small files such as fin-gerprint datafiles there were some difficulties. For example, overloading of the NameNode and the perform-ances of upload and query were not satisfied. The serialization technology named SequenceFile to merge small files was used and some targeted optimization about the merging of small files, the storage of metadata and the caching strategies were considered. Experimental results showed that the proposed scheme could effectively deal with the problem of NameNode memory′s overloading. The upload and query performances about massive small files sucn as fingerprint datafiles were also improved.