电脑知识与技术
電腦知識與技術
전뇌지식여기술
COMPUTER KNOWLEDGE AND TECHNOLOGY
2013年
21期
4918-4920,4932
,共4页
存储系统%重复数据检测%Rabin指纹%基于内容分块%有限域
存儲繫統%重複數據檢測%Rabin指紋%基于內容分塊%有限域
존저계통%중복수거검측%Rabin지문%기우내용분괴%유한역
storage system%duplicated data detection%rabin fingerprint%content defined chunking%galois field
Rabin指纹算法计算效率高、随机性好,可将数据更改对连续指纹序列的影响限制在局部范围内,广泛应用于重复数据检测领域。分析了Rabin指纹在有限域GF(2n)上的运算原理,得出滑动窗口移动时定长字符序列的数字指纹快速计算公式。用伪代码描述了Rabin指纹算法在重复数据检测中的应用,并用VC++语言进行了算法实现,在普通计算机上提取Word文档、程序源代码和BMP图像等三类文件作为测试数据集,测试结果表明算法是有效的。
Rabin指紋算法計算效率高、隨機性好,可將數據更改對連續指紋序列的影響限製在跼部範圍內,廣汎應用于重複數據檢測領域。分析瞭Rabin指紋在有限域GF(2n)上的運算原理,得齣滑動窗口移動時定長字符序列的數字指紋快速計算公式。用偽代碼描述瞭Rabin指紋算法在重複數據檢測中的應用,併用VC++語言進行瞭算法實現,在普通計算機上提取Word文檔、程序源代碼和BMP圖像等三類文件作為測試數據集,測試結果錶明算法是有效的。
Rabin지문산법계산효솔고、수궤성호,가장수거경개대련속지문서렬적영향한제재국부범위내,엄범응용우중복수거검측영역。분석료Rabin지문재유한역GF(2n)상적운산원리,득출활동창구이동시정장자부서렬적수자지문쾌속계산공식。용위대마묘술료Rabin지문산법재중복수거검측중적응용,병용VC++어언진행료산법실현,재보통계산궤상제취Word문당、정서원대마화BMP도상등삼류문건작위측시수거집,측시결과표명산법시유효적。
Rabin fingerprint algorithm is widely used in the field of duplicate data detection with high computational efficiency and good randomness. When data changes affecting the continuous fingerprint sequence, the algorithm can limit the impact to lo-cal area. After analyzing the Rabin fingerprint principle on galois field GF(2n), a fingerprint fast calculation formula for fixed-length character sequence is derived in the process of slide window moving. The application of Rabin fingerprint algorithm in du-plicate data detection fields is described by pseudo code, and implemented by VC++programing language. Experiment uses a da-ta set that including three types of files(Word documents, source code, and BMP images) extraced from some ordinary computer, and the result shows that the algorithm is effective.