计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2014年
23期
198-202
,共5页
重复数据删除%集合元素查询%布鲁姆过滤器%MD5%假阳性误判率
重複數據刪除%集閤元素查詢%佈魯姆過濾器%MD5%假暘性誤判率
중복수거산제%집합원소사순%포로모과려기%MD5%가양성오판솔
duplicate data delete%query elements%bloom filter%MD5%false positive error rate
针对文件级单布鲁姆过滤器排重算法只能以文件为单位进行数据排重,数据块级单布鲁姆过滤器排重算法耗时过多的缺点,采用2个布鲁姆过滤器,创建文件级和数据块级2级数据排重的算法结构。实验结果表明,双布鲁姆过滤器排重算法可以以数据块为单位对数据排重,在保持低假阳性误判率的同时,相比数据块级单布鲁姆过滤器排重算法耗时缩短了43%~68%。
針對文件級單佈魯姆過濾器排重算法隻能以文件為單位進行數據排重,數據塊級單佈魯姆過濾器排重算法耗時過多的缺點,採用2箇佈魯姆過濾器,創建文件級和數據塊級2級數據排重的算法結構。實驗結果錶明,雙佈魯姆過濾器排重算法可以以數據塊為單位對數據排重,在保持低假暘性誤判率的同時,相比數據塊級單佈魯姆過濾器排重算法耗時縮短瞭43%~68%。
침대문건급단포로모과려기배중산법지능이문건위단위진행수거배중,수거괴급단포로모과려기배중산법모시과다적결점,채용2개포로모과려기,창건문건급화수거괴급2급수거배중적산법결구。실험결과표명,쌍포로모과려기배중산법가이이수거괴위단위대수거배중,재보지저가양성오판솔적동시,상비수거괴급단포로모과려기배중산법모시축단료43%~68%。
Aiming at the disadvantage of file level single bloom filter duplicate data delete algorithm deletes duplicate data only at file size, block level single bloom filter duplicate data delete algorithm’s time-consuming is too much. In this paper, it uses 2 bloom filter, creates a 2 level duplicate data delete algorithm structure-file level and block level. The experimental results show that, double bloom filter duplicate data delete algorithm could delete duplicate data at block level, keep false positive error rate at a low level, time-consuming gets 43%~68%shorter compared with block level single bloom filter duplicate data delete algorithm.