期刊文献+

Bloom Filters散列函数数目多阶段动态优化算法 被引量:1

A Multi-Stage Dynamic Optimization Algorithm for Bloom Filters Hash Functions Number
在线阅读 下载PDF
导出
摘要 标准Bloom Filters在操作前需要知道数据集合中不同元素数目才能确定最佳的Hash函数数目,但是数据集的分布情况并不容易事先获得.本文提出一种多阶段Hash函数数目动态优化的Bloom Filters(Multi-stage Dynamicoptimization Bloom Filters,MDBF),它将元素插入过程分为多个阶段,在每个阶段根据比特向量的使用情况分析插入元素的分布,动态调整最优的Hash函数数目.实验表明MDBF能够适应元素多样性和偏斜分布的复杂情况,选择最优的Hash函数数目,获得更低的误检率. Standard Bloom Filters needs to know the number of different elements in data set in order to determine the optimal number of hash functions.However,the data distribution information is not easy to obtain prior.This paper proposes a multistage dynamic optimization for Bloom Filters hash functions number(MDBF).It splits element insertion procedure into several stages,and in each stage of element insertion,MDBF decides the optimal hash function number by analyzing the inserted data distribution with bit vector usage situation.The experimental results show that MDBF can select the optimal number of hash functions to obtain low false positive probability in complicated applications,which have element multiplicity and skewed distribution.
作者 张伟 王汝传
出处 《电子学报》 EI CAS CSCD 北大核心 2011年第4期877-881,共5页 Acta Electronica Sinica
基金 国家自然科学基金(No.60973193 61003039 61003236) 江苏省自然科学基金(No.BK2008451) 省级现代服务业发展专项基金(No.0801019C) 国家博士后基金(No.20090451241) 江苏高校科技创新计划项目(No.CX09B-153Z CX10B-260Z CX10B-261Z CX10B-262Z) 江苏省六大高峰人才项目(No.2008118) 江苏省计算机信息处理技术重点实验室基金(2010)
关键词 BLOOM FILTERS HASH函数 偏斜分布 误检率 bloom filters hash function skewed distribution false positive probability
作者简介 张伟 男,1973年9月出生于江苏泰兴,南京邮电大学副教授,博士.现为南京邮电大学通信与信息工程流动站博士后,主要研究方向是网络异常检测及恶意代码分析等.E-mail:zhangw@njupt.edu.cn 王汝传 男,1943年9月出生于安徽合肥,南京邮电大学教授、博士生导师.主要研究方向是计算机软件、计算机网络和网格、对等计算、信息安全、无线传感器网络、移动代理等.E-mail:wangle@njupt.edu.cn
  • 相关文献

参考文献19

  • 1B Bloom. Space/Time tradeoffs in hash coding with allowable errors[Jl. Communications of the ACM, 1970, 13 (7):422 - 426.
  • 2A Broder, M Mitzenmacher. Network applications of bloom filters: A survey [J]. Internet Mathematics, 2005, 1 ( 4 ) : 485 - 509.
  • 3L Fan, P Cao, J Almeida, et al. Summary cache: A scalable wide-area web cache sharing protocol [ J]. IEEE/ACM Transactions on Networking, 2000,8 (3) : 281 - 293.
  • 4A Kumar, J Xu, E W Zegura. Efficient and scalable query routing for unstructured peer-to-peer networks E A ]. Proc. INFO- COM' 05 [ C]. Miami, Florida, USA, 2005, (2) : 1162 - 1173.
  • 5F Chang, W Feng, K Li. Approximate caches for packet classification I A ]. Proc. of INFOCOM [ C ]. Hong Kong, China, 2004, (4) :2196 - 2207.
  • 6H Song, S Dharmapurikar, J Turner, et al. Fast hash table lookup using extended Bloom filter:An aid to network processing[A]. Proc SIGCOMM[C]. Philadelphia, PA, USA, 2005.20 - 26.
  • 7F Ye,H I.,uo, S Lu,L Zhang. Statistical en-route filtering of injected false data in sensor networks [ A ]. Proc INFOCOM' 04 [ C ]. Hong Kong, China: mEE Press, 2004, (4) : 2446 - 2457.
  • 8M Mitzenmacher. Compressed bloom filters [ J ]. IEEE/ACM Transactions on Networking, 2002,10(5 ) : 604 - 612.
  • 9Bruck J, Jie Gao, Jiang A. Weighted bloom filter[ A]. Information Theory 2006 IEEE International Symposium[C]. Seattle, Washington, USA: IEEE Press, 2006. 2304 - 2308.
  • 10谢鲲,闵应骅,张大方,谢高岗,文吉刚.分档布鲁姆过滤器的查询算法[J].计算机学报,2007,30(4):597-607. 被引量:14

二级参考文献38

  • 1姜彩萍,李子木,杨凤杰.集中管理式Web缓存系统及性能分析[J].小型微型计算机系统,2004,25(8):1428-1431. 被引量:10
  • 2[1]B Bloom.Space/time tradeoffs in hash coding with allowable errors[J].Communications of the ACM,1970,13(7):422-426.
  • 3[2]M Mitzenmacher.Compressed bloom filters[A].In Proceedings of the 20th ACM Symposium on Principles of Distributed Computing (PODC2001)[C].Newport,Rhode,Island,2001.
  • 4[3]Li Fan,P Cao,J Almeida,A Broder.Summary cache:A scalable wide-area web cache sharing protocol[J].IEEE/ACM transactions on networking,2000,8(3).
  • 5[4]J Kubiatowicz,D Bindel,Y Chen,S Czerwinski,P Eaton,D Geels,R Gummadi,S Rhea,H Weatherspoon,W Weimer,Cwells,B Zhao.OceanStore:An architecture for globe-scale persistent storage[A].In proceedings of the 9th international conference on architectural support for programming languages and operating systems (ASPLOS 2000)[C].Cambridge,MA,2000.
  • 6[5]M V Ramakrishna.Practical performance of bloom filters and parallel free-text searching[J].Communications of the ACM,1989,32(10):1237-1239.
  • 7[6]J K Mulllin.A second look at bloom filters[J].Communiations of the ACM,1983,26(8):570-571.
  • 8[7]I H Witten,A Moffat,T Bell.Managing Gigabytes (2nd Edition)[M].Morgan Kaufmann,San Francisco:Morgan Kaufmaan,1999.
  • 9[8]George Coulouris,Jean Dollimore,et al.Distributed Systems Concepts and Design (3rd Edition)[M].Reading,Mass:Addison Wesley,2001.
  • 10[9]C Stanfill,B Kahle.Parallel free-text search on the connection machine system[J].Communication of the ACM,1986,29(12).

共引文献34

同被引文献14

  • 1叶明江,崔勇,徐恪,吴建平.基于有状态Bloom filter引擎的高速分组检测[J].软件学报,2007,18(1):117-126. 被引量:13
  • 2JAVIDAN R ,VAHDATI F,FARAAHI A. A new method for data filtering in RFID middleware[J].Radio Frequency Identification Technology and Applications,2011,3(4):260-284.
  • 3GUO De-ke,WU Jie,CHEN Hong-hui. The dynamic Bloom filters[J].IEEE Trans on Knowledge and Data,2010,22(1):120-133.
  • 4BRODER A,MITZENMACHER M. Network applications of Bloom filters:a survey[J].Internet Mathematics,2004,1(4):485-509.
  • 5MITZENMACHER M. Compressed Bloom filters[J].IEEE/ACM Trans on Networking,2002,10(5):604-612.
  • 6CHEN Tao,GUO De-ke,HE Yuan. A Bloom filters based dissemination protocol in wireless sensor networks[J].Ad hoc Networks,2013,11(4):1359-1371.
  • 7DHARMAPURIKAR S,KRISHNAMURTHY P,SPROULL T,et al. Deep packet inspection using parallel Bloom filters[J].IEEE Micro,2004,24(1):52-61.
  • 8蒋邵岗,谭杰.RFID中间件数据处理与过滤方法的研究[J].计算机应用,2008,28(10):2613-2615. 被引量:33
  • 9谢鲲,文吉刚,张大方,谢高岗.布鲁姆过滤器查询算法[J].软件学报,2009,20(1):96-108. 被引量:36
  • 10张明哲,张强,袁巍,刘威.嵌入式RFID中间件数据过滤模型研究[J].计算机工程与设计,2010,31(17):3743-3746. 被引量:8

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部