摘要
为了得到未抽样流的分布特征,提出一种新的由抽样报文流数据来估计原始未抽样流长度分布的方法.首先分析了产生一个定长抽样流的原始流的概率分布模型,并根据这个概率分布特征给出了长流一个非常简单的估计.然后构造了关于短流的方程组,利用流的重尾分布特性和最小二乘法对该方程组进行求解,得到了短流的估计.理论分析表明该估计方法有效地控制了时间复杂程度,实验测试结果也表明该算法对于分布的估计是精确的,估计精度与EM算法相当.
A novel method for estimation of original flow length distributions from sampled flow statistics is proposed to obtain the distribution feature of unsampled flows. First, the probability distribution model of original flow for a sampled flow of fixed length is analyzed, and simple estimation for large flows is described according to the analysis result. Then, estimation for short flows is obtained by constructing equations involving short flows and solving them using the heavy-tailed feature of flow and the least square method. The theoretical analysis shows that the computational complexity of this method is well under control, and the experimental results demonstrate that the distributions inferred from the proposed method are as accurate as those from the expectation maximum (EM) algorithm.
出处
《东南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2006年第3期467-471,共5页
Journal of Southeast University:Natural Science Edition
基金
国家重点基础研究发展计划(973计划)资助项目(2003CB314803)
教育部科学技术研究重点资助项目(105084)
江苏省网络与信息安全重点实验室资助项目(BM2003201)
江苏省博士后科研资助计划资助项目
关键词
抽样报文
IP流
概率
最小二乘法
packet sampling
IP flows
probability
least square method
作者简介
刘卫江(1969-),男,博士,教授,wjliu@njnet.edu.cn.