摘要
为保护流量数据隐私,并针对服务器分布少量标注流量,客户端非独立同分布大量未标注流量且互不共享带来的模型难以聚合得到流量分类模型的问题,提出基于联邦半监督学习的流量分类方法来解决。采用参数分解策略,最小化有无监督学习任务间的互相干扰。利用客户端间一致性正则化最大化相似网段间客户端共识,解决多变小样本数据的学习问题。此外,在联邦学习参数传递过程中只传递稀疏化参数差异矩阵。实验结果表明,该方法能够在保证参与客户端流量数据隐私安全的前提下实现多方未标注流量的共享与学习,分类准确率达到91.86%,通信成本也得到明显改善。
To solve the problem that it is difficult to aggregate the traffic classification model caused by a small amount of labeled traffic distributed on the server and a large amount of unshared and dependent unlabeled traffic identically distributed on the client,this paper proposes a traffic classification method based on federated semi-supervised learning for traffic privacy protection.The parameter decomposition strategy minimizes the interference between supervised and unsupervised learning tasks.The inter-client consistency loss is used to maximize the client consensus between similar network segments to solve the problem of variable small sample traffic.In addition,only the sparse parameter difference matrix is transferred in the transfer process of federated learning parameters.Experimental results show that the proposed method can realize the sharing and learning of unlabeled traffic among multiple parties on the premise of ensuring the privacy of participating clients.Through comparison,the accuracy is improved to 91.86%and the communication cost is significantly reduced.
作者
孙重鑫
陈博
卜佑军
张德升
王涵
SUN Chongxin;CHEN Bo;BU Youjun;ZHANG Desheng;WANG Han(Information Engineering University,Zhengzhou 450001,China;Purple Mountain Laboratory,Nanjing 211100,China)
出处
《信息工程大学学报》
2024年第1期92-99,共8页
Journal of Information Engineering University
基金
国家自然科学基金资助项目(62176264)。
关键词
流量分类
深度学习
联邦学习
半监督学习
traffic classification
deep learning
federated learning
semi-supervised learning
作者简介
孙重鑫(1997-),男,硕士生,主要研究方向为网络安全、深度学习。