摘要
在工业故障分类过程中有标记样本数量少而人工标注成本高会导致分类器精度难以提高,而大量包含丰富信息的无标记样本却没有得到充分利用。针对上述问题,提出了一种结合主动学习(AL)和最优路径森林算法(OPF)的半监督故障分类模型(AL-OPF)。该方法首先利用BvSB和余弦相似度准则综合衡量样本的价值量,以排序批处理模式筛选价值高的样本,并获取其标签扩充初始标记样本集,然后通过构建最优路径森林实现半监督标签传播,最后在实验室采集得到的管道故障样本集上进行实验验证。实验结果表明,该方法能在有标签样本为10%的情况下达到96.68%的整体识别准确率,与逐个采样模式的主动学习方法以及基于距离度量提取训练样本全局结构信息的半监督方法相比,所提出方法拥有更高的Recall值和F1-score值。
Aiming at the problem of difficulty in improving the classification accuracy of industrial fault detection caused by its limited number of labeled training samples which would consume a significant amount of manpower,which a large number of unlabeled samples containing rich information are not fully utilized,this paper puts forward a semi-supervised classification model of combining active learning(AL)and the optimum-path forest(OPF).Firstly,the high-value samples are selected in sorting batch mode according to the value of samples that are comprehensively measured based on BvSB and cosine similarity criterion,and the value of each sample is obtained to expand the initial labeled sample set.Secondly,semi-supervised label propagation is achieved by constructing the optimum-path forest.Finally,the experimental verification was carried out using laboratory collected pipe condition datasets.The experimental results show that the method can achieve an overall recognition accuracy of 96.68%when the number of labeled samples is 10%.Compared with active learning methods in one-by-one sampling mode and semi-supervised methods that extract global structural information of training samples based on distance metrics,the proposed method has higher Recall value and F1-score value.
作者
李恬
冯早
朱雪峰
Li Tian;Feng Zao;Zhu Xuefeng(Faculty of Information Engineering&Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming 650500,China)
出处
《电子测量与仪器学报》
CSCD
北大核心
2022年第12期67-76,共10页
Journal of Electronic Measurement and Instrumentation
基金
国家自然科学基金(61563024)项目资助
关键词
主动学习
最优路径森林
半监督
故障分类
active learning
optimum-path forest
semi-supervised
fault classification
作者简介
李恬,2020年毕业于周口师范学院获得学士学位,现为昆明理工大学硕士研究生,主要研究方向为信号处理、机器学习算法。E⁃mail:1364381330@qq.com;通信作者:冯早,2009年于英国纽卡斯尔大学获得硕士学位,2014年于英国布拉德福德大学获工学博士学位,现为昆明理工大学副教授,主要研究方向为声学无损检测、数据挖掘、机器学习算法研究。E⁃mail:6483975@qq.com