当前,深度主动学习(Deep Active Learning,DAL)在分类数据标注工作中获得成功,但如何筛选出最能提升模型性能的样本仍是难题.本文提出基于弱标签争议的半自动分类数据标注方法(Dispute about Weak Label based Deep Active Learning,DWL...当前,深度主动学习(Deep Active Learning,DAL)在分类数据标注工作中获得成功,但如何筛选出最能提升模型性能的样本仍是难题.本文提出基于弱标签争议的半自动分类数据标注方法(Dispute about Weak Label based Deep Active Learning,DWLDAL),迭代地筛选出模型难以区分的样本,交给人工进行准确标注.该方法包含伪标签生成器和弱标签生成器,伪标签生成器是在准确标注的数据集上训练而成,用于生成无标签数据的伪标签;弱标签生成器则是在带伪标签的随机子集上训练而成.弱标签生成器委员会决定哪些无标签数据最有争议,则交给人工标注.本文针对文本分类问题,在公开数据集IMDB(Internet Movie DataBase)、20NEWS(20NEW Sgroup)和chnsenticorp(chnsenticorp_htl_all)上进行实验验证.从数据标注和分类任务的准确性2个角度,对3种不同投票决策方式进行评估.DWLDAL方法中数据标注的F1分数比现有方法Snuba分别提高30.22%、14.07%和2.57%,DWLDAL方法中分类任务的F1分数比Snuba分别提高1.01%、22.72%和4.83%.展开更多
Structural health monitoring is widely utilized in outdoor environments,especially under harsh conditions,which can introduce noise into the monitoring system.Therefore,designing an effective denoising strategy to enh...Structural health monitoring is widely utilized in outdoor environments,especially under harsh conditions,which can introduce noise into the monitoring system.Therefore,designing an effective denoising strategy to enhance the performance of guided wave damage detection in noisy environments is crucial.This paper introduces a local temporal principal component analysis(PCA)reconstruction approach for denoising guided waves prior to implementing unsupervised damage detection,achieved through novel autoencoder-based reconstruction.Experimental results demonstrate that the proposed denoising method significantly enhances damage detection performance when guided waves are contaminated by noise,with SNR values ranging from 10 to-5 dB.Following the implementation of the proposed denoising approach,the AUC score can elevate from 0.65 to 0.96 when dealing with guided waves corrputed by noise at a level of-5 dB.Additionally,the paper provides guidance on selecting the appropriate number of components used in the denoising PCA reconstruction,aiding in the optimization of the damage detection in noisy conditions.展开更多
文摘当前,深度主动学习(Deep Active Learning,DAL)在分类数据标注工作中获得成功,但如何筛选出最能提升模型性能的样本仍是难题.本文提出基于弱标签争议的半自动分类数据标注方法(Dispute about Weak Label based Deep Active Learning,DWLDAL),迭代地筛选出模型难以区分的样本,交给人工进行准确标注.该方法包含伪标签生成器和弱标签生成器,伪标签生成器是在准确标注的数据集上训练而成,用于生成无标签数据的伪标签;弱标签生成器则是在带伪标签的随机子集上训练而成.弱标签生成器委员会决定哪些无标签数据最有争议,则交给人工标注.本文针对文本分类问题,在公开数据集IMDB(Internet Movie DataBase)、20NEWS(20NEW Sgroup)和chnsenticorp(chnsenticorp_htl_all)上进行实验验证.从数据标注和分类任务的准确性2个角度,对3种不同投票决策方式进行评估.DWLDAL方法中数据标注的F1分数比现有方法Snuba分别提高30.22%、14.07%和2.57%,DWLDAL方法中分类任务的F1分数比Snuba分别提高1.01%、22.72%和4.83%.
基金National Science Foundation of Zhejiang under Contract(LY23E010001)。
文摘Structural health monitoring is widely utilized in outdoor environments,especially under harsh conditions,which can introduce noise into the monitoring system.Therefore,designing an effective denoising strategy to enhance the performance of guided wave damage detection in noisy environments is crucial.This paper introduces a local temporal principal component analysis(PCA)reconstruction approach for denoising guided waves prior to implementing unsupervised damage detection,achieved through novel autoencoder-based reconstruction.Experimental results demonstrate that the proposed denoising method significantly enhances damage detection performance when guided waves are contaminated by noise,with SNR values ranging from 10 to-5 dB.Following the implementation of the proposed denoising approach,the AUC score can elevate from 0.65 to 0.96 when dealing with guided waves corrputed by noise at a level of-5 dB.Additionally,the paper provides guidance on selecting the appropriate number of components used in the denoising PCA reconstruction,aiding in the optimization of the damage detection in noisy conditions.