Recent deep neural network(DNN)based blind image quality assessment(BIQA)approaches take mean opinion score(MOS)as ground-truth labels,which would lead to cross-datasets biases and limited generalization ability of th...Recent deep neural network(DNN)based blind image quality assessment(BIQA)approaches take mean opinion score(MOS)as ground-truth labels,which would lead to cross-datasets biases and limited generalization ability of the DNN-based BIQA model.This work validates the natural instability of MOS through investigating the neuropsychological characteristics inside the human visual system during quality perception.By combining persistent homology analysis with electroencephalogram(EEG),the physiologically meaningful features of the brain responses to different distortion levels are extracted.The physiological features indicate that although volunteers view exactly the same image content,their EEG features are quite varied.Based on the physiological results,we advocate treating MOS as noisy labels and optimizing the DNN based BIQA model with earlystop strategies.Experimental results on both innerdataset and cross-dataset demonstrate the superiority of our optimization approach in terms of generalization ability.展开更多
基金supported by the Medium and Long-term Science and Technology Plan for Radio,Television,and Online Audiovisuals(2023AC0200)the Public Welfare Technology Application Research Project of Zhejiang Province,China(No.LGF21F010001).
文摘Recent deep neural network(DNN)based blind image quality assessment(BIQA)approaches take mean opinion score(MOS)as ground-truth labels,which would lead to cross-datasets biases and limited generalization ability of the DNN-based BIQA model.This work validates the natural instability of MOS through investigating the neuropsychological characteristics inside the human visual system during quality perception.By combining persistent homology analysis with electroencephalogram(EEG),the physiologically meaningful features of the brain responses to different distortion levels are extracted.The physiological features indicate that although volunteers view exactly the same image content,their EEG features are quite varied.Based on the physiological results,we advocate treating MOS as noisy labels and optimizing the DNN based BIQA model with earlystop strategies.Experimental results on both innerdataset and cross-dataset demonstrate the superiority of our optimization approach in terms of generalization ability.