This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction method...This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.展开更多
本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角...本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。展开更多
为了研究声学头模上不同的头部官能结构对头相关传输函数(Head-Related Transfer Function,HRTF)及录音听感上的影响,对具有不同官能结构的头模进行HRTF测量和对比分析,得出头部不同官能结构对HRTF的影响。进一步的主观评价实验也验证...为了研究声学头模上不同的头部官能结构对头相关传输函数(Head-Related Transfer Function,HRTF)及录音听感上的影响,对具有不同官能结构的头模进行HRTF测量和对比分析,得出头部不同官能结构对HRTF的影响。进一步的主观评价实验也验证了不同的头部官能结构对于声源定位的影响程度不同这一结论。其中,耳廓对声源定位的影响较大,是不可缺少的一个结构,而有无鼻子、头发等其他细节官能结构对声源定位产生的影响则要弱得多,同时这些头部官能结构对听感上的影响与声源类型和入射方向的关系十分密切。展开更多
基金supported by the cooperation between BIT and Ericssonpartially supported by the National Natural Science Foundation of China under Grants No.62071039。
文摘This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.
文摘本文设计实现了一个深度神经网络模型,根据人体生理参数及角度信息重建个性化头相关传递函数(Head Related Transfer Function,HRTF),仅需一次训练即可得到全部方向的预测HRTFs。网络模型由将人体测量参数作为输入的深度神经网络、将角度信息作为输入的展开层以及将前两者的输出作为输入的深度神经网络组成。最后对所提出方法的整体性能进行了客观评价。
文摘为了研究声学头模上不同的头部官能结构对头相关传输函数(Head-Related Transfer Function,HRTF)及录音听感上的影响,对具有不同官能结构的头模进行HRTF测量和对比分析,得出头部不同官能结构对HRTF的影响。进一步的主观评价实验也验证了不同的头部官能结构对于声源定位的影响程度不同这一结论。其中,耳廓对声源定位的影响较大,是不可缺少的一个结构,而有无鼻子、头发等其他细节官能结构对声源定位产生的影响则要弱得多,同时这些头部官能结构对听感上的影响与声源类型和入射方向的关系十分密切。
文摘虚拟现实(Virtual Reality,VR)的兴起使得三维音频技术得到进一步的应用。VR中三维音频的回放一般采用基于双耳的方式,目前VR中应用较多的三维音频技术有基于物理声场重建和球谐分解的Ambisonics技术,基于自然双耳录音(Binaural recording)的技术,以及基于头相关传递函数(Head Related Transfer Function,HRTF)重建的技术。此外在考虑环境混响效果的场景下还需要双耳房间脉冲响应(Binaural Room Impulse Response,BRIR)技术。介绍了VR中现有的三维音频技术和市场上的主要应用,介绍了VR音频从采集,编码传输到渲染回放整个过程中的主流相关技术,最后对VR三维音频的发展进行了展望。