A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a force...A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.展开更多
滚动轴承作为机械设备的重要部件,对其进行剩余使用寿命预测在企业的生产过程中变得越来越重要。目前,虽然主流的卷积神经网络(convolutional neural network, CNN)可以自动地从轴承的振动信号中提取特征,却不能给特征分配不同的权重来...滚动轴承作为机械设备的重要部件,对其进行剩余使用寿命预测在企业的生产过程中变得越来越重要。目前,虽然主流的卷积神经网络(convolutional neural network, CNN)可以自动地从轴承的振动信号中提取特征,却不能给特征分配不同的权重来提高模型对重要特征的关注程度,对于长时间序列容易丢失重要信息。另外,神经网络中隐藏层神经元个数、学习率以及正则化参数等超参数还需要依靠人工经验设置。为了解决上述问题,提出基于灰狼优化(grey wolf optimizer, GWO)算法、优化集合CNN、双向长短期记忆(bidirectional long short term memory, BiLSTM)网络和注意力机制(Attention)轴承剩余使用寿命预测方法。首先,从原始振动信号中提取时域、频域以及时频域特征指标构建可选特征集;然后,通过构建考虑特征相关性、鲁棒性和单调性的综合评价指标筛选出高于设定阈值的轴承退化敏感特征集,作为预测模型的输入;最后,将预测值和真实值的均方误差作为GWO算法的适应度函数,优化预测模型获得最优隐藏层神经元个数、学习率和正则化参数,利用优化后模型进行剩余使用寿命预测,并在公开数据集上进行验证。结果表明,所提方法可在非经验指导下获得最优的超参数组合,优化后的预测模型与未进行优化模型相比,平均绝对误差与均方根误差分别降低了28.8%和24.3%。展开更多
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
基金supported by the Ministry of Trade,Industry & Energy(MOTIE,Korea) under Industrial Technology Innovation Program (No.10063424,'development of distant speech recognition and multi-task dialog processing technologies for in-door conversational robots')
文摘A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.
文摘滚动轴承作为机械设备的重要部件,对其进行剩余使用寿命预测在企业的生产过程中变得越来越重要。目前,虽然主流的卷积神经网络(convolutional neural network, CNN)可以自动地从轴承的振动信号中提取特征,却不能给特征分配不同的权重来提高模型对重要特征的关注程度,对于长时间序列容易丢失重要信息。另外,神经网络中隐藏层神经元个数、学习率以及正则化参数等超参数还需要依靠人工经验设置。为了解决上述问题,提出基于灰狼优化(grey wolf optimizer, GWO)算法、优化集合CNN、双向长短期记忆(bidirectional long short term memory, BiLSTM)网络和注意力机制(Attention)轴承剩余使用寿命预测方法。首先,从原始振动信号中提取时域、频域以及时频域特征指标构建可选特征集;然后,通过构建考虑特征相关性、鲁棒性和单调性的综合评价指标筛选出高于设定阈值的轴承退化敏感特征集,作为预测模型的输入;最后,将预测值和真实值的均方误差作为GWO算法的适应度函数,优化预测模型获得最优隐藏层神经元个数、学习率和正则化参数,利用优化后模型进行剩余使用寿命预测,并在公开数据集上进行验证。结果表明,所提方法可在非经验指导下获得最优的超参数组合,优化后的预测模型与未进行优化模型相比,平均绝对误差与均方根误差分别降低了28.8%和24.3%。
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.