期刊文献+
共找到5,731篇文章
< 1 2 250 >
每页显示 20 50 100
Speech-ABR安静及噪声环境下音位的对比研究 被引量:7
1
作者 王倩 王燕 刘志成 《中华耳科学杂志》 CSCD 北大核心 2016年第5期634-638,共5页
目的对比speech-ABR在安静及噪声环境下单音节声母、韵母及声调的变化,研究噪声对单音节音位的影响。方法招募正常听力受试者40例(男20例,女20例),母语为汉语普通话。Speech-ABR刺激声为260ms时程的合成言语声/mi/,声调为三声,刺激强度... 目的对比speech-ABR在安静及噪声环境下单音节声母、韵母及声调的变化,研究噪声对单音节音位的影响。方法招募正常听力受试者40例(男20例,女20例),母语为汉语普通话。Speech-ABR刺激声为260ms时程的合成言语声/mi/,声调为三声,刺激强度为70d B SPL,记录右耳安静状态下及噪声状态下(信噪比SNR=-10d B)speech-ABR的反应波形。对比起始反应波形(onset response,OR)、过渡反应波形(consonant-to-vowel transition)及频率跟随反应波形(frequency following response,FFR)的潜伏期的变化。并对比安静及噪声状态下声调追踪(pitch tracking)相关系数r的变化。使用SPSS18.0软件进行数据统计分析,数据采用配对t检验分析两组的差异,P<0.05时为差异有统计学意义。结果260ms时程/mi/诱发的言语听性脑干反应波形特征,主要由潜伏期为10ms内的起始反应、潜伏期为80-220ms内的频率跟随反应及最后的终止反应组成,以及潜伏期在10-80ms内的辅音-元音过渡反应。其中起始反应部分为辅音部分所诱发;过渡反应部分为辅-元音的过渡信息诱发;由/mi/中的元音部分所诱发的频率跟随反应部分共由15个波形组成。经配对t检验分析,在安静及噪声环境下进行对比,起始反应峰值(辅音部分)平均潜伏期延长0.85±0.17ms(P=0.000)。过度反应峰值平均潜伏期延长0.75±0.15ms((P=0.000)。频率跟随反应峰值平均潜伏期延长0.38±0.10ms(P=0.000),结果均具有统计学意义。安静环境下声调追踪反应相关系数r均值为0.84±0.08,噪声环境下相关系数r均值为0.74±0.12,两者对比结果具有统计学意义((P=0.000)。结论在噪声环境下,测试音的辅音、元音对应波形潜伏期均发生变化,声调追踪系数会有所下降,提示三种音位均会受到噪声的影响。与以往主观的言语识别率测试方式及诱发电位测试相比,speech-ABR是一种客观方式评估言语声受到噪声干扰情况的测试方法。 展开更多
关键词 speech-ABR 言语噪声 单音节
在线阅读 下载PDF
语音分析软件Speech Analyzer和Praat在上海市区方言鼻化韵单一化演变研究中的应用 被引量:6
2
作者 顾钦 《计算机应用与软件》 CSCD 北大核心 2006年第12期81-82,108,共3页
目前我国方言语音演变研究中音位的确定主要以传统意义上的研究方法为多,音位归纳主要凭个人经验。若记音人本身记音能力有限,可能造成记录结果与方言实际读音存在偏差。因此,如果在传统记音方法的基础上,辅以语音分析软件,将使方言语... 目前我国方言语音演变研究中音位的确定主要以传统意义上的研究方法为多,音位归纳主要凭个人经验。若记音人本身记音能力有限,可能造成记录结果与方言实际读音存在偏差。因此,如果在传统记音方法的基础上,辅以语音分析软件,将使方言语音研究更为精密、客观。以Speech Analyzer和Praat语音分析软件为例,結合上海市区方言鼻化韵单一化,即鼻化韵中前a后ɑ的对立完全消失,发成中A这一语音演变现象,以Speech Analyzer确定鼻化元音位置,以Praat做共振峰分析,为这一语音演变现象的确定提供证据。 展开更多
关键词 语音分析软件 speech ANALYZER Praat上海市区方言 鼻化韵单一化
在线阅读 下载PDF
A robust feature extraction approach based on an auditory model for classification of speech and expressiveness 被引量:5
3
作者 孙颖 V.Werner 张雪英 《Journal of Central South University》 SCIE EI CAS 2012年第2期504-510,共7页
Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were ... Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way. 展开更多
关键词 speech recognition emotion recognition zero-crossings Teager energy operator speech database
在线阅读 下载PDF
Improved hidden Markov model for speech recognition and POS tagging 被引量:4
4
作者 袁里驰 《Journal of Central South University》 SCIE EI CAS 2012年第2期511-516,共6页
In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language proc... In order to overcome defects of the classical hidden Markov model (HMM), Markov family model (MFM), a new statistical model was proposed. Markov family model was applied to speech recognition and natural language processing. The speaker independently continuous speech recognition experiments and the part-of-speech tagging experiments show that Markov family model has higher performance than hidden Markov model. The precision is enhanced from 94.642% to 96.214% in the part-of-speech tagging experiments, and the work rate is reduced by 11.9% in the speech recognition experiments with respect to HMM baseline system. 展开更多
关键词 hidden Markov model Markov family model speech recognition part-of-speech tagging
在线阅读 下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
5
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
在线阅读 下载PDF
Improved speech absence probability estimation based on environmental noise classification 被引量:2
6
作者 SON Young-ho LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2012年第9期2548-2553,共6页
An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking met... An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking method,requires seeking the "a priori" probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the "a posteriori" signal-to-noise ratio(SNR).To overcome this problem,first,the optimal values in terms of the perceived speech quality of a variety of noise types are derived.Second,the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model(GMM).The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type,unlike the conventional approach which uses a fixed threshold and smoothing parameter.The performance of the proposed method was evaluated by objective tests,such as the perceptual evaluation of speech quality(PESQ) and composite measure.Performance was then evaluated by a subjective test,namely,mean opinion scores(MOS) under various noise environments.The proposed method show better results than existing methods. 展开更多
关键词 speech enhancement soft decision speech absence probability Gaussian mixture model (GMM)
在线阅读 下载PDF
Child-directed Speech and Foreigner Talk
7
作者 马小梅 《陕西师范大学学报(哲学社会科学版)》 CSSCI 北大核心 2004年第S2期433-437,共5页
The article reviews child-directed speech and foreigner talk respectively and comparatively. It compares the features, functions and some of the similarities as well as differences of the two registers. They should be... The article reviews child-directed speech and foreigner talk respectively and comparatively. It compares the features, functions and some of the similarities as well as differences of the two registers. They should be thought of as dynamic, changing in accordance with various situational factors rather than static, fixed sets of features. 展开更多
关键词 child-directed speech foreigner TALK children’s EDUCATION
在线阅读 下载PDF
Differences Between Male andFemale Sex in Speech Behaviour
8
作者 赵清丽 《陕西师范大学学报(哲学社会科学版)》 CSSCI 北大核心 2002年第S3期193-196,共4页
This paper describes male-female differences in speech behavior from the following aspects: their different attitudes towards public speaking and private speaking; their different attitudes towards public details and ... This paper describes male-female differences in speech behavior from the following aspects: their different attitudes towards public speaking and private speaking; their different attitudes towards public details and private details; their different purposes towards troubles; their different attitudes towards asking information. Then this paper presents explanations for male-female differences in speech behavior from social point of view and anthropological point of view. 展开更多
关键词 male-female DIFFERENCE speech behavior
在线阅读 下载PDF
基于Deep Speech与多层LSTM的儿童朗读语音评价模型 被引量:2
9
作者 郑纯军 贾宁 《计算机科学》 CSCD 北大核心 2019年第S11期108-111,148,共5页
现代人大多忽略了朗读的重要性,然而对于5~12岁的儿童,朗读不仅是学习过程中必备的技能,还是陶冶情操的有效手段。由于朗读语音信号的特征与评价标准之间存在着非线性关系,递归神经网络虽然适用于时间序列的预测,但是对长时间跨度的预... 现代人大多忽略了朗读的重要性,然而对于5~12岁的儿童,朗读不仅是学习过程中必备的技能,还是陶冶情操的有效手段。由于朗读语音信号的特征与评价标准之间存在着非线性关系,递归神经网络虽然适用于时间序列的预测,但是对长时间跨度的预测效果有限。基于此,根据儿童朗读语音特点及其评价体系,设计了一种基于DeepSpeech与三层长短期记忆(Long Short-Term Memory,LSTM)神经网络相结合的模型。首先,在添加注意力机制的基础上,提出朗读语音评价的准确性和流利性度量,以频谱图作为特征提取的输入,其中,朗读评价的准确性采用改进后的Deep Speech以提高音素识别的准确率,流利性评价将频谱图送至三层LSTM模型中以呈现时间序列的影响;然后,将结果送入注意力机制进行权重调节;最终,将计算的总评价结果用于儿童朗读语音的评分。使用“出口成章”软件提供的儿童朗读语料库和TensorFlow平台进行实验。结果表明,与传统的模型相比,此模型不仅可以精确判断朗读的正确性和朗读的流利性,而且其评价模型获得的评分结果较准确。 展开更多
关键词 频谱图 长短期记忆网络 注意力机制 Deepspeech 朗读语音评价模型
在线阅读 下载PDF
基于Speech SDK的语音控制应用程序的设计与实现 被引量:40
10
作者 李禹材 左友东 +1 位作者 郑秀清 王玲 《计算机应用》 CSCD 北大核心 2004年第6期114-116,共3页
分析了微软SpeechSDK5.1里语音应用程序接口(SAPI)的结构和工作原理,提出了语音控制应用程序的设计方法,并以"Z+Z智能教学平台的语音识别接口"的设计为例,展示了这类系统的主框架和关键技术。
关键词 语音识别 COM SAPI 语音控制
在线阅读 下载PDF
基于ArcGIS与Speech SDK的中文语音交互式GIS实现方法 被引量:5
11
作者 吴建华 余梦娟 +1 位作者 刘强 舒志刚 《地理与地理信息科学》 CSCD 北大核心 2016年第5期76-80,共5页
自然语言与GIS结合实现语音GIS有利于提高地理信息服务的智能化,而语音识别及空间信息抽取是实现语音GIS的关键技术。该文探索性研究了利用微软Speech SDK及文法规则匹配实现中文语音GIS的方法,设计和实现了一套语音交互式GIS原型系统... 自然语言与GIS结合实现语音GIS有利于提高地理信息服务的智能化,而语音识别及空间信息抽取是实现语音GIS的关键技术。该文探索性研究了利用微软Speech SDK及文法规则匹配实现中文语音GIS的方法,设计和实现了一套语音交互式GIS原型系统。首先,设计了语音交互式GIS实现的流程;其次,研究了Speech SDK接口应用方法、语音触发机制、特征词提取与标注以及文法规则匹配方法;最后,基于ArcGIS Engine 10.2和Speech SDK 5.1实现了原型系统,其主要功能包括基于语音的地图浏览、根据对象名称查询、道路交叉口查询、缓冲区查询、最短路径查询等。实际应用证明,该方法相比现有的ArcGIS软件完成相同查询具有操作步骤少、普适性强、智能化等特点。 展开更多
关键词 自然语言 语音GIS 空间关系 文法规则匹配 语义解析
在线阅读 下载PDF
Adaptive bands filter bank optimized by genetic algorithm for robust speech recognition system 被引量:5
12
作者 黄丽霞 G.Evangelista 张雪英 《Journal of Central South University》 SCIE EI CAS 2011年第5期1595-1601,共7页
Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher acc... Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results. 展开更多
关键词 perceptual filter banks bark scale speaker independent speech recognition systems zero-crossing peak amplitude genetic algorithm
在线阅读 下载PDF
A speech enhancement algorithm to reduce noise and compensate for partial masking effect 被引量:4
13
作者 JEON Yu-yong LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2011年第4期1121-1127,共7页
To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estima... To enhance the speech quality that is degraded by environmental noise,an algorithm was proposed to reduce the noise and reinforce the speech.The minima controlled recursive averaging(MCRA) algorithm was used to estimate the noise spectrum and the partial masking effect which is one of the psychoacoustic properties was introduced to reinforce speech.The performance evaluation was performed by comparing the PESQ(perceptual evaluation of speech quality) and segSNR(segmental signal to noise ratio) by the proposed algorithm with the conventional algorithm.As a result,average PESQ by the proposed algorithm was higher than the average PESQ by the conventional noise reduction algorithm and segSNR was higher as much as 3.2 dB in average than that of the noise reduction algorithm. 展开更多
关键词 speech enhancement noise reduction psychoacoustic property human hearing property
在线阅读 下载PDF
A continuous differentiable wavelet threshold function for speech enhancement 被引量:3
14
作者 贾海蓉 张雪英 白静 《Journal of Central South University》 SCIE EI CAS 2013年第8期2219-2225,共7页
Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable thresh... Enhanced speech based on the traditional wavelet threshold function had auditory oscillation distortion and the low signal-to-noise ratio (SNR). In order to solve these problems, a new continuous differentiable threshold function for speech enhancement was presented. Firstly, the function adopted narrow threshold areas, preserved the smaller signal speech, and improved the speech quality; secondly, based on the properties of the continuous differentiable and non-fixed deviation, each area function was attained gradually by using the method of mathematical derivation. It ensured that enhanced speech was continuous and smooth; it removed the auditory oscillation distortion; finally, combined with the Bark wavelet packets, it further improved human auditory perception. Experimental results show that the segmental SNR and PESQ (perceptual evaluation of speech quality) of the enhanced speech using this method increase effectively, compared with the existing speech enhancement algorithms based on wavelet threshold. 展开更多
关键词 continuous differentiable wavelet threshold fimction speech enhancement Bark wavelet packet non-fixed deviation noise
在线阅读 下载PDF
Mapping methods for output-based objective speech quality assessment using data mining 被引量:3
15
作者 王晶 赵胜辉 +1 位作者 谢湘 匡镜明 《Journal of Central South University》 SCIE EI CAS 2014年第5期1919-1926,共8页
Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.T... Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error. 展开更多
关键词 objective speech quality data mining multivariate non-linear regression fuzzy neural network support vector regression
在线阅读 下载PDF
Variable step-size affine projection algorithm based on global speech absence probability for adaptive feedback cancellation 被引量:3
16
作者 KIM Young-Sear SONG Ji-hyun +1 位作者 KIM Sang-Kyun LEE Sangmin 《Journal of Central South University》 SCIE EI CAS 2014年第2期646-650,共5页
A novel approach is proposed for improving adaptive feedback cancellation using a variable step-size affine projection algorithm(VSS-APA) based on global speech absence probability(GSAP).The variable step-size of the ... A novel approach is proposed for improving adaptive feedback cancellation using a variable step-size affine projection algorithm(VSS-APA) based on global speech absence probability(GSAP).The variable step-size of the proposed VSS-APA is adjusted according to the GSAP of the current frame.The weight vector of the adaptive filter is updated by the probability of the speech absence.The performance measure of acoustic feedback cancellation is evaluated using normalized misalignment.Experimental results demonstrate that the proposed approach has better performance than the normalized least mean square(NLMS) and the constant step-size affine projection algorithms. 展开更多
关键词 adaptive feedback cancellation affine projection global speech absence probability(GSAP)
在线阅读 下载PDF
Integrated search technique for parameter determination of SVM for speech recognition 被引量:2
17
作者 Teena Mittal R.K.Sharma 《Journal of Central South University》 SCIE EI CAS CSCD 2016年第6期1390-1398,共9页
Support vector machine(SVM)has a good application prospect for speech recognition problems;still optimum parameter selection is a vital issue for it.To improve the learning ability of SVM,a method for searching the op... Support vector machine(SVM)has a good application prospect for speech recognition problems;still optimum parameter selection is a vital issue for it.To improve the learning ability of SVM,a method for searching the optimal parameters based on integration of predator prey optimization(PPO)and Hooke-Jeeves method has been proposed.In PPO technique,population consists of prey and predator particles.The prey particles search the optimum solution and predator always attacks the global best prey particle.The solution obtained by PPO is further improved by applying Hooke-Jeeves method.Proposed method is applied to recognize isolated words in a Hindi speech database and also to recognize words in a benchmark database TI-20 in clean and noisy environment.A recognition rate of 81.5%for Hindi database and 92.2%for TI-20 database has been achieved using proposed technique. 展开更多
关键词 support vector machine (SVM) predator prey optimization speech recognition Mel-frequency cepstral coefficients wavelet packets Hooke-Jeeves method
在线阅读 下载PDF
影片The King’s Speech之中爱的匮乏和补偿
18
作者 庞红霞 《电影文学》 北大核心 2012年第9期110-111,共2页
电影The King's Speech给人们展现的不但有那种立体丰满的人物形象,贴近民众的平民化形象,激励人们的精神享受,并且还发掘了人之相同的跨文化理念——爱。由于父亲对乔治的爱以及亲情的缺乏,国王变得唯唯诺诺,也是由于爱,国王终于克服... 电影The King's Speech给人们展现的不但有那种立体丰满的人物形象,贴近民众的平民化形象,激励人们的精神享受,并且还发掘了人之相同的跨文化理念——爱。由于父亲对乔治的爱以及亲情的缺乏,国王变得唯唯诺诺,也是由于爱,国王终于克服内心的恐惧,能够在危险时刻发表让人欢欣鼓舞的演讲。 展开更多
关键词 TheKing's speech 爱的匮乏与补偿
在线阅读 下载PDF
Probing the Linguistic and Rhetorical Features of English Speeches 被引量:1
19
作者 李庆明 《西安理工大学学报》 CAS 2004年第3期327-331,共5页
English speech is a discourse delivered at an assembly or on formal occasions. As a variety of the English language, English speech has a unique presentation of its own. This paper, as its title indicates, is to analy... English speech is a discourse delivered at an assembly or on formal occasions. As a variety of the English language, English speech has a unique presentation of its own. This paper, as its title indicates, is to analyze and probe the linguistic and rhetorical features of famous English speeches with a view to improving the ability to appreciate English speeches on the part of Chinese learners of English. 展开更多
关键词 英语演讲 语言学 修辞特征 句型
在线阅读 下载PDF
Single-channel speech enhancement method based on masking properties and minimum statistics
20
作者 JiangXiaoping YaoTianren FuHua 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2004年第2期217-224,共8页
A single-channel speech enhancement method of noisy speech signals at very low signal-to-noise ratios is presented, which is based on masking properties of the human auditory system and power spectral density estimati... A single-channel speech enhancement method of noisy speech signals at very low signal-to-noise ratios is presented, which is based on masking properties of the human auditory system and power spectral density estimation of non stationary noise. It allows for an automatic adaptation in time and frequency of the parametric enhancement system, and finds the best tradeoff among the amount of noise reduction, the speech distortion, and the level of musical residual noise based on a criterion correlated with perception and SNR. This leads to a significant reduction of the unnatural structure of the residual noise. The results with several noise types show that the enhanced speech is more pleasant to a human listener. 展开更多
关键词 auditory property masking varying SNR estimation speech enhancement minimum statistics.
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部