The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and d...The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.展开更多
With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service respons...With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.展开更多
Identifying gene names is an attractive research area of biology computing. However, accurate extraction of gene names is a challenging task with the lack of conventions for describing gene names. We devise a systemat...Identifying gene names is an attractive research area of biology computing. However, accurate extraction of gene names is a challenging task with the lack of conventions for describing gene names. We devise a systematical architecture and apply the model using conditional random fields (CRFs) for extracting gene names from Medline. In order to improve the performance, biomedical ontology features are inserted into the model and post processing including boundary adjusting and word filter is presented to solve name overlapping problem and remove false positive single words. Pure string match method, baseline CRFs, and CRFs with our methods are applied to human gene names and HIV gene names extraction respectively in 1100 abstracts of Medline and their performances are contrasted. Results show that CRFs are robust for unseen gene names. Furthermore, CRFs with our methods outperforms other methods with precision 0.818 and recall 0.812.展开更多
A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses ...A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.展开更多
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
地图匹配是许多位置服务与轨迹挖掘应用的基础.随着定位技术和位置服务应用的发展,地图匹配研究不断演进,从早期基于高采样率GPS(Global Position System)的实时匹配,到近期基于低采样率GPS轨迹的离线匹配、再到当前非GPS定位数据或高...地图匹配是许多位置服务与轨迹挖掘应用的基础.随着定位技术和位置服务应用的发展,地图匹配研究不断演进,从早期基于高采样率GPS(Global Position System)的实时匹配,到近期基于低采样率GPS轨迹的离线匹配、再到当前非GPS定位数据或高精度地图匹配。迄今已有许多地图匹配算法相继提出,但鲜有研究对这些算法进行全面总结.为此,对近十年提出的地图匹配算法进行调研,归纳出地图匹配算法的统一框架及常用时空特征.从模型或实现技术角度分类发现:现有算法大都采用HMM(Hidden Markov Model)模型,其次是最大权重模型;深度学习技术近期开始用于地图匹配,将是未来高精度地图匹配研究的趋势.展开更多
Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:S...Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:Semantic dictionary and conditional random field model(CRFM)were used to annotate the semantic information of research papers.Based on the annotation results,the research level information was extracted through regular expression.All the functions were implemented on Sybase platform.Findings:According to the result of our experiment in carbon nanotube research,the precision and recall rates reached 65.13%and 57.75%,respectively after the semantic properties of word class have been labeled,and F-measure increased dramatically from less than 50%to60.18%while added with semantic features.Our experiment also showed that the information extraction system for research level(IESRL)can extract performance indicators from research papers rapidly and effectively.Research limitations:Some text information,such as that of format and chart,might have been lost due to the extraction processing of text format from PDF to TXT files.Semantic labeling on sentences could be insufficient due to the rich meaning of lexicons in the semantic dictionary.Research implications:The established system can help researchers rapidly compare the level of different research papers and find out their implicit innovation values.It could also be used as an auxiliary tool for analyzing research levels of various research institutions.Originality/value:In this work,we have successfully established an information extraction system for research papers by a revised semantic annotation method based on CRFM and the semantic dictionary.Our system can analyze the information extraction problem from two levels,i.e.from the sentence level and noun(phrase)level of research papers.Compared with the extraction method based on knowledge engineering and that on machine learning,our system shows advantages of the both.展开更多
An approach to track multiple objects in crowded scenes with long-term partial occlusions is proposed. Tracking-by-detection is a successful strategy to address the task of tracking multiple objects in unconstrained s...An approach to track multiple objects in crowded scenes with long-term partial occlusions is proposed. Tracking-by-detection is a successful strategy to address the task of tracking multiple objects in unconstrained scenarios,but an obvious shortcoming of this method is that most information available in image sequences is simply ignored due to thresholding weak detection responses and applying non-maximum suppression. This paper proposes a multi-label conditional random field( CRF) model which integrates the superpixel information and detection responses into a unified energy optimization framework to handle the task of tracking multiple targets. A key characteristic of the model is that the pairwise potential is constructed to enforce collision avoidance between objects,which can offer the advantage to improve the tracking performance in crowded scenes. Experiments on standard benchmark databases demonstrate that the proposed algorithm significantly outperforms the state-of-the-art tracking-by-detection methods.展开更多
文摘The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.
基金supported by Science and Technology Project of State Grid Corporation(Research and Application of Intelligent Energy Meter Quality Analysis and Evaluation Technology Based on Full Chain Data)
文摘With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.
基金supported by China Scholarship Council under Grant No 2007104897UESTC Youth Foundation under Grant No JX05007
文摘Identifying gene names is an attractive research area of biology computing. However, accurate extraction of gene names is a challenging task with the lack of conventions for describing gene names. We devise a systematical architecture and apply the model using conditional random fields (CRFs) for extracting gene names from Medline. In order to improve the performance, biomedical ontology features are inserted into the model and post processing including boundary adjusting and word filter is presented to solve name overlapping problem and remove false positive single words. Pure string match method, baseline CRFs, and CRFs with our methods are applied to human gene names and HIV gene names extraction respectively in 1100 abstracts of Medline and their performances are contrasted. Results show that CRFs are robust for unseen gene names. Furthermore, CRFs with our methods outperforms other methods with precision 0.818 and recall 0.812.
基金Supported by the Science and Technology Innovation Plan of Beijing Institute of Technology(2013)
文摘A fast method for phrase structure grammar analysis is proposed based on conditional ran- dom fields (CRF). The method trains several CRF classifiers for recognizing the phrase nodes at dif- ferent levels, and uses the bottom-up to connect the recognized phrase nodes to construct the syn- tactic tree. On the basis of Beijing forest studio Chinese tagged corpus, two experiments are de- signed to select the training parameters and verify the validity of the method. The result shows that the method costs 78. 98 ms and 4. 63 ms to train and test a Chinese sentence of 17. 9 words. The method is a new way to parse the phrase structure grammar for Chinese, and has good generalization ability and fast speed.
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.
文摘地图匹配是许多位置服务与轨迹挖掘应用的基础.随着定位技术和位置服务应用的发展,地图匹配研究不断演进,从早期基于高采样率GPS(Global Position System)的实时匹配,到近期基于低采样率GPS轨迹的离线匹配、再到当前非GPS定位数据或高精度地图匹配。迄今已有许多地图匹配算法相继提出,但鲜有研究对这些算法进行全面总结.为此,对近十年提出的地图匹配算法进行调研,归纳出地图匹配算法的统一框架及常用时空特征.从模型或实现技术角度分类发现:现有算法大都采用HMM(Hidden Markov Model)模型,其次是最大权重模型;深度学习技术近期开始用于地图匹配,将是未来高精度地图匹配研究的趋势.
基金supported by the National Social Science Foundation of China(Grant No.12CTQ032)
文摘Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:Semantic dictionary and conditional random field model(CRFM)were used to annotate the semantic information of research papers.Based on the annotation results,the research level information was extracted through regular expression.All the functions were implemented on Sybase platform.Findings:According to the result of our experiment in carbon nanotube research,the precision and recall rates reached 65.13%and 57.75%,respectively after the semantic properties of word class have been labeled,and F-measure increased dramatically from less than 50%to60.18%while added with semantic features.Our experiment also showed that the information extraction system for research level(IESRL)can extract performance indicators from research papers rapidly and effectively.Research limitations:Some text information,such as that of format and chart,might have been lost due to the extraction processing of text format from PDF to TXT files.Semantic labeling on sentences could be insufficient due to the rich meaning of lexicons in the semantic dictionary.Research implications:The established system can help researchers rapidly compare the level of different research papers and find out their implicit innovation values.It could also be used as an auxiliary tool for analyzing research levels of various research institutions.Originality/value:In this work,we have successfully established an information extraction system for research papers by a revised semantic annotation method based on CRFM and the semantic dictionary.Our system can analyze the information extraction problem from two levels,i.e.from the sentence level and noun(phrase)level of research papers.Compared with the extraction method based on knowledge engineering and that on machine learning,our system shows advantages of the both.
基金Supported by the National Natural Science Foundation of China(61471225)Scientific Research Foundation of Shandong University of Science and Technology for Recruited Talents(2014RCJJ055)
文摘An approach to track multiple objects in crowded scenes with long-term partial occlusions is proposed. Tracking-by-detection is a successful strategy to address the task of tracking multiple objects in unconstrained scenarios,but an obvious shortcoming of this method is that most information available in image sequences is simply ignored due to thresholding weak detection responses and applying non-maximum suppression. This paper proposes a multi-label conditional random field( CRF) model which integrates the superpixel information and detection responses into a unified energy optimization framework to handle the task of tracking multiple targets. A key characteristic of the model is that the pairwise potential is constructed to enforce collision avoidance between objects,which can offer the advantage to improve the tracking performance in crowded scenes. Experiments on standard benchmark databases demonstrate that the proposed algorithm significantly outperforms the state-of-the-art tracking-by-detection methods.