摘要
近些年来 ,中文时间信息抽取和处理已经变得越来越重要。然而 ,很少有研究者关注中文文本中事件信息所对应的时间信息的识别和分析。本文的目的就是确定文本中时间信息和事件信息之间的映射关系。区别于传统的基于规则的方法 ,本文采用了一种机器学习的方法—基于转换的错误驱动学习—来确定事件相应的时间表达 ,这种学习算法可以自动的获取和改进规则。使用训练得到的转换规则集后 ,系统的时间 -事件映射错误率减少了 9 74 % 。
In the past years, temporal information processing and extraction has received increasing attentions. Nevertheless, only a few researchers have investigated the recognition about corresponding temporal expression of the event in Chinese text. The aim of this paper is to investigate both the temporal information extraction and the determining of mapping relation between event and its temporal expression. As compared to many other techniques, we use a machine learning method, transformation based error driven learning algorithm to determine the time event mapping relation. The method can automatically acquire the analytical rules. The system builds an initial time event tagger firstly. Then by machine learning, the system get a patch rule set to improve the performance of the initial time event tagger. Using the patch rule set, system gets 6.5% error rate decrease for time event mapping relation determination. The experiment indicates that the transformation based error driven learning is a good patch for based rule method.
出处
《中文信息学报》
CSCD
北大核心
2004年第4期23-30,共8页
Journal of Chinese Information Processing
基金
自然科学基金资助项目 (6 9975 0 0 8)
86 3计划资助项目 (2 0 0 1AA1 1 4 2 1 0 )
关键词
计算机应用
中文信息处理
时间信息处理
基于转换的错误驱动学习
信息抽取
computer application
Chinese information processing
Temporal information processing
transformation-based error-driven learning
information extraction