摘要
网络已经成为世界上最大的数字图书馆,目前网上信息数量仍在急剧膨胀,无用信息占有比例愈来愈大,而且网络传输速度较慢,由于网页的存储结构直接影响着查询质量和查询速度,因此网络信息的存储方式亟需改进。本文针对网络信息固有的特点提出一种新的文档存储结构,改进了搜索引擎的性能.其中主要包括信息的自动分类,网页相关度的计算,垃圾信息以及重复信息的过滤等技术。
Internet has already become the largest digital library. Internet, which is still growing at an exponential rate, provides us with massive and valuable information. However, the valueless information also grows fast. So how to storage web information and achieve a rapid and appropriate information access has become an important issue. In this paper we proposed an efficient method based on intelligent agent. In accomplishing the above issue, automatic classification, information filtering and text analysis t...
出处
《计算机工程》
CAS
CSCD
北大核心
2000年第S1期716-720,共5页
Computer Engineering
基金
国家863计划资助项目(863-306-ZD03-04-1)
关键词
信息存储
智能代理
信息过滤
文本分析
information storage
Intelligent agent: Information filtering
Text analysis