期刊文献+

Web日志中用户频繁路径快速挖掘算法 被引量:12

A Fast Algorithm for Mining User Frequent Paths from Web Logs
在线阅读 下载PDF
导出
摘要 Web访问志中含有大量用户浏览信息,从中有效挖掘出用户频繁路径是建立自适用化网站的必要前提。该文在Apriori算法和有向图存储结构的基础上,提出了会话矩阵和遍历矩阵的概念,设计了用户频繁路径快速挖掘算法:首先利用会话矩阵筛选出满足一定阈值条件的频繁一项集,这样避免产生大量中间项;然后在相似客户群体内,对页面快速聚类,得到相关联页面;最后根据遍历矩阵对相关联页面进行路径合并,得出频繁路径。实验表明此算法的准确性和快速性。 Web logs contain a lot of user browsing information,it's necessary condition for creating adaptive web sites. On the analysis of Apriori algorithm and graphic storage organization,This paper proposes Session Matrix and Trace Matrix,designs a fast algorithm for mining user frequent paths:Firstly,Frequent 1-1tem Set which match the criteria of certain threshold is filtered ouffrom web access logs by Session Matrix,which avoids generating a great dealof intermediate items;Then we can get relative pages by clustering pages fast in similar customer groups;Finally,all the relative pages is combined by Trace Matrix,which generates Frequent PathsoExperiments show the accuracy and fast of the algorithm.
出处 《计算机工程与应用》 CSCD 北大核心 2005年第22期164-167,共4页 Computer Engineering and Applications
关键词 会话矩阵 遍历矩阵 相关联页面 用户频繁路径 快速挖掘算法 session matrix,trace matrix,relative pages,user frequent paths,fast mining algorithm
作者简介 杜家强(1979-),男,硕士研究生,研究方向为计算机网络,数据挖掘 .E-mail:djqluck@peopledaily.com.cn韩其睿(1957-),男,教授,硕士生导师,研究方向为计算机图形处理,软件。王科(1978-),硕士研究生,研究方向为网络通讯,模式识别。杜家兴(1976-),男,网络工程师,研究方向为网络安全。
  • 相关文献

参考文献8

  • 1宋擒豹,沈钧毅.Web日志的高效多能挖掘算法[J].计算机研究与发展,2001,38(3):328-333. 被引量:115
  • 2Anand S S,Patrick A R,Hughes J G.A data Mining methodology for cross-sales[J].Knowledge Based Systems Journal, 1998; 10(7) :449~461.
  • 3Mobasher B,Srivastava J.Data preparation for mining world wide web browing patterns [ J ].Knowledge and Information System, 1999;1(1):5~32.
  • 4Srikant Rt, Agrawal R.Mining generalized association rules [ C ].In:Proceedings of the 21st International Conference on Very Large DataBase, Switzerland, 1995: 407~419.
  • 5Karunap Joshi,Nupam Joshi,Elena Yesha. On Using Warehouse to Analyze Web Logs[J].Distributed and Parallel Databases,2003;13:61~180.
  • 6Qiang Yang,Joshua Zhexue Huang,Michael NG.A Data Cube Model for Prediction-Based Web Prefetchingp [J ] .Journal of Intelligent Information Systems, 2003; 20 ( 1 ): 11~30.
  • 7邢东山,沈钧毅,宋擒豹.从Web日志中挖掘用户浏览偏爱路径[J].计算机学报,2003,26(11):1518-1523. 被引量:87
  • 8Jiawei Han,Micheline Kamber. Data Mining Concepts and Techniques [M].Beijing:China Machine Press,2003-09.

二级参考文献13

  • 1Zaiane O R,Proc Advances Digital Libraries Conf,1998年,19页
  • 2Chen M S,Proc of the 16th Int Conf Distributed Computing Systems,1996年,385页
  • 3Mobasher B,Tech Rep:TR96,1996年
  • 4Anand S S, Patrick A R, Hughes J G. A data mining methodology for cross-sales. Knowledge Based Systems Journal, 1998,10(7):449~461
  • 5Park J S, Chen M S, Yu P S. Using A hash-based method with transaction trimming for mining association rules. IEEE Transactions on Knowledge and Data Eng., 1997, 9(5):813~825
  • 6Bfichner A G, Baumgarten M, Artand S S. Navigation pattern discovery from internet data. In: Proceedings of the 5th ACM International Conference on Knowledge Discovery and Data Mining (WEBKDD′99 Workshop) (SIGKDD′99), New York, 1999.25~30
  • 7Srikant R, Agrawal R. Mining generalized association rules. In: Proceedings of the 21st International Conference Very Large DataBase, Switzerland, 1995. 407~419
  • 8Srikant R, Agrawal R. Mining quantitative association rules in large relational tables. In: Proceedings of the ACM SIGMOD, Canada, 1996.1~12
  • 9Yang D L, Yang S H, Hong M C. An efficient web mining for session path patterns. In: Proceedings of International Computer Symposium 2000, Workshop on Software Eng. and Database Systems, Taiwan, 2000. 107~113
  • 10Brin S, Motwani R, Silverstein C. Beyond market baskets: Generalizing association rules to correlations. In: Proceedings of the ACM SIGMOD, Canada, 1996.255~276

共引文献192

同被引文献72

引证文献12

二级引证文献38

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部