期刊文献+

基于URL语义分析的Web用户会话识别方法 被引量:1

A method for Web user session identification based on URL semantic analysis
在线阅读 下载PDF
导出
摘要 由于现有基于时间和引用的经典会话识别方法在复杂Web使用模式挖掘中存在局限性,提出了一个基于URL语义分析的用户会话识别新方法.这个方法借助Web目录服务,将Web日志中的每一条URL记录赋予一定的语义信息,并给出一些测度指标对URL之间的语义相似度进行评价.对静态和流动两类Web日志情况进行分析,分别给出了语义奇异值鉴别方法SOAs和SOAd对用户会话进行切分识别.最后对提出的方法与现有经典方法进行了比较实验与分析,结果表明会话识别的精确率和召回率有所提高. Because classical session identification methods based on timeout-oriented and referrer-based heuristics are restricted to discover complex patterns in Web usage mining,a new method based on URL semantic analysis to identify user sessions is presented.Every URL in Web log files is given a centain semantic information with the aid of Web directory in this method and then some factors are defined to measure the semantic distance between URLs.According to static and dynamic Web logs,two semantic outliers detection methods — SOA_s and SOA_d,are presented respectively to segment user sessions.Finally,some comparison experiments between classical session identification method and the proposed method are conducted,and the results show that the precision ratio and recall ratio of session identification are increased.
作者 朱志国
出处 《大连理工大学学报》 EI CAS CSCD 北大核心 2011年第3期440-446,共7页 Journal of Dalian University of Technology
基金 国家自然科学基金资助项目(70671016)
关键词 数据挖掘 WEB使用挖掘 数据预处理 用户会话识别 data mining Web usage mining data preprocessing user session identification
作者简介 朱志国(1977-),男,博士,副教授,E—mail:zhuzg0628@126.com.
  • 相关文献

参考文献11

  • 1FEDERICO M F, PIER L L. Mining interesting knowledge from weblogs., a survey [J]. Data and Knowledge Engineering, 2005, 53(3) :225-241.
  • 2陈子军,王鑫昱,李伟.一种Web日志会话识别的优化方法[J].计算机工程,2007,33(1):95-97. 被引量:18
  • 3张辉,宋瀚涛,徐晓梅.基于语义的Web用户会话识别算法[J].北京理工大学学报,2007,27(6):471-472. 被引量:3
  • 4SPILIOPOULOU M, MOBASHER B, BERENDT B, et al. A framework for the evaluation of session reconstruction heuristics in Web-usage analysis [J]. INFORMS Journal of Computing, 2 0 0 3, 15 (2) : 10-16.
  • 5朱志国,邓贵仕.Web使用挖掘技术的分析与研究[J].计算机应用研究,2008(1):29-32. 被引量:23
  • 6NLANR/NSF. IRCache users guide [EB/OL]. [2008-03-18]. http://www, ircache, net/.
  • 7GRUBER T. What is an ontology? [EB/OL].[2008-02-21], http://www-ksl, stan{ord, edu/kst/ what-is-an-ontology, htm.
  • 8JUNG J, YOON J. Collaborative information filtering by using categorized bookmarks on the Web [C] // Proceedings of the 14th International Conference on Applications of Prolog. Tokyo: The Prolog Association, 2001:343-357.
  • 9MEO R, LANZI P L, MATERA M. Integrating Web conceptual modeling and Web usage mining [C] // KDD Workshop on Web Mining and Web Usage Analysis. Berlin: Springer, 2004 : 117-214.
  • 10EIRINAKI M, VAZIRGIANNIS M, VARLAMIS I. Sewep: using site semantics and a taxonomy to enhance the Web personalization process [C] // Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2003:99-208.

二级参考文献29

  • 1Facca F M,Lanzi P L.Mining Interesting Knowledge From Weblogs:a Survey[J].Data and Knowledge Engineering,2005,53(3):225-241.
  • 2Cooley R,Mobasher B,Srivastava J.Data Preparation for Mining World Wide Web Browsing Patterns[J].Journal of Knowledge and Information Systems,1999,1(1):5-32.
  • 3Catledge L,Pitkow J.Characterizing Browsing Strategies in the World_Wide_Web[J].Computer Networks and ISDN Systems,1995,27(6):1065-1073.
  • 4Chen M S,Park J S,Yu P S.Efficient Data Mining for Path Traversal Patterns[J].IEEE Transactions on Knowledge and Data Engineering,1998,10(2):209-221.
  • 5Xiao Yongqiao,Dunham M H.Efficient Mining of Traversal Patterns[J].Data and Knowledge Engineering,2001,39(2):191-214.
  • 6Spiliopoulou M,Mobasher B.A framework for the evaluation of session reconstruction heuristics in Web usage analysis[J].INFORMS Journal of Computing,Special Issue on Mining Web-Based Data for E-Business Applications,2003(1):10-16.
  • 7Zhang Hui,Song Hantao.Fuzzy related classification approach based on semantic measurement for Web document[C]//Proceedings of the International Conference on Data Mining.Hong Kong:IEEE,2006:13-18.
  • 8Wang Ru,Song Hantao,Lu Yuchang.Research of extracting data from HTML Web pages automatically[J].Journal of Beijing Institute of Technology,2003,12(S1):104-108.
  • 9Pitkow J.In search of reliable usage data on the WWW[C]//Proceedings of 6th International World Wide Web Conference Santa Clara.California:IEEE,1997:451-463.
  • 10Sarukkai R R.Link prediction and path analysis using Markov Chains[J].Computer Network,2000(5):377-386.

共引文献41

同被引文献16

  • 1马瑞民,李向云.Web日志挖掘中数据预处理技术的研究[J].计算机工程与设计,2007,28(10):2358-2360. 被引量:19
  • 2TANASA D,TROUSSE B. Advanced data preprocessing for intersites Web usage mining[J].IEEE Intelligent Systems,2004,(02):59-65.
  • 3HOFMANN T. Latent semantic models for collaborative filtering[J].ACM Trans on Information Systems,2004,(01):89-115.
  • 4ISHIKAWA H,OHTA M,YOKOYAMA S. On the effectiveness of Web usage mining for page recommendation and restructuring[A].2003.253-267.
  • 5GOBINATH R,HEMALATHA M. hnproved preprocessing techniques for analyzing patterns in Web personalization process[J].International Journal of Computer Applications,2012,(03):13-20.
  • 6CATLEDGE L,PITKOW J. Characterizing browsing behaviors on the world wide Web[J].Computer Networks and ISDN Systems,1995,(06):1065-1073.
  • 7BERENDT B,MOBASHER B,NAKAGAWA M. The impact of site structure and user environment on session reconstruction in Web usage analysis[A].2002.159-179.
  • 8SPILIOPOULOU M,MOBASHER B,BERENDT B. A framework for the evaluation of session reconstruction heuristics in Web usage analysis[J].INFORMS JOURNAL ON COMPUTING,2003,(02):171-190.
  • 9CHEN M,PARK J,YU P S. Data mining for path traversal patterns in a Web environment[A].1996.385-392.
  • 10ZAIANE O R,XIN Man,HAN Jia-wei. Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs[A].1998.19-29.

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部