作为XML(extensible markup language)数据查询的核心操作,树模式查询的应用前景广泛,其研究具有重要意义。针对扩展的树模式GTP++(generalized tree pattern)提出了一种树模式描述语言XTPL(XML tree pattern language),采用指称语义方...作为XML(extensible markup language)数据查询的核心操作,树模式查询的应用前景广泛,其研究具有重要意义。针对扩展的树模式GTP++(generalized tree pattern)提出了一种树模式描述语言XTPL(XML tree pattern language),采用指称语义方式给出其完整的语义描述,使得形式化方法可以用于分析树模式查询的行为特征,有助于验证XML查询的正确性,以及提高查询处理方法的可靠性和鲁棒性。同时以路径表达式为例,以指称语义的方式给出了从路径表达式提取树模式的算法。展开更多
Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especia...Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth)method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new project frequent pattern growth (PFP-tree)algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence. So increase algorithm's scalability efficiently.展开更多
文摘作为XML(extensible markup language)数据查询的核心操作,树模式查询的应用前景广泛,其研究具有重要意义。针对扩展的树模式GTP++(generalized tree pattern)提出了一种树模式描述语言XTPL(XML tree pattern language),采用指称语义方式给出其完整的语义描述,使得形式化方法可以用于分析树模式查询的行为特征,有助于验证XML查询的正确性,以及提高查询处理方法的可靠性和鲁棒性。同时以路径表达式为例,以指称语义的方式给出了从路径表达式提取树模式的算法。
文摘Frequent Pattern mining plays an essential role in data mining. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns.In this study, we introduce a novel frequent pattern growth (FP-growth)method, which is efficient and scalable for mining both long and short frequent patterns without candidate generation. And build a new project frequent pattern growth (PFP-tree)algorithm on this study, which not only heirs all the advantages in the FP-growth method, but also avoids it's bottleneck in database size dependence. So increase algorithm's scalability efficiently.