摘要
在数据库中,通过识别等价查询可以减少重复计算.现有方法通常从查询的语义等价关系方面来验证等价查询,然而查询的语义等价是查询结果等价的充分非必要条件,因此,仅依据语义等价关系来判断等价查询会漏掉一些语义不等价但结果相同的查询.针对这一问题,本文面向数据库条件查询提出一种非语义等价关系模型(Non-Semantic Equivalence Relation Model,NSERM):以查询的过滤条件间的包含关系作为偏序关系构建查询格,结合查询结果相等划分得到等价类,依据等价类的凸集性质,即包含等价类上界且被下界包含的查询属于该等价类,从而直接识别或回答语义不等价但结果相同的条件查询集.所提出的模型在开源数据库PostgreSQL中实现,基于TPC-H测试集的实验结果表明,NSERM能识别非语义等价的等价查询,同时还能为数据库带来性能上的提升.
In database,identifying equivalent queries can reduce repeated calculations.The existing methods usually verify the equivalence query from the semantic equivalence relation of the query.However,the semantic equivalence of the query is a sufficient but not necessary condition for the equivalence of the query results.Therefore,judging the equivalence query only based on the semantic equivalence relation will miss some queries with the same results but different semantics.In order to solve this problem,this paper proposes a non-semantic equivalence relation model(NSERM)for conditional query in databases.The query lattice is constructed by using the inclusion relation between the filtering conditions of the query as the partial order relation,and the equivalence class is divided by combining equal query results.According to the convex set property of the equivalence class,that is,the query that contains the upper bound of the equivalence class and is contained by the lower bound of the equivalence class belongs to this equivalence class,so as to directly identify or answer the conditional query set with unequal semantics but the same result.The proposed model is implemented in the open source database PostgreSQL.The experimental results based on the TPC-H test set show that NSERM can identify non-semantic equivalent queries and improve the performance of the database.
作者
何培蕾
游进国
王宇轩
丁家满
HE Peilei;YOU Jinguo;WANG Yuxuan;DING Jiaman(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming 650500,China)
出处
《小型微型计算机系统》
北大核心
2025年第6期1523-1529,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(62062046,61462050)资助.
关键词
数据库
条件查询
非语义等价
查询格
等价类
database
conditional query
non-semantic equivalence
query lattice
equivalence class
作者简介
何培蕾,女,1999年生,硕士研究生,CCF学生会员,研究方向为数据库、数据仓库与数据挖掘;通信作者:游进国,男,1977年生,博士,教授,CCF杰出会员,研究方向为大数据分析、数据仓库和AI4DB等,E-mail:jgyou@126.com;王宇轩,男,1998年生,硕士研究生,研究方向为数据仓库与数据挖掘;丁家满,男,1974年生,硕士,教授,CCF专业会员,研究方向为数据挖掘、软件工程、大数据分析等.