摘要
复杂数据对象(如图片、文本)通常被表示成高维特征向量。PostgreSQL系统现有的最近邻检索方法KNN-Gist基于树状索引实现,无法高效支持高维数据的最近邻检索。引入的PostgreSQL系统高维空间近似最近邻检索插件:AKNN-Qalsh,基于位置敏感哈希机制实现,支持大规模、高维数据对象的近似最近邻检索。通过在五个真实数据集上的密集实验,验证了该插件的有效性。
Complex data objects (such as pictures, text) are usually represented as high-dimensional feature vectors. The existing nearest neighbor search method KNN-Gist in PostgreSQL is based on the tree-structured index and cannot efficiently support the nearest neighbor search of high-dimensional data. The PostgreSQL system high-dimensional approximate nearest neighbor search extension: AKNN-Qalsh is introduced, which is based on the Locality-Sensitive Hashing scheme and supports approximate nearest neighbor search of large-scale, high-dimensional data objects. The effectiveness of the extension via extensive experiments on five real data sets is demonstrated.
作者
张楚涵
张家侨
冯剑琳
ZHANG Chuhan;ZHANG Jiaqiao;FENG Jianlin(School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006,China)
出处
《中山大学学报(自然科学版)》
CAS
CSCD
北大核心
2019年第3期79-85,共7页
Acta Scientiarum Naturalium Universitatis Sunyatseni
基金
国家自然科学基金(61772563)
作者简介
张楚涵(1995年生),女;研究方向:数据库、数据挖掘;E-mail: zhangchh7@mail2.sysu.edu.cn;通信作者:冯剑琳(1970年生),男;研究方向:数据库、数据挖掘;E-mail: fengjlin@mail.sysu.edu.cn.