Fingerprints of two varieties of rice and their mixtures were investigated by a nonlinear chemical reaction system consisting of rice components,sodium bromate,manganese sulfate,sulfuric acid and acetone.The variety o...Fingerprints of two varieties of rice and their mixtures were investigated by a nonlinear chemical reaction system consisting of rice components,sodium bromate,manganese sulfate,sulfuric acid and acetone.The variety of rice was identified by the visual characteristic of fingerprint and system similarity pattern recognition,and the content of each variety of rice in the mixture was determined by the quantitative information of fingerprint.The results show that nonlinear chemical analysis may be used to exactly identify the variety of pure rice and to accurately determine the content of each variety of rice in the mixture,indicating the method is simple and convenient.展开更多
现有的重复数据删除技术大部分是基于变长分块(content defined chunking,CDC)算法的,不考虑不同文件类型的内容特征.这种方法以一种随机的方式确定分块边界并应用于所有文件类型,已经证明其非常适合于文本和简单内容,而不适合非结构化...现有的重复数据删除技术大部分是基于变长分块(content defined chunking,CDC)算法的,不考虑不同文件类型的内容特征.这种方法以一种随机的方式确定分块边界并应用于所有文件类型,已经证明其非常适合于文本和简单内容,而不适合非结构化数据构成的复合文件.分析了OpenXML标准的复合文件属性,给出了对象提取的基本方法,并提出基于对象分布和对象结构的去重粒度确定算法.目的是对于非结构化数据构成的复合文件,有效地检测不同文件中和同一文件不同位置的相同对象,在文件物理布局改变时也能够有效去重.通过对典型的非结构化数据集合的模拟实验表明,在综合情况下,对象重复数据删除比CDC方法提高了10%左右的非结构化数据的去重率.展开更多
基金Project(61533021) supported by the National Natural Science Foundation of China
文摘Fingerprints of two varieties of rice and their mixtures were investigated by a nonlinear chemical reaction system consisting of rice components,sodium bromate,manganese sulfate,sulfuric acid and acetone.The variety of rice was identified by the visual characteristic of fingerprint and system similarity pattern recognition,and the content of each variety of rice in the mixture was determined by the quantitative information of fingerprint.The results show that nonlinear chemical analysis may be used to exactly identify the variety of pure rice and to accurately determine the content of each variety of rice in the mixture,indicating the method is simple and convenient.
文摘现有的重复数据删除技术大部分是基于变长分块(content defined chunking,CDC)算法的,不考虑不同文件类型的内容特征.这种方法以一种随机的方式确定分块边界并应用于所有文件类型,已经证明其非常适合于文本和简单内容,而不适合非结构化数据构成的复合文件.分析了OpenXML标准的复合文件属性,给出了对象提取的基本方法,并提出基于对象分布和对象结构的去重粒度确定算法.目的是对于非结构化数据构成的复合文件,有效地检测不同文件中和同一文件不同位置的相同对象,在文件物理布局改变时也能够有效去重.通过对典型的非结构化数据集合的模拟实验表明,在综合情况下,对象重复数据删除比CDC方法提高了10%左右的非结构化数据的去重率.
基金Supported by the National Basic Research Program of China under Grant No.2003CB317003(国家重点基础研究发展计划(973))Me Strategy Grant of City University of Hong Kong of China under Grant Nos.70017097001777(香港城市大学战略发展计划)