Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP...Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.展开更多
本研究采用文献计量学方法,总结当前土壤质量研究中最小数据集(MDS)选取的方法和指标,定量分析并指出土壤质量评价中最小数据集的热点和前沿,为中国土壤质量评价和农业绿色发展提供科学参考。通过检索1991-2022年CNKI和Web of Science...本研究采用文献计量学方法,总结当前土壤质量研究中最小数据集(MDS)选取的方法和指标,定量分析并指出土壤质量评价中最小数据集的热点和前沿,为中国土壤质量评价和农业绿色发展提供科学参考。通过检索1991-2022年CNKI和Web of Science相关文献,收集了文献中310个最小数据集进行筛选,借助CiteSpace和VOSviewer对年度发文量、国家/地区、机构、期刊进行共现分析,对关键词进行突现词和聚类分析。31年来该领域文献量逐步增加并仍处于快速发展阶段,中国是发文量最多的国家,期刊载文量最多的为《土壤通报》《生态学报》和Ecological Indicators;主要研究热点表现在“农业管理对土壤质量影响、土壤退化与修复、土壤质量对气候变化的响应与应对及最小数据集筛选方法与模型构建”等方面;前期MDS在土壤质量评价中选用较多的主要为物理、化学指标,但随着土壤健康的发展,生物学指标逐步增长。在未来一段时间内MDS发文量仍为快速增长阶段,发展中国家在全球起着重要节点作用;MDS核心指标为土壤有机质/碳(SOM/SOC)、pH、全氮、速效磷和容重;未来研究应注重在基于大数据平台构建不同尺度下静态评价与动态监测相结合的综合反映土壤功能的土壤健康质量评价框架体系,探讨气候变化背景下与土壤质量变化相对应的MDS及其指标体系,构建精准反映土壤质量变化规律的评价模型与最优最小数据集。展开更多
基金Supported by National Natural Science Foundation of China(60675039)National High Technology Research and Development Program of China(863 Program)(2006AA04Z217)Hundred Talents Program of Chinese Academy of Sciences
基金Projects(60903082,60975042)supported by the National Natural Science Foundation of ChinaProject(20070217043)supported by the Research Fund for the Doctoral Program of Higher Education of China
文摘Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.
文摘本研究采用文献计量学方法,总结当前土壤质量研究中最小数据集(MDS)选取的方法和指标,定量分析并指出土壤质量评价中最小数据集的热点和前沿,为中国土壤质量评价和农业绿色发展提供科学参考。通过检索1991-2022年CNKI和Web of Science相关文献,收集了文献中310个最小数据集进行筛选,借助CiteSpace和VOSviewer对年度发文量、国家/地区、机构、期刊进行共现分析,对关键词进行突现词和聚类分析。31年来该领域文献量逐步增加并仍处于快速发展阶段,中国是发文量最多的国家,期刊载文量最多的为《土壤通报》《生态学报》和Ecological Indicators;主要研究热点表现在“农业管理对土壤质量影响、土壤退化与修复、土壤质量对气候变化的响应与应对及最小数据集筛选方法与模型构建”等方面;前期MDS在土壤质量评价中选用较多的主要为物理、化学指标,但随着土壤健康的发展,生物学指标逐步增长。在未来一段时间内MDS发文量仍为快速增长阶段,发展中国家在全球起着重要节点作用;MDS核心指标为土壤有机质/碳(SOM/SOC)、pH、全氮、速效磷和容重;未来研究应注重在基于大数据平台构建不同尺度下静态评价与动态监测相结合的综合反映土壤功能的土壤健康质量评价框架体系,探讨气候变化背景下与土壤质量变化相对应的MDS及其指标体系,构建精准反映土壤质量变化规律的评价模型与最优最小数据集。