Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP...Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.展开更多
With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has be...With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has been the subject of numerous scientific publications.On January 1,2018,we conducted literature data collection and processing on flood research and categorized the retrieved paper records into Whole SCI Dataset(WS)and High-Citation SCI Dataset(HCS).These data sets can serve as basic data for bibliometric analysis to identify the status of global flood research during 1990-2017.Our study shows that while the Chinese Academy of Sciences was the most productive institution during this period,the United States was the most productive country.Besides,our keyword analysis reveals the potential popular issues and future trends of flood research.展开更多
The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is conside...The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.展开更多
Vegetation encroachment occurred in bauxite residue disposal area(BRDA)following natural weathering processes,whilst the typical indicators of soil formation are still uncertain.Residue samples were collected from the...Vegetation encroachment occurred in bauxite residue disposal area(BRDA)following natural weathering processes,whilst the typical indicators of soil formation are still uncertain.Residue samples were collected from the BRDA in Central China,and related physical,chemical and biological indicators of bauxite residue with different storage years were determined.The indicators of soil formation in bauxite residue were selected using principal component analysis,factor analysis,and comprehensive evaluation to establish soil quality diagnostic index model on disposal areas.Following natural weathering processes,the texture of bauxite residue changed from silty loam to sandy loam.The pH and EC decreased,whilst porosity,nutrient element content and microbial biomass increased.The identified minimum data set(MDS)included available phosphorus(AP),moisture content(MC),C/N,sand content,total nitrogen(TN),microbial biomass carbon(MBC),and pH.The soil quality index of bauxite residue increased,and the relative soil quality index decreased from 1.89 to 0.15,which indicated that natural weathering had a significant effect on improveing the quality of bauxite residue and forming a new soil-like matrix.The diagnostic model of bauxite residue was established to provide data support for the regeneration on disposal area.展开更多
基金Projects(60903082,60975042)supported by the National Natural Science Foundation of ChinaProject(20070217043)supported by the Research Fund for the Doctoral Program of Higher Education of China
文摘Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.
基金National Key Research and Development Program of China(2016YFE0122600)。
文摘With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has been the subject of numerous scientific publications.On January 1,2018,we conducted literature data collection and processing on flood research and categorized the retrieved paper records into Whole SCI Dataset(WS)and High-Citation SCI Dataset(HCS).These data sets can serve as basic data for bibliometric analysis to identify the status of global flood research during 1990-2017.Our study shows that while the Chinese Academy of Sciences was the most productive institution during this period,the United States was the most productive country.Besides,our keyword analysis reveals the potential popular issues and future trends of flood research.
基金supported by proposal No.OSD/BCUD/392/197 Board of Colleges and University Development,Savitribai Phule Pune University,Pune
文摘The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.
基金Projects(41877551,41842020)supported by the National Natural Science Foundation of China
文摘Vegetation encroachment occurred in bauxite residue disposal area(BRDA)following natural weathering processes,whilst the typical indicators of soil formation are still uncertain.Residue samples were collected from the BRDA in Central China,and related physical,chemical and biological indicators of bauxite residue with different storage years were determined.The indicators of soil formation in bauxite residue were selected using principal component analysis,factor analysis,and comprehensive evaluation to establish soil quality diagnostic index model on disposal areas.Following natural weathering processes,the texture of bauxite residue changed from silty loam to sandy loam.The pH and EC decreased,whilst porosity,nutrient element content and microbial biomass increased.The identified minimum data set(MDS)included available phosphorus(AP),moisture content(MC),C/N,sand content,total nitrogen(TN),microbial biomass carbon(MBC),and pH.The soil quality index of bauxite residue increased,and the relative soil quality index decreased from 1.89 to 0.15,which indicated that natural weathering had a significant effect on improveing the quality of bauxite residue and forming a new soil-like matrix.The diagnostic model of bauxite residue was established to provide data support for the regeneration on disposal area.