With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has be...With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has been the subject of numerous scientific publications.On January 1,2018,we conducted literature data collection and processing on flood research and categorized the retrieved paper records into Whole SCI Dataset(WS)and High-Citation SCI Dataset(HCS).These data sets can serve as basic data for bibliometric analysis to identify the status of global flood research during 1990-2017.Our study shows that while the Chinese Academy of Sciences was the most productive institution during this period,the United States was the most productive country.Besides,our keyword analysis reveals the potential popular issues and future trends of flood research.展开更多
A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial partic...A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial particles was designed to ensure the reasonable initial fitness, and then, the dynamically dimensionality cutting of dataset was built to decrease the search space. Based on four high-dimensional datasets, BPSO-HD was compared with Apriori to test its reliability, and was compared with the ordinary BPSO and quantum swarm evolutionary(QSE) to prove its advantages. The experiments show that the results given by BPSO-HD is reliable and better than the results generated by BPSO and QSE.展开更多
The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is conside...The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.展开更多
A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking appl...A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking application.In this new algorithm,the measurements lying in the intersection of two or more validation regions are allocated to the corresponding targets through rough set theory,and the multi-target tracking problem is transformed into a single target tracking after the classification of measurements lying in the intersection region.Several typical multi-target tracking applications are given.The simulation results show that the algorithm can not only reduce the complexity and time consumption but also enhance the accuracy and stability of the tracking results.展开更多
The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate ...The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate and efficient data. However, current truth finder algorithms are unsatisfying, because of their low accuracy and complication. This paper proposes a truth finder algorithm based on entity attributes (TFAEA). Based on the iterative computation of source reliability and fact accuracy, TFAEA considers the interactive degree among facts and the degree of dependence among sources, to simplify the typical truth finder algorithms. In order to improve the accuracy of them, TFAEA combines the one-way text similarity and the factual conflict to calculate the mutual support degree among facts. Furthermore, TFAEA utilizes the symmetric saturation of data sources to calculate the degree of dependence among sources. The experimental results show that TFAEA is not only more stable, but also more accurate than the typical truth finder algorithms.展开更多
基金National Key Research and Development Program of China(2016YFE0122600)。
文摘With an increasing number of scientific achievements published,it is particularly important to conduct literature-based knowledge discovery and data mining.Flood,as one of the most destructive natural disasters,has been the subject of numerous scientific publications.On January 1,2018,we conducted literature data collection and processing on flood research and categorized the retrieved paper records into Whole SCI Dataset(WS)and High-Citation SCI Dataset(HCS).These data sets can serve as basic data for bibliometric analysis to identify the status of global flood research during 1990-2017.Our study shows that while the Chinese Academy of Sciences was the most productive institution during this period,the United States was the most productive country.Besides,our keyword analysis reveals the potential popular issues and future trends of flood research.
文摘A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial particles was designed to ensure the reasonable initial fitness, and then, the dynamically dimensionality cutting of dataset was built to decrease the search space. Based on four high-dimensional datasets, BPSO-HD was compared with Apriori to test its reliability, and was compared with the ordinary BPSO and quantum swarm evolutionary(QSE) to prove its advantages. The experiments show that the results given by BPSO-HD is reliable and better than the results generated by BPSO and QSE.
基金Supported by National Natural Science Foundation of China(60675039)National High Technology Research and Development Program of China(863 Program)(2006AA04Z217)Hundred Talents Program of Chinese Academy of Sciences
基金supported by proposal No.OSD/BCUD/392/197 Board of Colleges and University Development,Savitribai Phule Pune University,Pune
文摘The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers.
文摘A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking application.In this new algorithm,the measurements lying in the intersection of two or more validation regions are allocated to the corresponding targets through rough set theory,and the multi-target tracking problem is transformed into a single target tracking after the classification of measurements lying in the intersection region.Several typical multi-target tracking applications are given.The simulation results show that the algorithm can not only reduce the complexity and time consumption but also enhance the accuracy and stability of the tracking results.
基金supported by the National Natural Science Foundation of China(61472192)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)
文摘The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate and efficient data. However, current truth finder algorithms are unsatisfying, because of their low accuracy and complication. This paper proposes a truth finder algorithm based on entity attributes (TFAEA). Based on the iterative computation of source reliability and fact accuracy, TFAEA considers the interactive degree among facts and the degree of dependence among sources, to simplify the typical truth finder algorithms. In order to improve the accuracy of them, TFAEA combines the one-way text similarity and the factual conflict to calculate the mutual support degree among facts. Furthermore, TFAEA utilizes the symmetric saturation of data sources to calculate the degree of dependence among sources. The experimental results show that TFAEA is not only more stable, but also more accurate than the typical truth finder algorithms.