Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real proc...Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.展开更多
Time series analysis is a key technology for medical diagnosis,weather forecasting and financial prediction systems.However,missing data frequently occur during data recording,posing a great challenge to data mining t...Time series analysis is a key technology for medical diagnosis,weather forecasting and financial prediction systems.However,missing data frequently occur during data recording,posing a great challenge to data mining tasks.In this study,we propose a novel time series data representation-based denoising autoencoder(DAE)for the reconstruction of missing values.Two data representation methods,namely,recurrence plot(RP)and Gramian angular field(GAF),are used to transform the raw time series to a 2D matrix for establishing the temporal correlations between different time intervals and extracting the structural patterns from the time series.Then an improved DAE is proposed to reconstruct the missing values from the 2D representation of time series.A comprehensive comparison is conducted amongst the different representations on standard datasets.Results show that the 2D representations have a lower reconstruction error than the raw time series,and the RP representation provides the best outcome.This work provides useful insights into the better reconstruction of missing values in time series analysis to considerably improve the reliability of timevarying system.展开更多
基金supported by Key Discipline Construction Program of Beijing Municipal Commission of Education (XK10008043)
文摘Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.
文摘Time series analysis is a key technology for medical diagnosis,weather forecasting and financial prediction systems.However,missing data frequently occur during data recording,posing a great challenge to data mining tasks.In this study,we propose a novel time series data representation-based denoising autoencoder(DAE)for the reconstruction of missing values.Two data representation methods,namely,recurrence plot(RP)and Gramian angular field(GAF),are used to transform the raw time series to a 2D matrix for establishing the temporal correlations between different time intervals and extracting the structural patterns from the time series.Then an improved DAE is proposed to reconstruct the missing values from the 2D representation of time series.A comprehensive comparison is conducted amongst the different representations on standard datasets.Results show that the 2D representations have a lower reconstruction error than the raw time series,and the RP representation provides the best outcome.This work provides useful insights into the better reconstruction of missing values in time series analysis to considerably improve the reliability of timevarying system.