摘要
研究了若干由数据特征引起的工业控制模型统计建模结果与理论知识不符的现象,为全面理解及合理判断建模统计分析结果提供了帮助。利用SAS等统计工具,模拟工业现场数据,生成了大量实验数据。利用数据挖掘的统计分析方法对实验数据进行了深入的分析。工业现场数据的数据特征具有不同的特性,这种特性是导致工业控制模型统计建模结果与理论知识不符的现象产生的主要原因。对工业过程生产数据进行预处理时,由于改变了数据的数据特征,造成的利用最小二乘法估计的模型参数有偏以及线性关系呈现非线性的假象。变量间复杂的内在关系会导致简单利用变量间的相关系数判断变量间的线性关系的分析结果与理论知识不符的现象产生。研究结果为宝钢精准选样模型及带钢性能动态控制模型的建模提供了理论参考。
The uniformity between the statistic analysis result of industrial control modeling and the theory knowledge caused by data characteristics is analyzed,which enable people to gain overall understanding of the statistic modeling and deal with the statistic analysis result rationally.The statistic tool SAS is used to produce a large number of experimental data which simulate the industrial process data.The data mining statistic analysis method is used to analyze experimental data.Differences in characteristics of industrial process data are the major cause that bring about the uniformity between the statistic analysis result of industrial control modeling and the theory knowledge.During the pre-treatment of the industrial process data,the parameters obtained through the least square estimation method is biased and the optimal estimation of some linear relations maybe nonlinear,due to the change of the data distribution.The relation obtained simply based on correlation coefficient will be different from the theory knowledge,due to the complexity of the relations between variables.The research results offer references to accurate sample modeling and the stripped steel performance dynamic control modeling of Baosteel.
出处
《控制工程》
CSCD
北大核心
2009年第S2期192-195,共4页
Control Engineering of China
关键词
数据特征
数据挖掘
最小二乘
最佳估计
相关系数
data characteristics
data mining
least square
optimal estimation
correlation coefficient