摘要
电网企业记录了大量故障与缺陷中文文本,这些文本蕴藏了丰富的设备健康信息。但迄今为止,鲜有电力领域的文本信息挖掘技术研究。以断路器全寿命状态评价为应用研究背景,探索了电网中文文本挖掘方法。首先,根据断路器状态评价的研究现状,提出了构建文本挖掘与全寿命状态评价模型的关键问题。然后,构建了包含文本挖掘信息的全寿命状态评价模型,通过基于隐马尔可夫法(HMM)的文本预处理与向量化、自主区间搜索k最近邻(KNN)算法的文本分类和比率型状态信息融合模型完成了断路器全寿命健康状态指数的展示。最后,采用某电网公司实际缺陷文本构建算例。算例表明,文本挖掘技术实现了相似缺陷的相关性学习,比率型信息融合模型能更全面真实地展示健康状态评价的历史流。
In power grids,operating and maintaining engineers have recorded plenty of texts or logs during maintaining and inspecting activities.These textual data contain abundant asset health information.So far,however,few researches,if any,have studied text mining techniques in the power grid.We take the circuit breaker(CB)as a case in point to establish a framework of text mining-based lifecycle condition assessment.Firstly,the key issues of text mining and lifecycle condition assessment models are listed based on reviewing the research of CB condition assessment.Then,use is made of the framework including a hidden Markov model(HMM)-based text preprocessing and vectorizion,self-interval searching k-nearest neighbor(KNN)-based text classification,and a proportional health-index fusion model(PHFM).Finally,we have collected real textual data from a certain power company to demonstrate an example that shows the text mining technique could learn similar defects from other assets by itself,and PHFM shows historical data stream and lifecycle health index much more rigorously.
出处
《电力系统自动化》
EI
CSCD
北大核心
2016年第6期107-112,118,共7页
Automation of Electric Power Systems
基金
国家电网公司科技项目~~
作者简介
邱剑(1990-),男,博士研究生,主要研究方向:智能电网数据挖掘、故障率预测和状态检修。E—mail:jianqiu@zju.edu.cn
王慧芳(1974-),女,通信作者,博士,副教授,主要研究方向:电网状态检修、继电保护与控制。E-mail:huifangwang@zju.edu.cn
应高亮(1957-),男,高级工程师,主要研究方向:带电检测应用技术。