摘要
本研究基于多元化数据挖掘和机器学习集成方法改进两个方面,对于如何识别财务舞弊并提高识别效率问题提供了系统性预测方法.在多元化数据方面,不仅对传统财务因子进行了重构,而且引入公司治理层面因子并利用文本分析构建了语言类因子.在机器学习集成方法改进方面,以9种不同特质的机器学习算法作为基学习器,套用元学习框架对上市公司财务舞弊进行系统性识别.研究发现:1)元学习框架能够显著提升舞弊样本召回率和预测精确度,提高整体学习器预测性能,并且对于大部分行业都有效果;2)接近真实场景的滚动预测方法下,元学习框架依然能显著提高基学习器的财务舞弊识别能力;3)公司治理因子、语言类因子对于财务舞弊识别有一定的帮助.
Based on diversified data-mining and machine learning integration,this paper provides a systematic prediction method for identifying financial fraud and improving the efficiency of financial fraud identification.In terms of diversified data-mining,not only the traditional financial factors are reconstructed,but also the corporate governance and the language factors are used.For improving the machine learning efficiency,nine machine learning algorithms with different characteristics are used as the basic learners and the meta-learning framework is applied to systematically identify the financial fraud of listed companies.This paper finds the following results.1)Meta-learning framework can not only significantly improve the recall and precision of fraud samples and the overall prediction performance of the learner,but also is applicable for most industries.2)Under the rolling prediction method close to the real scenario,the meta-learning can still significantly improve the financial fraud identification ability of the basic learner.3)corporate governance factors and language factors are helpful for financial fraud identification.
作者
张学勇
施懿
ZHANG Xue-yong;SHI Yi(School of Finance,Central University of Finance and Economics,Beijing 100081,China;Webank,Shenzhen 518063,China)
出处
《管理科学学报》
CSSCI
CSCD
北大核心
2023年第10期95-113,共19页
Journal of Management Sciences in China
基金
国家哲学社会科学重大项目(19ZDA098)
关键词
财务舞弊
元学习
机器学习
文本分析
financial fraud
meta-learning
machine learning
text analysis
作者简介
张学勇(1978-),男,安徽庐江人,博士,教授,博士生导师.Email:zhangxueyong@cufe.edu.cn