期刊文献+

基于网络结构的正则化逻辑回归 被引量:6

Logistic Regression with Regularization Based on Network Structure
在线阅读 下载PDF
导出
摘要 逻辑回归是一个应用广泛的分类模型,但由于高维数据分类任务在实际应用中变得越来越频繁,使得分类模型面临着巨大的挑战。应对该挑战的一种有效方法是对模型进行正则化。许多已有的正则化逻辑回归直接运用L1范数罚作为正则化罚项,而不考虑特征之间的复杂关联关系。也有一些研究工作基于特征的组信息设计了正则化罚项,但它们假设组信息是预先给定的。文中从网络的视角对特征数据中存在的潜在模式进行挖掘,并基于此提出了一个基于网络结构的正则化逻辑回归。首先,以网络的形式描述特征数据并构建出特征网络;其次,从网络科学的角度对特征网络进行观察和分析,并基于此设计罚函数;然后,以该罚函数为正则化罚项,提出网络结构Lasso逻辑回归;最后,结合Nesterov加速近端梯度下降法和Moreau-Yosida正则化方法,推导了模型的求解过程。在真实数据集上的实验结果显示,所提网络结构Lasso逻辑回归表现优异,这表明从网络的视角观察和分析特征数据是研究正则化模型的一个具有潜力的方向。 Logistic regression is widely used as classification model.However,as the task of high-dimensional data classification becomes more and more frequent in practical application,the classification model is facing great challenge.Regularization is an effective approach to this challenge.Many existing regularized logistic regression models directly use L1-norm penalty as regularized penalty term without considering the complex relationships among features.There are also some regularization penalty terms designed on the basis of group information of features,but assuming that the group information is prior knowledge.This paper explores the pattern hidden in feature data from the perspective of network and then proposes a regularized logistic regression model based on the network structure.Firstly,this paper constructs feature network by describing feature data in the form of network.Secondly,it observes and analyzes the feature network from the perspective of network science and designs a penalty function based on the observation.Thirdly,it proposes a logistic regression model with network structured Lasso by taking the penalty function as regularized penalty term.Lastly,it infers the solution of the model by combining the Nesterov’s accelerated proximal gradient method and the Moreau-Yosida regularization method.Experiments on real datasets show that the proposed regularized logistic regression performs excellently,which demonstrates that observing and analyzing feature data from the perspective of network is a potential way to study regularized model.
作者 胡艳梅 杨波 多滨 HU Yan-mei;YANG Bo;DUO Bin(College of Computer Science and Cyber Security,Chengdu University of Technology,Chengdu 610059,China;School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China)
出处 《计算机科学》 CSCD 北大核心 2021年第7期281-291,共11页 Computer Science
基金 国家自然科学基金(61802034,61977013) 国家重点研发项目基金(2019YFC1509602) 四川省科技计划重点研发项目基金(2021YFG0333)。
关键词 正则化罚项 逻辑回归 网络结构 特征选择 近端梯度下降法 Regularized penalty term Logistic regression Network structure Feature selection Proximal gradient method
作者简介 通信作者:胡艳梅,born in 1984,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.Her main research interests include data mining,social and information networks analysis,machine learning and evolutionary computation.huyanmei@cdut.edu.cn。
  • 相关文献

参考文献1

二级参考文献121

  • 1Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267-288.
  • 2Breiman L. Better subset regression using the nonnegative garrote. Technometrics, 1995, 37(4) 373-384.
  • 3Frank L L E, Friedman J H. A statistical view of some chemometrics regression tools. Technometrics, 1993, 35 (2) 109-135.
  • 4Efron B, Hastie T, Johnstone I, et al. Least angle regression. The Annals of Statistics, 2004, 32(2): 407-499.
  • 5Yuan M, Lin Y. On the non-negative garrotte estimator. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2007, 69(2) : 143-161.
  • 6Xiong S. Some notes on the nonnegative garrote. Techno- metrics, 2010, 52(3): 349-361.
  • 7Fu W J. Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics, 1998, 7(3) : 397-416.
  • 8Knight K, Fu W. Asymptotics for Lasso-type estimators. Annals of Statistics, 2000, 28(5): 1356-1378.
  • 9Huang J, Horowitz J L, Ma S. Asymptotic properties of bridge estimators in sparse high-dimensional regression models. The Annals of Statistics, 2008, 36(2) : 587-613.
  • 10Friedman J, Hastie T, H6fling H, et al. Pathwise coordinate optimzation. The Annals o{ Applied Statistics, 2007, 1(2) : 302-332.

共引文献69

同被引文献64

引证文献6

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部