期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
A Rebalancing Framework for Classification of Imbalanced Medical Appointment No-show Data 被引量:1
1
作者 Ulagapriya Krishnan Pushpa Sangar 《Journal of Data and Information Science》 CSCD 2021年第1期178-192,共15页
Purpose: This paper aims to improve the classification performance when the data is imbalanced by applying different sampling techniques available in Machine Learning.Design/methodology/approach: The medical appointme... Purpose: This paper aims to improve the classification performance when the data is imbalanced by applying different sampling techniques available in Machine Learning.Design/methodology/approach: The medical appointment no-show dataset is imbalanced, and when classification algorithms are applied directly to the dataset, it is biased towards the majority class, ignoring the minority class. To avoid this issue, multiple sampling techniques such as Random Over Sampling(ROS), Random Under Sampling(RUS), Synthetic Minority Oversampling TEchnique(SMOTE), ADAptive SYNthetic Sampling(ADASYN), Edited Nearest Neighbor(ENN), and Condensed Nearest Neighbor(CNN) are applied in order to make the dataset balanced. The performance is assessed by the Decision Tree classifier with the listed sampling techniques and the best performance is identified.Findings: This study focuses on the comparison of the performance metrics of various sampling methods widely used. It is revealed that, compared to other techniques, the Recall is high when ENN is applied CNN and ADASYN have performed equally well on the Imbalanced data.Research limitations: The testing was carried out with limited dataset and needs to be tested with a larger dataset.Practical implications: This framework will be useful whenever the data is imbalanced in real world scenarios, which ultimately improves the performance.Originality/value: This paper uses the rebalancing framework on medical appointment no-show dataset to predict the no-shows and removes the bias towards minority class. 展开更多
关键词 imbalanced data Sampling methods Machine learning CLASSIFICATION
在线阅读 下载PDF
Fault Diagnosis Method Based on Xgboost and LR Fusion Model under Data Imbalance
2
作者 Liling Ma Tianyi Wang +2 位作者 Xiaoran Liu Junzheng Wang Wei Shen 《Journal of Beijing Institute of Technology》 EI CAS 2022年第4期401-412,共12页
Diagnosis methods based on machine learning and deep learning are widely used in the field of motor fault diagnosis.However,due to the fact that the data imbalance caused by the high cost of obtaining fault data will ... Diagnosis methods based on machine learning and deep learning are widely used in the field of motor fault diagnosis.However,due to the fact that the data imbalance caused by the high cost of obtaining fault data will lead to insufficient generalization performance of the diagnosis method.In response to this problem,a motor fault monitoring system is proposed,which includes a fault diagnosis method(Xgb_LR)based on the optimized gradient boosting decision tree(Xgboost)and logistic regression(LR)fusion model and a data augmentation method named data simulation neighborhood interpolation(DSNI).The Xgb_LR method combines the advantages of the two models and has positive adaptability to imbalanced data.Simultaneously,the DSNI method can be used as an auxiliary method of the diagnosis method to reduce the impact of data imbalance by expanding the original data(signal).Simulation experiments verify the effectiveness of the proposed methods. 展开更多
关键词 imbalanced data fault diagnosis data augmentation method
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部