摘要
近年来,随着大数据分析技术的发展,用户画像技术日趋成熟,挖掘数据中的隐藏信息已经成了研究热门。文章参考用户画像技术,提出了一种基于C4.5决策树算法的学生画像建模方法。文章以广西大学计算机与电子信息学院的本科生为研究样本,搜集学生们的行为数据制作数据集。该模型采用平均值填充法进行数据预处理,并将清洗完毕的数据投入训练模型进行调参,从而获得最后的测试模型。该模型能够按照三级标签体系将学生分类,准确率达到62.3%。最后该文对研究结果进行了分析总结,确定了未来的发展方向。
In recent years,with the development of big data analysis technology,user portrait technology is becoming more and morc mature,and mining the hidden information in the data has become a popular research.In this paper,a student portrait modeling method based on C4.5 decision trce algorithm is proposed with reference to user portrait technology.This study takes the undergraduate students of the School of Computer and Electronic Information of Guangxi University as the research sample and collects the behavioral data of the students to make a dataset.The model uses the mean padding method for data preprocessing,and the cleaned data is put into the training model for tuning,so as to obtain the final test model.The model was able to categorize students according to a three-level labeling system with an accuracy of 62.3%.Finally,this paper analyzes and summarizes the results of the study and identifies future development directions.
作者
代洪伟
梁文栊
苏森
陈剑炜
DAI Hongwei;LIANG Wenbar;SU Sen;CHEN Jianwei(School of Computer and Electronic Infomation,Guangxi University,Nanning 530000,China)
出处
《长江信息通信》
2024年第9期68-70,98,共4页
Changjiang Information & Communications
关键词
大数据分析
用户画像
学生画像
C4.5决策树算法
标签体系
学生分类
big data analytics
user profiling
student profiling
C4.5 decision tree algorithm
labeling system
student classification
作者简介
代洪伟(2003-),男,安徽亳州人,本科生,研究方向:大数据分析、机器学习;梁文栊(2002-),男,广西贵港人,本科生,研究方向:机器学习、数据挖掘;苏森(2003-),男,辽宁大连人,本科生,研究方向:图神经网络、机器学习;陈剑炜(2003-),男,广西玉林人,本科生,研究方向:大数据分析、机器视觉。