摘要
Driver behavior is a critical factor in road safety,highlighting the need for advanced methods in Distracted riving lassification(DDC).In this study,we introduce DDC-Chat,a novel classification method based on a isual large anguageodel(VLM).DDC-Chat is an interactive multimodal system built upon LLAVA-Plus,fine-tuned specifically for addressing distracted driving detection.It utilizes logical reasoning chains to activate visual skills,including segmentation and pose detection,through end-to-end training.Furthermore,instruction tuning allows DDC-Chat to continuously incorporate new visual skills,enhancing its ability to classify distracted driving behavior.Our extensive experiments demonstrate that DDC-Chat achieves state-of-the-art performance on public DDC datasets,surpassing previous benchmarks.In evaluations on the 100-Driver dataset,the model exhibits superior results in both zero-shot and few-shot learning contexts,establishing it as a valuable tool for improving driving safety by accurately identifying driver distraction.Due to the computational intensity of inference,DDC-Chat is optimized for deployment on remote servers,with data streamed from in-vehicle monitoring systems for real-time analysis.
基金
supported by the National Natural Science Foundation of China(62173253,52272374)
the Research and Practice Project of New Engineering in Ordinary Undergraduate Universities in Guangxi Zhuang Autonomous Region(XGK202310)
educational reform projects(JGT202302,JGKQ202309)
the 2024 Guangxi Collegiate Innovation and Entrepreneurship Training Project"Eye-Smart Driving-Fatigue Driving Monitoring and Warning System Based on Computer Vision"(Project No.S202410595158).
作者简介
Corresponding author:Kuoyi Lin.E-mail address:21005005@guet.edu.cn。