摘要
目的针对当前三维装箱算法存在的模型鲁棒性差、泛化性弱、装载率低等问题,设计一种无监督融合机制的在线装箱算法。方法充分考虑货物“即到即码”的实时性需求,以容器空间利用率为优化目标,基于无监督深度融合指针网络端到端学习模型框架,将在线三维装箱的码垛过程公式化地表述为马尔科夫决策过程,设计强化学习要素,并以深度强化学习算法为主,融入蒙特卡洛树搜索,对智能体的决策动作进行训练,以生成具有较优“学习”能力的在线三维装箱模型。结果采用125种不同尺寸和方向随机生成货物数据集,并在7种约束条件下验证,实验结果表明,容器的平均利用率可达84.6%。结论该算法的泛化性较好,且其装载率远优于当前效果较好的启发式算法、深度学习方法,为货物的在线装箱提供了理论依据及参考。
The work aims to design an on-line unsupervised integration algorithm,in order to solve the problems of poor model robustness,poor generalization and low loading rate in the existing 3D packing algorithm.In full consideration of the real-time premise of"just-in-time"cargo and with the container space utilization rate as the optimization goal,based on the end-to-end learning model framework of unsupervised deep fusion pointer network,the stacking process of online 3D packing was formulated as a Markovian decision-making process,to design reinforcement learning elements,and to give priority to the deep reinforcement learning algorithm.The decision-making actions of the agent were trained with the Monte Carlo tree search to generate an online three-dimensional boxing model with better"learning"ability.125 randomly generated cargo data sets with different sizes and directions were tested under 7 constraint conditions.The experimental results showed that the average utilization rate of containers could reach 84.6%.The generalization of the algorithm is good,and the loading rate of the algorithm is much better than the current heuristic and depth learning method,providing theoretical basis and reference for on-line packing of cargo.
作者
张长勇
姚凯超
王彤
ZHANG Changyong;YAO Kaichao;WANG Tong(College of Electronic Information and Automation,Civil Aviation University of China,Tianjin 300300,China)
出处
《包装工程》
CAS
北大核心
2024年第11期153-162,共10页
Packaging Engineering
基金
中央高校高水平培育项目(3122023PY04)。
关键词
在线三维装箱
无监督融合机制
马尔科夫决策
指针网络
蒙特卡洛树搜索
online 3D packing
unsupervised integration mechanism
Markovian decision
pointer network
Monte Carlo tree search