摘要
Cranes are key heavy-duty material handling equipment widely used in shops, warehouses, ports, and other industrial settings. The scheduling of cranes significantly affects transportation efficiency and the achievement of production goals. To address the crane scheduling problem with time windows (CSP-TW), a mixed-integer linear programming model based on spatio-temporal discretization is developed. Based on the characteristics of the model, a hierarchical reinforcement learning (HRL) decision-making framework is designed. The high-level decision network assigns transportation tasks to appropriate cranes, while the low-level network plans paths for each crane to complete its assigned task. During the learning process, action tabu rules are introduced to avoid ineffective actions and guide the decision networks toward the dominant policy space. Subsequently, external experience pooling and the dueling double deep Q-network strategy are adopted to train the decision networks. Tests were executed based on the logistics simulation platform of a steel plant from a certain company. Ablation experiments show that the introduction of action tabu rules improves learning efficiency.Training comparisons indicate that HRL achieves better convergence than the end-to-end framework. Comparative experiments demonstrate that HRL outperforms several methods, including multi-rule combinations, meta-heuristic algorithms, end-to-end and deep Q-network, while satisfying second-level response-time requirements for applications.
| 投稿的翻译标题 | Hierarchical reinforcement learning-based optimization method for crane scheduling |
|---|---|
| 源语言 | 繁体中文 |
| 页(从-至) | 2261-2273 |
| 页数 | 13 |
| 期刊 | Kongzhi Lilun Yu Yingyong/Control Theory and Applications |
| 卷 | 42 |
| 期 | 11 |
| DOI | |
| 出版状态 | 已出版 - 2025 |
关键词
- action tabu
- crane scheduling
- hierarchical reinforcement learning
- path planning
- task assignment
指纹
探究 '基于分层强化学习的天车调度优化方法' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver