RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding across Different Environments and Tasks

Yimin Tang*, Xiao Xiong*, Jingyi Xi, Jiaoyang Li, Erdem Bıyık, Sven Koenig.

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025.
A short version appeared at Symposium on Combinatorial Search (SoCS), pages 273-274, 2025.

arXiv

Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community’s continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.

@inproceedings{ TangIROS25railgun,
  author    = "Yimin Tang and Xiao Xiong and Jingyi Xi and Jiaoyang Li and Erdem Bıyık and Sven Koenig",
  title     = "RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding across Different Environments and Tasks",
  booktitle = "Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)",
  pages     = "",
  year      = "2025",
  doi       = "",
}

Abstract:

Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community’s continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.