TY - GEN
T1 - Effective Crash Recovery of Robot Software Programs in ROS
AU - Zou, Yong Hao
AU - Bai, Jia Ju
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Modern robot systems use various software programs to autonomously perform different kinds of tasks. However, due to the risks of possible faults and errors, a robotic software program can inevitably crash in some cases, causing that the robot system fails to perform the current task. Thus, for robustness, the crashed program should be correctly recovered to continue the failed task. For this purpose, ROS provides a default restart method to automatically restart crashed programs. However, our case studies of typical ROS programs show that the restart method can perform incorrect crash recovery, and it can even cause the robot to perform dangerous behaviors, because this method loses the program's important data that was stored before the crash and is used after recovery. To solve this problem, we develop a practical approach named RORY, to perform effective crash recovery of robot software programs in ROS. RORY uses a hybrid checkpoint-replay method, and it is generic to different ROS programs by considering ROS properties. We evaluate RORY on 6 common ROS programs, and show that RORY performs correct crash recovery in both virtual and realistic environments with modest overhead. The comparison experiments indicate that RORY outperforms the restart, checkpoint-alone and replay-alone methods.
AB - Modern robot systems use various software programs to autonomously perform different kinds of tasks. However, due to the risks of possible faults and errors, a robotic software program can inevitably crash in some cases, causing that the robot system fails to perform the current task. Thus, for robustness, the crashed program should be correctly recovered to continue the failed task. For this purpose, ROS provides a default restart method to automatically restart crashed programs. However, our case studies of typical ROS programs show that the restart method can perform incorrect crash recovery, and it can even cause the robot to perform dangerous behaviors, because this method loses the program's important data that was stored before the crash and is used after recovery. To solve this problem, we develop a practical approach named RORY, to perform effective crash recovery of robot software programs in ROS. RORY uses a hybrid checkpoint-replay method, and it is generic to different ROS programs by considering ROS properties. We evaluate RORY on 6 common ROS programs, and show that RORY performs correct crash recovery in both virtual and realistic environments with modest overhead. The comparison experiments indicate that RORY outperforms the restart, checkpoint-alone and replay-alone methods.
UR - https://www.scopus.com/pages/publications/85125463810
U2 - 10.1109/ICRA48506.2021.9560876
DO - 10.1109/ICRA48506.2021.9560876
M3 - 会议稿件
AN - SCOPUS:85125463810
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 9498
EP - 9504
BT - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Y2 - 30 May 2021 through 5 June 2021
ER -