Skip to main navigation Skip to search Skip to main content

DRLMutation: A Comprehensive Framework for Mutation Testing in Deep Reinforcement Learning Systems

  • Jiapeng Li
  • , Zheng Zheng*
  • , Xiaoting Du
  • , Haoyu Wang
  • , Yanwen Liu
  • *Corresponding author for this work
  • Beihang University
  • Beijing University of Posts and Telecommunications

Research output: Contribution to journalArticlepeer-review

Abstract

Deep reinforcement learning systems have been increasingly applied in various domains. Testing them, however, remains a major open research problem. Mutation testing is a popular test suite evaluation technique that analyzes the extent to which test suites detect injected faults. It has been widely researched in both traditional software and the field of deep learning. However, due to the fundamental differences between deep reinforcement learning systems and traditional software, as well as deep learning systems, in aspects such as environment interaction, network decision-making, and data efficiency, previous mutation testing techniques cannot be directly applied to deep reinforcement learning systems. In this article, we proposed a comprehensive mutation testing framework specifically designed for deep reinforcement learning systems, DRLMutation, to further fill this gap. We first considered the characteristics of deep reinforcement learning, and based on both the training process and the model of trained agent, examined combinations from three dimensions: objects, operation methods, and injection methods. This approach led to a more comprehensive design methodology for deep reinforcement learning mutation operators. After filtering, we identified a total of 107 applicable deep reinforcement learning mutation operators. Then, in the realm of evaluation, we formulated a set of metrics tailored to assess test suites. Finally, we validated the stealthiness and effectiveness of the proposed mutation operators in the Cart Pole, Mountain Car Continuous, Lunar Lander, Breakout, and CARLA environments. We show inspiring findings that the majority of these designed deep reinforcement learning mutation operators potentially undermine the decision-making capabilities of the agent without affecting normal training. The varying degrees of disruption achieved by these mutation operators can be used to assess the quality of different test suites.

Original languageEnglish
Article number220
JournalACM Transactions on Software Engineering and Methodology
Volume34
Issue number8
DOIs
StatePublished - 6 Oct 2025

Keywords

  • Deep Reinforcement Learning
  • Mutation Operators
  • Mutation Testing
  • Software Testing

Fingerprint

Dive into the research topics of 'DRLMutation: A Comprehensive Framework for Mutation Testing in Deep Reinforcement Learning Systems'. Together they form a unique fingerprint.

Cite this