Explanations for Unrealizability of Infinite-State Safety Shields

Andoni Rodríguez, Irfansha Shaik, Davide Corsi, Roy Fox, César Sánchez

November, 2025

Abstract

Safe Reinforcement Learning focuses on developing optimal policies while ensuring safety. A popular method to address such task is shielding, in which a correct-by-construction safety component is synthesized from logical specifications. Recently, shield synthesis has been extended to infinite-state domains, such as continuous environments. This makes shielding more applicable to realistic scenarios. However, often shields might be unrealizable because the specification is inconsistent (e.g., contradictory). In order to address this gap, we present a method to obtain simple unconditional and conditional explanations that witness unrealizability, which goes by temporal formula unrolling. In this paper, we show different variants of the technique and its applicability.

Type

Conference paper

Publication

Proc. of the 22nd Int’l Conf. on Principles of Knowledge Representation and Reasoning, KR in Planning and Scheduling (KR'25), pp858-868, 2025

César Sánchez

Professor

My research focuses on formal methods, in paricular logic, automata and game theory. Temporal logics for Hyperproperties. Reactive Synthesis Modulo Theories. Applications to Blockchain.