**Recorded video of the entire program
**

**
Goal of the workshop
**

Safe Learning for Near Optimal Scheduling
Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes
Can one infer an implementation of a black-box reactive system just by observing examples?
Manthan: A Data-Driven Approach for Boolean Functional Synthesis
Abstracting neurons empirically: ML helps verifying ML
Learning of Structurally Unambiguous Probabilistic Grammars
Network reduction techniques for Scalable Verification of AI-Controlled Cyber-Physical Systems
Scalable Anytime Algorithms for Learning Formulas in Linear Temporal Logic
Safety Abstractions in Constrained Reinforcement Learning
Reinforcement Learning and Formal Requirements

Recently, efficient artificial intelligence techniques are increasingly applied to problems which are (theoretically) hard or even undecidable, while usually no formal correctness guarantees are given. These shortcomings raise the following question: Can AI borrow from the verification techniques where correctness is of paramount importance, and thus provide correctness guarantees? On the other hand, verification algorithms often suffer from scalability issues while providing correctness. This raises the following question: Can we leverage machine learning techniques to implement better, more efficient verification tools?

The iVerif workshop aims at bringing together researchers working on different aspects of AI for verification, the verification of AI-enabled systems, and safe AI, amongst others.

### Invited speakers

- Dana Fisman, Ben-Gurion University, Israel
- Jan Křetínský, Technical University of Munich, Germany
- Pavithra Prabhakar, Kansas State University, USA
- Ashutosh Trivedi, University of Colorado Boulder, USA

### Short student talks

- Debraj Chakraborty , Université Libre de Bruxelles, Belgium
- Florent Delgrange , Vrije Universiteit Brussel, Belgium
- Priyanka Golia , Indian Institute of Technology Kanpur, India & National University of Singapore, Singapore
- Dolav Nitay , Ben Gurion University, Israel
- Ritam Raha , University of Antwerp, Belgium & LABRI, France
- Thiago D. Simão , Delft University of Technology, Netherlands

### Panel discussion

The panel discussion will be centred around the topics of the talks: A relevant question in this regard is whether we can use machine learning techniques to build efficient and scalable verification tools. More specific questions include: Is it possible to retain guarantees, or what kind of guarantees can we expect to have? Can we use synthesis algorithms on models that are learnt? Which one gives better guarantees in practice: model-free RL or model-based RL? Is scalability a problem for learning real-world systems? Can we verify complex neural networks? What are the challenges? How well does it scale? How do we verify real-world cyber physical systems with AI components?

##### Panelists

- Supratik Chakraborty (Chair) Indian Institute of Technology Bombay, India
- Nils Jansen , Radboud University Nijmegen, Netherlands
- Jan Křetínský , Technical University of Munich, Germany
- Pavithra Prabhakar , Kansas State University, USA
- Ashutosh Trivedi , University of Colorado Boulder, USA

### Programme

The workshop will consist of invited talks by eminent researchers, short invited presentations by students, and a panel discussion on relevant topics.

IST | CET | Speaker | Title |

1320 - 1330 | 0850 - 0900 | Guillermo A. Pérez | Opening remarks |

Session 1 | Chair: Guillermo A. Pérez | ||

1330 - 1415 | 0900 - 0945 | Dana Fisman | Can one infer an implementation of a black-box reactive system just by observing examples? |

Session 2 | Chair: Paritosh Pandya | ||

1415 - 1500 | 0945 - 1030 | Jan Křetínský | Abstracting neurons empirically: ML helps verifying ML |

1500 - 1515 | 1030 - 1045 | Short break | |

Session 3: Short talks | Chair: C Aiswarya | ||

1515 - 1530 | 1045 - 1100 | Thiago D. Simão | Safety Abstractions in Constrained Reinforcement Learning |

1530 - 1545 | 1100 - 1115 | Debraj Chakraborty | Safe Learning for Near Optimal Scheduling |

1545 - 1600 | 1115 - 1130 | Priyanka Golia | Manthan: A Data-Driven Approach for Boolean Functional Synthesis |

1600 - 1615 | 1030 - 1045 | Short break | |

Session 4: Short talks | Chair: Sebastian Junges | ||

1615 - 1630 | 1145 - 1200 | Dolav Nitay | Learning of Structurally Unambiguous Probabilistic Grammars |

1630 - 1645 | 1200 - 1215 | Ritam Raha | Scalable Anytime Algorithms for Learning Formulas in Linear Temporal Logic |

1645 - 1700 | 1215 - 1230 | Florent Delgrange | Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes |

1700 - 1815 | 1230 - 1345 | Long break / Poster session | |

Session 5 | Chair: Deepak D'Souza | ||

1815 - 1900 | 1345 - 1430 | Pavithra Prabhakar | Network reduction techniques for Scalable Verification of AI-Controlled Cyber-Physical Systems |

Session 6 | Chair: Shibashis Guha | ||

1900 - 1945 | 1430 - 1515 | Ashutosh Trivedi | Reinforcement Learning and Formal Requirements |

1945 - 2015 | 1515 - 1545 | Tea break + breakout rooms | |

Session 7 | Chair: Supratik Chakraborty | ||

2015 - 2100 | 1545 - 1630 | Supratik Chakraborty (chair), Nils Jansen, Jan Křetínský, Pavithra Prabhakar, Ashutosh Trivedi | Panel discussion |

2100 - 2110 | 1630 - 1640 | Shibashis Guha | Closing remarks |

### Title and Talk abstracts

We investigate the combination of synthesis, learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem. We provide probably approximately correct (PAC) guarantees for learning the model as a Markov decision process (MDP). Our algorithms can handle large MDPs and beyond which cannot be handled with state-of-the art probabilistic model-checkers. Additionally, we extend Monte-Carlo tree search with symbolic advice, computed using safety games or obtained using the earliest-deadline-first scheduler, to safely explore the learned model online. Finally, we compare our algorithms empirically against shielded deep Q-learning on large task systems.

This talk is based on the following papers:

- Safe learning for near-optimal scheduling. D. Busatto-Gaston, D. Chakraborty, S. Guha, G. A. Pérz,J-F Raskin. QEST 2021
- Monte Carlo tree search guided by symbolic advice for MDPs. D. Busatto-Gaston, D. Chakraborty,J-F Raskin. CONCUR 2020

We consider the challenge of policy simplification and verification in the context of policies learned through reinforcement learning (RL) in continuous environments. In well-behaved settings, RL algorithms have convergence guarantees in the limit. While these guarantees are valuable, they are insufficient for safety-critical applications. Furthermore, they are lost when applying advanced techniques such as deep-RL. To recover guarantees when applying advanced RL algorithms to more complex environments with (i) reachability,(ii) safety-constrained reachability, or (iii) discounted-reward objectives, we build upon the DeepMDP framework introduced by Gelada et al. to derive new bisimulation bounds between the unknown environment and a learned discrete latent model of it. Our bisimulation bounds enable the application of formal methods for Markov decision processes. Finally, we show how one can use a policy obtained via state-of-the-art RL to efficiently train a variational autoencoder that yields a discrete latent model with provably approximately correct bisimulation guarantees. Additionally, we obtain a distilled version of the policy for the latent model.

Reactive systems are modeled by languages of infinite words (omega-languages). If the black-box system implements a regular language of finite words, its inference can be done using an active learning algorithm, that runs in polynomial time and asks a polynomial number of queries in the size of the returned DFA (deterministic finite-automaton). While among the automata that accept infinite words there are ones that have the same structure as a DFA (in particular, DBA, deterministic Buechi automata) it can be shown that DBA cannot be polynomially learned. In this talk we will get acquainted with the ideas behind learning algorithms for regular languages, the obstacles in devising learning algorithms for regular omega-languages, and state-of-the-art results on learnability of acceptors of regular omega-languages.

A good place to be familiar with the subject is:

- Inferring regular languages and ω-languages. Dana Fisman. J. Log. Algebraic Methods Program. 98: 27-49 (2018)

Synthesis is a fundamental problem in which given a set of inputs, outputs, and an underlying relational specification between them, the objective is to synthesize the outputs in terms of inputs such that the given specification is met.

In this talk, I will discuss a data-driven approach called Manthan for Boolean functional synthesis. Manthan views functional synthesis as a classification problem, relying on advances in constrained sampling for data generation, and advances in automated reasoning for a novel proof-guided refinement and provable verification. On an extensive and rigorous evaluation over 609 benchmarks, we demonstrate that Manthan significantly improves upon the current state of the art, solving 509 benchmarks compared to 280, which is the most solved by a state-of-the-art technique. The significant performance improvements call for interesting future work at the intersection of machine learning, constrained sampling, and automated reasoning.

This talk is based on the following papers:

- Manthan: A Data Driven Approach for Boolean Function Synthesis. P. Golia, S. Roy, K. Meel. CAV 2020
- Engineering an Efficient Boolean Functional Synthesis Engine. P. Golia, F. Slivovsky, S. Roy, K. Meel. ICCAD 2021

While abstraction is a classic tool of verification to scale it up, its usage for verifying neural networks is only very recent. We introduce an abstraction framework based on semantics rather than syntax of the network layers. For the particular case of ReLU, we also provide error bounds incurred by the abstraction. We show how the abstraction reduces the size of the network, while preserving its accuracy, and how verification results on the abstract network can be transferred back to the original network. The talk is based on "DeepAbstract" (ATVA'20) and ongoing work.

A good place to be familiar with the subject is:

- DeepAbstract: Neural Network Abstraction for Accelerating Verification. P. Ashok, V. Hashemi, J. Křetínský, S. Mohr. ATVA 2020

The problem of identifying a probabilistic context free grammar has two aspects: the first is determining the grammar's topology (the rules of the grammar) and the second is estimating probabilistic weights for each rule. Given the hardness results for learning context-free grammars in general, and probabilistic grammars in particular, most of the literature has concentrated on the second problem. In this work we address the first problem. We restrict attention to structurally unambiguous weighted context-free grammars (SUWCFG) and provide a query learning algorithm for structurally unambiguous probabilistic context-free grammars (SUPCFG). We show that SUWCFG can be represented using co-linear multiplicity tree automata (CMTA), and provide a polynomial learning algorithm that learns CMTAs. We show that the learned CMTA can be converted into a probabilistic grammar, thus providing a complete algorithm for learning a structurally unambiguous probabilistic context free grammar (both the grammar topology and the probabilistic weights) using structured membership queries and structured equivalence queries.

This talk is based on the following paper:

- Learning of Structurally Unambiguous Probabilistic Grammars. D. Nitay, D. Fisman, M. Ziv-Ukelson. AAAI 2021

Traditional feedback controllers in cyber-physical systems (CPS) are being increasingly replaced by learning based components such as artificial neural networks (NN). Given the safety criticality of AI-controlled CPS such as autonomous vehicles, rigorous analysis techniques such as formal verification have become imperative. While there is a rich body of work on formal verification of traditional CPS in the realm of hybrid systems, the area of rigorous testing and analysis of CPS controlled by AI components such as neural networks is still in its nascent stages.

One of the main challenges with formal verification of neural network controlled CPS is the scalability of the analysis methods to large networks and complex dynamics. We present our recent results on network reduction techniques that reduce the size of neural networks while providing theoretical guarantees. First, we present the foundations of network reduction, by exploring the notions of bisimulation and approximate bisimulation for neural networks that relate "nearby" neurons in terms of their effect on the output reachability. These results could be used toward network reduction with bounded error guarantees.

Next, we present a novel abstraction technique for neural network size reduction for output range analysis that provides soundness guarantees for safety analysis, and indicates a promising direction for scalable analysis of the closed loop systems. Our abstraction consists of constructing a simpler neural network with fewer neurons, albeit with interval weights called interval neural network (INN), which over-approximates the output range of the given neural network. We reduce the output range analysis problem on the INNs to solving a mixed integer linear programming problem. Our experimental results highlight the trade-off between the computation time and the precision of the computed output range.

The talk is based on the following papers:

- Bisimulations for Neural Network Reduction. P. Prabhakar. VMCAI 2022
- Abstraction based Output Range Analysis. P. Prabhakar and Z. Afzal. Neurips 2019

In this talk, I will introduce a novel algorithm of building (learning) an explainable model in the form of Linear Temporal Logic formulas from a set of positive and negative finite system behaviours a.k.a. traces. Linear temporal logic (LTL) has been a prominent logic for specifying temporal properties which is widely used in the field of artificial intelligence, program verification, motion planning in robotics, process mining, and many other areas. Despite a growing interest of the research community, existing solutions suffer from two limitations: they do not scale beyond small formulas, and they may exhaust computational resources without returning any result. Our algorithm addresses both the issues: it is able to construct formulas an order of magnitude larger than previous methods, and it is anytime, meaning that it in most cases successfully outputs a formula, albeit possibly not of minimal size. We have also implemented our algorithm in a prototype named SCARLET and I will also present the evaluation of its performances against publicly available benchmarks.

This talk is based on the following work:

- Scalable Anytime Algorithms for Learning Formulas in Linear Temporal Logic. R. Raha, R. Roy, N. Fijalkow, and D. Neider

Constrained reinforcement learning (CRL) agents can trade off performance and safety autonomously based on the constrained Markov decision process (CMDP) framework, since it decouples safety from reward. Unfortunately, most CRL agents only guarantee safety after the learning phase. In this work, we investigate settings where a concise abstract model of the safety aspects is given, a reasonable assumption since a thorough understanding of safety-related matters is a prerequisite for deploying CRL in typical applications. We propose a CRL algorithm that uses this abstract model to learn policies for CMDPs safely, that is without violating the constraints during the learning phase.

This talk is based on the following work:

- AlwaysSafe: Reinforcement Learning Without Safety Constraint Violations During Training. T. D. Simão, N. Jansen and M. T. J. Spaan. AAMAS 2021.

Reinforcement learning is an approach to controller synthesis where agents rely on reward signals to choose actions in order to satisfy the requirements implicit in reward signals. Oftentimes non-experts have to come up with the requirements and their translation to rewards under significant time pressure, even though manual translation is time-consuming and error-prone. For safety-critical applications of reinforcement learning, a rigorous design methodology is needed and, in particular, a principled approach to requirement specification and to the translation of objectives into the form required by reinforcement learning algorithms.

Formal logic provides a foundation for the rigorous and unambiguous requirement specification of learning objectives. However, reinforcement learning algorithms require requirements to be expressed as scalar reward signals. We discuss a recent technique, called limit-reachability, that bridges this gap by faithfully translating logic-based requirements into the scalar reward form needed in model-free reinforcement learning. This technique enables the synthesis of controllers that maximize the probability to satisfy given logical requirements using off-the-shelf, model-free reinforcement learning algorithms.

This talk is based on the following papers:

- Reinforcement Learning and Formal Requirements. Fabio Somenzi, Ashutosh Trivedi. NSV 2019
- Omega-Regular Objectives in Model-Free Reinforcement Learning. Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak. TACAS 2019
- Model-Free Reinforcement Learning for Stochastic Parity Games. Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, Dominik Wojtczak. CONCUR 2020

### Organizers

- Shibashis Guha, Tata Institute of Fundamental Research, India
- Guillermo A. Pérez, University of Antwerp, Belgium

### Registration

Participation in the workshop is open to everyone and will be limited only by logistic constraints. Registration is free but mandatory. Please register using the FSTTCS registration page.