A Platform for Characterizing Artificial Intelligence Safety and
Robustness developped at CEA List
Trustworthiness of AI-based systems
The trustworthiness of an AI-based software is a wide-reaching term covering several key aspects. As an example:
intelligibility, e.g., through explainability and interpretability of research,
security, e.g., through the resistance to, the ability to recover from, or to remain in a degraded but acceptable state during malicious attacks
fairness, e.g., through the absence of bias according to social or biological attributes
CAISAR is an open-source platform under active development that focuses on another paramount aspect of trustworthiness: the Characterization of AI systems' Robustness and Safety. CAISAR provides a unified entry point for defining verification problems by using WhyML, the mature and expressive language of the Why3 verification platform.
While the current frenzy of AI trustworthiness is mostly focused neural networks, our industrial partners can also use other types of AI (e.g. SVM, XGBoost) in their products. This is why CAISAR targets a wider range of AI systems.
Modular and extensible
Written in the functional language OCaml, adding a verification, analysis or testing software to CAISAR's toolsuit is made easier through a unified interface, and an instantiation guided by data-types.
Maintainable
Functional programming provides easy mathematical reading, lowering the entry barrier for understanding the inner workings of CAISAR. Strong typing also minimizes errors with informative messages.
Standard oriented
By relying on AI standards (ONNX, NNnet) and formal methods standards (SMT, CP), CAISAR maximizes the potential for inclusiveness. Any tool that supports these standards is a potential addition to the CAISAR platform.
Interoperability
An internal representation for property language and AI representation helps reinforce the synergy between the different tools in CAISAR, where one analyser can rely on, and complement, the partial output of another.
Open-source
Our vision of CAISAR is collaborative in essence. To encourage cooperation and build a community of trust-minded entities, an easy access to the platform, supplied with documentations and tutorials, is our priority.
An expressive specification language
CAISAR provides a powerful specification language tailored for the
specification of machine learning programs. Building upon WhyML, CAISAR add
operations on vectors, composability of machine learning programs, dataset
specification helpers while keeping the soundness of Why3's logic.
Want to know more? See the language reference for more details.
Supported tools
A wide range of tools
CAISAR orchestrates and composes state-of-the-art machine learning verification tools which, individually, are not able to efficiently handle all problems but, collectively, can cover a growing number of properties. Our aim is to assist, on the one hand, the V&V process by reducing the burden of choosing the methodology tailored to a given verification problem, and on the other hand the tools developers by factorizing useful features-visualization, report generation, property description-in one platform.
AIMOS
AIMOS (AI Metamorphism Observing Software) is a tool which goal
is to provide the means to test metamorphic properties on a dataset
for a given AI model. The approach is black-box as the intrinsic
characteristics of the AI model is not used in any way,
as the model is treated as a black-box oracle with
only its inference function.
Its approach improves the usual approaches which only...
Colibri & Co
The Colibri suite of constraint solvers is designed to assist with formal method tasks by providing the reasoning capabilities necessary to verify software and provide reasoning capabilities within critical systems.
In other words:
Colibri is a set of tools that can be used to prove that software and systems meet certain requirements. It does this by modeling the software or ...
Marabou
Marabou is an SMT-based tool that answers queries about
neural networks and their properties.
Queries are transformed into constraint satisfaction problems.
Marabou takes as input networks with various piece-wise
linear activation functions and with fully connected topology.
Marabou first applies multiple pre-processing steps to infer bounds
for each node in the network.
Next it appl...
PyRAT
The Python Reachability Analysis Tool (PyRAT) is a
neural network verification tool developed internally at
CEA. It is based on the technique called abstract interpretation.
PyRAT aims to
determine whether a certain state can be reached in a given neural network.
To do so, it propagates different types of abstract domains along
the different layers of the neural network up until the ...
SAVer
SAVer (SVM Abstract Verifier) is an abstract interpretation based tool for proving properties of Support Vector Machines.
It can handle all sorts of reachability properties.
CAISAR uses SAVer to compute properties on Support Vector
Machine components.
Publications and presentations
Here is a list of publications and presentation featuring CAISAR.
CAISAR: A platform for Characterizing Artificial Intelligence Safety and Robustness
Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani, Augustin Lemesle
AI Safety, Vienna, 2022
Caractériser des propriétés de confiance d’IA avec Why3
Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani
JFLA, Praz-sur-Arly, 2023
Multiple provers, multiple answers: formulating and solving verification problems for diverse artificial intelligence components using CAISAR
Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani
Workshop on Safe and Robust Machine Learning, Edinburgh, 2023
Formuler et prouver des propriétés pour des composants basés IA avec différents prouveurs
Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani
Journées GDR GPL LVP-MTV2, Saclay, 2023
Neural Network Verification is a Programming Language Challenge
Lucas C. Cordeiro, Matthew L. Daggitt, Julien Girard-Satabin, Omri Isac, Taylor T. Johnson, Guy Katz, Ekaterina Komendantskaya, Augustin Lemesle, Edoardo Manino, Artjoms Šinkarovs, Haoze Wu
ESOP, 2025
Composer une salade CAISAR avec des réseaux de neurones et de l’édition de graphe
Michele Alberti, François Bobot, Julien Girard-Satabin, Alban Grastien, Aymeric Varasse, Zakaria Chihani