Trustworthiness of AI-based systems

The trustworthiness of an AI-based software is a wide-reaching term covering several key aspects. As an example:

intelligibility, e.g., through explainability and interpretability of research,
security, e.g., through the resistance to, the ability to recover from, or to remain in a degraded but acceptable state during malicious attacks
fairness, e.g., through the absence of bias according to social or biological attributes

CAISAR is an open-source platform under active development that focuses on another paramount aspect of trustworthiness: the Characterization of AI systems' Robustness and Safety. CAISAR provides a unified entry point for defining verification problems by using WhyML, the mature and expressive language of the Why3 verification platform.

Latest news of CAISAR

CAISAR team at ESSAI 2025

More >>

CAISAR 4.0 released

More >>

CAISAR 3.0 released

More >>

Check for all news

CAISAR in short

Aimed at all AI systems

While the current frenzy of AI trustworthiness is mostly focused neural networks, our industrial partners can also use other types of AI (e.g. SVM, XGBoost) in their products. This is why CAISAR targets a wider range of AI systems.

Modular and extensible

Written in the functional language OCaml, adding a verification, analysis or testing software to CAISAR's toolsuit is made easier through a unified interface, and an instantiation guided by data-types.

Maintainable

Functional programming provides easy mathematical reading, lowering the entry barrier for understanding the inner workings of CAISAR. Strong typing also minimizes errors with informative messages.

Standard oriented

By relying on AI standards (ONNX, NNnet) and formal methods standards (SMT, CP), CAISAR maximizes the potential for inclusiveness. Any tool that supports these standards is a potential addition to the CAISAR platform.

Interoperability

An internal representation for property language and AI representation helps reinforce the synergy between the different tools in CAISAR, where one analyser can rely on, and complement, the partial output of another.

Open-source

Our vision of CAISAR is collaborative in essence. To encourage cooperation and build a community of trust-minded entities, an easy access to the platform, supplied with documentations and tutorials, is our priority.

An expressive specification language

CAISAR provides a powerful specification language tailored for the specification of machine learning programs. Building upon WhyML, CAISAR add operations on vectors, composability of machine learning programs, dataset specification helpers while keeping the soundness of Why3's logic. Want to know more? See the language reference for more details.

Supported tools

A wide range of tools

CAISAR orchestrates and composes state-of-the-art machine learning verification tools which, individually, are not able to efficiently handle all problems but, collectively, can cover a growing number of properties. Our aim is to assist, on the one hand, the V&V process by reducing the burden of choosing the methodology tailored to a given verification problem, and on the other hand the tools developers by factorizing useful features-visualization, report generation, property description-in one platform.

AIMOS

AIMOS (AI Metamorphism Observing Software) is a tool which goal is to provide the means to test metamorphic properties on a dataset for a given AI model. The approach is black-box as the intrinsic characteristics of the AI model is not used in any way, as the model is treated as a black-box oracle with only its inference function.

Its approach improves the usual approaches which only...

Colibri & Co

The Colibri suite of constraint solvers is designed to assist with formal method tasks by providing the reasoning capabilities necessary to verify software and provide reasoning capabilities within critical systems.

In other words:

Colibri is a set of tools that can be used to prove that software and systems meet certain requirements. It does this by modeling the software or ...

Marabou

Marabou is an SMT-based tool that answers queries about neural networks and their properties. Queries are transformed into constraint satisfaction problems. Marabou takes as input networks with various piece-wise linear activation functions and with fully connected topology.

Marabou first applies multiple pre-processing steps to infer bounds for each node in the network. Next it appl...

PyRAT

The Python Reachability Analysis Tool (PyRAT) is a neural network verification tool developed internally at CEA. It is based on the technique called abstract interpretation.

PyRAT aims to determine whether a certain state can be reached in a given neural network. To do so, it propagates different types of abstract domains along the different layers of the neural network up until the ...

SAVer

SAVer (SVM Abstract Verifier) is an abstract interpretation based tool for proving properties of Support Vector Machines. It can handle all sorts of reachability properties.

CAISAR uses SAVer to compute properties on Support Vector Machine components.

Publications and presentations

Here is a list of publications and presentation featuring CAISAR.

CAISAR: A platform for Characterizing Artificial Intelligence Safety and Robustness

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani, Augustin Lemesle

AI Safety, Vienna, 2022

Caractériser des propriétés de confiance d’IA avec Why3

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

JFLA, Praz-sur-Arly, 2023

Multiple provers, multiple answers: formulating and solving verification problems for diverse artificial intelligence components using CAISAR

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

Workshop on Safe and Robust Machine Learning, Edinburgh, 2023

Formuler et prouver des propriétés pour des composants basés IA avec différents prouveurs

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

Journées GDR GPL LVP-MTV2, Saclay, 2023

Neural Network Verification is a Programming Language Challenge

Lucas C. Cordeiro, Matthew L. Daggitt, Julien Girard-Satabin, Omri Isac, Taylor T. Johnson, Guy Katz, Ekaterina Komendantskaya, Augustin Lemesle, Edoardo Manino, Artjoms Šinkarovs, Haoze Wu

Trustworthiness of AI-based systems

Latest news of CAISAR

Check for all news

CAISAR in short

An expressive specification language

Supported tools

A wide range of tools

Publications and presentations

CAISAR: A platform for Characterizing Artificial Intelligence Safety and Robustness

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani, Augustin Lemesle

AI Safety, Vienna, 2022

Caractériser des propriétés de confiance d’IA avec Why3

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

JFLA, Praz-sur-Arly, 2023

Multiple provers, multiple answers: formulating and solving verification problems for diverse artificial intelligence components using CAISAR

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

Workshop on Safe and Robust Machine Learning, Edinburgh, 2023

Formuler et prouver des propriétés pour des composants basés IA avec différents prouveurs

Julien Girard-Satabin, Michele Alberti, François Bobot, Zakaria Chihani

Journées GDR GPL LVP-MTV2, Saclay, 2023

Neural Network Verification is a Programming Language Challenge

Lucas C. Cordeiro, Matthew L. Daggitt, Julien Girard-Satabin, Omri Isac, Taylor T. Johnson, Guy Katz, Ekaterina Komendantskaya, Augustin Lemesle, Edoardo Manino, Artjoms Šinkarovs, Haoze Wu

ESOP, 2025

Composer une salade CAISAR avec des réseaux de neurones et de l’édition de graphe

Michele Alberti, François Bobot, Julien Girard-Satabin, Alban Grastien, Aymeric Varasse, Zakaria Chihani

JFLA, Roiffé, 2025