4.4. Confidence-based properties on COMPAS

Formal Verification of machine learning programs are usually limited to local robustness-like properties [Casadio2022]. As we have seen in (Local) Robustness of MNIST dataset when specifying local robustness, the specification holds on a given point and its \(\epsilon\)-bounded neighborhood. Those specifications come short when one’s objective is to verify the program’s behaviour on a whole input domain.

One main issue is that formulating a global robustness property is non-trivial. Consider the following formulation for an hypothetical “global robustness”:

\(\forall x,x' \in X. \ \lVert x - x' \rVert_\infty \leq \epsilon \Rightarrow C(x) = C(x')\).

This property is always false for any classification model. Indeed, it is always possible to find a point that is arbitrarily close to the decision boundary of the program, which result in a different classification. Thus, in order for this property to be effective, it needs to exclude the neighborhood of the decisin boudary.

The notion of confidence-based safety properties has been introduced by [Athavale2024]. In such a property, the confidence refers to the confidence level of the prediction made by the neural network. By adding this notion of confidence to a global robustness specification, it is possible to specify a behaviour away from the decision boundary, thus resolving the aforementioned issue.

CAISAR provides the possibility for a user to verify the confidence-based global robustness of a model. In the following, we will describe in details what is a confidence-based global property, and how to use CAISAR for verifying such a property on a neural network, on the COMPAS use case. For more details on the rationale and underlying formalism CAISAR uses, see [Ardouin2026].

4.4.1. Use case presentation

COMPAS is a neural network that, based on a selection of features such as gender and race, assesses the probability of criminal defendants committing future crimes [1].

The neural network that is used is a feedforward neural network with 6 inputs, 3 outputs, and 2 hidden layers composed of 5 neurons each (available at examples/confidence_properties/nnet/compas.nnet ).

4.4.2. Properties

The idea of confidence-based safety properties is to add a constraint to the original safety property of interest: the confidence score \(\text{conf}\) of the input considered in the property must be above a given threshold \(\kappa\). This notion is also applicable to global safety properties such as global robustness or global fairness. A neural network \(f:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is globally safe for confidence \(\kappa\in]0,1]\) if and only if \(\forall \mathbf{x},\mathbf{x}'.P(\mathbf{x},\mathbf{x}')\land\text{conf}(f(\mathbf{x}))>\kappa\Rightarrow Q(\mathbf{x},\mathbf{x}')\), where \(P\) and \(Q\) are the respective precondition and postcondition of the global safety property.

In CAISAR, it is possible to verify the confidence-based gobal robusness of a model. A neural network \(f:\mathbb{R}^n\rightarrow\mathbb{R}^m\) is globally \(\varepsilon\)-robust for confidence \(\kappa\in]0,1]\) if and only if \(\forall \mathbf{x},\mathbf{x}'.||\mathbf{x}-\mathbf{x}'||\leq\varepsilon\land\text{conf}(f(\mathbf{x}))>\kappa\Rightarrow\text{class}(f(\mathbf{x})) = \text{class}(f(\mathbf{x}'))\), where \(\varepsilon>0\) and \(||\cdot||\) is the distance metric over the input space of the neural network \(f\).

The concept of confidence-based global robustness is illustrated here:

Confidence-based global robustness illustration

In this figure, \(||\mathbf{c}-\mathbf{c}'||\leq\varepsilon\), \(||\mathbf{a}-\mathbf{a}'||\leq\varepsilon\) and \(||\mathbf{b}-\mathbf{b}'||\leq\varepsilon\). Since \(\text{conf}(f(\mathbf{c}))\leq0.51\) and \(\text{conf}(f(\mathbf{c}'))\leq 0.51\), \((\mathbf{c},\mathbf{c}')\) is not a counterexample for global robustness with confidence, even if \(\text{class}(f(\mathbf{c}))\neq\text{class}(f(\mathbf{c}'))\). Same remark for \((\mathbf{a},\mathbf{a}')\) since \(\text{class}(f(\mathbf{a}))=\text{class}(f(\mathbf{a}'))\). However, \(\text{conf}(f(\mathbf{b}))>0.51\) and \(\text{class}(f(\mathbf{b}))\neq\text{class}(f(\mathbf{b}'))\): \((\mathbf{b},\mathbf{b}')\) is a counterexample and therefore \(f\) is not globally robust with confidence \(\kappa=0.51\) and tolerance \(\varepsilon\).

The confidence metric can be defined and chosen by the user. For now, CAISAR proposes to use a confidence metric based on applying a softmax layer to the last layer of the neural network.

4.4.3. Modelling the problem using WhyML

As described for the example on Functional properties of ACAS-Xu and (Local) Robustness of MNIST dataset, we first need to write a specification file containing a WhyML theory to describe the verification problem.

The CAISAR standard library provides theories for the specification of confidence-based properties (caisar.confidence.ClassGlobalRobustConf and caisar.confidence.ConfidenceMeasure). Most predicates are taken from these theories.

We start by importing all necessary theories.

use ieee_float.Float64
use caisar.model.Model
use caisar.types.IntWithBounds as Label
use caisar.types.Vector
use caisar.types.VectorFloat64 as FeatureVector
use caisar.confidence.ClassGlobalRobustConf
use caisar.confidence.ConfidenceMeasure

We introduce constants that will be defined by the user on the command line at the time of executing CAISAR.

constant model_filename: string
constant eps: Float64.t
constant threshold: Float64.t

As COMPAS has three output classes, we specify the bounds of the label as follows.

constant label_bounds: Label.bounds =
  Label.{ lower = 0; upper = 2 }

The input variables of this neural network have different bounds. To check them, the user has to specify the input bounds of each component. To do so, the user has to define a generic type bounds as follows.

type bounds = {
  b0: Float64.t;
  b1: Float64.t;
  b2: Float64.t;
  b3: Float64.t;
  b4: Float64.t;
  b5: Float64.t
}

Then, the user can specify the input bounds of the model, using this type.

constant lbounds: bounds = {
  b0 = (0.0:t);
  b1 = (0.0:t);
  b2 = (0.0:t);
  b3 = (0.0:t);
  b4 = (1.0:t);
  b5 = (0.0:t)
}
constant ubounds: bounds = {
  b0 = (1.0:t);
  b1 = (8.0:t);
  b2 = (1.0:t);
  b3 = (6.0:t);
  b4 = (4.0:t);
  b5 = (2.0:t)
}

Using these constants, the user can define a predicate for checking the validity of input bounds.

predicate valid_input_bounds (x: FeatureVector.t) =
  (x[0] .>= lbounds.b0 /\ x[0] .<= ubounds.b0) /\
  (x[1] .>= lbounds.b1 /\ x[1] .<= ubounds.b1) /\
  (x[2] .>= lbounds.b2 /\ x[2] .<= ubounds.b2) /\
  (x[3] .>= lbounds.b3 /\ x[3] .<= ubounds.b3) /\
  (x[4] .>= lbounds.b4 /\ x[4] .<= ubounds.b4) /\
  (x[5] .>= lbounds.b5 /\ x[5] .<= ubounds.b5)

The user can now formalize the verification goal using this predicate and the ones defined in the confidence-based theories.

goal conf_global_robustness:
  let nn = read_model model_filename in
  let len = 6 in
  let bounds = (fun x -> valid_input_bounds x) in
  let conf_measure = (fun m x -> softmax_confidence m x) in
  confidence_globally_robust bounds
                             label_bounds
                             lbounds
                             ubounds
                             len
                             nn
                             eps
                             threshold
                             conf_measure

confidence_globally_robust is a predicate defined in caisar.confidence.ClassGlobalRobustConf and implements the property of confidence-based global robustness.

The confidence measure is given by conf_measure and is selected from the caisar.confidence.ConfidenceMeasure theory. Currently, only softmax_confidence, the confidence measure based on the softmax layer, is available.

4.4.4. Verifying the property with CAISAR

For now, only two provers are able to verify confidence-based global robustness in CAISAR: PyRAT and Maraboupy. It can be done as follows:

$ caisar verify -p PyRAT --prover-altern confidence -m:16GB -t:60s --define eps:1.0 --define threshold:0.5 --define model_filename:nnet/compas.nnet examples/confidence_properties/compas.why
[caisar] Goal conf_global_robustness: Invalid

Maraboupy has a small internal issue that is fixed by asking specifically CAISAR not to use \(\texttt{Sub}\) ONNX nodes in the reformulation of an operator. It is currently done by using the environment variable CAISAR_CLIP_ADD_ONLY:

$ export CAISAR_CLIP_ADD_ONLY=TRUE && caisar verify -p maraboupy -t:60s --define eps:1.0 --define threshold:0.5 --define model_filename:nnet/compas.nnet examples/confidence_properties/compas.why
[caisar] [WARNING] Clip rewriting: all Sub operators are expressed using Add operators
[caisar] Goal conf_global_robustness: Unknown ()

CAISAR states here that there exist a classification change for the given confidence threshold of \(0.5\) and a neighborhood of \(1.0\). It is no surprise, given that a softmax output of \(0.5\) is rather low for 3 classes, and that the distance between the two inputs is very high. Indeed, as we have defined previously, the first coordinate of the input lies within \(0\) and \(1\). By specifying such a high value for \(\epsilon\), we cover an input neighborhood that will possibly cross the decision boundary.

By being more conservative on the input neighborhood and the confidence score, we can prove the property valid.

$ export CAISAR_CLIP_ADD_ONLY=TRUE && caisar verify -p maraboupy -t:60s --define eps:0.001000000047497451305389404296875 --define threshold:0.89999997615814208984375 --define model_filename:nnet/compas.nnet examples/confidence_properties/compas.why
[caisar] [WARNING] Clip rewriting: all Sub operators are expressed using Add operators
[caisar] Goal conf_global_robustness: Valid

Here, we are able to prove that on a sufficiently small neighborhood and a high confidence, the classification does not change.

[Athavale2024]

Anagha Athavale, Ezio Bartocci, Maria Christakis, Matteo Maffei, Dejan Nickovic, Georg Weissenbacher, Verifying Global Two-Safety Properties in Neural Networks with Confidence, Computer Aided Verification - 36th International Conference, CAV 2024, Montreal, QC, Canada, July 24-27, 2024, Proceedings, Part II, volume 14682 of Lecture Notes in Computer Science, pages 329–351. Springer, 2024.

[Casadio2022]

Marco Casadio, Ekaterina Komendantskaya, Matthew L. Daggitt, Wen Kokke, Guy Katz, Guy Amir, Idan Refaeli, Neural Network Robustness as a Verification Property: A Principled Case Study, Computer Aided Verification - 36th International Conference, CAV 2022