Symbolic program analysis as Satisfiability Modulo Theories Nikolaj Bjrner Microsoft Research Based on joint work with Krytof Hoder, Ken McMillan, Leonardo de Moura, Andrey Rybalchenko Background: Z3 - Efficient SMT Solver Many custom solvers: Free functions Linear Arithmetic Bit-vectors Algebraic data-types

Arrays Polynomials Quantifiers Several Applications: Analysis, Testing, Leonardo de Moura, B, Christoph Wintersteiger from http://rise4fun.com/z3 Tools using Z3 features Engine SLAyer

Arrray s Simplifie r BitVector s Cores Arith meti c

Quanti fier Inst Quantifierelim API SAGE Models Proofs Isabelle HOL4

Tools using Z3 for fixedpointsSLAyer Methodology Sep. Logic Simulation Relation Summarie s Abstractio n

Refinemen t Houdini GateKeeper SAGE Fixed-Point Abstract Interpretatio n Logic Programmin g

Predicate Based MC BDD MC Datalog Havoc Poirot Corral Interpolatin g MC Engines for Recursive Predicates Contract Points-to

analysis Checking Symbolic Software Checking Z3 Property Directed Datalog + Relational domains Reachability solver Services for other solvers

(Quantifier elimination, Fold-unfold simplification) Engines for Recursive Some anecdotal Predicates Recursive predicates: experience Expressed as Horn clauses + query Points-to Z: Portfolio of solvers and fixed-points:

analysis GateKeeper (sparse hashtables) services Magnus for Madsen Bottom-up Datalog Engine Contract KOP2 database

(using magic sets) DKAL - FiniteContract Tables (e.g., Hash-tables, B-Trees)[Hoder, Logic) Logic)Bjrner, Checking - Symbolic Tables (e.g., BDDs) Moura] Bebopde

benchmarks - Composition of Relations: (evaluate PDR - AbstractSymbolic interpretation domains generalized to PDA) - Reduced products Software Checking Symbolic Engine Modulo Theories (encoding

CAV Primal 2011 Infon (encoding Primal Infon Corral samples SAT 2012 (evaluate PDR Bjrner] Modulo[Hoder, Arithmetic

) - Generalized Property Directed Reachability Motivation: Recursive Procedures mc(x) = x-10 if x > 100 mc(x) = mc(mc(x+11)) if x 100 assert (mc(x) 91) Motivation: Recursive Procedures Formulate as Horn clauses.

mc() mc() mc() mc() mc() Solve for mc Motivation: Recursive Procedures Formulate as Predicate Transformer: Check: Motivation: Recursive Procedures

Instead of computing then checking Suffices to find post-fixed point satisfying: Program Verification as SMT Program Verification (Safety) as Solving least fixed-points as Satisfiability of Horn clauses [Bjrner, McMillan, Rybalchenko, SMT workshop 2012] Hilbert Sausage Factory: [Grebenshchikov, Lopes, Popeea, Rybalchenko et.al. PLDI 2012 ]

Old but New Should really not be a surprise: - 90s Program Analyses using Datalog - Existential Fixedpoint Logic for Hoare Logic Gurevich] [Blass, Induction-less induction, Under-appreciated: - Many language-specific tools using custom analysis - .. but there has to be a catch [FOL < FOL+Transitivity] - A flurry of recent progress on Modern Symbolic Model checking tools/algorithms.

Claim: they are all strategies for Horn Clause satisfiability. - The Quest: Horn Clause Satisfiability Verification Tool Workflow HAVOC Program annotated with inductive invariants Verification condition

Dafny Verification Tool Workflow Houdini HAVOC Program partially annotated with inductive invariants Verification condition

Slicing Corral Inductive variable selection Dafny Envisioned: Verification Tool Workflow Verification Condition Generators can already produce Horn Clauses

Corral HAVOC Program partially annotated with inductive invariants Why, LLVM Horn Clauses Dualit Kind Leon HSF

y Aligato r Synerg UFO MCMT IC3 y SAFARI Dafny Procedures Horn Formulas

Summary as commands Verifying procedure calls Modular Concurrency Horn Clauses [Predicate Abstraction and Refinement for Verifying Multi-Threaded Programs Gupta, Popeea, Rybalchenko, POPL 2011]

Clauses Horn { : | ( )} { :| ( , )} { :| ()} { : | ( , )} Extract sufficient Horn Conditions Generalized Horn Formulas In a nutshell, solving partial correctness amounts to checking truth value of formulas of the form: E.g., satisfiability of:

Generalized Horn Formulas Handling background axioms: Remark: Abductive Logic Programming amounts to symbolic simulation: - is consistent eg. solve for negation of above formula: A New PDR Engine for Fixedpoints PDR (aka. IC3) Property Directed Reachability algorithm Breakthrough in Symbolic Model Checking of Hardware [Aaron Bradley, VMCAI 2011]

Original Algorithm Description main in code. Transition Decomposes steps Tough to digest. Rule +queue strategy description could help System priority deconstruct the steps. Formulation

Original Algorithm Applies to vs. Hardware (Finite State Automata). Procedures Regular Push Down systems Software has procedure calls. Beyond Linearis Real Arithmetic Original Algorithm

for Finite State Systems Propositional - Timed Automata Decision Procedure Open question what it meant to incorporate Logic - Interpolants from models Infinite State systems (= theories)

[Hoder & Bjrner, SAT 2012] PDR as a Transition System Objective is to solve for R such that Elements of PDR encoded as transitions: Over-approximate reachable states Search for counter-examples to PDR as a Transition System

Objective is to solve for R such that Initialize: F ( ) Main invariant:

+ F ( ) PDR a visual overview Is Search for over-approximations of states valid? PDR Is

Initially: N = 0, start with valid? PDR Is valid? Unfold to the next level if PDR Is

valid? Main Invariant is established for N = 1 PDR Is valid? Model PDR Is

valid? C, PDR Is valid? Unfold to the next level if PDR Is

Etc. valid? PDR Is Etc. valid? PDR Is

valid? implies Valid Formula is valid if is a post-fixed point PDR Is valid? Induction w

PDR Is valid? Induction w PDR Is valid? Induction w

PDR Is valid? Induction w PDR Is valid? Monotonicity of F

Induction w PDR Is valid? Induction w PDR Is valid?

Induction w PDR Is valid? Decide PDR Is valid?

Decide Non-linear fixed-points Recall: Is feasible? Start with summary feasible? Yes, e.g., Is

reachable? (in Non-linear transformers M(87) = M(M(98)) = M(M(M(109))) = M(M(99)) = M(M(M(110))) = M(M(100)) = M(M(M(111))) = M(M(101)) = M(91) = M(M(102))

= M(92) = M(M(103)) = M(93) R=90 Benchmarks from the SLAM Research toolkit Checking against controls depth, but potentially wide tree. Our approach: build DAG by sharing states. Sharing is cheap, even no sharing works on Bebop

Arithmetic R(0,0,0,0). Initial states Mutual Exclusion T(L,M,Y1,Y2,L,M,Y1,Y2)R(L,M,Y1,Y2) R(L,M,Y1,Y2) Reachable states R(2,2,Y1,Y2) false Is unsafe state reachable? Clauses have model Step(L,L,Y1,Y2,Y1) T(L,M,Y1,Y2,L,M,Y1,Y2) P1 takes a step Step(M,M,Y2,Y1,Y2) T(L,M,Y1,Y2,L,M,Y1,Y2)

P2 takes a step Step(0,1,Y1,Y2,Y2+1) (Y1 Y2 Y2 = 0) Step(1,2,Y1,Y2,Y1) Step(2,3,Y1,Y2,Y1) Step(3,0,Y1,Y2,0) Search: Mile-high perspective F()

2 F ( )Conflict B ( ) Conflict Conflict ResolutionPropagationPropagation

PDR(T): Conflict Resolution Conflict Resolution + 0 Conflict 0 Resolution Get Generalization from Farkas Lemma Eg., resolve away blue internal variables

PDR(T): Conflict Resolution = = = Conflict Resolution Conflict Propagation Conflict Propagation PDR(T): Generalization from Tlemmas Can we satisfy? Initial states Reachable states Unsafe state is unreachable is unsatisfiable

E.g., there is unsat core of: Unsat proof uses T-lemmas PDR(T): Generalization from Tlemmas Can we satisfy? Initial states Reachable states Unsafe state is unreachable Unsat proof uses T-lemmas PDR(LRA): Timed automata Observation: PDR + Model refinement using Farkas

strengthening is a decision procedure for timed push-down systems Justification: Every lemma produced is a sum of differences from the input ~ Acyclic path in difference graph. Finite set of Farkas lemmas possible. N+1 degrees of separation Objective: synthesize inductive invariant proving property. Reaching objective with interpolants:

Synthesize interpolants, use for proving invariants. Be admired. Synthesize interpolants, evaluate on random formulas. Admire them. Write papers about interpolants. Admire the theorems. Review papers about generating interpolants. Watch Kevin Bacon. Reaching objective with PDR: . Nevertheless, interpolants sneak in. What is a Craig Interpolant?

Suppose A Craig Interpolant is formula Horn version. Establish satisfiability of: and find solution for PDR(T): Interpolants as a side-effect Intermediary solutions: Observation: Farkas strengthening computes a DAG interpolant for LRA i.e., solves for non-recursive Horn clauses

Summary The question is: Quantified Horn Clause Satisfiability Modulo Theories PDR Generalized: - as an abstract Transition System - for Horn Clause Satisfiability over Theory of Arithmetic - Using Farkas to generalize failed counter-example traces Difference Logic a Model Checking algorithm for Timed Automata Interpolants from Model refinements - Propagate also properties for predicates (so far inefficient)

PDR as a Transition System Bottom-up Datalog: Engine Restarts Compilation Relational Algebra

Abstract Machine Bottom-up Datalog: Relations x 0 y 0 1 z

1 Bounds Intervals Pentagons = + =+