z3
To Solve Logic PuzzlesRecently, I learned about point-free programming. The day after I learned about it, I was idly wishing for a puzzle book on converting program code to point-free form. A point-free programming puzzle book would be a good exercise on how well I can compose functions. Sadly, I'm not aware of such a puzzle book, nor am I qualified to write one.
However, thinking of puzzles again reminded me of how I liked logic puzzle books when I was younger (anyone remember PennyPress?). A few years ago, I solved a logic puzzle using an SMT solver out of curiosity. I was motivated to do this after seeing Claire Wolf doing the same to solve a similar "guess the combination to the lock given hints" puzzle. I'd intended to write a blog post about how I mapped the word problem of a logic puzzle to an SMT query, but I never followed up on it.
I know my last post was in late 2019. 2020 through 2023 has not been kind to the world and our collective social/mental bandwidth. Mine included. So while the thought of fun logic puzzles from my youth were fresh in my head yesterday (2023-06-30), I forced myself remember how I solved a logic puzzle with an SMT solver 5 years ago. And at last I'm going to document it here.
This post assumes a background in Boolean logic
and/or one programming language (along with how to use its Boolean operators).
Familiarity with S-expressions is
also a plus, because the input to the z3
program we're using uses S-expressions. However,
I will be explaining all the input we send to z3
line-by-line. So feel free to
learn about S-expressions at your own pace while or after reading this post.
When referencing concepts common in other programming languages, I refer to Rust documentation, because at this writing (2023-07-01), it's easiest for me to find the analogous concepts in the Rust language's documentation.
And lastly, you should know what a logic puzzle is. Enjoying solving logic puzzles is a nice bonus :).
Before I explain what an SMT solver is, and how to use them to solve puzzles, I need to explain what SAT and SAT solvers are as prerequisites.
Suppose you have some Boolean expressions- a bunch of Boolean variables chained together-
with Boolean operators- &&
, ||
, !
, etc. Here is an example query:
A || (B && !C)
B && C
SAT, or SATisfiability, is the process of determining whether a set of
Boolean expressions like above can all be made true
(1
) for one or more
combination of the values for the given Boolean variables.
Given n
Boolean variables, where each variable has 2 possible values- true
(1
)
and false
(0
)- you can determine whether there's an assignment that satisfies
all the Boolean expressions by "substituting all combinations of values for the
variables in the expressions" (brute-forcing).
For instance, here's how we can brute-force the above query in (up to) 8 (2^3
)
easy steps:
A=0,B=0,C=0
: UNSAT
0 || (0 && !0) = 0
0 && 0 = 0
A=1,B=0,C=0
: UNSAT
1 || (0 && !0) = 1
0 && 0 = 0
A=0,B=1,C=0
: UNSAT
0 || (1 && !0) = 1
1 && 0 = 0
A=1,B=1,C=0
: UNSAT
1 || (1 && !0) = 1
1 && 0 = 0
A=0,B=0,C=1
: UNSAT
0 || (0 && !1) = 0
0 && 0 = 0
A=1,B=0,C=1
: UNSAT
1 || (0 && !1) = 1
0 && 1 = 0
A=0,B=1,C=1
: UNSAT
0 || (1 && !1) = 0
1 && 1 = 1
A=1,B=1,C=1
: SAT
1 || (1 && !1) = 1
1 && 1 = 1
Indeed, our query is SAT
isfiable when all three Boolean variables
are set to 1
.
When checking for satisfiability, if just one combination of variable values
makes all the expressions in your query equal to 1
, you're done- the query is
SAT
. Unfortunately, this means you have to try 2^n
combinations of
values for n
Boolean variables if you have a query that is UNSAT
isfiable. And
even if the query is SAT
, you may have to try many combinations of
variable values before you find a combination that returns SAT
, such as the
above example.
To make matters worse, real-world SAT problems can have thousands of expressions
and millions of variables. Unlike my toy example above, it is completely
impractical to try to brute-force 2^1000000
different combinations of Boolean
variables. It is believed
that for some SAT problems, there is no better strategy than brute forcing all 2^n
combinations. However, real-world SAT problems are better behaved and can be solved
efficiently using, for instance, the Davis–Putnam–Logemann–Loveland
(DPLL) algorithm or Conflict-Driven Clause Learning
(CDCL) algorithm.
SAT solvers are a type of computer program optimized for solving Boolean
satisfiability problems. They take advantage of these algorithms and hueristics
common to real-world SAT problems to efficiently find whether a query is SAT
or UNSAT
1
.
Satisfiability Modulo Theory (SMT) problems and solvers are what you get when you're no longer limited to queries of just Boolean variables. Queries can be taken over Booleans, reals, integers, arrays of Booleans, functions, enumerations, and many more! You can even create queries of combinations of these data types! As might be expected, by relying on queries over data types besides Booleans can result in dramatic speedups, as well as dramatic increases in complexity :).
While some data types map nicely2
to Boolean expressions such as
bitvectors,
modern SMT solver use theories to map more complex data types back to Boolean
expressions. Just like how Booleans are governed by the theory of Boolean algebra, each data
type an SMT solver supports has a theory that explains "what you can and cannot
do" with the data type. For instance, consider the following expressions, comparing equality
of integers I
, J
, and K
:
(I == 1) && (J == I + 1)
(I == 2) || (K == J + 3)
(I == 2) && (J == K + 2)
The expressions I == 1
, J == I + 1
, I == 2
, K == J + 3
, J ==K + 2
and
all yield Boolean values, and can therefore be converted to and from
Boolean values A
, B
, C
, D
, and E
respectively:
A && B
C || D
C && E
While Boolean expressions can be solved for using the above algorithms for SAT, solving for equality is governed by the congruence closure, and the theory of integers is governed by, well, number theory. The DPLL/CDCL algorithms can be modified to handle conversion to and from Boolean expressions, as well as handle congruence closure and other theories. This is known as the DPLL(T) framework3 .
z3
And SMT-LIBOur SMT solver of choice will be Microsoft Research's z3
. Other SMT
solvers exist, such as yices
, and thanks to
each using different algorithms, there is not a single SMT solver that solves
all practical SMT problems better than the others. Speedups and slowdowns can be
dramatic depending on which SMT solver you use4
. However, a query to solve
a logic puzzle will not tax any SMT solver, so seeing that I'm familiar with its
API, I chose z3
.
For our purposes, z3
is invoked as the following:
z3 -smt2 model.txt
model.txt
is a text file we will create in SMT-LIB format.
SMT-LIB is an S-expression-based text
input/output format widely-supported by SMT solvers, z3
included. It contains
commands to construct input queries for many different theories, run the solver
on your query via check-sat
, and examine the solver's output. In particular,
after running check-sat
, you can use SMT-LIB to:
get-model
, where the values are written in terms of define-fun
).UNSAT
(get-proof
).get-value
)!As is custom for an S-expression format, SMT-LIB has a large amount of parantheses to represent nested data structures. While I show a generic example of how to use all our needed SMT-LIB commands, I will be skimming over the meaning of the parentheses nesting for this example application.
Like SAT solvers, SMT solvers can handle Boolean expressions just fine, as long as they're in the SMT-LIB format. Using the SAT example earlier in the post, the equivalent SMT-LIB query would look something like this:
(set-option :produce-models true)
(set-logic QF_UF)
(declare-const a Bool)
(declare-const b Bool)
(declare-const c Bool)
(assert (or a (and b (not c)))) ; A || (B && C)
(assert (and b c)) ; B && C
(check-sat)
(get-model)
There are a few things to note comparing the SAT example I created above:
The first line set-option :produce-models true
enables the get-model
command. The second line set-logic
restricts the data types and theories
that the solver will use. The SMT-LIB website provides
a catalog of logics that can be used with set-logic
, and QF_UF
is a simple
sub-logic suitable for a Boolean-only query.
Although z3
tolerates the first two lines missing, its good practice to
add those lines for other solvers like yices
and for optimizing. Assuming that
some data types and theories can't occur in a given query sometimes allows
the solver to speed up proofs.
Variables are not implicit in SMT-LIB, and need to be declared with
declare-const
. declare-const
takes a variable name, and a type- Bool
in
this case, but an SMT solver understands other types. During check-sat
, the
solver will fill in a value for you for each variable. The value of a variable
is the same everywhere that the variable is used, hence the const
part.
As is traditional with S-expressions, operators such as and
, or
, etc
are written in prefix notation
rather than infix notation.
Parentheses are required around operators and their arguments to actually
evaluate the expression/do
the calculation5
.
The Boolean expressions you want to check for satisfiability aren't
implicit either; you must tell the solver about your query using one more
more assert
s. The solver must find a way to make all
assert
s true
simultaneously for your query to return SAT
.
A query without any assert
s returns true
.
Comments are preceded with ;
.
check-sat
checks whether your query is SAT
or UNSAT
, and nothing else.
In contrast, get-model
returns the value of the variables the solver used to satisfy your query
if your query was SAT
. In this case, the output of the solver, including
the variable values chosen, matches the SAT example:
sat
(
(define-fun b () Bool
true)
(define-fun a () Bool
true)
(define-fun c () Bool
true)
)
Additionally, there is also get-value
. As will become clear later, get-value
is sometimes more useful than
get-model
for presenting information about a model which satisfies
your query. get-value
is especially when you're interested in the return values of
uninterpreted functions.
Although not represented in the above example, there are a few additional
operators available to the Bool
theory in SMT-LIB that are not available in
a SAT solver due to being polymorphic:
=
, distinct
, and ite
.
=
checks whether two generic types T
have equal value. If T
is
Bool
, this is equivalent to XNOR
. Unlike many programming languages,
SMT-LIB uses single-equals (=
), not double-equals (==
) for comparison.
distinct
checks whether two generic types T
are not of equal value.
If T
is Bool
, this is equivalent to XOR
.
ite
corresponds to if
-then
-else
expressions in your favorite
programming language. The if
predicate requires Bool
, while the
then
and else
branches of the expression require a T
as input.
The entire expression returns a T
as output. It returns the value of
the then
branch when the if
predicate evaluates to true
, and
returns the value of the else
branch when the if
predicate evaluates
to false
. When T
is Bool
, ite
has the following truth table:
I | T | E | O |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 0 |
0 | 1 | 1 | 1 |
1 | 0 | 0 | 0 |
1 | 0 | 1 | 0 |
1 | 1 | 0 | 1 |
1 | 1 | 1 | 1 |
While the above example doesn't really use an SMT solver's true power, it does illustrate the fundamentals of how to interface with one. We will discuss how to use other datatypes and theories with SMT-LIB as we turn a logic puzzle into an SMT query.
With the whirlwind introduction to SMT out of the way, my goal with this post is
to show you how to map the word problem presented by a logic puzzle to a
SAT
isfiable SMT query. I will introduce an example logic puzzle first, and
then show how to use various SMT-LIB commands to create some datatypes
and constraints to represent the puzzle.
At a high level, if you visualize a logic puzzle grid, such as the one below, the conversion process looks like this:
assert
s.Each set of 3 row and column headers belong to the same category:
To solve the puzzle, you are intended to mark each table entry a ✓ if the row and column are associated with each other, or ✗ if the row and column are not associated with each other, based on clues given to you (not represented in the above image).
A table entry whose row and column are of the same category does not need to be marked. The pictured table represents this by completely omitting such entries, making the table similar to an upper triangular matrix.
The bottom part of the puzzle consists of a table with the columns queen, tom, activity, and number of kittens. The column queen is filled in for you with Ruby, Spot, and Starbuck. You fill the remaining rows of this table based on which row and columns you marked with a ✓ in the above table.
I got the logic puzzle template (before filling in the headers) from Daydream Puzzles.
For this post, I'll be using a puzzle from Aha! Puzzles as an example. I'm not going to copy the puzzle description, since a person presumably spent time to create the puzzle (among others), and I don't want to steal their work. Instead I will paraphrase the puzzle and grid, which describes a set of cats.
Given the above information, and the following clues, figure out each queen's partner, favorite activity, and the number of kittens they have.
The puzzle is intended to have a unique answer. However, using only the description above, there are multiple correct answers. The clues below are intended to limit the multiple correct answers down to a single unique answer6 :
Now that we have a puzzle, we can map the textual description to a query and a mathematical model that the SMT solver will return. Let's create the query piece by piece, starting with initial setup code:
(set-option :produce-models true)
(set-logic ALL)
The first two lines we've already seen, but we have a new logic ALL
. set-logic ALL
is a bit of a cop-out; SMT-LIB allows ALL
for applications which generate
queries on-the-fly. ALL
is not intended for handwritten queries. Unfortunately, z3
doesn't
really
have that many logics that support datatypes. And none of the logics that do
support datatypes seem to support both integers and uninterpreted functions.7
So, without any better options for now, I'm just making explicit that z3
should use
"whatever means necessary" to solve the logic puzzle8
.
declare-datatypes
is part of the most recent version of SMT-LIB, version 2.6.
It is a very general function that allows you to create complex data types,
including recursive
data types. In our usage we will only be using it to create the equivalent of enum
s without data
in other programming languages9
.
In general, you create an enum
Foo
whose variants A
, B
, C
(and more, if wish) have
data using the following syntax10
:
(declare-datatypes ((Foo 0)) (((A) (B) (C))))
Anywhere in your query where a value of type Foo
is used, the SMT solver
has no choice but to make the value either A
, B
, or C
. Additionally, the
following properties hold11
:
A = A
, B = B
, and C = C
(reflexivity)A != B
, B != C
, and A != C
(antisymmetry).This can be confirmed with the following query, which returns SAT
:
(declare-datatypes ((Foo 0)) (((A) (B) (C))))
(assert (and (= A A) (= B B) (= C C)))
(assert (and (distinct A B) (distinct A C) (distinct B C)))
(check-sat)
With the above syntax and properties in mind, the names of the cats and their
activities correspond very nicely to enum
s whose variants don't have data:
(declare-datatypes ((Tom 0)) (((Batman) (Jake) (Dibii))))
(declare-datatypes ((Queen 0)) (((Ruby) (Spot) (Starbuck))))
(declare-datatypes ((Activities 0)) (((Lazer) (Sleep) (Ball))))
Functions form the backbone of programming languages and mathematics, and SMT
formulas are no exception. We can create pure functions
to abstract away common parts of our query using define-fun
. Here is an
example add
function that "adds one to its input Int
":
(define-fun add ((x Int)) Int (+ x 1))
(check-sat)
(get-value ((add 1) (add 2) (add 10)))
The above query returns SAT
in response to check-sat
. After check-sat
, your model has an add
function
that the solver can call.
We don't have any assert
s in the above query, so get-model
returns
nothing. However, we can display the result of calls to our add
function using get-value
.
get-value
takes a single argument, but this single argument can
take an arbitrary number of terms. Each of the three terms to the get-value
call above will make the solver read, evaluate, and print
the results of calling your add
function on various integers- exciting!
sat
(((add 1) 2)
((add 2) 3)
((add 10) 11))
Not only can we define functions, we can also declare functions, without
bothering to define/fill in the function body! We can request the solver to fill
in the function body for us by using declare-fun
:
(declare-fun add_two (Int) Int)
(check-sat)
(get-model)
(get-value ((add_two 1)))
The above query returns sat
in response to check-sat
. get-model
will then return the definition of add_two
that the solver created:
sat
(
(define-fun add_two ((x!0 Int)) Int
0)
)
(((add_two 1) 0))
In this case add_two
returns the literal 0
. Not particularly interesting.
What if we add an assert
, to force add_two
to return 3
when given the
input 1
?
(declare-fun add_two (Int) Int)
(assert (= (add_two 1) 3))
(check-sat)
(get-model)
(get-value ((add_two 1)))
sat
(
(define-fun add_two ((x!0 Int)) Int
3)
)
(((add_two 1) 3))
It's a function that returns 3
for all inputs. Now the solver is being cute.
"Technically correct" is the best type of correct. The solver won't create a
more interesting function body unless forced to via proper constraints on the
input arguments and return values12
.
We will constrain our uninterpreted functions shortly, but for now, let's
declare 4 uninterpreted functions with declare-fun
. These functions will in
fact return the answers for our logic puzzle, once the solver fills them in:
(declare-fun QueenPartner (Queen) Tom)
(declare-fun QueenActivity (Queen) Activities)
(declare-fun QueenKittens (Queen) Int)
(declare-fun TomPartner (Tom) Queen)
To be specific:
QueenPartner
returns the Tom
partner of the given Queen
.QueenActivity
returns the given queen's favorite activity.QueenKittens
returns the number of kittens a given Queen
has as an Int
.TomPartner
returns the Queen
partner of the given Tom
.By their very nature, logic puzzles contain some usually-implied rules that we must follow when solving them if we don't want incoherent, nonsensical answers:
A logic puzzle grid such as above visually makes it easy
to follow these rules. Unfortunately, an SMT solver knows nothing about logic
puzzles without our help, so we must explain those implicit constraints in our
query. We use assert
s to force an SMT solver to reject nonsensical-but-valid
answers to a query of variables, where in the case of a logic puzzle, our
variables are our uninterpreted functions.
Our enum
s we created via declare-datatype
already encode these
uniqueness/finiteness constraints mentioned above. We can also
represent the number of kittens each cat pair has as a NumKitten
s enum
,
but for convenience that will become clear shortly, I chose to use an Int
.
While Int
s have a uniqueness constraint (1
is different than 2
), in SMT-LIB, Int
s
do not have a finiteness constraint. Thus without additional information, the solver
free to set NumKitten
s for each cat pair to 1
, 10
, or even 1,234,567
, among
other fine Int
s, if the solver so chooses.
The puzzle's description implies outright says that the number of kittens
each cat pair must be between 1
and 3
inclusive. We can constrain the values
for number of kittens the solver will try using an assert
:
; Integer constraint
(assert (and
(>= (QueenKittens Ruby) 1) (<= (QueenKittens Ruby) 3)
(>= (QueenKittens Spot) 1) (<= (QueenKittens Spot) 3)
(>= (QueenKittens Starbuck) 1) (<= (QueenKittens Starbuck) 3)))
While the uniquess constraints of enum
s and Int
s gets us far, it's not
quite enough. As quoted above:
As is customary for this type of problem, it is implied that each queen has a different partner, favorite activity, and number of kittens.
Although the solver recognizes that each tom is a different cat, without additional information, nothing prevents the solver from giving an answer which assigns, for instance, two queens to the same tomcat partner.
We can further constrain the solver from trying to assign two queens the same
partner, number of kittens, and activity by constraining the uninterpreted functions
we defined earlier via more assert
s. For instance, the statement
(assert (not (= (QueenPartner Ruby) (QueenPartner Spot)))
tells the SMT solver
"However you decide to fill the function body, I need the function QueenPartner
to not return the same value for Ruby
and Spot
". To properly constrain our
query, you need one assert
for each possible pair of values in each category,
which can be combined into a single large conjuction13
:
; "Only one dot in each row/col" consistency check.
(assert (and
(not (= (QueenPartner Ruby) (QueenPartner Spot)))
(not (= (QueenPartner Ruby) (QueenPartner Starbuck)))
(not (= (QueenPartner Spot) (QueenPartner Starbuck)))
(not (= (QueenKittens Ruby) (QueenKittens Spot)))
(not (= (QueenKittens Ruby) (QueenKittens Starbuck)))
(not (= (QueenKittens Spot) (QueenKittens Starbuck)))
(not (= (QueenActivity Ruby) (QueenActivity Spot)))
(not (= (QueenActivity Ruby) (QueenActivity Starbuck)))
(not (= (QueenActivity Spot) (QueenActivity Starbuck)))))
The above check is morally equivalent to a human making sure a logic
grid has only one ✓ per row and column for each combination
of values in each category. Because of the uniqueness constraints for each
enum
and Int
, the above is sufficient to make sure the solver knows that
each queen has a mutually exclusive partner, number of kittens, and activity
and that such a solution exists14
.
Lastly, each cat pair shares a favorite activity and the number kittens they have. So far, we've only written constraints for each queen in the cat pair. However, we don't need to rewrite the constraints for each tom! There is another consistency check we need, and a side-effect, we can represent each tom's favorite activity and number of kittens for free in terms of this constraint.
The TomPartner
uninterpreted function returns the Queen
who is the partner
of the given Tom
. Likewise, the QueenPartner
function returns the Tom
whose partner is the given Queen
. Without extra constraints, the SMT solver
doesn't know "obvious" facts like "'the partner of the partner' of a cat is
the same cat you started with"15
. We can represent this fact with
another assert
:
; All other constraints for Tom are implied if the partner of the partner is the same.
(assert (and
(= (QueenPartner (TomPartner Batman)) Batman)
(= (QueenPartner (TomPartner Jake)) Jake)
(= (QueenPartner (TomPartner Dibii)) Dibii)))
After this constraint, TomPartner
unambiguously maps
a tom back to queen. Since a cat pair has the same favorite activity and number
of kittens, the solver now has enough information to solve for each tom's
favorite activity and number of kittens in terms of queens.
The implicit constraits are necessary but not sufficient to constrain the logic puzzle to a unique answer. If an SMT solver is presented with only the above lines, but there will be multiple ways to fill in our uninterpreted functions which satisfy that query. The solver will choose to present to you16 one interpretation for each function out of all valid ones, and chances are, it won't be the intended answer to the logic puzzle based on the clues.
If you incorporate the clues, however, the SMT solver should17
be limited to only one valid answer to your query. In this sense, the clues given
above form the explicit, outright-stated constraints unique to a
specific logic puzzle. Explicit constraints are handled identically to the
implicit constraints in the previous section, by
mapping a textual description of the clues described earlier to asserts
:
Batman's partner likes to chase a ball. She was not Starbuck.
; 1. Batman chose the female who liked to chase a ball, but she was not Starbuck.
(assert (= (QueenActivity (TomPartner Batman)) Ball))
(assert (not (= (TomPartner Batman) Starbuck)))
Dibii's partner liked the lazer.
; 2. Dibii's companion liked to chase the laser light.
(assert (= (QueenActivity (TomPartner Dibii)) Lazer))
Ruby was a cat who liked to sleep.
; 3. Ruby loved to cuddle up to her male for a long afternoon nap in the sun.
(assert (= (QueenActivity Ruby) Sleep))
Starbuck has two more kittens than whoever Batman's companion is.
This clue is the reason I represented the number of kittens as an Int
. While
I would have to reinvent addition if I used a NumKittens
enum
, the SMT
solver already knows the theory of Int
s and can add them using +
18
.
; 4. Starbuck had two more kittens than Batman's companion.
(assert (= (QueenKittens Starbuck)
(+ (QueenKittens (TomPartner Batman)) 2)))
Once, we add check-sat
, our query text file is finished, and we can feed it
to z3
using the invocation described to check whether our
query is satisfiable. You can download the completed query here
and run it online
(as of 2023-08-13) if you wish.
Many times, knowing SAT
or UNSAT
is "good enough". But for a logic puzzle,
we actually want the variable values. get-model
works, but this shows
the definition of the uninterpreted functions you asked the solver to fill in.
While seeing the function bodies are useful information, for solving a logic puzzle,
it's probably better to use get-value
to see individual outputs of each
now-interpreted function for "interesting" inputs.
To compare and contrast, I use all of check-sat
, get-model
, and get-value
:
(check-sat)
(get-model)
(echo "
Spot:")
(get-value ((QueenPartner Spot)
(QueenKittens Spot)
(QueenActivity Spot)))
(echo "
Ruby:")
(get-value ((QueenPartner Ruby)
(QueenKittens Ruby)
(QueenActivity Ruby)))
(echo "
Starbuck:")
(get-value ((QueenPartner Starbuck)
(QueenKittens Starbuck)
(QueenActivity Starbuck)))
Running the entire query through my local copy of z3
gives the following
output:
sat
(
(define-fun QueenKittens ((x!0 Queen)) Int
(ite (= x!0 Spot) 1
(ite (= x!0 Starbuck) 3
2)))
(define-fun QueenPartner ((x!0 Queen)) Tom
(ite (= x!0 Spot) Batman
(ite (= x!0 Starbuck) Dibii
Jake)))
(define-fun QueenActivity ((x!0 Queen)) Activities
(ite (= x!0 Starbuck) Lazer
(ite (= x!0 Ruby) Sleep
Ball)))
(define-fun TomPartner ((x!0 Tom)) Queen
(ite (= x!0 Jake) Ruby
(ite (= x!0 Dibii) Starbuck
Spot)))
)
Spot:
(((QueenPartner Spot) Batman)
((QueenKittens Spot) 1)
((QueenActivity Spot) Ball))
Ruby:
(((QueenPartner Ruby) Jake)
((QueenKittens Ruby) 2)
((QueenActivity Ruby) Sleep))
Starbuck:
(((QueenPartner Starbuck) Dibii)
((QueenKittens Starbuck) 3)
((QueenActivity Starbuck) Lazer))
First off, the query is sat
, which is a good start! The model returned by
get-model
isn't hard for me to follow either. I could fairly quickly figure
out each answer to the logic puzzle using get-model
alone, and I'm sure anyone
else reading this post could too. But using get-value
to evaluate the
functions the solver filled in for us gives us the logic puzzle answers directly,
ready to be written down! So let's save ourselves the trouble and populate the
puzzle's answers thanks to get-value
!
Was the solver correct? Well, as of this writing (2023-07-23), Aha! Puzzles lets you solve logic puzzles interactively online. I'll just let the below screenshot speak for itself:
z3
was correct.
Although the logic puzzle grid and answer table is obscured by the popup, there
is just enough of the grid and answer table shown to verify that I
filled in the puzzle with the answers returned by z3
.I still find logic puzzles fun to work through manually. To me, asking a machine to solve every logic puzzle for you defeats the satisfaction of working your brain to solve the puzzle yourself. So while I hope the above example provides some insight as to how to map word problems to an SMT query/model, I wouldn't suggest that you start solving every logic puzzle with an SMT solver.
That said, I found that trying to map a logic puzzle to an SMT solver query was also intellectually stimulating. It is a low-stakes problem that gave me the opportunity to exercise my skills at creating SMT queries, as opposed to my typical usage where I have a program generate queries and models for me. I was excited after I got my query working, and I wanted to document the result.
Speaking of generating queries and models, it's possible to take this logic puzzle-solving use case further by creating a Domain-Specific Language (DSL). Like how the previously-linked SymbiYosys converts Verilog to SMT-LIB queries, this DSL would convert a more convenient textual representation of a logic puzzle into SMT-LIB queries. It's a bit more up-front work than writing SMT-LIB queries manually, and probably not worth the time spent other than for learning. However, since this project is meant to be low-stakes and for fun, I may look into a DSL in the future for a follow-up post. It's about time I wrote a(nother) compiler anyway.
I would like to thank my friends Screwtape and Alyssa Rosenzweig for their general feedback and a few last-minute corrections on the final draft of this post.
1 I will not be discussing how to write a SAT solver; for those interested I suggest this tutorial for building a SAT solver in Python. The tutorial is based on Donald Knuth's "SAT0W" solver.
Although I have not personally done so yet, I would probably also look at Knuth's "SAT10" solver, which is based on the Davis–Putnam–Logemann–Loveland (DPLL) algorithm.
2 To be honest, I'm going by what Wikipedia says here. There's no external reference:
Early attempts for solving SMT instances involved translating them to Boolean SAT instances (e.g., a 32-bit integer variable would be encoded by 32 single-bit variables with appropriate weights and word-level operations such as 'plus' would be replaced by lower-level logic operations on the bits) and passing this formula to a Boolean SAT solver.
3 Just like for SAT solvers, I will not be discussing how to write an SMT solver. While in the process of writing this post, I have been learning about the DPLL(T) framework myself. I've found the following set of slides useful:
4 In my own experience, yices
is significantly faster at many problems involving
formal verification of HDL. However, it is not universal. This is why the
Symbiyosys front end for HDL
verification will run the same query against multiple SMT solvers in parallel
if you have them installed.
5 If you don't include parentheses to actually evaluate operators, weird things
you probably didn't intend will happen. For instance, try comparing the result
of the query (assert (= or true false true))
to
(assert (= (or true false) true))
.
6 More succinctly, the clues "constrain the solution space".
7 The logic that's closest to what I need, Quantifier-Free Finite Domain (QF_FD
),
has some additional restrictions which cause my logic puzzle query to return
UNKNOWN
. I'm not actually certain why it returns UNKNOWN
,
but I suspect I'm running afoul of the restrictions by using datatypes
with uninterpreted functions, and not just equality (=
) comparisons.
8 I belive `ALL` is
z3
's default logic anyway.
9 declare-datatypes
is not required, and the logic puzzle can
be solved without them. In fact, I have constructed a working
query without datatypes for this logic puzzle-
twice! I wanted to solve the puzzle with
yices
as well as z3 as an exercise; while yices
does
support enum
s and the like, they are not compatible with SMT-LIB.
As shown in the linked yices
examples, declare-sort
,
declare-const
, and declare-fun
, plus some assert
constraints can emulate declare-datatypes
and finite integers reasonably. However, for
a handwritten example, I wanted to focus on creating constraints appropriate
to a logic puzzle rather than reinventing enum
s.
10 The integer argument to declare-datatypes
is used for
parametric polymorphism. It represents, informally, "the number of types to
be filled in later".
11 What A
, B
, and C
actually are in
terms of "bits in memory somewhere" doesn't matter as long as the solver
manipulates these values of type Foo
according to the above rules.
12 To make the example more interesting, I tried to use assert
s to
get the solver to generate a function body which uses +
(integer addition),
but got bored and gave up. This post is getting long anyway.
13 Readers familiar with using z3
may notice that I can more-succinctly
represent the exclusivity constraints using a forall
quantifier.
Indeed, this works, and was how I originally solved
the example logic puzzle. Except for an extra
pair
datatype, the quantifier version of our query is
very similar to the quantifier-free query constructed
in this blog post; just compare the comments.
However, I decided not not to discuss quantifiers in this post for a few reasons:
SAT
or UNSAT
, some queries with quanitifers
are undecidable.
As of this writing (2023-07-02) I don't think I'm qualified to discuss
undecidability other than in passing.
yices
).Therefore, while the quantifiers are likely harmless here, I'd rather them when they are actually required, not just as a means to make queries take fewer lines.
14 If two enum
variants could be equal each other in value (i.e. not distinct
),
it would be impossible for our uninterpreted functions to return three distinct
enum
variants to three different values of input enum
variants.
To show this, play with this commented query online:
(declare-sort Foo 0) ; declare-sort declares a new type, like `Int`, `Bool`. ; It has fewer assumptions baked-in than `declare-datatypes`. ; Create some named values for our types. ; These are like our `enum` variants. (declare-const foo_one Foo) (declare-const foo_two Foo) (declare-const foo_three Foo) (declare-sort Bar 0) (declare-const bar_one Bar) (declare-const bar_two Bar) (declare-const bar_three Bar) ; Our uninterpreted function. Takes a `Foo`, returns a `Bar`. (declare-fun my-fun (Foo) Bar) ; Ensure each constant has a unique value. This is implied ; `declare-datatypes`. ; ; Change _any_ of the 6 asserts from `distinct` to `=`. ; The solver will return `unsat`. (assert (distinct foo_one foo_two)) (assert (distinct foo_one foo_three)) (assert (distinct foo_two foo_three)) (assert (distinct bar_one bar_two)) (assert (distinct bar_one bar_three)) (assert (distinct bar_two bar_three)) ; Prevent the solver from trying to create more values for our types ; from thin air. This is implied in `declare-datatypes`. (assert (and (or (= (my-fun foo_one) bar_one) (= (my-fun foo_one) bar_two) (= (my-fun foo_one) bar_three)) (or (= (my-fun foo_two) bar_one) (= (my-fun foo_two) bar_two) (= (my-fun foo_two) bar_three)) (or (= (my-fun foo_three) bar_one) (= (my-fun foo_three) bar_two) (= (my-fun foo_three) bar_three)))) ; Each input to my-fun should return a unique output. ; This is impossible unless foo_{one, two, three} have ; different values according to `=`. (assert (distinct (my-fun foo_one) (my-fun foo_two))) (assert (distinct (my-fun foo_one) (my-fun foo_three))) (assert (distinct (my-fun foo_two) (my-fun foo_three))) (check-sat) (get-model)
15 While I praise the SMT solver for being open-minded, polyamory is not within the scope of this logic puzzle.
16 By default an SMT solver will give you one answer. It is possible to get all possible answers to an SMT query. Since a logic puzzle is intended to have a single answer, ensuring that the number of unique answers is 1 would be another good consistency check at the cost of execution time. Unfortunately, I don't remember how to do this.
17 The puzzle should have a unique answer if it was designed properly.
18 I chose to write the assert
in terms of Int
and +
as an example. When numbers you can choose from in a category aren't sequential,
and/or the clues are more complicated, I think it's useful to be able to model
clues with other theories the SMT solver provides.
In this case, because we know that NumKittens
is constrained
from 1
to 3
inclusive, you could just as easily
derive the constraint as:
(assert (= (QueenKittens Starbuck) 3)) (assert (= (QueenKittens (TomPartner Batman)) 1))
Last Updated: 2023-08-15