Unification
Complete the associated in-class exercises.
1 How Does Prolog’s Proof Procedure Handle Variables?
We explored how Prolog automatically derives a proof for a query back in propositional logic, but we never really figured out how full Prolog manages the same thing. What’s going on?
For this part, we’ll again consider a fact (like p.
) to be a rule with an empty body (like p :- .
).
1.1 What Happens Intuitively
Intuitively, when Prolog runs across a goal like append([1,3|Tail], [2,4], Result)
, it looks for clauses that are about append
and, for each clause, it tries to match up the head of that clause with the three terms inside the goal [1,3|Tail]
, [2,4]
, and Result
. It tries the first matching clause it finds. If that fails, it returns to that choice and tries the next matching clause instead, until it runs out of matching clauses.
What does that look like more formally? Let’s go back to propositional logic and then extend it to full Prolog.
1.2 Propositional Logic Proof Procedure
For propositional logic, Prolog performs a backtracking search for a proof. To solve a query like ?- q1, ..., qk
:
- Let the answer clause A be
yes :- q1, ..., qk.
- While the body of A has goals inside:
- Let the leftmost goal in A be
a1
. - Choose a clause C in the KB with
a1
as its head. - Replace
a1
in the body of A with the body of C.
- Let the leftmost goal in A be
That choose step is non-deterministic: Prolog tries a choice and if the choice fails, it backtracks to that choice and tries the next option. Specifically:
- If Prolog hits the choose step and there is no matching clause (or no matching clause left), it fails.
- When Prolog hits the choose step and there is at least one matching clause, it tries the first matching clause in the KB and proceeds from there. But, it remembers which one it chose. On failure, it goes back to its most recent choice and tries the next matching clause instead, or fails again if there are none.
1.3 Proof Procedure with Variables
In propositional logic, a goal a1
matches the head of a clause in the KB if they are literally the same simple atoms.
In full Prolog, a1
and the head of the clause may have arguments, including variables. The two atoms match if we can “make them look the same”: they unify. (We get incredibly lucky because any two items that unify do so in one most-general way; we don’t have to return to the unification over and over again and try different ways to unify!)
Our procedure winds up being another backtracking search for a proof. To solve a query like ?- q1, ..., qk
, where goals q1, ..., qk
may have terms inside them including variables1, and all the variables anywhere inside these goals are V1, ..., Vj
:
- Let the generalized answer clause A be
yes(V1, ..., Vj) :- q1, ..., qk.
- While the body of A has goals inside:
- Let the leftmost goal in A be
a1
, and leta1
’s predicate symbol bep1
.2 - Choose a clause C in the KB.
- Rename all the variables in C (so they don’t accidentally overlap with the ones in A).
- It must be possible to unify
h
anda1
. (If not, this choice fails.) - Let the substitution that unifies
h
anda1
be θ
- Replace
a1
in the body of A with the body of C and apply the substitution θ to A (its head, C’s body, and A’s remaining goals).
- Let the leftmost goal in A be
Again, if the choose step finds no options (either because no heads with the appropriate predicate symbol are left or because the head doesn’t unify with the goal), Prolog fails and rewinds to its most recent choose where it had more options.
To really make sense of that, however, we need to know what a substitution is in more detail and how the algorithm to unify two atoms/terms works.
2 Substitutions
We need a few definitions:
- A substitution is a (finite) list of mappings of variables to terms. We write it like: {V1/t1, …, Vn/tn}, where each Vi is a different variable, and each ti is the term we want to use to replace the variable.
- The application of a substitution to an atom or clause is what we get when we replace every variable in the atom/clause that appears in the substitution with its corresponding term. (So, for any variable Vi that is mapped to ti in the substitution, we find all occurrences of Vi in the atom/clause and replace each one with ti.)3
- An instance of an atom/clause is the result of applying some substitution to the atom/clause.
If σ is a substitution and c is an atom or clause, then we write cσ to mean the instance we get from applying σ to c.
For example, consider these substitutions:
- σ1 = {
X
/A
,Y
/b
,Z
/C
,D
/e
} - σ2 = {
A
/X
,Y
/b
,C
/Z
,D
/e
} - σ3 = {
A
/V
,X
/V
,Y
/b
,C
/W
,Z
/W
,D
/e
}
What is the result of each of the following substitution applications? (The first is complete as an example.)
p
(
A
,
b
,
C
,
D
)
σ1 =p
(
A
,
b
,
C
,
e
)
. (σ1 only has a replacement forD
of the variables inp(A,b,C,D)
. We’ve replaced it.)p
(
X
,
Y
,
Z
,
e
)
σ1p
(
A
,
b
,
C
,
D
)
σ2p
(
X
,
Y
,
Z
,
e
)
σ2p
(
A
,
b
,
C
,
D
)
σ3p
(
X
,
Y
,
Z
,
e
)
σ3
(Two Exercises. We’ll do the first together in class.)
3 Unification
Unifying two atoms or terms means making them look the same. Specifically:
- A substitution σ is a unifier of atoms/terms e1 and e2 if e1σ = e2σ. That is, the instance we get from applying σ to e1 is the same one we get from applying σ to e2. They match!
- A substitution σ is a most general unifier or mgu of e1 and e2 if:
- σ is a unifier of e1 and e2, and
- if substitution σ′ is also a unifier of e1 and e2, then eσ′ is an instance of eσ for all atoms/terms e.
- If two Prolog atoms/terms have a unifier, then they have a mgu.4
- If there are multiple mgu’s, then they differ only in the names of the variables chosen.
Let’s try an example. Consider:
- e1=
append(cons(1,cons(3,Tail)), cons(2,cons(4,empty)), Result)
, - e2=
append(empty, X, X)
, and - e3=
append(cons(X,Xs), Ys, cons(X,Zs))
:5
For these:
- e2 has no unifier with either e1 or e3. That’s because there’s no substitution that can make the terms
cons(1,cons(3,Tail))
andempty
look alike. (No matter what we do to the one variableTail
, the rest of the terms won’t match!) Similarly, no substitution makescons(X,Xs)
andempty
look alike. - e1 unifies with e3 with various unifiers. For example:
Consider the substitution: σ1 = {
X
/1
,X
s
/c
o
n
s
(
3
,
e
m
p
t
y
)
,T
a
i
l
/e
m
p
t
y
,Y
s
/c
o
n
s
(
2
,
c
o
n
s
(
4
,
e
m
p
t
y
)
)
,R
e
s
u
l
t
/c
o
n
s
(
1
,
Z
s
)
,U
n
n
e
c
e
s
s
a
r
y
/I
r
r
e
l
e
v
a
n
t
}.Let’s apply that:
append(cons(1,cons(3,Tail)), cons(2,cons(4,empty)), Result)
σ1=append(cons(1,cons(3,empty)), cons(2,cons(4,empty)), cons(1, Zs))
append(cons(X,Xs), Ys, cons(X,Zs))
σ1=append(cons(1,cons(3,empty)), cons(2,cons(4,empty)), cons(1, Zs))
And, those are the same term! So, σ1 is indeed a unifier for them.
But, σ1 includes a totally unnecessary mapping at the end (for
Unnecessary
) and is more specific than it needs to be (mappingTail
toempty
).Consider the substitution: σ2 = {
X
/1
,X
s
/c
o
n
s
(
3
,
T
a
i
l
)
,Y
s
/c
o
n
s
(
2
,
c
o
n
s
(
4
,
e
m
p
t
y
)
)
,R
e
s
u
l
t
/c
o
n
s
(
1
,
Z
s
)
}.Let’s apply that:
append(cons(1,cons(3,Tail)), cons(2,cons(4,empty)), Result)
σ2=append(cons(1,cons(3,Tail)), cons(2,cons(4,empty)), cons(1,Zs))
append(cons(X,Xs), Ys, cons(X,Zs))
σ2=append(cons(1,cons(3,Tail)), cons(2,cons(4,empty)), cons(1,Zs))
Again, σ2 unifies these. σ2 is more general than σ1, however. (If we apply σ1 to something, we can get the same effect by applying σ2 and then using two more mappings: {
T
a
i
l
/e
m
p
t
y
,U
n
n
e
c
e
s
s
a
r
y
/I
r
r
e
l
e
v
a
n
t
}.)
In this case, this mgu is unique. In general, there may be many mgu’s, but they only differ in renaming variables differently.
(Exercise.)
3.1 Unification Algorithm
Intuitively, we can unify two atoms/terms if:
- They’re already identical, or else
- One is a variable, in which case we map it to the other atom/term6, or else
- They are both compound terms with the same name and same number of arguments, and we can unify each of the pairs of arguments, in turn.
What does this look like as an algorithm?
Algorithm unify(t1,t2) either fails (if t1 and t2 cannot be unified) or returns a substitution σ:
- Let T = {t1 = t2}. (This is our “todo list” of pairs of atoms/terms we need to unify.)
- Let σ = {}. (This is our substitution, which we build up bit by bit as the algorithm proceeds.)
- While T ≠ {}:
- Select and remove x = y from T.7
- If x is identical to y, there’s no update needed.8
- Otherwise, if x is a variable:
- Replace x with y wherever it appears in T and σ.
- Add x/y to σ. (The new σ value is σ ∪ {x/y}.)
- Otherwise, if y is a variable:
- Replace y with x wherever it appears in T and σ.
- Add y/x to σ. (The new σ value is σ ∪ {y/x}.)
- Otherwise, if x is a compound term p(x1,…,xn) and y is a compound term p(y1,…,yn) (where the name p must match and the number of arguments n must match):
- Add x1 = y1, …, xn = yn to the todo list T. (The new T value is T ∪ {x1 = y1, …, xn = yn}.)
- Otherwise, fail.
- Return σ
Notice that the algorithm maintains a single substitution throughout. The result is that Prolog gets pattern-matching even more powerful than Haskell’s, where the same variable can appear in many different places.
Let’s try some examples:
- unify
p(A, b, C, D)
andp(X, Y, Z, e)
- unify
p(A, b, A, D)
andp(X, X, Z, Z)
(left as an exercise!) - unify
p(A, b, A, d)
andp(X, X, Z, Z)
- unify
n([sam, likes, prolog], L2, I, C1, C2)
andn([P|R], R, P, [person(P)|C], C)
(Exercise.)
3.2 The Occurs Check
There is one last issue we have not addressed.
Consider a knowledge base consisting of one fact: nest(X, inner(X)).
What should happen with the following query: ?- nest(Y, Y).
What does happen, in Prolog?
Now try adding this rule unnest(inner(Z)) :- unnest(Z).
and running the query ?- nest(Y, Y), unnest(Y).
(You can find these rules in the file occurs_check.pl
.)
The problem is that we allow a substitution to be cyclical: a variable can be inside the replacement for itself.
The solution is the occurs check: before accepting a new mapping into the substitution, ensure that it is not recursive itself and that it won’t introduce recursion into any of the other mappings.9
Prolog does not perform the occurs check by default, for efficiency.
4 Full Proof Examples
Let’s do some full examples of proofs.
Given the KB:
Y) :- connected_to(Y, Z), live(Z).
live(.
live(outside), w5).
connected_to(w6, outside). connected_to(w5
Here is a proof for the query live(A)
:
A).
? live(A) :- live(A). % A is an argument
yes(A) :- connected_to(A, Z1), live(Z1). % we rename Y and Z.
yes(:- live(w5). % A = w6, Z1 = w5.
yes(w6) :- connected_to(w5, Z2), live(Z2).
yes(w6) :- live(outside).
yes(w6) :- . yes(w6)
So, the answer is A = w6
.
Try these.
Given the KB:
, L, L). append([]H | T], A, [H | R]) :- append(T, A, R). append([
Give a full proof for the query:
?- append([a, b, c], [1, 2, 3], L).
Given the KB:
E, set(E,_,_)). elem(V, set(E,LT,_)) :- V < E, elem(V,LT). elem(V, set(E,_,RT)) :- E < V, elem(V,RT). elem(
Give a full proof for the query:
?- elem(3, S), elem(8, S).
(Five Exercises. We’ll do the first and fourth together.)
So, for example,
q1
may actually be something likecomplex_atom(term1, compound_term2(X, Y), Z)
.↩︎So,
a1
actually looks likep1
orp1(...)
with various terms inside.↩︎In our assignment, we defined walking a substitution over an expression to be essentially repeatedly substituting until we stopped changing the expression (“reached a fixpoint”, as that is sometimes called). Here, we are instead doing just a single (simultaneous) pass of the substitution. That means, for example, that a substitution like
X
/Y
,Y
/X
can swap the names of two variables. However, we’re going to carefully construct our substitutions so that never happens. Instead, with the exception of when we violate the “occurs check” (which we’ll define a little later): no variable that is on the left of any mapping will ever appear on the right of any mapping in a substitution produced by our unification algorithm.↩︎We’re asserting this and the next bullet point, not proving them true. However, you can imagine an inductive proof that follows the structure of the algorithm we give below. It disassembles atoms/compound terms into their parts and shows that at each stage, we stay as general as possible.↩︎
We’re avoiding the special syntax for lists because it just confuses the issue by hiding the real compound terms being used. However, the process still works the same for our custom lists and Prolog’s built-in lists.↩︎
We’re skipping something here. It will cause us trouble, and we’ll come back to it!↩︎
This is “don’t-care non-determinism”. Handling the todo items in any order works. However, we’ll generally handle them in left-to-right order of their appearance in the original expressions.↩︎
In an implementation, we usually do more like what we did in assignment 3. We check if these are simple terms (like constants, numbers, or strings) that are identical to each other. If they’re compound terms that are identical, then the later compound term step will discover that already.↩︎
Specifically, in our algorithm, we maintain an invariant that no variable that appears on the left of a mapping in σ may also appear on the right of any mapping in σ. When we introduce a new mapping, we already know: the newly mapped variable does not appear on the left of any mapping in σ, and none of the variables on the left of mappings so far in σ can appear on the right of the new mapping. (Both of those are because we substitute out any newly added variable from T and σ prior to adding it to σ, alongside our next constraint.) We further insist that the new variable also cannot appear on the right of its mapping. If it does, we simply fail.↩︎