Binary Search Trees in Prolog

Lecture #10
Complete the associated in-class exercises.

1 Preface: BSTs (Nothing New for CPSC 312)
- 1.1 What is a BST?
2 BSTs in Prolog

1 Preface: BSTs (Nothing New for CPSC 312)

Note: We will likely skip past this section in class and use the example below to remind ourselves what a Binary Search Tree is.

Binary Search Trees (BSTs) are beautiful and profound… and not the point of CPSC 312. There’s absolutely nothing new from a 312 perspective in this short lecture. We’re just playing around with what full Prolog enables us to do, using compound terms as our data structures.

CPSC 107 and CPSC 110 discuss BSTs, and those of you who have taken CPSC 221 have certainly seen plenty of them. You’ve even already seen a Datalog-based binary tree representation in Assignment #3! (script/4 represents a binary tree of a different kind from a BST.)

We’re going to very briefly introduce BSTs and then focus on exploring some predicates that operate on BSTs.

1.1 What is a BST?

A BST is a data structure that associates keys (which must be ordered: comparable for <, equality, and >) with values (which can be anything). Depending on the flavour of BST you’re working with, they support efficient:

lookup (given a key: is it present, and what is its value?),
insertion (given a key and value, place it in the BST),
deletion (given a key, remove it from the BST), and
a variety of order-related queries (smallest, largest, all in a given range, etc.)

A BST is either:

An empty tree (with nothing inside and no further structure) or
A node. Each node has a key, a value, and two BSTs as its left and right children. Crucially:
- The keys in the left subtree of a node are all less than the node’s key.
- The keys in the right subtree of a node are all greater than the node’s key. (We assume no duplicate keys.)

We call the initial node of a non-empty BST its root.

As a result, when we’re searching for a key at a given node, only three cases can occur:

The key matches the node’s key, and this is the key’s location in the tree.
The key is less than the node’s key, and the key’s location is in the left subtree.
The key is greater than the node’s key, and the key’s location is in the right subtree.

Also as a result, as long as the tree has about as many nodes in each node’s left subtree as in its right subtree: the number of nodes from the root to any other node in the tree is logarithmic in the size of the tree.

And.. that’s it! We have an awesome data structure for efficiently associating a key with a value. What are the keys and values? Could be anything, like:

A Prolog variable name and its mapping under a substitution.
A Unix process’s scheduling priority and the process identifier
A museum item’s catalog number and the item’s storage/display location

2 BSTs in Prolog

Let’s start with an example BST to work with. Since our algorithms never “touch” the values, just pass them along unchanged, we often ignore them when we draw a tree, like this:

How should Prolog represent this? Let’s use compound terms:

an empty tree is empty
a node is node(Key, Value, LeftSubtree, RightSubtree)

2.1 Example BST in Prolog

The root of our example tree has a key of 8. Values weren’t specified; let’s have the nodes values be a, b, c, … reading across from left to right in each row from top to bottom. So, the root’s value will be a.

Our root is therefore node(8, a, RootLeft, RootRight).

This is Prolog; so, we can actually keep going in that way. Here’s the whole left path down our tree:

Tree = node(8, a, RootLeft, RootRight), RootLeft = node(3, b, ThreeLeft, ThreeRight), ThreeLeft = node(1, d, empty, empty).

We could also directly write everything as a single nested structure:

Tree = node(8, a, 
         node(3, b,
           node(1, d, empty, empty),
           node(6, e, 
             node(4, g, empty, empty),
             node(7, h, empty, empty))),
         node(10, c,
           empty,
           node(14, f,
             node(13, i, empty, empty),
             empty))).

2.2 A Prolog Aside

It might be nice if our code above defined a constant Tree that we could use over and over. That’s what it does in Haskell and many other languages.

In Prolog, however, every variable is local to its own clause. If we use Tree anywhere else, it’s simply a new universally quantified variable in that clause.

Instead, we can sort-of define constants like this:

example_tree1(
  node(8, a, 
    node(3, b,
      node(1, d, empty, empty),
      node(6, e, 
        node(4, g, empty, empty),
        node(7, h, empty, empty))),
    node(10, c,
      empty,
      node(14, f,
        node(13, i, empty, empty),
        empty)))
)

% Now I can use that:
example_root_key1(RKey) :- example_tree1(node(RKey, _, _, _)).

% Or maybe better:

% root_key(Tree, Key) is true if Key is the key at the root of
% BST Tree.
root_key(node(Key, _, _, _), Key).

example_root_key1_v2(RKey) :- 
  example_tree1(Tree), root_key(Tree, RKey).

2.3 Live-Coding BSTs

Let’s work together on building some functionality for a BST. We’ll start with bst_starter.pl and put our working code in bst.pl.

(Three Exercises.)