Prolog Learnings: Recursion, Unification and Backtracking

Unification makes Prolog the Tenet of programming languages.

list_length/2 counts the number of elements in a list, where the first argument is a list, and the second argument is the number of elements in the list.

It is good Prolog practice to try and encode the intent, order and type of arguments in the functor name. So "list_length" defines what the predicate does and that the first argument should be a list and the second the length of the list.

Using a single code path, it should be possible to;

  1. Return the number of elements in a list, using the magic of recursion.

  2. Validate that the number of elements in the list equals a specified C, using the magic of Unification.

  3. Return a new list containing the number of elements specified in C, using the magic of backtracking.

  4. If a list and count are unspecified, then return values for list and count where list_length evaluates to true, using all of the magic fairy dust.

This is called multiway reasoning[1] and allows the same predicate to be used in many ways.

list_length/2 example in Prolog
1
2
3
4
5
6
7
8
9
%% list_length/2 counts the elements in a list.

%% The base case terminating recursion.
list_length([], 0).

%% Recurse through the list, counting each element.
list_length([_|Rest], C) :-
    list_length(Rest, Count),
    C is Count + 1.

Some explanation of the code above is probably needed.

  • The base case must always come before the recursive call because Prolog (and Elixir) evaluate procedures in order of appearance in the source file. So on each call to list_length/2 Prolog will first attempt to pattern match to the base case, and if this fails, call the recusive case.

  • The notation [First|Rest] pattern matches on a list, assigning the first element in a list to First and the remaining list elements to Rest. Here, "_" is effectively used to throw away the first item of the list. If you are used to Elixir then the Prolog list notation and the use of "\_" to ignore a parameter will be very familiar.

  • There is a difference between = and is in Prolog.

    • is assigns the result of a calculation,

    • = is used for Unification. As we will see later on in length/2 implemented using "=", list_length/2 can be implemented using = instead of is.

  • Due to the immutability of variables, Prolog does not support a common notation such as C is C + 1, or count++, which is why the intermediate Count variable is needed. Because once a variable is assigned a value, that value must remain unchanged throughout the variable scope. So in the case of C is C + 1, Prolog cannot assign the C on the left of the is a different value to whatever C is already assigned on the right side of the is. This does take some getting used to.

list_length/2 Use Cases

We can see all our defined use cases in just the base case.

list_length/2 base case
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
%% The base case terminating recursion.
list_length([], 0).

?- list_length([], C). % (1)
C = 0

?- list_length([], 0). % (2)
true.

?- list_length(L, 0). % (3)
L = [].

?- list_length(L, C). % (4)
L = [],
C = 0.
1 An empty list [] matches the first argument of the functor definition, so the second argument C is returned with a value of 0.
2 An empty list [] and a list count 0 matches the first and second arguments, so the predicate evaulates to true. Any other values and the predicate will evaluate to false.
3 A list count 0 matches the second argument, so an empty list [] is returned.
4 Prolog can only evaluate the base case to true with an empty list [] and a list count of 0. Anything else will evaluate to false.

Now that the base case is defined to handle all possbile use cases, we can define the main recursive procedure.

list_length/2 recusive procedure
1
2
3
4
%% Recurse through the list, counting each element.
list_length([_|Rest], C) :-
    list_length(Rest, Count),
    C is Count + 1.

Use Case 1: Return the number of elements in a list

For the first use case of returning the number of elements in a list, list_length works as follows;

list_length/2 stack trace
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
?- list_length([1,2,3], C).   | list_length([1,2,3], C).   <--------->   C = 3.
                              |
% Stack frame #1              |
list_length([_|Rest], C) :-   | list_length(_|[2,3], _4012) <------->    list_length(_|[2,3], 3)
    list_length(Rest, Count), |   list_length([2,3], _4486),               list_length([2,3], 2),
    C is Count + 1.           |   _4012 is _4486 + 1.                      3 is 2 + 1.
                              |
% Stack frame #2              |
list_length([_|Rest], C) :-   |   list_length(_|[3], _4804)  <----->   list_length(_|[3], 2)
    list_length(Rest, Count), |     list_length([3], _4530),             list_length([3], 1),
    C is Count + 1.           |     _4804 is _4530 + 1.                  2 is 1 + 1.
                              |
% Stack frame #3              |
list_length([_|Rest], C) :-   |     list_length(_|[], _4666)  <--->  list_length(_|[], 1)
    list_length(Rest, Count), |       list_length([], _4574),          list_length([], 0),
    C is Count + 1.           |       _4666 is _4574 + 1.              1 is 0 + 1.
                              |
% Stack frame #4              |
list_length([], 0).           |       list_length([], 0).      <->  list_length([], 0).
  1. Throw away the first element of the list, then

  2. Call list_length with the remaining list, passing in a new variable Count.

  3. Return to (1) until the list is empty, matching [] to the base case.

  4. Return 0 from the base case, because this is the number of elements in [], the empty list.

  5. Returning from list_count, Count will have a value 0,

  6. Add 1 to Count, and assign this to C.

  7. Return the value of C,

  8. Returning from list_count, Count now has the value of C,

  9. Add 1 to Count, and assign this to C.

  10. Return to (7), and repeat until we have climbed back up the stack.

  11. C will now have the total number of elements of the list.

Use Case 2: Validate that elements in a list equal specified count

The second use case validates that the number of elements in the list matches the specified count. The important take-away here is that Unification allows the same code and code path to handle both use cases 1 and 2.

list_length/2 stack trace
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
?- list_length([1,2,3], 3).   | list_length([1,2,3], 3).   <--------->   true.
                              |
% Stack frame #1              |
list_length([_|Rest], C) :-   | list_length(_|[2,3], 3)     <------->    list_length(_|[2,3], 3)  % (1)
    list_length(Rest, Count), |   list_length([2,3], _4486),               list_length([2,3], 2),
    C is Count + 1.           |   3 is _4486 + 1.                          3 is 2 + 1.            % (2)
                              |
% Stack frame #2              |
list_length([_|Rest], C) :-   |   list_length(_|[3], _4804)  <----->   list_length(_|[3], 2)
    list_length(Rest, Count), |     list_length([3], _4530),             list_length([3], 1),
    C is Count + 1.           |     _4804 is _4530 + 1.                  2 is 1 + 1.
                              |
% Stack frame #3              |
list_length([_|Rest], C) :-   |     list_length(_|[], _4666)  <--->  list_length(_|[], 1)
    list_length(Rest, Count), |       list_length([], _4574),          list_length([], 0),
    C is Count + 1.           |       _4666 is _4574 + 1.              1 is 0 + 1.
                              |
% Stack frame #4              |
list_length([], 0).           |       list_length([], 0).      <->  list_length([], 0).
1 3 is passed as the second argument so the anonymous variable _4012 is not created.
2 Prolog unifies the clause 3 is 2 + 1 to true.

Prolog uses unification in <2> above (instead of simple assignment) to attempt to make the left side of the is the same as the right side. In use case 1, prolog unifies _4012 is 2 + 1 by assigning 3 to _4012. In use case 2, Prolog attempts to unify 3 is 2 + 1, which is valid Prolog syntax. In this case 3 is the same as 2 + 1 and the clause evaluates to true. If 5 was passed in as the second argument, then Prolog would attempt to unify 5 is 2+1, which it cannot and the clause would evaluate to false.

Use Case 3: Return a new list with the number of elements in count - TODO

The third use case returns a new list wth the number of elements specified in count. The important take-away here is that Backtracking allows the same code and code path can handle use cases 1, 2 and 3.

list_length/2 stack trace
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
?- list_length(L, 3).         | list_length(L, 3).   <--------->   L = [_2352, _3425, _4523].
                              |
% Stack frame #1              |
list_length([_|Rest], C) :-   | list_length(_|[2,3], _4012) <------->    list_length(_|[2,3], 3)
    list_length(Rest, Count), |   list_length([2,3], _4486),               list_length([2,3], 2),
    C is Count + 1.           |   _4012 is _4486 + 1.                      3 is 2 + 1.
                              |
% Stack frame #2              |
list_length([_|Rest], C) :-   |   list_length(_|[3], _4804)  <----->   list_length(_|[3], 2)
    list_length(Rest, Count), |     list_length([3], _4530),             list_length([3], 1),
    C is Count + 1.           |     _4804 is _4530 + 1.                  2 is 1 + 1.
                              |
% Stack frame #3              |
list_length([_|Rest], C) :-   |     list_length(_|[], _4666)  <--->  list_length(_|[], 1)
    list_length(Rest, Count), |       list_length([], _4574),          list_length([], 0),
    C is Count + 1.           |       _4666 is _4574 + 1.              1 is 0 + 1.
                              |
% Stack frame #4              |
list_length([], 0).           |       list_length([], 0).      <->  list_length([], 0).

length/2 implemented using "="

Ivan Bratco in his book[2] presents an alternate solution to length/2 using = instead of is.

Lets start by only replacing is with =.

list_length_1/2 using "="
1
2
3
4
list_length_1([], 0).
list_length_1([_|Rest], C) :-
    list_length_1(Rest, Count)
    C = Count + 1.

Which when evaluated returns…​

list_length_1/2 using "=", evaluated
1
2
?- list_length_1([1,2,3], C).
C = 0+1+1+1,

OK, what’s going on here? Remember that = does not evaluate any of its terms, so each recursive call ads another +1 to C. So 0, then 0+1, then 0+1+1, and finally 0+1+1+1. To return the length as an integer, we need to evaluate the result of C using is, like so…​.

list_length_1/2 using "=" then "is"
1
2
?- list_length_1([1,2,3], C), Length is C.
Length = 3.

Remember that unlike is where all arguments must already have been instantiated to numbers, = does not evaluate its terms. So list_length_1/2 can be rewritten with C = Count + 1, appearing before list_length_1(Rest,Count)..

list_length_1/2 rewritten
1
2
3
list_length_1([_|Rest], C) :-
    C = Count + 1,
    list_length_1(Rest, Count).

Yet another Prolog idiosyncrasy — postponed variable bindings; when a variable is assigned a value all instances of that variable, within scope, assume that value, even those instances that occur in the code prior to the assignment. So Count is unassigned in the clause C = Count + 1. But because = does not evaluate its arguments, Count acts as a placeholder and binding is postponed until such time as a value is assigned. Prolog is like the Tenet[3] of programming languages.

As = is implicit in predicate arguments (remember that Prolog does not evaluate arguments), list_length_1/2 can be refactored to remove the explicit = as follows.

list_length_1/2 refactored
1
2
list_length_1([_|Rest], 1 + L) :-
    list_length_1(Rest, L).

Which results in the correct answer.

list_length_1/2 refactored and evaluated
1
2
?- list_length_1([1,2,3], C), Length is C.
Length = 3.

Using is or = in list_length_1/2 is, I suppose, a matter of taste. Although I personally think using is makes the intent explicit in the code. From a "Big O" point of view, using = results in the creation of a new list which equates to O(n) in additional memory needed.