Tabling The Truth

2011-07-04 19:38 +0000

Boolean logic is a programmer's daily bread. In fact, it might be one of the few concepts that indeed every programmer has to master no matter what programming paradigms or languages she prefers. And I think it's safe to assume that with a little practice everyone feels confident about constructing boolean expressions correctly. The primary use case of boolean expressions is, of course, controlling program flow. Consider this simple JavaScript if statement:

if (energy == 0 && credits == 0) {
  gameOver();
} else {
  retryLevel();
}

No big surprises there, it almost reads like natural language: if energy is equal to zero and credits are also¹ equal to zero, the game is over, else retry.

But is it really correct? Shouldn't the game just continue instead of retrying the level when energy is greater than zero? Let's fix this:

if (energy == 0 && credits == 0) {
  gameOver();
} else if (energy == 0) {
  retryLevel();
}

Ugh, that's getting ugly. In it's actually a pretty simple case. Imagine more variables and relations involved:

if ((a > 10 && b < 50) || c == 20) {
  do a
} else if (a > 10) {
  do b
} else if (b < 50 && c == 20) {
  do c
} else {
  do d
}

Pretty hard to track what's going on here. Can you tell when the second else if branch is actually executed? I'm sure you have come across conditionals like this yourself before. Now imagine being tasked to change something, like introduce a new variable to the mix. You will probably not get it right the first time around. Hopefully you have some unit tests around that cover all possible cases - and even then you can't be sure!

Needless to say, smart people from over 50 years ago already found a nice solution to handle this kind of complexity. Enter the world of decision table languages! Decision tables are basically Wittgenstein's truth tables employed for control flow and they are very easy to grasp compared to long if/else chains. And apparently this has been a major programming language paradigm during the 60s and 70s of the 20th century.² For some reason it fell into obscurity in the following time until it was rediscovered and brushed up around 2004 by Jonathan Edwards under the banner of the subtext programming language. If you haven't already, watch this excellent demonstration of it.

Now let's explore this idea a little closer. The above example expressed as a decision table would look something like this:

a > 10 b < 50 c == 20 action
true true true do a
true true false do a
false true true do a
true false true do a
true false false do b
false true true do c
false false false do d
false true false do d
false false true do d

Granted, this is a bit more verbose but it's very easy to tell what action is taken under which conditions. For example, it's much clearer now when the else branch is reached as all conditions are explicitly listed. Another thing that is now apparent is that action c will never be executed as its branch can't be reached because the first condition overlaps with it. So that's a plus: the decision table helps finding mistakes.

Parenthesize

Now, with classic decision table languages being unavailable and with Subtext living in its graphical programming environment³ I couldn't help but try porting that idea to Scheme. One simple approach would be to use matchable like this:

(match (list (> a 10) (< b 50) (= c 20))
  ((#t #t #t) (do-a))
  ((#t #t #f) (do-a))
  ((#f #t #t) (do-a))
  ((#t #f #t) (do-a))
  ((#t #f #f) (do-b))
  ((#f #t #t) (do-c))
  ((#f #f #f) (do-d))
  ((#f #t #f) (do-d))
  ((#f #f #t) (do-d)))

That's not so bad. But note that we still have the overlap there. Of course, we can now spot it much more easily but why stop there? Why not define a macro that checks our decision table for overlaps at expansion time? First of all, let's write down what we'd like the macro to be used like. This is generally a good first step when creating macros since basically you are creating new syntax here (thus define-syntax instead of define-macro in recent Schemes), and syntax is best judged from the user's point of view. I went with the name decide:

(decide ((> a 10) (< b 50) (= c 20))
  (#t #t _  (do-a))
  (#f #t #t (do-a))
  (#t #f #t (do-a))
  (#t #f #f (do-b))
  (#f #t #t (do-c))
  (#f _  #f (do-d))
  (#f #f #t (do-d)))

I use the underscore operator in there to mean "true or false" also known as the "don't care" operator. This reduces repetition a little without hampering readability too much I think. The repetition of common actions is something I'd like to get rid of, too, but I couldn't come up with a good solution for that so far. It's probably a good idea to use named functions for actions anyway, possibly by wrapping the decide in a let.

Let's start with a simple expansion into a cond so as not to depend on match. Readability is not really an issue for macro expanded code so we don't mind losing the tabular code structure. Here's an implementation using Chicken Scheme's recently added implicit renaming macros:

(begin-for-syntax
 (import chicken)
 (use srfi-1))

(define-for-syntax (expand-conds cmp vars conds)
  (map (lambda (b)
         (if (cdr b)
             (car b)
             `(not ,(car b))))
       (remove (lambda (b) (cmp '_ (cdr b)))
               (map cons vars conds))))

(define-syntax decide
  (ir-macro-transformer
   (lambda (x i c)
     (let* ((exprs (second x))
            (cols (length exprs))
            (conds (map (lambda (row)
                          (take row cols))
                        (cddr x)))
            (actions (map (lambda (row)
                            (drop row cols))
                          (cddr x)))
            (vars (list-tabulate 
                   cols
                   (lambda (i) 
                     (string->symbol (format "e~A" i))))))

       `(let ,(zip vars exprs)
          (cond . ,(map (lambda (conds action)
                          `((and . ,(expand-conds c vars conds))
                            . ,action))
                        conds actions)))))))

So the example above would expand into this code:

(let ((e0 (> a 10))
      (e1 (< b 50))
      (e2 (= c 20)))
  (cond
   ((and e0 e1) (do-a))
   ((and (not e0) e1 e2) (do-a))
   ((and e0 (not e1) e2) (do-a))
   ((and e0 (not e1) (not e2)) (do-b))
   ((and (not e0) e1 e2) (do-c))
   ((and (not e0) (not e2)) (do-d))
   ((and (not e0) (not e1) e2) (do-d))))

Cool!

Turning A Table Into A Tree

Next up is checking the decision table for overlaps. One way this can be done is by turning the table into a tree with each row becoming a branch of the tree. An overlap then means a collision of leaf nodes. So let's define an auxiliary function that takes the decision table as its argument and raises an error in case an overlap is found. We also need to pass in the comparison function to be able to handle the underscore operator:

(define-for-syntax (check-for-overlaps! cmp table)
  ;; traverse rows, eventually returning the corresponding tree
  (fold (lambda (row tree)
          ;; traverse the current row's columns,
          ;; eventually returning the updated tree
          (let loop ((tree tree) (r row))
            (cond ;; end of the row, just return the tree
                  ((null? r) tree)
                  ;; turn underscore into both #t and #f nodes
                  ;; and recur on both cases
                  ((cmp '_ (car r))
                   (loop (loop tree (cons #t (cdr r)))
                         (cons #f (cdr r))))
                  ;; if a node for this case exists already
                  ((alist-ref (car r) tree) =>
                   (lambda (node)
                     ;; and it is a leaf node
                     (if (null? node)
                         ;; we've found an overlap
                         (error 'decide 
                                "overlapping row"
                                (strip-syntax row))
                         ;; otherwise, recur
                         (alist-update! (car r)
                                        (loop node (cdr r))
                                        tree))))
                  ;; otherwise this branch does not exist, yet,
                  ;; so just add the node and recur
                  (else (alist-cons (car r)
                                    (loop '() (cdr r))
                                    tree)))))
        '()
        table))

We would call this function in the let* body right before returning the macro expansion like this:

(check-for-overlaps! c conds)

Executing the example expression now would signal an error:

Error: (decide) during expansion of (decide ...) - 
overlapping row: (#f #t #t)

Note that the overlap check is run at macro expansion time. This means that if you compile the example code the check will only be run once at compile time and refuse to compile in case an overlap is detected.

We Can Do Even Better

Detecting overlaps is nice and can save us a lot of head scratching already. However, there is another error we might run into. Consider this example:

(decide ((equal? answer "yes") (> input 10))
  (#t #t (handle input))
  (#t #f (retry 'input-too-low))
  (#f #f (retry 'must-confirm)))

Here we have a gap: the case (#f #t) is not covered. It could be trivially fxied by changing the last row's conditions to (#f _), of course, but it still might slip our attention. Luckily, gaps can also easily be detected.

So far we have constructed the tree representation of the decision table merely for the side-effect of checking for overlaps. However, we can also use it to check for gaps. To understand how, just take a look at the tree for the above example:

((#t 
  (#t)
  (#f))
 (#f 
  (#f)))

As is quite apparent, a gap is nothing but a branch with less than two leaf nodes. check-for-gaps! accepts the (expected) depth of the decision tree and the tree itself as arguments:

(define-for-syntax (check-for-gaps! depth tree)
  (let loop ((tree tree) (branch '()))
    (case (length tree)
      ;; if there are no nodes we are at a leaf node
      ;; but if we are not at the expected
      ;; depth we have found a gap
      ((0) (unless (= depth (length branch))
             ;; so raise an error giving the two missing
             ;; conditions as a hint
             (error 'decide 
                    "gaps"
                    (append (cons #t branch)
                            (cons #f branch)))))
      ;; a tree with just one node must have a gap
      ;; so raise an error giving the opposite of the existing
      ;; node as a hint
      ((1) (error 'decide
                  "gap"
                  (cons (not (caar tree)) branch)))
      ;; otherwise, recur on each node
      ((2) (for-each (lambda (b)
                       (loop (cdr b) (cons (car b) branch)))
                     tree)))))
                     

Now we can extend our check call like this (the number of columns corresponds to the expected depth of the tree):

(check-for-gaps! cols (check-for-overlaps! c conds))

Executing the example we should now get:

Error: during expansion of (decide ...) - gap: (#t #f)

Excellent, this is already somewhat usable. But I don't really like the code anymore, the check procedures stick out with them being all side-effecty. Also, we traverse the whole expression quite often now that we first turn it into a tree, then check for gaps and finally transform the table into the cond. There must be a way to simplify that. How about expanding into something more tree-like so that we can construct the macro expansion at the same time as we check the tree for gaps? Nested if you say? Good idea! How about this:

(define-for-syntax (table->tree cmp depth rows)
  (fold (lambda (row tree)
          (let loop ((tree tree) (r row) (d depth))
            (cond ((zero? d) r)
                  ((cmp '_ (car r))
                   (loop (loop tree (cons #t (cdr r)) d)
                         (cons #f (cdr r))
                         d))
                  ((alist-ref (car r) tree) =>
                   (lambda (node)
                     (if (= d 1)
                         (error 'decide 
                                "overlapping row"
                                (strip-syntax row))
                         (alist-update! (car r)
                                        (loop node
                                              (cdr r) 
                                              (- d 1))
                                        tree))))
                  (else (alist-cons (car r) 
                                    (loop '() (cdr r) (- d 1))
                                    tree)))))
        '()
        rows))

(define-syntax decide
  (ir-macro-transformer
   (lambda (x i c)
     (let* ((exprs (second x))
            (cols (length exprs))
            (rows (cddr x))
            (vars (list-tabulate 
                   cols
                   (lambda (i) 
                     (string->symbol (format "e~A" i))))))

       `(let ,(zip vars exprs)
          ,(let loop ((tree (table->tree c cols rows))
                      (vars vars)
                      (branch '()))
             (cond ((not tree)
                    (error 'decide "gap" branch))
                   ((null? vars)
                    `(begin . ,tree))
                   (else 
                    `(if ,(car vars)
                         . ,(map (lambda (b)
                                   (loop (alist-ref b tree)
                                         (cdr vars)
                                         (cons b branch)))
                                 '(#t #f)))))))))))

This is arguably more elegant. The former check-for-overlaps! is now called table->tree, emphasizing the transformation and relegating the overlap check to being a mere exceptional case. Also, it simplifies the gap check code a bit.

Decide For Yourself

Phew, that was quite a ride. I bet there are a number of things that could be improved or added. How about an optional else clause that fills all gaps? Feel free to hack it as you please, the code can be forked at gitorious.

I hope you enjoyed this trip to the earlier days of the computing age and the evolution of the decide macro!


1
I recently learned that in VB.NET the short-circuiting version of And is actually called AndAlso.
2
Unfortunately I was unable to find any code examples of those old languages. If you have some or even are proficient in some language of that era, drop me a line!
3
I think Edwards' appeal for the programming world to move into the second dimension makes a lot of sense. However, it just isn't in me, I will probably remain a textual person, manipulating characters of strings--although I'd argue that with paredit it's actually a lot more than that.