Vectors & Symbols

In the previous post I described how to implement a small macro called select. The macro implements a tiny DSL for selecting items from a list of key/value pairs.

I also wrote that while select in of itself is not very useful, the overall approach for making a DSL is quite powerful. So Let's use the same technique to do something more interesting.

My vector library (cl-veq) has multiple utilities for doing (among other things) vector mathematics. A core component of cl-veq is a macro called vv.

I outlined the idea and motivation behind vv in an earlier post. You probably want to look at both these earlier posts before reading on. In any case, all these posts are written so it should be possible to follow even if you have no experience with Common Lisp (CL).

Mechanical plotter drawing, 1/1 d9559cd, 2017.

What are we Doing this Time?

We will extend the approach used in the select macro to implement a working version of a part of vv from cl-veq. Specifically, we will make this syntax work for doing vector operations with arbitrary functions on packs of values:

(2!@+ (2!@* 1.0 2.0 3.0 4.0)
      (2!@/ 5.0 6.0 7.0 8.0))

Which can be written like this in vanilla CL:

(values (+ (* 1.0 3.0) (/ 5.0 7.0))  ;; (values 3.7142856 8.75)
        (+ (* 2.0 4.0) (/ 6.0 8.0)))

It should work for vectors of dimension 1-9. And as you see we put the dimension before the trigger in the symbol, and the function name after the trigger. So e.g. this should work too:

(3!@- 1 2 3 4 5 6) ;; (values -3 -3 -3)

We will see that we don't require much more code than we needed for select. And the result is pretty easy to extend with other triggers (!@). The result can be seen in this gist.

Mechanical plotter drawing, 1/1 8f41fd6, 2018.

Values, values, values

First we need some utilities to handle values. cl-veq has a lot of sugar coating to make it more convenient to handle value packs. I will only introduce a few here, but enough to give you an idea of how it can be done.

The first one is for making sure you get all values from one or more values:

(defmacro ~ (&rest rest)
  `(multiple-value-call #'values ,@rest))

(~ (values 1 2))          ;; (~ 1 2)
(~ (~ 1 2))               ;; (~ 1 2) ; surprise!
(~ (~ 1 2 (~ 3)) (~ 4 5)) ;; (~ 1 2 3 4 5) ; and so on

Note that I am using ~ to mean a value pack, as well as a syntax that will coerce one or more values into a single values.

Sometimes you want all values in a list instead. In which case we have the following macro:

(defmacro lst (&body body)
  `(multiple-value-call #'list (~ ,@body)))

(lst (~ 1 2) (~ 3 4 (~ 5 6))) ; (1 2 3 4 5 6)

cl-veq contains a good deal more sugar coating. But to keep this relatively simple we will only introduce one more macro which is handy for debugging. It allows you to wrap it around any combinations of values to print them, while still returning the same values.

(defmacro vpr (&body body)
  (let ((res (gensym)))
    `(let ((,res (lst ,@body)))
       (format t "~&;> ~{~a~^ | ~}~&;; ~{~a~^ | ~}~&"
               ',body ,res)
       (apply #'values ,res))))

(vpr (~ 1 2) (~ 3 4)) ;; (~ 1 2 3 4)

We see that functionally it behaves just like ~, but it also prints the following two lines:

;> (~ 1 2) | (~ 3 4)
;; 1 | 2 | 3 | 4

Seeing as the notation we are implementing has explicit dimension, it would make sense to also have a specific way to ensure that something is exactly n values. E.g. 3~ or similar. There are several ways to achieve this. We won't implement it here, but it can be done by extending the approach as we are about to implement.

Mechanical plotter drawing, 1/1 0ba65d8, 2018.

Parsing the new Syntax

We already have startswith? as we defined in the previous post. However, this time we will be using something slightly more general:

(defun match-substr (sub str)
  (loop with lc = (length sub)
        for i from 0 repeat (1+ (- (length str) lc))
        if (string= sub str :start2 i :end2 (+ i lc))
        do (return-from match-substr i)))

match-substr will return the first index where sub matches str, otherwise it returns nil. That means we can find our triggers like this:

(match-substr "!@" "2!@fx")  ;; 0  
(match-substr "!@" "abc!@+") ;; 3  
(match-substr "!@" "abc!")   ;; nil

This time we need to extract the dimension (prefix) from the trigger symbols in addition to the function name (postfix). Assuming that sym complies with our syntax, the following is sufficient:

(defun split-vv-trigger* (sym trig)
  (values (digit-char-p (char sym 0))
          (symb (subseq sym (+ (length trig)
                               (match-substr trig sym))))))

(split-vv-trigger* "3!@+" "!@") ;; (~ 3 +)

digit-char-p returns the digit (as a number), if the input is a digit; and nil otherwise. As such, we will get an error later when we need the dimension to be a number.

Debugging macros can be really tricky, so we will check whether the dimension is actually a number instead:

(defun split-vv-trigger (sym trig)
  (let ((d (digit-char-p (char sym 0))))
    (unless d (warn "~a wants digit prefix. got: ~a" trig sym))
    (values d (symb (subseq sym (+ (length trig)
                                   (match-substr trig sym)))))))

You can probably spot several other things we could improve in this code, but this should give you an idea of how you can get better error messages in your own code. Let's try this out:

(split-vv-trigger* "3!@+" "!@") ;; (~ 3 +) ; as before
(split-vv-trigger  "a!@+" "!@") ;; (~ nil +)
; WARN: !@ wants digit prefix. got a!@fx

Note that warn won't stop execution (by default). So you might still get another error later, but now you have an indication of what went wrong. I won't go into error handling further here. But know that you can also use error, if you actually want to interrupt execution.

Just like in select we need to know whether a given s-expression contains our trigger. The only difference this time around is that we explicitly check whether the first object in the s-expression is an actual symbol. To see why try to remove the call to symbolp. Here is the function:

(defun has-vv-trigger? (body trig)
  (and (listp body)
       (symbolp (car body))
       (match-substr trig (mkstr (car body)))))

(has-vv-trigger? '((2!@+ 1 2)) "!@) ;; nil
(has-vv-trigger? '(2!@+ 1 2)   "!@) ;; t

Mechanical plotter drawing, 1/1 fd148f1, 2018.

Putting it Together

Now that we have all the pieces, we can define the functions to traverse code and compile it. It is very similar to do-trigger and rec that we used for the select macro. But obviously the code we generate is quite different:

; create a list with n new symbols
(defun nsym (n name)
  (loop repeat n collect (gensym name)))

; compile !@ triggers
(defun vv-do-trigger (body)
  (multiple-value-bind (dim fx) (split-vv-trigger
                                  (mkstr (car body)) "!@")
      (let ((args (nsym (* 2 dim) (mkstr "VAR" fx))))
        `(multiple-value-bind ,args
           (~ ,@(vv-rec (cdr body)))
           (values
             ; make the actual function calls:
             ,@(loop for a in args              ; first dim symbs
                     for b in (subseq args dim) ; last dim symbs
                     collect `(,fx ,a ,b)))))))

; recursively process code with !@ triggers:
(defun vv-rec (body)
  (cond ((atom body) body)
        ((has-vv-trigger? body "!@")
           (vv-do-trigger body))
        (t `(,(vv-rec (car body))
             ,@(vv-rec (cdr body))))))

Again the actual macro is almost disappointingly trivial. But this time we use progn. Which is another special operator (like quote). It is often used exactly the way we use it here; to accept and evaluate any number of forms, then return the last result. We have used implicit progn multiple places already. One example is in the body of let, which behaves exactly the same way. Here is the final macro:

(defmacro vv (&body body)
  `(progn ,@(vv-rec body)))

So let's test it with the example syntax we started with:

(vv (2!@+ (2!@* 1.0 2.0 3.0 4.0)   ;; (~ 3.7142859 8.75)
          (2!@/ 5.0 6.0 7.0 8.0)))

Which is the result we wanted. The expanded code looks like this:

(macroexpand-1
  '(vv (2!@+ (2!@* 1.0 2.0 3.0 4.0)
             (2!@/ 5.0 6.0 7.0 8.0))))
;; (PROGN
;;   (MULTIPLE-VALUE-BIND (#:VAR+93 #:VAR+94 #:VAR+95 #:VAR+96)
;;     (~ (MULTIPLE-VALUE-BIND (#:VAR*97 #:VAR*98 #:VAR*99 #:VAR*100)
;;          (~ 1.0 2.0 3.0 4.0)
;;          (VALUES (* #:VAR*97 #:VAR*99)
;;                  (* #:VAR*98 #:VAR*100)))
;;        (MULTIPLE-VALUE-BIND (#:VAR/101 #:VAR/102 #:VAR/103 #:VAR/104)
;;           (~ 5.0 6.0 7.0 8.0)
;;           (VALUES (/ #:VAR/101 #:VAR/103)
;;                   (/ #:VAR/102 #:VAR/104))))
;;    (VALUES (+ #:VAR+93 #:VAR+95)
;;            (+ #:VAR+94 #:VAR+96))))

We see that the expansion is a great deal more complicated than the expansions for select. This is basically why we want a DSL like this in the first place; nesting expressions with value packs quickly gets verbose, but the DSL hides this away quite efficiently.

If you want to extend this implementation of vv, you can pretty easily insert checks for more triggers into the cond in vv-rec.

This is isn't necessarily the most efficient approach, but that frequently does not matter that much for macros, as they are expanded at compile time (once), rather than at runtime. If you do run into issues with slow compilation you can always go back and optimize macros later.

Mechanical plotter drawing, 1/1 4c80cca, 2017.

Conclusion

I think it took me about two months to implement the full version of vv. It has several more modes/triggers and also deals with arrays of point vectors. You can read about how it works in the documentation for cl-veq.

Part of the reason for the development time is that it tends to be how I work on larger macros. Where I have an idea of what I want. But I also need to use the DSL in practice over time to figure out if my initial ideas were reasonable or not. Both in terms of syntax and readability as well as functionality.

Your experience may vary, but this is the approach I would recommend for developing these kinds of tools. It is also where CL really shines, as it allows you to mold the language you are using while you are using it. Giving you an additional dimension to approach your problem from.

We have not said anything about the '# ("sharp quote") notation. It's a bit confusing, but there are actually two namespaces in CL. One for variables and one for functions, roughly speaking. Sometimes you need to refer specifically to a symbol in the function namespace (e.g. list). In which case the sharp quote reader macro comes in handy. This is also why you will sometimes see the '# in front of lambda. Opinions vary on what is the "best" convention. I tend to only use it when I have to. And I make macros to avoid it even then. In fact, the full implementation of vv has m@ and f@ for this specific issue. They translate to multiple-value-call with and without '# respectively.
The condition handling in CL is actually pretty sophisticated. And I don't know it very well, so I won't be commenting on it further here.
format is an interesting DSL, you can read more about it here.