2012-09-28

I Write Funny-Lookin' Racket Code: An Alternate Style for Delimiters and Indentation

A lot of people are quite surprised when they see the Racket code I write. Let's say I needed a function to render hash table values as expressions that would produce the same value. A "normal" Racketeer might write something like the following.

(define (hash->expr ht)
  (cons 'hash
        (for/fold ([args '()])
                  ([(k v) (in-hash ht)])
          (cons (list 'quote k)
                (cons (list 'quote v)
                      args)))))
There might be a few variances in style, especially depending on whether one has Racket or Emacs set up to indent for/fold specially. Almost no one, however, would come up with the code I write.
(define (hash->expr ht)
  (cons 'hash
    (for/fold
        {[args '()]}
        {[{k v} (in-hash ht)]}
      (cons (list 'quote k)
        (cons (list 'quote v)
          args)))))
The biggest reaction I get is from the presence of {curly braces}, but those are largely incidental as far as I'm concerned. It's all about the indentation to me.



A while back I found that my .emacs file was growing in proportion to my Racket code—all of it I had ever written, in fact. Every new macro in my code or in the latest Racket version needed a line like:
(put 'for/fold 'scheme-indent-function 2)
This would tell Emacs more or less how I wanted it to indent the given form. So long as I followed the use patterns Emacs could cope with. For instance, with for/fold, Emacs could cope with both of the "special" arguments on the same line as the macro name, or both on separate lines. Changing that up got weird results.

Also, function arguments would lead to significant rightward-creep in my indentation. Adding up the lengths of a list of strings, for instance, might look like this:
(foldl +
       0
       (map string-length
            (list "There"
                  "are"
                  "thirty-four"
                  "characters"
                  "here.")))
This wastes a lot of space on the left, and to me it doesn't do enough for readability to justify it. I don't need my eyes drawn to 0 and + nearly that much.

I discovered a new style of indentation in the {Little, Seasoned, Reasoned} Schemer series of books by Dan Friedman and his many cohorts. These books always start a new indentation level at a fixed distance in from the previous one, regardless of the cause for the indentation. Arguments on the same line as the function or macro name are ignored; they do not "push" indentation over to the right at all.

This indentation style has a lot of appeal to me for a number of reasons. One, it wastes no space on the left. Two, it never needs to "know" what a given macro means. It doesn't matter if you're applying + or lambda or for/fold, all lines beyond the first move two (or however many) characters to the right. I saw a light at the end of the tunnel: no more .emacs customization for every new form!

This style leaves two issues. One, how to indent cond? The Little books treat cond differently, indenting each clause only as far as the keyword cond, while other form's arguments are pushed in slightly farther than the function or macro name. Two, how to "fix" forms like for/fold where a few lines really ought to be indented differently? A straight-up interpretation of this style would generate code like this:
(for/fold
  ([x 0])
  ([str (in-list '("12" "characters"))])
  (define n (string-length str))
  (+ x n))
Now we can't tell visually where the for/fold iteration clauses leave off and the loop body definitions and expressions begin.

The cond issue is easy enough to resolve. In Racket, unlike in vanilla Scheme, we use [brackets] around cond clauses. The same goes for a number of other repeated clauses, in fact: let, match, syntax-parse, and so forth. So I decided my new, custom indentation style would indent [brackets] differently from (parentheses). Parens indent one farther than brackets. That way,
(let ([x 1]
      [y 2])
  (+ x y))
doesn't become
(let ([x 1]
       [y 2])
  (+ x y))
Since I already use [brackets] every time I have a repeated, non-expression clause, this rule does exactly what I need it to do.

Once I had differentiated [] from (), resolving the for/fold issue was obvious. I needed a new indentation rule and a new lexical marker: {braces}. Now every time I have a fixed number of special non-expression forms in a macro, I wrap them in braces. Anything in braces is indented slightly farther (four spaces rather than two) than ordinary sub-expressions. So my for/fold example comes out like this.
(for/fold
    {[x 0]}
    {[str (in-list '("12" "characters"))]}
  (define n (string-length str))
  (+ x n))
Suddently it's quite clear which parts are "special" in the for/fold macro.

So now I write code using (parentheses) for definitions, expressions, and anything else resembling a nestable, expandable term (e.g. match patterns, syntax templates), [brackets] for repeated, non-expandable clauses (e.g. cond clauses, let bindings), and {braces} for non-repeated, non-expandable forms (e.g. lambda formals, groups of let bindings). And I don't bother to align function arguments; I tend to treat the most significant argument as an "accumulator", and put everything else on one line if I can.
(foldl + 0
  (map string-length
    (list
      "There"
      "are"
      "thirty-four"
      "characters"
      "here.")))
The way I read this code, the first line tells us we are performing a summation; the second line tells us we want the length of each string; the third line tells us we have a list coming; and the rest give its contents. The result "accumulates" from a list to its lengths to their sum as the indentation cascades out and up from the inside.

With these three rules, I now write my Racket code without bothering to customize my .emacs file as I go. I just use delimiters judiciously to tell Emacs how I want everything indented, and everything comes out pretty much how I want it.

For anyone interested in installing this indentation mode or looking at its source code, I've put the file up on GitHub at:
https://github.com/carl-eastlund/simple-sexp
To use it, just put it somewhere your Emacs knows to look for Elisp code and add (require 'simple-sexp) to your .emacs file.

Addendum: Oh, and there's some structured s-expression editing code in that file as well. It preserves matched parens, brackets, braces, and quotes (for strings). It's probably a much inferior implementation of things like paredit; this code represents the flailings of an Elisp novice. Use at your own peril.