Recently at RacketCon,
I gave a talk about the new Generics library that ships
with Racket 5.3. In this blog post, I’ll offer a slightly expanded commentary
about the new library. For the ten-minute video explanation, see the
Youtube video. The
accompanying slides are available
here.
Introduction
Probably the first question that comes to your mind is: what are generics and
what do we mean by generic programming? Let me illustrate with
some example code:
> (vector-ref #(1 2 3) 2) |
3 |
> (list-ref '(1 2 3) 2) |
3 |
Both of the lines above operate on sequence-like datatypes, but for each of the
datatypes we use different functions:
vector-ref and
list-ref. This is also the case with other datatypes like
dictionary-like datatypes:
hash-ref,
assoc, etc. Or all the
different kinds of equality:
string=?,
boolean=?, and
=. These specialized operations may seem redundant. Ideally, we’d have
generic functions that don’t care about the specific datatypes that we
operate over.
Thankfully, Racket does provide these. For dictionaries, we have functions like
dict-ref and
dict-set that operate over any kind of
dictionary-like type. For sequences,
sequence-ref,
sequence-append, and so on. These generic interfaces are all built-in
to the standard library. You might wonder, however, if you can define your own
generic functions.
As of version 5.3, you just need to
(require racket/generic) to get all
the tools you need to define your own generic interface. In the rest of the
article I’ll show you how to define your own generic interface and how to use
it. If you’ve seen Rust’s
traits or
Clojure’s
protocols, our design
will feel familiar.
Examples
The running example will be the implementation of a simple queue
interface. Our queue will contain five operations:
queue-enqueue,
queue-dequeue,
queue-head,
queue-empty?, and
queue-length.
The first thing to do is to require the library:
> (require racket/generic) |
|
Then we use
define-generics to define a generic interface:
> (define-generics queue |
[queue-enqueue queue elem] |
[queue-dequeue queue] |
[queue-head queue] |
[queue-empty? queue] |
[queue-length queue]) |
|
|
The method headers above define the methods that concrete implementations of
the generic interface need to provide. Each header needs to contain at least
one argument that matches the name of the interface; this argument will be used
for dispatch. The form defines several identifiers:
queue?,
gen:queue,
queue/c, and each of the generic functions. The
first is a predicate that checks whether a given value implements the
interface. The identifier prefixed with
gen: is a binding that’s used
to supply methods for the interface. The
queue/c specifies a contract
combinator for the interface, which I’ll describe later in the article.
We now have the generic functions, but they’re not very useful without concrete
implementations. To implement a generic interface, you first need to define a
structure type. We’ll define a simple functional
queue (
Okasaki 1998, p.42) that uses two lists:
> (struct simple-queue (front back) |
#:methods gen:queue |
[; helper function to balance lists |
(define (check-front queue) |
(match queue |
[(simple-queue '() back) |
(simple-queue (reverse back) '())] |
[_ queue])) |
; enqueue an element |
(define (queue-enqueue queue elem) |
(match queue |
[(simple-queue front back) |
(check-front (simple-queue front (cons elem back)))])) |
; dequeue an element |
(define (queue-dequeue queue) |
(match queue |
[(simple-queue (cons x xs) back) |
(check-front (simple-queue xs back))])) |
; get the head of the queue |
(define (queue-head queue) |
(match queue |
[(simple-queue (cons x xs) back) x])) |
; check if the queue is empty |
(define (queue-empty? queue) |
(empty? (simple-queue-front queue))) |
; get the queue's length |
(define (queue-length queue) |
(+ (length (simple-queue-front queue)) |
(length (simple-queue-back queue))))]) |
|
|
> (define empty-queue |
(simple-queue '() '())) |
|
|
Using the
#:methods keyword and the
gen:queue binding, we can
specify the methods that a
simple-queue implements. Note that a
#:methods block may also contain helper functions (like
check-front) and definitions that are used to define the methods.
Each method has the same method header as the corresponding headers in the
interface definition.
We can check that our new queue actually works with the generic functions:
> (queue-head (queue-enqueue empty-queue 5)) |
5 |
> (queue-empty? empty-queue) |
#t |
> (queue-length (queue-enqueue (queue-enqueue empty-queue 7) 5)) |
2 |
It works! For any structure type, we can define methods in the same way. For
example, we can define an efficient persistent queue (
Okasaki 1998, p.64)
that implements the same methods. This time, the implementation will use
lazy evaluation with streams:
> (struct persistent-queue (front-len front back-len back) |
#:methods gen:queue |
[; helper function to balance lists |
(define (check queue) |
(match queue |
[(persistent-queue front-len front back-len back) |
(if (<= back-len front-len) |
queue |
(persistent-queue |
(+ front-len back-len) |
(stream-append front (stream-reverse back)) |
0 stream-null))])) |
; enqueue an element |
(define (queue-enqueue queue elem) |
(match queue |
[(persistent-queue front-len front back-len back) |
(check (persistent-queue |
front-len front |
(+ 1 back-len) (stream-cons elem back)))])) |
; dequeue an element |
(define (queue-dequeue queue) |
(match queue |
[(persistent-queue front-len front back-len back) |
(check (persistent-queue |
(- front-len 1) (stream-rest front) |
back-len back))])) |
; get the head of the queue |
(define (queue-head queue) |
(match queue |
[(persistent-queue front-len front back-len back) |
(stream-first front)])) |
; check if the queue is empty |
(define (queue-empty? queue) |
(= 0 (persistent-queue-front-len queue))) |
; get the queue's length |
(define (queue-length queue) |
(+ (persistent-queue-front-len queue) |
(persistent-queue-back-len queue)))]) |
|
|
> (define empty-persistent-queue |
(persistent-queue 0 stream-null 0 stream-null)) |
|
|
Our operations from before work as expected:
> (queue-head (queue-enqueue empty-persistent-queue 5)) |
5 |
> (queue-empty? empty-persistent-queue) |
#t |
> (queue-length (queue-enqueue (queue-enqueue empty-persistent-queue 7) 5)) |
2 |
Contracts
Earlier, I mentioned that the generic interface also comes with a contract
form that’s automatically defined. You can use these to attach dynamic checks
to your implementations.
For example, we can write a contract that restricts our queues to only accept
integers as data elements:
> (define int-queue/c |
(recursive-contract |
(queue/c [queue-enqueue (-> int-queue/c integer? int-queue/c)] |
[queue-dequeue (-> int-queue/c int-queue/c)] |
[queue-head (-> int-queue/c integer?)] |
[queue-empty? (-> int-queue/c boolean?)] |
[queue-length (-> int-queue/c integer?)]))) |
|
|
For the queue interface, the automatically defined
queue/c combinator
allows us to specify contracts on each of the methods in the interface. We also
use a recursive contract here just so that we can reference the
int-queue/c contract within itself.
We can apply the contract to a particular queue:
> (define/contract checked-queue |
int-queue/c |
empty-queue) |
|
|
> (queue-enqueue checked-queue 42) |
#<simple-queue> |
> (queue-enqueue checked-queue "not an integer") |
checked-queue: contract violation |
expected: integer? |
given: "not an integer" |
in: the 2nd argument of |
the queue-enqueue method of |
(recursive-contract |
(queue/c |
(queue-enqueue |
(-> int-queue/c integer? int-queue/c)) |
(queue-dequeue |
(-> int-queue/c int-queue/c)) |
(queue-head (-> int-queue/c integer?)) |
(queue-empty? (-> int-queue/c boolean?)) |
(queue-length (-> int-queue/c integer?)))) |
contract from: (definition checked-queue) |
blaming: top-level |
at: eval:15.0 |
The second use of
queue-enqueue causes a contract error as expected,
since we can’t add a string to an integer queue. You can also provide a
constructor for your integer queue that’s contracted to produce
int-queue/cs. Any queues created with that constructor will be checked
for integers.
Also, you might have noticed that the queues we wrote above don’t protect
against dequeueing or taking the head of empty queues. To prevent this, we can
write contracts that ensure these operations raise contract errors on empty
queues. Since we want to enforce this for
all queues instead of just
some of them, we apply contracts to the generic functions:
> (define non-empty-queue/c |
(flat-named-contract |
'non-empty-queue |
(λ (q) (and (queue? q) (not (queue-empty? q)))))) |
|
|
> (define/contract (checked-dequeue queue) |
(-> non-empty-queue/c queue?) |
(queue-dequeue queue)) |
|
|
> (define/contract (checked-head queue) |
(-> non-empty-queue/c any/c) |
(queue-head queue)) |
|
|
> (checked-head empty-persistent-queue) |
checked-head: contract violation |
expected: non-empty-queue |
given: #<persistent-queue> |
in: the 1st argument of |
(-> non-empty-queue any/c) |
contract from: (function checked-head) |
blaming: top-level |
at: eval:20.0 |
The
checked-head function raises a contract error as expected instead
of an exception from a stream function. In a real implementation, you would
just export the original generic functions with contracts attached using
contract-out instead of defining checked versions like we did here.
Summary
Racket 5.3 has made the process of defining and using generic interfaces much
easier. The new library is still under active development and we plan to
experiment with additional features and performance improvements. The full code
from this article can be found in the following gist:
https://gist.github.com/3995200
Bibliography
Chris Okasaki. Purely Functional Data Structures. Cambridge University Press, 1998. |
1 comments:
Interesting stuff. These days I find data + interfaces all I really need to in terms of "object orientation".
Given all this, is it possible to define new sequence methods in a similar way?
Post a Comment