Reader Macros

I’m working on a Lisp program for a friend right now that will import old blog entries from a giant XML file. The file is about 60MB, mostly containing images in base64 format. I’m using CL-XMLS to parse the XML file and for some reason it takes a *long* time (15+ minutes) to parse a large file. It could be that the library was never meant to parse files that big. So working with this data file is a real pain.

Disclaimer: I’m going to have to say at this point that I’ve only been coding Lisp for a year, maybe a little more, so I may get some of this wrong. Corrections are welcome!

Edit: See Levi’s comments and corrections below as I am a bit mixed up on terminology.

Common Lisp has a handy feature that makes working with large amounts of external data easier: reader macros. A reader macro allows you to embed the results of a lisp expression into your code. The lisp expression can be anything, like loading an image file or in my case, a very large XML file. This pushes the time it takes to load the XML file to compile time, rather than run time. Once I’ve compiled the source file, everything is fast binary data at that point. As long as the XML data or the source file doesn’t change, I never have to load it again.

Here are the details. I have a file called blog.lisp that contains the code to load the data from the XML file and the reader macro to embed the results into the code:

(in-package :erik-blog)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (defun parse-blog ()
    (with-open-file (in (merge-pathnames #p"erik-blog.xml" (util:source-file-directory)))
      (xmls:parse in))))

(defvar *blog-data* '#.(parse-blog))

In Common Lisp the syntax for a reader macro is #.(some lisp expression). The last line above embeds the results of the PARSE-BLOG function into the code, which then gets assigned to *BLOG-DATA*. When blog.lisp is compiled, it produces a blog.fasl file which is like a .obj file in C except it can be dynamically loaded. The blog.fasl file that is produced is large because it contains the entire binary representation of the blog data. It takes about a second to load the FASL file into my running lisp image but that is nothing compared to the 15+ minutes it would take to load it from scratch.

This technique can be used to embed any type of data asset into an application. I read somewhere on-line about a game company (Naughty Dog) that compiled graphics, sounds and other assets into their Lisp game. Very cool stuff!

5 Responses to “Reader Macros”

  1. びっくり Says:

    This sounds cool. Not just for technical reasons, but also because I am looking forward to recovering the data from the original blog. If you can figure out the encoding on my Japanese text maybe that will make another interesting post. Thanks again for the help.

  2. Levi Says:

    Actually, what you’re describing here is not defining reader macros. It’s known as read-time evaluation, though #. itself is a reader macro. One of the fun things about Lisp is that you can control exactly when different parts of your code are evaluated, whether it be compile-toplevel, load-toplevel, or execute time. The eval-when special form gives you even more control over evaluation time than the #. reader macro does.

    Reader macros themselves are bits of code that allow you to change the syntax that the reader understands. They work by modifying the read-table of the lisp reader. You can dispatch your own read-time functions based on a character or pair of characters. This has been used to create mini-languages with non-sexp syntax embedded within Lisp programs.

  3. anthonyf Says:

    Levi,

    Thanks for your corrections. I’m still learning this stuff and I’m finding that there is a lot of power in Lisp that I have not yet discovered. Having control over when evaluation is done is something that I have not seen in any other language. Very cool.

    Anthony

  4. donlindsay Says:

    That’s a really interesting blog post, glad I came across it. Thanks, and keep it up!

  5. Lauri Says:

    Maybe you should use CXML package instead of XMLS – it should work considerably faster. See http://article.gmane.org/gmane.lisp.lispworks.general/5878 for more details.

Leave a Reply