August 30, 2006
I’ve been crazy busy at work so I’ve been slacking on my posts. Lots of cool things to write about but not enough time! On a positive note I’ve been doing more utility code in Lisp at work. At work I call Lisp !C (not C) to avoid any political issues that could arise. When asked “What’s !C?”, I answer “Its like C, only better! It’s all the rage! Everyone is doing it!”.
This past weekend I worked on a program for work that fragments the hell out of a hard disk. Why, you ask? Its a tool to test the performance of our product under extreme conditions. It fragments the disk by creating files of random size util it reaches a free space threshold, then deletes random files which it previously created. After a few deletes, it creates more randomly sized files. Rinse, repeat. If you let the program run for a while on a disk it produces files with 200+ fragments.
The first working version of the code was horribly slow. It took an *extremely* long time to fill up a disk with files before it got to the delete stage. The file writes were taking way to long to complete. I got to talking to my coworker about this problem and he suggested an old hack which I have never heard of. Instead of writing data to the files, use fseek() to seek to the position where the desired EOF should be and then write a single byte to it. This technique can be used to create huge files in a matter of milliseconds. He explained that this hack can be used to “see” the data in the free space on a disk. How evil is that?
Changing my code to use the new technique made it orders of magnitude faster. But it wasn’t long before we discovered a bug in the new system. I was able to allocate huge files on the system (1GB+) but it would not decrease the amount of free space on the disk. After a throwing out a few wild a crazy theories about it my coworker and I finally discovered what was going on. The file system was not reserving space for blocks that were not written to. So for a 1GB file only one block was getting reserved. The rest of the file must be in the “free block pool” until data is written to it.
To fix the problem I changed the code to write out a single byte per block, thus allocating the the entire block without the overhead of filling the entire block with data. This solution is not quite as fast but it is still way more efficient than the original one, and it works!
The program was written in CLISP, which is a very cool Lisp for doing shell scripting an quick one-off utilities. Here’s the code for the function that eats the disk space:
(defun eat-space (file-name size)
"Create a bogus file that chews up disk space. Size is in bytes."
(let ((file (linux:fopen file-name "w")))
;; write a byte every 4k so a block gets allocated
(loop for x from 0 to (1- size) by 4096
do (progn (linux:fseek file x 0)
(linux:fputc 1 file)))
(linux:fclose file)))
One other thing to note for any lispers that may be reading this. The code above uses the LINUX package in CLISP, not the standard CL file IO functions. The reason I cannot use the standard functions is FILE-POSITION throws an error when attempting the seek past the end of a file. So in other words, the seek hack does not work in standard common lisp (or at least CLISP’s version of it).
9 Comments |
Lisp, Programming |
Permalink
Posted by anthonyf
August 8, 2006
I’m working on a Lisp program for a friend right now that will import old blog entries from a giant XML file. The file is about 60MB, mostly containing images in base64 format. I’m using CL-XMLS to parse the XML file and for some reason it takes a *long* time (15+ minutes) to parse a large file. It could be that the library was never meant to parse files that big. So working with this data file is a real pain.
Disclaimer: I’m going to have to say at this point that I’ve only been coding Lisp for a year, maybe a little more, so I may get some of this wrong. Corrections are welcome!
Edit: See Levi’s comments and corrections below as I am a bit mixed up on terminology.
Common Lisp has a handy feature that makes working with large amounts of external data easier: reader macros. A reader macro allows you to embed the results of a lisp expression into your code. The lisp expression can be anything, like loading an image file or in my case, a very large XML file. This pushes the time it takes to load the XML file to compile time, rather than run time. Once I’ve compiled the source file, everything is fast binary data at that point. As long as the XML data or the source file doesn’t change, I never have to load it again.
Here are the details. I have a file called blog.lisp that contains the code to load the data from the XML file and the reader macro to embed the results into the code:
(in-package :erik-blog)
(eval-when (:compile-toplevel :load-toplevel :execute)
(defun parse-blog ()
(with-open-file (in (merge-pathnames #p"erik-blog.xml" (util:source-file-directory)))
(xmls:parse in))))
(defvar *blog-data* '#.(parse-blog))
In Common Lisp the syntax for a reader macro is #.(some lisp expression). The last line above embeds the results of the PARSE-BLOG function into the code, which then gets assigned to *BLOG-DATA*. When blog.lisp is compiled, it produces a blog.fasl file which is like a .obj file in C except it can be dynamically loaded. The blog.fasl file that is produced is large because it contains the entire binary representation of the blog data. It takes about a second to load the FASL file into my running lisp image but that is nothing compared to the 15+ minutes it would take to load it from scratch.
This technique can be used to embed any type of data asset into an application. I read somewhere on-line about a game company (Naughty Dog) that compiled graphics, sounds and other assets into their Lisp game. Very cool stuff!
5 Comments |
Lisp, Programming |
Permalink
Posted by anthonyf
August 1, 2006
I was accused of cheating today. I’m all riled up so I can’t possibly sleep, which is what I should be doing right now. Grr!
I’m taking a web publishing class as part of my degree program at a local college. The teacher mentioned to us on several occasions that we were not allowed to use WYSIWYG editors for the class. He wants us to hand type all the HTML. That’s fine with me, in fact, I hate WYSIWYG editors. I figured I would not be violating this rule if I were to use Emacs (my editor of choice) which is text based and provides absolutely no WYSIWYG functionality at all. I also assumed it would be OK to write the HTML using compact lisp symbolic expressions to save my poor hands from RSI. In case you’ve never seen HTML written this way, here’s an example:
(:html
(:head (:title “Title of the web page”))
(:body (:h1 “Hello World!”))))
Writing HTML this way saves a lot of typing and avoids the “angle bracket tax”. After writing the HTML using s-exprs I run it through a translator that spits out normal HTML, which is what I turn in to the teacher.
Well, today I got 2 emails from my teacher. The first one said I did a excellent job with my assignment and he even gave me some extra credit points. The next email, which came a few hours later, he retracted my grade stating that I must have used a WYSIWYG editor to do the assignment. My HTML was too advanced and well formatted to be hand written. He then said he would give me a break this one time and let me redo my assignment by hand like the rest of the students in the class.
His response to my assignment made me very mad. I spent a lot of time working on that assignment and to be called a cheater without first allowing me to explain myself really ticks me off. So after cooling down a little bit I sent him a nice email explaining exactly how I did the assignment and included the s-expr source file. I also explained why I did it that way, mentioning RSI and the AB Tax. Hopefully the issue will be resolved in the morning and I can get my excellent grade with extra credit back!
Alright, I’m done venting, maybe I can get some sleep now!
25 Comments |
Uncategorized |
Permalink
Posted by anthonyf