I’ve been crazy busy at work so I’ve been slacking on my posts. Lots of cool things to write about but not enough time! On a positive note I’ve been doing more utility code in Lisp at work. At work I call Lisp !C (not C) to avoid any political issues that could arise. When asked “What’s !C?”, I answer “Its like C, only better! It’s all the rage! Everyone is doing it!”.
This past weekend I worked on a program for work that fragments the hell out of a hard disk. Why, you ask? Its a tool to test the performance of our product under extreme conditions. It fragments the disk by creating files of random size util it reaches a free space threshold, then deletes random files which it previously created. After a few deletes, it creates more randomly sized files. Rinse, repeat. If you let the program run for a while on a disk it produces files with 200+ fragments.
The first working version of the code was horribly slow. It took an *extremely* long time to fill up a disk with files before it got to the delete stage. The file writes were taking way to long to complete. I got to talking to my coworker about this problem and he suggested an old hack which I have never heard of. Instead of writing data to the files, use fseek() to seek to the position where the desired EOF should be and then write a single byte to it. This technique can be used to create huge files in a matter of milliseconds. He explained that this hack can be used to “see” the data in the free space on a disk. How evil is that?
Changing my code to use the new technique made it orders of magnitude faster. But it wasn’t long before we discovered a bug in the new system. I was able to allocate huge files on the system (1GB+) but it would not decrease the amount of free space on the disk. After a throwing out a few wild a crazy theories about it my coworker and I finally discovered what was going on. The file system was not reserving space for blocks that were not written to. So for a 1GB file only one block was getting reserved. The rest of the file must be in the “free block pool” until data is written to it.
To fix the problem I changed the code to write out a single byte per block, thus allocating the the entire block without the overhead of filling the entire block with data. This solution is not quite as fast but it is still way more efficient than the original one, and it works!
The program was written in CLISP, which is a very cool Lisp for doing shell scripting an quick one-off utilities. Here’s the code for the function that eats the disk space:
(defun eat-space (file-name size)
"Create a bogus file that chews up disk space. Size is in bytes."
(let ((file (linux:fopen file-name "w")))
;; write a byte every 4k so a block gets allocated
(loop for x from 0 to (1- size) by 4096
do (progn (linux:fseek file x 0)
(linux:fputc 1 file)))
(linux:fclose file)))
One other thing to note for any lispers that may be reading this. The code above uses the LINUX package in CLISP, not the standard CL file IO functions. The reason I cannot use the standard functions is FILE-POSITION throws an error when attempting the seek past the end of a file. So in other words, the seek hack does not work in standard common lisp (or at least CLISP’s version of it).
You rule. I worship you.
fseek/EOF hack to create randomly sized files
…I got to talking to my coworker about this problem and he suggested an old hack which I have never heard of. Instead of writing data to the files, use fseek() to seek to the position where the desired EOF should be and then write a single byte to i…
quote:
…I got to talking to my coworker about this problem and he suggested an old hack which I have never heard of. Instead of writing data to the files, use fseek() to seek to the position where the desired EOF should be and then write a single byte to i…
/quote
This probably won’t work any more, your file system will just create a sparse file, it won’t actually allocate all the empty space. That’s why the lisp version writes a byte every 4k.
I’m sorry that this comment does not have anything to do with the content of Your post, just with the form of it:
How do You insert those nice formatted and colored code excerpts? I have a blog on Java development, and currently I’m using a rather fair Java2HTML tool. Quite nice, but could be better. Do You have a hint for me?
Madoc:
I used Emacs as my code editor which has a feature called “htmlize”. I select my already syntax highlighted code and type “M-x htmlize-region” which creates an html page with all of the appropriate font tags for highlighting the code. I then just copy the HTML into WordPress.
I made a simple option tweak for htmlize to make it not use external styles. This makes the html code it generates work on its own. To do it I put this in my .emacs file:
;; htmlize stuff
;; Use font mode so we can embed generated
;; html into foreign html documents
(setq htmlize-output-type ‘font)
Emacs will syntax highlight Java code so this solution would work for you.
Good luck!
Cool! Finally something that is not over my head yet, as always, time-saving and useful. I like your ‘bang C’ comment as well.
back on August 30th, sjf said:
quote:
This probably won’t work any more, your file system will just create a sparse file, it won’t actually allocate all the empty space. That’s why the lisp version writes a byte every 4k.
/quote
It depends on the OS… I’ve used this trick for years (decades actually) and it still works fine on some embedded OSs. But as time’s gone on, most larger scale OSs have gotten smarter about where the data actually is.
We (I’m the co-worker) had some fun playing with this when I first suggested it, and in the end, I modified my original C app to drop a byte every {blocksize} bytes. Anthony then adapted his !C flavor to do the same. In the end, it’s still many times faster than actually writing gobs of data from front-to-back.
i love u
Refreshing blog.
i tried anthonyf.wordpress.com in the past and i like it, just keep it up!
i am sorry if i wrote in wrong section and please admins to move this to another place.
i am find ways to make money on the net. what posibilities we have? have you ever tried survey sites? i
t seems to be easy to make money. i found myhotrevenue dot com in my searches. IS this legit? anyone know?