Chipz
Hi,
I have a question that might related to how LispWorks compiles a
LABELS construct.
In the open source Chipz library ( http://method-combination.net/lisp/chipz/
), yet another excellent library by Nathan Froyd doing decompression
according to a number of algorithms, I have a performance problem that
seems to occur only on LispWorks.
Once you ASDF install the library, this helper function will gunzip
from one file to another:
(defun gunzip-file (from to)
(with-open-file (in from)
(with-open-file (out to :direction :output :if-exists :supersede)
(chipz:decompress out 'chipz:gzip in))))
Using the standard Unix tools to create a test file:
$ echo Hello Worls from Common Lisp | gzip > /tmp/test.gzip
When can then use Chipz to decompress the file as follows:
> (gunzip-file "/tmp/test.gzip" "/tmp/test.txt")
#<STREAM::LATIN-1-FILE-STREAM /tmp/test.txt>
And is works:
$ cat /tmp/test.txt
Hello Worls from Common Lisp
The problem is performance related, let's take a bigger file:
$ curl http://www.gutenberg.org/files/345/345.txt > /tmp/dracula.txt
$ cat /tmp/dracula.txt | gzip > /tmp/dracula.gzip
Decompressing this file show the problem:
> (gunzip-file "/tmp/dracula.gzip" "/tmp/foo.txt")
Stack overflow (stack size 20475).
1 (continue) Extend stack by 50%.
2 Extend stack by 300%.
3 (abort) Return to level 0.
4 Restart top-level loop.
Type :b for backtrace, :c <option number> to proceed, or :? for other
options
1 > :bq
RUNTIME:BAD-ARGS-OR-STACK
<- (SUBFUNCTION CHIPZ::FROB-BY-COPYING-FROM (SUBFUNCTION (LABELS
CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-MACHINE))
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
[... many more deleted ...]
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- (SUBFUNCTION (LABELS CHIPZ::COPY-MATCH) CHIPZ::%INFLATE-STATE-
MACHINE)
<- CHIPZ::%INFLATE-STATE-MACHINE <- CHIPZ::%INFLATE
<- CHIPZ::%DECOMPRESS/STREAM-STREAM <- CHIPZ::%DECOMPRESS
<- GUNZIP-FILE <- EVAL
1 > :c 2
#<STREAM::LATIN-1-FILE-STREAM /tmp/foo.txt>
Decompression itself worked:
$ diff /tmp/dracula.txt /tmp/foo.txt
COPY-MATCH is a LABELS sub function inside %INFLATE-STATE-MACHINE.
The (very big) LABELS construct there is used to implement a state
machine.
According to Nathan, no such recursion should be on the stack.
As far as I can see, it should at least be tail optimized.
Maybe LispWorks is doing something wrong compiling the code ?
There are also some important timing differences between LispWorks and
SBCL,
probably because most declarations are more geared towards SBCL, but
the cause
could be related:
LWM:
> (time (gunzip-file "/tmp/dracula.gzip" "/tmp/foo.txt"))
Timing the evaluation of (GUNZIP-FILE "/tmp/dracula.gzip" "/tmp/
foo.txt")
User time = 0.707
System time = 0.005
Elapsed time = 0.726
Allocation = 182096 bytes
0 Page faults
#<STREAM::LATIN-1-FILE-STREAM /tmp/foo.txt>
SBCL
* (time (gunzip-file "/tmp/dracula.gzip" "/tmp/foo.txt"))
Evaluation took:
0.081 seconds of real time
0.074685 seconds of user run time
0.004396 seconds of system run time
0 calls to %EVAL
0 page faults and
62,624 bytes consed.
#<SB-SYS:FD-STREAM for "file /tmp/foo.txt" {11825EA9}>
Does anyone have any ideas or care to have a look ?
It would be nice if we could help Chipz to run great on LispWorks as
well.
Sven
PS: the above benchmark was run using a patched version of chipz 0.7.4;
the patch is a newer CRC32 implementation. This is not related to the
LABELS
issue, but it will influence consing a lot.
PS: on SBCL the gunzip-file function has to be implemented using
binary IO,
somthing that LispWorks does tranparently most of the time.