Re: cl-base64
On 03/03/16 22:33, Erik Ronström wrote:
> Hi List,
>
> This is a message that I would have sent to the maintainer of cl-base64, had I known any address to him/her. But I spent so much time tracking down a bug that I just want to post it somewhere for reference, if anyone else should happen to encounter the same problem!
>
> Encoding a binary array to base64 and back resulted in different data, which looked like a bug in cl-base64:
>
> CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string #(1 2 3 4 5)))
> #(4 0 0 0 8)
>
> However if the vector is of the correct type, it works:
>
> CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string (coerce #(1 2 3 4 5) '(vector (unsigned-byte 8)))))
> #(1 2 3 4 5)
Indeed! The bug (or "feature") is in encode.lisp, line 54:
** (declare ,@(case input-type
(:string
'((string input)))
(:usb8-array
'((type (array (unsigned-byte 8) (*)) input))))
(fixnum columns)
(optimize (speed 3) (safety 0) (space 0)))
As you can see, the problem is that the encoding function (generated by
this def-*-to-base64-* macro), *declares* the type of the argument.
Therefore implementations will ignore the actual type, and just assume
that the declared type is indeed the type of the data.
To the defense of Common Lisp, the call:
(base64:usb8-array-to-base64-string #(1 2 3 4 5))
is not conforming.
However, I think we can all agree that a public library function should
NEVER declare the type of its arguments, but instead should use
CHECK-TYPE to check them!
If you need an optimized function *inside* a library, with the type of
the arguments declared internally, then call it *after* you've checked
the arguments type with check-type.
If you want to publish such an internal and unsafe function, then at the
very least, it should be *strongly stressed* that it's an unsafe
function in the documentation, and it should be named with a *%*!
(base64:%unsafe-usb8-array-to-base64-string #(1 2 3 4 5)))
or even unexported:
(base64::%usb8-array-to-base64-string #(1 2 3 4 5)))
would clearly hint that something's wrong here.
This same macro is also defective in that it generates the same useless
documentation string for all the function it generates, with no
indication of the unsafety of the generated code:
# Function |USB8-ARRAY-TO-BASE64-STRING| (input &key (uri nil) (columns 0))
Encode a string array to base64. If columns is > 0, designates maximum
number of columns in a line and the string will be terminated with a
#Newline.
One email address of the author dating 2007 is:
/Kevin M. Rosenberg/ <kevin@rosenberg.net>
An older one from 2003 was:
/Kevin M. Rosenberg/ <kmr@debian.org>
http://www.rosenberg.net/
http://blog.kpe.io/
His code really looks as if written by a type-declaration fetichist...
Perhaps he wanted to use C++, not Common Lisp?
quicklisp says that cl-base64 come from github:
https://github.com/quicklisp/quicklisp-projects/blob/master/projects/cl-base64/source.txt
contains:
kmr-git cl-base64
apparently from a personal git repository.
https://github.com/quicklisp/quicklisp-controller
says in
https://github.com/quicklisp/quicklisp-controller/blob/master/upstream-misc.lisp#L47
that kmr-git ishttp://git.kpe.io
It's still maintained:
http://git.kpe.io/?p=cl-base64.git;a=shortlog
Otherwise, this is the repository you would clone if you wanted to patch it:
http://git.kpe.io/cl-base64.git
Also, you could instead use the AGPL3
com.informatimago.common-lisp.rfc3548.rfc3548 package:
cl-user>
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2
3 4))
"AQIDBA=="
cl-user> (com.informatimago.common-lisp.rfc3548.rfc3548:base64-decode-bytes
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2
3 4)))
#(1 2 3 4)
cl-user> (type-of
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-decode-bytes
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2
3 4))))
(vector (unsigned-byte 8) 1024)
cl-user> (documentation
'com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes
'function)
"
DO: Encode the BYTES in BASE64 text.
RETURN: An encoded string.
BYTES: A vector of (unsigned-byte 8).
LINE-WIDTH: NIL or an integer indicating the line width.
the string new-line will be inserted after that
many characters have been written on a given line.
NEW-LINE: A string contaiing the new-line character or characters.
the default +new-line+ is (format nil \"~%\").
"
cl-user>
Notice that a "vector of (unsigned-byte 8)" is not the same as a
"(vector (unsigned-byte 8))".
Perhaps I should make it clearer by writing "vector containing
(unsigned-byte 8) values".
--
__Pascal J. Bourguignon__
http://www.informatimago.com/
_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html