Lisp HUG Maillist Archive

cl-base64

Hi List,

This is a message that I would have sent to the maintainer of cl-base64, had I known any address to him/her. But I spent so much time tracking down a bug that I just want to post it somewhere for reference, if anyone else should happen to encounter the same problem!

Encoding a binary array to base64 and back resulted in different data, which looked like a bug in cl-base64:

CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string #(1 2 3 4 5)))
#(4 0 0 0 8)

However if the vector is of the correct type, it works:

CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string (coerce #(1 2 3 4 5) '(vector (unsigned-byte 8)))))
#(1 2 3 4 5)


Erik



_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: cl-base64



On 03/03/16 22:33, Erik Ronström wrote:
> Hi List,
>
> This is a message that I would have sent to the maintainer of cl-base64, had I known any address to him/her. But I spent so much time tracking down a bug that I just want to post it somewhere for reference, if anyone else should happen to encounter the same problem!
>
> Encoding a binary array to base64 and back resulted in different data, which looked like a bug in cl-base64:
>
> CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string #(1 2 3 4 5)))
> #(4 0 0 0 8)
>
> However if the vector is of the correct type, it works:
>
> CL-USER> (base64:base64-string-to-usb8-array (base64:usb8-array-to-base64-string (coerce #(1 2 3 4 5) '(vector (unsigned-byte 8)))))
> #(1 2 3 4 5)


Indeed!  The bug (or "feature") is in encode.lisp, line 54:

**    (declare ,@(case input-type
                       (:string
                        '((string input)))
                       (:usb8-array
                        '((type (array (unsigned-byte 8) (*)) input))))
               (fixnum columns)
               (optimize (speed 3) (safety 0) (space 0)))

As you can see, the problem is that the encoding function (generated by 
this def-*-to-base64-* macro), *declares* the type of the argument.  
Therefore implementations will ignore the actual type, and just assume 
that the declared type is indeed the type of the data.

To the defense of Common Lisp, the call:

    (base64:usb8-array-to-base64-string #(1 2 3 4 5))

is not conforming.

However, I think we can all agree that a public library function should 
NEVER declare the type of its arguments, but instead should use 
CHECK-TYPE to check them!

If you need an optimized function *inside* a library, with the type of 
the arguments declared internally, then call it *after* you've checked 
the arguments type with check-type.

If you want to publish such an internal and unsafe function, then at the 
very least, it should be *strongly stressed* that it's an unsafe 
function in the documentation, and it should be named with a *%*!

    (base64:%unsafe-usb8-array-to-base64-string #(1 2 3 4 5)))

or even unexported:

    (base64::%usb8-array-to-base64-string #(1 2 3 4 5)))


would clearly hint that something's wrong here.


This same macro is also defective in that it generates the same useless 
documentation string for all the function it generates, with no 
indication of the unsafety of the generated code:

# Function |USB8-ARRAY-TO-BASE64-STRING| (input &key (uri nil) (columns 0))
Encode a string array to base64. If columns is > 0, designates maximum 
number of columns in a line and the string will be terminated with a 
#Newline.



One email address of the author dating 2007 is:
/Kevin M. Rosenberg/ <kevin@rosenberg.net>
An older one from 2003 was:
/Kevin M. Rosenberg/ <kmr@debian.org>

http://www.rosenberg.net/
http://blog.kpe.io/

His code really looks as if written by a type-declaration fetichist...  
Perhaps he wanted to use C++, not Common Lisp?

quicklisp says that cl-base64 come from github:
https://github.com/quicklisp/quicklisp-projects/blob/master/projects/cl-base64/source.txt
contains:
kmr-git cl-base64
apparently from a personal git repository.

https://github.com/quicklisp/quicklisp-controller
says in 
https://github.com/quicklisp/quicklisp-controller/blob/master/upstream-misc.lisp#L47
that kmr-git ishttp://git.kpe.io

It's still maintained:
http://git.kpe.io/?p=cl-base64.git;a=shortlog

Otherwise, this is the repository you would clone if you wanted to patch it:
http://git.kpe.io/cl-base64.git




Also, you could instead use the AGPL3 
com.informatimago.common-lisp.rfc3548.rfc3548 package:

cl-user> 
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2 
3 4))
"AQIDBA=="
cl-user> (com.informatimago.common-lisp.rfc3548.rfc3548:base64-decode-bytes
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2 
3 4)))
#(1 2 3 4)
cl-user> (type-of 
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-decode-bytes
(com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes #(1 2 
3 4))))
(vector (unsigned-byte 8) 1024)

cl-user> (documentation 
'com.informatimago.common-lisp.rfc3548.rfc3548:base64-encode-bytes 
'function)
"
DO:         Encode the BYTES in BASE64 text.

RETURN:     An encoded string.

BYTES:      A vector of (unsigned-byte 8).

LINE-WIDTH: NIL or an integer indicating the line width.
             the string new-line will be inserted after that
             many characters have been written on a given line.

NEW-LINE:   A string contaiing the new-line character or characters.
             the default +new-line+ is (format nil \"~%\").
"
cl-user>

Notice that a "vector of (unsigned-byte 8)" is not the same as a 
"(vector (unsigned-byte 8))".
Perhaps I should make it clearer by writing "vector containing 
(unsigned-byte 8) values".


-- 
__Pascal J. Bourguignon__
http://www.informatimago.com/


_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Updated at: 2020-12-10 08:32 UTC