Lisp HUG Maillist Archive

Converting a file to a bit-vector?

I have a disk file and I'd like to load it into a bit vector of a
particular length, which might be less than (* file-octet-count 8), and
I might want to start from a non-zero offset (though aligning to octets
is a restriction with which I can live). I know how to do this in a
portable, leisurely way. Is there a supported, LispWorks-specific way to
do this quickly?

Thanks,
Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

I've not come across a LispWorks' specific way in the manuals. The closest thing I've ever used has been HCL:FILE-STRING which just slurps a file into a string with a provided external format.

I'd say, either use this and then convert the string to a bit-vector, or:

  1. Compute the file size.
  2. Preallocate a bit-vector of that size.
  3. Read the file, filling the vector up.

I'm sure you've done or thought of doing the latter already.

Cheers,

Robert Smith


On Mon, May 6, 2013 at 11:41 AM, Zach Beane <xach@xach.com> wrote:

I have a disk file and I'd like to load it into a bit vector of a
particular length, which might be less than (* file-octet-count 8), and
I might want to start from a non-zero offset (though aligning to octets
is a restriction with which I can live). I know how to do this in a
portable, leisurely way. Is there a supported, LispWorks-specific way to
do this quickly?

Thanks,
Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

Robert Smith <quad@symbo1ics.com> writes:

> I've not come across a LispWorks' specific way in the manuals. The closest
> thing I've ever used has been HCL:FILE-STRING which just slurps a file into
> a string with a provided external format.
>
> I'd say, either use this and then convert the string to a bit-vector, or:
>
>   1. Compute the file size.
>   2. Preallocate a bit-vector of that size.
>   3. Read the file, filling the vector up.
>
> I'm sure you've done or thought of doing the latter already.

A quick experiment suggests that LispWorks does not support
read-sequence for bit vectors.

Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

Tim Bradshaw <tfb@cley.com> writes:

> On 6 May 2013, at 22:09, Zach Beane wrote:
>> 
>> A quick experiment suggests that LispWorks does not support
>> read-sequence for bit vectors.
>
> It looks to me as if it probably does but it doesn't support reading
> BITs from files, which makes the point moot really (ie when I tried
> this I get an error from READ-SEQUENCE when it tries to store
> something bigger than a BIT into the first element of the array).
>
> I am not sure what (open ... :element-type 'bit) is meant to do: for
> both the implementations I tried (LW and CCL) it seems to create a
> stream which reads 8-bit bytes, which strikes me as wrong (I'd want an
> error or a bit) but probably I am misreading the spec, since they both
> do this.

Well, the point is that it's implementation specific how binary elements
are stored into binary files (and therefore how they're read back).

What the standard says is that:

(let ((word (1+ (random 100)))                           ; no limit really
      (bits #(0 1 1 1 0 1)))
  (when (nth-value 1 (subtypep `(unsigned-byte ,word) `integer))
    (with-open-file (out "TEST.BIN" 
                         :element-type `(unsigned-byte ,word)
                         :direction :output
                         :if-does-not-exist :create
                         :if-exists :supersede)
      (write-sequence bits  out))
    (with-open-file (inp "TEST.BIN" 
                         :element-type `(unsigned-byte ,word)
                         :direction :input)
      (let ((v (make-array (file-length inp) :element-type `(unsigned-byte ,word))))
        (read-sequence v inp)
        (assert (= (file-length inp) (length bits))
                () "file length=~S expected=~S" (file-length inp) (length bits))
        (assert (equalp bits v)
                () "bits=~S read=~S" bits v)))))


For BITs, some implementation indeed write #(0 1 1 1 0 1) as (od -t x1):

0000000 00 01 01 01 00 01
0000006

while clisp packs bits (and bytes), but apart from when the byte size is
the same as the underlying file system byte size, this will require a
header specifying the file size (actually, just the number of bytes in
the last word):

0000000 06 00 00 00 2e
0000005




> Related to this is the question of bit-order: if you *could* read bits
> from a file what order do you want the bits in a byte in?

It's a concern for the implementation, if you use the BIT type.



Now, it is expected that (unsigned-byte 8) will map octet-by-octet the
contents of the file (on POSIX systems) to the octet vector.  You can
then do whatever bit decoding you want, in whatever bit order or bytesex
you want.  Of course, it wouldn't work to read 9-track magnetic tapes
on a 36-bit host.



-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.

You can take the lisper out of the lisp job, but you can't take the lisp out
of the lisper (; -- antifuchs

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

Tim Bradshaw <tfb@cley.com> writes:

> On 7 May 2013, at 15:29, Pascal J. Bourguignon wrote:
>
>> Well, the point is that it's implementation specific how binary elements
>> are stored into binary files (and therefore how they're read back).
>
> Yes, I think that's the point: I had been assuming that reading BITs
> meant decoding "bytes"[1] into bits on read, but that's wrong.  That
> means that it's unlikely that read-sequence can ever be useful for
> filling a bit-vector from a file in any useful way, simply because I
> assume that the intent would be to use a file with more than one bit
> per "byte".

Maybe I'm working at the wrong level -- I would be happy with a fast,
supported routine to (possibly destructively?) bash an octet vector into
a bit vector, and vice versa

Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

Zach Beane <xach@xach.com> writes:

> Tim Bradshaw <tfb@cley.com> writes:
>
>> On 7 May 2013, at 15:29, Pascal J. Bourguignon wrote:
>>
>>> Well, the point is that it's implementation specific how binary elements
>>> are stored into binary files (and therefore how they're read back).
>>
>> Yes, I think that's the point: I had been assuming that reading BITs
>> meant decoding "bytes"[1] into bits on read, but that's wrong.  That
>> means that it's unlikely that read-sequence can ever be useful for
>> filling a bit-vector from a file in any useful way, simply because I
>> assume that the intent would be to use a file with more than one bit
>> per "byte".
>
> Maybe I'm working at the wrong level -- I would be happy with a fast,
> supported routine to (possibly destructively?) bash an octet vector into
> a bit vector, and vice versa

It's fast (O(n) n=number of bits or number of bytes) and it's supported
by any conforming implementation (conforming code):


(defun octets-from-bits (bits &key (start 0) (end (length bits)))
  (bytes-from-bits 8 bits :start start :end end))

(defun bytes-from-bits (width bits &key (start 0) (end (length bits)))
  (let* ((size  (ceiling (- end start) width))
         (bytes (make-array size :element-type `(unsigned-byte ,width))))
    (loop
      :for d :below size
      :for bstart :from start :by width
      :do (setf (aref bytes d)
                (loop
                  :repeat (min width (- (length bits) bstart))
                  :for p = 1 :then (* 2 p)
                  :for i :from bstart
                  :unless (zerop (aref bits i))
                  :sum p)))
    bytes))

(defun octets-to-bits (bytes &key (start 0) (end (length bytes)))
  (bytes-to-bits 8 bytes :start start :end end))

(defun bytes-to-bits (width bytes &key (start 0) (end (length bytes)))
  (let* ((size (* width (- end start)))
         (bits (make-array size :element-type 'bit)))
    (loop
      :with d = -1
      :for s :from start :below end
      :for byte = (aref bytes s)
      :do (loop
            :repeat width
            :do (setf (aref bits (incf d)) (logand 1 byte)
                      byte (ash byte -1))))
    bits))

(assert (equalp (octets-from-bits #*100001101111111100000000100000000000000110101010010101011100)
                #(97 255 0 1 128 85 170 3)))

(assert (equalp (octets-to-bits #(97 255 0 1 128 85 170 3))
                #*1000011011111111000000001000000000000001101010100101010111000000))



-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.

You can take the lisper out of the lisp job, but you can't take the lisp out
of the lisper (; -- antifuchs

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

"Pascal J. Bourguignon" <pjb@informatimago.com> writes:

> Zach Beane <xach@xach.com> writes:
>
>> Tim Bradshaw <tfb@cley.com> writes:
>>
>>> On 7 May 2013, at 15:29, Pascal J. Bourguignon wrote:
>>>
>>>> Well, the point is that it's implementation specific how binary elements
>>>> are stored into binary files (and therefore how they're read back).
>>>
>>> Yes, I think that's the point: I had been assuming that reading BITs
>>> meant decoding "bytes"[1] into bits on read, but that's wrong.  That
>>> means that it's unlikely that read-sequence can ever be useful for
>>> filling a bit-vector from a file in any useful way, simply because I
>>> assume that the intent would be to use a file with more than one bit
>>> per "byte".
>>
>> Maybe I'm working at the wrong level -- I would be happy with a fast,
>> supported routine to (possibly destructively?) bash an octet vector into
>> a bit vector, and vice versa
>
> It's fast (O(n) n=number of bits or number of bytes) and it's supported
> by any conforming implementation (conforming code):

I am aware of the portable techniques, and I am looking for non-portable
but supported mechanisms with smaller constant factors.

Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

I'm sure this has been considered, but is there a specific reason to want the bytes in a bit-vector format as opposed to just having a couple functions for accessing the bits in the byte vector?

(defun get-bit (bytes i)
  (ldb (byte 1 (logand i 7)) (elt bytes (ash i -3))))

(defun set-bit (bytes i n)
  (let ((pos (ash i -3)))
    (setf (elt bytes pos) (dpb n (byte 1 (logand i 7)) (elt bytes pos)))))

?

Jeff M.


On Mon, May 6, 2013 at 1:41 PM, Zach Beane <xach@xach.com> wrote:

I have a disk file and I'd like to load it into a bit vector of a
particular length, which might be less than (* file-octet-count 8), and
I might want to start from a non-zero offset (though aligning to octets
is a restriction with which I can live). I know how to do this in a
portable, leisurely way. Is there a supported, LispWorks-specific way to
do this quickly?

Thanks,
Zach

_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Re: Converting a file to a bit-vector?

Jeff Massung <massung@gmail.com> writes:

> I'm sure this has been considered, but is there a specific reason to want
> the bytes in a bit-vector format as opposed to just having a couple
> functions for accessing the bits in the byte vector?
>
> (defun get-bit (bytes i)
>   (ldb (byte 1 (logand i 7)) (elt bytes (ash i -3))))
>
> (defun set-bit (bytes i n)
>   (let ((pos (ash i -3)))
>     (setf (elt bytes pos) (dpb n (byte 1 (logand i 7)) (elt bytes pos)))))

That's a good point, and may be suitable in my current case after
all. In another case, I really do need bit vectors because bit-and,
bit-or, etc can be way, way, way faster than doing the logical bit
operators on octet vectors.

Zach


_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html


Updated at: 2020-12-10 08:35 UTC