Lisp HUG Maillist Archive

Character writing question

I have a trivial form to dump some html data read via drakma, currently in sbcl:

(setf *drakma-default-external-format* :utf8)

    (with-open-file
    	(s (pathname (concatenate 'string "/tmp/tm-" name ".html"))
		   :direction :output :if-exists :supersede)
	            (write-sequence page s))

And I get this error in WRITE-SEQUENCE:

Error: #\' is not of type BASE-CHAR

where the ' is (I believe) the Windows single quote character.

OTOH, if I add :element-type '(unsigned-byte 8) to the OPEN args I get this:

"..." contains non-integer element #\Newline.

also in WRITE-SEQUENCE.

I'm sure this is a simple fix.  Thanks in advance for your help and patience.

Jonathon McKitrick
--
My other computer is your Windows box.


Re: Character writing question

On Wed, 5 Dec 2007 02:31:38 +0000, Jonathon McKitrick <jcm@sdf.lonestar.org> wrote:

> Error: #\' is not of type BASE-CHAR
>
> where the ' is (I believe) the Windows single quote character.

No, I don't think so.

  CL-USER 1 > (typep #\' 'base-char)
  t

> I'm sure this is a simple fix.

Try to open the file with :ELEMENT-TYPE 'LW:SIMPLE-CHAR.

  http://www.lispworks.com/documentation/lw50/LWRM/html/lwref-346.htm

Edi.


Re: Character writing question

On Wed, 5 Dec 2007 21:29:35 +0000, Jonathon McKitrick <jcm@sdf.lonestar.org> wrote:

> It's not a *true* ' but it's one of those characters that looks like
> it was pasted from Word.  I've seen similar characters for quotation
> marks as well that seem to be Win32 specific.

Probably a curly quote.  That's not Win32-specific, you'll find it in
almost any book.

  http://en.wikipedia.org/wiki/Smart_quotes

It's not an ASCII character, though.

> Unfortunately I got the same error

How about a backtrace?


Re: Character writing question

On Wed, 5 Dec 2007 22:56:11 +0000, Jonathon McKitrick <jcm@sdf.lonestar.org> wrote:

> Though that's not very helpful without any expansion.

:bb

> Is it useful to mention that the stream object is
> LATIN-1-FILE-STREAM?

Ugh, so maybe your character is not in Latin-1?  Have you tried other
external formats like UTF-8?


Re: Character writing question

On Sun, 9 Dec 2007 20:04:34 +0000, Jonathon McKitrick <jcm@sdf.lonestar.org> wrote:

> I was going to call regex-replace-all, but the non-ASCII curly quote
> seems to be a problem.  I didn't see anything in the docs on this in
> cl-ppcre, but the replace works on sbcl and not LW.

Works for me (LWW 5.0.2):

  CL-USER 1 > (code-char 8216)
  #\‘

  CL-USER 2 > (char-name *)
  "U+2018"

  CL-USER 3 > (defparameter *regex* (format nil "[~C~C]" (code-char 8216) (code-char 8217)))
  *REGEX*

  CL-USER 4 > *regex*
  "[‘’]"

  CL-USER 5 > (defparameter *target* (format nil "~CCurly quotes inside~C" (code-char 8216) (code-char 8217)))
  *TARGET*

  CL-USER 6 > *target*
  "‘Curly quotes inside’"

  CL-USER 7 > (ppcre:regex-replace-all *regex* *target* "'")
  "'Curly quotes inside'"
  T

  CL-USER 8 > (map 'list #'char-code *)
  (39 67 117 114 108 121 32 113 117 111 116 101 115 32 105 110 115 105 100 101 39)

  CL-USER 9 > (defun foo (target) (ppcre:regex-replace-all #.*regex* target "'"))
  FOO

  CL-USER 10 > (compile *)
  FOO
  NIL
  NIL

  CL-USER 11 > (foo *target*)
  "'Curly quotes inside'"
  T

  CL-USER 12 > (map 'list #'char-code *)
  (39 67 117 114 108 121 32 113 117 111 116 101 115 32 105 110 115 105 100 101 39)

> Is there something I need to do under LW to make it work?

What exactly doesn't work for you?

> My other computer is your Windows box.

I'm beginning to seriously doubt that...

Edi.


Re: Character writing question

On Mon, 10 Dec 2007 00:24:50 +0000, Jonathon McKitrick <jcm@sdf.lonestar.org> wrote:

> On Sun, Dec 09, 2007 at 09:35:34PM +0100, Edi Weitz wrote:
>
> : What exactly doesn't work for you?
>
> Hmmm.  I'm not sure why a simple regex on an html page with a curly
> quote works under sbcl but not under LW.  So I really don't know
> where to look next.  Perhaps I need to force latin-1 encoding when
> fetching the page?  I'm new to most of these encoding issues as it
> is.
>
> : > My other computer is your Windows box.
> : 
> : I'm beginning to seriously doubt that...
>
> It's just a joke.
> I'm beginning to seriously doubt that subtle humor translates to the
> web. ;-)

Beam me up, Scotty...


Updated at: 2020-12-10 08:44 UTC