(Lisp-HUG) Hanging Sockets...
Greetings to the List from the Rocky Mountains -
Once a day, in our soft real-time embedded application, I am
connecting to a remote time server and asking it what day and time
etc it thinks it is.
Everything is fine, except that occasionally things hang for about a
minute - in fact, *everything* hangs - all the threads, CAPI, et. al.
hang. I have narrowed the hang down to the code that reads data from
the time server.
A separate thread - one of many (10?) - makes the actual connection,
reads and deciphers the data, etc.
I open the stream to the remote server, somewhere on the other side
of this planet, and read the anticipated 51-bytes of date/time
information with:
----------------------------------------------------------------------------------------------------------------------------------------
(with-open-stream
(time-server ; Attempt Communication...
(comm:open-tcp-stream
; ...with Remote Server...
url
13 ; Port 13 on Remote Server...
:direction
:input ; Read Only...
:element-type
'base-char ; Gonna Read ASCII Characters...
:errorp
nil ; Don't Signal an Error
on Error...
:read-timeout
3 ; Three-Seconds until Timeout...
:write-timeout
3 ; You Never Know...
:timeout
3)) ; Three-Seconds -
Connection Timeout...
(if (null
time-server) ; Did
Server Connection Fail?
(throw 'server-error
nil) ; Yes - Exit...
(let ((received-string (make-string
51)) ; Receive Buffer...
(position
-1) ; Current
Position in Buffer...
(eor))
; Set True on Valid End of Record...
(while (and (< position
51) ; Loop and Gather Received Characters...
(not
eor)) ; Keep on until
Overflow or EOR...
(if (listen
time-server) ; Anything There Yet?
(let ((character (read-char time-server nil :eor)))
(cond ((eql character
:eor) ; End of Transmission?
(if (not (eql position
50)) ; Yes - Correct String Length?
(throw 'server-error
nil) ; No - Weird Received String...
(setq eor
t))) ; Yes - Set Loop Sentinal...
;
Got another Good Character...
(t (setf (elt received-string (incf position))
character))))
; ...Insert into Received String...
(mp:process-allow-scheduling)))
; No Joy. Nothing There Yet: Timeslice...
--------------------------------------------------------------------------------------------------------------------------------------------------
I have excised declarations etc.
Thus, after opening the stream to the time server, I repeatedly
(listen stream) and if a character is there, I read
and buffer it - otherwise, I give up the CPU and let other threads
have a go at whatever they think they are doing.
Obviously, I repeatedly do the (listen) so as not to hang whilst
waiting for the network - but, I,and all the threads, *do* hang (occasionally).
Clearly, network latency will force numerous delays - but things on
my side should not hang(?!) as I check for Is There A Character Ready?
"Common Lisp: The Language" mumbles, "The predicate listen is true if
there is a character immediately available from input-stream and is
false if not."
I was using the more obvious to use Read-Char-No-Hang rather than
Listen/Read-Char, but it seemed to hang even more (probably wrong).
Question: Does anyone know if Listen is "guaranteed" to not hang
buried in the bowels of the TCP/IP stack if an incoming character
does not exist in the socket buffer???
I am running LispWorks 5.1 under Windows XP.
Thoughts?
Regards,
Jack Harper
Secure Outcomes Inc.
Evergreen, Colorado USA
Re: (Lisp-HUG) Hanging Sockets...
At 03:30 PM 8/18/2010, Mike Watters wrote:
>Hello,
>
>I'm not sure if your example is meant to be a
>demonstration/generalization of an issue you're experiencing
>elsewhere. If not, I noticed a few things:
>
>1. At only 50 bytes, the client running the example should have the
>entire daytime response buffered (by the OS) after receiving the
>first (and only) payload packet from the server, unless the
>connection is severed immediately after establishment, or the server
>does something like send one byte per packet, or the TCP receive
>window is set to an unusually-low value, or the server just doesn't
>respond, etc.
>
>2. If you're using a separate worker thread, why not use blocking
>reads with a timeout?
>
>3. In your specific example as-written, character should never be
>:eor (if the listen call returns non-nil, read-char shouldn't return
>eof), so the first cond branch is never taken, eor is never set, and
>the loop continues forever if less than the expected number of bytes
>are returned by the server.
Thank You Mike for your observations -
(1) Yes, I agree that the entire 51-byte message from the time server
should arrive all at once, but my understanding is that there are no
firm guarantees on that. It should and *always* does, but then, maybe
not in some pathological case.
(2) I am running under LispWorks 5.1 where there are no *real*
threads in a preemptive multitasking sense. If a logical thread
blocks by going down the WIN32 rabbit hole - all of the other threads
will block also. This appears to be what is happening with my code,
even though it should not with the (listen stream) call.
(3) Drat - I just now introduced that bug when I went from the
original read-char-no-hang to the current listen/read-char. I
appreciate you pointing that thing out.
Regards,
Jack Harper
Secure Outcomes Inc.