Subtle problem in MP code: Are we doing this right?
Dear lisp-huggers, Consider the following piece of code: (defun invoke-with-timeout (timeout bodyfn timeoutfn) (block timeout (let* ((process mp:*current-process*) (timer (mp:make-timer #'(lambda () (mp:process-interrupt process #'(lambda () (return-from timeout (funcall timeoutfn)))))))) (mp:schedule-timer-relative timer timeout) (unwind-protect (funcall bodyfn) (mp:unschedule-timer timer))))) This code is from the acl-compat library distributed with portable allegroserve. (Thanks Jochen!) We have, in *very* rare circumstances, seen the following occur: Condition: Uncaught throw of <timeoutfn return value> to (NIL). Call to (HARLEQUIN-COMMON-LISP:SUBFUNCTION 1 ACL-COMPAT-MP::INVOKE-WITH-TIMEOUT) Call to MP::PROCESS-OUTSTANDING-INTERRUPTS Call to MP::RESUME-PROCESS Call to MP:PROCESS-WAIT Call to PING-READY-MONITORS Call to (HARLEQUIN-COMMON-LISP:SUBFUNCTION 1 MP::INITIALIZE-PROCESS-STACK) We believe that the following occurs: a) Timer is scheduled b) (funcall bodyfn) in unwind protect form runs c) Timer begins running, mp:process-interrupt is called, but process is not yet interrupted (in running non-safety-0 function?) d) Process normally completes invoke-with-timeout e) Process enters a wait-function f) At this point, the interrupt is able to run, but the context in which it was called has disappeared. g) A condition is thrown containing the return value of timeoutfn. [Because it's trying to do a RETURN-FROM to a block which is now gone] Claudio and I propose the following correction: (defun invoke-with-timeout (timeout bodyfn timeoutfn) (block timeout (let* ((process mp:*current-process*) (unsheduled? nil) (timer (mp:make-timer #'(lambda () (mp:process-interrupt process #'(lambda () (unless unsheduled? (return-from timeout (funcall timeoutfn))))))))) (mp:schedule-timer-relative timer timeout) (unwind-protect (funcall bodyfn) (without-interrupts (mp:unschedule-timer timer) (setf unsheduled? t)))))) Does anyone know if this theory of ours is correct, and if our above fix will work? -- Alain Picard Memetrics