Bizarre deadlock with only one lock - (LW6.01/32/Win32/Quad-core Intel/Vista64)
I tried to distill the problem into a single code fragment (below). If the problem won't repeat, perhaps change "workers-available" or put it in a function and compile it. It repeated faster on a beefy quad-core than on a slower XP dual; but would repeat. I think it's a multi-CPU thing. The "wl" macrolet merely expands to mp:with-lock and the single lock created in the code fragment. Example failure point: ........ Making thread #:G1590 Error: Trying to lock 134220981 : Deadlock {simple} : waiting for another stack which waits for the current thread. Other stack: #<MP:PROCESS Name "Worker #:G1588" Priority 0 State "PROCESS-LOCK waiting for Worker thread lock"> Waits for: #<MP:LOCK "Worker thread lock" Locked once by "CAPI Execution Listener 1" 2071ABCB> The failure occurs where, according to a stack trace, the listener process is still inside the "process-run-function" call, with the lock still held. Notice though from the dump - the thread it was making above was G1590; but the "other stack" is listed as G1588. Is it legal to start a thread from within a thread that happens to hold a lock? That's all I can see that could be wrong here in code, since otherwise there is only one lock. . . . Should I be doing this differently, or should I write this up as a potential LW6.01 issue? Anyway, the code is listed below: (let ((lock (mp:make-lock :name "Worker thread lock")) (workers-available 16)) (macrolet ((wl (&body body) `(mp:with-lock (lock) (prog1 (let nil ,@body))))) (labels ((worker-thread-run () (wl (incf workers-available))) (give-work () (wl (unless (zerop workers-available) (decf workers-available) (let ((id (gensym))) (format t "Making thread ~S" id) (terpri) (mp:process-run-function (format nil "Worker ~S" id) nil #'worker-thread-run)))))) (loop while t do (give-work) (sleep 0)))))