ENSURE-XX-AFTER-XX
LW/64 for SMP contains 4 SYS primitives,- SYS:ENSURE-LOADS-AFTER-LOADS
- SYS:ENSURE-MEMORY-AFTER-STORE
- SYS:ENSURE-STORES-AFTER-MEMORY
- SYS:ENSURE-STORES-AFTER-STORES
The LW documentation mentions these breifly. But on X86 all but ENSURE-MEMORY-AFTER-STORE are simply do-nothing functions. The remaining one, ENSURE-MEMORY-AFTER-STORE, executes an MFENCE instruction.
Here is a simple snippet of code that demonstrates the behavior of ENSURE-MEMORY-AFTER-STORE:
(defun tst ()
(let ((begin-sem1 (mp:make-semaphore :count 0))
(begin-sem2 (mp:make-semaphore :count 0))
(sem-r1 (mp:make-semaphore :count 0))
(sem-r2 (mp:make-semaphore :count 0))
(det 0)
(x 0)
(y 0)
(r1 1)
(r2 1))
(flet ((thr1 ()
(loop do
(mp:semaphore-acquire begin-sem1)
(loop until (zerop (lw:mt-random 8)))
(setf x 1)
;; at least 1 thread with an MFENCE prevents reorders overall
;; (sys:ensure-memory-after-store)
(setf r1 y)
(mp:semaphore-release sem-r1)))
(thr2 ()
(loop do
(mp:semaphore-acquire begin-sem2)
(loop until (zerop (lw:mt-random 8)))
(setf y 1)
;; (sys:ensure-memory-after-store) ;; 2nd one not needed
(setf r2 x)
(mp:semaphore-release sem-r2))))
(let ((proc1 (mp:process-run-function :thread1 () #'thr1))
(proc2 (mp:process-run-function :thread2 () #'thr2)))
(unwind-protect
(loop for iter from 0 do
(setf x 0
y 0)
(mp:semaphore-release begin-sem1)
(mp:semaphore-release begin-sem2)
(mp:semaphore-acquire sem-r1)
(mp:semaphore-acquire sem-r2)
(when (and (zerop r1)
(zerop r2))
(incf det)
(format t "~D reorders detected after ~D iterations~%" det iter)))
(mp:process-terminate proc1)
(mp:process-terminate proc2))
))))
With the ENSURE-MEMORY-AFTER-STORE commented out, a run of this code produces output like this:
CL-USER 98 > (tst)
1 reorders detected after 33123 iterations
2 reorders detected after 80515 iterations
3 reorders detected after 402985 iterations
4 reorders detected after 920288 iterations
5 reorders detected after 1137734 iterations
6 reorders detected after 1152909 iterations
7 reorders detected after 1344047 iterations
8 reorders detected after 2496919 iterations
...
Kill the function with Break from the Listener window.
By uncommenting at least one of the ENSURE-MEMORY-AFTER-STORE, a re-run of the code produces quiet output.
The above code certainly is not a demonstration of how one should program shared mutable access. But it is a possible component for lock-free primitives.
What it demonstrates is that the Intel X86 will reorder loads / stores as needed to keep the CPU as busy as possible. These are not instruction reorderings, but rather just the re-scheduling of memory transfers, which can vary with conditions such as cache state, and main memory access times.
A single thread can never detect its own orderings. It takes SMP to be able to show the effect.
- DM