Committing James Knight's patch from
https://bugs.launchpad.net/sbcl/+bug/589293
Tested on amd64 and x86 with same failures before and after.
Quotes:
It seems as though the ALLOC-TLS-INDEX-IN-* code is not thread-safe. I
cannot figure out how it's not thread-safe, because it looks correct
to me, so far as I can see.
But this test case shows that it's not.
sbcl --eval '(load (compile-file "thread-test.lisp"))' --eval "(run-test)" --eval "(quit)"
It will fail with something like:
The assertion (BOUNDP '*VAR50*) failed.
The variable number will vary randomly, of course.
SBCL 1.0.37.44. Debian Linux 2.6.32-amd64, x86-64.
%%
Okay, turns out to be an utterly trivial problem. This is the
disassembly of SB-VM::ALLOC-TLS-INDEX-IN-RAX. See instructions
0x20000f42 and 0x20000f47. You can't use %rax as the new value and the
compare-to value for cmpxchg! So, the lock was never actually
taken. The variants for allocating in everything else worked
fine. Just RAX is broken.
0x20000f38: mov %rbp,0xb8(%r12)
0x20000f40: push %rcx
0x20000f41: push %rax
0x20000f42: mov $0x1,%eax
0x20000f47: xor %eax,%eax
0x20000f49: lock cmpxchg %rax,0x20100b88
0x20000f53: jne 0x20000f42
0x20000f55: pop %rcx
0x20000f56: mov 0x21(%rcx),%rax
0x20000f5a: or %rax,%rax
0x20000f5d: jne 0x20000f8d
0x20000f5f: mov 0x20100b48,%rax
0x20000f67: cmp $0x8000,%rax
0x20000f6d: jl 0x20000f80
0x20000f6f: movq $0x0,0xb8(%r12)
0x20000f7b: jmpq 0x20001874
0x20000f80: addq $0x8,0x20100b48
0x20000f89: mov %rax,0x21(%rcx)
0x20000f8d: xor %ecx,%ecx
0x20000f8f: xchg %rcx,0x20100b88
0x20000f97: pop %rcx
0x20000f98: xor %rbp,0xb8(%r12)
0x20000fa0: je 0x20000fa4
0x20000fa2: int3
0x20000fa3: 0x09
0x20000fa4: ret
(inst push other)
(inst push target)
(emit-label get-tls-index-lock)
- (inst mov target 1)
- (zeroize rax-tn)
- (inst cmpxchg (make-ea-for-symbol-value *tls-index-lock*)
- target :lock)
- (inst jmp :ne get-tls-index-lock)
+ (let ((not-rax ,(if (eql 'rax reg) 'other 'target)))
+ (inst mov not-rax 1)
+ (zeroize rax-tn)
+ (inst cmpxchg (make-ea-for-symbol-value *tls-index-lock*)
+ not-rax :lock)
+ (inst jmp :ne get-tls-index-lock))
;; The symbol is now in OTHER.
(inst pop other)
;; Now with the lock held, see if the symbol's tls index has been
(inst push other)
(inst push target)
(emit-label get-tls-index-lock)
- (inst mov target 1)
- (inst xor eax-tn eax-tn)
- (inst cmpxchg (make-ea-for-symbol-value *tls-index-lock*)
- target :lock)
- (inst jmp :ne get-tls-index-lock)
+ (let ((not-eax ,(if (eql 'eax reg) 'other 'target)))
+ (inst mov not-eax 1)
+ (inst xor eax-tn eax-tn)
+ (inst cmpxchg (make-ea-for-symbol-value *tls-index-lock*)
+ not-eax :lock)
+ (inst jmp :ne get-tls-index-lock))
;; The symbol is now in OTHER.
(inst pop other)
;; Now with the lock held, see if the symbol's tls index has been
;;; checkins which aren't released. (And occasionally for internal
;;; versions, especially for internal versions off the main CVS
;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
-"1.0.39.15"
+"1.0.39.16"