This is the likely implementation for Interlocked.Exchange() in the CLR, copied from the SSCLI20 source:
Note that UP in the function name means UniProcessor. This is not atomic on SMP / multi-core systems. This implementation will only be used by CLR on single-core systems.
FASTCALL_FUNC ExchangeUP,8
_ASSERT_ALIGNED_4_X86 ecx
mov eax, [ecx] ; attempted comparand
retry:
cmpxchg [ecx], edx
jne retry1 ; predicted NOT taken
retn
retry1:
jmp retry
FASTCALL_ENDFUNC ExchangeUP
It is superior to using XCHG because this code works without taking a bus lock. xchg
has an implicit lock
prefix, so unlike xadd
or cmpxchg
it simply can't be omitted for single-core systems to still do the operation in one instruction to make it atomic with respect to interrupts (and thus other threads on uniprocessor).
The odd looking jumping code is an optimization in case branch prediction data is not available. Needless to say perhaps, trying to do a better job than what has been mulled over for many years by very good software engineers with generous helpings from the chip manufacturers is a tall task.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…