Buffer Sequence Points ====================== David Lovemore, Ravenbrook Limited, 2012-09-24 Introduction ------------ This is an analysis of the kind of memory barriers that might be required for safety by the MPS "buffered allocation" design and the "allocation point" protocol on modern multi-core CPUs. It was a response to [this request](https://info.ravenbrook.com/mail/2012/09/14/22-13-58/0/) by Richard Brooksby: > //info.ravenbrook.com/project/mps/master/code/buffer.c#14 line 791 says: > > /* .improve.memory-barrier: Memory barrier here on the DEC Alpha */ > /* (and other relaxed memory order architectures). */ > > Well, I suspect everything's pretty relaxed these days. > > Could I ask you to assess the situation for I3 and I6? Do we need to > insert some processor sequence points somewhere? Analysis -------- Memory barriers are needed at the compiler level and the CPU level. The memory barrier for the compiler stops loads and stores being reordered. On Windows they come in three flavours `_ReadBarrier`, `_WriteBarrier` and `_ReadWriteBarrier` intrinsics see . Alternatively we can mark the relevant accesses as volatile. My guess is that all mutator accesses are relevant because the objects need to be written before they are committed, so I don't think that is going to work for us. That does not stop the CPU reordering reads and writes. The portable way to have a memory barrier on windows is to use the `MemoryBarrier` macro , which *both* inserts a memory barrier instruction sequence *and* prevents the compiler reordering memory accesses across it. Currently we stop all threads before we do the flip, which will force all reads and writes to be in a consistent state between threads when we do the synchronization, so I don't think we *currently* need a full CPU synchronization. So I don't think we need a memory barrier in `BufferFlip` where it is marked "Memory Barrier here?" because the mutator is stopped at this point and these fields aren't changing: buffer->initAtFlip = buffer->ap_s.init; /* Memory Barrier here? @@@@ */ buffer->ap_s.limit = (Addr)0; However in the mutator we ought to have a `_ReadWriteBarrier` just before testing the limit: buffer->ap_s.init = buffer->ap_s.alloc; /* .improve.memory-barrier: Memory barrier here on the DEC Alpha */ /* (and other relaxed memory order architectures). */ /* .commit.after: If a flip occurs at this point, the pool will */ /* see "initAtFlip" above the object, which is valid, so it will */ /* be collected. The commit must succeed when trip is called. */ /* The pointer "p" will have been fixed up. (@@@@ Will it?) */ /* .commit.trip: Trip the buffer if a flip has occurred. */ if (buffer->ap_s.limit == 0) return BufferTrip(buffer, p, size); It seems here that a compiler might be tempted to load the limit early to avoid having to wait for the load when it wants to test it. Again I don't think we need a full CPU memory barrier here as the underlying memory will only get changed when threads are stopped. Note we also need to update the macros in mps.h. I guess that we have been getting away without the memory barriers so far. Document History ---------------- - 2012-10-01 RB Edited from email into document because it contains an important analysis that we need to reference publicly. --- $Id: //info.ravenbrook.com/project/mps/doc/2012-09-24/buffer-sequence-points/index.txt#2 $