5. Allocation buffers and allocation points¶
5.1. Introduction¶
.scope: This is the design of allocation buffers and allocation points.
.purpose: The purpose of this document is to record design decisions made concerning allocation buffers and allocation points and justify those decisions in terms of requirements.
.readership: The document is intended for reading by any MPS developer.
5.2. Glossary¶
trapped
.def.trapped: The buffer is in a state such that the MPS gets to know about the next use of that buffer.
5.3. Source¶
.source.mail: Much of the juicy stuff about buffers is only floating around in mail discussions. You might like to try searching the archives if you can’t find what you want here.
Note
Mail archives are only accessible to Ravenbrook staff. RHSK 2006-06-09.
.source.synchronize: For a discussion of the synchronization issues, see mail.richard.1995-05-19.17-10, mail.ptw.1995-05-19.19-15, and mail.richard.1995-05-24.10-18.
Note
I believe that the sequence for flip in PTW’s message is incorrect. The operations should be in the other order. DRJ.
.source.interface: For a description of the buffer interface in C prototypes, see mail.richard.1997-04-28.09-25.
.source.qa: Discussions with QA were useful in pinning down the semantics and understanding of some obscure but important boundary cases. See the thread with subject “notes on our allocation points discussion” and messages mail.richard.tucker.1997-05-12.09-45, mail.ptw.1997-05-12.12-46, mail.richard.1997-05-12.13-15, mail.richard.1997-05-12.13-28, mail.ptw.1997-05-13.15-15, mail.sheep.1997-05-14.11-52, mail.rit.1997-05-15.09-19, mail.ptw.1997-05-15.21-22, mail.ptw.1997-05-15.21-35, mail.rit.1997-05-16.08-02, mail.rit.1997-05-16.08-42, mail.ptw.1997-05-16.12-36, mail.ptw.1997-05-16.12-47, mail.richard.1997-05-19.15-46, mail.richard.1997-05-19.15-56, and mail.ptw.1997-05-20.20-47.
5.4. Requirements¶
.req.fast: Allocation must be very fast.
.req.thread-safe: Must run safely in a multi-threaded environment.
.req.no-synch: Must avoid the use of thread-synchronization. (.req.fast)
.req.manual: Support manual memory management.
.req.exact: Support exact collectors.
.req.ambig: Support ambiguous collectors.
.req.count: Must record (approximately) the amount of allocation (in bytes).
Note
Actually not a requirement any more, but once was put forward as a Dylan requirement. Bits of the code still reflect this requirement. See request.dylan.170554.
5.5. Classes¶
.class.hierarchy: The Buffer data structure is designed to be
subclassable (see design.mps.protocol).
.class.hierarchy.buffer: The basic buffer class (BufferClass)
supports basic allocation-point buffering, and is appropriate for
those manual pools which don’t use segments (.req.manual). The
Buffer class doesn’t support reference ranks (that is, the buffers
have RankSetEMPTY). Clients may use BufferClass directly, or
create their own subclasses (see .subclassing).
.class.hierarchy.segbuf: Class SegBufClass is also provided for
the use of pools which additionally need to associate buffers with
segments. SegBufClass is a subclass of BufferClass. Manual
pools may find it convenient to use SegBufClass, but it is
primarily intended for automatic pools (.req.exact, .req.ambig).
An instance of SegBufClass may be attached to a region of memory
that lies within a single segment. The segment is associated with the
buffer, and may be accessed with the BufferSeg() function.
SegBufClass also supports references at any rank set. Hence this
class or one of its subclasses should be used by all automatic pools
(with the possible exception of leaf pools). The rank sets of buffers
and the segments they are attached to must match. Clients may use
SegBufClass directly, or create their own subclasses (see
.subclassing).
.class.hierarchy.rankbuf: Class RankBufClass is also provided
as a subclass of SegBufClass. The only way in which this differs
from its superclass is that the rankset of a RankBufClass is set
during initialization to the singleton rank passed as an additional
parameter to BufferCreate(). Instances of RankBufClass are of
the same type as instances of SegBufClass, that is, SegBuf.
Clients may use RankBufClass directly, or create their own
subclasses (see .subclassing).
.class.create: The buffer creation functions (BufferCreate()
and BufferCreateV()) take a class parameter, which determines
the class of buffer to be created.
.class.choice: Pools which support buffered allocation should
specify a default class for buffers. This class will be used when a
buffer is created in the normal fashion by MPS clients (for example by
a call to mps_ap_create()). Pools specify the default class by
means of the bufferClass field in the pool class object. This
should be a pointer to a function of type PoolBufferClassMethod.
The normal class “Ensure” function (for example
EnsureBufferClass()) has the appropriate type.
.subclassing: Pools may create their own subclasses of the standard
buffer classes. This is sometimes useful if the pool needs to add an
extra field to the buffer. The convenience macro
DEFINE_BUFFER_CLASS() may be used to define subclasses of buffer
classes. See design.mps.protocol.int.define-special.
.replay: To work with the allocation replayer (see design.mps.telemetry.replayer), the buffer class has to emit an event for each call to an external interface, containing all the parameters passed by the user. If a new event type is required to carry this information, the replayer (impl.c.eventrep) must then be extended to recreate the call.
.replay.pool-buffer: The replayer must also be updated if the association of buffer class to pool or the buffer class hierarchy is changed.
.class.method: Buffer classes provide the following methods (these should not be confused with the pool class methods related to the buffer protocol, described in .method.create and following sections):
.class.method.init: init() is a class-specific initialization
method called from BufferInit(). It receives the keyword arguments
passed to to BufferInit(). Client-defined methods must call their
superclass method (via a next-method call) before performing any
class-specific behaviour. .replay.init: The init() method
should emit a BufferInit<foo> event (if there aren’t any extra
parameters, <foo> = "").
- 
void (*BufferFinishMethod)(Buffer buffer)¶
.class.method.finish: finish() is a class-specific finish
method called from BufferFinish(). Client-defined methods must
call their superclass method (via a next-method call) after performing
any class-specific behaviour.
.class.method.attach: attach() is a class-specific method
called whenever a buffer is attached to memory, via
BufferAttach(). Client-defined methods must call their superclass
method (via a next-method call) before performing any class-specific
behaviour.
- 
void (*BufferDetachMethod)(Buffer buffer)¶
.class.method.detach: detach() is a class-specific method
called whenever a buffer is detached from memory, via
BufferDetach(). Client-defined methods must call their superclass
method (via a next-method call) after performing any class-specific
behaviour.
.class.method.seg: seg() is a class-specific accessor method
which returns the segment attached to a buffer (or NULL if there
isn’t one). It is called from BufferSeg(). Clients should not need
to define their own methods for this.
.class.method.rankSet: rankSet() is a class-specific accessor
method which returns the rank set of a buffer. It is called from
BufferRankSet(). Clients should not need to define their own
methods for this.
.class.method.setRankSet: setRankSet() is a class-specific
setter method which sets the rank set of a buffer. It is called from
BufferSetRankSet(). Clients should not need to define their own
methods for this.
- 
Res (*BufferDescribeMethod)(Buffer buffer, mps_lib_FILE *stream, Count depth)¶
.class.method.describe: describe() is a class-specific method
called to describe a buffer, via BufferDescribe(). Client-defined
methods must call their superclass method (via a next-method call)
before describing any class-specific state.
5.6. Logging¶
.logging.control: Buffers have a separate control for whether they
are logged or not, this is because they are particularly high volume.
This is a Boolean flag (bufferLogging) in the ArenaStruct.
5.7. Measurement¶
.count: Counting the allocation volume is done by maintaining two fields in the buffer struct:
.count.fields: fillSize, emptySize.
.count.monotonic: both of these fields are monotonically increasing.
.count.fillsize: fillSize is an accumulated total of the size
of all the fills (as a result of calling the PoolClass
BufferFill() method) that happen on the buffer.
.count.emptysize: emptySize is an accumulated total of the size of
all the empties than happen on the buffer (which are notified to the
pool using the PoolClass BufferEmpty() method).
.count.generic: These fields are maintained by the generic buffer
code in BufferAttach() and BufferDetach().
.count.other: Similar count fields are maintained in the arena. They are maintained on an internal (buffers used internally by the MPS) and external (buffers used for mutator allocation points) basis. The fields are also updated by the buffer code. The fields are:
- in the arena, - fillMutatorSize,- fillInternalSize,- emptyMutatorSize,- emptyInternalSize, and- allocMutatorSize(5 fields).
.count.alloc.how: The amount of allocation in the buffer just
after an empty is fillSize - emptySize. At other times this
computation will include space that the buffer has the use of (between
base and init) but which may not get allocated in (because the
remaining space may be too large for the next reserve so some or all
of it may get emptied). The arena field allocMutatorSize is
incremented by the allocated size (between base and init)
whenever a buffer is detached. Symmetrically this field is decremented
by by the pre-allocated size (between base and init) whenever
a buffer is attached. The overall count is asymptotically correct.
.count.type: All the count fields are type double.
.count.type.justify: This is because double is the type most likely to give us enough precision. Because of the lack of genuine requirements the type isn’t so important. It’s nice to have it more precise than long. Which double usually is.
5.8. Notes from the whiteboard¶
Requirements
- atomic update of words 
- guarantee order of reads and write to certain memory locations. 
Flip
- limit:=0 
- record init for scanner 
Commit
- init:=alloc 
- if(limit = 0) … 
- L written only by MM 
- A written only by client (except during synchronized MM op) 
- I ditto 
- I read by MM during flip 
States
- busy 
- ready 
- trapped 
- reset 
Note
There are many more states. DRJ.
Misc
- During buffer ops all field values can change. Might trash perfectly good (“valid”?) object if pool isn’t careful. 
5.9. Synchronization¶
Buffers provide a loose form of synchronization between the mutator and the collector.
The crucial synchronization issues are between the operation the pool performs on flip and the mutator’s commit operation.
Commit
- read init 
- write init 
- Memory Barrier 
- read - limit
Flip
- write - limit
- Memory Barrier 
- read init 
Commit consists of two parts. The first is the update to init. This is a declaration that the new object just before init is now correctly formatted and can be scanned. The second is a check to see if the buffer has been “tripped”. The ordering of the two parts is crucial.
Note that the declaration that the object is correctly formatted is independent of whether the buffer has been tripped or not. In particular a pool can scan up to the init pointer (including the newly declared object) whether or not the pool will cause the commit to fail. In the case where the pool scans the object, but then causes the commit to fail (and presumably the allocation to occur somewhere else), the pool will have scanned a “dead” object, but this is just another example of conservatism in the general sense.
Not that the read of init in the Flip sequence can in fact be arbitrarily delayed (as long as it is read before a buffered segment is scanned).
On processors with Relaxed Memory Order (such as the DEC Alpha), Memory Barriers will need to be placed at the points indicated.
* DESIGN
*
* An allocation buffer is an interface to a pool which provides
* very fast allocation, and defers the need for synchronization in
* a multi-threaded environment.
*
* Pools which contain formatted objects must be synchronized so
* that the pool can know when an object is valid.  Allocation from
* such pools is done in two stages: reserve and commit.  The client
* first reserves memory, then initializes it, then commits.
* Committing the memory declares that it contains a valid formatted
* object.  Under certain conditions, some pools may cause the
* commit operation to fail.  (See the documentation for the pool.)
* Failure to commit indicates that the whole allocation failed and
* must be restarted.  When using a pool which introduces the
* possibility of commit failing, the allocation sequence could look
* something like this:
*
* do {
*   res = BufferReserve(&p, buffer, size);
*   if(res != ResOK) return res;       // allocation fails, reason res
*   initialize(p);                     // p now points at valid object
* } while(!BufferCommit(buffer, p, size));
*
* Pools which do not contain formatted objects can use a one-step
* allocation as usual.  Effectively any random rubbish counts as a
* "valid object" to such pools.
*
* An allocation buffer is an area of memory which is pre-allocated
* from a pool, plus a buffer descriptor, which contains, inter
* alia, four pointers: base, init, alloc, and limit.  Base points
* to the base address of the area, limit to the last address plus
* one.  Init points to the first uninitialized address in the
* buffer, and alloc points to the first unallocated address.
*
*    L . - - - - - .         ^
*      |           |     Higher addresses -'
*      |   junk    |
*      |           |       the "busy" state, after Reserve
*    A |-----------|
*      |  uninit   |
*    I |-----------|
*      |   init    |
*      |           |     Lower addresses  -.
*    B `-----------'         v
*
*    L . - - - - - .         ^
*      |           |     Higher addresses -'
*      |   junk    |
*      |           |       the "ready" state, after Commit
*  A=I |-----------|
*      |           |
*      |           |
*      |   init    |
*      |           |     Lower addresses  -.
*    B `-----------'         v
*
* Access to these pointers is restricted in order to allow
* synchronization between the pool and the client.  The client may
* only write to init and alloc, but in a restricted and atomic way
* detailed below.  The pool may read the contents of the buffer
* descriptor at _any_ time.  During calls to the fill and trip
* methods, the pool may update any or all of the fields
* in the buffer descriptor.  The pool may update the limit at _any_
* time.
*
* Access to buffers by these methods is not synchronized.  If a buffer
* is to be used by more than one thread then it is the client's
* responsibility to ensure exclusive access.  It is recommended that
* a buffer be used by only a single thread.
*
* [Only one thread may use a buffer at once, unless the client
* places a mutual exclusion around the buffer access in the usual
* way.  In such cases it is usually better to create one buffer for
* each thread.]
*
* Here are pseudo-code descriptions of the reserve and commit
* operations.  These may be implemented in-line by the client.
* Note that the client is responsible for ensuring that the size
* (and therefore the alloc and init pointers) are aligned according
* to the buffer's alignment.
*
* Reserve(buf, size)                   ; size must be aligned to pool
*   if buf->limit - buf->alloc >= size then
*     buf->alloc +=size                ; must be atomic update
*     p = buf->init
*   else
*     res = BufferFill(&p, buf, size)  ; buf contents may change
*
* Commit(buf, p, size)
*   buf->init = buf->alloc             ; must be atomic update
*   if buf->limit == 0 then
*     res = BufferTrip(buf, p, size)   ; buf contents may change
*   else
*     res = True
* (returns True on successful commit)
*
* The pool must allocate the buffer descriptor and initialize it by
* calling BufferInit.  The descriptor this creates will fall
* through to the fill method on the first allocation.  In general,
* pools should not assign resources to the buffer until the first
* allocation, since the buffer may never be used.
*
* The pool may update the base, init, alloc, and limit fields when
* the fallback methods are called.  In addition, the pool may set
* the limit to zero at any time.  The effect of this is either:
*
*   1. cause the _next_ allocation in the buffer to fall through to
*      the buffer fill method, and allow the buffer to be flushed
*      and relocated;
*
*   2. cause the buffer trip method to be called if the client was
*      between reserve and commit.
*
* A buffer may not be relocated under other circumstances because
* there is a race between updating the descriptor and the client
* allocation sequence.
5.10. Interface¶
.method.create: Create an allocation buffer in a pool. The buffer is created in the “ready” state.
A buffer structure is allocated from the space control pool and
partially initialized (in particularly neither the signature nor the
serial field are initialized). The pool class’s bufferCreate()
method is then called. This method can update (some undefined subset
of) the fields of the structure; it should return with the buffer in
the “ready” state (or fail). The remainder of the initialization then
occurs.
If and only if successful then a valid buffer is returned.
- 
void BufferDestroy(Buffer buffer)¶
.method.destroy: Free a buffer descriptor. The buffer must be in
the “ready” state, that is, not between a Reserve and Commit.
Allocation in the area of memory to which the descriptor refers must
cease after BufferDestroy() is called.
Destroying an allocation buffer does not affect objects which have been allocated, it just frees resources associated with the buffer itself.
The pool class’s bufferDestroy() method is called and then the
buffer structure is uninitialized and freed.
- 
BufferCheck(Buffer buffer)¶
.method.check: The check method is straightforward, the non-trivial dependencies checked are:
- The ordering constraints between base, init, alloc, and limit. 
- The alignment constraints on base, init, alloc, and limit. 
- That the buffer’s rank is identical to the segment’s rank. 
.method.attach: Set the base, init, alloc, and limit fields so that
the buffer is ready to start allocating in area of memory. The alloc
field is set to init + size.
.method.attach.unbusy: BufferAttach() must only be applied to
buffers that are not busy.
- 
void BufferDetach(Buffer buffer, Pool pool)¶
.method.detach: Set the seg, base, init, alloc, and limit fields to zero, so that the next reserve request will call the fill method.
.method.detach.unbusy: BufferDetach() must only be applied to
buffers that are not busy.
.method.isreset: Returns TRUE if and only if the buffer is in the
reset state, that is, with base, init, alloc, and limit all set to
zero.
.method.isready: Returns TRUE if and only if the buffer is not
between a reserve and commit. The result is only reliable if the
client is not currently using the buffer, since it may update the
alloc and init pointers asynchronously.
Returns the APStruct substructure of a buffer.
.method.ofap: Return the buffer which owns an APStruct.
.method.ofap.thread-safe: BufferOfAP() must be thread safe (see
impl.c.mpsi.thread-safety). This is achieved simply because the
underlying operation involved is simply a subtraction.
.method.arena: Returns the arena which owns a buffer.
.method.arena.thread-safe: BufferArena() must be thread safe
(see impl.c.mpsi.thread-safety). This is achieved simple because the
underlying operation is a read of shared-non-mutable data (see
design.mps.thread-safety).
- 
Pool BufferPool(Buffer buffer)¶
Returns the pool to which a buffer is attached.
.method.reserve: Reserves memory from an allocation buffer.
This is a provided version of the reserve procedure described above.
The size must be aligned according to the buffer alignment. If
successful, ResOK is returned and *pReturn is updated with a
pointer to the reserved memory. Otherwise *pReturn is not touched.
The reserved memory is not guaranteed to have any particular contents.
The memory must be initialized with a valid object (according to the
pool to which the buffer belongs) and then passed to the
BufferCommit() method (see below). BufferReserve(0 may not be
applied twice to a buffer without a BufferCommit() in-between. In
other words, Reserve/Commit pairs do not nest.
.method.fill: Refills an empty buffer. If there is not enough space
in a buffer to allocate in-line, BufferFill() must be called to
“refill” the buffer.
.method.commit: Commit memory previously reserved.
BufferCommit() notifies the pool that memory which has been
previously reserved (see above) has been initialized with a valid
object (according to the pool to which the buffer belongs). The
pointer p must be the same as that returned by
BufferReserve(), and the size must match the size passed to
BufferReserve().
BufferCommit() may not be applied twice to a buffer without a
reserve in between. In other words, objects must be reserved,
initialized, then committed only once.
Commit returns TRUE if successful, FALSE otherwise. If commit
fails and returns FALSE, the client may try to allocate again by
going back to the reserve stage, and may not use the memory at p
again for any purpose.
Some classes of pool may cause commit to fail under rare circumstances.
.method.trip: Act on a tripped buffer. The pool which owns a buffer
may asynchronously set the buffer limit to zero in order to get
control over the buffer. If this occurs after a BufferReserve()
(but before the corresponding commit), then the BufferCommit()
method calls BufferTrip() and the BufferCommit() method
returns with the return value of BufferTrip().
.method.trip.precondition: At the time trip is called, from
BufferCommit(), the following are true:
- .method.trip.precondition.limit: - limit == 0
- .method.trip.precondition.init: - init == alloc
- .method.trip.precondition.p: - p + size == alloc
5.11. Diagrams¶
Here are a number of diagrams showing how buffers behave. In general, the horizontal axis corresponds to mutator action (reserve, commit) and the vertical axis corresponds to collector action. I’m not sure which of the diagrams are the same as each other, and which are best or most complete when they are different, but they all attempt to show essentially the same information. It’s very difficult to get all the details in. These diagrams were drawn by Richard Brooksby, Richard Tucker, Gavin Matthews, and others in April 1997. In general, the later diagrams are, I suspect, more correct, complete and useful than the earlier ones. I have put them all here for the record. Richard Tucker, 1998-02-09.
Buffer Diagram: Buffer States
Buffer States (3-column) Buffer States (4-column) Buffer States (gavinised) Buffer States (interleaved) Buffer States (richardized)
[missing diagrams]
