Allocation buffers and allocation points

author	Richard Brooksby
copyright	See Copyright and License.
date	1996-09-02
index terms	pair: buffers; design
revision	//info.ravenbrook.com/project/mps/master/design/buffer.txt#18
status	incomplete design
tag	design.mps.buffer

Introduction

.scope: This is the design of allocation buffers and allocation points.

.purpose: The purpose of this document is to record design decisions made concerning allocation buffers and allocation points and justify those decisions in terms of requirements.

.readership: The document is intended for reading by any MPS developer.

Glossary

trapped

.def.trapped: The buffer is in a state such that the MPS gets to know about the next use of that buffer.

Source

.source.mail: Much of the juicy stuff about buffers is only floating around in mail discussions. You might like to try searching the archives if you can't find what you want here.

Note

Mail archives are only accessible to Ravenbrook staff. RHSK 2006-06-09.

.source.synchronize: For a discussion of the synchronization issues, see mail.richard.1995-05-19.17-10, mail.ptw.1995-05-19.19-15, and mail.richard.1995-05-24.10-18.

Note

I believe that the sequence for flip in PTW's message is incorrect. The operations should be in the other order. DRJ.

.source.interface: For a description of the buffer interface in C prototypes, see mail.richard.1997-04-28.09-25.

.source.qa: Discussions with QA were useful in pinning down the semantics and understanding of some obscure but important boundary cases. See the thread with subject "notes on our allocation points discussion" and messages mail.richard.tucker.1997-05-12.09-45, mail.ptw.1997-05-12.12-46, mail.richard.1997-05-12.13-15, mail.richard.1997-05-12.13-28, mail.ptw.1997-05-13.15-15, mail.sheep.1997-05-14.11-52, mail.rit.1997-05-15.09-19, mail.ptw.1997-05-15.21-22, mail.ptw.1997-05-15.21-35, mail.rit.1997-05-16.08-02, mail.rit.1997-05-16.08-42, mail.ptw.1997-05-16.12-36, mail.ptw.1997-05-16.12-47, mail.richard.1997-05-19.15-46, mail.richard.1997-05-19.15-56, and mail.ptw.1997-05-20.20-47.

Requirements

.req.fast: Allocation must be very fast.

.req.thread-safe: Must run safely in a multi-threaded environment.

.req.no-synch: Must avoid the use of thread-synchronization. (.req.fast)

.req.manual: Support manual memory management.

.req.exact: Support exact collectors.

.req.ambig: Support ambiguous collectors.

.req.count: Must record (approximately) the amount of allocation (in bytes).

Note

Actually not a requirement any more, but once was put forward as a Dylan requirement. Bits of the code still reflect this requirement. See request.dylan.170554.

Classes

.class.hierarchy: The Buffer data structure is designed to be subclassable (see design.mps.protocol).

.class.hierarchy.buffer: The basic buffer class (BufferClass) supports basic allocation-point buffering, and is appropriate for those manual pools which don't use segments (.req.manual). The Buffer class doesn't support reference ranks (that is, the buffers have RankSetEMPTY). Clients may use BufferClass directly, or create their own subclasses (see .subclassing).

.class.hierarchy.segbuf: Class SegBufClass is also provided for the use of pools which additionally need to associate buffers with segments. SegBufClass is a subclass of BufferClass. Manual pools may find it convenient to use SegBufClass, but it is primarily intended for automatic pools (.req.exact, .req.ambig). An instance of SegBufClass may be attached to a region of memory that lies within a single segment. The segment is associated with the buffer, and may be accessed with the BufferSeg() function. SegBufClass also supports references at any rank set. Hence this class or one of its subclasses should be used by all automatic pools (with the possible exception of leaf pools). The rank sets of buffers and the segments they are attached to must match. Clients may use SegBufClass directly, or create their own subclasses (see .subclassing).

.class.hierarchy.rankbuf: Class RankBufClass is also provided as a subclass of SegBufClass. The only way in which this differs from its superclass is that the rankset of a RankBufClass is set during initialization to the singleton rank passed as an additional parameter to BufferCreate(). Instances of RankBufClass are of the same type as instances of SegBufClass, that is, SegBuf. Clients may use RankBufClass directly, or create their own subclasses (see .subclassing).

.class.create: The buffer creation functions (BufferCreate() and BufferCreateV()) take a class parameter, which determines the class of buffer to be created.

.class.choice: Pools which support buffered allocation should specify a default class for buffers. This class will be used when a buffer is created in the normal fashion by MPS clients (for example by a call to mps_ap_create()). Pools specify the default class by means of the bufferClass field in the pool class object. This should be a pointer to a function of type PoolBufferClassMethod. The normal class "Ensure" function (for example EnsureBufferClass()) has the appropriate type.

.subclassing: Pools may create their own subclasses of the standard buffer classes. This is sometimes useful if the pool needs to add an extra field to the buffer. The convenience macro DEFINE_BUFFER_CLASS() may be used to define subclasses of buffer classes. See design.mps.protocol.int.define-special.

.replay: To work with the allocation replayer (see design.mps.telemetry.replayer), the buffer class has to emit an event for each call to an external interface, containing all the parameters passed by the user. If a new event type is required to carry this information, the replayer (impl.c.eventrep) must then be extended to recreate the call.

.replay.pool-buffer: The replayer must also be updated if the association of buffer class to pool or the buffer class hierarchy is changed.

.class.method: Buffer classes provide the following methods (these should not be confused with the pool class methods related to the buffer protocol, described in .method.create and following sections):

typedef Res (*BufferInitMethod)(Buffer buffer, Pool pool, ArgList args)

.class.method.init: init() is a class-specific initialization method called from BufferInit(). It receives the keyword arguments passed to to BufferInit(). Client-defined methods must call their superclass method (via a next-method call) before performing any class-specific behaviour. .replay.init: The init() method should emit a BufferInit<foo> event (if there aren't any extra parameters, <foo> = "").

typedef void (*BufferAttachMethod)(Buffer buffer, Addr base, Addr limit, Addr init, Size size)

.class.method.attach: attach() is a class-specific method called whenever a buffer is attached to memory, via BufferAttach(). Client-defined methods must call their superclass method (via a next-method call) before performing any class-specific behaviour.

typedef void (*BufferDetachMethod)(Buffer buffer)

.class.method.detach: detach() is a class-specific method called whenever a buffer is detached from memory, via BufferDetach(). Client-defined methods must call their superclass method (via a next-method call) after performing any class-specific behaviour.

typedef Seg (*BufferSegMethod)(Buffer buffer)

.class.method.seg: seg() is a class-specific accessor method which returns the segment attached to a buffer (or NULL if there isn't one). It is called from BufferSeg(). Clients should not need to define their own methods for this.

typedef RankSet (*BufferRankSetMethod)(Buffer buffer)

.class.method.rankSet: rankSet() is a class-specific accessor method which returns the rank set of a buffer. It is called from BufferRankSet(). Clients should not need to define their own methods for this.

typedef void (*BufferSetRankSetMethod)(Buffer buffer, RankSet rankSet)

.class.method.setRankSet: setRankSet() is a class-specific setter method which sets the rank set of a buffer. It is called from BufferSetRankSet(). Clients should not need to define their own methods for this.

Logging

.logging.control: Buffers have a separate control for whether they are logged or not, this is because they are particularly high volume. This is a Boolean flag (bufferLogging) in the ArenaStruct.

Measurement

.count: Counting the allocation volume is done by maintaining two fields in the buffer struct:

.count.fields: fillSize, emptySize.

.count.monotonic: both of these fields are monotonically increasing.

.count.fillsize: fillSize is an accumulated total of the size of all the fills (as a result of calling the PoolClass BufferFill() method) that happen on the buffer.

.count.emptysize: emptySize is an accumulated total of the size of all the empties than happen on the buffer (which are notified to the pool using the PoolClass BufferEmpty() method).

.count.generic: These fields are maintained by the generic buffer code in BufferAttach() and BufferDetach().

.count.other: Similar count fields are maintained in the arena. They are maintained on an internal (buffers used internally by the MPS) and external (buffers used for mutator allocation points) basis. The fields are also updated by the buffer code. The fields are:

in the arena, fillMutatorSize, fillInternalSize, emptyMutatorSize, emptyInternalSize, and allocMutatorSize (5 fields).

.count.alloc.how: The amount of allocation in the buffer just after an empty is fillSize - emptySize. At other times this computation will include space that the buffer has the use of (between base and init) but which may not get allocated in (because the remaining space may be too large for the next reserve so some or all of it may get emptied). The arena field allocMutatorSize is incremented by the allocated size (between base and init) whenever a buffer is detached. Symmetrically this field is decremented by by the pre-allocated size (between base and init) whenever a buffer is attached. The overall count is asymptotically correct.

.count.type: All the count fields are type double.

.count.type.justify: This is because double is the type most likely to give us enough precision. Because of the lack of genuine requirements the type isn't so important. It's nice to have it more precise than long. Which double usually is.

Notes from the whiteboard

Requirements

atomic update of words
guarantee order of reads and write to certain memory locations.

Flip

limit:=0
record init for scanner

Commit

init:=alloc
if(limit = 0) ...
L written only by MM
A written only by client (except during synchronized MM op)
I ditto
I read by MM during flip

States

busy
ready
trapped
reset

Note

There are many more states. DRJ.

Misc

During buffer ops all field values can change. Might trash perfectly good ("valid"?) object if pool isn't careful.

Synchronization

Buffers provide a loose form of synchronization between the mutator and the collector.

The crucial synchronization issues are between the operation the pool performs on flip and the mutator's commit operation.

Commit

read init
write init
Memory Barrier
read limit

Flip

write limit
Memory Barrier
read init

Commit consists of two parts. The first is the update to init. This is a declaration that the new object just before init is now correctly formatted and can be scanned. The second is a check to see if the buffer has been "tripped". The ordering of the two parts is crucial.

Note that the declaration that the object is correctly formatted is independent of whether the buffer has been tripped or not. In particular a pool can scan up to the init pointer (including the newly declared object) whether or not the pool will cause the commit to fail. In the case where the pool scans the object, but then causes the commit to fail (and presumably the allocation to occur somewhere else), the pool will have scanned a "dead" object, but this is just another example of conservatism in the general sense.

Not that the read of init in the Flip sequence can in fact be arbitrarily delayed (as long as it is read before a buffered segment is scanned).

On processors with Relaxed Memory Order (such as the DEC Alpha), Memory Barriers will need to be placed at the points indicated.

* DESIGN
*
* An allocation buffer is an interface to a pool which provides
* very fast allocation, and defers the need for synchronization in
* a multi-threaded environment.
*
* Pools which contain formatted objects must be synchronized so
* that the pool can know when an object is valid.  Allocation from
* such pools is done in two stages: reserve and commit.  The client
* first reserves memory, then initializes it, then commits.
* Committing the memory declares that it contains a valid formatted
* object.  Under certain conditions, some pools may cause the
* commit operation to fail.  (See the documentation for the pool.)
* Failure to commit indicates that the whole allocation failed and
* must be restarted.  When using a pool which introduces the
* possibility of commit failing, the allocation sequence could look
* something like this:
*
* do {
*   res = BufferReserve(&p, buffer, size);
*   if(res != ResOK) return res;       // allocation fails, reason res
*   initialize(p);                     // p now points at valid object
* } while(!BufferCommit(buffer, p, size));
*
* Pools which do not contain formatted objects can use a one-step
* allocation as usual.  Effectively any random rubbish counts as a
* "valid object" to such pools.
*
* An allocation buffer is an area of memory which is pre-allocated
* from a pool, plus a buffer descriptor, which contains, inter
* alia, four pointers: base, init, alloc, and limit.  Base points
* to the base address of the area, limit to the last address plus
* one.  Init points to the first uninitialized address in the
* buffer, and alloc points to the first unallocated address.
*
*    L . - - - - - .         ^
*      |           |     Higher addresses -'
*      |   junk    |
*      |           |       the "busy" state, after Reserve
*    A |-----------|
*      |  uninit   |
*    I |-----------|
*      |   init    |
*      |           |     Lower addresses  -.
*    B `-----------'         v
*
*    L . - - - - - .         ^
*      |           |     Higher addresses -'
*      |   junk    |
*      |           |       the "ready" state, after Commit
*  A=I |-----------|
*      |           |
*      |           |
*      |   init    |
*      |           |     Lower addresses  -.
*    B `-----------'         v
*
* Access to these pointers is restricted in order to allow
* synchronization between the pool and the client.  The client may
* only write to init and alloc, but in a restricted and atomic way
* detailed below.  The pool may read the contents of the buffer
* descriptor at _any_ time.  During calls to the fill and trip
* methods, the pool may update any or all of the fields
* in the buffer descriptor.  The pool may update the limit at _any_
* time.
*
* Access to buffers by these methods is not synchronized.  If a buffer
* is to be used by more than one thread then it is the client's
* responsibility to ensure exclusive access.  It is recommended that
* a buffer be used by only a single thread.
*
* [Only one thread may use a buffer at once, unless the client
* places a mutual exclusion around the buffer access in the usual
* way.  In such cases it is usually better to create one buffer for
* each thread.]
*
* Here are pseudo-code descriptions of the reserve and commit
* operations.  These may be implemented in-line by the client.
* Note that the client is responsible for ensuring that the size
* (and therefore the alloc and init pointers) are aligned according
* to the buffer's alignment.
*
* Reserve(buf, size)                   ; size must be aligned to pool
*   if buf->limit - buf->alloc >= size then
*     buf->alloc +=size                ; must be atomic update
*     p = buf->init
*   else
*     res = BufferFill(&p, buf, size)  ; buf contents may change
*
* Commit(buf, p, size)
*   buf->init = buf->alloc             ; must be atomic update
*   if buf->limit == 0 then
*     res = BufferTrip(buf, p, size)   ; buf contents may change
*   else
*     res = True
* (returns True on successful commit)
*
* The pool must allocate the buffer descriptor and initialize it by
* calling BufferInit.  The descriptor this creates will fall
* through to the fill method on the first allocation.  In general,
* pools should not assign resources to the buffer until the first
* allocation, since the buffer may never be used.
*
* The pool may update the base, init, alloc, and limit fields when
* the fallback methods are called.  In addition, the pool may set
* the limit to zero at any time.  The effect of this is either:
*
*   1. cause the _next_ allocation in the buffer to fall through to
*      the buffer fill method, and allow the buffer to be flushed
*      and relocated;
*
*   2. cause the buffer trip method to be called if the client was
*      between reserve and commit.
*
* A buffer may not be relocated under other circumstances because
* there is a race between updating the descriptor and the client
* allocation sequence.

Interface

Res BufferCreate(Buffer *bufferReturn, BufferClass class, Pool pool, Bool isMutator, ArgList args)

.method.create: Create an allocation buffer in a pool. The buffer is created in the "ready" state.

A buffer structure is allocated from the space control pool and partially initialized (in particularly neither the signature nor the serial field are initialized). The pool class's bufferCreate() method is then called. This method can update (some undefined subset of) the fields of the structure; it should return with the buffer in the "ready" state (or fail). The remainder of the initialization then occurs.

If and only if successful then a valid buffer is returned.

void BufferDestroy(Buffer buffer)

.method.destroy: Free a buffer descriptor. The buffer must be in the "ready" state, that is, not between a Reserve and Commit. Allocation in the area of memory to which the descriptor refers must cease after BufferDestroy() is called.

Destroying an allocation buffer does not affect objects which have been allocated, it just frees resources associated with the buffer itself.

The pool class's bufferDestroy() method is called and then the buffer structure is uninitialized and freed.

Bool BufferCheck(Buffer buffer)

.method.check: The check method is straightforward, the non-trivial dependencies checked are:

The ordering constraints between base, init, alloc, and limit.
The alignment constraints on base, init, alloc, and limit.
That the buffer's rank is identical to the segment's rank.

void BufferAttach(Buffer buffer, Addr base, Addr limit, Addr init, Size size)

.method.attach: Set the base, init, alloc, and limit fields so that the buffer is ready to start allocating in area of memory. The alloc field is set to init + size.

.method.attach.unbusy: BufferAttach() must only be applied to buffers that are not busy.

void BufferDetach(Buffer buffer, Pool pool)

.method.detach: Set the seg, base, init, alloc, and limit fields to zero, so that the next reserve request will call the fill method.

.method.detach.unbusy: BufferDetach() must only be applied to buffers that are not busy.

Bool BufferIsReset(Buffer buffer)

.method.isreset: Returns TRUE if and only if the buffer is in the reset state, that is, with base, init, alloc, and limit all set to zero.

Bool BufferIsReady(Buffer buffer)

.method.isready: Returns TRUE if and only if the buffer is not between a reserve and commit. The result is only reliable if the client is not currently using the buffer, since it may update the alloc and init pointers asynchronously.

mps_ap_t BufferAP(Buffer buffer)

Returns the APStruct substructure of a buffer.

Buffer BufferOfAP(mps_ap_t ap)

.method.ofap: Return the buffer which owns an APStruct.

.method.ofap.thread-safe: BufferOfAP() must be thread safe (see impl.c.mpsi.thread-safety). This is achieved simply because the underlying operation involved is simply a subtraction.

Arena BufferArena(Buffer buffer)

.method.arena: Returns the arena which owns a buffer.

.method.arena.thread-safe: BufferArena() must be thread safe (see impl.c.mpsi.thread-safety). This is achieved simple because the underlying operation is a read of shared-non-mutable data (see design.mps.thread-safety).

Pool BufferPool(Buffer buffer)

Returns the pool to which a buffer is attached.

Res BufferReserve(Addr *pReturn, Buffer buffer, Size size)

.method.reserve: Reserves memory from an allocation buffer.

This is a provided version of the reserve procedure described above. The size must be aligned according to the buffer alignment. If successful, ResOK is returned and *pReturn is updated with a pointer to the reserved memory. Otherwise *pReturn is not touched. The reserved memory is not guaranteed to have any particular contents. The memory must be initialized with a valid object (according to the pool to which the buffer belongs) and then passed to the BufferCommit() method (see below). BufferReserve(0 may not be applied twice to a buffer without a BufferCommit() in-between. In other words, Reserve/Commit pairs do not nest.

Res BufferFill(Addr *pReturn, Buffer buffer, Size size)

.method.fill: Refills an empty buffer. If there is not enough space in a buffer to allocate in-line, BufferFill() must be called to "refill" the buffer.

Bool BufferCommit(Buffer buffer, Addr p, Size size)

.method.commit: Commit memory previously reserved.

BufferCommit() notifies the pool that memory which has been previously reserved (see above) has been initialized with a valid object (according to the pool to which the buffer belongs). The pointer p must be the same as that returned by BufferReserve(), and the size must match the size passed to BufferReserve().

BufferCommit() may not be applied twice to a buffer without a reserve in between. In other words, objects must be reserved, initialized, then committed only once.

Commit returns TRUE if successful, FALSE otherwise. If commit fails and returns FALSE, the client may try to allocate again by going back to the reserve stage, and may not use the memory at p again for any purpose.

Some classes of pool may cause commit to fail under rare circumstances.

void BufferTrip(Buffer buffer, Addr p, Size size)

.method.trip: Act on a tripped buffer. The pool which owns a buffer may asynchronously set the buffer limit to zero in order to get control over the buffer. If this occurs after a BufferReserve() (but before the corresponding commit), then the BufferCommit() method calls BufferTrip() and the BufferCommit() method returns with the return value of BufferTrip().

.method.trip.precondition: At the time trip is called, from BufferCommit(), the following are true:

.method.trip.precondition.limit: limit == 0
.method.trip.precondition.init: init == alloc
.method.trip.precondition.p: p + size == alloc

Diagrams

Here are a number of diagrams showing how buffers behave. In general, the horizontal axis corresponds to mutator action (reserve, commit) and the vertical axis corresponds to collector action. I'm not sure which of the diagrams are the same as each other, and which are best or most complete when they are different, but they all attempt to show essentially the same information. It's very difficult to get all the details in. These diagrams were drawn by Richard Brooksby, Richard Tucker, Gavin Matthews, and others in April 1997. In general, the later diagrams are, I suspect, more correct, complete and useful than the earlier ones. I have put them all here for the record. Richard Tucker, 1998-02-09.

Buffer Diagram: Buffer States

Buffer States (3-column) Buffer States (4-column) Buffer States (gavinised) Buffer States (interleaved) Buffer States (richardized)

[missing diagrams]

Document History

1996-09-02 RB incomplete design
2002-06-07 RB Converted from MMInfo database design document.
2007-03-22 RHSK Created Guide.
2013-05-24 GDR Converted to reStructuredText; some tidying and modernizing (BufferInit() takes keyword arguments; BufferSpace(), BufferSet() and BufferReset() are now BufferArena(), BufferAttach() and BufferDetach() respectively; BufferExpose() and BufferCover() have been moved to the Shield interface; see design.mps.shield).

Copyright and License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.