Ravenbrook / Projects / Memory Pool System / Master Product Sources / Design Documents

Memory Pool System Project


                ALLOCATION BUFFERS AND ALLOCATION POINTS
                           design.mps.buffer
                           incomplete design
                           richard 1996-09-02

INTRODUCTION

.scope: This document describes the design of allocation buffers and allocation 
points.

.purpose: The purpose of this document is to record design decisions made 
concerning allocation buffers and allocation points and justify those decisions 
in terms of requirements.

.readership: The document is intended for reading by any Memory Management 
Group developer.


HISTORY

.history.0-1: The history for versions 0-1 is lost pending possible 
reconstruction.

.history.2: Added class hierarchy and subclassing information.


SOURCE

.source.mail: Much of the juicy stuff about buffers is only floating around in 
mail discussions.  You might like to try searching the archives if you can't 
find what you want here.

.source.synchronize: For a discussion of the syncronization issues: 
mail.richard.1995-05-24.10-18, mail.ptw.1995-05-19.19-15, 
mail.richard.1995-05-19.17-10
[drj - I believe that the sequence for flip in ptw's message is incorrect.  The 
operations should be in the other order]

.source.interface: For a description of the buffer interface in C prototypes: 
mail.richard.1997-04-28.09-25(0)

.source.qa: Discussions with QA were useful in pinning down the semantics and 
understanding of some obscure but important boundary cases.  See 
mail.richard.tucker.1997-05-12.09-45(0) et seq (mail subject: "notes on our 
allocation points discussion").


REQUIREMENTS

.req.fast: Allocation must be very fast
.req.thread-safe: Must run safely in a multi-threaded environment
.req.no-synch: Must avoid the use of thread-synchronization (.req.fast)
.req.manual: Support manual memory management
.req.exact: Support exact collectors
.req.ambig: Support ambiguous collectors
.req.count: Must record (approximately) the amount of allocation (in bytes).  
Actually not a requirement any more, but once was put forward as a Dylan 
requirement.  Bits of the code still reflect this requirement.  See 
request.dylan.170554.


CLASSES

.class.hierarchy: The Buffer datastructure is designed to be subclassable (see 
design.mps.protocol). 

.class.hierarchy.buffer: The basic buffer class (BufferClass) supports basic 
allocation-point buffering, and is appropriate for those manual pools which 
don't use segments (.req.manual).  The Buffer class doesn't support reference 
ranks (i.e. the buffers have RankSetEMPTY).  Clients may use BufferClass 
directly, or create their own subclasses (see .subclassing). 

.class.hierarchy.segbuf: Class SegBufClass is also provided for the use of 
pools which additionally need to associate buffers with segments.  SegBufClass 
is a subclass of BufferClass.  Manual pools may find it convenient to use 
SegBufClass, but it is primarily intended for automatic pools (.req.exact, 
.req.ambig).  An instance of SegBufClass may be attached to a region of memory 
that lies within a single segment.  The segment is associated with the buffer, 
and may be accessed with the BufferSeg function.  SegBufClass also supports 
references at any rank set. Hence this class or one of its subclasses should be 
used by all automatic pools (with the possible exception of leaf pools).  The 
rank sets of buffers and the segments they are attached to must match.  Clients 
may use SegBufClass directly, or create their own subclasses (see 
.subclassing).

.class.hierarchy.rankbuf: Class RankBufClass is also provided as a subclass of 
SegBufClass.  The only way in which this differs from its superclass is that 
the rankset of a RankBufClass is set during initialization to the singleton 
rank passed as an additional parameter to BufferCreate.  Instances of 
RankBufClass are of the same type as instances of SegBufClass, i.e., SegBuf.  
Clients may use RankBufClass directly, or create their own subclasses (see 
.subclassing).

.class.create: The buffer creation functions (BufferCreate and BufferCreateV) 
take a class parameter, which determines the class of buffer to be created.

.class.choice: Pools which support buffered allocation should specify a default 
class for buffers.  This class will be used when a buffer is created in the 
normal fashion by MPS clients (for example by a call to mps_ap_create).  Pools 
specify the default class by means of the bufferClass field in the pool class 
object.  This should be a pointer to a function of type PoolBufferClassMethod.  
The normal class "Ensure" function (e.g. Ensure*Buffer*Class) has the 
appropriate type.

.subclassing: Pools may create their own subclasses of the standard buffer 
classes.  This is sometimes useful if the pool needs to add an extra field to 
the buffer.  The convenience macro DEFINE_BUFFER_CLASS may be used to define 
subclasses of buffer classes.  See design.mps.protocol.int.define-special.  
.replay: To work with the allocation replayer (see 
design.mps.telemetry.replayer), the buffer class has to emit an event for each 
call to an external interface, containing all the parameters passed by the 
user.  If a new event type is required to carry this information, the replayer 
(impl.c.eventrep) must then be extended to recreate the call.  
.replay.pool-buffer: The replayer must also be updated if the association of 
buffer class to pool or the buffer class hierarchy is changed.

.class.method: Buffer classes provide the following methods (these should not 
be confused with the pool class methods related to the buffer protocol, 
described in .method.*):

.class.method.init: "init" is a class-specific initialization method called 
from BufferInitV.  It receives the optional (vararg) parameters passed to 
BufferInitV.  Client-defined methods must call their superclass method (via a 
next-method call) before performing any class-specific behaviour.  .replay.init
: The init method should emit a BufferInit<foo> event (if there aren't any 
extra parameters, <foo> = "").

.class.method.finish: "finish" is a class-specific finish method called from 
BufferFinish.  Client-defined methods must call their superclass method (via a 
next-method call) after performing any class-specific behaviour.

.class.method.attach: "attach" is a class-specific method called whenever a 
buffer is attached to memory, via BufferAttach.   Client-defined methods must 
call their superclass method (via a next-method call) before performing any 
class-specific behaviour.

.class.method.detach: "detach" is a class-specific method called whenever a 
buffer is detached from memory, via BufferDetach.  Client-defined methods must 
call their superclass method (via a next-method call) after performing any 
class-specific behaviour.

.class.method.seg: "seg" is a class-specific accessor method which returns the 
segment attached to a buffer (or NULL if there isn't one).  It is called from 
BufferSeg.  Clients should not need to define their own methods for this.

.class.method.rankSet: "rankSet" is a class-specific accessor method which 
returns the rank set of a buffer.  It is called from BufferRankSet.  Clients 
should not need to define their own methods for this.

.class.method.setRankSet: "setRankSet" is a class-specific setter method which 
sets the rank set of a buffer.  It is called from BufferSetRankSet.  Clients 
should not need to define their own methods for this.

.class.method.describe: "describe" is a class-specific method called to 
describe a buffer, via BufferDescribe.  Client-defined methods must call their 
superclass method (via a next-method call) before describing any class-specific 
state.


NOTES

.logging.control: Buffers have a separate control for whether they are logged 
or not, this is because they are particularly high volume.  This is a boolean 
flags (bufferLogging) in the ArenaStruct.

.count: Counting the allocation volume is done by maintaining two fields in the 
buffer struct:  .count.fields: fillSize, emptySize.  .count.monotonic: both of 
these fields are monotonically increasing.  .count.fillsize: fillSize is an 
accumulated total of the size of all the fills (as a result of calling the 
PoolClass BufferFill method) that happen on the buffer.  .count.emptysize: 
emptySize is an accumulated total of the size of all the empties than happen on 
the buffer (which are notified to the pool using the PoolClass BufferEmpty 
method).  .count.generic: These fields are maintained by the generic buffer 
code (in BufferAttach and BufferDetach). 

.count.other: Similar count fields are maintained in the pool and the arena.  
They are maintained on an internal (buffers used internally by MPS) and 
external (buffers used for mutator APs) basis.  The fields are also updated by 
the buffer code.  The fields are: in the pool, 
{fill|empty}{Mutator|Internal}Size (4 fields); in the arena, 
{fill|empty}{Mutator|Internal}Size allocMutatorSize (5 fields). 

 .count.alloc.how: The amount of allocation in the buffer just after an empty 
is (fillSize - emptySize).  At other times this computation will include space 
that the buffer has the use of (between base and init) but which may not get 
allocated in (because the remaining space may be too large for the next reserve 
so some or all of it may get emptied).  The arena field allocMutatorSize is 
incremented by the allocated size (between base and init) whenever a buffer is 
detached. Symmetrically this field is decremented by by the pre-allocated size 
(between base and init) whenever a buffer is attached. The overall count is 
asymptotically correct.

.count.type: All the count fields are type double.  .count.type.justify: This 
is because double is the type most likely to give us enough precision.  Because 
of the lack of genuine requirements the type isn't so important.  It's nice to 
have it more precise than long.  Which double usually is.



From the whiteboard:

REQ
atomic update of words
guarantee order of reads and write to certain memory locations.

FLIP
limit:=0
record init for scanner

COMMIT
init:=alloc
if(limit = 0) ...

L written only by MM
A \ written only by client (except during synchronized MM op)
I /
I read by MM during flip


States

BUSY
READY
TRAPPED
RESET
[drj: there are many more states]


Misc

.misc: During buffer ops all field values can change.  Might trash perfectly 
good ("valid"?) object if pool isn't careful.


Not from the whiteboard.


SYNCHRONIZATION

Buffers provide a loose form of synchronization between the mutator and the 
collector. 

The crucial synchronization issues are between the operation the pool performs 
on flip and the mutator's commit operation.

Commit

read init
write init
Memory Barrier
read limit

Flip

write limit
Memory Barrier
read init

Commit consists of two parts.  The first is the update to init.  This is a 
declaration that the new object just before init is now correctly formatted and 
can be scanned.  The second is a check to see if the buffer has been 
"tripped".  The ordering of the two parts is crucial.

Note that the declaration that the object is correctly formatted is independent 
of whether the buffer has been tripped or not.  In particular a pool can scan 
up to the init pointer (including the newly declared object) whether or not the 
pool will cause the commit to fail.  In the case where the pool scans the 
object, but then causes the commit to fail (and presumably the allocation to 
occur somewhere else), the pool will have scanned a "dead" object, but this is 
just another example of conservatism in the general sense.

Not that the read of init in the Flip sequence can in fact be arbitrarily 
delayed (as long as it is read before a buffered segment is scanned).

On processors with Relaxed Memory Order (such as the DEC Alpha), Memory 
Barriers will need to be placed at the points indicated.

 * DESIGN
 *
 * design.mps.buffer.
 *
 * An allocation buffer is an interface to a pool which provides
 * very fast allocation, and defers the need for synchronization in
 * a multi-threaded environment.
 *
 * Pools which contain formatted objects must be synchronized so
 * that the pool can know when an object is valid.  Allocation from
 * such pools is done in two stages: reserve and commit.  The client
 * first reserves memory, then initializes it, then commits.
 * Committing the memory declares that it contains a valid formatted
 * object.  Under certain conditions, some pools may cause the
 * commit operation to fail.  (See the documentation for the pool.)
 * Failure to commit indicates that the whole allocation failed and
 * must be restarted.  When using a pool which introduces the
 * possibility of commit failing, the allocation sequence could look
 * something like this:
 *
 * do {
 *   res = BufferReserve(&p, buffer, size);
 *   if(res != ResOK) return res;       // allocation fails, reason res
 *   initialize(p);                     // p now points at valid object
 * } while(!BufferCommit(buffer, p, size));
 *
 * Pools which do not contain formatted objects can use a one-step
 * allocation as usual.  Effectively any random rubbish counts as a
 * "valid object" to such pools.
 *
 * An allocation buffer is an area of memory which is pre-allocated
 * from a pool, plus a buffer descriptor, which contains, inter
 * alia, four pointers: base, init, alloc, and limit.  Base points
 * to the base address of the area, limit to the last address plus
 * one.  Init points to the first uninitialized address in the
 * buffer, and alloc points to the first unallocated address.
 *
 *    L . - - - - - .         ^
 *      |           |     Higher addresses -'
 *      |   junk    |
 *      |           |       the "busy" state, after Reserve
 *    A |-----------|
 *      |  uninit   |
 *    I |-----------|
 *      |   init    |
 *      |           |     Lower addresses  -.
 *    B `-----------'         v
 *
 *    L . - - - - - .         ^
 *      |           |     Higher addresses -'
 *      |   junk    |
 *      |           |       the "ready" state, after Commit
 *  A=I |-----------|
 *      |           |
 *      |           |
 *      |   init    |
 *      |           |     Lower addresses  -.
 *    B `-----------'         v
 *
 * Access to these pointers is restricted in order to allow
 * synchronization between the pool and the client.  The client may
 * only write to init and alloc, but in a restricted and atomic way
 * detailed below.  The pool may read the contents of the buffer
 * descriptor at _any_ time.  During calls to the fill and trip
 * methods, the pool may update any or all of the fields
 * in the buffer descriptor.  The pool may update the limit at _any_
 * time.
 *
 * Access to buffers by these methods is not synchronized.  If a buffer
 * is to be used by more than one thread then it is the client's
 * responsibility to ensure exclusive access.  It is recommended that
 * a buffer be used by only a single thread.
 *
 * [Only one thread may use a buffer at once, unless the client
 * places a mutual exclusion around the buffer access in the usual
 * way.  In such cases it is usually better to create one buffer for
 * each thread.]
 *
 * Here are pseudo-code descriptions of the reserve and commit
 * operations.  These may be implemented in-line by the client.
 * Note that the client is responsible for ensuring that the size
 * (and therefore the alloc and init pointers) are aligned according
 * to the buffer's alignment.
 *
 * Reserve(buf, size)                   ; size must be aligned to pool
 *   if buf->limit - buf->alloc >= size then
 *     buf->alloc +=size                ; must be atomic update
 *     p = buf->init
 *   else
 *     res = BufferFill(&p, buf, size)  ; buf contents may change
 *
 * Commit(buf, p, size)
 *   buf->init = buf->alloc             ; must be atomic update
 *   if buf->limit == 0 then
 *     res = BufferTrip(buf, p, size)   ; buf contents may change
 *   else
 *     res = True
 * (returns True on successful commit)
 *
 * The pool must allocate the buffer descriptor and initialize it by
 * calling BufferInit.  The descriptor this creates will fall
 * through to the fill method on the first allocation.  In general,
 * pools should not assign resources to the buffer until the first
 * allocation, since the buffer may never be used.
 *
 * The pool may update the base, init, alloc, and limit fields when
 * the fallback methods are called.  In addition, the pool may set
 * the limit to zero at any time.  The effect of this is either:
 *
 *   1. cause the _next_ allocation in the buffer to fall through to
 *      the buffer fill method, and allow the buffer to be flushed
 *      and relocated;
 *
 *   2. cause the buffer trip method to be called if the client was
 *      between reserve and commit.
 *
 * A buffer may not be relocated under other circumstances because
 * there is a race between updating the descriptor and the client
 * allocation sequence.


.method.create:

BufferCreate

Create an allocation buffer in a pool.

The buffer is created in the "ready" state.

A buffer structure is allocated from the space control pool and partially 
initialized (in particularly neither the signature nor the serial field are 
initialized).  The pool class's bufferCreate method is then called.  This 
method can update (some undefined subset of) the fields of the structure; it 
should return with the buffer in the "ready" state (or fail).  The remainder of 
the initialization then occurs.

If and only if successful then a valid buffer is returned.


.method.destroy:

BufferDestroy

Destroy frees a buffer descriptor.  The buffer must be in the "ready" state, 
i.e. not between a Reserve and Commit.  Allocation in the area of memory to 
which the descriptor refers must cease after Destroy is called.

Destroying an allocation buffer does not affect objects which have been 
allocated, it just frees resources associated with the buffer itself.

The pool class's bufferDestroy method is called and then the buffer structure 
is uninitialized and freed.


.method.check:

BufferCheck

The check method is straightforward, the non-trivial dependencies checked are:
  The ordering constraints between base, init, alloc, and limit.
  The alignment constraints on base, init, alloc, and limit.
  That the buffer's rank is identical to the segment's rank.
  
.method.set-reset:

/* BufferSet/Reset -- set/reset a buffer
 *
 * Set sets the buffer base, init, alloc, and limit fields so that
 * the buffer is ready to start allocating in area of memory.  The
 * alloc field is a copy of the init field.
 *
 * Reset sets the seg, base, init, alloc, and limit fields to
 * zero, so that the next reserve request will call the fill
 * method.
 */

.method.set.unbusy: BufferSet must only be applied to buffers that are not busy.
.method.reset.unbusy: BufferReset must only be applied to buffers that are not 
busy.


.method.accessors:

/* Buffer Information
 *
 * BufferIsReset returns TRUE if and only if the buffer is in the
 * reset state, i.e.  with base, init, alloc, and limit set to zero.
 *
 * BufferIsReady returns TRUE iff the buffer is not between a
 * reserve and commit.  The result is only reliable if the client is
 * not currently using the buffer, since it may update the alloc and
 * init pointers asynchronously.
 *
 * BufferAP returns the APStruct substructure of a buffer.
 *
 * BufferOfAP is a thread-safe (impl.c.mpsi.thread-safety) method of
 * getting the buffer which owns an APStruct.
 *
 * BufferSpace is a thread-safe (impl.c.mpsi.thread-safety) method of
 * getting the space which owns a buffer.
 *
 * BufferPool returns the pool to which a buffer is attached.
 */

.method.ofap:

.method.ofap.thread-safe:
BufferOfAP must be thread safe (see impl.c.mpsi.thread-safety).  This is 
achieved simply because the underlying operation involved is simply a 
subtraction.

.method.space:
.method.space.thread-safe:
BufferSpace must be thread safe (see impl.c.mpsi.thread-safety).  This is 
achieved simple because the underlying operation is a read of 
shared-non-mutable data (see design.mps.thread-safety).

.method.reserve:

/* BufferReserve -- reserve memory from an allocation buffer
 *
 * This is a provided version of the reserve procedure described
 * above.  The size must be aligned according to the buffer
 * alignment.  Iff successful, ResOK is returned and
 * *pReturn updated with a pointer to the reserved memory.
 * Otherwise *pReturn is not touched.  The reserved memory is not
 * guaranteed to have any particular contents.  The memory must be
 * initialized with a valid object (according to the pool to which
 * the buffer belongs) and then passed to the Commit method (see
 * below).  Reserve may not be applied twice to a buffer without a
 * Commit in-between.  In other words, Reserve/Commit pairs do not
 * nest.
 */

Res BufferReserve(Addr *pReturn, Buffer buffer, Size size)
{
  Addr next;

  AVER(pReturn != NULL);
  AVERT(Buffer, buffer);
  AVER(size > 0);
  AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
  AVER(BufferIsReady(buffer));

  /* Is there enough room in the unallocated portion of the buffer to */
  /* satisfy the request?  If so, just increase the alloc marker and */
  /* return a pointer to the area below it. */

  next = AddrAdd(buffer->ap.alloc, size);
  if(next > buffer->ap.alloc && next <= buffer->ap.limit)
  {
    buffer->ap.alloc = next;
    *pReturn = buffer->ap.init;
    return ResOK;
  }

  /* If the buffer can't accommodate the request, fall through to the */
  /* pool-specific allocation method. */

  return BufferFill(pReturn, buffer, size);
}

.method.fill:

/* BufferFill -- refill an empty buffer
 *
 * If there is not enough space in a buffer to allocate in-line,
 * BufferFill must be called to "refill" the buffer.  (See the
 * description of the in-line Reserve method in the leader comment.)
 */

Res BufferFill(Addr *pReturn, Buffer buffer, Size size)
{
  Res res;
  Pool pool;

  AVER(pReturn != NULL);
  AVERT(Buffer, buffer);
  AVER(size > 0);
  AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
  AVER(BufferIsReady(buffer));

  pool = BufferPool(buffer);
  res = (*pool->class->bufferFill)(pReturn, pool, buffer, size);

  AVERT(Buffer, buffer);

  return res;
}


.method.commit:

/* BufferCommit -- commit memory previously reserved
 *
 * Commit notifies the pool that memory which has been previously
 * reserved (see above) has been initialized with a valid object
 * (according to the pool to which the buffer belongs).  The pointer
 * p must be the same as that returned by Reserve, and the size must
 * match the size passed to Reserve.
 *
 * Commit may not be applied twice to a buffer without a reserve
 * in-between.  In other words, objects must be reserved,
 * initialized, then committed only once.
 *
 * Commit returns TRUE iff successful.  If commit fails and returns
 * FALSE, the client may try to allocate again by going back to the
 * reserve stage, and may not use the memory at p again for any
 * purpose.
 *
 * Some classes of pool may cause commit to fail under rare
 * circumstances.
 */

Bool BufferCommit(Buffer buffer, Addr p, Size size)
{
  AVERT(Buffer, buffer);
  AVER(size > 0);
  AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
  /* Buffer is "busy" */
  AVER(!BufferIsReady(buffer));

  /* See design.mps.collection.flip.
   * If a flip occurs before this point, when the pool reads
   * buffer->init it will point below the object, so it will be trashed
   * and the commit must fail when trip is called.  The pool will also
   * read p (during the call to trip) which points to the invalid
   * object at init.
   */

  AVER(p == buffer->ap.init);
  AVER(AddrAdd(buffer->ap.init, size) == buffer->ap.alloc);

  /* Atomically update the init pointer to declare that the object */
  /* is initialized (though it may be invalid if a flip occurred). */

  buffer->ap.init = buffer->ap.alloc;

  /* .improve.memory-barrier: Memory barrier here on the DEC Alpha
   * (and other relaxed memory order architectures). */

  /* If a flip occurs at this point, the pool will see init */
  /* above the object, which is valid, so it will be collected. */
  /* The commit must succeed when trip is called.  The pointer */
  /* p will have been fixed up. */

  /* Trip the buffer if a flip has occurred. */

  if(buffer->ap.limit == 0)
    return BufferTrip(buffer, p, size);

  /* No flip occurred, so succeed. */

  return TRUE;
}


.method.trip:

BufferTrip -- act on a tripped buffer

The pool which owns a buffer may asynchronously set the buffer limit to zero in 
order to get control over the buffer.  If this occurs after a Reserve (but 
before the corresponding commit), then the Commit method calls BufferTrip and 
the Commit method returns with BufferTrip's return value.  (See the description 
of Commit.)

.method.trip.precondition:
At the time trip is called (see Commit), the following are true:
.method.trip.precondition.limit: limit == 0
.method.trip.precondition.init: init == alloc
.method.trip.precondition.p: p+size == alloc



.method.expose-cover:

Expose / Cover

BufferExpose/Cover are used by collectors that want to allocate in a forwarding 
buffer.  Since the forwarding buffer may be Shielded the potential problem of 
handling a recursive fault appears (mutator causes a page fault, collector 
fixes some objects causing allocation, the allocation takes place is a 
protected area of memory which cause a page fault).  BufferExpose guarantees 
that allocation can take place in the buffer without causing a page fault, 
BufferCover removes this guarantee.

[The following paragraph is reverse-constructed conjecture]
BufferExpose puts the buffer in an "exposed" state, BufferCover puts the buffer 
in a "covered" state.  BufferExpose can only be called if the buffer is 
"covered".  BufferCover can only be called if the buffer is "exposed".
[Is this part of the "Protection/Suspension Protocol"? (see mail with that 
subject)]







-------------------------------

Here are a number of diagrams showing how buffers behave. In general, the 
horizontal axis corresponds to mutator action (reserve, commit) and the 
vertical axis corresponds to collector action. I'm not sure which of the 
diagrams are the same as each other, and which are best or most complete when 
they are different, but they all attempt to show essentially the same 
information. It's very difficult to get all the details in. These diagrams were 
drawn by richard, rit, gavinm, &c, &c in April 1997. In general, the later 
diagrams are, I suspect, more correct, complete and useful than the earlier 
ones. I have put them all here for the record. rit 1998-02-09

Buffer Diagram:
Buffer States

Buffer States (3-column)
Buffer States (4-column)
Buffer States (gavinised)
Buffer States (interleaved)
Buffer States (richardized)


A. References

B. Document History

2002-06-07 RB Converted from MMInfo database design document.