This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS users and developers.
These notes on Allocation Points were written by RHSK between 2006-06-07 and 2006-06-22, following research and discussion with RB and NB. Warning: the text in this User's Guide is preliminary, but believed to be 'conservatively correct'. In other words, I think if you follow these guidelines, your code will be correct, and will not violate the current or future definitions of the MPS ap protocol. But this is not (yet) an accurate statement of the MPS ap protocol. RHSK 2006-06-13.
[Note: some constraints may be mentioned only here, and not yet in other places they should be mentioned, such as the Reference Manual. Notably, the client's obligation to ensure there are no exact references to a failed new-object, before it calls mps_ap_trip, is suspected to be new. RHSK 2006-06-13]
Allocation points are an MPS protocol that the client uses to allocate memory with low overhead, and low synchronization cost (with an asynchronous collector and with other threads).
The allocation point protocol is designed to work with incremental collections and multi-threaded clients. (And even if your particular client is single-threaded and non-incremental, for the purposes of using allocation points it's easiest to assume you are coding for this case).
The assumption is that, between any two client instructions, the MPS can interrupt you, move objects around (if they are in a moving pool) and collect garbage. To cope with this, allocating is a two-step process.
Important: .assume.ambig-workspace: this User's Guide assumes that you have declared your stack and registers to be a root that is ambiguously scanned, using mps_root_create_reg and passing the mps_stack_scan_ambig function to it. This is the simplest way to write a client. Other scenarios are possible, but their implications for correct Allocation Point use are not yet documented here.
The rest of this Allocation Points User's Guide contains the following sections:
To create an allocation point in a pool,
call mps_ap_create
.
This may require additional arguments, depending on the pool
class. See pool class documentation.
An allocation point MUST NOT be used by more than one thread. (Each thread must have its own allocation point or points).
Destroy an allocation point with mps_ap_destroy
.
When the client is building (creating and formatting) a new object, you can think of it as being 'in a race' with the MPS. The object is 'under construction', and the MPS cannot manage it in the normal way. So the client should build the object quickly, and then commit it to MPS management. Rarely, the MPS has to move other objects around right in the middle of this build phase: that's a (small) price you pay for having an asynchronous collector. If this happens, the MPS will tell the client that it has 'lost the race'. Objects have moved around, and the new object is invalid. The client must start building it again from scratch.
The client starts building the new object with
mps_reserve
, and
marks it complete by calling mps_commit
.
Almost always, mps_commit
succeeds. But if the
client did not complete the object in time, then
mps_commit
fails (returns 0).
This is how the client should build a new object:
mps_reserve
some memory,mps_commit
the new object to MPS management;If commit succeeds, the object is complete, and immediately becomes just a normal allocated object. The client may write a reference to the new object into some older object (thereby connecting the new object into the client's graph of objects).
If commit fails, the new object no longer exists: the data has gone and any references that used to refer to it are now dangling pointers. The client should simply try to build the object again.
In pseudo-code, the standard allocation point idiom is:
do mps_reserve initialize new object make an ambiguous reference to new object while (! mps_commit) link new object into my object graph
(Do not worry about getting stuck in this loop: commit usually fails at most once per collection, so it is very rare for commit to fail even once, let alone twice).
In C, this typically looks like this:
int make_object(mps_ap_t ap, object *parent) { void *p; object *neo = NULL; do { if (mps_reserve(&p, ap, SIZE_OBJECT) != MPS_RES_OK) { goto fail_make_object; } /* Build the new object */ neo = p; neo->formatcode = FORMAT_CLIENT; /* (not fwd or pad) */ neo->type = TYPE_OBJECT; neo->size = SIZE_OBJECT; neo->parent = parent; neo->tribe = parent->tribe; neo->child = NULL; /* neo (ambiguous reference) preserves the new object */ } while (! mps_commit(ap, p, SIZE_OBJECT)); /* Success: link the new object into my object graph */ parent->child = neo; return TRUE; fail_make_object: return FALSE; /* out of memory, etc */ }
Note that, throughout this User's Guide, we assume that
the stack and registers are declared as ambiguous roots
(.assume.ambig-workspace)
which means that the neo
pointer
keeps the new object alive for us.
The rest of this User's Guide goes through these steps in more detail.
The MPS is a moving garbage collector: it supports preserve-by-copying pools, whose objects are 'mobile'. Whenever the MPS moves an object, it will ensure that all managed references are updated to point to the new location -- and this happens instantaneously as far as the client sees it.
The client should assume that, between any pair of instructions, the MPS may 'shake' this graph, moving all the mobile objects, and updating all the managed references.
Any parts of the graph that are no longer connected (no longer reachable from declared roots) may be collected, and the memory that those objects occupied may be unmapped, or re-used for different objects.
The client usually takes care to ensure that all the references it holds are managed. To be managed, the reference must be in a declared root (such as a scanned stack or a global variable), or in a formatted object that is reachable from a root.
It is okay for a careful client to hold unmanaged references, but:
Call mps_reserve
, passing the size of the
new object you wish to create.
The size must be aligned to the pool alignment.
This is in contrast to mps_alloc, which (for some pools) allows
unaligned sizes.
[Normally, use mps_reserve
(the lower-case C macro).
But if you are using a weak compiler that does not detect common
subexpressions, you may find that using
MPS_RESERVE_BLOCK
(functionally identical) generates
faster code.
Or it may generate slower code. It depends on your compiler, and
you will have to conduct tests to find out.]
mps_reserve
returns a reference to a piece
of new memory for the client to build a new object in.
During this build, the MPS pins the piece of memory, and
treats it as raw data.
"Pinned" means: it will not move, be collected, be unmapped, or anything like that. You may keep an unmanaged reference to it at this time.
"Raw data" means two things:
Firstly, "raw data" means that any references stored IN the new object are unmanaged. This means:
Secondly, "raw data" means that any references TO the new object are treated like other references to unmanaged memory:
The client will typically do all these things:
However, during the build, there are a couple of restrictions:
Once the client has stored a reference IN the new object, it MUST NOT read it out again — any reference stored in the new object is unmanaged, and may have become stale.
(Actually, the restriction is: the moment a reference to an existing mobile object is written into the new object, that reference (in the new object) may become stale. And you'd better not use (dereference) a stale reference. And you'd better not write it into any exactly-scanned cell (such as in an existing object). Reading it into an ambiguously-scanned cell (such as an ambiguously scanned register or stack cell) is okay as long as you don't dereference it. Writing it back into another part of the new object is okay too. Just don't trust it to be a valid reference.)
The client MUST NOT store a reference TO the new object in any exactly-scanned place.
[Note: this is in fact possible, but the protocol for doing it is more complex, and beyond the scope of this guide. RHSK 2006-06-22]
This means the client should NOT connect the new object into the graph of managed objects during the build.
Before the end of the build phase:
Optionally, for improved robustness to bugs, consider initialising all parts of the new object, including parts that are not yet being used to store useful data (such as a string buffer). You might want to make this compile-time switchable, for debugging.
If you leave these unused parts uninitialised, they may contain data that looks like a valid object -- this is called a "spoof object". (This might be the 'ghost' of a previous object, or just random junk that happens to look like a valid object).
This is completely legal: spoof objects do not cause a problem for the MPS.
However, this might leave you with less safety margin than you want, especially when developing a new client. If there were to be a bug in your code (or indeed in the MPS) that resulted in a bogus exact reference to this spoof, it might go undetected, and arbitrary corruption might occur before the bug came to light. So, consider filling these as-yet unused parts with specially chosen dummy values, at least as an option for debugging. Choose dummy values that your format code will recognise as not permitted at the start of a valid formatted object. You will then detect bogus exact references more promptly.
[RHSK 2006-06-15: In poolamc, these ghosts will be forwarding pointers, and they will usually get unmapped (though unless we use zeroed / secure / etc VM they may get mapped-in again intact). But if the tract is nailed they won't even get unmapped. And ghost forwarding pointers are just as bad news as any other spoof. There's currently no format method "destroy". If there was, we could call it in the reclaim phase, to allow format code to safely mark these ghosts as dead. Actually, perhaps that's a valid use of the 'pad' method? ]
When you call mps_commit
, it will either fail or succeed.
Almost always, mps_commit
succeeds.
If it succeeds, that means:
Occasionally but rarely, mps_commit
fails.
This means:
mps_commit
returns;If commit fails, the client usually tries making the object again (although this is not required: it is allowed to just give up!). This is why the standard allocation point idiom has a do...while loop.
Here are some examples of mistakes to avoid:
The example below is INCORRECT.typedef struct object_s { int formatcode; /* FORMAT_CLIENT, _FWD, or _PAD */ int type; size_t size; struct object_s *tribe; struct object_s *parent; struct object_s *child; } object; int make_object(mps_ap_t ap, object *parent) { void *p; object *neo = NULL; do { if (mps_reserve(&p, ap, SIZE_OBJECT) != MPS_RES_OK) { goto fail_make_object; } /* Build the new object */ neo = p; neo->formatcode = FORMAT_CLIENT; neo->type = TYPE_OBJECT; neo->size = SIZE_OBJECT; neo->parent = parent; neo->tribe = neo->parent->tribe; /*--- incorrect-1 ---*/ parent->child = neo; /*--- incorrect-2 ---*/ /* neo (ambiguous reference) preserves the new object */ } while (! mps_commit(ap, p, SIZE_OBJECT)); neo->child = NULL; /*--- incorrect-3 ---*/ return TRUE; fail_make_object: return FALSE; /* out of memory, etc */ } The example above is INCORRECT.
Incorrect-1: do not read references from the new object.
Dereferencing neo->parent
is illegal.
(The code should use parent->tribe
).
Incorrect-2: making an exact reference to the new object is illegal. (The code should only do this after a successful commit).
Incorrect-3: the child
slot (in this example) is
exactly scanned, and it MUST be initialised before the call to
commit.
(The code shown is initialising it too late).
Although this User's Guide explains the protocol in terms of the
pre-packaged macros
mps_reserve
and mps_commit
,
that is a simplification.
The MPS allocation point protocol is designed as a
binary protocol, defined at the level of atomic machine operations.
The precise specification of the binary protocol is beyond the
scope of this document.
For further discussion of Allocation Points, see Allocation Points -- Internals.
(As part of GC article) 2006-06-09 RHSK Allocation points: how it's supposed to be, from RB 2006-06-09. 2006-06-09 RHSK Allocation points: must make at least one ambiguous ref and no exact refs to new object before calling commit. 2006-06-12 RHSK Allocation points: minor edits for clarity. 2006-06-12 RHSK Allocation point internals: Scenario; A mutator with ambiguous references only (to the new object); How bad is a bogus exact reference?; quick notes on: with exact refs; flipping at other times. 2006-06-13 RHSK Allocation point user's guide: Tidy; clarify-- not yet complete. Distinguish it from section on AP internals. 2006-06-14 RHSK Allocation point user's guide: standard reserve-commit idiom: corrections, and now assume ambiguous reg+stack: transitioning to this assumption. Add .talk.RB.2006-06-13 about standard r-c idiom Add .talk.RB.2006-06-14 about Modes of use of MPS, and Promises to not flip (single-threaded). Add .think.RHSK.2006-06-14 about how we could support unscanned reg+stack, even with multi-threaded, moving collector. 2006-06-15 RHSK Allocation Points: clarifications and further notes into User Guide and Internals, including initializing as-yet unused parts of new objects to reduce spoofing. 2006-06-22 RHSK Move rough notes .talk.RB.2006-06-13 etc elsewhere. Make AP User Guide consistently .assume.ambig-workspace, moving notes to APInternals; add Intro and Mistakes; check example code. (This article) 2006-06-22 RHSK Created from GC article. 2006-06-23 RHSK Move AP Internals discussion to its own article.
This document is copyright © 2006 Ravenbrook Limited. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.