18. Lands

18.1. Introduction

.intro: This is the design of the land abstract data type, which represents a collection of contiguous address ranges.

.readership: This document is intended for any MPS developer.

.source: design.mps.cbs, design.mps.freelist.

.overview: Collections of address ranges are used in several places in the MPS: the arena stores a set of mapped address ranges; pools store sets of address ranges which have been acquired from the arena and sets of address ranges that are available for allocation. The land abstract data type makes it easy to try out different implementations with different performance characteristics and other attributes.

.name: The name is inspired by rangeland meaning group of ranges (where ranges is used in the sense grazing areas).

18.2. Definitions

.def.range: A (contiguous) range of addresses is a semi-open interval on address space.

.def.isolated: A contiguous range is isolated with respect to some property it has, if adjacent elements do not have that property.

18.3. Requirements

.req.set: Must maintain a set of addresses.

.req.add: Must be able to add address ranges to the set.

.req.remove: Must be able to remove address ranges from the set.

.req.size: Must report concisely to the client when isolated contiguous ranges of at least a certain size appear and disappear.

.req.iterate: Must support the iteration of all isolated contiguous ranges.

.req.protocol: Must detect protocol violations.

.req.debug: Must support debugging of client code.

.req.align: Must support an alignment (the alignment of all addresses specifying ranges) of down to sizeof(void*) without losing memory.

18.4. Interface

18.4.1. Types

LandStruct *Land

.type.land: The type of a generic land instance.

Bool (*LandVisitor)(Land land, Range range, void *closure)

.type.visitor: Type LandVisitor is a callback function that may be passed to LandIterate(). It is called for every isolated contiguous range in address order. The function must return a Bool indicating whether to continue with the iteration.

Bool (*LandDeleteVisitor)(Bool *deleteReturn, Land land, Range range, void *closure)

.type.deletevisitor: Type LandDeleteVisitor is a callback function that may be passed to LandIterateAndDelete(). It is called for every isolated contiguous range in address order. The function must return a Bool indicating whether to continue with the iteration. It may additionally update *deleteReturn to TRUE if the range must be deleted from the land, or FALSE if the range must be kept. (The default is to keep the range.)

18.4.2. Generic functions

Res LandInit(Land land, LandClass class, Arena arena, Align alignment, void *owner, ArgList args)

.function.init: LandInit() initializes the land structure for the given class. The land will perform allocation (if necessary – not all land classes need to allocate) in the supplied arena. The alignment parameter is the alignment of the address ranges that will be stored and retrieved from the land. The parameter owner is output as a parameter to the LandInit event. The newly initialized land contains no ranges.

Res LandCreate(Land *landReturn, Arena arena, LandClass class, Align alignment, void *owner, ArgList args)

.function.create: LandCreate() allocates memory for a land structure of the given class in arena, and then passes all parameters to LandInit().

void LandFinish(Land land)

.function.finish: LandFinish() finishes the land structure and discards any other resources associated with the land.

void LandSize(Land land)

.function.size: LandSize() returns the total size of the ranges stored in the land.

Res LandInsert(Range rangeReturn, Land land, Range range)

.function.insert: If any part of range is already in the land, then leave it unchanged and return ResFAIL. Otherwise, attempt to insert range into the land. If the insertion succeeds, then update rangeReturn to describe the contiguous isolated range containing the inserted range (this may differ from range if there was coalescence on either side) and return ResOK. If the insertion fails, return a result code indicating allocation failure.

.function.insert.fail: Insertion of a valid range (that is, one that does not overlap with any range in the land) can only fail if the new range is isolated and the allocation of the necessary data structure to represent it failed.

.function.insert.alias: It is acceptable for rangeReturn and range to share storage.

Res LandDelete(Range rangeReturn, Land land, Range range)

.function.delete: If any part of the range is not in the land, then leave the land unchanged and return ResFAIL. Otherwise, update rangeReturn to describe the contiguous isolated range that contains range (this may differ from range if there are fragments on either side) and attempt to delete the range from the land. If the deletion succeeds, return ResOK. If the deletion fails, return a result code indicating allocation failure.

.function.delete.fail: Deletion of a valid range (that is, one that is wholly contained in the land) can only fail if there are fragments on both sides and the allocation of the necessary data structures to represent them fails.

.function.delete.return: LandDelete() returns the contiguous isolated range that contains range even if the deletion fails. This is so that the caller can try deleting the whole block (which is guaranteed to succeed) and managing the fragments using a fallback strategy.

.function.delete.alias: It is acceptable for rangeReturn and range to share storage.

Bool LandIterate(Land land, LandVisitor visitor, void *closure)

.function.iterate: LandIterate() is the function used to iterate all isolated contiguous ranges in a land. It receives a visitor function to invoke on every range, and a closure pointer to pass on to the visitor function. If the visitor function returns FALSE, then iteration is terminated and LandIterate() returns FALSE. If all iterator method calls return TRUE, then LandIterate() returns TRUE

Bool LandIterateAndDelete(Land land, LandDeleteVisitor visitor, void *closure)

.function.iterate.and.delete: As LandIterate(), but the visitor function additionally returns a Boolean indicating whether the range should be deleted from the land.

.function.iterate.and.delete.justify: The reason for having both LandIterate() and LandIterateAndDelete() is that it may be possible to use a more efficient algorithm, or to preserve more properties of the data structure, when it is known that the land willl not be modified during the iteration. For example, in the CBS implementation, LandIterate() uses TreeTraverse() which preserves the tree structure, whereas LandIterateAndDelete() uses TreeTraverseAndDelete() which flattens the tree structure, losing information about recently accessed nodes.

Bool LandFindFirst(Range rangeReturn, Range oldRangeReturn, Land land, Size size, FindDelete findDelete)

.function.find.first: Locate the first block (in address order) within the land of at least the specified size, update rangeReturn to describe that range, and return TRUE. If there is no such block, it returns FALSE.

In addition, optionally delete the top, bottom, or all of the found range, depending on the findDelete argument. This saves a separate call to LandDelete(), and uses the knowledge of exactly where we found the range. The value of findDelete must come from this enumeration:

enum {
    FindDeleteNONE,    /* don't delete after finding */
    FindDeleteLOW,     /* delete size bytes from low end of block */
    FindDeleteHIGH,    /* delete size bytes from high end of block */
    FindDeleteENTIRE   /* delete entire range */
};

The original contiguous isolated range in which the range was found is returned via the oldRangeReturn argument. (If findDelete is FindDeleteNONE or FindDeleteENTIRE, then this will be identical to the range returned via the rangeReturn argument.)

Bool LandFindLast(Range rangeReturn, Range oldRangeReturn, Land land, Size size, FindDelete findDelete)

.function.find.last: Like LandFindFirst(), except that it finds the last block in address order.

Bool LandFindLargest(Range rangeReturn, Range oldRangeReturn, Land land, Size size, FindDelete findDelete)

.function.find.largest: Locate the largest block within the land, and if that block is at least as big as size, return its range via the rangeReturn argument, and return TRUE. If there are no blocks in the land at least as large as size, return FALSE. Pass 0 for size if you want the largest block unconditionally.

Like LandFindFirst(), optionally delete the range (specifying FindDeleteLOW or FindDeleteHIGH has the same effect as FindDeleteENTIRE), and return the original contiguous isolated range in which the range was found via the oldRangeReturn argument.

Res LandFindInZones(Bool *foundReturn, Range rangeReturn, Range oldRangeReturn, Land land, Size size, ZoneSet zoneSet, Bool high)

.function.find.zones: Locate a block at least as big as size that lies entirely within the zoneSet, return its range via the rangeReturn argument, set *foundReturn to TRUE, and return ResOK. (The first such block, if high is FALSE, or the last, if high is TRUE.) If there is no such block, set *foundReturn to FALSE, and return ResOK.

Delete the range as for LandFindFirst() and LastFindLast() (with the effect of FindDeleteLOW if high is FALSE and the effect of FindDeleteHIGH if high is TRUE), and return the original contiguous isolated range in which the range was found via the oldRangeReturn argument.

.function.find.zones.fail: It’s possible that the range can’t be deleted from the land because that would require allocation, in which case the result code indicates the cause of the failure.

Res LandDescribe(Land land, mps_lib_FILE *stream)

.function.describe: LandDescribe() prints a textual representation of the land to the given stream. It is provided for debugging purposes only.

void LandFlush(Land dest, Land src)

.function.flush: Delete ranges of addresses from src and insert them into dest, so long as LandInsert() remains successful.

18.5. Implementations

There are three land implementations:

  1. CBS (Coalescing Block Structure) stores ranges in a splay tree. It has fast (logarithmic in the number of ranges) insertion, deletion and searching, but has substantial space overhead. See design.mps.cbs.

  2. Freelist stores ranges in an address-ordered free list, as in traditional malloc() implementations. Insertion, deletion, and searching are slow (proportional to the number of ranges) but it does not need to allocate. See design.mps.freelist.

  3. Failover combines two lands, using one (the primary) until it fails, and then falls back to the other (the secondary). See design.mps.failover.

18.6. Testing

.test: There is a stress test for implementations of this interface in impl.c.landtest. This allocates a large block of memory and then simulates the allocation and deallocation of ranges within this block using both a Land and a BT. It makes both valid and invalid requests, and compares the Land response to the correct behaviour as determined by the BT. It iterates the ranges in the Land, comparing them to the BT. It invokes the LandDescribe() generic function, but makes no automatic test of the resulting output.