.. mode: -*- rst -*- Coalescing block structure ========================== :Tag: design.mps.cbs :Author: Gavin Matthews :Date: 1998-05-01 :Status: complete design :Revision: $Id: //info.ravenbrook.com/project/mps/version/1.113/design/cbs.txt#1 $ :Copyright: See section `Copyright and License`_. Introduction ------------ _`.intro`: This is the design for impl.c.cbs, which implements a data structure for the management of non-intersecting memory ranges, with eager coalescence. _`.readership`: This document is intended for any MM developer. _`.source`: design.mps.poolmv2, design.mps.poolmvff. _`.overview`: The "coalescing block structure" is a set of addresses (or a subset of address space), with provision for efficient management of contiguous ranges, including insertion and deletion, high level communication with the client about the size of contiguous ranges, and detection of protocol violations. Definitions ----------- _`.def.range`: A (contiguous) *range* of addresses is a semi-open interval on address space. _`.def.isolated`: A contiguous range is *isolated* with respect to some property it has, if adjacent elements do not have that property. Requirements ------------ _`.req.set`: Must maintain a set of addresses. _`.req.fast`: Common operations must have a low amortized cost. _`.req.add`: Must be able to add address ranges to the set. _`.req.remove`: Must be able to remove address ranges from the set. _`.req.size`: Must report concisely to the client when isolated contiguous ranges of at least a certain size appear and disappear. _`.req.iterate`: Must support the iteration of all isolated contiguous ranges. This will not be a common operation. _`.req.protocol`: Must detect protocol violations. _`.req.debug`: Must support debugging of client code. _`.req.small`: Must have a small space overhead for the storage of typical subsets of address space and not have abysmal overhead for the storage of any subset of address space. _`.req.align`: Must support an alignment (the alignment of all addresses specifying ranges) of down to ``sizeof(void *)`` without losing memory. Interface --------- _`.header`: CBS is used through impl.h.cbs. External types .............. ``typedef struct CBSStruct *CBS`` _`.type.cbs`: ``CBS`` is the main data structure for manipulating a CBS. It is intended that a ``CBSStruct`` be embedded in another structure. No convenience functions are provided for the allocation or deallocation of the CBS. ``typedef Bool (*CBSIterateMethod)(CBS cbs, Range range, void *closureP, Size closureS)`` _`.type.cbs.iterate.method`: Type ``CBSIterateMethod`` is a callback function that may be passed to ``CBSIterate()``. It is called for every isolated contiguous range in address order. The function must returns a ``Bool`` indicating whether to continue with the iteration. External functions .................. ``Res CBSInit(Arena arena, CBS cbs, void *owner, Align alignment, Bool fastFind, ArgList args)`` _`.function.cbs.init`: ``CBSInit()`` is the function that initialises the CBS structure. It performs allocation in the supplied arena. The parameter ``owner`` is passed to ``MeterInit()``, an ``alignment`` indicates the alignment of ranges to be maintained. An initialised CBS contains no ranges. ``fastFind``, if set, causes the CBS to maintain, for each subtree, the size of the largest block in that subtree. This must be true if any of the ``CBSFindFirst()``, ``CBSFindLast()``, or ``CBSFindLargest()`` functions are going to be used on the CBS. ``CBSInit()`` may take one keyword argument: * ``MPS_KEY_CBS_EXTEND_BY`` (type ``Size``; default 4096) is the size of segment that the CBS will request from the arena in which to allocate its ``CBSBlock`` structures. ``void CBSFinish(CBS cbs)`` _`.function.cbs.finish`: ``CBSFinish()`` is the function that finishes the CBS structure and discards any other resources associated with the CBS. ``Res CBSInsert(Range rangeReturn, CBS cbs, Range range)`` _`.function.cbs.insert`: If any part of ``range`` is already in the CBS, then leave it unchanged and return ``ResFAIL``. Otherwise, attempt to insert ``range`` into the CBS. If the insertion succeeds, then update ``rangeReturn`` to describe the contiguous isolated range containing the inserted range (this may differ from ``range`` if there was coalescence on either side) and return ``ResOK``. If the insertion fails, return a result code indicating allocation failure. _`.function.cbs.insert.fail`: Insertion of a valid range (that is, one that does not overlap with any range in the CBS) can only fail if the new range is isolated and the allocation of the necessary data structure to represent it failed. ``Res CBSDelete(Range rangeReturn, CBS cbs, Range range)`` _`.function.cbs.delete`: If any part of the range is not in the CBS, then leave the CBS unchanged and return ``ResFAIL``. Otherwise, update ``rangeReturn`` to describe the contiguous isolated range that contains ``range`` (this may differ from ``range`` if there are fragments on either side) and attempt to delete the range from the CBS. If the deletion succeeds, return ``ResOK``. If the deletion fails, return a result code indicating allocation failure. _`.function.cbs.delete.fail`: Deletion of a valid range (that is, one that is wholly contained in the CBS) can only fail if there are fragments on both sides and the allocation of the necessary data structures to represent them fails. _`.function.cbs.delete.return`: ``CBSDelete()`` returns the contiguous isolated range that contains ``range`` even if the deletion fails. This is so that the caller can try deleting the whole block (which is guaranteed to succeed) and managing the fragments using a fallback strategy. ``void CBSIterate(CBS cbs, CBSIterateMethod iterate, void *closureP, Size closureS)`` _`.function.cbs.iterate`: ``CBSIterate()`` is the function used to iterate all isolated contiguous ranges in a CBS. It receives a pointer, ``Size`` closure pair to pass on to the iterator method, and an iterator method to invoke on every range in address order. If the iterator method returns ``FALSE``, then the iteration is terminated. ``Res CBSDescribe(CBS cbs, mps_lib_FILE *stream)`` _`.function.cbs.describe`: ``CBSDescribe()`` is a function that prints a textual representation of the CBS to the given stream, indicating the contiguous ranges in order, as well as the structure of the underlying splay tree implementation. It is provided for debugging purposes only. ``Bool CBSFindFirst(Range rangeReturn, Range oldRangeReturn, CBS cbs, Size size, FindDelete findDelete)`` _`.function.cbs.find.first`: Locate the first block (in address order) within the CBS of at least the specified size, update ``rangeReturn`` to describe that range, and return ``TRUE``. If there is no such block, it returns ``FALSE``. In addition, optionally delete the top, bottom, or all of the found range, depending on the ``findDelete`` argument. This saves a separate call to ``CBSDelete()``, and uses the knowledge of exactly where we found the range. The value of ``findDelete`` must come from this enumeration:: enum { FindDeleteNONE, /* don't delete after finding */ FindDeleteLOW, /* delete size bytes from low end of block */ FindDeleteHIGH, /* delete size bytes from high end of block */ FindDeleteENTIRE /* delete entire range */ }; The original contiguous isolated range in which the range was found is returned via the ``oldRangeReturn`` argument. (If ``findDelete`` is ``FindDeleteNONE`` or ``FindDeleteENTIRE``, then this will be identical to the range returned via the ``rangeReturn`` argument.) ``CBSFindFirst()`` requires that ``fastFind`` was true when ``CBSInit()`` was called. ``Bool CBSFindLast(Range rangeReturn, Range oldRangeReturn, CBS cbs, Size size, FindDelete findDelete)`` _`.function.cbs.find.last`: Like ``CBSFindFirst()``, except that it finds the last block in address order. ``Bool CBSFindLargest(Range rangeReturn, Range oldRangeReturn, CBS cbs, Size size, FindDelete findDelete)`` _`.function.cbs.find.largest`: Locate the largest block within the CBS, and if that block is at least as big as ``size``, return its range via the ``rangeReturn`` argument, and return ``TRUE``. If there are no blocks in the CBS at least as large as ``size``, return ``FALSE``. Pass 0 for ``size`` if you want the largest block unconditionally. Like ``CBSFindFirst()``, optionally delete the range (specifying ``FindDeleteLOW`` or ``FindDeleteHIGH`` has the same effect as ``FindDeleteENTIRE``). This feature requires that ``fastFind`` was true when ``CBSInit()`` was called. Implementation -------------- _`.impl`: This section is concerned with describing various aspects of the implementation. It does not form part of the interface definition. Splay tree .......... _`.impl.splay`: The CBS is principally implemented using a splay tree (see design.mps.splay_). Each splay tree node is embedded in a ``CBSBlock`` that represents a semi-open address range. The key passed for comparison is the base of another range. .. _design.mps.splay: splay _`.impl.splay.fast-find`: ``CBSFindFirst()`` and ``CBSFindLast()`` use the update/refresh facility of splay trees to store, in each ``CBSBlock``, an accurate summary of the maximum block size in the tree rooted at the corresponding splay node. This allows rapid location of the first or last suitable block, and very rapid failure if there is no suitable block. _`.impl.find-largest`: ``CBSFindLargest()`` simply finds out the size of the largest block in the CBS from the root of the tree, using ``SplayRoot()``, and does ``SplayFindFirst()`` for a block of that size. This takes time proportional to the logarithm of the size of the free list, so it's about the best you can do without maintaining a separate priority queue, just to do ``CBSFindLargest()``. Low memory behaviour .................... _`.impl.low-mem`: When the CBS tries to allocate a new ``CBSBlock`` structure for a new isolated range as a result of either ``CBSInsert()`` or ``CBSDelete()``, and there is insufficient memory to allocation the ``CBSBlock`` structure, then the range is not added to the CBS or deleted from it, and the call to ``CBSInsert()`` or ``CBSDelete()`` returns ``ResMEMORY``. The CBS block ............. _`.impl.cbs.block`: The block contains a base-limit pair and a splay tree node. _`.impl.cbs.block.special`: The base and limit may be equal if the block is halfway through being deleted. _`.impl.cbs.block.special.just`: This conflates values and status, but is justified because block size is very important. Testing ------- _`.test`: The following testing will be performed on this module: _`.test.cbstest`: There is a stress test for this module in impl.c.cbstest. This allocates a large block of memory and then simulates the allocation and deallocation of ranges within this block using both a ``CBS`` and a ``BT``. It makes both valid and invalid requests, and compares the ``CBS`` response to the correct behaviour as determined by the ``BT``. It also iterates the ranges in the ``CBS``, comparing them to the ``BT``. It also invokes the ``CBSDescribe()`` method, but makes no automatic test of the resulting output. It does not currently test the callbacks. _`.test.pool`: Several pools (currently MVT_ and MVFF_) are implemented on top of a CBS. These pool are subject to testing in development, QA, and are/will be heavily exercised by customers. .. _MVT: poolmvt .. _MVFF: poolmvff Notes for future development ---------------------------- _`.future.not-splay`: The initial implementation of CBSs is based on splay trees. It could be revised to use any other data structure that meets the requirements (especially `.req.fast`_). _`.future.hybrid`: It would be possible to attenuate the problem of `.risk.overhead`_ (below) by using a single word bit set to represent the membership in a (possibly aligned) word-width of grains. This might be used for block sizes less than a word-width of grains, converting them when they reach all free in the bit set. Note that this would make coalescence slightly less eager, by up to ``(word-width - 1)``. Risks ----- _`.risk.overhead`: Clients should note that the current implementation of CBSs has a space overhead proportional to the number of isolated contiguous ranges. [Four words per range.] If the CBS contains every other grain in an area, then the overhead will be large compared to the size of that area. [Four words per two grains.] The CBS structure is thus suitable only for managing large enough ranges. Document History ---------------- - 1998-05-01 Gavin Matthews. This document was derived from the outline in design.mps.poolmv2(2). - 1998-07-22 Gavin Matthews. Updated in response to approval comments in change.epcore.anchovy.160040. There is too much fragmentation in trapping memory. - Gavin Matthews. Updated (as part of change.epcore.brisling.160158: MVFF cannot be instantiated with 4-byte alignment) to document new alignment restrictions. - 2002-06-07 RB_ Converted from MMInfo database design document. - 2013-04-14 GDR_ Converted to reStructuredText. - 2013-05-19 GDR_ Removed the "emergency" free list allocator, the design notes on an unimplemented "future hybrid" scheme, the callbacks, block interface, and minimum size interface. Updated the arguments for ``CBSIterateMethod``, ``CBSInit()``, ``CBSInsert()``, and ``CBSDelete()``. - 2013-05-23 RB_ Removed references to "contingency methods" that were talking about the deleted "emergency" free list allocator. Documented ``fastFind`` argument to ``CBSInit()``. .. _RB: http://www.ravenbrook.com/consultants/rb/ .. _GDR: http://www.ravenbrook.com/consultants/gdr/ Copyright and License --------------------- Copyright © 1998-2014 Ravenbrook Limited. All rights reserved. . This is an open source license. Contact Ravenbrook for commercial licensing options. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: #. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. #. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. #. Redistributions in any form must be accompanied by information on how to obtain complete source code for this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs. **This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.**