.. mode: -*- rst -*- Stack and register scanning =========================== :Tag: design.mps.stack-scan :Author: Gareth Rees :Date: 2014-10-22 :Status: complete design :Revision: $Id: //info.ravenbrook.com/project/mps/master/design/stack-scan.txt#14 $ :Copyright: See `Copyright and License`_. :Index terms: pair: stack and register scanning; design Introduction ------------ _`.intro`: This is the design of the stack and register scanning module. _`.readership`: Any MPS developer; anyone porting the MPS to a new platform. _`.overview`: This module locates and scans references in the control stack and registers of the *current* thread (the one that has called in to the MPS). _`.other`: The thread manager module is responsible for scanning the control stack and registers of *other* threads. See design.mps.thread-manager.if.scan_. .. _design.mps.thread-manager.if.scan: thread-manager#.if.scan _`.origin`: This design was originally proposed in mail.richard.2012-08-03.14-36_. Calling conventions for supported platforms are documented in [Fog]_ and [x86_64_registers]_. .. _mail.richard.2012-08-03.14-36: https://info.ravenbrook.com/mail/2012/08/03/14-36-35/0/ Requirements ------------ _`.req.stack.hot`: Must locate the hot end of the mutator's stack. (This is needed for conservative garbage collection of uncooperative code, where references might be stored by the mutator on its stack.) _`.req.stack.cold.not`: There is no requirement to locate the cold end of the stack. (The mutator supplies this as an argument to ``mps_root_create_thread()``.) _`.req.stack.platform`: Must support the platform's stack conventions. _`.req.stack.platform.full-empty`: The implementation must take into account whether the stack is *full* (the stack pointer points to the last full location) or *empty* (the stack pointer points to the first empty location). _`.req.stack.platform.desc-asc`: The implementation must take into account whether the stack is *descending* (the hot end of the stack is at a lower address than the cold end) or *ascending* (the hot end of the stack is at a higher address than the cold end). _`.req.registers`: Must locate and scan all references in the mutator's *root registers*, the subset of registers which might contain references that do not also appear on the stack. (This is needed for conservative garbage collection of uncooperative code, where references might appear in registers.) _`.req.entry`: Should save the mutator's context (stack and registers) at the point where it enters the MPS. (This avoids scanning registers and stack that belong to the MPS rather than the mutator, leading to unnecessary pinning and zone pollution; see job003525_.) .. _job003525: https://www.ravenbrook.com/project/mps/issue/job003525/ _`.req.setjmp`: The implementation must follow the C Standard in its use of the ``setjmp()`` macro. (So that it is reliable and portable.) _`.req.assembly.not`: The implementation should not use assembly language. (So that it can be developed in tools like Microsoft Visual Studio that don't support this.) Design ------ _`.sol.entry-points`: To meet `.req.entry`_, the mutator's registers and stack must be recorded when the mutator enters the MPS, if there is a possibility that the MPS might need to know the mutator context. _`.sol.entry-points.fragile`: The analysis of which entry points might need to save the context (see `.analysis.entry-points`_ below) is fragile. It might be incorrect now, or become incomplete if we refactor the internals of tracing and polling. As a defence against errors of this form, ``StackScan()`` asserts that the context was saved, but if the client program continues from the assertion, it saves the context anyway and continues. _`.sol.registers`: Implementations spill the root registers onto the stack so that they can be scanned there. _`.sol.registers.root`: The *root registers* are the subset of the callee-save registers that may contain pointers. _`.sol.registers.root.justify`: The caller-save registers will have been spilled onto the stack by the time the MPS is entered, so will be scanned by the stack scan. _`.sol.setjmp`: The values in callee-save registers can be found by invoking ``setjmp()``. This forces any of the caller's callee-save registers into either the ``jmp_buf`` or the current stack frame. _`.sol.setjmp.scan`: Although we might be able to decode the jump buffer in a platform-dependent way, it's hard to guarantee that an uncooperative compiler won't temporarily store a reference in any register or stack location. We must conservatively scan the whole of both. _`.sol.setjmp.justify`: The [C1990]_ standard specifies that ``jmp_buf``: is an array type suitable for holding the information needed to restore a calling environment. The environment of a call to the ``setjmp()`` macro consists of information sufficient for a call to the ``longjmp()`` function to return execution to the correct block and invocation of that block, were it called recursively. We believe that any reasonable implementation of ``setjmp()`` must copy the callee-save registers either into the jump buffer or into the stack frame that invokes it in order to work as described. Otherwise, once the callee-save registers have been overwritten by other function calls, a ``longjmp()`` would result in the callee-save registers having the wrong values. A ``longjmp()`` can come from anywhere, and so the function using ``setjmp()`` can't rely on callee-save registers being saved by callees. _`.sol.stack.hot`: We could decode the frame of the function that invokes ``setjmp()`` from the jump buffer in a platform-specific way, but we can do something simpler (if more hacky) by calling the stub function ``StackHot()`` which takes the address of its argument. So long as this stub function is not inlined into the caller, then on all supported platforms this yields a pointer that is pretty much at the hot end of the frame. _`.sol.stack.hot.noinline`: The reason that ``StackHot()`` must not be inlined is that after inlining, the compiler might place ``stackOut`` at a colder stack address than the ``StackContextStruct``, causing the latter not to be scanned. See `mail.gdr.2018-07-11.09-48`_. .. _mail.gdr.2018-07-11.09-48: https://info.ravenbrook.com/mail/2018/07/11/09-48-49/0/ _`.sol.stack.nest`: We can take care of scanning the jump buffer itself by storing it in the same stack frame. That way a scan from the hot end determined by `.sol.stack.hot`_ to the cold end will contain all of the roots. _`.sol.stack.platform`: As of version 1.115, all supported platforms are *full* and *descending* so the implementation in ``StackScan()`` assumes this. New platforms must check this assumption. _`.sol.xc.alternative`: On macOS, we could use ``getcontext()`` from libunwind (see here_), but that produces deprecation warnings and introduces a dependency on that library. .. _here: https://stackoverflow.com/questions/3592914/ Analysis -------- _`.analysis.setjmp`: The [C1990]_ standard says: An invocation of the ``setjmp`` macro shall appear only in one of the following contexts: - the entire controlling expression of a selection or iteration statement; - one operand of a relational or equality operator with the other operand an integral constant expression, with the resulting expression being the entire controlling expression of a selection or iteration statement; - the operand of a unary ``!`` operator with the resulting expression being the entire controlling expression of a selection or iteration statement; or - the entire expression of an expression statement (possibly cast to ``void``). And the [C1999]_ standard adds: If the invocation appears in any other context, the behavior is undefined. _`.analysis.entry-points`: Here's a reverse call graph (in the master sources at changelevel 189652) showing which entry points might call ``StackScan()`` and so need to record the stack context:: StackScan └ThreadScan └RootScan ├traceScanRootRes │ └traceScanRoot │ └rootFlip │ └traceFlip │ └TraceStart │ ├PolicyStartTrace │ │ └TracePoll │ │ ├ArenaStep │ │ │ └mps_arena_step │ │ └ArenaPoll │ │ ├mps_alloc │ │ ├mps_ap_fill │ │ ├mps_ap_alloc_pattern_end │ │ ├mps_ap_alloc_pattern_reset │ │ └ArenaRelease │ │ ├mps_arena_release │ │ └ArenaStartCollect │ │ ├mps_arena_start_collect │ │ └ArenaCollect │ │ └mps_arena_collect │ └TraceStartCollectAll │ ├ArenaStep [see above] │ ├ArenaStartCollect [see above] │ └PolicyStartTrace [see above] └rootsWalk └ArenaRootsWalk └mps_arena_roots_walk So the entry points that need to save the stack context are ``mps_arena_step()``, ``mps_alloc()``, ``mps_ap_fill()``, ``mps_ap_alloc_pattern_end()``, ``mps_ap_alloc_pattern_reset()``, ``mps_arena_release()``, ``mps_arena_start_collect()``, ``mps_arena_collect()``, and ``mps_arena_roots_walk()``. Interface --------- ``typedef StackContextStruct *StackContext`` _`.if.sc`: A structure encapsulating the mutator context. ``Res StackScan(ScanState ss, void *stackCold, mps_area_scan_t scan_area, void *closure)`` _`.if.scan`: Scan the stack of the current thread, between ``stackCold`` and the hot end of the mutator's stack that was recorded by ``STACK_CONTEXT_SAVE()`` when the arena was entered. This will include any roots which were in the mutator's callee-save registers on entry to the MPS (see `.sol.setjmp`_ and `.sol.stack.nest`_). Return ``ResOK`` if successful, or another result code if not. _`.if.scan.begin-end`: This function must be called between ``STACK_CONTEXT_BEGIN()`` and ``STACK_CONTEXT_END()``. ``STACK_CONTEXT_SAVE(sc)`` _`.if.save`: Store the mutator context in the structure ``sc``. _`.if.save.macro`: This must be implemented as a macro because it needs to run in the stack frame of the entry point (if it runs in some other function it does not necessarily get the mutator's registers). This necessity to have the definition in scope in ``mpsi.c``, while also having different definitions on different platforms, requires a violation of design.mps.config.no-spaghetti_ in ss.h. .. _design.mps.config.no-spaghetti: config#.no-spaghetti ``STACK_CONTEXT_BEGIN(arena)`` _`.if.begin`: Start an MPS operation that may need to know the mutator context (see `.sol.entry-points`_). This macro must be used like this:: Res res; ArenaEnter(arena); STACK_CONTEXT_BEGIN(arena) { res = ArenaStartCollect(...); } STACK_CONTEXT_END(arena); ArenaLeave(arena); return res; That is, it must be paired with ``STACK_CONTEXT_END()``, and there must be no ``return`` between the two macro invocations. This macro stores the mutator context in a ``StackContext`` structure allocated on the stack, and sets ``arena->stackWarm`` to the hot end of the current frame (using `.sol.stack.hot`_). ``STACK_CONTEXT_END(arena)`` _`.if.end`: Finish the MPS operation that was started by ``STACK_CONTEXT_BEGIN()``. This macro sets ``arena->stackWarm`` to ``NULL``. Implementations --------------- _`.impl`: Generic implementation of ``StackScan()`` in ``ss.c`` scans the whole area between ``arena->stackWarm`` and the cold end of the mutator's stack, implementing `.sol.stack.nest`_ and also the backup strategy in `.sol.entry-points.fragile`_. .. figure:: stack-scan-areas.svg :align: center :alt: Diagram: scanned areas of the stack. References ---------- .. [C1990] International Standard ISO/IEC 9899:1990. "Programming languages — C". .. [C1999] International Standard ISO/IEC 9899:1999. "`Programming languages — C `_". .. [Fog] Agner Fog; "`Calling conventions for different C++ compilers and operating systems `_"; Copenhagen University College of Engineering; 2014-08-07. .. [x86_64_registers] Microsoft Corporation; "`Caller/Callee Saved Registers `_". Document History ---------------- - 2014-10-22 GDR_ Initial draft. - 2016-03-03 RB_ Reorganised based mostly on `.sol.stack.hot`_ and `.sol.stack.nest`_. .. _GDR: https://www.ravenbrook.com/consultants/gdr/ .. _RB: https://www.ravenbrook.com/consultants/rb/ Copyright and License --------------------- Copyright © 2014–2020 `Ravenbrook Limited `_. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.