.. highlight:: none


.. index::
   pair: thread manager; design

.. _design-thread-manager:


Thread manager
==============

.. mps:prefix:: design.mps.thread-manager


Introduction
------------

:mps:tag:`intro` This is the design of the thread manager module.

:mps:tag:`readership` Any MPS developer; anyone porting the MPS to a new
platform.

:mps:tag:`overview` The thread manager implements two features that allow
the MPS to work in a multi-threaded environment: exclusive access to
memory, and scanning of roots in a thread's registers and control
stack.


Requirements
------------

:mps:tag:`req.exclusive` The thread manager must provide the MPS with
exclusive access to the memory it manages in critical sections of the
code. (This is necessary to avoid for the MPS to be able to flip
atomically from the point of view of the mutator.)

:mps:tag:`req.scan` The thread manager must be able to locate references in
the registers and control stack of the current thread, or of a
suspended thread. (This is necessary in order to implement
conservative collection, in environments where the registers and
control stack contain ambiguous roots. Scanning of roots is carried
out during the flip, hence while other threads are suspended.)

:mps:tag:`req.register.multi` It must be possible to register the same
thread multiple times. (This is needed to support the situation where
a program that does not use the MPS is calling into MPS-using code
from multiple threads. On entry to the MPS-using code, the thread can
be registered, but it may not be possible to ensure that the thread is
deregistered on exit, because control may be transferred by some
non-local mechanism such as an exception or :c:func:`longjmp()`. We don't
want to insist that the client program keep a table of threads it has
registered, because maintaining the table might require allocation,
which might provoke a collection. See request.dylan.160252_.)

.. _request.dylan.160252: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/dylan/160252/

:mps:tag:`req.thread.die` It would be nice if the MPS coped with threads
that die while registered. (This makes it easier for a client program
to interface with foreign code that terminates threads without the
client program being given an opportunity to deregister them. See
request.dylan.160022_ and request.mps.160093_.)

.. _request.dylan.160022: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/dylan/160022
.. _request.mps.160093: https://info.ravenbrook.com/project/mps/import/2001-11-05/mmprevol/request/mps/160093/

:mps:tag:`req.thread.intr` It would be nice if on POSIX systems the MPS does
not cause system calls in the mutator to fail with EINTR due to the
MPS thread-management signals being delivered while the mutator is
blocked in a system call. (See `GitHub issue #9`_.)

.. _GitHub issue #9: https://github.com/ravenbrook/mps/issues/9

:mps:tag:`req.thread.errno` It would be nice if on POSIX systems the MPS
does not cause system calls in the mutator to update ``errno`` due to
the MPS thread-management signals being delivered while the mutator is
blocked in a system call, and the MPS signal handlers updating
``errno``. (See `GitHub issue #10`_.)

.. _GitHub issue #10: https://github.com/ravenbrook/mps/issues/10

:mps:tag:`req.thread.lasterror` It would be nice if on Windows systems the
MPS does not cause system calls in the mutator to update the value
returned from :c:func:`GetLastError()` when the exception handler is called
due to a fault. This may cause the MPS to destroy the previous value
there. (See `GitHub issue #61`_.)

.. _GitHub issue #61: https://github.com/Ravenbrook/mps/issues/61

Design
------

:mps:tag:`sol.exclusive` In order to meet :mps:ref:`.req.exclusive`, the arena
maintains a ring of threads (in ``arena->threadRing``) that have been
registered by the client program. When the MPS needs exclusive access
to memory, it suspends all the threads in the ring except for the
currently running thread. When the MPS no longer needs exclusive
access to memory, it resumes all threads in the ring.

:mps:tag:`sol.exclusive.assumption` This relies on the assumption that any
thread that might refer to, read from, or write to memory in
automatically managed pool classes is registered with the MPS. This is
documented in the manual under :c:func:`mps_thread_reg()`.

:mps:tag:`sol.thread.term` The thread manager cannot reliably detect that a
thread has terminated. The reason is that threading systems do not
guarantee behaviour in this case. For example, POSIX_ says, "A
conforming implementation is free to reuse a thread ID after its
lifetime has ended. If an application attempts to use a thread ID
whose lifetime has ended, the behavior is undefined." For this reason,
the documentation for :c:func:`mps_thread_reg()` specifies that it is an
error if a thread dies while registered.

.. _POSIX: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_02

:mps:tag:`sol.thread.term.attempt` Nonetheless, the thread manager makes a
"best effort" to continue running after detecting a terminated thread,
by moving the thread to a ring of dead threads, and avoiding scanning
it. This might allow a malfunctioning client program to limp along.

:mps:tag:`sol.thread.intr` The POSIX specification for sigaction_ says that
if the :c:macro:`SA_RESTART` flag is set, and if "a function specified as
interruptible is interrupted by this signal, the function shall
restart and shall not fail with :c:macro:`EINTR` unless otherwise specified."

.. |sigaction| replace:: :c:func:`sigaction()`
.. _sigaction: https://pubs.opengroup.org/onlinepubs/9699919799/functions/sigaction.html

:mps:tag:`sol.thread.intr.linux` Linux does not fully implement the POSIX
specification, so that some system calls are "never restarted after
being interrupted by a signal handler, regardless of the use of
SA_RESTART; they always fail with the error EINTR when interrupted by
a signal handler". The exceptional calls are listed in the |signal|_
manual. There is nothing that the MPS can do about this except to warn
users in the reference manual.

.. |signal| replace:: signal(7)
.. _signal: https://man7.org/linux/man-pages/man7/signal.7.html

:mps:tag:`sol.thread.errno` The POSIX specification for sigaction_ says,
"Note in particular that even the "safe" functions may modify
``errno``; the signal-catching function, if not executing as an
independent thread, should save and restore its value." All MPS
signals handlers therefore save and restore ``errno`` using the macros
:c:macro:`ERRNO_SAVE` and :c:macro:`ERRNO_RESTORE`.

:mps:tag:`sol.thread.lasterror` The documentation for ``AddVectoredExceptionHandler``
does not mention :c:func:`GetLastError()` at all, but testing_ the behaviour
reveals that any value in :c:func:`GetLastError()` is not preserved. Therefore,
this value is saved using :c:macro:`LAST_ERROR_SAVE` and :c:macro:`LAST_ERROR_RESTORE`.

.. _testing: https://github.com/Ravenbrook/mps/issues/61

Interface
---------

.. c:type:: struct mps_thr_s *Thread

:mps:tag:`if.thread` The type of threads. It is a pointer to an opaque
structure, which must be defined by the implementation.

.. c:function:: Bool ThreadCheck(Thread thread)

:mps:tag:`if.check` The check function for threads. See design.mps.check_.

.. _design.mps.check: check.html

.. c:function:: Bool ThreadCheckSimple(Thread thread)

:mps:tag:`if.check.simple` A thread-safe check function for threads, for use
by :c:func:`mps_thread_dereg()`. It can't use ``AVER(TESTT(Thread,
thread))``, as recommended by design.mps.sig.check.arg.unlocked_,
since :c:type:`Thread` is an opaque type.

.. _design.mps.sig.check.arg.unlocked: sig.html#design.mps.sig.check.arg.unlocked

.. c:function:: Arena ThreadArena(Thread thread)

:mps:tag:`if.arena` Return the arena that the thread is registered with.
Must be thread-safe as it needs to be called by :c:func:`mps_thread_dereg()`
before taking the arena lock.

.. c:function:: Res ThreadRegister(Thread *threadReturn, Arena arena)

:mps:tag:`if.register` Register the current thread with the arena,
allocating a new :c:type:`Thread` object. If successful, update
``*threadReturn`` to point to the new thread and return ``ResOK``.
Otherwise, return a result code indicating the cause of the error.

.. c:function:: void ThreadDeregister(Thread thread, Arena arena)

:mps:tag:`if.deregister` Remove ``thread`` from the list of threads managed
by the arena and free it.

.. c:function:: void ThreadRingSuspend(Ring threadRing, Ring deadRing)

:mps:tag:`if.ring.suspend` Suspend all the threads on ``threadRing``, except
for the current thread. If any threads are discovered to have
terminated, move them to ``deadRing``.

.. c:function:: void ThreadRingResume(Ring threadRing, Ring deadRing)

:mps:tag:`if.ring.resume` Resume all the threads on ``threadRing``. If any
threads are discovered to have terminated, move them to ``deadRing``.

.. c:function:: Thread ThreadRingThread(Ring threadRing)

:mps:tag:`if.ring.thread` Return the thread that owns the given element of
the thread ring.

.. c:function:: Res ThreadScan(ScanState ss, Thread thread, Word *stackCold, mps_area_scan_t scan_area, void *closure)

:mps:tag:`if.scan` Scan the stacks and root registers of ``thread``, using
``ss`` and ``scan_area``. ``stackCold`` points to the cold end of the
thread's stack---this is the value that was supplied by the client
program when it called :c:func:`mps_root_create_thread()`. In the common
case, where the stack grows downwards, ``stackCold`` is the highest
stack address. Return ``ResOK`` if successful, another result code
otherwise.


Implementations
---------------

Generic implementation
......................

:mps:tag:`impl.an` In ``than.c``.

:mps:tag:`impl.an.single` Supports a single thread. (This cannot be enforced
because of :mps:ref:`.req.register.multi`.)

:mps:tag:`impl.an.register.multi` There is no need for any special treatment
of multiple threads, because :c:func:`ThreadRingSuspend()` and
:c:func:`ThreadRingResume()` do nothing.

:mps:tag:`impl.an.suspend` :c:func:`ThreadRingSuspend()` does nothing because
there are no other threads.

:mps:tag:`impl.an.resume` :c:func:`ThreadRingResume()` does nothing because no
threads are ever suspended.

:mps:tag:`impl.an.scan` Just calls :c:func:`StackScan()` since there are no
suspended threads.


POSIX threads implementation
............................

:mps:tag:`impl.ix` In ``thix.c`` and ``pthrdext.c``. See
design.mps.pthreadext_.

.. _design.mps.pthreadext: pthreadext.html

:mps:tag:`impl.ix.multi` Supports multiple threads.

:mps:tag:`impl.ix.register` :c:func:`ThreadRegister()` records the thread id
the current thread by calling |pthread_self|_.

.. |pthread_self| replace:: :c:func:`pthread_self()`
.. _pthread_self: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_self.html

:mps:tag:`impl.ix.register.multi` Multiply-registered threads are handled
specially by the POSIX thread extensions. See
design.mps.pthreadext.req.suspend.multiple_ and
design.mps.pthreadext.req.resume.multiple_.

.. _design.mps.pthreadext.req.suspend.multiple: pthreadext.html#design.mps.pthreadext.req.suspend.multiple
.. _design.mps.pthreadext.req.resume.multiple: pthreadext.html#design.mps.pthreadext.req.resume.multiple

:mps:tag:`impl.ix.suspend` :c:func:`ThreadRingSuspend()` calls
:c:func:`PThreadextSuspend()`. See design.mps.pthreadext.if.suspend_.

.. _design.mps.pthreadext.if.suspend: pthreadext.html#design.mps.pthreadext.if.suspend

:mps:tag:`impl.ix.resume` :c:func:`ThreadRingResume()` calls
:c:func:`PThreadextResume()`. See design.mps.pthreadext.if.resume_.

.. _design.mps.pthreadext.if.resume: pthreadext.html#design.mps.pthreadext.if.resume

:mps:tag:`impl.ix.scan.current` :c:func:`ThreadScan()` calls :c:func:`StackScan()` if
the thread is current.

:mps:tag:`impl.ix.scan.suspended` :c:func:`PThreadextSuspend()` records the
context of each suspended thread, and :c:func:`ThreadRingSuspend()` stores
this in the :c:type:`Thread` structure, so that is available by the time
:c:func:`ThreadScan()` is called.


Windows implementation
......................

:mps:tag:`impl.w3` In ``thw3.c``.

:mps:tag:`impl.w3.multi` Supports multiple threads.

:mps:tag:`impl.w3.register` :c:func:`ThreadRegister()` records the following
information for the current thread:

  - A :c:macro:`HANDLE` to the process, with access flags
    :c:macro:`THREAD_SUSPEND_RESUME` and :c:macro:`THREAD_GET_CONTEXT`. This handle
    is needed as parameter to |SuspendThread|_ and
    |ResumeThread|_.

  - The result of |GetCurrentThreadId|_, so that the current thread
    may be identified in the ring of threads.

.. |SuspendThread| replace:: :c:func:`SuspendThread()`
.. _SuspendThread: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-suspendthread
.. |ResumeThread| replace:: :c:func:`ResumeThread()`
.. _ResumeThread: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-resumethread
.. |GetCurrentThreadId| replace:: :c:func:`GetCurrentThreadId()`
.. _GetCurrentThreadId: https://docs.microsoft.com/en-gb/windows/desktop/api/processthreadsapi/nf-processthreadsapi-getcurrentthreadid

:mps:tag:`impl.w3.register.multi` There is no need for any special treatment
of multiple threads, because Windows maintains a suspend count that is
incremented on |SuspendThread|_ and decremented on
|ResumeThread|_.

:mps:tag:`impl.w3.suspend` :c:func:`ThreadRingSuspend()` calls |SuspendThread|_.

:mps:tag:`impl.w3.resume` :c:func:`ThreadRingResume()` calls |ResumeThread|_.

:mps:tag:`impl.w3.scan.current` :c:func:`ThreadScan()` calls :c:func:`StackScan()` if
the thread is current. This is because |GetThreadContext|_ doesn't
work on the current thread: the context would not necessarily have the
values which were in the saved registers on entry to the MPS.

.. |GetThreadContext| replace:: :c:func:`GetThreadContext()`
.. _GetThreadContext: https://docs.microsoft.com/en-us/windows/desktop/api/processthreadsapi/nf-processthreadsapi-getthreadcontext

:mps:tag:`impl.w3.scan.suspended` Otherwise, :c:func:`ThreadScan()` calls
|GetThreadContext|_ to get the root registers and the stack
pointer.


macOS implementation
....................

:mps:tag:`impl.xc` In ``thxc.c``.

:mps:tag:`impl.xc.multi` Supports multiple threads.

:mps:tag:`impl.xc.register` :c:func:`ThreadRegister()` records the Mach port of
the current thread by calling |mach_thread_self|_.

.. |mach_thread_self| replace:: :c:func:`mach_thread_self()`
.. _mach_thread_self: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Information.html

:mps:tag:`impl.xc.register.multi` There is no need for any special treatment
of multiple threads, because Mach maintains a suspend count that is
incremented on |thread_suspend|_ and decremented on
|thread_resume|_.

.. |thread_suspend| replace:: :c:func:`thread_suspend()`
.. _thread_suspend: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html
.. |thread_resume| replace:: :c:func:`thread_resume()`
.. _thread_resume: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html

:mps:tag:`impl.xc.suspend` :c:func:`ThreadRingSuspend()` calls
|thread_suspend|_.

:mps:tag:`impl.xc.resume` :c:func:`ThreadRingResume()` calls |thread_resume|_.

:mps:tag:`impl.xc.scan.current` :c:func:`ThreadScan()` calls :c:func:`StackScan()` if
the thread is current.

:mps:tag:`impl.xc.scan.suspended` Otherwise, :c:func:`ThreadScan()` calls
|thread_get_state|_ to get the root registers and the stack pointer.

.. |thread_get_state| replace:: :c:func:`thread_get_state()`
.. _thread_get_state: https://www.gnu.org/software/hurd/gnumach-doc/Thread-Execution.html