.. mode: -*- rst -*-

Linux implementation of protection module
=========================================

:Tag: design.mps.protli
:Author: Tony Mann
:Date: 2000-02-03
:Status: incomplete document
:Revision: $Id: //info.ravenbrook.com/project/mps/version/1.116/design/protli.txt#1 $
:Copyright: See `Copyright and License`_.
:Index terms:
   pair: Linux; protection interface design
   pair: Linux protection interface; design


Introduction
------------

_`.readership`: Any MPS developer

_`.intro`: This is the design of the Linux implementation of the
protection module. It makes use of various services provided by Linux.
It is intended to work with LinuxThreads.


Requirements
------------

_`.req.general`: Required to implement the general protection
interface defined in design.mps.prot.if_.

.. _design.mps.prot.if: prot#if


Data structures
---------------

_`.data.signext`: This is static. Because that is the only
communications channel available to signal handlers.

.. note::

    Write a little more here.


Functions
---------

_`.fun.setup`: ``ProtSetup()`` installs a signal handler for the
signal ``SIGSEGV`` to catch and handle protection faults (this handler
is the function ``sigHandle()``). The previous handler is recorded (in
the variable ``sigNext``, see `.data.signext`_) so that it can be
reached from ``sigHandle()`` if it fails to handle the fault.

_`.fun.setup.problem`: The problem with this approach is that we can't
honour the wishes of the ``sigvec(2)`` entry for the previous handler
(in terms of masks in particular).

_`.improve.sigvec`: What if when we want to pass on the signal instead
of calling the handler we call ``sigvec()`` with the old entry and use
``kill()`` to send the signal to ourselves and then restore our
handler using ``sigvec()`` again?

.. note::

    Need more detail and analysis here.

_`.fun.set`: ``ProtSet()`` uses ``mprotect()`` to adjust the
protection for pages.

_`.fun.set.convert`: The requested protection (which is expressed in
the ``mode`` parameter, see design.mps.prot.if.set_) is translated into
an operating system protection. If read accesses are to be forbidden
then all accesses are forbidden, this is done by setting the
protection of the page to ``PROT_NONE``. If write accesses are to be
forbidden (and not read accesses) then write accesses are forbidden
and read accesses are allowed, this is done by setting the protection
of the page to ``PROT_READ|PROT_EXEC``. Otherwise (all access are
okay), the protection is set to ``PROT_READ|PROT_WRITE|PROT_EXEC``.

.. _design.mps.prot.if.set: prot#if.set

_`.fun.set.assume.mprotect`: We assume that the call to ``mprotect()``
always succeeds.

_`.fun.set.assume.mprotect`: This is because we should always call the
function with valid arguments (aligned, references to mapped pages,
and with an access that is compatible with the access of the
underlying object).

_`.fun.sync`: ``ProtSync()`` does nothing in this implementation as
``ProtSet()`` sets the protection without any delay.


Threads
-------

_`.threads`: The design must operate in a multi-threaded environment
(with LinuxThreads) and cooperate with the Linux support for locks
(see design.mps.lock_) and the thread suspension mechanism (see
design.mps.pthreadext_ ).

.. _design.mps.pthreadext: pthreadext
.. _design.mps.lock: lock

_`.threads.suspend`: The ``SIGSEGV`` signal handler does not mask out
any signals, so a thread may be suspended while the handler is active,
as required by the design (see
design.mps.pthreadext.req.suspend.protection_). The signal handlers
simply nest at top of stack.

.. _design.mps.pthreadext.req.suspend.protection: pthreadext#req.suspend.protection

_`.threads.async`: POSIX (and hence Linux) imposes some restrictions
on signal handler functions (see
design.mps.pthreadext.anal.signal.safety_). Basically the rules say the
behaviour of almost all POSIX functions inside a signal handler is
undefined, except for a handful of functions which are known to be
"async-signal safe". However, if it's known that the signal didn't
happen inside a POSIX function, then it is safe to call arbitrary
POSIX functions inside a handler.

.. _design.mps.pthreadext.anal.signal.safety: pthreadext#anal.signal.safety

_`.threads.async.protection`: If the signal handler is invoked because
of an MPS access, then we know the access must have been caused by
client code, because the client is not allowed to permit access to
protectable memory to arbitrary foreign code. In these circumstances,
it's OK to call arbitrary POSIX functions inside the handler.

.. note::

    Need a reference for "the client is not allowed to permit access
    to protectable memory to arbitrary foreign code".

_`.threads.async.other`: If the signal handler is invoked for some
other reason (that is, one we are not prepared to handle) then there
is less we can say about what might have caused the SEGV. In general
it is not safe to call arbitrary POSIX functions inside the handler in
this case.

_`.threads.async.choice`: The signal handler calls ``ArenaAccess()``
to determine whether the segmentation fault was the result of an MPS
access. ArenaAccess will claim various MPS locks (that is, the arena
ring lock and some arena locks). The code calls no other POSIX
functions in the case where the segmentation fault is not an MPS
access. The locks are implemented as mutexes and are claimed by
calling ``pthread_mutex_lock()``, which is not defined to be
async-signal safe.

_`.threads.async.choice.ok`: However, despite the fact that PThreads
documentation doesn't define the behaviour of ``pthread_mutex_lock()``
in these circumstances, we expect the LinuxThreads implementation will
be well-behaved unless the segmentation fault occurs while while in
the process of locking or unlocking one of the MPS locks (see
`.threads.async.linux-mutex`_). But we can assume that a segmentation
fault will not happen then (because we use the locks correctly, and
generally must assume that they work). Hence we conclude that it is OK
to call ``ArenaAccess()`` directly from the signal handler.

_`.threads.async.linux-mutex`: A study of the LinuxThreads source code
reveals that mutex lock and unlock functions are implemented as a
spinlock (using a locked compare-and-exchange instruction) with a
backup suspension mechanism using ``sigsuspend()``. On locking, the
spinlock code performs a loop which examines the state of the lock,
and then atomically tests that the state is unchanged while attempting
to modify it. This part of the code is reentrant (and hence
async-signal safe). Eventually, when locking, the spinlock code may
need to block, in which case it calls ``sigsuspend()``, waiting for
the manager thread to unblock it. The unlocking code is similar,
except that this code may need to release another thread, in which
case it calls ``kill()``. The functions ``sigsuspend()`` and
``kill()`` are both defined to be async-signal safe by POSIX. In
summary, the mutex locking functions use primitives which are entirely
async-signal safe. They perform side-effects which modify the fields
of the lock structure only. This code may be safely invoked inside a
signal handler unless the interrupted function is in the process of
manipulating the fields of that lock structure.

_`.threads.async.improve`: In future it would be preferable to not
have to assume reentrant mutex locking and unlocking functions. By
making the assumption we also assume that the implementation of
mutexes in LinuxThreads will not be completely re-designed in future
(which is not wise for the long term). An alternative approach would
be necessary anyway when supporting another platform which doesn't
offer reentrant locks (if such a platform does exist).

_`.threads.async.improve.how`: We could avoid the assumption if we had
a means of testing whether an address lies within an arena chunk
without the need to claim any locks. Such a test might actually be
possible. For example, arenas could update a global datastructure
describing the ranges of all chunks, using atomic updates rather than
locks; the handler code would be allowed to read this without locking.
However, this is somewhat tricky; a particular consideration is that
it's not clear when it's safe to deallocate stale portions of the
datastructure.

_`.threads.sig-stack`: We do not handle signals on a separate signal
stack. Separate signal stacks apparently don't work properly with
Pthreads.


Document History
----------------

- 2000-02-03 Tony Mann. Incomplete document.

- 2002-06-07 RB_ Converted from MMInfo database design document.

- 2013-05-23 GDR_ Converted to reStructuredText.

.. _RB: http://www.ravenbrook.com/consultants/rb/
.. _GDR: http://www.ravenbrook.com/consultants/gdr/


Copyright and License
---------------------

Copyright © 2013-2014 Ravenbrook Limited <http://www.ravenbrook.com/>.
All rights reserved. This is an open source license. Contact
Ravenbrook for commercial licensing options.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

#. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.

#. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

#. Redistributions in any form must be accompanied by information on how
   to obtain complete source code for this software and any
   accompanying software that uses this software.  The source code must
   either be included in the distribution or be available for no more than
   the cost of distribution plus a nominal fee, and must be freely
   redistributable under reasonable conditions.  For an executable file,
   complete source code means the source code for all modules it contains.
   It does not include source code for modules or files that typically
   accompany the major components of the operating system on which the
   executable file runs.

**This software is provided by the copyright holders and contributors
"as is" and any express or implied warranties, including, but not
limited to, the implied warranties of merchantability, fitness for a
particular purpose, or non-infringement, are disclaimed.  In no event
shall the copyright holders and contributors be liable for any direct,
indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or
services; loss of use, data, or profits; or business interruption)
however caused and on any theory of liability, whether in contract,
strict liability, or tort (including negligence or otherwise) arising in
any way out of the use of this software, even if advised of the
possibility of such damage.**