LINUX IMPLEMENTATION OF PROTECTION MODULE
design.mps.protli
incomplete doc
tony 2000-02-03
INTRODUCTION
.readership: Any MPS developer
.intro: This is the design of the Linux implementation of the protection
module. It makes use of various services provided by Linux. It is intended to
work with LinuxThreads.
REQUIREMENTS
.req.general: Required to implement the general protection interface defined in
design.mps.prot.if.*.
MISC
.improve.sigvec: Note 1 of ProtSetup notes that we can't honour the sigvec(2)
entries of the next handler in the chain. What if when we want to pass on the
signal instead of calling the handler we call sigvec with the old entry and use
kill to send the signal to ourselves and then restore our handler using sigvec
again. [need more detail and analysis here].
DATASTRUCTURES
.data.signext: This is static. Because that is the only communications channel
available to signal handlers. [write a little more here]
FUNCTIONS
.fun.setup: ProtSetup installs a signal handler for the signal SIGSEGV to catch
and handle protection faults (this handler is the function sigHandle, see
.fun.sighandle). The previous handler is recorded (in the variable sigNext, see
.data.signext) so that it can be reached from sigHandle if it fails to handle
the fault. .fun.setup.problem: The problem with this approach is that we can't
honour the wishes of the sigvec(2) entry for the previous handler (in terms of
masks in particular).
.fun.set: ProtSet uses mprotect to adjust the protection for pages.
void ProtSet(Addr base, Addr limit, AccessSet mode)
.fun.set.convert: The requested protection (which is expressed in the mode
parameter, see design.mps.prot.if.set) is translated into an OS protection. If
read accesses are to be forbidden then all accesses are forbidden, this is done
by setting the protection of the page to PROT_NONE. If write accesses are to
be forbidden (and not read accesses) then write accesses are forbidden and read
accesses are allowed, this is done by setting the protection of the page to
PROT_READ|PROT_EXEC. Otherwise (all access are okay), the protection is set to
PROT_READ|PROT_WRITE|PROT_EXEC.
.fun.set.assume.mprotect: We assume that the call to mprotect always succeeds.
.fun.set.assume.mprotect: This is because we should always call the function
with valid arguments (aligned, references to mapped pages, and with an access
that is compatible with the access of the underlying object).
.fun.sync: ProtSync does nothing in this implementation as ProtSet sets the
protection without any delay.
void ProtSync(Space space);
.fun.tramp: The protection trampoline is trivial under Linux, as there is
nothing that needs to be done in the dynamic context of the mutator in order
to catch faults. (Contrast this with Win32 Structured Exception Handling.)
void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t
s);
THREADS
.threads: The design must operate in a multi-threaded environment (with
LinuxThreads) and cooperate with the Linux support for locks (see
design.mps.lock) and the thread suspension mechanism (see design.mps.pthreadext
).
.threads.suspend: The SIGSEGV signal handler does not mask out any signals, so
a thread may be suspended while the handler is active, as required by the
design (see design.mps.pthreadext.req.suspend.protection). The signal handlers
simply nest at top of stack.
.threads.async: POSIX (and hence Linux) imposes some restrictions on signal
handler functions (see design.mps.pthreadext.anal.signal.safety). Basically the
rules say the behaviour of almost all POSIX functions inside a signal handler
is undefined, except for a handful of functions which are known to be
"async-signal safe". However, if it's known that the signal didn't happen
inside a POSIX function, then it is safe to call arbitrary POSIX functions
inside a handler.
.threads.async.protection: If the signal handler is invoked because of an MPS
access, then we know the access must have been caused by client code (because
the client is not allowed to permit access to protectable memory to arbitrary
foreign code [need a reference for this]). In these circumstances, it's OK to
call arbitrary POSIX functions inside the handler.
.threads.async.other: If the signal handler is invoked for some other reason
(i.e. one we are not prepared to handle) then there is less we can say about
what might have caused the SEGV. In general it is not safe to call arbitrary
POSIX functions inside the handler in this case.
.threads.async.choice: The signal handler calls ArenaAccess to determine
whether the SEGV was the result of an MPS access. ArenaAccess will claim
various MPS locks (i.e. the arena ring lock and some arena locks). The code
calls no other POSIX functions in the case where the SEGV is not an MPS access.
The locks are implemented as mutexes and are claimed by calling
pthread_mutex_lock, which is not defined to be async-signal safe.
.threads.async.choice.ok: However, despite the fact that PThreads documentation
doesn't define the behaviour of pthread_mutex_lock in these circumstances, we
expect the LinuxThreads implementation will be well-behaved unless the SEGV
occurs while while in the process of locking or unlocking one of the MPS locks
(see .threads.async.linux-mutex). But we can assume that a SEGV will not happen
then (because we use the locks correctly, and generally must assume that they
work). Hence we conclude that it is OK to call ArenaAccess directly from the
signal handler.
.threads.async.linux-mutex: A study of the LinuxThreads source code reveals
that mutex lock and unlock functions are implemented as a spinlock (using a
locked compare-and-exchange instruction) with a backup suspension mechanism
using sigsuspend. On locking, the spinlock code performs a loop which examines
the state of the lock, and then atomically tests that the state is unchanged
while attempting to modify it. This part of the code is reentrant (and hence
async-signal safe). Eventually, when locking, the spinlock code may need to
block, in which case it calls sigsuspend waiting for the manager thread to
unblock it. The unlocking code is similar, except that this code may need to
release another thread, in which case it calls kill. sigsuspend and kill are
both defined to be async-signal safe by POSIX. In summary, the mutex locking
functions use primitives which are entirely async-signal safe. They perform
side-effects which modify the fields of the lock structure only. This code may
be safely invoked inside a signal handler unless the interrupted function is in
the process of manipulating the fields of that lock structure.
.threads.async.improve: In future it would be preferable to not have to assume
reentrant mutex locking and unlocking functions. By making the assumption we
also assume that the implementation of mutexes in LinuxThreads will not be
completely re-designed in future (which is not wise for the long term). An
alternative approach would be necessary anyway when supporting another platform
which doesn't offer reentrant locks (if such a platform does exist).
.threads.async.improve.how: We could avoid the assumption if we had a means of
testing whether an address lies within an arena chunk without the need to claim
any locks. Such a test might actually be possible. For example, arenas could
update a global datastructure describing the ranges of all chunks, using atomic
updates rather than locks; the handler code would be allowed to read this
without locking. However, this is somewhat tricky; a particular consideration
is that it's not clear when it's safe to deallocate stale portions of the
datastructure.
.threads.sig-stack: We do not handle signals on a separate signal stack.
Separate signal stacks apparently don't work properly with Pthreads.
A. References
B. Document History
| 2002-06-07 | RB | Converted from MMInfo database design document. |
C. Copyright and License
This document is copyright © 1995-2002 Ravenbrook Limited. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs.
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.