MPS issue job003970

TitleThe MPS can't scan in parallel with the mutator
Assigned userRichard Brooksby
DescriptionThe MPS stops mutator threads whenever it scans grey objects, because it can't allow the mutator to see pointers to white objects, and it can't get exclusive access to the memory containing the grey objects. As a consequence, the MPS can't effectively scan using one processor core while the mutator is running on another, limiting the utilization of multicore processors.
AnalysisThe MPS architecture does not fundamentally have this limitation, but it is imposed by the fact that the threads in a process share the same protections (because they're sharing the same page table). This means that ShieldExpose must suspend threads. The several candidate solutions I know of are:
1. Run a separate MPS *process* sharing memory with the main process. Each process has its own page tables and so the MPS worker can scan memory that is protected from the mutator. The worker sends protection change messages to the main thread. Barrier hits in the mutator process send urgent scanning requests to the worker. The worker could potentially run multiple scanning threads. See // for a Posix test program. See // for a Posix prototype program.
2. Map the same physical memory at two virtual addresses, so that the MPS has an unprotected back door into the memory. The MPS will have to cope with references as an offset. It's not clear which OSs support this. OS X used to on PowerPC. Linux does not. See // for a Posix test program. Windows *may* allow it, but it's unclear <>.
3. Remap the memory rather than protecting it. Mutator accesses will result in a fault, of course, but the MPS can scan the memory at its remapped location before moving it back into position. No other mutator thread will see it until then, so they do not need to be suspended. See Linux mremap system call. On Windows, a similar effect might be achievable with CreateFileMapping <>. OS X has no mremap.
The two-process solution looks like the most portable, although it's possibly the most clunky, requiring process management and interprocess communication.
How foundunknown
EvidenceThis is a longstanding problem often discussed by RB.
Recent conversations on #clasp.
Hacker News comment <>
Created byRichard Brooksby
Created on2016-03-06 22:00:44
Last modified byGareth Rees
Last modified on2016-09-13 10:36:22
History2016-03-06 RB Created so that we have a job to refer to.


Change Effect Date User Description
179495 open 2012-09-14 22:28:56 Richard Brooksby Adding comment I was prompted to write at <> to the code at ShieldExpose.