MPS issue job000441

TitlePoolAMC sometimes fails !arena->insideShield check
Statusclosed
Prioritycritical
Assigned userNick Barnes
OrganizationRavenbrook
Description When running large AMC programs with a checking library, a client
 sometimes sees a check fail (!arena->insideShield). A test program
 can generate this failure within a few minutes. It is not wholly
 deterministic.
AnalysisRelated jobs:
  job001706 -- further work to fix the defect in the right way

Debugging the test program after such a failure showed that an object being forwarded (by a format->move method) is behind a barrier. This should be prevented by the ShieldExpose/ShieldCover protocol of the shield module, but AMCFix and AMCHeaderFix were modified last year (changelist 21548, an attempt to speed up Fix), in a way which breaks the shield module's abstraction, with code like this:

 /* .access.read.header: Make sure seg isn't behind a read barrier. */
 shieldUp = FALSE;
 if (SegPM(seg) & AccessREAD) {
   ShieldExpose(arena, seg);
   shieldUp = TRUE;
 }

The problem is that SegPM(seg) is liable to change between this code and the point at which we need the segment exposed. In particular, we may have a barrier on this segment (the from-segment), but the barrier may be down temporarily (because we called ShieldExpose/ShieldCover or because we haven't bothered to erect the barrier yet). In that case the from-segment is in the shield cache (a set of segments to be protected later). Then when we expose and cover the to-segment, the from-segment may be evicted from the cache, to make room for the to-segment, and therefore protected.
The fix is to maintain the shield abstraction: AMCFix and AMCHeaderFix should just call ShieldExpose and ShieldCover. If we need them to go faster, we can write macro versions of these functions.

RHSK 2007-09-12
This fix was not correct (but seems to work anyway). See job001706
for further work to fix the defect in the right way.
How foundcustomer
EvidenceA string of emails, starting with
<http://info.ravenbrook.com/mail/2001/11/23/12-23-02/0.txt>
and culminating with this one, which includes a complete test case:
<http://info.ravenbrook.com/mail/2001/12/05/13-47-22/0.txt>
Created byNick Barnes
Created on2001-12-17 15:04:35
Last modified byGareth Rees
Last modified on2010-10-07 11:15:36
History2001-12-17 NB Created.
2007-09-12 RHSK Ref job001706 -- further work to fix this defect

Fixes

Change Effect Date User Description
25379 closed 2001-12-19 14:42:33 Nick Barnes Make AMC obey shield invariants.
25310 closed 2001-12-17 15:22:01 Nick Barnes Maintain shield abstraction. See job000441.