MPS issue job003494

TitleRe-entrancy failure in LDReset
Statusclosed
Priorityessential
Assigned userGareth Rees
OrganizationRavenbrook
DescriptionMMQA test case function/6 [1] fails with the assertion

     lockix.c:125: MPS ASSERTION FAILED: res == 0
AnalysisIn gdb:

    Program received signal SIGABRT, Aborted.
    0x00007fff8c1b9d46 in __kill ()
    (gdb) bt
    #0 0x00007fff8c1b9d46 in __kill ()
    #1 0x00007fff85d16df0 in abort ()
    #2 0x000000010011cb12 in mps_lib_assert_fail_default (file=0x100244eb4 "code/lockix.c", line=125, condition=0x100244f15 "res == 0") at mpsliban.c:76
    #3 0x00000001000046d2 in mps_lib_assert_fail (file=0x100244eb4 "code/lockix.c", line=125, condition=0x100244f15 "res == 0") at mpsliban.c:85
    #4 0x000000010006a773 in LockClaim (lock=0x13fffe110) at lockix.c:125
    #5 0x0000000100069ee7 in arenaEnterLock (arena=0x10035e000, recursive=0) at global.c:514
    #6 0x000000010000467a in ArenaEnter (arena=0x10035e000) at global.c:494
    #7 0x000000010006bb3a in ArenaAccess (addr=0x13ea32ce0, mode=3, context=0x0) at global.c:607
    #8 0x000000010011c51e in sigHandle (sig=10, info=0x7fff5fbff6a0, context=0x7fff5fbff708) at protsgix.c:97
    #9 <signal handler called>
    #10 0x0000000100027926 in LDReset (ld=0x13ea32ce0, arena=0x10035e000) at ld.c:71
    #11 0x0000000100027676 in mps_ld_reset (ld=0x13ea32ce0, arena=0x10035e000) at mpsi.c:1418
    #12 0x00000001000013cc in test () at function/6.c:71
    #13 0x00000001000026e6 in call_f (p=0x13ea32ce0, s=4298498048) at test/testlib/testlib.c:298
    #14 0x00000001000262c4 in ProtTramp (resultReturn=0x7fff5fbff910, f=0x1000026e0 <call_f>, p=0x7fff5fbff918, s=0) at protix.c:132
    #15 0x0000000100025f6b in mps_tramp (r_o=0x7fff5fbff910, f=0x1000026e0 <call_f>, p=0x7fff5fbff918, s=0) at mpsi.c:1378
    #16 0x0000000100001f12 in easy_tramp2 [inlined] () at test/test/testlib/testlib.c:316
    #17 0x0000000100001f12 in easy_tramp (f=0x7fff5fbff910) at test/testlib/testlib.c:337
    #18 0x00000001000011e2 in main () at function/6.c:98
    (gdb) frame 10
    #10 0x0000000100027926 in LDReset (ld=0x13ea32ce0, arena=0x10035e000) at ld.c:71
    71 ld->_epoch = arena->epoch;
    (gdb) list
    66 AVERT(Arena, arena);
    67
    68 b = SegOfAddr(&seg, arena, (Addr)ld);
    69 if (b)
    70 ShieldExpose(arena, seg); /* .ld.access */
    71 ld->_epoch = arena->epoch;
    72 ld->_rs = RefSetEMPTY;
    73 if (b)
    74 ShieldCover(arena, seg);
    75 }
    (gdb) p ld
    $1 = (mps_ld_t) 0x13ea32ce0
    (gdb) p b
    $2 = 0

So SegOfAddr(ld) at line 68 returned 0 (meaning the arena doesn't think the memory belongs to any segment), so ShieldExpose() was not called, but writing through ld on line 71 triggered a barrier hit, which caused the MPS to be re-entered. The code in [1] suggests that ld has just been allocated from an AMC pool. But inspection shows that the code is actually bogus:

    mycell *p;
    mps_ld_t ld;
    ...
    p = allocdumb(ap, sizeof(mps_ld_s));
    ld = (mps_ld_t) getdata(p);

p has been allocated with the wrong size. The getdata call returns a value off the end of the object, which most of the time will point into unused space at the end of the buffer, so we get away with it, but eventually it points off the end of the buffer into unmapped space.
How foundautomated_test
Evidence[1] <http://www.ravenbrook.com/project/mps/master/test/function/6.c>
Observed in1.111.0
Created byGareth Rees
Created on2013-05-25 16:37:44
Last modified byGareth Rees
Last modified on2014-04-07 20:52:42
History2013-05-25 GDR Created.

Fixes

Change Effect Date User Description
185233 closed 2014-04-04 18:22:13 Gareth Rees Allocate with the right size.