TitleMPS spends too much time making system calls
Assigned userGareth Rees
DescriptionIn the test case shown below, the toy Scheme interpreter is spending nearly 70% of the time in system calls. This leads to unacceptable performance.

$ time ./scheme-advanced test-leaf.scm
All tests pass.

real 1m10.526s
user 0m22.468s
sys 0m47.605s

See [1].

The test case in question is scheme-advanced.c with the following changes:

- { 150, 0.85 },
- { 170, 0.45 }
+ { 6400, 0.85 },
+ { 6400, 0.45 }
- (size_t)(32 * 1024 * 1024));
+ (size_t)(1024ul * 1024 * 1024));

(I haven't checked in these changes because for pedagogic reasons we want the Scheme interpreter to do frequent collections.)
AnalysisLooking at the system calls with dtruss [2], I can see that the vast majority of them (> 97%) are mprotect:

$ sudo dtruss ./scheme-advanced test-leaf.scm 2> dtruss.log
All tests pass.
$ wc -l dtruss.log
9540925 dtruss.log
$ <dtruss.log sed -n 's/^\([A-Za-z][A-Za-z_0-9]*\)(.*$/\1/gp' | sort | uniq -c | sort -rn | head 4
9258063 mprotect
 244500 mmap
  21837 getrusage
  14935 sigreturn

In particular I can see long sequences of hundreds of mprotect calls each protecting a 4K page. Also I can see evidence of pointlessly turning barriers off and then on again (or vice versa) before returning from a signal.

Possible causes include:

1. The MPS taking too small a time-slice after a barrier hit. The amount of work has to be large enough to justify the overhead from the context switch. [Need to investigate: how big are the time slices? How long does a context switch take?]

2. The shield cache size is too small, so gets flushed prematurely, resulting in page protections that would have been found to be unnecessary with a larger cache. [Need some numbers: how many ShieldExpose/ShieldCover calls are there in the course of a typical quantum of work?]

3. There's no optimization of mprotect calls: even if a range of pages needing protection is contiguous, an mprotect call is issued for each one. [This might be better handled by allocating bigger segments than trying to coalesce mprotect calls.]

4. There are a lot of mmap calls: perhaps the arena is being too aggressive at returning memory to the operating system. [Is this the function of the "spare commit limit"?]

5. The MPS using a very small granularity for its segments: at the moment these are only 4096 bytes. [3]

GDR 2014-02-03 For the mmap problem, see also job003674.
History2012-11-07 GDR Created.
2014-02-03 GDR Mention job003674.


