GC Messages

Design of MPS GC messages:

  • mps_message_type_gc_start();
  • mps_message_type_gc().

They are called "trace start" and "trace end" messages in this document and in most MPS source code.

Readership: any MPS developer. Not confidential.

(For a guide to the MPS message system in general, see <design/message>.)

Introduction

The MPS posts a trace start message (_gc_start) near the start of every trace (but after calculating the condemned set, so we can report how large it is).

The MPS posts a trace end message (_gc) near the end of every trace.

These messages are extremely flexible: they can hold arbitrary additional data simply by writing new accessor functions. If there is more data to report at either of these two events, then there is a good argument for adding it into these existing messages.

(Note: in previous versions of this design document, there was a partial unimplemented design for a "_gc_generation" message. I would argue that this would not have been a good design, because managing and collating multiple messages is much more complex for both MPS and client than using a single message. RHSK 2008-12-19.)

Purpose

The purpose of these messages is to allow the client program to be aware of GC activity, in order to:

  • adjust its own behaviour programmatically;
  • show or report GC activity in some appropriate 'custom' way, such as an in-client display, in a log file, etc. The main message content should be intelligible and helpful to client-developers (with help from MPS staff if necessary). There may be extra content that is only meaningful to MPS staff, to help us diagnose client problems.

While there is some overlap with the Diagnostic Feedback system, the main contrasts are that these GC messages are present in release builds, are stable from release to release, and are designed to be parsed by the client program.

Names and Parts

Here's a helpful list of the names used in the GC message system:

Implementation is mostly in the source file "traceanc.c" (trace ancillary).

The "_gc_start" message is implemented internally by the "trace start" message. The principal names are:

Internal names:
  trace start
  TraceStartMessage, tsMessage (struct *)
  arena->tsMessage[ti]
  MessageTypeGCSTART (enum)

External names:
  mps_message_type_gc_start (enum macro)
  MPS_MESSAGE_TYPE_GC_START (enum)

The "_gc" message is implemented internally by the "trace end" message. The principal names are:

Internal names:
  trace end
  TraceMessage, tMessage (struct *)
  arena->tMessage[ti]
  MessageTypeGC (enum)

External names:
  mps_message_type_gc (enum macro)
  MPS_MESSAGE_TYPE_GC (enum)

Note: the names of these messages are unconventional; they should properly be called "gc (or trace) begin" and "gc (or trace) end". But it's much too late to change them now. RHSK 2008-12-15

Collectively, the trace-start and trace-end messages are called the "trace id messages", and they are managed by the following functions:

Internal names:
  TraceIdMessagesCheck
  TraceIdMessagesCreate
  TraceIdMessagesDestroy

The currently supported message-field accessor methods are:

External names:
  mps_message_gc_start_why
  mps_message_gc_live_size
  mps_message_gc_condemned_size
  mps_message_gc_not_condemned_size

These are documented in the Reference Manual.

Lifecycle

Overview: for each trace id, pre-allocate a pair of start/end messages (with ControlAlloc). Then, when a trace runs using that trace id, fill-in and post these messages. As soon as the trace has posted both messages, immediately pre-allocate a new pair of messages, which wait in readiness for the next trace to use that trace id.

Requirements

.no-start-alloc: Should avoid attempting to allocate memory at trace start time.

.no-start-alloc.why: There may be no free memory at trace start time. Client would still like to hear about collections in those circumstances.

.queue: Must support a client that enables, but does not promptly get, GC messages. Ungot messages must remain queued, and the client must be able to get them later without loss. It is not acceptable to stop issuing GC messages for subsequent collections merely because messages from previous collections have not yet been got.

.queue.why: This is because there is simply no reasonable way for a client to guarantee that it always promptly collects GC messages.

.match: Start and end messages should always match up: never post one of the messages but fail to post the matching one.

.match.why: This makes client code much simpler -- it does not have to handle mismatching messages.

.errors-not-direct: Errors (such as a ControlAlloc failure) cannot be reported directly to the client, because collections often happen automatically, without an explicit client call to the MPS interface.

.multi-trace: Up to TraceLIMIT traces may be running, and emitting start/end messages, simultaneously.

.early: Nice to tell client as much as possible about the collection in the start message, if we can.

.similar: Start and end messages are conceptually similar -- it is quite okay, and may be helpful to the client, for the same datum (for example: the reason why the collection occurred) to be present in both the start and end message.

Storage

For each trace-id (.multi-trace) a pair (.match) of start/end messages is dynamically allocated (.queue) in advance (.no-start-alloc). Messages are allocated in the control pool using ControlAlloc.

(Previous implementations of the trace start message used static allocation. This does not satisfy .queue. See also job001570. RHSK 2008-12-15.)

Pointers to these messages are stored in tsMessage[ti] and tMessage[ti] arrays in the ArenaStruct.

(We must not keep the pre-allocated messages, or pointers to them, in the TraceStructs: the memory for TraceStructs is statically allocated, but the values in them are re-initialised by TraceCreate each time the trace id is used, so the TraceStructs are invalid (that is: to be treated as random uninitialised memory) when not being used by a trace. See also job001989. RHSK 2008-12-15.)

Creating and Posting

At ArenaCreate time we TRACE_SET_ITER to initialise the tsMessage[ti] and tMessage[ti] pointers to NULL, and then (when the control pool is ready) TRACE_SET_ITER calling TraceIdMessagesCreate. This performs the initial pre-allocation of the trace start/end messages for each trace id. Allocation failure is not tolerated here: it makes ArenaCreate fail with an error code, because the arena is deemed to be unreasonably small.

When a trace is running using trace id ti, it finds a pre-allocated message via tsMessage[ti] or tMessage[ti] in the ArenaStruct, fills-in and posts the message, and nulls-out the pointer. (If the pointer was null, no message is sent; see below). The message is now reachable only from the arena message queue (but the control pool also knows about it).

When the trace completes, it calls TraceIdMessagesCreate for its trace id. This performs the ongoing pre-allocation of the trace start/end messages for the next use of this trace id. The expectation is that, after a trace has completed, some memory will have been reclaimed, and the ControlAllocs will succeed.

But allocation failure here is permitted: if it happens, both the start and the end messages are freed (if present). This means that, for the next collection using this trace id, neither a start nor an end message will be sent (.match). There is no direct way to report this failure to the client (.errors-not-direct), so we just increment the "droppedMessages" counter in the ArenaStruct. Currently this counter is never reported to the client (except in diagnostic varieties).

Getting and Discarding

If the client has not enabled that message type, the message is discarded immediately when posted, calling ControlFree and reclaiming the memory.

If the client has enabled but never gets the message, it remains on the message queue until ArenaDestroy. Theoretically these messages could accumulate forever until they exhaust memory. This is intentional: the client should not enable a message type and then never get it!

Otherwise, when the client gets a message, it is dropped from the arena message queue: now only the client (and the control pool) hold references to it. The client must call mps_message_discard once it has finished using the message. This calls ControlFree and reclaims the memory.

If the client simply drops its reference, the memory will not be reclaimed until ArenaDestroy. This is intentional: the control pool is not garbage-collected.

Final Clearup

Final clearup is performed at ArenaDestroy, as follows:

Unused and unsent pre-allocated messages (one per trace id):
Freed with TRACE_SET_ITER calling TraceIdMessagesDestroy which calls the message Delete functions (and thereby ControlFree) on anything left in tsMessage[ti] and tMessage[ti].
Ungot messages:
Freed by emptying the arena message queue with MessageEmpty.
Got but undiscarded messages:
Freed by destroying the control pool.

Testing

The main test is "zmess.c". See notes there.

Various other tests, including "amcss.c", also collect and report mps_message_type_gc and/or mps_message_type_gc_start.

Coverage

Current tests do not check:

  • The less common why-codes (reasons why a trace starts). These should be added to zmess.c.

A. References

B. Document History

2003-02-17 DRJ Created.
2006-12-07 RHSK Remove mps_message_type_gc_generation (not implemented).
2007-02-08 RHSK Move historically interesting requirements for mps_message_type_gc_generation (not implemented) to the end of the doc.
2008-12-19 RHSK Complete re-write, for new GC message lifecycle. See job001989. (Previous version was //info.ravenbrook.com/project/mps/master/design/message-gc/index.html#2).