rts: Non-concurrent mark and sweep
authorÖmer Sinan Ağacan <omer@well-typed.com>
Tue, 5 Feb 2019 05:18:44 +0000 (00:18 -0500)
committerBen Gamari <ben@smart-cactus.org>
Wed, 19 Jun 2019 18:22:09 +0000 (14:22 -0400)
commit3f9bd4596191eb5f3aa69172e8f7546db68f2b07
tree312812b4f6bdd493197fbc23165f22a074375a80
parent8d190c8342c878293f2bb2272a86862dd3fb2254
rts: Non-concurrent mark and sweep

This implements the core heap structure and a serial mark/sweep
collector which can be used to manage the oldest-generation heap.
This is the first step towards a concurrent mark-and-sweep collector
aimed at low-latency applications.

The full design of the collector implemented here is described in detail
in a technical note

    B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
    Compiler" (2018)

The basic heap structure used in this design is heavily inspired by

    K. Ueno & A. Ohori. "A fully concurrent garbage collector for
    functional programs on multicore processors." /ACM SIGPLAN Notices/
    Vol. 51. No. 9 (presented by ICFP 2016)

This design is intended to allow both marking and sweeping
concurrent to execution of a multi-core mutator. Unlike the Ueno design,
which requires no global synchronization pauses, the collector
introduced here requires a stop-the-world pause at the beginning and end
of the mark phase.

To avoid heap fragmentation, the allocator consists of a number of
fixed-size /sub-allocators/. Each of these sub-allocators allocators into
its own set of /segments/, themselves allocated from the block
allocator. Each segment is broken into a set of fixed-size allocation
blocks (which back allocations) in addition to a bitmap (used to track
the liveness of blocks) and some additional metadata (used also used
to track liveness).

This heap structure enables collection via mark-and-sweep, which can be
performed concurrently via a snapshot-at-the-beginning scheme (although
concurrent collection is not implemented in this patch).

The mark queue is a fairly straightforward chunked-array structure.
The representation is a bit more verbose than a typical mark queue to
accomodate a combination of two features:

 * a mark FIFO, which improves the locality of marking, reducing one of
   the major overheads seen in mark/sweep allocators (see [1] for
   details)

 * the selector optimization and indirection shortcutting, which
   requires that we track where we found each reference to an object
   in case we need to update the reference at a later point (e.g. when
   we find that it is an indirection). See Note [Origin references in
   the nonmoving collector] (in `NonMovingMark.h`) for details.

Beyond this the mark/sweep is fairly run-of-the-mill.

[1] R. Garner, S.M. Blackburn, D. Frampton. "Effective Prefetch for
    Mark-Sweep Garbage Collection." ISMM 2007.

Co-Authored-By: Ben Gamari <ben@well-typed.com>
24 files changed:
includes/rts/storage/Block.h
rts/Capability.c
rts/Capability.h
rts/RtsStartup.c
rts/Weak.c
rts/rts.cabal.in
rts/sm/Evac.c
rts/sm/GC.c
rts/sm/GC.h
rts/sm/GCAux.c
rts/sm/GCThread.h
rts/sm/NonMoving.c [new file with mode: 0644]
rts/sm/NonMoving.h [new file with mode: 0644]
rts/sm/NonMovingMark.c [new file with mode: 0644]
rts/sm/NonMovingMark.h [new file with mode: 0644]
rts/sm/NonMovingScav.c [new file with mode: 0644]
rts/sm/NonMovingScav.h [new file with mode: 0644]
rts/sm/NonMovingSweep.c [new file with mode: 0644]
rts/sm/NonMovingSweep.h [new file with mode: 0644]
rts/sm/Sanity.c
rts/sm/Sanity.h
rts/sm/Scav.c
rts/sm/Storage.c
rts/sm/Storage.h