rts: Non-concurrent mark and sweep
authorÖmer Sinan Ağacan <omer@well-typed.com>
Tue, 5 Feb 2019 05:18:44 +0000 (00:18 -0500)
committerBen Gamari <ben@smart-cactus.org>
Fri, 17 May 2019 17:00:12 +0000 (13:00 -0400)
commit242f9ff8081a546ec36874bbd14038be8d7f915a
treef8731b0d8cd2a8f292a3109efd8a8c3ba889e895
parentfb1513a7492f3e8b04334b9c90769e7df00984fb
rts: Non-concurrent mark and sweep

This implements the core heap structure and a serial mark/sweep
collector which can be used to manage the oldest-generation heap.
This is the first step towards a concurrent mark-and-sweep collector
aimed at low-latency applications.

The full design of the collector implemented here is described in detail
in a technical note

    B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
    Compiler" (2018)

The basic heap structure used in this design is heavily inspired by

    K. Ueno & A. Ohori. "A fully concurrent garbage collector for
    functional programs on multicore processors." /ACM SIGPLAN Notices/
    Vol. 51. No. 9 (presented by ICFP 2016)

This design is intended to allow both marking and sweeping
concurrent to execution of a multi-core mutator. Unlike the Ueno design,
which requires no global synchronization pauses, the collector
introduced here requires a stop-the-world pause at the beginning and end
of the mark phase.

To avoid heap fragmentation, the allocator consists of a number of
fixed-size /sub-allocators/. Each of these sub-allocators allocators into
its own set of /segments/, themselves allocated from the block
allocator. Each segment is broken into a set of fixed-size allocation
blocks (which back allocations) in addition to a bitmap (used to track
the liveness of blocks) and some additional metadata (used also used
to track liveness).

This heap structure enables collection via mark-and-sweep, which can be
performed concurrently via a snapshot-at-the-beginning scheme (although
concurrent collection is not implemented in this patch).

The mark queue is a fairly straightforward chunked-array structure.
The representation is a bit more verbose than a typical mark queue to
accomodate a combination of two features:

 * a mark FIFO, which improves the locality of marking, reducing one of
   the major overheads seen in mark/sweep allocators (see [1] for
   details)

 * the selector optimization and indirection shortcutting, which
   requires that we track where we found each reference to an object
   in case we need to update the reference at a later point (e.g. when
   we find that it is an indirection). See Note [Origin references in
   the nonmoving collector] (in `NonMovingMark.h`) for details.

Beyond this the mark/sweep is fairly run-of-the-mill.

[1] R. Garner, S.M. Blackburn, D. Frampton. "Effective Prefetch for
    Mark-Sweep Garbage Collection." ISMM 2007.

Co-Authored-By: Ben Gamari <ben@well-typed.com>
22 files changed:
includes/rts/storage/Block.h
rts/Capability.c
rts/Capability.h
rts/RtsStartup.c
rts/Weak.c
rts/sm/Evac.c
rts/sm/GC.c
rts/sm/GC.h
rts/sm/GCAux.c
rts/sm/GCThread.h
rts/sm/NonMoving.c [new file with mode: 0644]
rts/sm/NonMoving.h [new file with mode: 0644]
rts/sm/NonMovingMark.c [new file with mode: 0644]
rts/sm/NonMovingMark.h [new file with mode: 0644]
rts/sm/NonMovingScav.c [new file with mode: 0644]
rts/sm/NonMovingScav.h [new file with mode: 0644]
rts/sm/NonMovingSweep.c [new file with mode: 0644]
rts/sm/NonMovingSweep.h [new file with mode: 0644]
rts/sm/Sanity.c
rts/sm/Scav.c
rts/sm/Storage.c
rts/sm/Storage.h