Make profiling work with multiple capabilities (+RTS -N)
authorSimon Marlow <marlowsd@gmail.com>
Mon, 28 Nov 2011 16:48:43 +0000 (16:48 +0000)
committerSimon Marlow <marlowsd@gmail.com>
Tue, 29 Nov 2011 12:21:18 +0000 (12:21 +0000)
commit50de6034343abc93a7b01daccff34121042c0e7c
tree24496a5fc6bc39c6baaa574608e53c5d76c169f6
parent1c2b838131134d44004dfdff18c302131478390d
Make profiling work with multiple capabilities (+RTS -N)

This means that both time and heap profiling work for parallel
programs.  Main internal changes:

  - CCCS is no longer a global variable; it is now another
    pseudo-register in the StgRegTable struct.  Thus every
    Capability has its own CCCS.

  - There is a new built-in CCS called "IDLE", which records ticks for
    Capabilities in the idle state.  If you profile a single-threaded
    program with +RTS -N2, you'll see about 50% of time in "IDLE".

  - There is appropriate locking in rts/Profiling.c to protect the
    shared cost-centre-stack data structures.

This patch does enough to get it working, I have cut one big corner:
the cost-centre-stack data structure is still shared amongst all
Capabilities, which means that multiple Capabilities will race when
updating the "allocations" and "entries" fields of a CCS.  Not only
does this give unpredictable results, but it runs very slowly due to
cache line bouncing.

It is strongly recommended that you use -fno-prof-count-entries to
disable the "entries" count when profiling parallel programs. (I shall
add a note to this effect to the docs).
34 files changed:
compiler/cmm/CmmExpr.hs
compiler/cmm/CmmLex.x
compiler/cmm/CmmParse.y
compiler/cmm/PprCmmExpr.hs
compiler/codeGen/CgCase.lhs
compiler/codeGen/CgClosure.lhs
compiler/codeGen/CgForeignCall.hs
compiler/codeGen/CgProf.hs
compiler/codeGen/CgUtils.hs
compiler/codeGen/StgCmmForeign.hs
compiler/codeGen/StgCmmProf.hs
compiler/codeGen/StgCmmUtils.hs
includes/Cmm.h
includes/RtsAPI.h
includes/mkDerivedConstants.c
includes/rts/prof/CCS.h
includes/stg/MiscClosures.h
includes/stg/Regs.h
rts/Apply.cmm
rts/AutoApply.h
rts/Capability.c
rts/Exception.cmm
rts/Interpreter.c
rts/PrimOps.cmm
rts/Profiling.c
rts/Proftimer.c
rts/RetainerProfile.h
rts/RtsFlags.c
rts/Schedule.c
rts/StgMiscClosures.cmm
rts/StgStdThunks.cmm
rts/sm/GC.c
rts/sm/Storage.c
utils/genapply/GenApply.hs