ghc.git
11 years agoReorganisation to fix problems related to the gct register variable
Simon Marlow [Wed, 16 Apr 2008 23:22:32 +0000 (23:22 +0000)] 
Reorganisation to fix problems related to the gct register variable
  - GCAux.c contains code not compiled with the gct register enabled,
    it is callable from outside the GC
  - marking functions are moved to their relevant subsystems, outside
    the GC
  - mark_root needs to save the gct register, as it is called from
    outside the GC

11 years agofaster block allocator, by dividing the free list into buckets
Simon Marlow [Wed, 16 Apr 2008 22:45:41 +0000 (22:45 +0000)] 
faster block allocator, by dividing the free list into buckets

11 years agoallocate more blocks in one go, to reduce contention for the block allocator
Simon Marlow [Wed, 16 Apr 2008 22:38:24 +0000 (22:38 +0000)] 
allocate more blocks in one go, to reduce contention for the block allocator

11 years agomeasure GC(0/1) times and work imbalance
Simon Marlow [Wed, 16 Apr 2008 22:25:39 +0000 (22:25 +0000)] 
measure GC(0/1) times and work imbalance

11 years agoremove outdated comment
Simon Marlow [Wed, 16 Apr 2008 22:23:19 +0000 (22:23 +0000)] 
remove outdated comment

11 years agocalculate and report slop (wasted space at the end of blocks)
Simon Marlow [Wed, 16 Apr 2008 22:15:16 +0000 (22:15 +0000)] 
calculate and report slop (wasted space at the end of blocks)

11 years agofree empty blocks at the end of GC
Simon Marlow [Wed, 16 Apr 2008 22:13:56 +0000 (22:13 +0000)] 
free empty blocks at the end of GC

11 years agomove the scan block pointer into the gct structure
Simon Marlow [Wed, 16 Apr 2008 22:13:31 +0000 (22:13 +0000)] 
move the scan block pointer into the gct structure

11 years agoimprovements to +RTS -s output
Simon Marlow [Wed, 16 Apr 2008 22:12:24 +0000 (22:12 +0000)] 
improvements to +RTS -s output
- count and report number of parallel collections
- calculate bytes scanned in addition to bytes copied per thread
- calculate "work balance factor"
- tidy up the formatting a bit

11 years agowait for threads to start up properly
Simon Marlow [Wed, 16 Apr 2008 22:10:02 +0000 (22:10 +0000)] 
wait for threads to start up properly

11 years agodebug output tweaks
Simon Marlow [Wed, 16 Apr 2008 22:08:07 +0000 (22:08 +0000)] 
debug output tweaks

11 years agoKeep track of an accurate count of live words in each step
Simon Marlow [Wed, 16 Apr 2008 22:06:20 +0000 (22:06 +0000)] 
Keep track of an accurate count of live words in each step
This means we can calculate slop easily, and also improve
predictability of GC.

11 years agoAllow work units smaller than a block to improve load balancing
Simon Marlow [Wed, 16 Apr 2008 22:03:47 +0000 (22:03 +0000)] 
Allow work units smaller than a block to improve load balancing

11 years agoin scavenge_block1(), we can use the lock-free recordMutableGen()
Simon Marlow [Wed, 16 Apr 2008 22:01:04 +0000 (22:01 +0000)] 
in scavenge_block1(), we can use the lock-free recordMutableGen()

11 years agoupdate the debug counters following changes to scav_find_work()
Simon Marlow [Wed, 16 Apr 2008 21:59:45 +0000 (21:59 +0000)] 
update the debug counters following changes to scav_find_work()

11 years agochange the find-work strategy: use oldest-first consistently
Simon Marlow [Wed, 16 Apr 2008 21:58:15 +0000 (21:58 +0000)] 
change the find-work strategy: use oldest-first consistently

11 years agoper-thread debug output when using multiple threads, not just major gc
Simon Marlow [Wed, 16 Apr 2008 21:57:41 +0000 (21:57 +0000)] 
per-thread debug output when using multiple threads, not just major gc

11 years agosmall debug output improvements
Simon Marlow [Wed, 16 Apr 2008 21:56:49 +0000 (21:56 +0000)] 
small debug output improvements

11 years agoallow parallel minor collections too
Simon Marlow [Wed, 16 Apr 2008 21:55:03 +0000 (21:55 +0000)] 
allow parallel minor collections too

11 years agoSpecialise evac/scav for single-threaded, not minor, GC
Simon Marlow [Wed, 16 Apr 2008 21:54:05 +0000 (21:54 +0000)] 
Specialise evac/scav for single-threaded, not minor, GC
So we can parallelise minor collections too.  Sometimes it's worth it.

11 years agomove usleep(1) to gc_thread_work() from any_work()
Simon Marlow [Wed, 16 Apr 2008 21:53:25 +0000 (21:53 +0000)] 
move usleep(1) to gc_thread_work() from any_work()

11 years agouse RTS_VAR()
Simon Marlow [Wed, 16 Apr 2008 21:52:45 +0000 (21:52 +0000)] 
use RTS_VAR()

11 years agotreat the global work list as a queue rather than a stack
Simon Marlow [Wed, 16 Apr 2008 21:51:09 +0000 (21:51 +0000)] 
treat the global work list as a queue rather than a stack

11 years agoGC: move static object processinng into thread-local storage
Simon Marlow [Wed, 16 Apr 2008 21:48:25 +0000 (21:48 +0000)] 
GC: move static object processinng into thread-local storage

11 years agotmp: usleep(1) during anyWork() if no work
Simon Marlow [Wed, 16 Apr 2008 21:40:23 +0000 (21:40 +0000)] 
tmp: usleep(1) during anyWork() if no work

11 years agoanyWork(): count the number of times we don't find any work
Simon Marlow [Wed, 16 Apr 2008 21:39:45 +0000 (21:39 +0000)] 
anyWork(): count the number of times we don't find any work

11 years agostats fixes
Simon Marlow [Wed, 16 Apr 2008 21:35:32 +0000 (21:35 +0000)] 
stats fixes

11 years agoAdd +RTS -vg flag for requesting some GC trace messages, outside DEBUG
Simon Marlow [Wed, 16 Apr 2008 21:35:04 +0000 (21:35 +0000)] 
Add +RTS -vg flag for requesting some GC trace messages, outside DEBUG
DEBUG imposes a significant performance hit in the GC, yet we often
want some of the debugging output, so -vg gives us the cheap trace
messages without the sanity checking of DEBUG, just like -vs for the
scheduler.

11 years agoGC: rearrange storage to reduce memory accesses in the inner loop
Simon Marlow [Wed, 16 Apr 2008 21:34:36 +0000 (21:34 +0000)] 
GC: rearrange storage to reduce memory accesses in the inner loop

11 years agoAdd profiling of spinlocks
Simon Marlow [Wed, 16 Apr 2008 21:33:58 +0000 (21:33 +0000)] 
Add profiling of spinlocks

11 years agorename StgSync to SpinLock
Simon Marlow [Wed, 16 Apr 2008 21:11:52 +0000 (21:11 +0000)] 
rename StgSync to SpinLock

11 years agoRelease some of the memory allocated to a stack when it shrinks (#2090)
simonmar@microsoft.com [Thu, 28 Feb 2008 15:31:29 +0000 (15:31 +0000)] 
Release some of the memory allocated to a stack when it shrinks (#2090)
When a stack is occupying less than 1/4 of the memory it owns, and is
larger than a megablock, we release half of it.  Shrinking is O(1), it
doesn't need to copy the stack.

11 years agoscavengeTSO might encounter a ThreadRelocated; cope
simonmar@microsoft.com [Thu, 28 Feb 2008 15:24:03 +0000 (15:24 +0000)] 
scavengeTSO might encounter a ThreadRelocated; cope

11 years agoUpdating a thunk in raiseAsync might encounter an IND; cope
simonmar@microsoft.com [Thu, 28 Feb 2008 15:23:32 +0000 (15:23 +0000)] 
Updating a thunk in raiseAsync might encounter an IND; cope
There was already a check to avoid updating an IND, but it was
originally there to avoid a bug which doesn't exist now.  Furthermore
the test and update are not atomic, so another thread could be
updating this thunk while we are.  We have to just go ahead and update
anyway - it might waste a little work, but this is a very rare case.

11 years agoadd GC(0) and GC(1) time
Simon Marlow [Fri, 22 Feb 2008 14:20:08 +0000 (14:20 +0000)] 
add GC(0) and GC(1) time

11 years agoround_to_mblocks: should use StgWord not nat
Simon Marlow [Wed, 20 Feb 2008 13:01:39 +0000 (13:01 +0000)] 
round_to_mblocks: should use StgWord not nat

11 years agodebugging code
Simon Marlow [Tue, 19 Feb 2008 10:26:51 +0000 (10:26 +0000)] 
debugging code

11 years agorefactoring
simonmar@microsoft.com [Mon, 18 Feb 2008 13:54:58 +0000 (13:54 +0000)] 
refactoring

11 years agofix off-by-one
simonmar@microsoft.com [Fri, 15 Feb 2008 13:40:17 +0000 (13:40 +0000)] 
fix off-by-one

11 years agomeasure mut_elapsed_time
simonmar@microsoft.com [Fri, 15 Feb 2008 13:38:50 +0000 (13:38 +0000)] 
measure mut_elapsed_time

11 years agofix build with 6.8
simonmar@microsoft.com [Fri, 15 Feb 2008 13:38:36 +0000 (13:38 +0000)] 
fix build with 6.8

11 years agoadd ROUNDUP_BYTES_TO_WDS
simonmar@microsoft.com [Fri, 15 Feb 2008 13:30:40 +0000 (13:30 +0000)] 
add ROUNDUP_BYTES_TO_WDS

11 years agoAllow +RTS -H0 as a way to override a previous -H<size>
simonmar@microsoft.com [Thu, 31 Jan 2008 15:36:45 +0000 (15:36 +0000)] 
Allow +RTS -H0 as a way to override a previous -H<size>

11 years agocomment out a bogus assertion
simonmar@microsoft.com [Wed, 30 Jan 2008 15:09:34 +0000 (15:09 +0000)] 
comment out a bogus assertion

11 years agomemInventory: optionally dump the memory inventory
simonmar@microsoft.com [Wed, 30 Jan 2008 15:09:21 +0000 (15:09 +0000)] 
memInventory: optionally dump the memory inventory
in addition to checking for leaks

11 years agocalcNeeded: fix the calculation, we weren't counting G0 step 1
simonmar@microsoft.com [Wed, 30 Jan 2008 15:07:30 +0000 (15:07 +0000)] 
calcNeeded: fix the calculation, we weren't counting G0 step 1

11 years agocalcNeeded: add in the large blocks too
simonmar@microsoft.com [Wed, 30 Jan 2008 13:54:18 +0000 (13:54 +0000)] 
calcNeeded: add in the large blocks too

11 years agoupdate a comment
Simon Marlow [Wed, 30 Jan 2008 10:15:04 +0000 (10:15 +0000)] 
update a comment

11 years agotell Emacs these files are C
simonmar@microsoft.com [Wed, 30 Jan 2008 10:00:47 +0000 (10:00 +0000)] 
tell Emacs these files are C

11 years agofix an assertion
Simon Marlow [Fri, 18 Jan 2008 16:09:10 +0000 (16:09 +0000)] 
fix an assertion

11 years agocut-and-pasto
Simon Marlow [Wed, 16 Jan 2008 10:37:51 +0000 (10:37 +0000)] 
cut-and-pasto

11 years agosmall rearrangement
simonmar@microsoft.com [Tue, 15 Jan 2008 09:57:36 +0000 (09:57 +0000)] 
small rearrangement

11 years agorecordMutableGen_GC: we must call the spinlocked version of allocBlock()
Simon Marlow [Fri, 11 Jan 2008 13:54:53 +0000 (13:54 +0000)] 
recordMutableGen_GC: we must call the spinlocked version of allocBlock()

11 years agoremove unused declaration
simonmar@microsoft.com [Fri, 11 Jan 2008 10:58:21 +0000 (10:58 +0000)] 
remove unused declaration

11 years agomore fixes for THUNK_SELECTORs
Simon Marlow [Thu, 10 Jan 2008 12:28:20 +0000 (12:28 +0000)] 
more fixes for THUNK_SELECTORs

11 years agoFix bug in eval_thunk_selector()
simonmar@microsoft.com [Thu, 10 Jan 2008 10:56:28 +0000 (10:56 +0000)] 
Fix bug in eval_thunk_selector()

11 years agomove markSparkQueue into GC.c, as it needs the register variable defined
Simon Marlow [Wed, 9 Jan 2008 16:28:28 +0000 (16:28 +0000)] 
move markSparkQueue into GC.c, as it needs the register variable defined

11 years agoWindows fix
Simon Marlow [Wed, 9 Jan 2008 16:27:32 +0000 (16:27 +0000)] 
Windows fix

11 years agoFix bug: eval_thunk_selector was calling the unlocked evacuate()
Simon Marlow [Wed, 9 Jan 2008 14:49:37 +0000 (14:49 +0000)] 
Fix bug: eval_thunk_selector was calling the unlocked evacuate()

11 years agoadd GC elapsed time
simonmar@microsoft.com [Mon, 7 Jan 2008 13:48:38 +0000 (13:48 +0000)] 
add GC elapsed time

11 years agoupdate to match Mb -> MB change in -s output
simonmar@microsoft.com [Thu, 20 Dec 2007 14:58:55 +0000 (14:58 +0000)] 
update to match Mb -> MB change in -s output

11 years agouse "MB" rather than "Mb" for abbreviating megabytes
simonmar@microsoft.com [Tue, 18 Dec 2007 14:51:35 +0000 (14:51 +0000)] 
use "MB" rather than "Mb" for abbreviating megabytes

11 years agofindSlop: useful function for tracking down excessive slop in gdb
simonmar@microsoft.com [Fri, 14 Dec 2007 13:59:09 +0000 (13:59 +0000)] 
findSlop: useful function for tracking down excessive slop in gdb

11 years agocalculate wastage due to unused memory at the end of each block
simonmar@microsoft.com [Fri, 14 Dec 2007 13:58:42 +0000 (13:58 +0000)] 
calculate wastage due to unused memory at the end of each block

11 years agobugfix: check for NULL before testing isPartiallyFull(stp->blocks)
simonmar@microsoft.com [Fri, 14 Dec 2007 10:32:23 +0000 (10:32 +0000)] 
bugfix: check for NULL before testing isPartiallyFull(stp->blocks)

11 years agohave each GC thread call GetRoots()
simonmar@microsoft.com [Thu, 13 Dec 2007 16:50:13 +0000 (16:50 +0000)] 
have each GC thread call GetRoots()

11 years agouse synchronised version of freeChain() in scavenge_mutable_list()
simonmar@microsoft.com [Thu, 13 Dec 2007 16:45:25 +0000 (16:45 +0000)] 
use synchronised version of freeChain() in scavenge_mutable_list()

11 years agoremove declarations for variables that no longer exist
simonmar@microsoft.com [Thu, 13 Dec 2007 15:09:46 +0000 (15:09 +0000)] 
remove declarations for variables that no longer exist

11 years agoremove old comment
simonmar@microsoft.com [Wed, 12 Dec 2007 16:33:29 +0000 (16:33 +0000)] 
remove old comment

11 years agoGC: small improvement to parallelism
simonmar@microsoft.com [Thu, 29 Nov 2007 15:49:27 +0000 (15:49 +0000)] 
GC: small improvement to parallelism
don't cache a work block locally if the global queue is empty

11 years agoEVACUATED: target is definitely HEAP_ALLOCED(), no need to check
simonmar@microsoft.com [Thu, 29 Nov 2007 12:00:21 +0000 (12:00 +0000)] 
EVACUATED: target is definitely HEAP_ALLOCED(), no need to check

11 years agoin scavenge_block(), keep going if we're scanning the todo block
simonmar@microsoft.com [Tue, 27 Nov 2007 16:07:47 +0000 (16:07 +0000)] 
in scavenge_block(), keep going if we're scanning the todo block

11 years agocount the number of todo blocks, and add a trace
simonmar@microsoft.com [Tue, 27 Nov 2007 16:07:17 +0000 (16:07 +0000)] 
count the number of todo blocks, and add a trace

11 years agooops, restore accidentally disabled hash-consing for Char
simonmar@microsoft.com [Fri, 23 Nov 2007 16:25:22 +0000 (16:25 +0000)] 
oops, restore accidentally disabled hash-consing for Char

11 years agokill the PAR/GRAN debug flags
simonmar@microsoft.com [Thu, 22 Nov 2007 12:23:27 +0000 (12:23 +0000)] 
kill the PAR/GRAN debug flags

11 years agostats: print elapsed time for GC in each generation
simonmar@microsoft.com [Thu, 22 Nov 2007 10:50:24 +0000 (10:50 +0000)] 
stats: print elapsed time for GC in each generation

11 years agoassertion fix
simonmar@microsoft.com [Wed, 21 Nov 2007 16:47:36 +0000 (16:47 +0000)] 
assertion fix

11 years agocache bd->todo_bd->free and the limit in the workspace
Simon Marlow [Wed, 21 Nov 2007 15:58:51 +0000 (15:58 +0000)] 
cache bd->todo_bd->free and the limit in the workspace
avoids cache contention: bd->todo_bd->free may clash with any cache
line, so we localise it.

11 years agowarning fix
simonmar@microsoft.com [Wed, 21 Nov 2007 16:47:47 +0000 (16:47 +0000)] 
warning fix

11 years agofix boundary bugs in a couple of for-loops
simonmar@microsoft.com [Tue, 20 Nov 2007 13:38:35 +0000 (13:38 +0000)] 
fix boundary bugs in a couple of for-loops

11 years agoimprovements to PAPI support
simonmar@microsoft.com [Tue, 20 Nov 2007 13:36:35 +0000 (13:36 +0000)] 
improvements to PAPI support
- major (multithreaded) GC is measured separately from minor GC
- events to measure can now be specified on the command line, e.g
     prog +RTS -a+PAPI_TOT_CYC

11 years agouse SRC_CC_OPTS rather than SRC_HC_OPTS for C options
simonmar@microsoft.com [Mon, 19 Nov 2007 11:16:30 +0000 (11:16 +0000)] 
use SRC_CC_OPTS rather than SRC_HC_OPTS for C options

12 years agoallow PAPI to be installed somewhere non-standard
Simon Marlow [Thu, 1 Nov 2007 15:03:25 +0000 (15:03 +0000)] 
allow PAPI to be installed somewhere non-standard

12 years agofix warnings
Simon Marlow [Thu, 1 Nov 2007 15:02:58 +0000 (15:02 +0000)] 
fix warnings

12 years agofix a warning
Simon Marlow [Thu, 1 Nov 2007 15:02:28 +0000 (15:02 +0000)] 
fix a warning

12 years agofix a warning
Simon Marlow [Thu, 1 Nov 2007 15:02:00 +0000 (15:02 +0000)] 
fix a warning

12 years agorename n_threads to n_gc_threads
Simon Marlow [Wed, 31 Oct 2007 16:31:47 +0000 (16:31 +0000)] 
rename n_threads to n_gc_threads

12 years agoRefactor PAPI support, and add profiling of multithreaded GC
Simon Marlow [Wed, 31 Oct 2007 16:30:15 +0000 (16:30 +0000)] 
Refactor PAPI support, and add profiling of multithreaded GC

12 years agofix merge errors
Simon Marlow [Wed, 31 Oct 2007 15:38:39 +0000 (15:38 +0000)] 
fix merge errors

12 years agorefactoring of eager_promotion in scavenge_block()
Simon Marlow [Wed, 31 Oct 2007 15:34:17 +0000 (15:34 +0000)] 
refactoring of eager_promotion in scavenge_block()

12 years agocompile special minor GC versions of evacuate() and scavenge_block()
Simon Marlow [Wed, 31 Oct 2007 15:33:39 +0000 (15:33 +0000)] 
compile special minor GC versions of evacuate() and scavenge_block()

This is for two reasons: minor GCs don't need to do per-object locking
for parallel GC, which is fairly expensive, and secondly minor GCs
don't need to follow SRTs.

12 years agofixes for eval_thunk_selector() in parallel GC
Simon Marlow [Wed, 31 Oct 2007 15:32:52 +0000 (15:32 +0000)] 
fixes for eval_thunk_selector() in parallel GC

12 years agoRemove the optimisation of avoiding scavenging for certain objects
Simon Marlow [Wed, 31 Oct 2007 14:45:42 +0000 (14:45 +0000)] 
Remove the optimisation of avoiding scavenging for certain objects

Some objects don't need to be scavenged, in particular if they have no
pointers.  This seems like an obvious optimisation, but in fact it
only accounts for about 1% of objects (in GHC, for example), and the
extra complication means it probably isn't worth doing.

12 years agoGC refactoring: change evac_gen to evac_step
Simon Marlow [Wed, 31 Oct 2007 14:42:30 +0000 (14:42 +0000)] 
GC refactoring: change evac_gen to evac_step

By establishing an ordering on step pointers, we can simplify the test
  (stp->gen_no < evac_gen)
to
  (stp < evac_step)
which is common in evacuate().

12 years agoGC refactoring: make evacuate() take an StgClosure**
Simon Marlow [Wed, 31 Oct 2007 14:36:34 +0000 (14:36 +0000)] 
GC refactoring: make evacuate() take an StgClosure**

Change the type of evacuate() from
  StgClosure *evacuate(StgClosure *);
to
  void evacuate(StgClosure **);

So evacuate() itself writes the source pointer, rather than the
caller.  This is slightly cleaner, and avoids a few memory writes:
sometimes evacuate() doesn't move the object, and in these cases the
source pointer doesn't need to be written.  It doesn't have a
measurable impact on performance, though.

12 years agotiny optimisation in evacuate()
Simon Marlow [Wed, 31 Oct 2007 13:09:35 +0000 (13:09 +0000)] 
tiny optimisation in evacuate()

12 years agoInitial parallel GC support
Simon Marlow [Wed, 31 Oct 2007 13:07:18 +0000 (13:07 +0000)] 
Initial parallel GC support

eg. use +RTS -g2 -RTS for 2 threads.  Only major GCs are parallelised,
minor GCs are still sequential. Don't use more threads than you
have CPUs.

It works most of the time, although you won't see much speedup yet.
Tuning and more work on stability still required.

12 years agoRefactoring of the GC in preparation for parallel GC
Simon Marlow [Wed, 31 Oct 2007 12:51:36 +0000 (12:51 +0000)] 
Refactoring of the GC in preparation for parallel GC

This patch localises the state of the GC into a gc_thread structure,
and reorganises the inner loop of the GC to scavenge one block at a
time from global work lists in each "step".  The gc_thread structure
has a "workspace" for each step, in which it collects evacuated
objects until it has a full block to push out to the step's global
list.  Details of the algorithm will be on the wiki in due course.

At the moment, THREADED_RTS does not compile, but the single-threaded
GC works (and is 10-20% slower than before).

12 years agoalso count total dispatch stalls in +RTS -as
Simon Marlow [Tue, 30 Oct 2007 14:45:09 +0000 (14:45 +0000)] 
also count total dispatch stalls in +RTS -as

12 years agomove GetRoots() to GC.c
Simon Marlow [Tue, 30 Oct 2007 13:00:52 +0000 (13:00 +0000)] 
move GetRoots() to GC.c