Another improvement to SetLevels
authorSimon Peyton Jones <simonpj@microsoft.com>
Fri, 23 Dec 2016 14:17:42 +0000 (14:17 +0000)
committerSimon Peyton Jones <simonpj@microsoft.com>
Tue, 7 Feb 2017 14:00:46 +0000 (14:00 +0000)
commitb8f58d79ee3e34840beeea2fab846a9f47bff21a
treec08e4d82f325e9f32c9e71c9a5b783c1e74aaa19
parentf77e99bbaac1b9ef7c47ed7fd750f0105f7fc28b
Another improvement to SetLevels

In my recent commit
   commit 432f952ef64641be9f32152a0fbf2b8496d8fe9c
   Float unboxed expressions by boxing
I changed how float_me in lvlMFE worked.  That was right, but
it exposed another bug: an error expression wasn't getting floated
as it should from a case alternative.  And that led to a collection
of minor improvements

* I found a much better way to cast it, by using lvlFloatRhs for
  top-level bindinds as well as nested ones, which is
    (a) more consistent and
    (b) works correctly.

  See Note [Floating from a RHS]

* I also found some delicacy in the "floating to the top" stuff, so I
  greatly elaborated the Note [Floating to the top].

* I simplified the "bottoming-float" stuff; the change is in the treatment
  of bottoming lambdas (\x y. error blah), where we now float the
  (error blah) part instead of the whole lambda (which risks just making
  duplicate lambdas.  See Note [Bottoming floats], esp (2).

Perf effects are minor.

* perf/compiler/T13056 improved sligtly (about 2%) in compiler
  allocations. Also T9233 improved by 1%.  I'm not sure why.

* Some small nofib changes:
    - Generally some very small reductions in run-time
      allocation, except k-nucleotide, which halves for some
      reason.  (I did try to look but it's a big complicated
      function and it was far from obvious.  Had it been a loss
      I would have looked harder!

NB: there's a nearby patch "Do not inline bottoming things" that could
also be responsible for either or both.  I didn't think it was worth
more testing to distinguish.

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
           grep          +0.1%     -0.2%      0.00      0.00     +0.0%
         mandel          -0.1%     -1.4%      0.13      0.13     +0.0%
   k-nucleotide          +0.1%    -51.6%     -1.0%     -1.0%     +0.0%
--------------------------------------------------------------------------------
            Min          -0.3%    -51.6%     -9.4%     -9.1%     -4.0%
            Max          +0.2%     +0.0%    +31.8%    +32.7%     +0.0%
 Geometric Mean          -0.0%     -0.8%     +1.4%     +1.4%     -0.1%
compiler/coreSyn/CoreUtils.hs
compiler/simplCore/SetLevels.hs
compiler/simplCore/Simplify.hs
testsuite/tests/perf/compiler/all.T