Simplify callSiteInline a little
authorSimon Peyton Jones <simonpj@microsoft.com>
Fri, 4 May 2018 14:47:31 +0000 (15:47 +0100)
committerBen Gamari <ben@smart-cactus.org>
Tue, 21 Aug 2018 22:52:42 +0000 (18:52 -0400)
This patch has virtually no effect on anything (according to a
nofib run).  But it simplifies the definition of interesting_call
by being a bit less gung-ho about inlining nested function
bindings.  See Note [Nested functions]

-----------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed  TotalMem
-----------------------------------------------------------------------
           anna          +0.2%     -0.0%     0.163     0.163      0.0%
   binary-trees          +0.1%     +0.0%     -4.5%     -4.5%      0.0%
      cacheprof          -0.1%     +0.1%     -4.7%     -4.8%     +2.7%
          fasta          +0.2%      0.0%     +2.6%     +3.0%      0.0%
          fluid          -0.0%     -0.6%     0.011     0.011      0.0%
         gamteb          -0.1%     -0.0%     0.069     0.070      0.0%
            hpg          +0.1%     +0.0%     +0.7%     +0.7%      0.0%
          infer          +0.3%     +0.2%     0.097     0.098      0.0%
         lambda          -0.1%     -0.0%     +2.0%     +2.0%      0.0%
         n-body          +0.1%     -0.1%     -0.1%     -0.1%      0.0%
         simple          -0.2%     -0.2%     +0.6%     +0.6%      0.0%
  spectral-norm          +0.1%     -0.0%     -0.1%     -0.1%      0.0%
            tak          -0.0%     -0.1%     0.024     0.024      0.0%
--------------------------------------------------------------------------------
            Min          -0.4%     -0.6%     -5.3%     -5.3%      0.0%
            Max          +0.3%     +0.2%     +3.3%     +3.3%    +15.0%
 Geometric Mean          -0.0%     -0.0%     -0.3%     -0.3%     +0.2%

(cherry picked from commit 33de71fa06d03e6da396a7c0a314fea3b492ab91)

(This reverts the previous reversion in commit
9dbf66d74e65309d02c9d700094e363f59c94096)

compiler/coreSyn/CoreUnfold.hs

index 7bd512d..68e7290 100644 (file)
@@ -1153,11 +1153,11 @@ callSiteInline dflags id active_unfolding lone_variable arg_infos cont_info
       -- idUnfolding checks for loop-breakers, returning NoUnfolding
       -- Things with an INLINE pragma may have an unfolding *and*
       -- be a loop breaker  (maybe the knot is not yet untied)
-        CoreUnfolding { uf_tmpl = unf_template, uf_is_top = is_top
+        CoreUnfolding { uf_tmpl = unf_template
                       , uf_is_work_free = is_wf
                       , uf_guidance = guidance, uf_expandable = is_exp }
           | active_unfolding -> tryUnfolding dflags id lone_variable
-                                    arg_infos cont_info unf_template is_top
+                                    arg_infos cont_info unf_template
                                     is_wf is_exp guidance
           | otherwise -> traceInline dflags id "Inactive unfolding:" (ppr id) Nothing
         NoUnfolding      -> Nothing
@@ -1177,10 +1177,10 @@ traceInline dflags inline_id str doc result
  = result
 
 tryUnfolding :: DynFlags -> Id -> Bool -> [ArgSummary] -> CallCtxt
-             -> CoreExpr -> Bool -> Bool -> Bool -> UnfoldingGuidance
+             -> CoreExpr -> Bool -> Bool -> UnfoldingGuidance
              -> Maybe CoreExpr
 tryUnfolding dflags id lone_variable
-             arg_infos cont_info unf_template is_top
+             arg_infos cont_info unf_template
              is_wf is_exp guidance
  = case guidance of
      UnfNever -> traceInline dflags id str (text "UnfNever") Nothing
@@ -1252,10 +1252,10 @@ tryUnfolding dflags id lone_variable
               CaseCtxt   -> not (lone_variable && is_exp)  -- Note [Lone variables]
               ValAppCtxt -> True                           -- Note [Cast then apply]
               RuleArgCtxt -> uf_arity > 0  -- See Note [Unfold info lazy contexts]
-              DiscArgCtxt -> uf_arity > 0  --
+              DiscArgCtxt -> uf_arity > 0  -- Note [Inlining in ArgCtxt]
               RhsCtxt     -> uf_arity > 0  --
-              _           -> not is_top && uf_arity > 0   -- Note [Nested functions]
-                                                      -- Note [Inlining in ArgCtxt]
+              _other      -> False         -- See Note [Nested functions]
+
 
 {-
 Note [Unfold into lazy contexts], Note [RHS of lets]
@@ -1325,18 +1325,17 @@ However for worker/wrapper it may be worth inlining even if the
 arity is not satisfied (as we do in the CoreUnfolding case) so we don't
 require saturation.
 
-
 Note [Nested functions]
 ~~~~~~~~~~~~~~~~~~~~~~~
-If a function has a nested defn we also record some-benefit, on the
-grounds that we are often able to eliminate the binding, and hence the
-allocation, for the function altogether; this is good for join points.
-But this only makes sense for *functions*; inlining a constructor
-doesn't help allocation unless the result is scrutinised.  UNLESS the
-constructor occurs just once, albeit possibly in multiple case
-branches.  Then inlining it doesn't increase allocation, but it does
-increase the chance that the constructor won't be allocated at all in
-the branches that don't use it.
+At one time we treated a call of a non-top-level function as
+"interesting" (regardless of how boring the context) in the hope
+that inlining it would eliminate the binding, and its allocation.
+Specifically, in the default case of interesting_call we had
+   _other -> not is_top && uf_arity > 0
+
+But actually postInlineUnconditionally does some of this and overall
+it makes virtually no difference to nofib.  So I simplified away this
+special case
 
 Note [Cast then apply]
 ~~~~~~~~~~~~~~~~~~~~~~