1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 \section{SetLevels}
6 ***************************
7 Overview
8 ***************************
10 1. We attach binding levels to Core bindings, in preparation for floating
11 outwards (@FloatOut@).
13 2. We also let-ify many expressions (notably case scrutinees), so they
14 will have a fighting chance of being floated sensible.
16 3. Note [Need for cloning during float-out]
17 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18 We clone the binders of any floatable let-binding, so that when it is
19 floated out it will be unique. Example
20 (let x=2 in x) + (let x=3 in x)
21 we must clone before floating so we get
22 let x1=2 in
23 let x2=3 in
24 x1+x2
26 NOTE: this can't be done using the uniqAway idea, because the variable
27 must be unique in the whole program, not just its current scope,
28 because two variables in different scopes may float out to the
29 same top level place
31 NOTE: Very tiresomely, we must apply this substitution to
32 the rules stored inside a variable too.
34 We do *not* clone top-level bindings, because some of them must not change,
35 but we *do* clone bindings that are heading for the top level
37 4. Note [Binder-swap during float-out]
38 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39 In the expression
40 case x of wild { p -> ...wild... }
41 we substitute x for wild in the RHS of the case alternatives:
42 case x of wild { p -> ...x... }
43 This means that a sub-expression involving x is not "trapped" inside the RHS.
44 And it's not inconvenient because we already have a substitution.
46 Note that this is EXACTLY BACKWARDS from the what the simplifier does.
47 The simplifier tries to get rid of occurrences of x, in favour of wild,
48 in the hope that there will only be one remaining occurrence of x, namely
49 the scrutinee of the case, and we can inline it.
50 -}
52 {-# LANGUAGE CPP, MultiWayIf #-}
53 module SetLevels (
54 setLevels,
56 Level(..), LevelType(..), tOP_LEVEL, isJoinCeilLvl, asJoinCeilLvl,
57 LevelledBind, LevelledExpr, LevelledBndr,
58 FloatSpec(..), floatSpecLevel,
60 incMinorLvl, ltMajLvl, ltLvl, isTopLvl
61 ) where
63 #include "HsVersions.h"
65 import CoreSyn
66 import CoreMonad ( FloatOutSwitches(..) )
67 import CoreUtils ( exprType, exprIsHNF
68 , exprOkForSpeculation
69 , exprIsTopLevelBindable
70 , isExprLevPoly
71 , collectMakeStaticArgs
72 )
73 import CoreArity ( exprBotStrictness_maybe )
74 import CoreFVs -- all of it
75 import CoreSubst
76 import MkCore ( sortQuantVars )
78 import Id
79 import IdInfo
80 import Var
81 import VarSet
82 import VarEnv
83 import Literal ( litIsTrivial )
84 import Demand ( StrictSig, isStrictDmd, splitStrictSig, increaseStrictSigArity )
85 import Name ( getOccName, mkSystemVarName )
86 import OccName ( occNameString )
87 import Type ( isUnliftedType, Type, mkLamTypes, splitTyConApp_maybe )
88 import BasicTypes ( Arity, RecFlag(..), isRec )
89 import DataCon ( dataConOrigResTy )
90 import TysWiredIn
91 import UniqSupply
92 import Util
93 import Outputable
94 import FastString
95 import UniqDFM
96 import FV
97 import Data.Maybe
98 import Control.Monad ( zipWithM )
100 {-
101 ************************************************************************
102 * *
103 \subsection{Level numbers}
104 * *
105 ************************************************************************
106 -}
108 type LevelledExpr = TaggedExpr FloatSpec
109 type LevelledBind = TaggedBind FloatSpec
110 type LevelledBndr = TaggedBndr FloatSpec
112 data Level = Level Int -- Level number of enclosing lambdas
113 Int -- Number of big-lambda and/or case expressions and/or
114 -- context boundaries between
115 -- here and the nearest enclosing lambda
116 LevelType -- Binder or join ceiling?
117 data LevelType = BndrLvl | JoinCeilLvl deriving (Eq)
119 data FloatSpec
120 = FloatMe Level -- Float to just inside the binding
121 -- tagged with this level
122 | StayPut Level -- Stay where it is; binding is
123 -- tagged with tihs level
125 floatSpecLevel :: FloatSpec -> Level
126 floatSpecLevel (FloatMe l) = l
127 floatSpecLevel (StayPut l) = l
129 {-
130 The {\em level number} on a (type-)lambda-bound variable is the
131 nesting depth of the (type-)lambda which binds it. The outermost lambda
132 has level 1, so (Level 0 0) means that the variable is bound outside any lambda.
134 On an expression, it's the maximum level number of its free
135 (type-)variables. On a let(rec)-bound variable, it's the level of its
136 RHS. On a case-bound variable, it's the number of enclosing lambdas.
138 Top-level variables: level~0. Those bound on the RHS of a top-level
139 definition but before'' a lambda; e.g., the \tr{x} in (levels shown
140 as subscripts'')...
141 \begin{verbatim}
142 a_0 = let b_? = ... in
143 x_1 = ... b ... in ...
144 \end{verbatim}
146 The main function @lvlExpr@ carries a context level'' (@le_ctxt_lvl@).
147 That's meant to be the level number of the enclosing binder in the
148 final (floated) program. If the level number of a sub-expression is
149 less than that of the context, then it might be worth let-binding the
150 sub-expression so that it will indeed float.
152 If you can float to level @Level 0 0@ worth doing so because then your
154 context @Level 0 0@.
157 Note [FloatOut inside INLINE]
158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159 @InlineCtxt@ very similar to @Level 0 0@, but is used for one purpose:
160 to say "don't float anything out of here". That's exactly what we
161 want for the body of an INLINE, where we don't want to float anything
162 out at all. See notes with lvlMFE below.
164 But, check this out:
166 -- At one time I tried the effect of not float anything out of an InlineMe,
167 -- but it sometimes works badly. For example, consider PrelArr.done. It
168 -- has the form __inline (\d. e)
169 -- where e doesn't mention d. If we float this to
170 -- __inline (let x = e in \d. x)
171 -- things are bad. The inliner doesn't even inline it because it doesn't look
172 -- like a head-normal form. So it seems a lesser evil to let things float.
173 -- In SetLevels we do set the context to (Level 0 0) when we get to an InlineMe
174 -- which discourages floating out.
176 So the conclusion is: don't do any floating at all inside an InlineMe.
177 (In the above example, don't float the {x=e} out of the \d.)
179 One particular case is that of workers: we don't want to float the
180 call to the worker outside the wrapper, otherwise the worker might get
181 inlined into the floated expression, and an importing module won't see
182 the worker at all.
184 Note [Join ceiling]
185 ~~~~~~~~~~~~~~~~~~~
186 Join points can't float very far; too far, and they can't remain join points
187 So, suppose we have:
189 f x = (joinrec j y = ... x ... in jump j x) + 1
191 One may be tempted to float j out to the top of f's RHS, but then the jump
192 would not be a tail call. Thus we keep track of a level called the *join
193 ceiling* past which join points are not allowed to float.
195 The troublesome thing is that, unlike most levels to which something might
196 float, there is not necessarily an identifier to which the join ceiling is
197 attached. Fortunately, if something is to be floated to a join ceiling, it must
198 be dropped at the *nearest* join ceiling. Thus each level is marked as to
199 whether it is a join ceiling, so that FloatOut can tell which binders are being
200 floated to the nearest join ceiling and which to a particular binder (or set of
201 binders).
202 -}
204 instance Outputable FloatSpec where
205 ppr (FloatMe l) = char 'F' <> ppr l
206 ppr (StayPut l) = ppr l
208 tOP_LEVEL :: Level
209 tOP_LEVEL = Level 0 0 BndrLvl
211 incMajorLvl :: Level -> Level
212 incMajorLvl (Level major _ _) = Level (major + 1) 0 BndrLvl
214 incMinorLvl :: Level -> Level
215 incMinorLvl (Level major minor _) = Level major (minor+1) BndrLvl
217 asJoinCeilLvl :: Level -> Level
218 asJoinCeilLvl (Level major minor _) = Level major minor JoinCeilLvl
220 maxLvl :: Level -> Level -> Level
221 maxLvl l1@(Level maj1 min1 _) l2@(Level maj2 min2 _)
222 | (maj1 > maj2) || (maj1 == maj2 && min1 > min2) = l1
223 | otherwise = l2
225 ltLvl :: Level -> Level -> Bool
226 ltLvl (Level maj1 min1 _) (Level maj2 min2 _)
227 = (maj1 < maj2) || (maj1 == maj2 && min1 < min2)
229 ltMajLvl :: Level -> Level -> Bool
230 -- Tells if one level belongs to a difft *lambda* level to another
231 ltMajLvl (Level maj1 _ _) (Level maj2 _ _) = maj1 < maj2
233 isTopLvl :: Level -> Bool
234 isTopLvl (Level 0 0 _) = True
235 isTopLvl _ = False
237 isJoinCeilLvl :: Level -> Bool
238 isJoinCeilLvl (Level _ _ t) = t == JoinCeilLvl
240 instance Outputable Level where
241 ppr (Level maj min typ)
242 = hcat [ char '<', int maj, char ',', int min, char '>'
243 , ppWhen (typ == JoinCeilLvl) (char 'C') ]
245 instance Eq Level where
246 (Level maj1 min1 _) == (Level maj2 min2 _) = maj1 == maj2 && min1 == min2
248 {-
249 ************************************************************************
250 * *
251 \subsection{Main level-setting code}
252 * *
253 ************************************************************************
254 -}
256 setLevels :: FloatOutSwitches
257 -> CoreProgram
258 -> UniqSupply
259 -> [LevelledBind]
261 setLevels float_lams binds us
262 = initLvl us (do_them init_env binds)
263 where
264 init_env = initialEnv float_lams
266 do_them :: LevelEnv -> [CoreBind] -> LvlM [LevelledBind]
267 do_them _ [] = return []
268 do_them env (b:bs)
269 = do { (lvld_bind, env') <- lvlTopBind env b
270 ; lvld_binds <- do_them env' bs
271 ; return (lvld_bind : lvld_binds) }
273 lvlTopBind :: LevelEnv -> Bind Id -> LvlM (LevelledBind, LevelEnv)
274 lvlTopBind env (NonRec bndr rhs)
275 = do { rhs' <- lvl_top env NonRecursive bndr rhs
276 ; let (env', [bndr']) = substAndLvlBndrs NonRecursive env tOP_LEVEL [bndr]
277 ; return (NonRec bndr' rhs', env') }
279 lvlTopBind env (Rec pairs)
280 = do { let (env', bndrs') = substAndLvlBndrs Recursive env tOP_LEVEL
281 (map fst pairs)
282 ; rhss' <- mapM (\(b,r) -> lvl_top env' Recursive b r) pairs
283 ; return (Rec (bndrs' zip rhss'), env') }
285 lvl_top :: LevelEnv -> RecFlag -> Id -> CoreExpr -> LvlM LevelledExpr
286 lvl_top env is_rec bndr rhs
287 = lvlRhs env is_rec
288 (isBottomingId bndr)
289 Nothing -- Not a join point
290 (freeVars rhs)
292 {-
293 ************************************************************************
294 * *
295 \subsection{Setting expression levels}
296 * *
297 ************************************************************************
299 Note [Floating over-saturated applications]
300 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
301 If we see (f x y), and (f x) is a redex (ie f's arity is 1),
302 we call (f x) an "over-saturated application"
304 Should we float out an over-sat app, if can escape a value lambda?
305 It is sometimes very beneficial (-7% runtime -4% alloc over nofib -O2).
306 But we don't want to do it for class selectors, because the work saved
307 is minimal, and the extra local thunks allocated cost money.
309 Arguably we could float even class-op applications if they were going to
310 top level -- but then they must be applied to a constant dictionary and
311 will almost certainly be optimised away anyway.
312 -}
314 lvlExpr :: LevelEnv -- Context
315 -> CoreExprWithFVs -- Input expression
316 -> LvlM LevelledExpr -- Result expression
318 {-
319 The @le_ctxt_lvl@ is, roughly, the level of the innermost enclosing
320 binder. Here's an example
322 v = \x -> ...\y -> let r = case (..x..) of
323 ..x..
324 in ..
326 When looking at the rhs of @r@, @le_ctxt_lvl@ will be 1 because that's
327 the level of @r@, even though it's inside a level-2 @\y@. It's
328 important that @le_ctxt_lvl@ is 1 and not 2 in @r@'s rhs, because we
329 don't want @lvlExpr@ to turn the scrutinee of the @case@ into an MFE
330 --- because it isn't a *maximal* free expression.
332 If there were another lambda in @r@'s rhs, it would get level-2 as well.
333 -}
335 lvlExpr env (_, AnnType ty) = return (Type (CoreSubst.substTy (le_subst env) ty))
336 lvlExpr env (_, AnnCoercion co) = return (Coercion (substCo (le_subst env) co))
337 lvlExpr env (_, AnnVar v) = return (lookupVar env v)
338 lvlExpr _ (_, AnnLit lit) = return (Lit lit)
340 lvlExpr env (_, AnnCast expr (_, co)) = do
341 expr' <- lvlNonTailExpr env expr
342 return (Cast expr' (substCo (le_subst env) co))
344 lvlExpr env (_, AnnTick tickish expr) = do
345 expr' <- lvlNonTailExpr env expr
346 let tickish' = substTickish (le_subst env) tickish
347 return (Tick tickish' expr')
349 lvlExpr env expr@(_, AnnApp _ _) = lvlApp env expr (collectAnnArgs expr)
351 -- We don't split adjacent lambdas. That is, given
352 -- \x y -> (x+1,y)
353 -- we don't float to give
354 -- \x -> let v = x+1 in \y -> (v,y)
355 -- Why not? Because partial applications are fairly rare, and splitting
356 -- lambdas makes them more expensive.
358 lvlExpr env expr@(_, AnnLam {})
359 = do { new_body <- lvlNonTailMFE new_env True body
360 ; return (mkLams new_bndrs new_body) }
361 where
362 (bndrs, body) = collectAnnBndrs expr
363 (env1, bndrs1) = substBndrsSL NonRecursive env bndrs
364 (new_env, new_bndrs) = lvlLamBndrs env1 (le_ctxt_lvl env) bndrs1
365 -- At one time we called a special verion of collectBinders,
366 -- which ignored coercions, because we don't want to split
367 -- a lambda like this (\x -> coerce t (\s -> ...))
368 -- This used to happen quite a bit in state-transformer programs,
369 -- but not nearly so much now non-recursive newtypes are transparent.
370 -- [See SetLevels rev 1.50 for a version with this approach.]
372 lvlExpr env (_, AnnLet bind body)
373 = do { (bind', new_env) <- lvlBind env bind
374 ; body' <- lvlExpr new_env body
375 -- No point in going via lvlMFE here. If the binding is alive
376 -- (mentioned in body), and the whole let-expression doesn't
377 -- float, then neither will the body
378 ; return (Let bind' body') }
380 lvlExpr env (_, AnnCase scrut case_bndr ty alts)
381 = do { scrut' <- lvlNonTailMFE env True scrut
382 ; lvlCase env (freeVarsOf scrut) scrut' case_bndr ty alts }
384 lvlNonTailExpr :: LevelEnv -- Context
385 -> CoreExprWithFVs -- Input expression
386 -> LvlM LevelledExpr -- Result expression
387 lvlNonTailExpr env expr
388 = lvlExpr (placeJoinCeiling env) expr
390 -------------------------------------------
391 lvlApp :: LevelEnv
392 -> CoreExprWithFVs
393 -> (CoreExprWithFVs, [CoreExprWithFVs]) -- Input application
394 -> LvlM LevelledExpr -- Result expression
395 lvlApp env orig_expr ((_,AnnVar fn), args)
396 | floatOverSat env -- See Note [Floating over-saturated applications]
397 , arity > 0
398 , arity < n_val_args
399 , Nothing <- isClassOpId_maybe fn
400 = do { rargs' <- mapM (lvlNonTailMFE env False) rargs
401 ; lapp' <- lvlNonTailMFE env False lapp
402 ; return (foldl App lapp' rargs') }
404 | otherwise
405 = do { args' <- zipWithM (lvlMFE env) stricts args
406 -- Take account of argument strictness; see
407 -- Note [Floating to the top]
408 ; return (foldl App (lookupVar env fn) args') }
409 where
410 n_val_args = count (isValArg . deAnnotate) args
411 arity = idArity fn
413 stricts :: [Bool] -- True for strict argument
414 stricts = case splitStrictSig (idStrictness fn) of
415 (arg_ds, _) | not (arg_ds lengthExceeds n_val_args)
416 -> map isStrictDmd arg_ds ++ repeat False
417 | otherwise
418 -> repeat False
420 -- Separate out the PAP that we are floating from the extra
421 -- arguments, by traversing the spine until we have collected
422 -- (n_val_args - arity) value arguments.
423 (lapp, rargs) = left (n_val_args - arity) orig_expr []
425 left 0 e rargs = (e, rargs)
426 left n (_, AnnApp f a) rargs
427 | isValArg (deAnnotate a) = left (n-1) f (a:rargs)
428 | otherwise = left n f (a:rargs)
429 left _ _ _ = panic "SetLevels.lvlExpr.left"
431 lvlApp env _ (fun, args)
432 = -- No PAPs that we can float: just carry on with the
433 -- arguments and the function.
434 do { args' <- mapM (lvlNonTailMFE env False) args
435 ; fun' <- lvlNonTailExpr env fun
436 ; return (foldl App fun' args') }
438 -------------------------------------------
439 lvlCase :: LevelEnv -- Level of in-scope names/tyvars
440 -> DVarSet -- Free vars of input scrutinee
441 -> LevelledExpr -- Processed scrutinee
442 -> Id -> Type -- Case binder and result type
443 -> [CoreAltWithFVs] -- Input alternatives
444 -> LvlM LevelledExpr -- Result expression
445 lvlCase env scrut_fvs scrut' case_bndr ty alts
446 | [(con@(DataAlt {}), bs, body)] <- alts
447 , exprOkForSpeculation scrut' -- See Note [Check the output scrutinee for okForSpec]
448 , not (isTopLvl dest_lvl) -- Can't have top-level cases
449 , not (floatTopLvlOnly env) -- Can float anywhere
450 = -- See Note [Floating cases]
451 -- Always float the case if possible
452 -- Unlike lets we don't insist that it escapes a value lambda
453 do { (env1, (case_bndr' : bs')) <- cloneCaseBndrs env dest_lvl (case_bndr : bs)
454 ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut'
455 ; body' <- lvlMFE rhs_env True body
456 ; let alt' = (con, map (stayPut dest_lvl) bs', body')
457 ; return (Case scrut' (TB case_bndr' (FloatMe dest_lvl)) ty' [alt']) }
459 | otherwise -- Stays put
460 = do { let (alts_env1, [case_bndr']) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr]
461 alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut'
462 ; alts' <- mapM (lvl_alt alts_env) alts
463 ; return (Case scrut' case_bndr' ty' alts') }
464 where
465 ty' = substTy (le_subst env) ty
467 incd_lvl = incMinorLvl (le_ctxt_lvl env)
468 dest_lvl = maxFvLevel (const True) env scrut_fvs
469 -- Don't abstract over type variables, hence const True
471 lvl_alt alts_env (con, bs, rhs)
472 = do { rhs' <- lvlMFE new_env True rhs
473 ; return (con, bs', rhs') }
474 where
475 (new_env, bs') = substAndLvlBndrs NonRecursive alts_env incd_lvl bs
477 {-
478 Note [Floating cases]
479 ~~~~~~~~~~~~~~~~~~~~~
480 Consider this:
481 data T a = MkT !a
482 f :: T Int -> blah
483 f x vs = case x of { MkT y ->
484 let f vs = ...(case y of I# w -> e)...f..
485 in f vs
486 Here we can float the (case y ...) out, because y is sure
487 to be evaluated, to give
488 f x vs = case x of { MkT y ->
489 caes y of I# w ->
490 let f vs = ...(e)...f..
491 in f vs
493 That saves unboxing it every time round the loop. It's important in
494 some DPH stuff where we really want to avoid that repeated unboxing in
495 the inner loop.
497 Things to note
498 * We can't float a case to top level
499 * It's worth doing this float even if we don't float
500 the case outside a value lambda. Example
501 case x of {
502 MkT y -> (case y of I# w2 -> ..., case y of I# w2 -> ...)
503 If we floated the cases out we could eliminate one of them.
504 * We only do this with a single-alternative case
506 Note [Check the output scrutinee for okForSpec]
507 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
508 Consider this:
509 case x of y {
510 A -> ....(case y of alts)....
511 }
512 Because of the binder-swap, the inner case will get substituted to
513 (case x of ..). So when testing whether the scrutinee is
514 okForSpeculation we must be careful to test the *result* scrutinee ('x'
515 in this case), not the *input* one 'y'. The latter *is* ok for
516 speculation here, but the former is not -- and indeed we can't float
517 the inner case out, at least not unless x is also evaluated at its
518 binding site.
520 That's why we apply exprOkForSpeculation to scrut' and not to scrut.
521 -}
523 lvlNonTailMFE :: LevelEnv -- Level of in-scope names/tyvars
524 -> Bool -- True <=> strict context [body of case
525 -- or let]
526 -> CoreExprWithFVs -- input expression
527 -> LvlM LevelledExpr -- Result expression
528 lvlNonTailMFE env strict_ctxt ann_expr
529 = lvlMFE (placeJoinCeiling env) strict_ctxt ann_expr
531 lvlMFE :: LevelEnv -- Level of in-scope names/tyvars
532 -> Bool -- True <=> strict context [body of case or let]
533 -> CoreExprWithFVs -- input expression
534 -> LvlM LevelledExpr -- Result expression
535 -- lvlMFE is just like lvlExpr, except that it might let-bind
536 -- the expression, so that it can itself be floated.
538 lvlMFE env _ (_, AnnType ty)
539 = return (Type (CoreSubst.substTy (le_subst env) ty))
541 -- No point in floating out an expression wrapped in a coercion or note
542 -- If we do we'll transform lvl = e |> co
543 -- to lvl' = e; lvl = lvl' |> co
544 -- and then inline lvl. Better just to float out the payload.
545 lvlMFE env strict_ctxt (_, AnnTick t e)
546 = do { e' <- lvlMFE env strict_ctxt e
547 ; return (Tick t e') }
549 lvlMFE env strict_ctxt (_, AnnCast e (_, co))
550 = do { e' <- lvlMFE env strict_ctxt e
551 ; return (Cast e' (substCo (le_subst env) co)) }
553 lvlMFE env strict_ctxt e@(_, AnnCase {})
554 | strict_ctxt -- Don't share cases in a strict context
555 = lvlExpr env e -- See Note [Case MFEs]
557 lvlMFE env strict_ctxt ann_expr
558 | floatTopLvlOnly env && not (isTopLvl dest_lvl)
559 -- Only floating to the top level is allowed.
560 || anyDVarSet isJoinId fvs -- If there is a free join, don't float
561 -- See Note [Free join points]
562 || isExprLevPoly expr
563 -- We can't let-bind levity polymorphic expressions
564 -- See Note [Levity polymorphism invariants] in CoreSyn
565 || notWorthFloating expr abs_vars
566 || not float_me
567 = -- Don't float it out
568 lvlExpr env ann_expr
570 | float_is_new_lam || exprIsTopLevelBindable expr expr_ty
571 -- No wrapping needed if the type is lifted, or is a literal string
572 -- or if we are wrapping it in one or more value lambdas
573 = do { expr1 <- lvlFloatRhs abs_vars dest_lvl rhs_env NonRecursive
574 (isJust mb_bot_str)
575 join_arity_maybe
576 ann_expr
577 -- Treat the expr just like a right-hand side
578 ; var <- newLvlVar expr1 join_arity_maybe is_mk_static
579 ; let var2 = annotateBotStr var float_n_lams mb_bot_str
580 ; return (Let (NonRec (TB var2 (FloatMe dest_lvl)) expr1)
581 (mkVarApps (Var var2) abs_vars)) }
583 -- OK, so the float has an unlifted type (not top-level bindable)
584 -- and no new value lambdas (float_is_new_lam is False)
585 -- Try for the boxing strategy
586 -- See Note [Floating MFEs of unlifted type]
587 | escapes_value_lam
588 , not expr_ok_for_spec -- Boxing/unboxing isn't worth it for cheap expressions
589 -- See Note [Test cheapness with exprOkForSpeculation]
590 , Just (tc, _) <- splitTyConApp_maybe expr_ty
591 , Just dc <- boxingDataCon_maybe tc
592 , let dc_res_ty = dataConOrigResTy dc -- No free type variables
593 [bx_bndr, ubx_bndr] = mkTemplateLocals [dc_res_ty, expr_ty]
594 = do { expr1 <- lvlExpr rhs_env ann_expr
595 ; let l1r = incMinorLvlFrom rhs_env
596 float_rhs = mkLams abs_vars_w_lvls $597 Case expr1 (stayPut l1r ubx_bndr) dc_res_ty 598 [(DEFAULT, [], mkConApp dc [Var ubx_bndr])] 600 ; var <- newLvlVar float_rhs Nothing is_mk_static 601 ; let l1u = incMinorLvlFrom env 602 use_expr = Case (mkVarApps (Var var) abs_vars) 603 (stayPut l1u bx_bndr) expr_ty 604 [(DataAlt dc, [stayPut l1u ubx_bndr], Var ubx_bndr)] 605 ; return (Let (NonRec (TB var (FloatMe dest_lvl)) float_rhs) 606 use_expr) } 608 | otherwise -- e.g. do not float unboxed tuples 609 = lvlExpr env ann_expr 611 where 612 expr = deAnnotate ann_expr 613 expr_ty = exprType expr 614 fvs = freeVarsOf ann_expr 615 is_bot = isBottomThunk mb_bot_str 616 is_function = isFunction ann_expr 617 mb_bot_str = exprBotStrictness_maybe expr 618 -- See Note [Bottoming floats] 619 -- esp Bottoming floats (2) 620 expr_ok_for_spec = exprOkForSpeculation expr 621 dest_lvl = destLevel env fvs is_function is_bot False 622 abs_vars = abstractVars dest_lvl env fvs 624 -- float_is_new_lam: the floated thing will be a new value lambda 625 -- replacing, say (g (x+4)) by (lvl x). No work is saved, nor is 626 -- allocation saved. The benefit is to get it to the top level 627 -- and hence out of the body of this function altogether, making 628 -- it smaller and more inlinable 629 float_is_new_lam = float_n_lams > 0 630 float_n_lams = count isId abs_vars 632 (rhs_env, abs_vars_w_lvls) = lvlLamBndrs env dest_lvl abs_vars 634 join_arity_maybe = Nothing 636 is_mk_static = isJust (collectMakeStaticArgs expr) 637 -- Yuk: See Note [Grand plan for static forms] in main/StaticPtrTable 639 -- A decision to float entails let-binding this thing, and we only do 640 -- that if we'll escape a value lambda, or will go to the top level. 641 float_me = saves_work || saves_alloc || is_mk_static 643 -- We can save work if we can move a redex outside a value lambda 644 -- But if float_is_new_lam is True, then the redex is wrapped in a 645 -- a new lambda, so no work is saved 646 saves_work = escapes_value_lam && not float_is_new_lam 648 escapes_value_lam = dest_lvl ltMajLvl (le_ctxt_lvl env) 649 -- See Note [Escaping a value lambda] 651 -- See Note [Floating to the top] 652 saves_alloc = isTopLvl dest_lvl 653 && floatConsts env 654 && (not strict_ctxt || is_bot || exprIsHNF expr) 656 isBottomThunk :: Maybe (Arity, s) -> Bool 657 -- See Note [Bottoming floats] (2) 658 isBottomThunk (Just (0, _)) = True -- Zero arity 659 isBottomThunk _ = False 661 {- Note [Floating to the top] 662 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 663 We are keen to float something to the top level, even if it does not 664 escape a value lambda (and hence save work), for two reasons: 666 * Doing so makes the function smaller, by floating out 667 bottoming expressions, or integer or string literals. That in 668 turn makes it easier to inline, with less duplication. 670 * (Minor) Doing so may turn a dynamic allocation (done by machine 671 instructions) into a static one. Minor because we are assuming 672 we are not escaping a value lambda. 674 But do not so if: 675 - the context is a strict, and 676 - the expression is not a HNF, and 677 - the expression is not bottoming 679 Exammples: 681 * Bottoming 682 f x = case x of 683 0 -> error <big thing> 684 _ -> x+1 685 Here we want to float (error <big thing>) to top level, abstracting 686 over 'x', so as to make f's RHS smaller. 688 * HNF 689 f = case y of 690 True -> p:q 691 False -> blah 692 We may as well float the (p:q) so it becomes a static data structure. 694 * Case scrutinee 695 f = case g True of .... 696 Don't float (g True) to top level; then we have the admin of a 697 top-level thunk to worry about, with zero gain. 699 * Case alternative 700 h = case y of 701 True -> g True 702 False -> False 703 Don't float (g True) to the top level 705 * Arguments 706 t = f (g True) 707 If f is lazy, we /do/ float (g True) because then we can allocate 708 the thunk statically rather than dynamically. But if f is strict 709 we don't (see the use of idStrictness in lvlApp). It's not clear 710 if this test is worth the bother: it's only about CAFs! 712 It's controlled by a flag (floatConsts), because doing this too 713 early loses opportunities for RULES which (needless to say) are 714 important in some nofib programs (gcd is an example). [SPJ note: 715 I think this is obselete; the flag seems always on.] 717 Note [Floating join point bindings] 718 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 719 Mostly we only float a join point if it can /stay/ a join point. But 720 there is one exception: if it can go to the top level (Trac #13286). 721 Consider 722 f x = joinrec j y n = <...j y' n'...> 723 in jump j x 0 725 Here we may just as well produce 726 j y n = <....j y' n'...> 727 f x = j x 0 729 and now there is a chance that 'f' will be inlined at its call sites. 730 It shouldn't make a lot of difference, but thes tests 731 perf/should_run/MethSharing 732 simplCore/should_compile/spec-inline 733 and one nofib program, all improve if you do float to top, because 734 of the resulting inlining of f. So ok, let's do it. 736 Note [Free join points] 737 ~~~~~~~~~~~~~~~~~~~~~~~ 738 We never float a MFE that has a free join-point variable. You mght think 739 this can never occur. After all, consider 740 join j x = ... 741 in ....(jump j x).... 742 How might we ever want to float that (jump j x)? 743 * If it would escape a value lambda, thus 744 join j x = ... in (\y. ...(jump j x)... ) 745 then 'j' isn't a valid join point in the first place. 747 But consider 748 join j x = .... in 749 joinrec j2 y = ...(jump j x)...(a+b).... 751 Since j2 is recursive, it /is/ worth floating (a+b) out of the joinrec. 752 But it is emphatically /not/ good to float the (jump j x) out: 753 (a) 'j' will stop being a join point 754 (b) In any case, jumping to 'j' must be an exit of the j2 loop, so no 755 work would be saved by floating it out of the \y. 757 Even if we floated 'j' to top level, (b) would still hold. 759 Bottom line: never float a MFE that has a free JoinId. 761 Note [Floating MFEs of unlifted type] 762 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 763 Suppose we have 764 case f x of (r::Int#) -> blah 765 we'd like to float (f x). But it's not trivial because it has type 766 Int#, and we don't want to evaluate it too early. But we can instead 767 float a boxed version 768 y = case f x of r -> I# r 769 and replace the original (f x) with 770 case (case y of I# r -> r) of r -> blah 772 Being able to float unboxed expressions is sometimes important; see 773 Trac #12603. I'm not sure how /often/ it is important, but it's 774 not hard to achieve. 776 We only do it for a fixed collection of types for which we have a 777 convenient boxing constructor (see boxingDataCon_maybe). In 778 particular we /don't/ do it for unboxed tuples; it's better to float 779 the components of the tuple individually. 781 I did experiment with a form of boxing that works for any type, namely 782 wrapping in a function. In our example 784 let y = case f x of r -> \v. f x 785 in case y void of r -> blah 787 It works fine, but it's 50% slower (based on some crude benchmarking). 788 I suppose we could do it for types not covered by boxingDataCon_maybe, 789 but it's more code and I'll wait to see if anyone wants it. 791 Note [Test cheapness with exprOkForSpeculation] 792 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 793 We don't want to float very cheap expressions by boxing and unboxing. 794 But we use exprOkForSpeculation for the test, not exprIsCheap. 795 Why? Because it's important /not/ to transform 796 f (a /# 3) 797 to 798 f (case bx of I# a -> a /# 3) 799 and float bx = I# (a /# 3), because the application of f no 800 longer obeys the let/app invariant. But (a /# 3) is ok-for-spec 801 due to a special hack that says division operators can't fail 802 when the denominator is definitely non-zero. And yet that 803 same expression says False to exprIsCheap. Simplest way to 804 guarantee the let/app invariant is to use the same function! 806 If an expression is okay for speculation, we could also float it out 807 *without* boxing and unboxing, since evaluating it early is okay. 808 However, it turned out to usually be better not to float such expressions, 809 since they tend to be extremely cheap things like (x +# 1#). Even the 810 cost of spilling the let-bound variable to the stack across a call may 811 exceed the cost of recomputing such an expression. (And we can't float 812 unlifted bindings to top-level.) 814 We could try to do something smarter here, and float out expensive yet 815 okay-for-speculation things, such as division by non-zero constants. 816 But I suspect it's a narrow target. 818 Note [Bottoming floats] 819 ~~~~~~~~~~~~~~~~~~~~~~~ 820 If we see 821 f = \x. g (error "urk") 822 we'd like to float the call to error, to get 823 lvl = error "urk" 824 f = \x. g lvl 826 But, as ever, we need to be careful: 828 (1) We want to float a bottoming 829 expression even if it has free variables: 830 f = \x. g (let v = h x in error ("urk" ++ v)) 831 Then we'd like to abstract over 'x' can float the whole arg of g: 832 lvl = \x. let v = h x in error ("urk" ++ v) 833 f = \x. g (lvl x) 834 To achieve this we pass is_bot to destLevel 836 (2) We do not do this for lambdas that return 837 bottom. Instead we treat the /body/ of such a function specially, 838 via point (1). For example: 839 f = \x. ....(\y z. if x then error y else error z).... 840 ===> 841 lvl = \x z y. if b then error y else error z 842 f = \x. ...(\y z. lvl x z y)... 843 (There is no guarantee that we'll choose the perfect argument order.) 845 (3) If we have a /binding/ that returns bottom, we want to float it to top 846 level, even if it has free vars (point (1)), and even it has lambdas. 847 Example: 848 ... let { v = \y. error (show x ++ show y) } in ... 849 We want to abstract over x and float the whole thing to top: 850 lvl = \xy. errror (show x ++ show y) 851 ...let {v = lvl x} in ... 853 Then of course we don't want to separately float the body (error ...) 854 as /another/ MFE, so we tell lvlFloatRhs not to do that, via the is_bot 855 argument. 857 See Maessen's paper 1999 "Bottom extraction: factoring error handling out 858 of functional programs" (unpublished I think). 860 When we do this, we set the strictness and arity of the new bottoming 861 Id, *immediately*, for three reasons: 863 * To prevent the abstracted thing being immediately inlined back in again 864 via preInlineUnconditionally. The latter has a test for bottoming Ids 865 to stop inlining them, so we'd better make sure it *is* a bottoming Id! 867 * So that it's properly exposed as such in the interface file, even if 868 this is all happening after strictness analysis. 870 * In case we do CSE with the same expression that *is* marked bottom 871 lvl = error "urk" 872 x{str=bot) = error "urk" 873 Here we don't want to replace 'x' with 'lvl', else we may get Lint 874 errors, e.g. via a case with empty alternatives: (case x of {}) 875 Lint complains unless the scrutinee of such a case is clearly bottom. 877 This was reported in Trac #11290. But since the whole bottoming-float 878 thing is based on the cheap-and-cheerful exprIsBottom, I'm not sure 879 that it'll nail all such cases. 881 Note [Bottoming floats: eta expansion] c.f Note [Bottoming floats] 882 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 883 Tiresomely, though, the simplifier has an invariant that the manifest 884 arity of the RHS should be the same as the arity; but we can't call 885 etaExpand during SetLevels because it works over a decorated form of 886 CoreExpr. So we do the eta expansion later, in FloatOut. 888 Note [Case MFEs] 889 ~~~~~~~~~~~~~~~~ 890 We don't float a case expression as an MFE from a strict context. Why not? 891 Because in doing so we share a tiny bit of computation (the switch) but 892 in exchange we build a thunk, which is bad. This case reduces allocation 893 by 7% in spectral/puzzle (a rather strange benchmark) and 1.2% in real/fem. 894 Doesn't change any other allocation at all. 896 We will make a separate decision for the scrutinees and alternatives. 897 -} 899 annotateBotStr :: Id -> Arity -> Maybe (Arity, StrictSig) -> Id 900 -- See Note [Bottoming floats] for why we want to add 901 -- bottoming information right now 902 -- 903 -- n_extra are the number of extra value arguments added during floating 904 annotateBotStr id n_extra mb_str 905 = case mb_str of 906 Nothing -> id 907 Just (arity, sig) -> id setIdArity (arity + n_extra) 908 setIdStrictness (increaseStrictSigArity n_extra sig) 910 notWorthFloating :: CoreExpr -> [Var] -> Bool 911 -- Returns True if the expression would be replaced by 912 -- something bigger than it is now. For example: 913 -- abs_vars = tvars only: return True if e is trivial, 914 -- but False for anything bigger 915 -- abs_vars = [x] (an Id): return True for trivial, or an application (f x) 916 -- but False for (f x x) 917 -- 918 -- One big goal is that floating should be idempotent. Eg if 919 -- we replace e with (lvl79 x y) and then run FloatOut again, don't want 920 -- to replace (lvl79 x y) with (lvl83 x y)! 922 notWorthFloating e abs_vars 923 = go e (count isId abs_vars) 924 where 925 go (Var {}) n = n >= 0 926 go (Lit lit) n = ASSERT( n==0 ) 927 litIsTrivial lit -- Note [Floating literals] 928 go (Tick t e) n = not (tickishIsCode t) && go e n 929 go (Cast e _) n = go e n 930 go (App e arg) n 931 | Type {} <- arg = go e n 932 | Coercion {} <- arg = go e n 933 | n==0 = False 934 | is_triv arg = go e (n-1) 935 | otherwise = False 936 go _ _ = False 938 is_triv (Lit {}) = True -- Treat all literals as trivial 939 is_triv (Var {}) = True -- (ie not worth floating) 940 is_triv (Cast e _) = is_triv e 941 is_triv (App e (Type {})) = is_triv e 942 is_triv (App e (Coercion {})) = is_triv e 943 is_triv (Tick t e) = not (tickishIsCode t) && is_triv e 944 is_triv _ = False 946 {- 947 Note [Floating literals] 948 ~~~~~~~~~~~~~~~~~~~~~~~~ 949 It's important to float Integer literals, so that they get shared, 950 rather than being allocated every time round the loop. 951 Hence the litIsTrivial. 953 Ditto literal strings (MachStr), which we'd like to float to top 954 level, which is now possible. 957 Note [Escaping a value lambda] 958 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 959 We want to float even cheap expressions out of value lambdas, 960 because that saves allocation. Consider 961 f = \x. .. (\y.e) ... 962 Then we'd like to avoid allocating the (\y.e) every time we call f, 963 (assuming e does not mention x). An example where this really makes a 964 difference is simplrun009. 966 Another reason it's good is because it makes SpecContr fire on functions. 967 Consider 968 f = \x. ....(f (\y.e)).... 969 After floating we get 970 lvl = \y.e 971 f = \x. ....(f lvl)... 972 and that is much easier for SpecConstr to generate a robust 973 specialisation for. 975 However, if we are wrapping the thing in extra value lambdas (in 976 abs_vars), then nothing is saved. E.g. 977 f = \xyz. ...(e1[y],e2).... 978 If we float 979 lvl = \y. (e1[y],e2) 980 f = \xyz. ...(lvl y)... 981 we have saved nothing: one pair will still be allocated for each 982 call of 'f'. Hence the (not float_is_lam) in float_me. 985 ************************************************************************ 986 * * 987 \subsection{Bindings} 988 * * 989 ************************************************************************ 991 The binding stuff works for top level too. 992 -} 994 lvlBind :: LevelEnv 995 -> CoreBindWithFVs 996 -> LvlM (LevelledBind, LevelEnv) 998 lvlBind env (AnnNonRec bndr rhs) 999 | isTyVar bndr -- Don't do anything for TyVar binders 1000 -- (simplifier gets rid of them pronto) 1001 || isCoVar bndr -- Difficult to fix up CoVar occurrences (see extendPolyLvlEnv) 1002 -- so we will ignore this case for now 1003 || not (profitableFloat env dest_lvl) 1004 || (isTopLvl dest_lvl && isUnliftedType (idType bndr)) 1005 -- We can't float an unlifted binding to top level, so we don't 1006 -- float it at all. It's a bit brutal, but unlifted bindings 1007 -- aren't expensive either 1009 = -- No float 1010 do { rhs' <- lvlRhs env NonRecursive is_bot mb_join_arity rhs 1011 ; let bind_lvl = incMinorLvl (le_ctxt_lvl env) 1012 (env', [bndr']) = substAndLvlBndrs NonRecursive env bind_lvl [bndr] 1013 ; return (NonRec bndr' rhs', env') } 1015 -- Otherwise we are going to float 1016 | null abs_vars 1017 = do { -- No type abstraction; clone existing binder 1018 rhs' <- lvlFloatRhs [] dest_lvl env NonRecursive 1019 is_bot mb_join_arity rhs 1020 ; (env', [bndr']) <- cloneLetVars NonRecursive env dest_lvl [bndr] 1021 ; let bndr2 = annotateBotStr bndr' 0 mb_bot_str 1022 ; return (NonRec (TB bndr2 (FloatMe dest_lvl)) rhs', env') } 1024 | otherwise 1025 = do { -- Yes, type abstraction; create a new binder, extend substitution, etc 1026 rhs' <- lvlFloatRhs abs_vars dest_lvl env NonRecursive 1027 is_bot mb_join_arity rhs 1028 ; (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr] 1029 ; let bndr2 = annotateBotStr bndr' n_extra mb_bot_str 1030 ; return (NonRec (TB bndr2 (FloatMe dest_lvl)) rhs', env') } 1032 where 1033 rhs_fvs = freeVarsOf rhs 1034 bind_fvs = rhs_fvs unionDVarSet dIdFreeVars bndr 1035 abs_vars = abstractVars dest_lvl env bind_fvs 1036 dest_lvl = destLevel env bind_fvs (isFunction rhs) is_bot is_join 1038 mb_bot_str = exprBotStrictness_maybe (deAnnotate rhs) 1039 is_bot = isJust mb_bot_str 1040 -- NB: not isBottomThunk! See Note [Bottoming floats] point (3) 1042 n_extra = count isId abs_vars 1043 mb_join_arity = isJoinId_maybe bndr 1044 is_join = isJust mb_join_arity 1046 lvlBind env (AnnRec pairs) 1047 | floatTopLvlOnly env && not (isTopLvl dest_lvl) 1048 -- Only floating to the top level is allowed. 1049 || not (profitableFloat env dest_lvl) 1050 = do { let bind_lvl = incMinorLvl (le_ctxt_lvl env) 1051 (env', bndrs') = substAndLvlBndrs Recursive env bind_lvl bndrs 1052 lvl_rhs (b,r) = lvlRhs env' Recursive is_bot (isJoinId_maybe b) r 1053 ; rhss' <- mapM lvl_rhs pairs 1054 ; return (Rec (bndrs' zip rhss'), env') } 1056 | null abs_vars 1057 = do { (new_env, new_bndrs) <- cloneLetVars Recursive env dest_lvl bndrs 1058 ; new_rhss <- mapM (do_rhs new_env) pairs 1059 ; return ( Rec ([TB b (FloatMe dest_lvl) | b <- new_bndrs] zip new_rhss) 1060 , new_env) } 1062 -- ToDo: when enabling the floatLambda stuff, 1063 -- I think we want to stop doing this 1064 | [(bndr,rhs)] <- pairs 1065 , count isId abs_vars > 1 1066 = do -- Special case for self recursion where there are 1067 -- several variables carried around: build a local loop: 1068 -- poly_f = \abs_vars. \lam_vars . letrec f = \lam_vars. rhs in f lam_vars 1069 -- This just makes the closures a bit smaller. If we don't do 1070 -- this, allocation rises significantly on some programs 1071 -- 1072 -- We could elaborate it for the case where there are several 1073 -- mutually functions, but it's quite a bit more complicated 1074 -- 1075 -- This all seems a bit ad hoc -- sigh 1076 let (rhs_env, abs_vars_w_lvls) = lvlLamBndrs env dest_lvl abs_vars 1077 rhs_lvl = le_ctxt_lvl rhs_env 1079 (rhs_env', [new_bndr]) <- cloneLetVars Recursive rhs_env rhs_lvl [bndr] 1080 let 1081 (lam_bndrs, rhs_body) = collectAnnBndrs rhs 1082 (body_env1, lam_bndrs1) = substBndrsSL NonRecursive rhs_env' lam_bndrs 1083 (body_env2, lam_bndrs2) = lvlLamBndrs body_env1 rhs_lvl lam_bndrs1 1084 new_rhs_body <- lvlRhs body_env2 Recursive is_bot (get_join bndr) rhs_body 1085 (poly_env, [poly_bndr]) <- newPolyBndrs dest_lvl env abs_vars [bndr] 1086 return (Rec [(TB poly_bndr (FloatMe dest_lvl) 1087 , mkLams abs_vars_w_lvls$
1088 mkLams lam_bndrs2 $1089 Let (Rec [( TB new_bndr (StayPut rhs_lvl) 1090 , mkLams lam_bndrs2 new_rhs_body)]) 1091 (mkVarApps (Var new_bndr) lam_bndrs1))] 1092 , poly_env) 1094 | otherwise -- Non-null abs_vars 1095 = do { (new_env, new_bndrs) <- newPolyBndrs dest_lvl env abs_vars bndrs 1096 ; new_rhss <- mapM (do_rhs new_env) pairs 1097 ; return ( Rec ([TB b (FloatMe dest_lvl) | b <- new_bndrs] zip new_rhss) 1098 , new_env) } 1100 where 1101 (bndrs,rhss) = unzip pairs 1102 is_join = isJoinId (head bndrs) 1103 -- bndrs is always non-empty and if one is a join they all are 1104 -- Both are checked by Lint 1105 is_fun = all isFunction rhss 1106 is_bot = False -- It's odd to have an unconditionally divergent 1107 -- funtion in a Rec, and we don't much care what 1108 -- happens to it. False is simple! 1110 do_rhs env (bndr,rhs) = lvlFloatRhs abs_vars dest_lvl env Recursive 1111 is_bot (get_join bndr) 1112 rhs 1114 get_join bndr | need_zap = Nothing 1115 | otherwise = isJoinId_maybe bndr 1116 need_zap = dest_lvl ltLvl joinCeilingLevel env 1118 -- Finding the free vars of the binding group is annoying 1119 bind_fvs = ((unionDVarSets [ freeVarsOf rhs | (_, rhs) <- pairs]) 1120 unionDVarSet 1121 (fvDVarSet$ unionsFV [ idFVs bndr
1122 | (bndr, (_,_)) <- pairs]))
1123 delDVarSetList
1124 bndrs
1126 dest_lvl = destLevel env bind_fvs is_fun is_bot is_join
1127 abs_vars = abstractVars dest_lvl env bind_fvs
1129 profitableFloat :: LevelEnv -> Level -> Bool
1130 profitableFloat env dest_lvl
1131 = (dest_lvl ltMajLvl le_ctxt_lvl env) -- Escapes a value lambda
1132 || isTopLvl dest_lvl -- Going all the way to top level
1135 ----------------------------------------------------
1136 -- Three help functions for the type-abstraction case
1138 lvlRhs :: LevelEnv
1139 -> RecFlag
1140 -> Bool -- Is this a bottoming function
1141 -> Maybe JoinArity
1142 -> CoreExprWithFVs
1143 -> LvlM LevelledExpr
1144 lvlRhs env rec_flag is_bot mb_join_arity expr
1145 = lvlFloatRhs [] (le_ctxt_lvl env) env
1146 rec_flag is_bot mb_join_arity expr
1148 lvlFloatRhs :: [OutVar] -> Level -> LevelEnv -> RecFlag
1149 -> Bool -- Binding is for a bottoming function
1150 -> Maybe JoinArity
1151 -> CoreExprWithFVs
1152 -> LvlM (Expr LevelledBndr)
1153 -- Ignores the le_ctxt_lvl in env; treats dest_lvl as the baseline
1154 lvlFloatRhs abs_vars dest_lvl env rec is_bot mb_join_arity rhs
1155 = do { body' <- if not is_bot -- See Note [Floating from a RHS]
1156 && any isId bndrs
1157 then lvlMFE body_env True body
1158 else lvlExpr body_env body
1159 ; return (mkLams bndrs' body') }
1160 where
1161 (bndrs, body) | Just join_arity <- mb_join_arity
1162 = collectNAnnBndrs join_arity rhs
1163 | otherwise
1164 = collectAnnBndrs rhs
1165 (env1, bndrs1) = substBndrsSL NonRecursive env bndrs
1166 all_bndrs = abs_vars ++ bndrs1
1167 (body_env, bndrs') | Just _ <- mb_join_arity
1168 = lvlJoinBndrs env1 dest_lvl rec all_bndrs
1169 | otherwise
1170 = lvlLamBndrs (placeJoinCeiling env1) dest_lvl all_bndrs
1171 -- The important thing here is that we call lvlLamBndrs on
1172 -- all these binders at once (abs_vars and bndrs), so they
1173 -- all get the same major level. Otherwise we create stupid
1174 -- let-bindings inside, joyfully thinking they can float; but
1175 -- in the end they don't because we never float bindings in
1176 -- between lambdas
1178 {- Note [Floating from a RHS]
1179 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1180 When float the RHS of a let-binding, we don't always want to apply
1181 lvlMFE to the body of a lambda, as we usually do, because the entire
1182 binding body is already going to the right place (dest_lvl).
1184 A particular example is the top level. Consider
1185 concat = /\ a -> foldr ..a.. (++) []
1186 We don't want to float the body of the lambda to get
1187 lvl = /\ a -> foldr ..a.. (++) []
1188 concat = /\ a -> lvl a
1189 That would be stupid.
1191 Previously this was avoided in a much nastier way, by testing strict_ctxt
1192 in float_me in lvlMFE. But that wasn't even right because it would fail
1193 to float out the error sub-expression in
1194 f = \x. case x of
1195 True -> error ("blah" ++ show x)
1196 False -> ...
1198 But we must be careful:
1201 f = \x -> factorial 20
1202 we /would/ want to float that (factorial 20) out! Functions are treated
1203 differently: see the use of isFunction in the calls to destLevel. If
1204 there are only type lambdas, then destLevel will say "go to top, and
1205 abstract over the free tyvars" and we don't want that here.
1207 * But if we had
1208 f = \x -> error (...x....)
1209 we would NOT want to float the bottoming expression out to give
1210 lvl = \x -> error (...x...)
1211 f = \x -> lvl x
1213 Conclusion: use lvlMFE if there are
1214 * any value lambdas in the original function, and
1215 * this is not a bottoming function (the is_bot argument)
1216 Use lvlExpr otherwise. A little subtle, and I got it wrong at least twice
1217 (e.g. Trac #13369).
1218 -}
1220 {-
1221 ************************************************************************
1222 * *
1223 \subsection{Deciding floatability}
1224 * *
1225 ************************************************************************
1226 -}
1228 substAndLvlBndrs :: RecFlag -> LevelEnv -> Level -> [InVar] -> (LevelEnv, [LevelledBndr])
1229 substAndLvlBndrs is_rec env lvl bndrs
1230 = lvlBndrs subst_env lvl subst_bndrs
1231 where
1232 (subst_env, subst_bndrs) = substBndrsSL is_rec env bndrs
1234 substBndrsSL :: RecFlag -> LevelEnv -> [InVar] -> (LevelEnv, [OutVar])
1235 -- So named only to avoid the name clash with CoreSubst.substBndrs
1236 substBndrsSL is_rec env@(LE { le_subst = subst, le_env = id_env }) bndrs
1237 = ( env { le_subst = subst'
1238 , le_env = foldl add_id id_env (bndrs zip bndrs') }
1239 , bndrs')
1240 where
1241 (subst', bndrs') = case is_rec of
1242 NonRecursive -> substBndrs subst bndrs
1243 Recursive -> substRecBndrs subst bndrs
1245 lvlLamBndrs :: LevelEnv -> Level -> [OutVar] -> (LevelEnv, [LevelledBndr])
1246 -- Compute the levels for the binders of a lambda group
1247 lvlLamBndrs env lvl bndrs
1248 = lvlBndrs env new_lvl bndrs
1249 where
1250 new_lvl | any is_major bndrs = incMajorLvl lvl
1251 | otherwise = incMinorLvl lvl
1253 is_major bndr = isId bndr && not (isProbablyOneShotLambda bndr)
1254 -- The "probably" part says "don't float things out of a
1255 -- probable one-shot lambda"
1256 -- See Note [Computing one-shot info] in Demand.hs
1258 lvlJoinBndrs :: LevelEnv -> Level -> RecFlag -> [OutVar]
1259 -> (LevelEnv, [LevelledBndr])
1260 lvlJoinBndrs env lvl rec bndrs
1261 = lvlBndrs env new_lvl bndrs
1262 where
1263 new_lvl | isRec rec = incMajorLvl lvl
1264 | otherwise = incMinorLvl lvl
1265 -- Non-recursive join points are one-shot; recursive ones are not
1267 lvlBndrs :: LevelEnv -> Level -> [CoreBndr] -> (LevelEnv, [LevelledBndr])
1268 -- The binders returned are exactly the same as the ones passed,
1269 -- apart from applying the substitution, but they are now paired
1270 -- with a (StayPut level)
1271 --
1272 -- The returned envt has le_ctxt_lvl updated to the new_lvl
1273 --
1274 -- All the new binders get the same level, because
1275 -- any floating binding is either going to float past
1276 -- all or none. We never separate binders.
1277 lvlBndrs env@(LE { le_lvl_env = lvl_env }) new_lvl bndrs
1278 = ( env { le_ctxt_lvl = new_lvl
1279 , le_join_ceil = new_lvl
1280 , le_lvl_env = addLvls new_lvl lvl_env bndrs }
1281 , map (stayPut new_lvl) bndrs)
1283 stayPut :: Level -> OutVar -> LevelledBndr
1284 stayPut new_lvl bndr = TB bndr (StayPut new_lvl)
1286 -- Destination level is the max Id level of the expression
1287 -- (We'll abstract the type variables, if any.)
1288 destLevel :: LevelEnv -> DVarSet
1289 -> Bool -- True <=> is function
1290 -> Bool -- True <=> is bottom
1291 -> Bool -- True <=> is a join point
1292 -> Level
1293 -- INVARIANT: if is_join=True then result >= join_ceiling
1294 destLevel env fvs is_function is_bot is_join
1295 | isTopLvl max_fv_id_level -- Float even joins if they get to top level
1296 -- See Note [Floating join point bindings]
1297 = tOP_LEVEL
1299 | is_join -- Never float a join point past the join ceiling
1300 -- See Note [Join points] in FloatOut
1301 = if max_fv_id_level ltLvl join_ceiling
1302 then join_ceiling
1303 else max_fv_id_level
1305 | is_bot -- Send bottoming bindings to the top
1306 = tOP_LEVEL -- regardless; see Note [Bottoming floats]
1307 -- Esp Bottoming floats (1)
1309 | Just n_args <- floatLams env
1310 , n_args > 0 -- n=0 case handled uniformly by the 'otherwise' case
1311 , is_function
1312 , countFreeIds fvs <= n_args
1313 = tOP_LEVEL -- Send functions to top level; see
1314 -- the comments with isFunction
1316 | otherwise = max_fv_id_level
1317 where
1318 max_fv_id_level = maxFvLevel isId env fvs -- Max over Ids only; the tyvars
1319 -- will be abstracted
1320 join_ceiling = joinCeilingLevel env
1322 isFunction :: CoreExprWithFVs -> Bool
1323 -- The idea here is that we want to float *functions* to
1324 -- the top level. This saves no work, but
1325 -- (a) it can make the host function body a lot smaller,
1326 -- and hence inlinable.
1327 -- (b) it can also save allocation when the function is recursive:
1328 -- h = \x -> letrec f = \y -> ...f...y...x...
1329 -- in f x
1330 -- becomes
1331 -- f = \x y -> ...(f x)...y...x...
1332 -- h = \x -> f x x
1333 -- No allocation for f now.
1334 -- We may only want to do this if there are sufficiently few free
1335 -- variables. We certainly only want to do it for values, and not for
1336 -- constructors. So the simple thing is just to look for lambdas
1337 isFunction (_, AnnLam b e) | isId b = True
1338 | otherwise = isFunction e
1339 -- isFunction (_, AnnTick _ e) = isFunction e -- dubious
1340 isFunction _ = False
1342 countFreeIds :: DVarSet -> Int
1343 countFreeIds = nonDetFoldUDFM add 0
1344 -- It's OK to use nonDetFoldUDFM here because we're just counting things.
1345 where
1346 add :: Var -> Int -> Int
1347 add v n | isId v = n+1
1348 | otherwise = n
1350 {-
1351 ************************************************************************
1352 * *
1354 * *
1355 ************************************************************************
1356 -}
1358 data LevelEnv
1359 = LE { le_switches :: FloatOutSwitches
1360 , le_ctxt_lvl :: Level -- The current level
1361 , le_lvl_env :: VarEnv Level -- Domain is *post-cloned* TyVars and Ids
1362 , le_join_ceil:: Level -- Highest level to which joins float
1363 -- Invariant: always >= le_ctxt_lvl
1365 -- See Note [le_subst and le_env]
1366 , le_subst :: Subst -- Domain is pre-cloned TyVars and Ids
1367 -- The Id -> CoreExpr in the Subst is ignored
1368 -- (since we want to substitute a LevelledExpr for
1369 -- an Id via le_env) but we do use the Co/TyVar substs
1370 , le_env :: IdEnv ([OutVar], LevelledExpr) -- Domain is pre-cloned Ids
1371 }
1373 {- Note [le_subst and le_env]
1374 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1375 We clone let- and case-bound variables so that they are still distinct
1376 when floated out; hence the le_subst/le_env. (see point 3 of the
1377 module overview comment). We also use these envs when making a
1378 variable polymorphic because we want to float it out past a big
1379 lambda.
1381 The le_subst and le_env always implement the same mapping,
1382 in_x :-> out_x a b
1383 where out_x is an OutVar, and a,b are its arguments (when
1384 we perform abstraction at the same time as floating).
1386 le_subst maps to CoreExpr
1387 le_env maps to LevelledExpr
1389 Since the range is always a variable or application, there is never
1390 any difference between the two, but sadly the types differ. The
1391 le_subst is used when substituting in a variable's IdInfo; the le_env
1392 when we find a Var.
1394 In addition the le_env records a [OutVar] of variables free in the
1395 OutExpr/LevelledExpr, just so we don't have to call freeVars
1396 repeatedly. This list is always non-empty, and the first element is
1397 out_x
1399 The domain of the both envs is *pre-cloned* Ids, though
1401 The domain of the le_lvl_env is the *post-cloned* Ids
1402 -}
1404 initialEnv :: FloatOutSwitches -> LevelEnv
1405 initialEnv float_lams
1406 = LE { le_switches = float_lams
1407 , le_ctxt_lvl = tOP_LEVEL
1408 , le_join_ceil = panic "initialEnv"
1409 , le_lvl_env = emptyVarEnv
1410 , le_subst = emptySubst
1411 , le_env = emptyVarEnv }
1413 addLvl :: Level -> VarEnv Level -> OutVar -> VarEnv Level
1414 addLvl dest_lvl env v' = extendVarEnv env v' dest_lvl
1416 addLvls :: Level -> VarEnv Level -> [OutVar] -> VarEnv Level
1419 floatLams :: LevelEnv -> Maybe Int
1420 floatLams le = floatOutLambdas (le_switches le)
1422 floatConsts :: LevelEnv -> Bool
1423 floatConsts le = floatOutConstants (le_switches le)
1425 floatOverSat :: LevelEnv -> Bool
1426 floatOverSat le = floatOutOverSatApps (le_switches le)
1428 floatTopLvlOnly :: LevelEnv -> Bool
1429 floatTopLvlOnly le = floatToTopLevelOnly (le_switches le)
1431 incMinorLvlFrom :: LevelEnv -> Level
1432 incMinorLvlFrom env = incMinorLvl (le_ctxt_lvl env)
1434 -- extendCaseBndrEnv adds the mapping case-bndr->scrut-var if it can
1435 -- See Note [Binder-swap during float-out]
1436 extendCaseBndrEnv :: LevelEnv
1437 -> Id -- Pre-cloned case binder
1438 -> Expr LevelledBndr -- Post-cloned scrutinee
1439 -> LevelEnv
1440 extendCaseBndrEnv le@(LE { le_subst = subst, le_env = id_env })
1441 case_bndr (Var scrut_var)
1442 = le { le_subst = extendSubstWithVar subst case_bndr scrut_var
1443 , le_env = add_id id_env (case_bndr, scrut_var) }
1444 extendCaseBndrEnv env _ _ = env
1446 -- See Note [Join ceiling]
1447 placeJoinCeiling :: LevelEnv -> LevelEnv
1448 placeJoinCeiling le@(LE { le_ctxt_lvl = lvl })
1449 = le { le_ctxt_lvl = lvl', le_join_ceil = lvl' }
1450 where
1451 lvl' = asJoinCeilLvl (incMinorLvl lvl)
1453 maxFvLevel :: (Var -> Bool) -> LevelEnv -> DVarSet -> Level
1454 maxFvLevel max_me (LE { le_lvl_env = lvl_env, le_env = id_env }) var_set
1455 = foldDVarSet max_in tOP_LEVEL var_set
1456 where
1457 max_in in_var lvl
1458 = foldr max_out lvl (case lookupVarEnv id_env in_var of
1459 Just (abs_vars, _) -> abs_vars
1460 Nothing -> [in_var])
1462 max_out out_var lvl
1463 | max_me out_var = case lookupVarEnv lvl_env out_var of
1464 Just lvl' -> maxLvl lvl' lvl
1465 Nothing -> lvl
1466 | otherwise = lvl -- Ignore some vars depending on max_me
1468 lookupVar :: LevelEnv -> Id -> LevelledExpr
1469 lookupVar le v = case lookupVarEnv (le_env le) v of
1470 Just (_, expr) -> expr
1471 _ -> Var v
1473 -- Level to which join points are allowed to float (boundary of current tail
1474 -- context). See Note [Join ceiling]
1475 joinCeilingLevel :: LevelEnv -> Level
1476 joinCeilingLevel = le_join_ceil
1478 abstractVars :: Level -> LevelEnv -> DVarSet -> [OutVar]
1479 -- Find the variables in fvs, free vars of the target expression,
1480 -- whose level is greater than the destination level
1481 -- These are the ones we are going to abstract out
1482 --
1483 -- Note that to get reproducible builds, the variables need to be
1484 -- abstracted in deterministic order, not dependent on the values of
1485 -- Uniques. This is achieved by using DVarSets, deterministic free
1486 -- variable computation and deterministic sort.
1487 -- See Note [Unique Determinism] in Unique for explanation of why
1488 -- Uniques are not deterministic.
1489 abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs
1490 = -- NB: sortQuantVars might not put duplicates next to each other
1491 map zap $sortQuantVars$ uniq
1492 [out_var | out_fv <- dVarSetElems (substDVarSet subst in_fvs)
1493 , out_var <- dVarSetElems (close out_fv)
1494 , abstract_me out_var ]
1495 -- NB: it's important to call abstract_me only on the OutIds the
1496 -- come from substDVarSet (not on fv, which is an InId)
1497 where
1498 uniq :: [Var] -> [Var]
1499 -- Remove duplicates, preserving order
1500 uniq = dVarSetElems . mkDVarSet
1502 abstract_me v = case lookupVarEnv lvl_env v of
1503 Just lvl -> dest_lvl ltLvl lvl
1504 Nothing -> False
1506 -- We are going to lambda-abstract, so nuke any IdInfo,
1507 -- and add the tyvars of the Id (if necessary)
1508 zap v | isId v = WARN( isStableUnfolding (idUnfolding v) ||
1509 not (isEmptyRuleInfo (idSpecialisation v)),
1510 text "absVarsOf: discarding info on" <+> ppr v )
1511 setIdInfo v vanillaIdInfo
1512 | otherwise = v
1514 close :: Var -> DVarSet -- Close over variables free in the type
1515 -- Result includes the input variable itself
1516 close v = foldDVarSet (unionDVarSet . close)
1517 (unitDVarSet v)
1518 (fvDVarSet $varTypeTyCoFVs v) 1520 type LvlM result = UniqSM result 1522 initLvl :: UniqSupply -> UniqSM a -> a 1523 initLvl = initUs_ 1525 newPolyBndrs :: Level -> LevelEnv -> [OutVar] -> [InId] 1526 -> LvlM (LevelEnv, [OutId]) 1527 -- The envt is extended to bind the new bndrs to dest_lvl, but 1528 -- the le_ctxt_lvl is unaffected 1529 newPolyBndrs dest_lvl 1530 env@(LE { le_lvl_env = lvl_env, le_subst = subst, le_env = id_env }) 1531 abs_vars bndrs 1532 = ASSERT( all (not . isCoVar) bndrs ) -- What would we add to the CoSubst in this case. No easy answer. 1533 do { uniqs <- getUniquesM 1534 ; let new_bndrs = zipWith mk_poly_bndr bndrs uniqs 1535 bndr_prs = bndrs zip new_bndrs 1536 env' = env { le_lvl_env = addLvls dest_lvl lvl_env new_bndrs 1537 , le_subst = foldl add_subst subst bndr_prs 1538 , le_env = foldl add_id id_env bndr_prs } 1539 ; return (env', new_bndrs) } 1540 where 1541 add_subst env (v, v') = extendIdSubst env v (mkVarApps (Var v') abs_vars) 1542 add_id env (v, v') = extendVarEnv env v ((v':abs_vars), mkVarApps (Var v') abs_vars) 1544 mk_poly_bndr bndr uniq = transferPolyIdInfo bndr abs_vars$ -- Note [transferPolyIdInfo] in Id.hs
1545 transfer_join_info bndr \$
1546 mkSysLocalOrCoVar (mkFastString str) uniq poly_ty
1547 where
1548 str = "poly_" ++ occNameString (getOccName bndr)
1549 poly_ty = mkLamTypes abs_vars (CoreSubst.substTy subst (idType bndr))
1551 -- If we are floating a join point to top level, it stops being
1552 -- a join point. Otherwise it continues to be a join point,
1553 -- but we may need to adjust its arity
1554 dest_is_top = isTopLvl dest_lvl
1555 transfer_join_info bndr new_bndr
1556 | Just join_arity <- isJoinId_maybe bndr
1557 , not dest_is_top
1558 = new_bndr asJoinId join_arity + length abs_vars
1559 | otherwise
1560 = new_bndr
1562 newLvlVar :: LevelledExpr -- The RHS of the new binding
1563 -> Maybe JoinArity -- Its join arity, if it is a join point
1564 -> Bool -- True <=> the RHS looks like (makeStatic ...)
1565 -> LvlM Id
1566 newLvlVar lvld_rhs join_arity_maybe is_mk_static
1567 = do { uniq <- getUniqueM
1568 ; return (add_join_info (mk_id uniq rhs_ty))
1569 }
1570 where
1571 add_join_info var = var asJoinId_maybe join_arity_maybe
1572 de_tagged_rhs = deTagExpr lvld_rhs
1573 rhs_ty = exprType de_tagged_rhs
1575 mk_id uniq rhs_ty
1576 -- See Note [Grand plan for static forms] in StaticPtrTable.
1577 | is_mk_static
1578 = mkExportedVanillaId (mkSystemVarName uniq (mkFastString "static_ptr"))
1579 rhs_ty
1580 | otherwise
1581 = mkLocalIdOrCoVar (mkSystemVarName uniq (mkFastString "lvl")) rhs_ty
1583 cloneCaseBndrs :: LevelEnv -> Level -> [Var] -> LvlM (LevelEnv, [Var])
1584 cloneCaseBndrs env@(LE { le_subst = subst, le_lvl_env = lvl_env, le_env = id_env })
1585 new_lvl vs
1586 = do { us <- getUniqueSupplyM
1587 ; let (subst', vs') = cloneBndrs subst us vs
1588 env' = env { le_ctxt_lvl = new_lvl
1589 , le_join_ceil = new_lvl
1590 , le_lvl_env = addLvls new_lvl lvl_env vs'
1591 , le_subst = subst'
1592 , le_env = foldl add_id id_env (vs zip vs') }
1594 ; return (env', vs') }
1596 cloneLetVars :: RecFlag -> LevelEnv -> Level -> [InVar]
1597 -> LvlM (LevelEnv, [OutVar])
1598 -- See Note [Need for cloning during float-out]
1599 -- Works for Ids bound by let(rec)
1600 -- The dest_lvl is attributed to the binders in the new env,
1601 -- but cloneVars doesn't affect the le_ctxt_lvl of the incoming env
1602 cloneLetVars is_rec
1603 env@(LE { le_subst = subst, le_lvl_env = lvl_env, le_env = id_env })
1604 dest_lvl vs
1605 = do { us <- getUniqueSupplyM
1606 ; let vs1 = map zap vs
1607 -- See Note [Zapping the demand info]
1608 (subst', vs2) = case is_rec of
1609 NonRecursive -> cloneBndrs subst us vs1
1610 Recursive -> cloneRecIdBndrs subst us vs1
1611 prs = vs zip vs2
1612 env' = env { le_lvl_env = addLvls dest_lvl lvl_env vs2
1613 , le_subst = subst'
1614 , le_env = foldl add_id id_env prs }
1616 ; return (env', vs2) }
1617 where
1618 zap :: Var -> Var
1619 zap v | isId v = zap_join (zapIdDemandInfo v)
1620 | otherwise = v
1622 zap_join | isTopLvl dest_lvl = zapJoinId
1623 | otherwise = \v -> v
1625 add_id :: IdEnv ([Var], LevelledExpr) -> (Var, Var) -> IdEnv ([Var], LevelledExpr)
1627 | isTyVar v = delVarEnv id_env v
1628 | otherwise = extendVarEnv id_env v ([v1], ASSERT(not (isCoVar v1)) Var v1)
1630 {-
1631 Note [Zapping the demand info]
1632 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1633 VERY IMPORTANT: we must zap the demand info if the thing is going to
1634 float out, because it may be less demanded than at its original
1635 binding site. Eg
1636 f :: Int -> Int
1637 f x = let v = 3*4 in v+x
1638 Here v is strict; but if we float v to top level, it isn't any more.
1640 Similarly, if we're floating a join point, it won't be one anymore, so we zap
1641 join point information as well.
1642 -}