83f5ee6a3b06fa0fb347e7ae3ee96da57bb33248
1 {-
2 (c) The AQUA Project, Glasgow University, 1993-1998
4 \section{Common subexpression}
5 -}
7 {-# LANGUAGE CPP #-}
9 module CSE (cseProgram, cseOneExpr) where
11 #include "HsVersions.h"
13 import CoreSubst
14 import Var ( Var )
15 import VarEnv ( elemInScopeSet )
16 import Id ( Id, idType, idInlineActivation, isDeadBinder
17 , zapIdOccInfo, zapIdUsageInfo, idInlinePragma
18 , isJoinId )
19 import CoreUtils ( mkAltExpr, eqExpr
20 , exprIsLiteralString
21 , stripTicksE, stripTicksT, mkTicks )
22 import Type ( tyConAppArgs )
23 import CoreSyn
24 import Outputable
25 import BasicTypes ( TopLevelFlag(..), isTopLevel
26 , isAlwaysActive, isAnyInlinePragma )
27 import TrieMap
28 import Util ( filterOut )
29 import Data.List ( mapAccumL )
31 {-
32 Simple common sub-expression
33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34 When we see
35 x1 = C a b
36 x2 = C x1 b
37 we build up a reverse mapping: C a b -> x1
38 C x1 b -> x2
39 and apply that to the rest of the program.
41 When we then see
42 y1 = C a b
43 y2 = C y1 b
44 we replace the C a b with x1. But then we *dont* want to
45 add x1 -> y1 to the mapping. Rather, we want the reverse, y1 -> x1
46 so that a subsequent binding
47 y2 = C y1 b
48 will get transformed to C x1 b, and then to x2.
50 So we carry an extra var->var substitution which we apply *before* looking up in the
51 reverse mapping.
55 ~~~~~~~~~~~~~~~~
57 For example, consider
58 f = \x -> let y = x+x in
59 h = \x -> x+x
60 in ...
62 Here we must *not* do CSE on the inner x+x! The simplifier used to guarantee no
63 shadowing, but it doesn't any more (it proved too hard), so we clone as we go.
67 Note [CSE for bindings]
68 ~~~~~~~~~~~~~~~~~~~~~~~
69 Let-bindings have two cases, implemented by addBinding.
71 * SUBSTITUTE: applies when the RHS is a variable
73 let x = y in ...(h x)....
75 Here we want to extend the /substitution/ with x -> y, so that the
76 (h x) in the body might CSE with an enclosing (let v = h y in ...).
77 NB: the substitution maps InIds, so we extend the substitution with
78 a binding for the original InId 'x'
80 How can we have a variable on the RHS? Doesn't the simplifier inline them?
82 - First, the original RHS might have been (g z) which has CSE'd
83 with an enclosing (let y = g z in ...). This is super-important.
84 See Trac #5996:
85 x1 = C a b
86 x2 = C x1 b
87 y1 = C a b
88 y2 = C y1 b
89 Here we CSE y1's rhs to 'x1', and then we must add (y1->x1) to
90 the substitution so that we can CSE the binding for y2.
92 - Second, we use addBinding for case expression scrutinees too;
93 see Note [CSE for case expressions]
95 * EXTEND THE REVERSE MAPPING: applies in all other cases
97 let x = h y in ...(h y)...
99 Here we want to extend the /reverse mapping (cs_map)/ so that
100 we CSE the (h y) call to x.
102 Note that we use EXTEND even for a trivial expression, provided it
103 is not a variable or literal. In particular this /includes/ type
104 applications. This can be important (Trac #13156); e.g.
105 case f @ Int of { r1 ->
106 case f @ Int of { r2 -> ...
107 Here we want to common-up the two uses of (f @ Int) so we can
108 remove one of the case expressions.
111 reason not to use SUBSTITUTE for all trivial expressions.
113 Notice that
114 - The SUBSTITUTE situation extends the substitution (cs_subst)
115 - The EXTEND situation extends the reverse mapping (cs_map)
117 Notice also that in the SUBSTITUTE case we leave behind a binding
118 x = y
119 even though we /also/ carry a substitution x -> y. Can we just drop
120 the binding instead? Well, not at top level! See SimplUtils
121 Note [Top level and postInlineUnconditionally]; and in any case CSE
122 applies only to the /bindings/ of the program, and we leave it to the
123 simplifier to propate effects to the RULES. Finally, it doesn't seem
124 worth the effort to discard the nested bindings because the simplifier
125 will do it next.
127 Note [CSE for case expressions]
128 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
129 Consider
130 case scrut_expr of x { ...alts... }
131 This is very like a strict let-binding
132 let !x = scrut_expr in ...
133 So we use (addBinding x scrut_expr) to process scrut_expr and x, and as a
134 result all the stuff under Note [CSE for bindings] applies directly.
136 For example:
138 * Trivial scrutinee
139 f = \x -> case x of wild {
140 (a:as) -> case a of wild1 {
141 (p,q) -> ...(wild1:as)...
143 Here, (wild1:as) is morally the same as (a:as) and hence equal to
144 wild. But that's not quite obvious. In the rest of the compiler we
145 want to keep it as (wild1:as), but for CSE purpose that's a bad
146 idea.
148 By using addBinding we add the binding (wild1 -> a) to the substitution,
149 which does exactly the right thing.
151 (Notice this is exactly backwards to what the simplifier does, which
152 is to try to replaces uses of 'a' with uses of 'wild1'.)
154 This is the main reason that addBinding is called with a trivial rhs.
156 * Non-trivial scrutinee
157 case (f x) of y { pat -> ...let z = f x in ... }
159 By using addBinding we'll add (f x :-> y) to the cs_map, and
160 thereby CSE the inner (f x) to y.
162 Note [CSE for INLINE and NOINLINE]
163 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
164 There are some subtle interactions of CSE with functions that the user
165 has marked as INLINE or NOINLINE. (Examples from Roman Leshchinskiy.)
166 Consider
168 yes :: Int {-# NOINLINE yes #-}
169 yes = undefined
171 no :: Int {-# NOINLINE no #-}
172 no = undefined
174 foo :: Int -> Int -> Int {-# NOINLINE foo #-}
175 foo m n = n
177 {-# RULES "foo/no" foo no = id #-}
179 bar :: Int -> Int
180 bar = foo yes
182 We do not expect the rule to fire. But if we do CSE, then we risk
183 getting yes=no, and the rule does fire. Actually, it won't because
184 NOINLINE means that 'yes' will never be inlined, not even if we have
185 yes=no. So that's fine (now; perhaps in the olden days, yes=no would
186 have substituted even if 'yes' was NOINLINE).
188 But we do need to take care. Consider
190 {-# NOINLINE bar #-}
191 bar = <rhs> -- Same rhs as foo
193 foo = <rhs>
195 If CSE produces
196 foo = bar
197 then foo will never be inlined to <rhs> (when it should be, if <rhs>
198 is small). The conclusion here is this:
201 <rhs> :-> bar
202 to the CSEnv if 'bar' has any constraints on when it can inline;
203 that is, if its 'activation' not always active. Otherwise we
204 might replace <rhs> by 'bar', and then later be unable to see that it
205 really was <rhs>.
207 Note that we do not (currently) do CSE on the unfolding stored inside
208 an Id, even if is a 'stable' unfolding. That means that when an
209 unfolding happens, it is always faithful to what the stable unfolding
210 originally was.
212 Note [CSE for stable unfoldings]
213 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
214 Consider
215 {-# Unf = Stable (\pq. build blah) #-}
216 foo = x
218 Here 'foo' has a stable unfolding, but its (optimised) RHS is trivial.
219 (Turns out that this actually happens for the enumFromTo method of
220 the Integer instance of Enum in GHC.Enum.) Suppose moreover that foo's
221 stable unfolding originates from an INLINE or INLINEABLE pragma on foo.
222 Then we obviously do NOT want to extend the substitution with (foo->x),
223 because we promised to inline foo as what the user wrote. See similar
224 SimplUtils Note [Stable unfoldings and postInlineUnconditionally].
226 Nor do we want to change the reverse mapping. Suppose we have
228 {-# Unf = Stable (\pq. build blah) #-}
229 foo = <expr>
230 bar = <expr>
232 There could conceivably be merit in rewriting the RHS of bar:
233 bar = foo
234 but now bar's inlining behaviour will change, and importing
235 modules might see that. So it seems dodgy and we don't do it.
237 Stable unfoldings are also created during worker/wrapper when we decide
238 that a function's definition is so small that it should always inline.
239 In this case we still want to do CSE (#13340). Hence the use of
240 isAnyInlinePragma rather than isStableUnfolding.
242 Note [Corner case for case expressions]
243 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244 Here is another reason that we do not use SUBSTITUTE for
245 all trivial expressions. Consider
246 case x |> co of (y::Array# Int) { ... }
248 We do not want to extend the substitution with (y -> x |> co); since y
249 is of unlifted type, this would destroy the let/app invariant if (x |>
250 co) was not ok-for-speculation.
252 But surely (x |> co) is ok-for-speculation, becasue it's a trivial
253 expression, and x's type is also unlifted, presumably. Well, maybe
254 not if you are using unsafe casts. I actually found a case where we
256 (x :: HValue) |> (UnsafeCo :: HValue ~ Array# Int)
258 Note [CSE for join points?]
259 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
260 We must not be naive about join points in CSE:
261 join j = e in
262 if b then jump j else 1 + e
263 The expression (1 + jump j) is not good (see Note [Invariants on join points] in
264 CoreSyn). This seems to come up quite seldom, but it happens (first seen
267 We could try and be careful by tracking which join points are still valid at
268 each subexpression, but since join points aren't allocated or shared, there's
269 less to gain by trying to CSE them.
271 Note [CSE for recursive bindings]
272 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
273 Consider
274 f = \x ... f....
275 g = \y ... g ...
276 where the "..." are identical. Could we CSE them? In full generality
277 with mutual recursion it's quite hard; but for self-recursive bindings
278 (which are very common) it's rather easy:
280 * Maintain a separate cs_rec_map, that maps
281 (\f. (\x. ...f...) ) -> f
282 Note the \f in the domain of the mapping!
284 * When we come across the binding for 'g', look up (\g. (\y. ...g...))
285 Bingo we get a hit. So we can replace the 'g' binding with
286 g = f
288 We can't use cs_map for this, because the key isn't an expression of
289 the program; it's a kind of synthetic key for recursive bindings.
292 ************************************************************************
293 * *
294 \section{Common subexpression}
295 * *
296 ************************************************************************
297 -}
299 cseProgram :: CoreProgram -> CoreProgram
300 cseProgram binds = snd (mapAccumL (cseBind TopLevel) emptyCSEnv binds)
302 cseBind :: TopLevelFlag -> CSEnv -> CoreBind -> (CSEnv, CoreBind)
303 cseBind toplevel env (NonRec b e)
304 = (env2, NonRec b2 e2)
305 where
306 (env1, b1) = addBinder env b
307 (env2, (b2, e2)) = cse_bind toplevel env1 (b,e) b1
309 cseBind _ env (Rec [(in_id, rhs)])
310 | noCSE in_id
311 = (env1, Rec [(out_id, rhs')])
313 -- See Note [CSE for recursive bindings]
314 | Just previous <- lookupCSRecEnv env out_id rhs''
315 , let previous' = mkTicks ticks previous
316 = (extendCSSubst env1 in_id previous', NonRec out_id previous')
318 | otherwise
319 = (extendCSRecEnv env1 out_id rhs'' id_expr', Rec [(zapped_id, rhs')])
321 where
322 (env1, [out_id]) = addRecBinders env [in_id]
323 rhs' = cseExpr env1 rhs
324 rhs'' = stripTicksE tickishFloatable rhs'
325 ticks = stripTicksT tickishFloatable rhs'
326 id_expr' = varToCoreExpr out_id
327 zapped_id = zapIdUsageInfo out_id
329 cseBind toplevel env (Rec pairs)
330 = (env2, Rec pairs')
331 where
332 (env1, bndrs1) = addRecBinders env (map fst pairs)
333 (env2, pairs') = mapAccumL do_one env1 (zip pairs bndrs1)
335 do_one env (pr, b1) = cse_bind toplevel env pr b1
337 -- | Given a binding of @in_id@ to @in_rhs@, and a fresh name to refer
339 -- first try to CSE @in_rhs@, and then add the resulting (possibly CSE'd)
340 -- binding to the 'CSEnv', so that we attempt to CSE any expressions
341 -- which are equal to @out_rhs@.
342 cse_bind :: TopLevelFlag -> CSEnv -> (InId, InExpr) -> OutId -> (CSEnv, (OutId, OutExpr))
343 cse_bind toplevel env (in_id, in_rhs) out_id
344 | isTopLevel toplevel, exprIsLiteralString in_rhs
345 -- See Note [Take care with literal strings]
346 = (env', (out_id, in_rhs))
348 | otherwise
349 = (env', (out_id', out_rhs))
350 where
351 out_rhs = tryForCSE env in_rhs
352 (env', out_id') = addBinding env in_id out_id out_rhs
354 addBinding :: CSEnv -- Includes InId->OutId cloning
355 -> InVar -- Could be a let-bound type
356 -> OutId -> OutExpr -- Processed binding
357 -> (CSEnv, OutId) -- Final env, final bndr
358 -- Extend the CSE env with a mapping [rhs -> out-id]
359 -- unless we can instead just substitute [in-id -> rhs]
360 --
361 -- It's possible for the binder to be a type variable (see
362 -- Note [Type-let] in CoreSyn), in which case we can just substitute.
363 addBinding env in_id out_id rhs'
364 | not (isId in_id) = (extendCSSubst env in_id rhs', out_id)
365 | noCSE in_id = (env, out_id)
366 | use_subst = (extendCSSubst env in_id rhs', out_id)
367 | otherwise = (extendCSEnv env rhs' id_expr', zapped_id)
368 where
369 id_expr' = varToCoreExpr out_id
370 zapped_id = zapIdUsageInfo out_id
371 -- Putting the Id into the cs_map makes it possible that
372 -- it'll become shared more than it is now, which would
373 -- invalidate (the usage part of) its demand info.
374 -- This caused Trac #100218.
375 -- Easiest thing is to zap the usage info; subsequently
376 -- performing late demand-analysis will restore it. Don't zap
377 -- the strictness info; it's not necessary to do so, and losing
378 -- it is bad for performance if you don't do late demand
379 -- analysis
381 -- Should we use SUBSTITUTE or EXTEND?
382 -- See Note [CSE for bindings]
383 use_subst = case rhs' of
384 Var {} -> True
385 _ -> False
387 noCSE :: InId -> Bool
388 noCSE id = not (isAlwaysActive (idInlineActivation id))
389 -- See Note [CSE for INLINE and NOINLINE]
390 || isAnyInlinePragma (idInlinePragma id)
391 -- See Note [CSE for stable unfoldings]
392 || isJoinId id
393 -- See Note [CSE for join points?]
396 {- Note [Take care with literal strings]
397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
398 Consider this example:
400 x = "foo"#
401 y = "foo"#
402 ...x...y...x...y....
404 We would normally turn this into:
406 x = "foo"#
407 y = x
408 ...x...x...x...x....
410 But this breaks an invariant of Core, namely that the RHS of a top-level binding
411 of type Addr# must be a string literal, not another variable. See Note
412 [CoreSyn top-level string literals] in CoreSyn.
414 For this reason, we special case top-level bindings to literal strings and leave
415 the original RHS unmodified. This produces:
417 x = "foo"#
418 y = "foo"#
419 ...x...x...x...x....
421 Now 'y' will be discarded as dead code, and we are done.
423 The net effect is that for the y-binding we want to
424 - Use SUBSTITUTE, by extending the substitution with y :-> x
425 - but leave the original binding for y undisturbed
427 This is done by cse_bind. I got it wrong the first time (Trac #13367).
428 -}
430 tryForCSE :: CSEnv -> InExpr -> OutExpr
431 tryForCSE env expr
432 | Just e <- lookupCSEnv env expr'' = mkTicks ticks e
433 | otherwise = expr'
434 -- The varToCoreExpr is needed if we have
435 -- case e of xco { ...case e of yco { ... } ... }
436 -- Then CSE will substitute yco -> xco;
437 -- but these are /coercion/ variables
438 where
439 expr' = cseExpr env expr
440 expr'' = stripTicksE tickishFloatable expr'
441 ticks = stripTicksT tickishFloatable expr'
442 -- We don't want to lose the source notes when a common sub
443 -- expression gets eliminated. Hence we push all (!) of them on
444 -- top of the replaced sub-expression. This is probably not too
445 -- useful in practice, but upholds our semantics.
447 cseOneExpr :: InExpr -> OutExpr
448 cseOneExpr = cseExpr emptyCSEnv
450 cseExpr :: CSEnv -> InExpr -> OutExpr
451 cseExpr env (Type t) = Type (substTy (csEnvSubst env) t)
452 cseExpr env (Coercion c) = Coercion (substCo (csEnvSubst env) c)
453 cseExpr _ (Lit lit) = Lit lit
454 cseExpr env (Var v) = lookupSubst env v
455 cseExpr env (App f a) = App (cseExpr env f) (tryForCSE env a)
456 cseExpr env (Tick t e) = Tick t (cseExpr env e)
457 cseExpr env (Cast e co) = Cast (cseExpr env e) (substCo (csEnvSubst env) co)
458 cseExpr env (Lam b e) = let (env', b') = addBinder env b
459 in Lam b' (cseExpr env' e)
460 cseExpr env (Let bind e) = let (env', bind') = cseBind NotTopLevel env bind
461 in Let bind' (cseExpr env' e)
462 cseExpr env (Case e bndr ty alts) = cseCase env e bndr ty alts
464 cseCase :: CSEnv -> InExpr -> InId -> InType -> [InAlt] -> OutExpr
465 cseCase env scrut bndr ty alts
466 = Case scrut1 bndr3 ty' \$
467 combineAlts alt_env (map cse_alt alts)
468 where
469 ty' = substTy (csEnvSubst env) ty
470 scrut1 = tryForCSE env scrut
472 bndr1 = zapIdOccInfo bndr
473 -- Zapping the OccInfo is needed because the extendCSEnv
474 -- in cse_alt may mean that a dead case binder
475 -- becomes alive, and Lint rejects that
476 (env1, bndr2) = addBinder env bndr1
477 (alt_env, bndr3) = addBinding env1 bndr bndr2 scrut1
478 -- addBinding: see Note [CSE for case expressions]
480 con_target :: OutExpr
481 con_target = lookupSubst alt_env bndr
483 arg_tys :: [OutType]
484 arg_tys = tyConAppArgs (idType bndr3)
486 -- Given case x of { K y z -> ...K y z... }
487 -- CSE K y z into x...
488 cse_alt (DataAlt con, args, rhs)
489 | not (null args)
490 -- ... but don't try CSE if there are no args; it just increases the number
491 -- of live vars. E.g.
492 -- case x of { True -> ....True.... }
493 -- Don't replace True by x!
494 -- Hence the 'null args', which also deal with literals and DEFAULT
495 = (DataAlt con, args', tryForCSE new_env rhs)
496 where
497 (env', args') = addBinders alt_env args
498 new_env = extendCSEnv env' con_expr con_target
499 con_expr = mkAltExpr (DataAlt con) args' arg_tys
501 cse_alt (con, args, rhs)
502 = (con, args', tryForCSE env' rhs)
503 where
504 (env', args') = addBinders alt_env args
506 combineAlts :: CSEnv -> [InAlt] -> [InAlt]
507 -- See Note [Combine case alternatives]
508 combineAlts env ((_,bndrs1,rhs1) : rest_alts)
510 = (DEFAULT, [], rhs1) : filtered_alts
511 where
512 in_scope = substInScope (csEnvSubst env)
513 filtered_alts = filterOut identical rest_alts
514 identical (_con, bndrs, rhs) = all ok bndrs && eqExpr in_scope rhs1 rhs
515 ok bndr = isDeadBinder bndr || not (bndr `elemInScopeSet` in_scope)
517 combineAlts _ alts = alts -- Default case
519 {- Note [Combine case alternatives]
520 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
521 combineAlts is just a more heavyweight version of the use of
522 combineIdenticalAlts in SimplUtils.prepareAlts. The basic idea is
523 to transform
525 DEFAULT -> e1
526 K x -> e1
527 W y z -> e2
528 ===>
529 DEFAULT -> e1
530 W y z -> e2
532 In the simplifier we use cheapEqExpr, because it is called a lot.
533 But here in CSE we use the full eqExpr. After all, two alterantives usually
534 differ near the root, so it probably isn't expensive to compare the full
535 alternative. It seems like the the same kind of thing that CSE is supposed
536 to be doing, which is why I put it here.
538 I acutally saw some examples in the wild, where some inlining made e1 too
539 big for cheapEqExpr to catch it.
542 ************************************************************************
543 * *
544 \section{The CSE envt}
545 * *
546 ************************************************************************
547 -}
549 data CSEnv
550 = CS { cs_subst :: Subst -- Maps InBndrs to OutExprs
551 -- The substitution variables to
552 -- /trivial/ OutExprs, not arbitrary expressions
554 , cs_map :: CoreMap OutExpr -- The reverse mapping
555 -- Maps a OutExpr to a /trivial/ OutExpr
556 -- The key of cs_map is stripped of all Ticks
558 , cs_rec_map :: CoreMap OutExpr
559 -- See Note [CSE for recursive bindings]
560 }
562 emptyCSEnv :: CSEnv
563 emptyCSEnv = CS { cs_map = emptyCoreMap, cs_rec_map = emptyCoreMap
564 , cs_subst = emptySubst }
566 lookupCSEnv :: CSEnv -> OutExpr -> Maybe OutExpr
567 lookupCSEnv (CS { cs_map = csmap }) expr
568 = lookupCoreMap csmap expr
570 extendCSEnv :: CSEnv -> OutExpr -> OutExpr -> CSEnv
571 extendCSEnv cse expr triv_expr
572 = cse { cs_map = extendCoreMap (cs_map cse) sexpr triv_expr }
573 where
574 sexpr = stripTicksE tickishFloatable expr
576 extendCSRecEnv :: CSEnv -> OutId -> OutExpr -> OutExpr -> CSEnv
577 -- See Note [CSE for recursive bindings]
578 extendCSRecEnv cse bndr expr triv_expr
579 = cse { cs_rec_map = extendCoreMap (cs_rec_map cse) (Lam bndr expr) triv_expr }
581 lookupCSRecEnv :: CSEnv -> OutId -> OutExpr -> Maybe OutExpr
582 -- See Note [CSE for recursive bindings]
583 lookupCSRecEnv (CS { cs_rec_map = csmap }) bndr expr
584 = lookupCoreMap csmap (Lam bndr expr)
586 csEnvSubst :: CSEnv -> Subst
587 csEnvSubst = cs_subst
589 lookupSubst :: CSEnv -> Id -> OutExpr
590 lookupSubst (CS { cs_subst = sub}) x = lookupIdSubst (text "CSE.lookupSubst") sub x
592 extendCSSubst :: CSEnv -> Id -> CoreExpr -> CSEnv
593 extendCSSubst cse x rhs = cse { cs_subst = extendSubst (cs_subst cse) x rhs }
595 -- | Add clones to the substitution to deal with shadowing. See
596 -- Note [Shadowing] for more details. You should call this whenever
597 -- you go under a binder.
598 addBinder :: CSEnv -> Var -> (CSEnv, Var)
599 addBinder cse v = (cse { cs_subst = sub' }, v')
600 where
601 (sub', v') = substBndr (cs_subst cse) v
603 addBinders :: CSEnv -> [Var] -> (CSEnv, [Var])
604 addBinders cse vs = (cse { cs_subst = sub' }, vs')
605 where
606 (sub', vs') = substBndrs (cs_subst cse) vs
608 addRecBinders :: CSEnv -> [Id] -> (CSEnv, [Id])
609 addRecBinders cse vs = (cse { cs_subst = sub' }, vs')
610 where
611 (sub', vs') = substRecBndrs (cs_subst cse) vs