1 {-

2 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998

4 ************************************************************************

5 * *

6 \section[OccurAnal]{Occurrence analysis pass}

7 * *

8 ************************************************************************

10 The occurrence analyser re-typechecks a core expression, returning a new

11 core expression with (hopefully) improved usage information.

12 -}

14 {-# LANGUAGE CPP, BangPatterns #-}

22 import CoreSyn

23 import CoreFVs

26 import Id

28 import BasicTypes

30 import Coercion

32 import VarSet

33 import VarEnv

34 import Var

38 import Unique

39 import UniqFM

40 import Util

41 import Outputable

42 import FastString

46 {-

47 ************************************************************************

48 * *

49 \subsection[OccurAnal-main]{Counting occurrences: main function}

50 * *

51 ************************************************************************

53 Here's the externally-callable interface:

54 -}

60 occurAnalysePgm this_mod active_rule imp_rules vects vectVars binds

61 | isEmptyVarEnv final_usage

62 = occ_anald_binds

67 occ_anald_glommed_binds

68 where

69 init_env = initOccEnv active_rule

73 initial_uds

74 -- It's crucial to re-analyse the glommed-together bindings

75 -- so that we establish the right loop breakers. Otherwise

76 -- we can easily create an infinite loop (Trac #9583 is an example)

78 initial_uds = addIdOccs emptyDetails

79 (rulesFreeVars imp_rules `unionVarSet`

80 vectsFreeVars vects `unionVarSet`

81 vectVars)

82 -- The RULES and VECTORISE declarations keep things alive! (For VECTORISE declarations,

83 -- we only get them *until* the vectoriser runs. Afterwards, these dependencies are

84 -- reflected in 'vectors' — see Note [Vectorisation declarations and occurrences].)

86 -- Note [Preventing loops due to imported functions rules]

89 | imp_rule <- imp_rules

91 `delVarSetList` ru_bndrs imp_rule

95 go _ []

99 where

104 -- Do occurrence analysis, and discard occurrence info returned

111 occurAnalyseExpr' enable_binder_swap expr

113 where

115 -- To be conservative, we say that all inlines and rules are active

118 {-

119 ************************************************************************

120 * *

121 \subsection[OccurAnal-main]{Counting occurrences: main function}

122 * *

123 ************************************************************************

125 Bindings

126 ~~~~~~~~

127 -}

131 noImpRuleEdges :: ImpRuleEdges

132 noImpRuleEdges = emptyVarEnv

135 -> ImpRuleEdges

136 -> CoreBind

142 = occAnalNonRecBind env top_env binder rhs body_usage

144 = occAnalRecBind env top_env pairs body_usage

146 -----------------

149 occAnalNonRecBind env imp_rule_edges binder rhs body_usage

150 | isTyVar binder -- A type let; we don't gather usage info

158 where

164 -- See Note [Rules are extra RHSs] and Note [Rule dependency info]

167 lookupVarEnv imp_rule_edges binder

168 -- See Note [Preventing loops due to imported functions rules]

170 -----------------

173 occAnalRecBind env imp_rule_edges pairs body_usage

175 -- For a recursive group, we

176 -- * occ-analyse all the RHSs

177 -- * compute strongly-connected components

178 -- * feed those components to occAnalRec

179 where

188 {-

189 Note [Dead code]

190 ~~~~~~~~~~~~~~~~

191 Dropping dead code for a cyclic Strongly Connected Component is done

192 in a very simple way:

194 the entire SCC is dropped if none of its binders are mentioned

195 in the body; otherwise the whole thing is kept.

197 The key observation is that dead code elimination happens after

198 dependency analysis: so 'occAnalBind' processes SCCs instead of the

199 original term's binding groups.

201 Thus 'occAnalBind' does indeed drop 'f' in an example like

203 letrec f = ...g...

204 g = ...(...g...)...

205 in

206 ...g...

208 when 'g' no longer uses 'f' at all (eg 'f' does not occur in a RULE in

209 'g'). 'occAnalBind' first consumes 'CyclicSCC g' and then it consumes

210 'AcyclicSCC f', where 'body_usage' won't contain 'f'.

212 ------------------------------------------------------------

213 Note [Forming Rec groups]

214 ~~~~~~~~~~~~~~~~~~~~~~~~~

215 We put bindings {f = ef; g = eg } in a Rec group if "f uses g"

216 and "g uses f", no matter how indirectly. We do a SCC analysis

217 with an edge f -> g if "f uses g".

219 More precisely, "f uses g" iff g should be in scope wherever f is.

220 That is, g is free in:

221 a) the rhs 'ef'

222 b) or the RHS of a rule for f (Note [Rules are extra RHSs])

223 c) or the LHS or a rule for f (Note [Rule dependency info])

225 These conditions apply regardless of the activation of the RULE (eg it might be

226 inactive in this phase but become active later). Once a Rec is broken up

227 it can never be put back together, so we must be conservative.

229 The principle is that, regardless of rule firings, every variable is

230 always in scope.

232 * Note [Rules are extra RHSs]

233 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

234 A RULE for 'f' is like an extra RHS for 'f'. That way the "parent"

235 keeps the specialised "children" alive. If the parent dies

236 (because it isn't referenced any more), then the children will die

237 too (unless they are already referenced directly).

239 To that end, we build a Rec group for each cyclic strongly

240 connected component,

241 *treating f's rules as extra RHSs for 'f'*.

242 More concretely, the SCC analysis runs on a graph with an edge

243 from f -> g iff g is mentioned in

244 (a) f's rhs

245 (b) f's RULES

246 These are rec_edges.

248 Under (b) we include variables free in *either* LHS *or* RHS of

249 the rule. The former might seems silly, but see Note [Rule

250 dependency info]. So in Example [eftInt], eftInt and eftIntFB

251 will be put in the same Rec, even though their 'main' RHSs are

252 both non-recursive.

254 * Note [Rule dependency info]

255 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

256 The VarSet in a RuleInfo is used for dependency analysis in the

257 occurrence analyser. We must track free vars in *both* lhs and rhs.

258 Hence use of idRuleVars, rather than idRuleRhsVars in occAnalBind.

259 Why both? Consider

260 x = y

261 RULE f x = v+4

262 Then if we substitute y for x, we'd better do so in the

263 rule's LHS too, so we'd better ensure the RULE appears to mention 'x'

264 as well as 'v'

266 * Note [Rules are visible in their own rec group]

267 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

268 We want the rules for 'f' to be visible in f's right-hand side.

269 And we'd like them to be visible in other functions in f's Rec

270 group. E.g. in Note [Specialisation rules] we want f' rule

271 to be visible in both f's RHS, and fs's RHS.

273 This means that we must simplify the RULEs first, before looking

274 at any of the definitions. This is done by Simplify.simplRecBind,

275 when it calls addLetIdInfo.

277 ------------------------------------------------------------

278 Note [Choosing loop breakers]

279 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

280 Loop breaking is surprisingly subtle. First read the section 4 of

281 "Secrets of the GHC inliner". This describes our basic plan.

282 We avoid infinite inlinings by choosing loop breakers, and

283 ensuring that a loop breaker cuts each loop.

285 Fundamentally, we do SCC analysis on a graph. For each recursive

286 group we choose a loop breaker, delete all edges to that node,

287 re-analyse the SCC, and iterate.

289 But what is the graph? NOT the same graph as was used for Note

290 [Forming Rec groups]! In particular, a RULE is like an equation for

291 'f' that is *always* inlined if it is applicable. We do *not* disable

292 rules for loop-breakers. It's up to whoever makes the rules to make

293 sure that the rules themselves always terminate. See Note [Rules for

294 recursive functions] in Simplify.hs

296 Hence, if

297 f's RHS (or its INLINE template if it has one) mentions g, and

298 g has a RULE that mentions h, and

299 h has a RULE that mentions f

301 then we *must* choose f to be a loop breaker. Example: see Note

302 [Specialisation rules].

304 In general, take the free variables of f's RHS, and augment it with

305 all the variables reachable by RULES from those starting points. That

306 is the whole reason for computing rule_fv_env in occAnalBind. (Of

307 course we only consider free vars that are also binders in this Rec

308 group.) See also Note [Finding rule RHS free vars]

310 Note that when we compute this rule_fv_env, we only consider variables

311 free in the *RHS* of the rule, in contrast to the way we build the

312 Rec group in the first place (Note [Rule dependency info])

314 Note that if 'g' has RHS that mentions 'w', we should add w to

315 g's loop-breaker edges. More concretely there is an edge from f -> g

316 iff

317 (a) g is mentioned in f's RHS `xor` f's INLINE rhs

318 (see Note [Inline rules])

319 (b) or h is mentioned in f's RHS, and

320 g appears in the RHS of an active RULE of h

321 or a transitive sequence of active rules starting with h

323 Why "active rules"? See Note [Finding rule RHS free vars]

325 Note that in Example [eftInt], *neither* eftInt *nor* eftIntFB is

326 chosen as a loop breaker, because their RHSs don't mention each other.

327 And indeed both can be inlined safely.

329 Note again that the edges of the graph we use for computing loop breakers

330 are not the same as the edges we use for computing the Rec blocks.

331 That's why we compute

333 - rec_edges for the Rec block analysis

334 - loop_breaker_edges for the loop breaker analysis

336 * Note [Finding rule RHS free vars]

337 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

338 Consider this real example from Data Parallel Haskell

339 tagZero :: Array Int -> Array Tag

340 {-# INLINE [1] tagZeroes #-}

343 {-# RULES "tagZero" [~1] forall xs n.

344 pmap fromBool <blah blah> = tagZero xs #-}

349 To make this work, we look for the RHS free vars only for

354 ~~~~~~~~~~~~~~~~~~~~~~~~~

358 RULE f [] = g

360 h = h_rhs

361 g = h

363 }

372 RULE!

396 Inline postInlineUnconditionally

397 strong IAmLoopBreaker False no no

398 weak IAmLoopBreaker True yes no

399 other yes yes

407 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

408 Consider this

411 Note that

414 Now we

415 can get

417 --> 1 + f Int Firing RULE

422 f's definition had been

427 f.

434 RULE f x = f x

438 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

439 Consider:

442 {-# RULES "filterList" forall p. foldr (filterFB (:) p) [] = filter p #-}

444 filterFB c p = ...

458 {-# LANGUAGE RankNTypes #-}

464 {-# INLINABLE filter #-}

469 {-# NOINLINE [0] filterFB #-}

471 filterFB c p x r | p x = x `c` r

474 {-# RULES

475 "filter" [~1] forall p xs. filter p xs = build (\c n -> foldr

476 (filterFB c p) n xs)

477 "filterList" [1] forall p. foldr (filterFB (:) p) [] = filter p

478 #-}

488 filter p xs

500 dies with "ticks exhausted"

510 local functions because the standard occurrence analysis stuff is

516 Ids.

519 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

524 we get

531 To avoid this,

533 give a NOINLINE pragma to the specialised function

536 ~~~~~~~~~~~~~~~

537 RULES for imported Ids can make something at the top refer to something at the bottom:

548 NOTICE that this cannot happen for rules whose head is a

553 Solution:

567 ------------------------------------------------------------

569 ~~~~~~~~~~~~~~~~~~~

590 {-# INLINE [1] foo #-}

594 {-# INLINE [1] bar #-}

597 {-# RULES "foo" [~1] forall x. foo x = bar x #-}

602 a) We may get

609 b)

613 ~~~~~~~~~~~~~~~

619 {-# INLINE [0] eftIntFB #-}

623 {-# RULES

624 "eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y)

625 "eftIntList" [1] eftIntFB (:) [] = eftInt

626 #-}

629 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

634 {-# RULE f (C a) = fs a #-}

644 -}

647 -- which is gotten from the Id.

648 data Details

653 -- ignoring phase (ie assuming all are active)

654 -- See Note [Forming Rec groups]

657 -- the stable unfolding (if present and active)

658 -- or the RHS (if not)

659 -- but excluding any RULES

660 -- This is the IdSet that may be used if the Id is inlined

663 -- but are *not* in nd_inl. These are the ones whose

664 -- dependencies might not be respected by loop_breaker_edges

665 -- See Note [Weak loop breakers]

668 }

677 ])

682 where

690 -- Constructing the edges for the main Rec computation

691 -- See Note [Forming Rec groups]

694 -- Note [Rule dependency info]

696 Just unf_fvs -> addIdOccs rhs_usage2 unf_fvs

697 Nothing -> rhs_usage2

698 node_fvs = udFreeVars bndr_set rhs_usage3

700 -- Finding the free variables of the rules

705 -- See Note [Preventing loops due to imported functions rules]

707 | rule <- rules

709 `delVarSetList` ru_bndrs rule

711 all_rule_fvs = rule_lhs_fvs `unionVarSet` rule_rhs_fvs

714 `delVarSetList` ru_bndrs ru) rules

717 -- Finding the free variables of the INLINE pragma (if any)

719 mb_unf_fvs = stableUnfoldingVars unf

721 -- Find the "nd_inl" free vars; for the loop-breaker phase

724 Just unf_fvs -> unf_fvs

725 -- We could check for an *active* INLINE (returning

726 -- emptyVarSet for an inactive one), but is_active

727 -- isn't the right thing (it tells about

728 -- RULE activation), so we'd need more plumbing

730 -----------------------------

735 -- The NonRec case is just like a Let (NonRec ...) above

744 where

747 -- The Rec case is the interesting one

748 -- See Note [Loop breaking]

755 -- [ text "tagged nodes" <+> ppr tagged_nodes

756 -- , text "lb edges" <+> ppr loop_breaker_edges])

759 where

761 bndr_set = mkVarSet bndrs

763 ----------------------------

764 -- Tag the binders with their occurrence info

767 final_uds = total_uds `minusVarEnv` bndr_set

775 ---------------------------

776 -- Now reconstruct the cycle

780 -- If weak_fvs is empty, the loop_breaker_edges will include all

781 -- the edges in tagged_nodes, so there isn't any point in doing

782 -- a fresh SCC computation that will yield a single CyclicSCC result.

784 weak_fvs :: VarSet

787 -- See Note [Choosing loop breakers] for loop_breaker_edges

792 ------------------------------------

793 rule_fv_env :: IdEnv IdSet

794 -- Maps a variable f to the variables from this group

795 -- mentioned in RHS of active rules for f

796 -- Domain is *subset* of bound vars (others have no rule fvs)

798 init_rule_fvs -- See Note [Finding rule RHS free vars]

804 {-

805 @loopBreakSCC@ is applied to the list of (binder,rhs) pairs for a cyclic

806 strongly connected component (there's guaranteed to be a cycle). It returns the

807 same pairs, but

808 a) in a better order,

809 b) with some of the Ids having a IAmALoopBreaker pragma

811 The "loop-breaker" Ids are sufficient to break all cycles in the SCC. This means

812 that the simplifier can guarantee not to loop provided it never records an inlining

813 for these no-inline guys.

815 Furthermore, the order of the binds is such that if we neglect dependencies

816 on the no-inline Ids then the binds are topologically sorted. This means

817 that the simplifier will generally do a good job if it works from top bottom,

818 recording inlinings for any Ids which aren't marked as "no-inline" as it goes.

819 -}

828 -- See Note [Weak loop breakers]

834 -- Find the subset of bndrs that are mentioned in uds

840 -- See Note [Weak loop breakers]

844 -- Return the bindings sorted into a plausible order, and marked with loop breakers.

845 loopBreakNodes depth bndr_set weak_fvs nodes binds

847 where

851 loop_break_scc scc binds

855 CyclicSCC nodes -> reOrderNodes depth bndr_set weak_fvs nodes binds

858 -- Choose a loop breaker, mark it no-inline,

859 -- do SCC analysis on the rest, and recursively sort them out

863 -- text "chosen" <+> ppr chosen_nodes) $

864 loopBreakNodes new_depth bndr_set weak_fvs unchosen $

866 where

872 -- After two iterations (d=0, d=1) give up

873 -- and approximate, returning to d=0

880 -- This loop looks for the bind with the lowest score

881 -- to pick as the loop breaker. The rest accumulate in

882 choose_loop_breaker _ loop_nodes acc []

885 -- If approximate_loop_breaker is True, we pick *all*

886 -- nodes with lowest score, else just one

887 -- See Note [Complexity of loop breaking]

897 where

898 sc = score node

905 -- Note [DFuns should not be loop breakers]

910 -- Data structures are more important than INLINE pragmas

911 -- so that dictionary/method recursion unravels

912 -- Note that this case hits all stable unfoldings, so we

913 -- never look at 'rhs' for stable unfoldings. That's right, because

914 -- 'rhs' is irrelevant for inlining things with a stable unfolding

919 -- Used to have also: && not (isExportedId bndr)

920 -- But I found this sometimes cost an extra iteration when we have

921 -- rec { d = (a,b); a = ...df...; b = ...df...; df = d }

922 -- where df is the exported dictionary. Then df makes a really

923 -- bad choice for loop breaker

926 -- If an Id is marked "never inline" then it makes a great loop breaker

927 -- The only reason for not checking that here is that it is rare

928 -- and I've never seen a situation where it makes a difference,

929 -- so it probably isn't worth the time to test on every binder

930 -- | isNeverActive (idInlinePragma bndr) = -10

935 -- The Id has some kind of unfolding

936 -- Ignore loop-breaker-ness here because that is what we are setting!

940 -- Checking for a constructor application

941 -- Cheap and cheerful; the simplifer moves casts out of the way

942 -- The lambda case is important to spot x = /\a. C (f a)

943 -- which comes up when C is a dictionary constructor and

944 -- f is a default method.

945 -- Example: the instance for Show (ST s a) in GHC.ST

946 --

947 -- However we *also* treat (\x. C p q) as a con-app-like thing,

948 -- Note [Closure conversion]

955 {-

956 Note [Complexity of loop breaking]

957 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

958 The loop-breaking algorithm knocks out one binder at a time, and

959 performs a new SCC analysis on the remaining binders. That can

960 behave very badly in tightly-coupled groups of bindings; in the

961 worst case it can be (N**2)*log N, because it does a full SCC

962 on N, then N-1, then N-2 and so on.

964 To avoid this, we switch plans after 2 (or whatever) attempts:

965 Plan A: pick one binder with the lowest score, make it

966 a loop breaker, and try again

967 Plan B: pick *all* binders with the lowest score, make them

968 all loop breakers, and try again

969 Since there are only a small finite number of scores, this will

970 terminate in a constant number of iterations, rather than O(N)

971 iterations.

973 You might thing that it's very unlikely, but RULES make it much

974 more likely. Here's a real example from Trac #1969:

975 Rec { $dm = \d.\x. op d

976 {-# RULES forall d. $dm Int d = $s$dm1

977 forall d. $dm Bool d = $s$dm2 #-}

981 opInt = $dm dInt

982 opBool = $dm dBool

992 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1005 {-# INLINE f #-}

1012 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1016 We give DFuns a higher score than ordinary CONLIKE things because

1022 {-# DFUN #-}

1024 }

1030 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1042 Inlining dictionaries is really essential to unravelling

1046 ~~~~~~~~~~~~~~~~~~~~~~~~~

1049 which generated code like this:

1069 by an INLINE pragma. For these we record that anything which occurs

1077 -}

1081 -- Returned usage details covers only the RHS,

1082 -- and *not* the RULE or INLINE template for the Id

1085 occAnalNonRecRhs :: OccEnv

1087 -- Binder is already tagged with occurrence info

1089 -- Returned usage details covers only the RHS,

1090 -- and *not* the RULE or INLINE template for the Id

1091 occAnalNonRecRhs env bndr rhs

1092 = occAnal rhs_env rhs

1093 where

1094 -- See Note [Use one-shot info]

1097 -- See Note [Cascading inlines]

1098 rhs_env | certainly_inline = env1

1101 certainly_inline -- See Note [Cascading inlines]

1106 dmd = idDemandInfo bndr

1111 addIdOccs usage id_set = foldVarSet addIdOcc usage id_set

1114 addIdOcc v u | isId v = addOneOcc u v NoOccInfo

1116 -- Give a non-committal binder info (i.e NoOccInfo) because

1117 -- a) Many copies of the specialised thing can appear

1118 -- b) We don't want to substitute a BIG expression inside a RULE

1119 -- even if that's the only occurrence of the thing

1120 -- (Same goes for INLINE.)

1122 {-

1123 Note [Cascading inlines]

1124 ~~~~~~~~~~~~~~~~~~~~~~~~

1125 By default we use an rhsCtxt for the RHS of a binding. This tells the

1126 occ anal n that it's looking at an RHS, which has an effect in

1127 occAnalApp. In particular, for constructor applications, it makes

1128 the arguments appear to have NoOccInfo, so that we don't inline into

1129 them. Thus x = f y

1130 k = Just x

1131 we do not want to inline x.

1133 But there's a problem. Consider

1134 x1 = a0 : []

1135 x2 = a1 : x1

1136 x3 = a2 : x2

1137 g = f x3

1138 First time round, it looks as if x1 and x2 occur as an arg of a

1139 let-bound constructor ==> give them a many-occurrence.

1140 But then x3 is inlined (unconditionally as it happens) and

1141 next time round, x2 will be, and the next time round x1 will be

1142 Result: multiple simplifier iterations. Sigh.

1144 So, when analysing the RHS of x3 we notice that x3 will itself

1145 definitely inline the next time round, and so we analyse x3's rhs in

1146 an ordinary context, not rhsCtxt. Hence the "certainly_inline" stuff.

1148 Annoyingly, we have to approximate SimplUtils.preInlineUnconditionally.

1149 If we say "yes" when preInlineUnconditionally says "no" the simplifier iterates

1150 indefinitely:

1151 x = f y

1152 k = Just x

1153 inline ==>

1154 k = Just (f y)

1155 float ==>

1156 x1 = f y

1157 k = Just x1

1159 This is worse than the slow cascade, so we only want to say "certainly_inline"

1160 if it really is certain. Look at the note with preInlineUnconditionally

1161 for the various clauses.

1163 Expressions

1164 ~~~~~~~~~~~

1165 -}

1167 occAnal :: OccEnv

1168 -> CoreExpr

1170 CoreExpr)

1175 -- At one stage, I gathered the idRuleVars for v here too,

1176 -- which in a way is the right thing to do.

1177 -- But that went wrong right after specialisation, when

1178 -- the *occurrences* of the overloaded function didn't have any

1179 -- rules in them, so the *specialised* versions looked as if they

1180 -- weren't used at all.

1184 -- See Note [Gather occurrences of coercion variables]

1186 {-

1187 Note [Gather occurrences of coercion variables]

1188 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1189 We need to gather info about what coercion variables appear, so that

1190 we can sort them into the right place when doing dependency analysis.

1191 -}

1194 | tickish `tickishScopesLike` SoftScope

1197 | Breakpoint _ ids <- tickish

1199 -- never substitute for any of the Ids in a Breakpoint

1201 | otherwise

1203 where

1205 -- for a non-soft tick scope, we can inline lambdas only

1206 usage_lam = mapVarEnv markInsideLam usage

1212 -- See Note [Gather occurrences of coercion variables]

1214 -- If we see let x = y `cast` co

1215 -- then mark y as 'Many' so that we don't

1216 -- immediately inline y again.

1217 }

1222 -- Ignore type variables altogether

1223 -- (a) occurrences inside type lambdas only not marked as InsideLam

1224 -- (b) type variables not in environment

1229 }

1231 -- For value lambdas we do a special hack. Consider

1232 -- (\x. \y. ...x...)

1233 -- If we did nothing, x is used inside the \y, so would be marked

1234 -- as dangerous to dup. But in the common case where the abstraction

1235 -- is applied to two arguments this is over-pessimistic.

1236 -- So instead, we just mark each binder with its occurrence

1237 -- info in the *body* of the multiple lambda.

1238 -- Then, the simplifier is careful when partially applying lambdas.

1242 let

1244 -- Use binders' to put one-shot info on the lambdas

1246 really_final_usage

1249 in

1251 where

1258 let

1262 in

1264 where

1265 -- Note [Case binder usage]

1266 -- ~~~~~~~~~~~~~~~~~~~~~~~~

1267 -- The case binder gets a usage of either "many" or "dead", never "one".

1268 -- Reason: we like to inline single occurrences, to eliminate a binding,

1269 -- but inlining a case binder *doesn't* eliminate a binding.

1270 -- We *don't* want to transform

1271 -- case x of w { (p,q) -> f w }

1272 -- into

1273 -- case x of w { (p,q) -> f (p,q) }

1274 tag_case_bndr usage bndr

1279 alt_env = mkAltEnv env scrut bndr

1280 occ_anal_alt = occAnalAlt alt_env

1285 -- in an interesting context; the case has

1286 -- at least one non-default alternative

1288 | t `tickishScopesLike` SoftScope

1289 -- No reason to not look through all ticks here, but only

1290 -- for soft-scoped ticks we can do so without having to

1291 -- update returned occurance info (see occAnal)

1294 occ_anal_scrut scrut _alts

1303 occAnalArgs _ [] _

1307 | isTypeArg arg

1311 | otherwise

1317 {-

1318 Applications are dealt with specially because we want

1319 the "build hack" to work.

1321 Note [Arguments of let-bound constructors]

1322 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1323 Consider

1324 f x = let y = expensive x in

1325 let z = (True,y) in

1326 (case z of {(p,q)->q}, case z of {(p,q)->q})

1327 We feel free to duplicate the WHNF (True,y), but that means

1328 that y may be duplicated thereby.

1330 If we aren't careful we duplicate the (expensive x) call!

1331 Constructors are rather like lambdas in this way.

1332 -}

1334 occAnalApp :: OccEnv

1340 where

1345 -- We mark the free vars of the argument of a constructor or PAP

1346 -- as "many", if it is the RHS of a let(rec).

1347 -- This means that nothing gets inlined into a constructor argument

1348 -- position, which is what we want. Typically those constructor

1349 -- arguments are just variables, or trivial expressions.

1350 --

1351 -- This is the *whole point* of the isRhsEnv predicate

1352 -- See Note [Arguments of let-bound constructors]

1354 n_val_args = valArgCount args

1356 is_exp = isExpandableApp fun n_val_args

1357 -- See Note [CONLIKE pragma] in BasicTypes

1358 -- The definition of is_exp should match that in

1359 -- Simplify.prepareRhs

1362 -- See Note [Use one-shot info]

1366 where

1368 -- The addAppCtxt is a bit cunning. One iteration of the simplifier

1369 -- often leaves behind beta redexs like

1370 -- (\x y -> e) a1 a2

1371 -- Here we would like to mark x,y as one-shot, and treat the whole

1372 -- thing much like a let. We do this by pushing some True items

1373 -- onto the context stack.

1378 -> UsageDetails

1382 {-

1383 Note [Use one-shot information]

1384 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1385 The occurrrence analyser propagates one-shot-lambda information in two situation

1386 * Applications: eg build (\cn -> blah)

1387 Propagate one-shot info from the strictness signature of 'build' to

1388 the \cn

1390 * Let-bindings: eg let f = \c. let ... in \n -> blah

1391 in (build f, build f)

1392 Propagate one-shot info from the demanand-info on 'f' to the

1393 lambdas in its RHS (which may not be syntactically at the top)

1395 Some of this is done by the demand analyser, but this way it happens

1396 much earlier, taking advantage of the strictness signature of

1397 imported functions.

1399 Note [Binders in case alternatives]

1400 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1401 Consider

1402 case x of y { (a,b) -> f y }

1403 We treat 'a', 'b' as dead, because they don't physically occur in the

1404 case alternative. (Indeed, a variable is dead iff it doesn't occur in

1405 its scope in the output of OccAnal.) It really helps to know when

1406 binders are unused. See esp the call to isDeadBinder in

1407 Simplify.mkDupableAlt

1409 In this example, though, the Simplifier will bring 'a' and 'b' back to

1410 life, beause it binds 'y' to (a,b) (imagine got inlined and

1411 scrutinised y).

1412 -}

1415 -> CoreAlt

1419 let

1421 -- See Note [Binders in case alternatives]

1423 wrapAltRHS env scrut_bind alt_usg tagged_bndrs rhs1

1424 in

1427 wrapAltRHS :: OccEnv

1434 | occ_binder_swap env

1436 -- handles condition (a) in Note [Binder swap]

1440 where

1442 -- The rhs of the let may include coercion variables

1443 -- if the scrutinee was a cast, so we must gather their

1444 -- usage. See Note [Gather occurrences of coercion variables]

1448 wrapAltRHS _ _ alt_usg _ alt_rhs

1451 {-

1452 ************************************************************************

1453 * *

1454 OccEnv

1455 * *

1456 ************************************************************************

1457 -}

1459 data OccEnv

1464 -- See Note [Finding rule RHS free vars]

1466 -- See CorePrep Note [Dead code in CorePrep]

1467 }

1471 -----------------------------

1472 -- OccEncl is used to control whether to inline into constructor arguments

1473 -- For example:

1474 -- x = (p,q) -- Don't inline p or q

1475 -- y = /\a -> (p a, q a) -- Still don't inline p or q

1476 -- z = f (p,q) -- Do inline p,q; it may make a rule fire

1477 -- So OccEncl tells enought about the context to know what to do when

1478 -- we encounter a contructor application or PAP.

1480 data OccEncl

1482 -- Don't inline into constructor args here

1483 | OccVanilla -- Argument of function, body of lambda, scruintee of case etc.

1484 -- Do inline into constructor args here

1491 -- [] No info

1492 --

1493 -- one_shot_info:ctxt Analysing a function-valued expression that

1494 -- will be applied as described by one_shot_info

1497 initOccEnv active_rule

1511 argCtxt env []

1521 -> ( OccEnv

1523 -- The result binders have one-shot-ness set that they might not have had originally.

1524 -- This happens in (build (\cn -> e)). Here the occurrence analyser

1525 -- linearity context knows that c,n are one-shot, and it records that fact in

1526 -- the binder. This is useful to guide subsequent float-in/float-out tranformations

1530 where

1531 go ctxt [] rev_bndrs

1535 go [] bndrs rev_bndrs

1540 | isId bndr

1545 where

1546 bndr' = updOneShotInfo bndr one_shot

1547 | otherwise

1555 -- If (f,g), (g,h) are in the input, then (f,h) is in the output

1556 -- as well as (f,g), (g,h)

1557 transClosureFV env

1558 | no_change = env

1560 where

1565 where

1568 -------------

1573 -- (extendFVs env s) returns

1574 -- (s `union` env(s), env(s) `subset` s)

1575 extendFvs env s

1576 | isNullUFM env

1578 | otherwise

1580 where

1585 {-

1586 ************************************************************************

1587 * *

1588 Binder swap

1589 * *

1590 ************************************************************************

1592 Note [Binder swap]

1593 ~~~~~~~~~~~~~~~~~~

1594 We do these two transformations right here:

1596 (1) case x of b { pi -> ri }

1597 ==>

1598 case x of b { pi -> let x=b in ri }

1600 (2) case (x |> co) of b { pi -> ri }

1601 ==>

1602 case (x |> co) of b { pi -> let x = b |> sym co in ri }

1604 Why (2)? See Note [Case of cast]

1606 In both cases, in a particular alternative (pi -> ri), we only

1607 add the binding if

1608 (a) x occurs free in (pi -> ri)

1609 (ie it occurs in ri, but is not bound in pi)

1610 (b) the pi does not bind b (or the free vars of co)

1611 We need (a) and (b) for the inserted binding to be correct.

1613 For the alternatives where we inject the binding, we can transfer

1614 all x's OccInfo to b. And that is the point.

1616 Notice that

1617 * The deliberate shadowing of 'x'.

1618 * That (a) rapidly becomes false, so no bindings are injected.

1620 The reason for doing these transformations here is because it allows

1621 us to adjust the OccInfo for 'x' and 'b' as we go.

1623 * Suppose the only occurrences of 'x' are the scrutinee and in the

1624 ri; then this transformation makes it occur just once, and hence

1625 get inlined right away.

1627 * If we do this in the Simplifier, we don't know whether 'x' is used

1628 in ri, so we are forced to pessimistically zap b's OccInfo even

1629 though it is typically dead (ie neither it nor x appear in the

1630 ri). There's nothing actually wrong with zapping it, except that

1631 it's kind of nice to know which variables are dead. My nose

1632 tells me to keep this information as robustly as possible.

1634 The Maybe (Id,CoreExpr) passed to occAnalAlt is the extra let-binding

1635 {x=b}; it's Nothing if the binder-swap doesn't happen.

1637 There is a danger though. Consider

1638 let v = x +# y

1639 in case (f v) of w -> ...v...v...

1640 And suppose that (f v) expands to just v. Then we'd like to

1641 use 'w' instead of 'v' in the alternative. But it may be too

1642 late; we may have substituted the (cheap) x+#y for v in the

1643 same simplifier pass that reduced (f v) to v.

1645 I think this is just too bad. CSE will recover some of it.

1647 Note [Case of cast]

1648 ~~~~~~~~~~~~~~~~~~~

1649 Consider case (x `cast` co) of b { I# ->

1650 ... (case (x `cast` co) of {...}) ...

1651 We'd like to eliminate the inner case. That is the motivation for

1652 equation (2) in Note [Binder swap]. When we get to the inner case, we

1653 inline x, cancel the casts, and away we go.

1655 Note [Binder swap on GlobalId scrutinees]

1656 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1657 When the scrutinee is a GlobalId we must take care in two ways

1659 i) In order to *know* whether 'x' occurs free in the RHS, we need its

1660 occurrence info. BUT, we don't gather occurrence info for

1661 GlobalIds. That's the reason for the (small) occ_gbl_scrut env in

1662 OccEnv is for: it says "gather occurrence info for these".

1664 ii) We must call localiseId on 'x' first, in case it's a GlobalId, or

1665 has an External Name. See, for example, SimplEnv Note [Global Ids in

1666 the substitution].

1668 Note [Zap case binders in proxy bindings]

1669 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1670 From the original

1671 case x of cb(dead) { p -> ...x... }

1672 we will get

1673 case x of cb(live) { p -> let x = cb in ...x... }

1675 Core Lint never expects to find an *occurrence* of an Id marked

1676 as Dead, so we must zap the OccInfo on cb before making the

1677 binding x = cb. See Trac #5028.

1679 Historical note [no-case-of-case]

1680 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1681 We *used* to suppress the binder-swap in case expressions when

1682 -fno-case-of-case is on. Old remarks:

1683 "This happens in the first simplifier pass,

1684 and enhances full laziness. Here's the bad case:

1685 f = \ y -> ...(case x of I# v -> ...(case x of ...) ... )

1686 If we eliminate the inner case, we trap it inside the I# v -> arm,

1687 which might prevent some full laziness happening. I've seen this

1688 in action in spectral/cichelli/Prog.hs:

1689 [(m,n) | m <- [1..max], n <- [1..max]]

1690 Hence the check for NoCaseOfCase."

1691 However, now the full-laziness pass itself reverses the binder-swap, so this

1692 check is no longer necessary.

1694 Historical note [Suppressing the case binder-swap]

1695 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1696 This old note describes a problem that is also fixed by doing the

1697 binder-swap in OccAnal:

1699 There is another situation when it might make sense to suppress the

1700 case-expression binde-swap. If we have

1702 case x of w1 { DEFAULT -> case x of w2 { A -> e1; B -> e2 }

1703 ...other cases .... }

1705 We'll perform the binder-swap for the outer case, giving

1707 case x of w1 { DEFAULT -> case w1 of w2 { A -> e1; B -> e2 }

1708 ...other cases .... }

1710 But there is no point in doing it for the inner case, because w1 can't

1711 be inlined anyway. Furthermore, doing the case-swapping involves

1712 zapping w2's occurrence info (see paragraphs that follow), and that

1713 forces us to bind w2 when doing case merging. So we get

1715 case x of w1 { A -> let w2 = w1 in e1

1716 B -> let w2 = w1 in e2

1717 ...other cases .... }

1719 This is plain silly in the common case where w2 is dead.

1721 Even so, I can't see a good way to implement this idea. I tried

1722 not doing the binder-swap if the scrutinee was already evaluated

1723 but that failed big-time:

1725 data T = MkT !Int

1727 case v of w { MkT x ->

1728 case x of x1 { I# y1 ->

1729 case x of x2 { I# y2 -> ...

1731 Notice that because MkT is strict, x is marked "evaluated". But to

1732 eliminate the last case, we must either make sure that x (as well as

1733 x1) has unfolding MkT y1. The straightforward thing to do is to do

1734 the binder-swap. So this whole note is a no-op.

1736 It's fixed by doing the binder-swap in OccAnal because we can do the

1737 binder-swap unconditionally and still get occurrence analysis

1738 information right.

1739 -}

1742 -- Does two things: a) makes the occ_one_shots = OccVanilla

1743 -- b) extends the GlobalScruts if possible

1744 -- c) returns a proxy mapping, binding the scrutinee

1745 -- to the case binder, if possible

1750 -- See Note [Case of cast]

1753 where

1759 -- Localise the scrut_var before shadowing it; we're making a

1760 -- new binding for it, and it might have an External Name, or

1761 -- even be a GlobalId; Note [Binder swap on GlobalId scrutinees]

1762 -- Also we don't want any INLINE or NOINLINE pragmas!

1764 {-

1765 ************************************************************************

1766 * *

1767 \subsection[OccurAnal-types]{OccEnv}

1768 * *

1769 ************************************************************************

1770 -}

1773 -- INVARIANT: never IAmDead

1774 -- (Deadness is signalled by not being in the map at all)

1776 (+++), combineAltsUsageDetails

1779 (+++) usage1 usage2

1780 = plusVarEnv_C addOccInfo usage1 usage2

1782 combineAltsUsageDetails usage1 usage2

1783 = plusVarEnv_C orOccInfo usage1 usage2

1786 addOneOcc usage id info

1788 -- ToDo: make this more efficient

1790 emptyDetails :: UsageDetails

1794 v `usedIn` details = isExportedId v || v `elemVarEnv` details

1802 -- Used for lambda and case binders

1803 -- It copes with the fact that lambda bindings can have a

1804 -- stable unfolding, used for join points

1806 where

1809 where

1810 usage1 = usage `delVarEnv` bndr

1819 tagBinder usage binder

1821 usage' = usage `delVarEnv` binder

1822 binder' = setBinderOcc usage binder

1823 in

1827 setBinderOcc usage bndr

1828 | isTyVar bndr = bndr

1830 NoOccInfo -> bndr

1831 _ -> setIdOccInfo bndr NoOccInfo

1832 -- Don't use local usage info for visible-elsewhere things

1833 -- BUT *do* erase any IAmALoopBreaker annotation, because we're

1834 -- about to re-generate it and it shouldn't be "sticky"

1837 where

1838 occ_info = lookupVarEnv usage bndr `orElse` IAmDead

1840 {-

1841 ************************************************************************

1842 * *

1843 \subsection{Operations over OccInfo}

1844 * *

1845 ************************************************************************

1846 -}

1849 mkOneOcc env id int_cxt

1850 | isLocalId id

1853 | id `elemVarEnv` occ_gbl_scrut env

1856 | otherwise

1857 = emptyDetails

1861 markMany _ = NoOccInfo

1864 markInsideLam occ = occ

1869 NoOccInfo -- Both branches are at least One

1870 -- (Argument is never IAmDead)

1872 -- (orOccInfo orig new) is used

1873 -- when combining occurrence info from branches of a case

1881 NoOccInfo