e23314b5694dc49d0579a4e861fe82f8dcb63a47
[ghc.git] / compiler / simplCore / CallArity.hs
1 --
2 -- Copyright (c) 2014 Joachim Breitner
3 --
4
5 module CallArity
6 ( callArityAnalProgram
7 , callArityRHS -- for testing
8 ) where
9
10 import VarSet
11 import VarEnv
12 import DynFlags ( DynFlags )
13
14 import BasicTypes
15 import CoreSyn
16 import Id
17 import CoreArity ( typeArity )
18 import CoreUtils ( exprIsCheap, exprIsTrivial )
19 import UnVarGraph
20 import Demand
21
22 import Control.Arrow ( first, second )
23
24
25 {-
26 %************************************************************************
27 %* *
28 Call Arity Analysis
29 %* *
30 %************************************************************************
31
32 Note [Call Arity: The goal]
33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
34
35 The goal of this analysis is to find out if we can eta-expand a local function,
36 based on how it is being called. The motivating example is this code,
37 which comes up when we implement foldl using foldr, and do list fusion:
38
39 let go = \x -> let d = case ... of
40 False -> go (x+1)
41 True -> id
42 in \z -> d (x + z)
43 in go 1 0
44
45 If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of
46 partial function applications, which would be bad.
47
48 The function `go` has a type of arity two, but only one lambda is manifest.
49 Furthermore, an analysis that only looks at the RHS of go cannot be sufficient
50 to eta-expand go: If `go` is ever called with one argument (and the result used
51 multiple times), we would be doing the work in `...` multiple times.
52
53 So `callArityAnalProgram` looks at the whole let expression to figure out if
54 all calls are nice, i.e. have a high enough arity. It then stores the result in
55 the `calledArity` field of the `IdInfo` of `go`, which the next simplifier
56 phase will eta-expand.
57
58 The specification of the `calledArity` field is:
59
60 No work will be lost if you eta-expand me to the arity in `calledArity`.
61
62 What we want to know for a variable
63 -----------------------------------
64
65 For every let-bound variable we'd like to know:
66 1. A lower bound on the arity of all calls to the variable, and
67 2. whether the variable is being called at most once or possible multiple
68 times.
69
70 It is always ok to lower the arity, or pretend that there are multiple calls.
71 In particular, "Minimum arity 0 and possible called multiple times" is always
72 correct.
73
74
75 What we want to know from an expression
76 ---------------------------------------
77
78 In order to obtain that information for variables, we analyze expression and
79 obtain bits of information:
80
81 I. The arity analysis:
82 For every variable, whether it is absent, or called,
83 and if called, which what arity.
84
85 II. The Co-Called analysis:
86 For every two variables, whether there is a possibility that both are being
87 called.
88 We obtain as a special case: For every variables, whether there is a
89 possibility that it is being called twice.
90
91 For efficiency reasons, we gather this information only for a set of
92 *interesting variables*, to avoid spending time on, e.g., variables from pattern matches.
93
94 The two analysis are not completely independent, as a higher arity can improve
95 the information about what variables are being called once or multiple times.
96
97 Note [Analysis I: The arity analysis]
98 ------------------------------------
99
100 The arity analysis is quite straight forward: The information about an
101 expression is an
102 VarEnv Arity
103 where absent variables are bound to Nothing and otherwise to a lower bound to
104 their arity.
105
106 When we analyze an expression, we analyze it with a given context arity.
107 Lambdas decrease and applications increase the incoming arity. Analysizing a
108 variable will put that arity in the environment. In lets or cases all the
109 results from the various subexpressions are lubed, which takes the point-wise
110 minimum (considering Nothing an infinity).
111
112
113 Note [Analysis II: The Co-Called analysis]
114 ------------------------------------------
115
116 The second part is more sophisticated. For reasons explained below, it is not
117 sufficient to simply know how often an expression evaluates a variable. Instead
118 we need to know which variables are possibly called together.
119
120 The data structure here is an undirected graph of variables, which is provided
121 by the abstract
122 UnVarGraph
123
124 It is safe to return a larger graph, i.e. one with more edges. The worst case
125 (i.e. the least useful and always correct result) is the complete graph on all
126 free variables, which means that anything can be called together with anything
127 (including itself).
128
129 Notation for the following:
130 C(e) is the co-called result for e.
131 G₁∪G₂ is the union of two graphs
132 fv is the set of free variables (conveniently the domain of the arity analysis result)
133 S₁×S₂ is the complete bipartite graph { {a,b} | a ∈ S₁, b ∈ S₂ }
134 S² is the complete graph on the set of variables S, S² = S×S
135 C'(e) is a variant for bound expression:
136 If e is called at most once, or it is and stays a thunk (after the analysis),
137 it is simply C(e). Otherwise, the expression can be called multiple times
138 and we return (fv e)²
139
140 The interesting cases of the analysis:
141 * Var v:
142 No other variables are being called.
143 Return {} (the empty graph)
144 * Lambda v e, under arity 0:
145 This means that e can be evaluated many times and we cannot get
146 any useful co-call information.
147 Return (fv e)²
148 * Case alternatives alt₁,alt₂,...:
149 Only one can be execuded, so
150 Return (alt₁ ∪ alt₂ ∪...)
151 * App e₁ e₂ (and analogously Case scrut alts), with non-trivial e₂:
152 We get the results from both sides, with the argument evaluated at most once.
153 Additionally, anything called by e₁ can possibly be called with anything
154 from e₂.
155 Return: C(e₁) ∪ C(e₂) ∪ (fv e₁) × (fv e₂)
156 * App e₁ x:
157 As this is already in A-normal form, CorePrep will not separately lambda
158 bind (and hence share) x. So we conservatively assume multiple calls to x here
159 Return: C(e₁) ∪ (fv e₁) × {x} ∪ {(x,x)}
160 * Let v = rhs in body:
161 In addition to the results from the subexpressions, add all co-calls from
162 everything that the body calls together with v to everthing that is called
163 by v.
164 Return: C'(rhs) ∪ C(body) ∪ (fv rhs) × {v'| {v,v'} ∈ C(body)}
165 * Letrec v₁ = rhs₁ ... vₙ = rhsₙ in body
166 Tricky.
167 We assume that it is really mutually recursive, i.e. that every variable
168 calls one of the others, and that this is strongly connected (otherwise we
169 return an over-approximation, so that's ok), see note [Recursion and fixpointing].
170
171 Let V = {v₁,...vₙ}.
172 Assume that the vs have been analysed with an incoming demand and
173 cardinality consistent with the final result (this is the fixed-pointing).
174 Again we can use the results from all subexpressions.
175 In addition, for every variable vᵢ, we need to find out what it is called
176 with (call this set Sᵢ). There are two cases:
177 * If vᵢ is a function, we need to go through all right-hand-sides and bodies,
178 and collect every variable that is called together with any variable from V:
179 Sᵢ = {v' | j ∈ {1,...,n}, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) }
180 * If vᵢ is a thunk, then its rhs is evaluated only once, so we need to
181 exclude it from this set:
182 Sᵢ = {v' | j ∈ {1,...,n}, j≠i, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) }
183 Finally, combine all this:
184 Return: C(body) ∪
185 C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪
186 (fv rhs₁) × S₁) ∪ ... ∪ (fv rhsₙ) × Sₙ)
187
188 Using the result: Eta-Expansion
189 -------------------------------
190
191 We use the result of these two analyses to decide whether we can eta-expand the
192 rhs of a let-bound variable.
193
194 If the variable is already a function (exprIsCheap), and all calls to the
195 variables have a higher arity than the current manifest arity (i.e. the number
196 of lambdas), expand.
197
198 If the variable is a thunk we must be careful: Eta-Expansion will prevent
199 sharing of work, so this is only safe if there is at most one call to the
200 function. Therefore, we check whether {v,v} ∈ G.
201
202 Example:
203
204 let n = case .. of .. -- A thunk!
205 in n 0 + n 1
206
207 vs.
208
209 let n = case .. of ..
210 in case .. of T -> n 0
211 F -> n 1
212
213 We are only allowed to eta-expand `n` if it is going to be called at most
214 once in the body of the outer let. So we need to know, for each variable
215 individually, that it is going to be called at most once.
216
217
218 Why the co-call graph?
219 ----------------------
220
221 Why is it not sufficient to simply remember which variables are called once and
222 which are called multiple times? It would be in the previous example, but consider
223
224 let n = case .. of ..
225 in case .. of
226 True -> let go = \y -> case .. of
227 True -> go (y + n 1)
228 False > n
229 in go 1
230 False -> n
231
232 vs.
233
234 let n = case .. of ..
235 in case .. of
236 True -> let go = \y -> case .. of
237 True -> go (y+1)
238 False > n
239 in go 1
240 False -> n
241
242 In both cases, the body and the rhs of the inner let call n at most once.
243 But only in the second case that holds for the whole expression! The
244 crucial difference is that in the first case, the rhs of `go` can call
245 *both* `go` and `n`, and hence can call `n` multiple times as it recurses,
246 while in the second case find out that `go` and `n` are not called together.
247
248
249 Why co-call information for functions?
250 --------------------------------------
251
252 Although for eta-expansion we need the information only for thunks, we still
253 need to know whether functions are being called once or multiple times, and
254 together with what other functions.
255
256 Example:
257
258 let n = case .. of ..
259 f x = n (x+1)
260 in f 1 + f 2
261
262 vs.
263
264 let n = case .. of ..
265 f x = n (x+1)
266 in case .. of T -> f 0
267 F -> f 1
268
269 Here, the body of f calls n exactly once, but f itself is being called
270 multiple times, so eta-expansion is not allowed.
271
272
273 Note [Analysis type signature]
274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
275
276 The work-hourse of the analysis is the function `callArityAnal`, with the
277 following type:
278
279 type CallArityRes = (UnVarGraph, VarEnv Arity)
280 callArityAnal ::
281 Arity -> -- The arity this expression is called with
282 VarSet -> -- The set of interesting variables
283 CoreExpr -> -- The expression to analyse
284 (CallArityRes, CoreExpr)
285
286 and the following specification:
287
288 ((coCalls, callArityEnv), expr') = callArityEnv arity interestingIds expr
289
290 <=>
291
292 Assume the expression `expr` is being passed `arity` arguments. Then it holds that
293 * The domain of `callArityEnv` is a subset of `interestingIds`.
294 * Any variable from `interestingIds` that is not mentioned in the `callArityEnv`
295 is absent, i.e. not called at all.
296 * Every call from `expr` to a variable bound to n in `callArityEnv` has at
297 least n value arguments.
298 * For two interesting variables `v1` and `v2`, they are not adjacent in `coCalls`,
299 then in no execution of `expr` both are being called.
300 Furthermore, expr' is expr with the callArity field of the `IdInfo` updated.
301
302
303 Note [Which variables are interesting]
304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
305
306 The analysis would quickly become prohibitive expensive if we would analyse all
307 variables; for most variables we simply do not care about how often they are
308 called, i.e. variables bound in a pattern match. So interesting are variables that are
309 * top-level or let bound
310 * and possibly functions (typeArity > 0)
311
312 Note [Taking boring variables into account]
313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314
315 If we decide that the variable bound in `let x = e1 in e2` is not interesting,
316 the analysis of `e2` will not report anything about `x`. To ensure that
317 `callArityBind` does still do the right thing we have to take that into account
318 everytime we would be lookup up `x` in the analysis result of `e2`.
319 * Instead of calling lookupCallArityRes, we return (0, True), indicating
320 that this variable might be called many times with no arguments.
321 * Instead of checking `calledWith x`, we assume that everything can be called
322 with it.
323 * In the recursive case, when calclulating the `cross_calls`, if there is
324 any boring variable in the recursive group, we ignore all co-call-results
325 and directly go to a very conservative assumption.
326
327 The last point has the nice side effect that the relatively expensive
328 integration of co-call results in a recursive groups is often skipped. This
329 helped to avoid the compile time blowup in some real-world code with large
330 recursive groups (#10293).
331
332 Note [Recursion and fixpointing]
333 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
334
335 For a mutually recursive let, we begin by
336 1. analysing the body, using the same incoming arity as for the whole expression.
337 2. Then we iterate, memoizing for each of the bound variables the last
338 analysis call, i.e. incoming arity, whether it is called once, and the CallArityRes.
339 3. We combine the analysis result from the body and the memoized results for
340 the arguments (if already present).
341 4. For each variable, we find out the incoming arity and whether it is called
342 once, based on the the current analysis result. If this differs from the
343 memoized results, we re-analyse the rhs and update the memoized table.
344 5. If nothing had to be reanalyzed, we are done.
345 Otherwise, repeat from step 3.
346
347
348 Note [Thunks in recursive groups]
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350
351 We never eta-expand a thunk in a recursive group, on the grounds that if it is
352 part of a recursive group, then it will be called multipe times.
353
354 This is not necessarily true, e.g. it would be safe to eta-expand t2 (but not
355 t1) in the following code:
356
357 let go x = t1
358 t1 = if ... then t2 else ...
359 t2 = if ... then go 1 else ...
360 in go 0
361
362 Detecting this would require finding out what variables are only ever called
363 from thunks. While this is certainly possible, we yet have to see this to be
364 relevant in the wild.
365
366
367 Note [Analysing top-level binds]
368 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369
370 We can eta-expand top-level-binds if they are not exported, as we see all calls
371 to them. The plan is as follows: Treat the top-level binds as nested lets around
372 a body representing “all external calls”, which returns a pessimistic
373 CallArityRes (the co-call graph is the complete graph, all arityies 0).
374
375 Note [Trimming arity]
376 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377
378 In the Call Arity papers, we are working on an untyped lambda calculus with no
379 other id annotations, where eta-expansion is always possible. But this is not
380 the case for Core!
381 1. We need to ensure the invariant
382 callArity e <= typeArity (exprType e)
383 for the same reasons that exprArity needs this invariant (see Note
384 [exprArity invariant] in CoreArity).
385
386 If we are not doing that, a too-high arity annotation will be stored with
387 the id, confusing the simplifier later on.
388
389 2. Eta-expanding a right hand side might invalidate existing annotations. In
390 particular, if an id has a strictness annotation of <...><...>b, then
391 passing two arguments to it will definitely bottom out, so the simplifier
392 will throw away additional parameters. This conflicts with Call Arity! So
393 we ensure that we never eta-expand such a value beyond the number of
394 arguments mentioned in the strictness signature.
395 See #10176 for a real-world-example.
396
397 Note [What is a thunk]
398 ~~~~~~~~~~~~~~~~~~~~~~
399
400 Originally, everything that is not in WHNF (`exprIsWHNF`) is considered a
401 thunk, not eta-expanded, to avoid losing any sharing. This is also how the
402 published papers on Call Arity describe it.
403
404 In practice, there are thunks that do a just little work, such as
405 pattern-matching on a variable, and the benefits of eta-expansion likely
406 oughtweigh the cost of doing that repeatedly. Therefore, this implementation of
407 Call Arity considers everything that is not cheap (`exprIsCheap`) as a thunk.
408
409 Note [Call Arity and Join Points]
410 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
411
412 The Call Arity analysis does not care about join points, and treats them just
413 like normal functions. This is ok.
414
415 The analysis *could* make use of the fact that join points are always evaluated
416 in the same context as the join-binding they are defined in and are always
417 one-shot, and handle join points separately, as suggested in
418 https://ghc.haskell.org/trac/ghc/ticket/13479#comment:10.
419 This *might* be more efficient (for example, join points would not have to be
420 considered interesting variables), but it would also add redundant code. So for
421 now we do not do that.
422
423 The simplifier never eta-expands join points (it instead pushes extra arguments from
424 an eta-expanded context into the join point’s RHS), so the call arity
425 annotation on join points is not actually used. As it would be equally valid
426 (though less efficient) to eta-expand join points, this is the simplifier's
427 choice, and hence Call Arity sets the call arity for join points as well.
428 -}
429
430 -- Main entry point
431
432 callArityAnalProgram :: DynFlags -> CoreProgram -> CoreProgram
433 callArityAnalProgram _dflags binds = binds'
434 where
435 (_, binds') = callArityTopLvl [] emptyVarSet binds
436
437 -- See Note [Analysing top-level-binds]
438 callArityTopLvl :: [Var] -> VarSet -> [CoreBind] -> (CallArityRes, [CoreBind])
439 callArityTopLvl exported _ []
440 = ( calledMultipleTimes $ (emptyUnVarGraph, mkVarEnv $ [(v, 0) | v <- exported])
441 , [] )
442 callArityTopLvl exported int1 (b:bs)
443 = (ae2, b':bs')
444 where
445 int2 = bindersOf b
446 exported' = filter isExportedId int2 ++ exported
447 int' = int1 `addInterestingBinds` b
448 (ae1, bs') = callArityTopLvl exported' int' bs
449 (ae2, b') = callArityBind (boringBinds b) ae1 int1 b
450
451
452 callArityRHS :: CoreExpr -> CoreExpr
453 callArityRHS = snd . callArityAnal 0 emptyVarSet
454
455 -- The main analysis function. See Note [Analysis type signature]
456 callArityAnal ::
457 Arity -> -- The arity this expression is called with
458 VarSet -> -- The set of interesting variables
459 CoreExpr -> -- The expression to analyse
460 (CallArityRes, CoreExpr)
461 -- How this expression uses its interesting variables
462 -- and the expression with IdInfo updated
463
464 -- The trivial base cases
465 callArityAnal _ _ e@(Lit _)
466 = (emptyArityRes, e)
467 callArityAnal _ _ e@(Type _)
468 = (emptyArityRes, e)
469 callArityAnal _ _ e@(Coercion _)
470 = (emptyArityRes, e)
471 -- The transparent cases
472 callArityAnal arity int (Tick t e)
473 = second (Tick t) $ callArityAnal arity int e
474 callArityAnal arity int (Cast e co)
475 = second (\e -> Cast e co) $ callArityAnal arity int e
476
477 -- The interesting case: Variables, Lambdas, Lets, Applications, Cases
478 callArityAnal arity int e@(Var v)
479 | v `elemVarSet` int
480 = (unitArityRes v arity, e)
481 | otherwise
482 = (emptyArityRes, e)
483
484 -- Non-value lambdas are ignored
485 callArityAnal arity int (Lam v e) | not (isId v)
486 = second (Lam v) $ callArityAnal arity (int `delVarSet` v) e
487
488 -- We have a lambda that may be called multiple times, so its free variables
489 -- can all be co-called.
490 callArityAnal 0 int (Lam v e)
491 = (ae', Lam v e')
492 where
493 (ae, e') = callArityAnal 0 (int `delVarSet` v) e
494 ae' = calledMultipleTimes ae
495 -- We have a lambda that we are calling. decrease arity.
496 callArityAnal arity int (Lam v e)
497 = (ae, Lam v e')
498 where
499 (ae, e') = callArityAnal (arity - 1) (int `delVarSet` v) e
500
501 -- Application. Increase arity for the called expression, nothing to know about
502 -- the second
503 callArityAnal arity int (App e (Type t))
504 = second (\e -> App e (Type t)) $ callArityAnal arity int e
505 callArityAnal arity int (App e1 e2)
506 = (final_ae, App e1' e2')
507 where
508 (ae1, e1') = callArityAnal (arity + 1) int e1
509 (ae2, e2') = callArityAnal 0 int e2
510 -- If the argument is trivial (e.g. a variable), then it will _not_ be
511 -- let-bound in the Core to STG transformation (CorePrep actually),
512 -- so no sharing will happen here, and we have to assume many calls.
513 ae2' | exprIsTrivial e2 = calledMultipleTimes ae2
514 | otherwise = ae2
515 final_ae = ae1 `both` ae2'
516
517 -- Case expression.
518 callArityAnal arity int (Case scrut bndr ty alts)
519 = -- pprTrace "callArityAnal:Case"
520 -- (vcat [ppr scrut, ppr final_ae])
521 (final_ae, Case scrut' bndr ty alts')
522 where
523 (alt_aes, alts') = unzip $ map go alts
524 go (dc, bndrs, e) = let (ae, e') = callArityAnal arity int e
525 in (ae, (dc, bndrs, e'))
526 alt_ae = lubRess alt_aes
527 (scrut_ae, scrut') = callArityAnal 0 int scrut
528 final_ae = scrut_ae `both` alt_ae
529
530 -- For lets, use callArityBind
531 callArityAnal arity int (Let bind e)
532 = -- pprTrace "callArityAnal:Let"
533 -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ])
534 (final_ae, Let bind' e')
535 where
536 int_body = int `addInterestingBinds` bind
537 (ae_body, e') = callArityAnal arity int_body e
538 (final_ae, bind') = callArityBind (boringBinds bind) ae_body int bind
539
540 -- Which bindings should we look at?
541 -- See Note [Which variables are interesting]
542 isInteresting :: Var -> Bool
543 isInteresting v = not $ null (typeArity (idType v))
544
545 interestingBinds :: CoreBind -> [Var]
546 interestingBinds = filter isInteresting . bindersOf
547
548 boringBinds :: CoreBind -> VarSet
549 boringBinds = mkVarSet . filter (not . isInteresting) . bindersOf
550
551 addInterestingBinds :: VarSet -> CoreBind -> VarSet
552 addInterestingBinds int bind
553 = int `delVarSetList` bindersOf bind -- Possible shadowing
554 `extendVarSetList` interestingBinds bind
555
556 -- Used for both local and top-level binds
557 -- Second argument is the demand from the body
558 callArityBind :: VarSet -> CallArityRes -> VarSet -> CoreBind -> (CallArityRes, CoreBind)
559 -- Non-recursive let
560 callArityBind boring_vars ae_body int (NonRec v rhs)
561 | otherwise
562 = -- pprTrace "callArityBind:NonRec"
563 -- (vcat [ppr v, ppr ae_body, ppr int, ppr ae_rhs, ppr safe_arity])
564 (final_ae, NonRec v' rhs')
565 where
566 is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk]
567 -- If v is boring, we will not find it in ae_body, but always assume (0, False)
568 boring = v `elemVarSet` boring_vars
569
570 (arity, called_once)
571 | boring = (0, False) -- See Note [Taking boring variables into account]
572 | otherwise = lookupCallArityRes ae_body v
573 safe_arity | called_once = arity
574 | is_thunk = 0 -- A thunk! Do not eta-expand
575 | otherwise = arity
576
577 -- See Note [Trimming arity]
578 trimmed_arity = trimArity v safe_arity
579
580 (ae_rhs, rhs') = callArityAnal trimmed_arity int rhs
581
582
583 ae_rhs'| called_once = ae_rhs
584 | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once
585 | otherwise = calledMultipleTimes ae_rhs
586
587 called_by_v = domRes ae_rhs'
588 called_with_v
589 | boring = domRes ae_body
590 | otherwise = calledWith ae_body v `delUnVarSet` v
591 final_ae = addCrossCoCalls called_by_v called_with_v $ ae_rhs' `lubRes` resDel v ae_body
592
593 v' = v `setIdCallArity` trimmed_arity
594
595
596 -- Recursive let. See Note [Recursion and fixpointing]
597 callArityBind boring_vars ae_body int b@(Rec binds)
598 = -- (if length binds > 300 then
599 -- pprTrace "callArityBind:Rec"
600 -- (vcat [ppr (Rec binds'), ppr ae_body, ppr int, ppr ae_rhs]) else id) $
601 (final_ae, Rec binds')
602 where
603 -- See Note [Taking boring variables into account]
604 any_boring = any (`elemVarSet` boring_vars) [ i | (i, _) <- binds]
605
606 int_body = int `addInterestingBinds` b
607 (ae_rhs, binds') = fix initial_binds
608 final_ae = bindersOf b `resDelList` ae_rhs
609
610 initial_binds = [(i,Nothing,e) | (i,e) <- binds]
611
612 fix :: [(Id, Maybe (Bool, Arity, CallArityRes), CoreExpr)] -> (CallArityRes, [(Id, CoreExpr)])
613 fix ann_binds
614 | -- pprTrace "callArityBind:fix" (vcat [ppr ann_binds, ppr any_change, ppr ae]) $
615 any_change
616 = fix ann_binds'
617 | otherwise
618 = (ae, map (\(i, _, e) -> (i, e)) ann_binds')
619 where
620 aes_old = [ (i,ae) | (i, Just (_,_,ae), _) <- ann_binds ]
621 ae = callArityRecEnv any_boring aes_old ae_body
622
623 rerun (i, mbLastRun, rhs)
624 | i `elemVarSet` int_body && not (i `elemUnVarSet` domRes ae)
625 -- No call to this yet, so do nothing
626 = (False, (i, Nothing, rhs))
627
628 | Just (old_called_once, old_arity, _) <- mbLastRun
629 , called_once == old_called_once
630 , new_arity == old_arity
631 -- No change, no need to re-analyze
632 = (False, (i, mbLastRun, rhs))
633
634 | otherwise
635 -- We previously analyzed this with a different arity (or not at all)
636 = let is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk]
637
638 safe_arity | is_thunk = 0 -- See Note [Thunks in recursive groups]
639 | otherwise = new_arity
640
641 -- See Note [Trimming arity]
642 trimmed_arity = trimArity i safe_arity
643
644 (ae_rhs, rhs') = callArityAnal trimmed_arity int_body rhs
645
646 ae_rhs' | called_once = ae_rhs
647 | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once
648 | otherwise = calledMultipleTimes ae_rhs
649
650 i' = i `setIdCallArity` trimmed_arity
651
652 in (True, (i', Just (called_once, new_arity, ae_rhs'), rhs'))
653 where
654 -- See Note [Taking boring variables into account]
655 (new_arity, called_once) | i `elemVarSet` boring_vars = (0, False)
656 | otherwise = lookupCallArityRes ae i
657
658 (changes, ann_binds') = unzip $ map rerun ann_binds
659 any_change = or changes
660
661 -- Combining the results from body and rhs, (mutually) recursive case
662 -- See Note [Analysis II: The Co-Called analysis]
663 callArityRecEnv :: Bool -> [(Var, CallArityRes)] -> CallArityRes -> CallArityRes
664 callArityRecEnv any_boring ae_rhss ae_body
665 = -- (if length ae_rhss > 300 then pprTrace "callArityRecEnv" (vcat [ppr ae_rhss, ppr ae_body, ppr ae_new]) else id) $
666 ae_new
667 where
668 vars = map fst ae_rhss
669
670 ae_combined = lubRess (map snd ae_rhss) `lubRes` ae_body
671
672 cross_calls
673 -- See Note [Taking boring variables into account]
674 | any_boring = completeGraph (domRes ae_combined)
675 -- Also, calculating cross_calls is expensive. Simply be conservative
676 -- if the mutually recursive group becomes too large.
677 | length ae_rhss > 25 = completeGraph (domRes ae_combined)
678 | otherwise = unionUnVarGraphs $ map cross_call ae_rhss
679 cross_call (v, ae_rhs) = completeBipartiteGraph called_by_v called_with_v
680 where
681 is_thunk = idCallArity v == 0
682 -- What rhs are relevant as happening before (or after) calling v?
683 -- If v is a thunk, everything from all the _other_ variables
684 -- If v is not a thunk, everything can happen.
685 ae_before_v | is_thunk = lubRess (map snd $ filter ((/= v) . fst) ae_rhss) `lubRes` ae_body
686 | otherwise = ae_combined
687 -- What do we want to know from these?
688 -- Which calls can happen next to any recursive call.
689 called_with_v
690 = unionUnVarSets $ map (calledWith ae_before_v) vars
691 called_by_v = domRes ae_rhs
692
693 ae_new = first (cross_calls `unionUnVarGraph`) ae_combined
694
695 -- See Note [Trimming arity]
696 trimArity :: Id -> Arity -> Arity
697 trimArity v a = minimum [a, max_arity_by_type, max_arity_by_strsig]
698 where
699 max_arity_by_type = length (typeArity (idType v))
700 max_arity_by_strsig
701 | isBotRes result_info = length demands
702 | otherwise = a
703
704 (demands, result_info) = splitStrictSig (idStrictness v)
705
706 ---------------------------------------
707 -- Functions related to CallArityRes --
708 ---------------------------------------
709
710 -- Result type for the two analyses.
711 -- See Note [Analysis I: The arity analysis]
712 -- and Note [Analysis II: The Co-Called analysis]
713 type CallArityRes = (UnVarGraph, VarEnv Arity)
714
715 emptyArityRes :: CallArityRes
716 emptyArityRes = (emptyUnVarGraph, emptyVarEnv)
717
718 unitArityRes :: Var -> Arity -> CallArityRes
719 unitArityRes v arity = (emptyUnVarGraph, unitVarEnv v arity)
720
721 resDelList :: [Var] -> CallArityRes -> CallArityRes
722 resDelList vs ae = foldr resDel ae vs
723
724 resDel :: Var -> CallArityRes -> CallArityRes
725 resDel v (g, ae) = (g `delNode` v, ae `delVarEnv` v)
726
727 domRes :: CallArityRes -> UnVarSet
728 domRes (_, ae) = varEnvDom ae
729
730 -- In the result, find out the minimum arity and whether the variable is called
731 -- at most once.
732 lookupCallArityRes :: CallArityRes -> Var -> (Arity, Bool)
733 lookupCallArityRes (g, ae) v
734 = case lookupVarEnv ae v of
735 Just a -> (a, not (v `elemUnVarSet` (neighbors g v)))
736 Nothing -> (0, False)
737
738 calledWith :: CallArityRes -> Var -> UnVarSet
739 calledWith (g, _) v = neighbors g v
740
741 addCrossCoCalls :: UnVarSet -> UnVarSet -> CallArityRes -> CallArityRes
742 addCrossCoCalls set1 set2 = first (completeBipartiteGraph set1 set2 `unionUnVarGraph`)
743
744 -- Replaces the co-call graph by a complete graph (i.e. no information)
745 calledMultipleTimes :: CallArityRes -> CallArityRes
746 calledMultipleTimes res = first (const (completeGraph (domRes res))) res
747
748 -- Used for application and cases
749 both :: CallArityRes -> CallArityRes -> CallArityRes
750 both r1 r2 = addCrossCoCalls (domRes r1) (domRes r2) $ r1 `lubRes` r2
751
752 -- Used when combining results from alternative cases; take the minimum
753 lubRes :: CallArityRes -> CallArityRes -> CallArityRes
754 lubRes (g1, ae1) (g2, ae2) = (g1 `unionUnVarGraph` g2, ae1 `lubArityEnv` ae2)
755
756 lubArityEnv :: VarEnv Arity -> VarEnv Arity -> VarEnv Arity
757 lubArityEnv = plusVarEnv_C min
758
759 lubRess :: [CallArityRes] -> CallArityRes
760 lubRess = foldl lubRes emptyArityRes