Get rid of some stuttering in comments and docs
[ghc.git] / compiler / simplCore / CallArity.hs
1 --
2 -- Copyright (c) 2014 Joachim Breitner
3 --
4
5 module CallArity
6 ( callArityAnalProgram
7 , callArityRHS -- for testing
8 ) where
9
10 import GhcPrelude
11
12 import VarSet
13 import VarEnv
14 import DynFlags ( DynFlags )
15
16 import BasicTypes
17 import CoreSyn
18 import Id
19 import CoreArity ( typeArity )
20 import CoreUtils ( exprIsCheap, exprIsTrivial )
21 import UnVarGraph
22 import Demand
23 import Util
24
25 import Control.Arrow ( first, second )
26
27
28 {-
29 %************************************************************************
30 %* *
31 Call Arity Analysis
32 %* *
33 %************************************************************************
34
35 Note [Call Arity: The goal]
36 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
37
38 The goal of this analysis is to find out if we can eta-expand a local function,
39 based on how it is being called. The motivating example is this code,
40 which comes up when we implement foldl using foldr, and do list fusion:
41
42 let go = \x -> let d = case ... of
43 False -> go (x+1)
44 True -> id
45 in \z -> d (x + z)
46 in go 1 0
47
48 If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of
49 partial function applications, which would be bad.
50
51 The function `go` has a type of arity two, but only one lambda is manifest.
52 Furthermore, an analysis that only looks at the RHS of go cannot be sufficient
53 to eta-expand go: If `go` is ever called with one argument (and the result used
54 multiple times), we would be doing the work in `...` multiple times.
55
56 So `callArityAnalProgram` looks at the whole let expression to figure out if
57 all calls are nice, i.e. have a high enough arity. It then stores the result in
58 the `calledArity` field of the `IdInfo` of `go`, which the next simplifier
59 phase will eta-expand.
60
61 The specification of the `calledArity` field is:
62
63 No work will be lost if you eta-expand me to the arity in `calledArity`.
64
65 What we want to know for a variable
66 -----------------------------------
67
68 For every let-bound variable we'd like to know:
69 1. A lower bound on the arity of all calls to the variable, and
70 2. whether the variable is being called at most once or possible multiple
71 times.
72
73 It is always ok to lower the arity, or pretend that there are multiple calls.
74 In particular, "Minimum arity 0 and possible called multiple times" is always
75 correct.
76
77
78 What we want to know from an expression
79 ---------------------------------------
80
81 In order to obtain that information for variables, we analyze expression and
82 obtain bits of information:
83
84 I. The arity analysis:
85 For every variable, whether it is absent, or called,
86 and if called, which what arity.
87
88 II. The Co-Called analysis:
89 For every two variables, whether there is a possibility that both are being
90 called.
91 We obtain as a special case: For every variables, whether there is a
92 possibility that it is being called twice.
93
94 For efficiency reasons, we gather this information only for a set of
95 *interesting variables*, to avoid spending time on, e.g., variables from pattern matches.
96
97 The two analysis are not completely independent, as a higher arity can improve
98 the information about what variables are being called once or multiple times.
99
100 Note [Analysis I: The arity analysis]
101 ------------------------------------
102
103 The arity analysis is quite straight forward: The information about an
104 expression is an
105 VarEnv Arity
106 where absent variables are bound to Nothing and otherwise to a lower bound to
107 their arity.
108
109 When we analyze an expression, we analyze it with a given context arity.
110 Lambdas decrease and applications increase the incoming arity. Analysizing a
111 variable will put that arity in the environment. In lets or cases all the
112 results from the various subexpressions are lubed, which takes the point-wise
113 minimum (considering Nothing an infinity).
114
115
116 Note [Analysis II: The Co-Called analysis]
117 ------------------------------------------
118
119 The second part is more sophisticated. For reasons explained below, it is not
120 sufficient to simply know how often an expression evaluates a variable. Instead
121 we need to know which variables are possibly called together.
122
123 The data structure here is an undirected graph of variables, which is provided
124 by the abstract
125 UnVarGraph
126
127 It is safe to return a larger graph, i.e. one with more edges. The worst case
128 (i.e. the least useful and always correct result) is the complete graph on all
129 free variables, which means that anything can be called together with anything
130 (including itself).
131
132 Notation for the following:
133 C(e) is the co-called result for e.
134 G₁∪G₂ is the union of two graphs
135 fv is the set of free variables (conveniently the domain of the arity analysis result)
136 S₁×S₂ is the complete bipartite graph { {a,b} | a ∈ S₁, b ∈ S₂ }
137 S² is the complete graph on the set of variables S, S² = S×S
138 C'(e) is a variant for bound expression:
139 If e is called at most once, or it is and stays a thunk (after the analysis),
140 it is simply C(e). Otherwise, the expression can be called multiple times
141 and we return (fv e)²
142
143 The interesting cases of the analysis:
144 * Var v:
145 No other variables are being called.
146 Return {} (the empty graph)
147 * Lambda v e, under arity 0:
148 This means that e can be evaluated many times and we cannot get
149 any useful co-call information.
150 Return (fv e)²
151 * Case alternatives alt₁,alt₂,...:
152 Only one can be execuded, so
153 Return (alt₁ ∪ alt₂ ∪...)
154 * App e₁ e₂ (and analogously Case scrut alts), with non-trivial e₂:
155 We get the results from both sides, with the argument evaluated at most once.
156 Additionally, anything called by e₁ can possibly be called with anything
157 from e₂.
158 Return: C(e₁) ∪ C(e₂) ∪ (fv e₁) × (fv e₂)
159 * App e₁ x:
160 As this is already in A-normal form, CorePrep will not separately lambda
161 bind (and hence share) x. So we conservatively assume multiple calls to x here
162 Return: C(e₁) ∪ (fv e₁) × {x} ∪ {(x,x)}
163 * Let v = rhs in body:
164 In addition to the results from the subexpressions, add all co-calls from
165 everything that the body calls together with v to everthing that is called
166 by v.
167 Return: C'(rhs) ∪ C(body) ∪ (fv rhs) × {v'| {v,v'} ∈ C(body)}
168 * Letrec v₁ = rhs₁ ... vₙ = rhsₙ in body
169 Tricky.
170 We assume that it is really mutually recursive, i.e. that every variable
171 calls one of the others, and that this is strongly connected (otherwise we
172 return an over-approximation, so that's ok), see note [Recursion and fixpointing].
173
174 Let V = {v₁,...vₙ}.
175 Assume that the vs have been analysed with an incoming demand and
176 cardinality consistent with the final result (this is the fixed-pointing).
177 Again we can use the results from all subexpressions.
178 In addition, for every variable vᵢ, we need to find out what it is called
179 with (call this set Sᵢ). There are two cases:
180 * If vᵢ is a function, we need to go through all right-hand-sides and bodies,
181 and collect every variable that is called together with any variable from V:
182 Sᵢ = {v' | j ∈ {1,...,n}, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) }
183 * If vᵢ is a thunk, then its rhs is evaluated only once, so we need to
184 exclude it from this set:
185 Sᵢ = {v' | j ∈ {1,...,n}, j≠i, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) }
186 Finally, combine all this:
187 Return: C(body) ∪
188 C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪
189 (fv rhs₁) × S₁) ∪ ... ∪ (fv rhsₙ) × Sₙ)
190
191 Using the result: Eta-Expansion
192 -------------------------------
193
194 We use the result of these two analyses to decide whether we can eta-expand the
195 rhs of a let-bound variable.
196
197 If the variable is already a function (exprIsCheap), and all calls to the
198 variables have a higher arity than the current manifest arity (i.e. the number
199 of lambdas), expand.
200
201 If the variable is a thunk we must be careful: Eta-Expansion will prevent
202 sharing of work, so this is only safe if there is at most one call to the
203 function. Therefore, we check whether {v,v} ∈ G.
204
205 Example:
206
207 let n = case .. of .. -- A thunk!
208 in n 0 + n 1
209
210 vs.
211
212 let n = case .. of ..
213 in case .. of T -> n 0
214 F -> n 1
215
216 We are only allowed to eta-expand `n` if it is going to be called at most
217 once in the body of the outer let. So we need to know, for each variable
218 individually, that it is going to be called at most once.
219
220
221 Why the co-call graph?
222 ----------------------
223
224 Why is it not sufficient to simply remember which variables are called once and
225 which are called multiple times? It would be in the previous example, but consider
226
227 let n = case .. of ..
228 in case .. of
229 True -> let go = \y -> case .. of
230 True -> go (y + n 1)
231 False > n
232 in go 1
233 False -> n
234
235 vs.
236
237 let n = case .. of ..
238 in case .. of
239 True -> let go = \y -> case .. of
240 True -> go (y+1)
241 False > n
242 in go 1
243 False -> n
244
245 In both cases, the body and the rhs of the inner let call n at most once.
246 But only in the second case that holds for the whole expression! The
247 crucial difference is that in the first case, the rhs of `go` can call
248 *both* `go` and `n`, and hence can call `n` multiple times as it recurses,
249 while in the second case find out that `go` and `n` are not called together.
250
251
252 Why co-call information for functions?
253 --------------------------------------
254
255 Although for eta-expansion we need the information only for thunks, we still
256 need to know whether functions are being called once or multiple times, and
257 together with what other functions.
258
259 Example:
260
261 let n = case .. of ..
262 f x = n (x+1)
263 in f 1 + f 2
264
265 vs.
266
267 let n = case .. of ..
268 f x = n (x+1)
269 in case .. of T -> f 0
270 F -> f 1
271
272 Here, the body of f calls n exactly once, but f itself is being called
273 multiple times, so eta-expansion is not allowed.
274
275
276 Note [Analysis type signature]
277 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
278
279 The work-hourse of the analysis is the function `callArityAnal`, with the
280 following type:
281
282 type CallArityRes = (UnVarGraph, VarEnv Arity)
283 callArityAnal ::
284 Arity -> -- The arity this expression is called with
285 VarSet -> -- The set of interesting variables
286 CoreExpr -> -- The expression to analyse
287 (CallArityRes, CoreExpr)
288
289 and the following specification:
290
291 ((coCalls, callArityEnv), expr') = callArityEnv arity interestingIds expr
292
293 <=>
294
295 Assume the expression `expr` is being passed `arity` arguments. Then it holds that
296 * The domain of `callArityEnv` is a subset of `interestingIds`.
297 * Any variable from `interestingIds` that is not mentioned in the `callArityEnv`
298 is absent, i.e. not called at all.
299 * Every call from `expr` to a variable bound to n in `callArityEnv` has at
300 least n value arguments.
301 * For two interesting variables `v1` and `v2`, they are not adjacent in `coCalls`,
302 then in no execution of `expr` both are being called.
303 Furthermore, expr' is expr with the callArity field of the `IdInfo` updated.
304
305
306 Note [Which variables are interesting]
307 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
308
309 The analysis would quickly become prohibitive expensive if we would analyse all
310 variables; for most variables we simply do not care about how often they are
311 called, i.e. variables bound in a pattern match. So interesting are variables that are
312 * top-level or let bound
313 * and possibly functions (typeArity > 0)
314
315 Note [Taking boring variables into account]
316 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317
318 If we decide that the variable bound in `let x = e1 in e2` is not interesting,
319 the analysis of `e2` will not report anything about `x`. To ensure that
320 `callArityBind` does still do the right thing we have to take that into account
321 everytime we would be lookup up `x` in the analysis result of `e2`.
322 * Instead of calling lookupCallArityRes, we return (0, True), indicating
323 that this variable might be called many times with no arguments.
324 * Instead of checking `calledWith x`, we assume that everything can be called
325 with it.
326 * In the recursive case, when calclulating the `cross_calls`, if there is
327 any boring variable in the recursive group, we ignore all co-call-results
328 and directly go to a very conservative assumption.
329
330 The last point has the nice side effect that the relatively expensive
331 integration of co-call results in a recursive groups is often skipped. This
332 helped to avoid the compile time blowup in some real-world code with large
333 recursive groups (#10293).
334
335 Note [Recursion and fixpointing]
336 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
337
338 For a mutually recursive let, we begin by
339 1. analysing the body, using the same incoming arity as for the whole expression.
340 2. Then we iterate, memoizing for each of the bound variables the last
341 analysis call, i.e. incoming arity, whether it is called once, and the CallArityRes.
342 3. We combine the analysis result from the body and the memoized results for
343 the arguments (if already present).
344 4. For each variable, we find out the incoming arity and whether it is called
345 once, based on the current analysis result. If this differs from the
346 memoized results, we re-analyse the rhs and update the memoized table.
347 5. If nothing had to be reanalyzed, we are done.
348 Otherwise, repeat from step 3.
349
350
351 Note [Thunks in recursive groups]
352 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
353
354 We never eta-expand a thunk in a recursive group, on the grounds that if it is
355 part of a recursive group, then it will be called multiple times.
356
357 This is not necessarily true, e.g. it would be safe to eta-expand t2 (but not
358 t1) in the following code:
359
360 let go x = t1
361 t1 = if ... then t2 else ...
362 t2 = if ... then go 1 else ...
363 in go 0
364
365 Detecting this would require finding out what variables are only ever called
366 from thunks. While this is certainly possible, we yet have to see this to be
367 relevant in the wild.
368
369
370 Note [Analysing top-level binds]
371 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
372
373 We can eta-expand top-level-binds if they are not exported, as we see all calls
374 to them. The plan is as follows: Treat the top-level binds as nested lets around
375 a body representing “all external calls”, which returns a pessimistic
376 CallArityRes (the co-call graph is the complete graph, all arityies 0).
377
378 Note [Trimming arity]
379 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
380
381 In the Call Arity papers, we are working on an untyped lambda calculus with no
382 other id annotations, where eta-expansion is always possible. But this is not
383 the case for Core!
384 1. We need to ensure the invariant
385 callArity e <= typeArity (exprType e)
386 for the same reasons that exprArity needs this invariant (see Note
387 [exprArity invariant] in CoreArity).
388
389 If we are not doing that, a too-high arity annotation will be stored with
390 the id, confusing the simplifier later on.
391
392 2. Eta-expanding a right hand side might invalidate existing annotations. In
393 particular, if an id has a strictness annotation of <...><...>b, then
394 passing two arguments to it will definitely bottom out, so the simplifier
395 will throw away additional parameters. This conflicts with Call Arity! So
396 we ensure that we never eta-expand such a value beyond the number of
397 arguments mentioned in the strictness signature.
398 See #10176 for a real-world-example.
399
400 Note [What is a thunk]
401 ~~~~~~~~~~~~~~~~~~~~~~
402
403 Originally, everything that is not in WHNF (`exprIsWHNF`) is considered a
404 thunk, not eta-expanded, to avoid losing any sharing. This is also how the
405 published papers on Call Arity describe it.
406
407 In practice, there are thunks that do a just little work, such as
408 pattern-matching on a variable, and the benefits of eta-expansion likely
409 oughtweigh the cost of doing that repeatedly. Therefore, this implementation of
410 Call Arity considers everything that is not cheap (`exprIsCheap`) as a thunk.
411
412 Note [Call Arity and Join Points]
413 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
414
415 The Call Arity analysis does not care about join points, and treats them just
416 like normal functions. This is ok.
417
418 The analysis *could* make use of the fact that join points are always evaluated
419 in the same context as the join-binding they are defined in and are always
420 one-shot, and handle join points separately, as suggested in
421 https://ghc.haskell.org/trac/ghc/ticket/13479#comment:10.
422 This *might* be more efficient (for example, join points would not have to be
423 considered interesting variables), but it would also add redundant code. So for
424 now we do not do that.
425
426 The simplifier never eta-expands join points (it instead pushes extra arguments from
427 an eta-expanded context into the join point’s RHS), so the call arity
428 annotation on join points is not actually used. As it would be equally valid
429 (though less efficient) to eta-expand join points, this is the simplifier's
430 choice, and hence Call Arity sets the call arity for join points as well.
431 -}
432
433 -- Main entry point
434
435 callArityAnalProgram :: DynFlags -> CoreProgram -> CoreProgram
436 callArityAnalProgram _dflags binds = binds'
437 where
438 (_, binds') = callArityTopLvl [] emptyVarSet binds
439
440 -- See Note [Analysing top-level-binds]
441 callArityTopLvl :: [Var] -> VarSet -> [CoreBind] -> (CallArityRes, [CoreBind])
442 callArityTopLvl exported _ []
443 = ( calledMultipleTimes $ (emptyUnVarGraph, mkVarEnv $ [(v, 0) | v <- exported])
444 , [] )
445 callArityTopLvl exported int1 (b:bs)
446 = (ae2, b':bs')
447 where
448 int2 = bindersOf b
449 exported' = filter isExportedId int2 ++ exported
450 int' = int1 `addInterestingBinds` b
451 (ae1, bs') = callArityTopLvl exported' int' bs
452 (ae2, b') = callArityBind (boringBinds b) ae1 int1 b
453
454
455 callArityRHS :: CoreExpr -> CoreExpr
456 callArityRHS = snd . callArityAnal 0 emptyVarSet
457
458 -- The main analysis function. See Note [Analysis type signature]
459 callArityAnal ::
460 Arity -> -- The arity this expression is called with
461 VarSet -> -- The set of interesting variables
462 CoreExpr -> -- The expression to analyse
463 (CallArityRes, CoreExpr)
464 -- How this expression uses its interesting variables
465 -- and the expression with IdInfo updated
466
467 -- The trivial base cases
468 callArityAnal _ _ e@(Lit _)
469 = (emptyArityRes, e)
470 callArityAnal _ _ e@(Type _)
471 = (emptyArityRes, e)
472 callArityAnal _ _ e@(Coercion _)
473 = (emptyArityRes, e)
474 -- The transparent cases
475 callArityAnal arity int (Tick t e)
476 = second (Tick t) $ callArityAnal arity int e
477 callArityAnal arity int (Cast e co)
478 = second (\e -> Cast e co) $ callArityAnal arity int e
479
480 -- The interesting case: Variables, Lambdas, Lets, Applications, Cases
481 callArityAnal arity int e@(Var v)
482 | v `elemVarSet` int
483 = (unitArityRes v arity, e)
484 | otherwise
485 = (emptyArityRes, e)
486
487 -- Non-value lambdas are ignored
488 callArityAnal arity int (Lam v e) | not (isId v)
489 = second (Lam v) $ callArityAnal arity (int `delVarSet` v) e
490
491 -- We have a lambda that may be called multiple times, so its free variables
492 -- can all be co-called.
493 callArityAnal 0 int (Lam v e)
494 = (ae', Lam v e')
495 where
496 (ae, e') = callArityAnal 0 (int `delVarSet` v) e
497 ae' = calledMultipleTimes ae
498 -- We have a lambda that we are calling. decrease arity.
499 callArityAnal arity int (Lam v e)
500 = (ae, Lam v e')
501 where
502 (ae, e') = callArityAnal (arity - 1) (int `delVarSet` v) e
503
504 -- Application. Increase arity for the called expression, nothing to know about
505 -- the second
506 callArityAnal arity int (App e (Type t))
507 = second (\e -> App e (Type t)) $ callArityAnal arity int e
508 callArityAnal arity int (App e1 e2)
509 = (final_ae, App e1' e2')
510 where
511 (ae1, e1') = callArityAnal (arity + 1) int e1
512 (ae2, e2') = callArityAnal 0 int e2
513 -- If the argument is trivial (e.g. a variable), then it will _not_ be
514 -- let-bound in the Core to STG transformation (CorePrep actually),
515 -- so no sharing will happen here, and we have to assume many calls.
516 ae2' | exprIsTrivial e2 = calledMultipleTimes ae2
517 | otherwise = ae2
518 final_ae = ae1 `both` ae2'
519
520 -- Case expression.
521 callArityAnal arity int (Case scrut bndr ty alts)
522 = -- pprTrace "callArityAnal:Case"
523 -- (vcat [ppr scrut, ppr final_ae])
524 (final_ae, Case scrut' bndr ty alts')
525 where
526 (alt_aes, alts') = unzip $ map go alts
527 go (dc, bndrs, e) = let (ae, e') = callArityAnal arity int e
528 in (ae, (dc, bndrs, e'))
529 alt_ae = lubRess alt_aes
530 (scrut_ae, scrut') = callArityAnal 0 int scrut
531 final_ae = scrut_ae `both` alt_ae
532
533 -- For lets, use callArityBind
534 callArityAnal arity int (Let bind e)
535 = -- pprTrace "callArityAnal:Let"
536 -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ])
537 (final_ae, Let bind' e')
538 where
539 int_body = int `addInterestingBinds` bind
540 (ae_body, e') = callArityAnal arity int_body e
541 (final_ae, bind') = callArityBind (boringBinds bind) ae_body int bind
542
543 -- Which bindings should we look at?
544 -- See Note [Which variables are interesting]
545 isInteresting :: Var -> Bool
546 isInteresting v = not $ null (typeArity (idType v))
547
548 interestingBinds :: CoreBind -> [Var]
549 interestingBinds = filter isInteresting . bindersOf
550
551 boringBinds :: CoreBind -> VarSet
552 boringBinds = mkVarSet . filter (not . isInteresting) . bindersOf
553
554 addInterestingBinds :: VarSet -> CoreBind -> VarSet
555 addInterestingBinds int bind
556 = int `delVarSetList` bindersOf bind -- Possible shadowing
557 `extendVarSetList` interestingBinds bind
558
559 -- Used for both local and top-level binds
560 -- Second argument is the demand from the body
561 callArityBind :: VarSet -> CallArityRes -> VarSet -> CoreBind -> (CallArityRes, CoreBind)
562 -- Non-recursive let
563 callArityBind boring_vars ae_body int (NonRec v rhs)
564 | otherwise
565 = -- pprTrace "callArityBind:NonRec"
566 -- (vcat [ppr v, ppr ae_body, ppr int, ppr ae_rhs, ppr safe_arity])
567 (final_ae, NonRec v' rhs')
568 where
569 is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk]
570 -- If v is boring, we will not find it in ae_body, but always assume (0, False)
571 boring = v `elemVarSet` boring_vars
572
573 (arity, called_once)
574 | boring = (0, False) -- See Note [Taking boring variables into account]
575 | otherwise = lookupCallArityRes ae_body v
576 safe_arity | called_once = arity
577 | is_thunk = 0 -- A thunk! Do not eta-expand
578 | otherwise = arity
579
580 -- See Note [Trimming arity]
581 trimmed_arity = trimArity v safe_arity
582
583 (ae_rhs, rhs') = callArityAnal trimmed_arity int rhs
584
585
586 ae_rhs'| called_once = ae_rhs
587 | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once
588 | otherwise = calledMultipleTimes ae_rhs
589
590 called_by_v = domRes ae_rhs'
591 called_with_v
592 | boring = domRes ae_body
593 | otherwise = calledWith ae_body v `delUnVarSet` v
594 final_ae = addCrossCoCalls called_by_v called_with_v $ ae_rhs' `lubRes` resDel v ae_body
595
596 v' = v `setIdCallArity` trimmed_arity
597
598
599 -- Recursive let. See Note [Recursion and fixpointing]
600 callArityBind boring_vars ae_body int b@(Rec binds)
601 = -- (if length binds > 300 then
602 -- pprTrace "callArityBind:Rec"
603 -- (vcat [ppr (Rec binds'), ppr ae_body, ppr int, ppr ae_rhs]) else id) $
604 (final_ae, Rec binds')
605 where
606 -- See Note [Taking boring variables into account]
607 any_boring = any (`elemVarSet` boring_vars) [ i | (i, _) <- binds]
608
609 int_body = int `addInterestingBinds` b
610 (ae_rhs, binds') = fix initial_binds
611 final_ae = bindersOf b `resDelList` ae_rhs
612
613 initial_binds = [(i,Nothing,e) | (i,e) <- binds]
614
615 fix :: [(Id, Maybe (Bool, Arity, CallArityRes), CoreExpr)] -> (CallArityRes, [(Id, CoreExpr)])
616 fix ann_binds
617 | -- pprTrace "callArityBind:fix" (vcat [ppr ann_binds, ppr any_change, ppr ae]) $
618 any_change
619 = fix ann_binds'
620 | otherwise
621 = (ae, map (\(i, _, e) -> (i, e)) ann_binds')
622 where
623 aes_old = [ (i,ae) | (i, Just (_,_,ae), _) <- ann_binds ]
624 ae = callArityRecEnv any_boring aes_old ae_body
625
626 rerun (i, mbLastRun, rhs)
627 | i `elemVarSet` int_body && not (i `elemUnVarSet` domRes ae)
628 -- No call to this yet, so do nothing
629 = (False, (i, Nothing, rhs))
630
631 | Just (old_called_once, old_arity, _) <- mbLastRun
632 , called_once == old_called_once
633 , new_arity == old_arity
634 -- No change, no need to re-analyze
635 = (False, (i, mbLastRun, rhs))
636
637 | otherwise
638 -- We previously analyzed this with a different arity (or not at all)
639 = let is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk]
640
641 safe_arity | is_thunk = 0 -- See Note [Thunks in recursive groups]
642 | otherwise = new_arity
643
644 -- See Note [Trimming arity]
645 trimmed_arity = trimArity i safe_arity
646
647 (ae_rhs, rhs') = callArityAnal trimmed_arity int_body rhs
648
649 ae_rhs' | called_once = ae_rhs
650 | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once
651 | otherwise = calledMultipleTimes ae_rhs
652
653 i' = i `setIdCallArity` trimmed_arity
654
655 in (True, (i', Just (called_once, new_arity, ae_rhs'), rhs'))
656 where
657 -- See Note [Taking boring variables into account]
658 (new_arity, called_once) | i `elemVarSet` boring_vars = (0, False)
659 | otherwise = lookupCallArityRes ae i
660
661 (changes, ann_binds') = unzip $ map rerun ann_binds
662 any_change = or changes
663
664 -- Combining the results from body and rhs, (mutually) recursive case
665 -- See Note [Analysis II: The Co-Called analysis]
666 callArityRecEnv :: Bool -> [(Var, CallArityRes)] -> CallArityRes -> CallArityRes
667 callArityRecEnv any_boring ae_rhss ae_body
668 = -- (if length ae_rhss > 300 then pprTrace "callArityRecEnv" (vcat [ppr ae_rhss, ppr ae_body, ppr ae_new]) else id) $
669 ae_new
670 where
671 vars = map fst ae_rhss
672
673 ae_combined = lubRess (map snd ae_rhss) `lubRes` ae_body
674
675 cross_calls
676 -- See Note [Taking boring variables into account]
677 | any_boring = completeGraph (domRes ae_combined)
678 -- Also, calculating cross_calls is expensive. Simply be conservative
679 -- if the mutually recursive group becomes too large.
680 | lengthExceeds ae_rhss 25 = completeGraph (domRes ae_combined)
681 | otherwise = unionUnVarGraphs $ map cross_call ae_rhss
682 cross_call (v, ae_rhs) = completeBipartiteGraph called_by_v called_with_v
683 where
684 is_thunk = idCallArity v == 0
685 -- What rhs are relevant as happening before (or after) calling v?
686 -- If v is a thunk, everything from all the _other_ variables
687 -- If v is not a thunk, everything can happen.
688 ae_before_v | is_thunk = lubRess (map snd $ filter ((/= v) . fst) ae_rhss) `lubRes` ae_body
689 | otherwise = ae_combined
690 -- What do we want to know from these?
691 -- Which calls can happen next to any recursive call.
692 called_with_v
693 = unionUnVarSets $ map (calledWith ae_before_v) vars
694 called_by_v = domRes ae_rhs
695
696 ae_new = first (cross_calls `unionUnVarGraph`) ae_combined
697
698 -- See Note [Trimming arity]
699 trimArity :: Id -> Arity -> Arity
700 trimArity v a = minimum [a, max_arity_by_type, max_arity_by_strsig]
701 where
702 max_arity_by_type = length (typeArity (idType v))
703 max_arity_by_strsig
704 | isBotRes result_info = length demands
705 | otherwise = a
706
707 (demands, result_info) = splitStrictSig (idStrictness v)
708
709 ---------------------------------------
710 -- Functions related to CallArityRes --
711 ---------------------------------------
712
713 -- Result type for the two analyses.
714 -- See Note [Analysis I: The arity analysis]
715 -- and Note [Analysis II: The Co-Called analysis]
716 type CallArityRes = (UnVarGraph, VarEnv Arity)
717
718 emptyArityRes :: CallArityRes
719 emptyArityRes = (emptyUnVarGraph, emptyVarEnv)
720
721 unitArityRes :: Var -> Arity -> CallArityRes
722 unitArityRes v arity = (emptyUnVarGraph, unitVarEnv v arity)
723
724 resDelList :: [Var] -> CallArityRes -> CallArityRes
725 resDelList vs ae = foldr resDel ae vs
726
727 resDel :: Var -> CallArityRes -> CallArityRes
728 resDel v (g, ae) = (g `delNode` v, ae `delVarEnv` v)
729
730 domRes :: CallArityRes -> UnVarSet
731 domRes (_, ae) = varEnvDom ae
732
733 -- In the result, find out the minimum arity and whether the variable is called
734 -- at most once.
735 lookupCallArityRes :: CallArityRes -> Var -> (Arity, Bool)
736 lookupCallArityRes (g, ae) v
737 = case lookupVarEnv ae v of
738 Just a -> (a, not (v `elemUnVarSet` (neighbors g v)))
739 Nothing -> (0, False)
740
741 calledWith :: CallArityRes -> Var -> UnVarSet
742 calledWith (g, _) v = neighbors g v
743
744 addCrossCoCalls :: UnVarSet -> UnVarSet -> CallArityRes -> CallArityRes
745 addCrossCoCalls set1 set2 = first (completeBipartiteGraph set1 set2 `unionUnVarGraph`)
746
747 -- Replaces the co-call graph by a complete graph (i.e. no information)
748 calledMultipleTimes :: CallArityRes -> CallArityRes
749 calledMultipleTimes res = first (const (completeGraph (domRes res))) res
750
751 -- Used for application and cases
752 both :: CallArityRes -> CallArityRes -> CallArityRes
753 both r1 r2 = addCrossCoCalls (domRes r1) (domRes r2) $ r1 `lubRes` r2
754
755 -- Used when combining results from alternative cases; take the minimum
756 lubRes :: CallArityRes -> CallArityRes -> CallArityRes
757 lubRes (g1, ae1) (g2, ae2) = (g1 `unionUnVarGraph` g2, ae1 `lubArityEnv` ae2)
758
759 lubArityEnv :: VarEnv Arity -> VarEnv Arity -> VarEnv Arity
760 lubArityEnv = plusVarEnv_C min
761
762 lubRess :: [CallArityRes] -> CallArityRes
763 lubRess = foldl lubRes emptyArityRes