Call Arity refactoring: instance Outputable Count
[ghc.git] / compiler / simplCore / CallArity.hs
1 --
2 -- Copyright (c) 2014 Joachim Breitner
3 --
4
5 module CallArity
6 ( callArityAnalProgram
7 , callArityRHS -- for testing
8 ) where
9
10 import VarSet
11 import VarEnv
12 import DynFlags ( DynFlags )
13
14 import BasicTypes
15 import CoreSyn
16 import Id
17 import CoreArity ( exprArity, typeArity )
18 import CoreUtils ( exprIsHNF )
19 import Outputable
20
21 import Control.Arrow ( first, second )
22
23
24 {-
25 %************************************************************************
26 %* *
27 Call Arity Analyis
28 %* *
29 %************************************************************************
30
31 Note [Call Arity: The goal]
32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
33
34 The goal of this analysis is to find out if we can eta-expand a local function,
35 based on how it is being called. The motivating example is code this this,
36 which comes up when we implement foldl using foldr, and do list fusion:
37
38 let go = \x -> let d = case ... of
39 False -> go (x+1)
40 True -> id
41 in \z -> d (x + z)
42 in go 1 0
43
44 If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of
45 partial function applications, which would be bad.
46
47 The function `go` has a type of arity two, but only one lambda is manifest.
48 Further more, an analysis that only looks at the RHS of go cannot be sufficient
49 to eta-expand go: If `go` is ever called with one argument (and the result used
50 multiple times), we would be doing the work in `...` multiple times.
51
52 So `callArityAnalProgram` looks at the whole let expression to figure out if
53 all calls are nice, i.e. have a high enough arity. It then stores the result in
54 the `calledArity` field of the `IdInfo` of `go`, which the next simplifier
55 phase will eta-expand.
56
57 The specification of the `calledArity` field is:
58
59 No work will be lost if you eta-expand me to the arity in `calledArity`.
60
61 The specification of the analysis
62 ---------------------------------
63
64 The analysis only does a conservative approximation, there are plenty of
65 situations where eta-expansion would be ok, but we do not catch it. We are
66 content if all the code that foldl-via-foldr generates is being optimized
67 sufficiently.
68
69 The work-hourse of the analysis is the function `callArityAnal`, with the
70 following type:
71
72 data Count = Many | OnceAndOnly
73 type CallCount = (Count, Arity)
74 type CallArityEnv = VarEnv (CallCount, Arity)
75 callArityAnal ::
76 Arity -> -- The arity this expression is called with
77 VarSet -> -- The set of interesting variables
78 CoreExpr -> -- The expression to analyse
79 (CallArityEnv, CoreExpr)
80
81 and the following specification:
82
83 (callArityEnv, expr') = callArityEnv arity interestingIds expr
84
85 <=>
86
87 Assume the expression `expr` is being passed `arity` arguments. Then it calls
88 the functions mentioned in `interestingIds` according to `callArityEnv`:
89 * The domain of `callArityEnv` is a subset of `interestingIds`.
90 * Any variable from interestingIds that is not mentioned in the `callArityEnv`
91 is absent, i.e. not called at all.
92 * Of all the variables that are mapped to OnceAndOnly by the `callArityEnv`,
93 at most one is being called, at most once, with at least that many
94 arguments.
95 * Variables mapped to Many are called an unknown number of times, but if they
96 are called, then with at least that many arguments.
97 Furthermore, expr' is expr with the callArity field of the `IdInfo` updated.
98
99 The (pointwise) domain is a product domain:
100
101 Many 0
102 | × |
103 OneAndOnly 1
104 |
105 ...
106
107 The at-most-once is important for various reasons:
108
109 1. Consider:
110
111 let n = case .. of .. -- A thunk!
112 in n 0 + n 1
113
114 vs.
115
116 let n = case .. of ..
117 in case .. of T -> n 0
118 F -> n 1
119
120 We are only allowed to eta-expand `n` if it is going to be called at most
121 once in the body of the outer let. So we need to know, for each variable
122 individually, that it is going to be called at most once.
123
124 2. We need to know it for non-thunks as well, because they might call a thunk:
125
126 let n = case .. of ..
127 f x = n (x+1)
128 in f 1 + f 2
129
130 vs.
131
132 let n = case .. of ..
133 f x = n (x+1)
134 in case .. of T -> f 0
135 F -> f 1
136
137 Here, the body of f calls n exactly once, but f itself is being called
138 multiple times, so eta-expansion is not allowed.
139
140 3. We need to know that at most one of the interesting functions is being
141 called, because of recursion. Consider:
142
143 let n = case .. of ..
144 in case .. of
145 True -> let go = \y -> case .. of
146 True -> go (y + n 1)
147 False > n
148 in go 1
149 False -> n
150
151 vs.
152
153 let n = case .. of ..
154 in case .. of
155 True -> let go = \y -> case .. of
156 True -> go (y+1)
157 False > n
158 in go 1
159 False -> n
160
161 In both cases, the body and the rhs of the inner let call n at most once.
162 But only in the second case that holds for the whole expression! The
163 crucial difference is that in the first case, the rhs of `go` can call
164 *both* `go` and `n`, and hence can call `n` multiple times as it recurses,
165 while in the second case it calls `go` or `n`, but not both.
166
167 Note [Which variables are interesting]
168 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
169
170 Unfortunately, the set of interesting variables is not irrelevant for the
171 precision of the analysis. Consider this example (and ignore the pointlessnes
172 of `d` recursing into itself):
173
174 let n = ... :: Int
175 in let d = let d = case ... of
176 False -> d
177 True -> id
178 in \z -> d (x + z)
179 in d 0
180
181 Of course, `d` should be interesting. If we consider `n` as interesting as
182 well, then the body of the second let will return
183 { go |-> (Many, 1) , n |-> (OnceAndOnly, 0) }
184 or
185 { go |-> (OnceAndOnly, 1), n |-> (Many, 0)}.
186 Only the latter is useful, but it is hard to decide that locally.
187 (Returning OnceAndOnly for both would be wrong, as both are being called.)
188
189 So the heuristics is:
190
191 Variables are interesting if their RHS has a lower exprArity than
192 typeArity.
193
194 (which is precisely the those variables where this analysis can actually cause
195 some eta-expansion.)
196
197 But this is not uniformly a win. Consider:
198
199 let go = \x -> let d = case ... of
200 False -> go (x+1)
201 True -> id
202 n x = d (x+1)
203 in \z -> n (x + z)
204 in go n 0
205
206 Now `n` is not going to be considered interesting (its type is `Int -> Int`).
207 But this will prevent us from detecting how often the body of the let calls
208 `d`, and we will not find out anything.
209
210 It might be possible to be smarter here; this needs find-tuning as we find more
211 examples.
212
213
214 Note [Recursion and fixpointing]
215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216
217 For a recursive let, we begin by analysing the body, using the same incoming
218 arity as for the whole expression.
219 * We use the arity from the body on the variable as the incoming demand on the
220 rhs. Then we check if the rhs calls itself with the same arity.
221 - If so, we are done.
222 - If not, we re-analise the rhs with the reduced arity. We do that until
223 we are down to the exprArity, which then is certainly correct.
224 * If the rhs calls itself many times, we must (conservatively) pass the result
225 through forgetOnceCalls.
226 * Similarly, if the body calls the variable many times, we must pass the
227 result of the fixpointing through forgetOnceCalls.
228 * Then we can `lubEnv` the results from the body and the rhs: If all mentioned
229 calls are OnceAndOnly calls, then the body calls *either* the rhs *or* one
230 of the other mentioned variables. Similarly, the rhs calls *either* itself
231 again *or* one of the other mentioned variables. This precision is required!
232
233 We do not analyse mutually recursive functions. This can be done once we see it
234 in the wild.
235
236 Note [Case and App: Which side to take?]
237 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
238
239 Combining the case branches is easy, just `lubEnv` them – at most one branch is
240 taken.
241
242 But how to combine that with the information coming from the scrunitee? Very
243 similarly, how to combine the information from the callee and argument of an
244 `App`?
245
246 It would not be correct to just `lubEnv` then: `f n` obviously calls *both* `f`
247 and `n`. We need to forget about the cardinality of calls from one side using
248 `forgetOnceCalls`. But which one?
249
250 Both are correct, and sometimes one and sometimes the other is more precise
251 (also see example in [Which variables are interesting]).
252
253 So currently, we first check the scrunitee (resp. the callee) if the returned
254 value has any usesful information, and if so, we use that; otherwise we use the
255 information from the alternatives (resp. the argument).
256
257 It might be smarter to look for “more important” variables first, i.e. the
258 innermost recursive variable.
259
260 -}
261
262 callArityAnalProgram :: DynFlags -> CoreProgram -> CoreProgram
263 callArityAnalProgram _dflags = map callArityBind
264
265 callArityBind :: CoreBind -> CoreBind
266 callArityBind (NonRec id rhs) = NonRec id (callArityRHS rhs)
267 callArityBind (Rec binds) = Rec $ map (\(id,rhs) -> (id, callArityRHS rhs)) binds
268
269 callArityRHS :: CoreExpr -> CoreExpr
270 callArityRHS = snd . callArityAnal 0 emptyVarSet
271
272
273 data Count = Many | OnceAndOnly deriving (Eq, Ord)
274 type CallCount = (Count, Arity)
275
276 topCallCount :: CallCount
277 topCallCount = (Many, 0)
278
279 type CallArityEnv = VarEnv CallCount
280
281 callArityAnal ::
282 Arity -> -- The arity this expression is called with
283 VarSet -> -- The set of interesting variables
284 CoreExpr -> -- The expression to analyse
285 (CallArityEnv, CoreExpr)
286 -- How this expression uses its interesting variables
287 -- and the expression with IdInfo updated
288
289 -- The trivial base cases
290 callArityAnal _ _ e@(Lit _)
291 = (emptyVarEnv, e)
292 callArityAnal _ _ e@(Type _)
293 = (emptyVarEnv, e)
294 callArityAnal _ _ e@(Coercion _)
295 = (emptyVarEnv, e)
296 -- The transparent cases
297 callArityAnal arity int (Tick t e)
298 = second (Tick t) $ callArityAnal arity int e
299 callArityAnal arity int (Cast e co)
300 = second (\e -> Cast e co) $ callArityAnal arity int e
301
302 -- The interesting case: Variables, Lambdas, Lets, Applications, Cases
303 callArityAnal arity int e@(Var v)
304 | v `elemVarSet` int
305 = (unitVarEnv v (OnceAndOnly, arity), e)
306 | otherwise
307 = (emptyVarEnv, e)
308
309 -- We have a lambda that we are not sure to call. Tail calls therein
310 -- are no longer OneAndOnly calls
311 callArityAnal 0 int (Lam v e)
312 = (ae', Lam v e')
313 where
314 (ae, e') = callArityAnal 0 int e
315 ae' = forgetOnceCalls ae
316 -- We have a lambda that we are calling. decrease arity.
317 callArityAnal arity int (Lam v e)
318 = (ae, Lam v e')
319 where
320 (ae, e') = callArityAnal (arity - 1) int e
321
322 -- Boring non-recursive let, i.e. no eta expansion possible. do not be smart about this
323 -- See Note [Which variables are interesting]
324 callArityAnal arity int (Let (NonRec v rhs) e)
325 | exprArity rhs >= length (typeArity (idType v))
326 = (ae_final, Let (NonRec v rhs') e')
327 where
328 (ae_rhs, rhs') = callArityAnal 0 int rhs
329 (ae_body, e') = callArityAnal arity int e
330 ae_body' = ae_body `delVarEnv` v
331 ae_final = forgetOnceCalls ae_rhs `lubEnv` ae_body'
332
333 -- Non-recursive let. Find out how the body calls the rhs, analise that,
334 -- and combine the results, convervatively using both
335 callArityAnal arity int (Let (NonRec v rhs) e)
336 = -- pprTrace "callArityAnal:LetNonRec"
337 -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ])
338 (final_ae, Let (NonRec v' rhs') e')
339 where
340 int_body = int `extendVarSet` v
341 (ae_body, e') = callArityAnal arity int_body e
342 callcount = lookupWithDefaultVarEnv ae_body topCallCount v
343
344 (ae_rhs, safe_arity, rhs') = callArityBound callcount int rhs
345 final_ae = ae_rhs `lubEnv` (ae_body `delVarEnv` v)
346 v' = v `setIdCallArity` safe_arity
347
348 -- Boring recursive let, i.e. no eta expansion possible. do not be smart about this
349 callArityAnal arity int (Let (Rec [(v,rhs)]) e)
350 | exprArity rhs >= length (typeArity (idType v))
351 = (ae_final, Let (Rec [(v,rhs')]) e')
352 where
353 (ae_rhs, rhs') = callArityAnal 0 int rhs
354 (ae_body, e') = callArityAnal arity int e
355 ae_final = (forgetOnceCalls ae_rhs `lubEnv` ae_body) `delVarEnv` v
356
357 -- Recursive let.
358 -- See Note [Recursion and fixpointing]
359 callArityAnal arity int (Let (Rec [(v,rhs)]) e)
360 = -- pprTrace "callArityAnal:LetRec"
361 -- (vcat [ppr v, ppr arity, ppr safe_arity, ppr rhs_arity', ppr final_ae ])
362 (final_ae, Let (Rec [(v',rhs')]) e')
363 where
364 int_body = int `extendVarSet` v
365 (ae_body, e') = callArityAnal arity int_body e
366 callcount = lookupWithDefaultVarEnv ae_body topCallCount v
367
368 (ae_rhs, new_arity, rhs') = callArityFix callcount int_body v rhs
369 final_ae = (ae_rhs `lubEnv` ae_body) `delVarEnv` v
370 v' = v `setIdCallArity` new_arity
371
372
373
374 -- Mutual recursion. Do nothing serious here, for now
375 callArityAnal arity int (Let (Rec binds) e)
376 = (final_ae, Let (Rec binds') e')
377 where
378 (aes, binds') = unzip $ map go binds
379 go (i,e) = let (ae,e') = callArityAnal 0 int e
380 in (forgetOnceCalls ae, (i,e'))
381 (ae, e') = callArityAnal arity int e
382 final_ae = foldl lubEnv ae aes `delVarEnvList` map fst binds
383
384 -- Application. Increase arity for the called expresion, nothing to know about
385 -- the second
386 callArityAnal arity int (App e1 e2)
387 = (final_ae, App e1' e2')
388 where
389 (ae1, e1') = callArityAnal (arity + 1) int e1
390 (ae2, e2') = callArityAnal 0 int e2
391 -- See Note [Case and App: Which side to take?]
392 final_ae = ae1 `useBetterOf` ae2
393
394 -- Case expression. Here we decide whether
395 -- we want to look at calls from the scrunitee or the alternatives;
396 -- one of them we set to Nothing.
397 -- Naive idea: If there are interesting calls in the scrunitee,
398 -- zap the alternatives
399 callArityAnal arity int (Case scrut bndr ty alts)
400 = -- pprTrace "callArityAnal:Case"
401 -- (vcat [ppr scrut, ppr final_ae])
402 (final_ae, Case scrut' bndr ty alts')
403 where
404 (alt_aes, alts') = unzip $ map go alts
405 go (dc, bndrs, e) = let (ae, e') = callArityAnal arity int e
406 in (ae, (dc, bndrs, e'))
407 alt_ae = foldl lubEnv emptyVarEnv alt_aes
408 (scrut_ae, scrut') = callArityAnal 0 int scrut
409 -- See Note [Case and App: Which side to take?]
410 final_ae = scrut_ae `useBetterOf` alt_ae
411
412 callArityFix :: CallCount -> VarSet -> Id -> CoreExpr -> (CallArityEnv, Arity, CoreExpr)
413 callArityFix arity int v e
414
415 | arity `lteCallCount` min_arity
416 -- The incoming arity is already lower than the exprArity, so we can
417 -- ignore the arity coming from the RHS
418 = (ae `delVarEnv` v, 0, e')
419
420 | otherwise
421 = if new_arity `ltCallCount` arity
422 -- RHS puts a lower arity on itself, so try that
423 then callArityFix new_arity int v e
424
425 -- RHS calls itself with at least as many arguments as the body of the let: Great!
426 else (ae `delVarEnv` v, safe_arity, e')
427 where
428 (ae, safe_arity, e') = callArityBound arity int e
429 new_arity = lookupWithDefaultVarEnv ae topCallCount v
430 min_arity = (Many, exprArity e)
431
432 -- This is a variant of callArityAnal that takes a CallCount (i.e. an arity with a
433 -- cardinality) and adjust the resulting environment accordingly. It is to be used
434 -- on bound expressions that can possibly be shared.
435 -- It also returns the safe arity used: For a thunk that is called multiple
436 -- times, this will be 0!
437 callArityBound :: CallCount -> VarSet -> CoreExpr -> (CallArityEnv, Arity, CoreExpr)
438 callArityBound (count, arity) int e = (final_ae, safe_arity, e')
439 where
440 is_thunk = not (exprIsHNF e)
441
442 safe_arity | OnceAndOnly <- count = arity
443 | is_thunk = 0 -- A thunk! Do not eta-expand
444 | otherwise = arity
445
446 (ae, e') = callArityAnal safe_arity int e
447
448 final_ae | OnceAndOnly <- count = ae
449 | otherwise = forgetOnceCalls ae
450
451
452 anyGoodCalls :: CallArityEnv -> Bool
453 anyGoodCalls = foldVarEnv ((||) . isOnceCall) False
454
455 isOnceCall :: CallCount -> Bool
456 isOnceCall (OnceAndOnly, _) = True
457 isOnceCall (Many, _) = False
458
459 forgetOnceCalls :: CallArityEnv -> CallArityEnv
460 forgetOnceCalls = mapVarEnv (first (const Many))
461
462 -- See Note [Case and App: Which side to take?]
463 useBetterOf :: CallArityEnv -> CallArityEnv -> CallArityEnv
464 useBetterOf ae1 ae2 | anyGoodCalls ae1 = ae1 `lubEnv` forgetOnceCalls ae2
465 useBetterOf ae1 ae2 | otherwise = forgetOnceCalls ae1 `lubEnv` ae2
466
467 lubCallCount :: CallCount -> CallCount -> CallCount
468 lubCallCount (count1, arity1) (count2, arity2)
469 = (count1 `lubCount` count2, arity1 `min` arity2)
470
471 lubCount :: Count -> Count -> Count
472 lubCount OnceAndOnly OnceAndOnly = OnceAndOnly
473 lubCount _ _ = Many
474
475 lteCallCount :: CallCount -> CallCount -> Bool
476 lteCallCount (count1, arity1) (count2, arity2)
477 = count1 <= count2 && arity1 <= arity2
478
479 ltCallCount :: CallCount -> CallCount -> Bool
480 ltCallCount c1 c2 = c1 `lteCallCount` c2 && c1 /= c2
481
482 -- Used when combining results from alternative cases; take the minimum
483 lubEnv :: CallArityEnv -> CallArityEnv -> CallArityEnv
484 lubEnv = plusVarEnv_C lubCallCount
485
486 instance Outputable Count where
487 ppr Many = text "Many"
488 ppr OnceAndOnly = text "OnceAndOnly"