Call Arity refactoring: Factor out callArityBound
[ghc.git] / compiler / simplCore / CallArity.hs
1 --
2 -- Copyright (c) 2014 Joachim Breitner
3 --
4
5 module CallArity
6 ( callArityAnalProgram
7 , callArityRHS -- for testing
8 ) where
9
10 import VarSet
11 import VarEnv
12 import DynFlags ( DynFlags )
13
14 import BasicTypes
15 import CoreSyn
16 import Id
17 import CoreArity ( exprArity, typeArity )
18 import CoreUtils ( exprIsHNF )
19
20 import Control.Arrow ( first, second )
21
22
23 {-
24 %************************************************************************
25 %* *
26 Call Arity Analyis
27 %* *
28 %************************************************************************
29
30 Note [Call Arity: The goal]
31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
32
33 The goal of this analysis is to find out if we can eta-expand a local function,
34 based on how it is being called. The motivating example is code this this,
35 which comes up when we implement foldl using foldr, and do list fusion:
36
37 let go = \x -> let d = case ... of
38 False -> go (x+1)
39 True -> id
40 in \z -> d (x + z)
41 in go 1 0
42
43 If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of
44 partial function applications, which would be bad.
45
46 The function `go` has a type of arity two, but only one lambda is manifest.
47 Further more, an analysis that only looks at the RHS of go cannot be sufficient
48 to eta-expand go: If `go` is ever called with one argument (and the result used
49 multiple times), we would be doing the work in `...` multiple times.
50
51 So `callArityAnalProgram` looks at the whole let expression to figure out if
52 all calls are nice, i.e. have a high enough arity. It then stores the result in
53 the `calledArity` field of the `IdInfo` of `go`, which the next simplifier
54 phase will eta-expand.
55
56 The specification of the `calledArity` field is:
57
58 No work will be lost if you eta-expand me to the arity in `calledArity`.
59
60 The specification of the analysis
61 ---------------------------------
62
63 The analysis only does a conservative approximation, there are plenty of
64 situations where eta-expansion would be ok, but we do not catch it. We are
65 content if all the code that foldl-via-foldr generates is being optimized
66 sufficiently.
67
68 The work-hourse of the analysis is the function `callArityAnal`, with the
69 following type:
70
71 data Count = Many | OnceAndOnly
72 type CallCount = (Count, Arity)
73 type CallArityEnv = VarEnv (CallCount, Arity)
74 callArityAnal ::
75 Arity -> -- The arity this expression is called with
76 VarSet -> -- The set of interesting variables
77 CoreExpr -> -- The expression to analyse
78 (CallArityEnv, CoreExpr)
79
80 and the following specification:
81
82 (callArityEnv, expr') = callArityEnv arity interestingIds expr
83
84 <=>
85
86 Assume the expression `expr` is being passed `arity` arguments. Then it calls
87 the functions mentioned in `interestingIds` according to `callArityEnv`:
88 * The domain of `callArityEnv` is a subset of `interestingIds`.
89 * Any variable from interestingIds that is not mentioned in the `callArityEnv`
90 is absent, i.e. not called at all.
91 * Of all the variables that are mapped to OnceAndOnly by the `callArityEnv`,
92 at most one is being called, at most once, with at least that many
93 arguments.
94 * Variables mapped to Many are called an unknown number of times, but if they
95 are called, then with at least that many arguments.
96 Furthermore, expr' is expr with the callArity field of the `IdInfo` updated.
97
98 The (pointwise) domain is a product domain:
99
100 Many 0
101 | × |
102 OneAndOnly 1
103 |
104 ...
105
106 The at-most-once is important for various reasons:
107
108 1. Consider:
109
110 let n = case .. of .. -- A thunk!
111 in n 0 + n 1
112
113 vs.
114
115 let n = case .. of ..
116 in case .. of T -> n 0
117 F -> n 1
118
119 We are only allowed to eta-expand `n` if it is going to be called at most
120 once in the body of the outer let. So we need to know, for each variable
121 individually, that it is going to be called at most once.
122
123 2. We need to know it for non-thunks as well, because they might call a thunk:
124
125 let n = case .. of ..
126 f x = n (x+1)
127 in f 1 + f 2
128
129 vs.
130
131 let n = case .. of ..
132 f x = n (x+1)
133 in case .. of T -> f 0
134 F -> f 1
135
136 Here, the body of f calls n exactly once, but f itself is being called
137 multiple times, so eta-expansion is not allowed.
138
139 3. We need to know that at most one of the interesting functions is being
140 called, because of recursion. Consider:
141
142 let n = case .. of ..
143 in case .. of
144 True -> let go = \y -> case .. of
145 True -> go (y + n 1)
146 False > n
147 in go 1
148 False -> n
149
150 vs.
151
152 let n = case .. of ..
153 in case .. of
154 True -> let go = \y -> case .. of
155 True -> go (y+1)
156 False > n
157 in go 1
158 False -> n
159
160 In both cases, the body and the rhs of the inner let call n at most once.
161 But only in the second case that holds for the whole expression! The
162 crucial difference is that in the first case, the rhs of `go` can call
163 *both* `go` and `n`, and hence can call `n` multiple times as it recurses,
164 while in the second case it calls `go` or `n`, but not both.
165
166 Note [Which variables are interesting]
167 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
168
169 Unfortunately, the set of interesting variables is not irrelevant for the
170 precision of the analysis. Consider this example (and ignore the pointlessnes
171 of `d` recursing into itself):
172
173 let n = ... :: Int
174 in let d = let d = case ... of
175 False -> d
176 True -> id
177 in \z -> d (x + z)
178 in d 0
179
180 Of course, `d` should be interesting. If we consider `n` as interesting as
181 well, then the body of the second let will return
182 { go |-> (Many, 1) , n |-> (OnceAndOnly, 0) }
183 or
184 { go |-> (OnceAndOnly, 1), n |-> (Many, 0)}.
185 Only the latter is useful, but it is hard to decide that locally.
186 (Returning OnceAndOnly for both would be wrong, as both are being called.)
187
188 So the heuristics is:
189
190 Variables are interesting if their RHS has a lower exprArity than
191 typeArity.
192
193 (which is precisely the those variables where this analysis can actually cause
194 some eta-expansion.)
195
196 But this is not uniformly a win. Consider:
197
198 let go = \x -> let d = case ... of
199 False -> go (x+1)
200 True -> id
201 n x = d (x+1)
202 in \z -> n (x + z)
203 in go n 0
204
205 Now `n` is not going to be considered interesting (its type is `Int -> Int`).
206 But this will prevent us from detecting how often the body of the let calls
207 `d`, and we will not find out anything.
208
209 It might be possible to be smarter here; this needs find-tuning as we find more
210 examples.
211
212
213 Note [Recursion and fixpointing]
214 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
215
216 For a recursive let, we begin by analysing the body, using the same incoming
217 arity as for the whole expression.
218 * We use the arity from the body on the variable as the incoming demand on the
219 rhs. Then we check if the rhs calls itself with the same arity.
220 - If so, we are done.
221 - If not, we re-analise the rhs with the reduced arity. We do that until
222 we are down to the exprArity, which then is certainly correct.
223 * If the rhs calls itself many times, we must (conservatively) pass the result
224 through forgetOnceCalls.
225 * Similarly, if the body calls the variable many times, we must pass the
226 result of the fixpointing through forgetOnceCalls.
227 * Then we can `lubEnv` the results from the body and the rhs: If all mentioned
228 calls are OnceAndOnly calls, then the body calls *either* the rhs *or* one
229 of the other mentioned variables. Similarly, the rhs calls *either* itself
230 again *or* one of the other mentioned variables. This precision is required!
231
232 We do not analyse mutually recursive functions. This can be done once we see it
233 in the wild.
234
235 Note [Case and App: Which side to take?]
236 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
237
238 Combining the case branches is easy, just `lubEnv` them – at most one branch is
239 taken.
240
241 But how to combine that with the information coming from the scrunitee? Very
242 similarly, how to combine the information from the callee and argument of an
243 `App`?
244
245 It would not be correct to just `lubEnv` then: `f n` obviously calls *both* `f`
246 and `n`. We need to forget about the cardinality of calls from one side using
247 `forgetOnceCalls`. But which one?
248
249 Both are correct, and sometimes one and sometimes the other is more precise
250 (also see example in [Which variables are interesting]).
251
252 So currently, we first check the scrunitee (resp. the callee) if the returned
253 value has any usesful information, and if so, we use that; otherwise we use the
254 information from the alternatives (resp. the argument).
255
256 It might be smarter to look for “more important” variables first, i.e. the
257 innermost recursive variable.
258
259 -}
260
261 callArityAnalProgram :: DynFlags -> CoreProgram -> CoreProgram
262 callArityAnalProgram _dflags = map callArityBind
263
264 callArityBind :: CoreBind -> CoreBind
265 callArityBind (NonRec id rhs) = NonRec id (callArityRHS rhs)
266 callArityBind (Rec binds) = Rec $ map (\(id,rhs) -> (id, callArityRHS rhs)) binds
267
268 callArityRHS :: CoreExpr -> CoreExpr
269 callArityRHS = snd . callArityAnal 0 emptyVarSet
270
271
272 data Count = Many | OnceAndOnly deriving (Eq, Ord)
273 type CallCount = (Count, Arity)
274
275 topCallCount :: CallCount
276 topCallCount = (Many, 0)
277
278 type CallArityEnv = VarEnv CallCount
279
280 callArityAnal ::
281 Arity -> -- The arity this expression is called with
282 VarSet -> -- The set of interesting variables
283 CoreExpr -> -- The expression to analyse
284 (CallArityEnv, CoreExpr)
285 -- How this expression uses its interesting variables
286 -- and the expression with IdInfo updated
287
288 -- The trivial base cases
289 callArityAnal _ _ e@(Lit _)
290 = (emptyVarEnv, e)
291 callArityAnal _ _ e@(Type _)
292 = (emptyVarEnv, e)
293 callArityAnal _ _ e@(Coercion _)
294 = (emptyVarEnv, e)
295 -- The transparent cases
296 callArityAnal arity int (Tick t e)
297 = second (Tick t) $ callArityAnal arity int e
298 callArityAnal arity int (Cast e co)
299 = second (\e -> Cast e co) $ callArityAnal arity int e
300
301 -- The interesting case: Variables, Lambdas, Lets, Applications, Cases
302 callArityAnal arity int e@(Var v)
303 | v `elemVarSet` int
304 = (unitVarEnv v (OnceAndOnly, arity), e)
305 | otherwise
306 = (emptyVarEnv, e)
307
308 -- We have a lambda that we are not sure to call. Tail calls therein
309 -- are no longer OneAndOnly calls
310 callArityAnal 0 int (Lam v e)
311 = (ae', Lam v e')
312 where
313 (ae, e') = callArityAnal 0 int e
314 ae' = forgetOnceCalls ae
315 -- We have a lambda that we are calling. decrease arity.
316 callArityAnal arity int (Lam v e)
317 = (ae, Lam v e')
318 where
319 (ae, e') = callArityAnal (arity - 1) int e
320
321 -- Boring non-recursive let, i.e. no eta expansion possible. do not be smart about this
322 -- See Note [Which variables are interesting]
323 callArityAnal arity int (Let (NonRec v rhs) e)
324 | exprArity rhs >= length (typeArity (idType v))
325 = (ae_final, Let (NonRec v rhs') e')
326 where
327 (ae_rhs, rhs') = callArityAnal 0 int rhs
328 (ae_body, e') = callArityAnal arity int e
329 ae_body' = ae_body `delVarEnv` v
330 ae_final = forgetOnceCalls ae_rhs `lubEnv` ae_body'
331
332 -- Non-recursive let. Find out how the body calls the rhs, analise that,
333 -- and combine the results, convervatively using both
334 callArityAnal arity int (Let (NonRec v rhs) e)
335 = -- pprTrace "callArityAnal:LetNonRec"
336 -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ])
337 (final_ae, Let (NonRec v' rhs') e')
338 where
339 int_body = int `extendVarSet` v
340 (ae_body, e') = callArityAnal arity int_body e
341 callcount = lookupWithDefaultVarEnv ae_body topCallCount v
342
343 (ae_rhs, safe_arity, rhs') = callArityBound callcount int rhs
344 final_ae = ae_rhs `lubEnv` (ae_body `delVarEnv` v)
345 v' = v `setIdCallArity` safe_arity
346
347 -- Boring recursive let, i.e. no eta expansion possible. do not be smart about this
348 callArityAnal arity int (Let (Rec [(v,rhs)]) e)
349 | exprArity rhs >= length (typeArity (idType v))
350 = (ae_final, Let (Rec [(v,rhs')]) e')
351 where
352 (ae_rhs, rhs') = callArityAnal 0 int rhs
353 (ae_body, e') = callArityAnal arity int e
354 ae_final = (forgetOnceCalls ae_rhs `lubEnv` ae_body) `delVarEnv` v
355
356 -- Recursive let.
357 -- See Note [Recursion and fixpointing]
358 callArityAnal arity int (Let (Rec [(v,rhs)]) e)
359 = -- pprTrace "callArityAnal:LetRec"
360 -- (vcat [ppr v, ppr arity, ppr safe_arity, ppr rhs_arity', ppr final_ae ])
361 (final_ae, Let (Rec [(v',rhs')]) e')
362 where
363 int_body = int `extendVarSet` v
364 (ae_body, e') = callArityAnal arity int_body e
365 callcount = lookupWithDefaultVarEnv ae_body topCallCount v
366
367 (ae_rhs, new_arity, rhs') = callArityFix callcount int_body v rhs
368 final_ae = (ae_rhs `lubEnv` ae_body) `delVarEnv` v
369 v' = v `setIdCallArity` new_arity
370
371
372
373 -- Mutual recursion. Do nothing serious here, for now
374 callArityAnal arity int (Let (Rec binds) e)
375 = (final_ae, Let (Rec binds') e')
376 where
377 (aes, binds') = unzip $ map go binds
378 go (i,e) = let (ae,e') = callArityAnal 0 int e
379 in (forgetOnceCalls ae, (i,e'))
380 (ae, e') = callArityAnal arity int e
381 final_ae = foldl lubEnv ae aes `delVarEnvList` map fst binds
382
383 -- Application. Increase arity for the called expresion, nothing to know about
384 -- the second
385 callArityAnal arity int (App e1 e2)
386 = (final_ae, App e1' e2')
387 where
388 (ae1, e1') = callArityAnal (arity + 1) int e1
389 (ae2, e2') = callArityAnal 0 int e2
390 -- See Note [Case and App: Which side to take?]
391 final_ae = ae1 `useBetterOf` ae2
392
393 -- Case expression. Here we decide whether
394 -- we want to look at calls from the scrunitee or the alternatives;
395 -- one of them we set to Nothing.
396 -- Naive idea: If there are interesting calls in the scrunitee,
397 -- zap the alternatives
398 callArityAnal arity int (Case scrut bndr ty alts)
399 = -- pprTrace "callArityAnal:Case"
400 -- (vcat [ppr scrut, ppr final_ae])
401 (final_ae, Case scrut' bndr ty alts')
402 where
403 (alt_aes, alts') = unzip $ map go alts
404 go (dc, bndrs, e) = let (ae, e') = callArityAnal arity int e
405 in (ae, (dc, bndrs, e'))
406 alt_ae = foldl lubEnv emptyVarEnv alt_aes
407 (scrut_ae, scrut') = callArityAnal 0 int scrut
408 -- See Note [Case and App: Which side to take?]
409 final_ae = scrut_ae `useBetterOf` alt_ae
410
411 callArityFix :: CallCount -> VarSet -> Id -> CoreExpr -> (CallArityEnv, Arity, CoreExpr)
412 callArityFix arity int v e
413
414 | arity `lteCallCount` min_arity
415 -- The incoming arity is already lower than the exprArity, so we can
416 -- ignore the arity coming from the RHS
417 = (ae `delVarEnv` v, 0, e')
418
419 | otherwise
420 = if new_arity `ltCallCount` arity
421 -- RHS puts a lower arity on itself, so try that
422 then callArityFix new_arity int v e
423
424 -- RHS calls itself with at least as many arguments as the body of the let: Great!
425 else (ae `delVarEnv` v, safe_arity, e')
426 where
427 (ae, safe_arity, e') = callArityBound arity int e
428 new_arity = lookupWithDefaultVarEnv ae topCallCount v
429 min_arity = (Many, exprArity e)
430
431 -- This is a variant of callArityAnal that takes a CallCount (i.e. an arity with a
432 -- cardinality) and adjust the resulting environment accordingly. It is to be used
433 -- on bound expressions that can possibly be shared.
434 -- It also returns the safe arity used: For a thunk that is called multiple
435 -- times, this will be 0!
436 callArityBound :: CallCount -> VarSet -> CoreExpr -> (CallArityEnv, Arity, CoreExpr)
437 callArityBound (count, arity) int e = (final_ae, safe_arity, e')
438 where
439 is_thunk = not (exprIsHNF e)
440
441 safe_arity | OnceAndOnly <- count = arity
442 | is_thunk = 0 -- A thunk! Do not eta-expand
443 | otherwise = arity
444
445 (ae, e') = callArityAnal safe_arity int e
446
447 final_ae | OnceAndOnly <- count = ae
448 | otherwise = forgetOnceCalls ae
449
450
451 anyGoodCalls :: CallArityEnv -> Bool
452 anyGoodCalls = foldVarEnv ((||) . isOnceCall) False
453
454 isOnceCall :: CallCount -> Bool
455 isOnceCall (OnceAndOnly, _) = True
456 isOnceCall (Many, _) = False
457
458 forgetOnceCalls :: CallArityEnv -> CallArityEnv
459 forgetOnceCalls = mapVarEnv (first (const Many))
460
461 -- See Note [Case and App: Which side to take?]
462 useBetterOf :: CallArityEnv -> CallArityEnv -> CallArityEnv
463 useBetterOf ae1 ae2 | anyGoodCalls ae1 = ae1 `lubEnv` forgetOnceCalls ae2
464 useBetterOf ae1 ae2 | otherwise = forgetOnceCalls ae1 `lubEnv` ae2
465
466 lubCallCount :: CallCount -> CallCount -> CallCount
467 lubCallCount (count1, arity1) (count2, arity2)
468 = (count1 `lubCount` count2, arity1 `min` arity2)
469
470 lubCount :: Count -> Count -> Count
471 lubCount OnceAndOnly OnceAndOnly = OnceAndOnly
472 lubCount _ _ = Many
473
474 lteCallCount :: CallCount -> CallCount -> Bool
475 lteCallCount (count1, arity1) (count2, arity2)
476 = count1 <= count2 && arity1 <= arity2
477
478 ltCallCount :: CallCount -> CallCount -> Bool
479 ltCallCount c1 c2 = c1 `lteCallCount` c2 && c1 /= c2
480
481 -- Used when combining results from alternative cases; take the minimum
482 lubEnv :: CallArityEnv -> CallArityEnv -> CallArityEnv
483 lubEnv = plusVarEnv_C lubCallCount
484