1 --

2 -- Copyright (c) 2014 Joachim Breitner

3 --

5 module CallArity

6 ( callArityAnalProgram

10 import VarSet

11 import VarEnv

14 import BasicTypes

15 import CoreSyn

16 import Id

19 import Outputable

24 {-

25 %************************************************************************

26 %* *

27 Call Arity Analyis

28 %* *

29 %************************************************************************

31 Note [Call Arity: The goal]

32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

34 The goal of this analysis is to find out if we can eta-expand a local function,

35 based on how it is being called. The motivating example is code this this,

36 which comes up when we implement foldl using foldr, and do list fusion:

38 let go = \x -> let d = case ... of

39 False -> go (x+1)

40 True -> id

41 in \z -> d (x + z)

42 in go 1 0

44 If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of

45 partial function applications, which would be bad.

47 The function `go` has a type of arity two, but only one lambda is manifest.

48 Further more, an analysis that only looks at the RHS of go cannot be sufficient

49 to eta-expand go: If `go` is ever called with one argument (and the result used

50 multiple times), we would be doing the work in `...` multiple times.

52 So `callArityAnalProgram` looks at the whole let expression to figure out if

53 all calls are nice, i.e. have a high enough arity. It then stores the result in

54 the `calledArity` field of the `IdInfo` of `go`, which the next simplifier

55 phase will eta-expand.

57 The specification of the `calledArity` field is:

59 No work will be lost if you eta-expand me to the arity in `calledArity`.

61 The specification of the analysis

62 ---------------------------------

64 The analysis only does a conservative approximation, there are plenty of

65 situations where eta-expansion would be ok, but we do not catch it. We are

66 content if all the code that foldl-via-foldr generates is being optimized

67 sufficiently.

69 The work-hourse of the analysis is the function `callArityAnal`, with the

70 following type:

72 data Count = Many | OnceAndOnly

73 type CallCount = (Count, Arity)

74 type CallArityEnv = VarEnv (CallCount, Arity)

75 callArityAnal ::

76 Arity -> -- The arity this expression is called with

77 VarSet -> -- The set of interesting variables

78 CoreExpr -> -- The expression to analyse

79 (CallArityEnv, CoreExpr)

81 and the following specification:

83 (callArityEnv, expr') = callArityEnv arity interestingIds expr

85 <=>

87 Assume the expression `expr` is being passed `arity` arguments. Then it calls

88 the functions mentioned in `interestingIds` according to `callArityEnv`:

89 * The domain of `callArityEnv` is a subset of `interestingIds`.

90 * Any variable from interestingIds that is not mentioned in the `callArityEnv`

91 is absent, i.e. not called at all.

92 * Of all the variables that are mapped to OnceAndOnly by the `callArityEnv`,

93 at most one is being called, at most once, with at least that many

94 arguments.

95 * Variables mapped to Many are called an unknown number of times, but if they

96 are called, then with at least that many arguments.

97 Furthermore, expr' is expr with the callArity field of the `IdInfo` updated.

99 The (pointwise) domain is a product domain:

101 Many 0

102 | × |

103 OneAndOnly 1

104 |

105 ...

107 The at-most-once is important for various reasons:

109 1. Consider:

111 let n = case .. of .. -- A thunk!

112 in n 0 + n 1

114 vs.

116 let n = case .. of ..

117 in case .. of T -> n 0

118 F -> n 1

120 We are only allowed to eta-expand `n` if it is going to be called at most

121 once in the body of the outer let. So we need to know, for each variable

122 individually, that it is going to be called at most once.

124 2. We need to know it for non-thunks as well, because they might call a thunk:

126 let n = case .. of ..

127 f x = n (x+1)

128 in f 1 + f 2

130 vs.

132 let n = case .. of ..

133 f x = n (x+1)

134 in case .. of T -> f 0

135 F -> f 1

137 Here, the body of f calls n exactly once, but f itself is being called

138 multiple times, so eta-expansion is not allowed.

140 3. We need to know that at most one of the interesting functions is being

141 called, because of recursion. Consider:

143 let n = case .. of ..

144 in case .. of

145 True -> let go = \y -> case .. of

146 True -> go (y + n 1)

147 False > n

148 in go 1

149 False -> n

151 vs.

153 let n = case .. of ..

154 in case .. of

155 True -> let go = \y -> case .. of

156 True -> go (y+1)

157 False > n

158 in go 1

159 False -> n

161 In both cases, the body and the rhs of the inner let call n at most once.

162 But only in the second case that holds for the whole expression! The

163 crucial difference is that in the first case, the rhs of `go` can call

164 *both* `go` and `n`, and hence can call `n` multiple times as it recurses,

165 while in the second case it calls `go` or `n`, but not both.

167 Note [Which variables are interesting]

168 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

170 Unfortunately, the set of interesting variables is not irrelevant for the

171 precision of the analysis. Consider this example (and ignore the pointlessnes

172 of `d` recursing into itself):

174 let n = ... :: Int

175 in let d = let d = case ... of

176 False -> d

177 True -> id

178 in \z -> d (x + z)

179 in d 0

181 Of course, `d` should be interesting. If we consider `n` as interesting as

182 well, then the body of the second let will return

183 { go |-> (Many, 1) , n |-> (OnceAndOnly, 0) }

184 or

185 { go |-> (OnceAndOnly, 1), n |-> (Many, 0)}.

186 Only the latter is useful, but it is hard to decide that locally.

187 (Returning OnceAndOnly for both would be wrong, as both are being called.)

189 So the heuristics is:

191 Variables are interesting if their RHS has a lower exprArity than

192 typeArity.

194 (which is precisely the those variables where this analysis can actually cause

195 some eta-expansion.)

197 But this is not uniformly a win. Consider:

199 let go = \x -> let d = case ... of

200 False -> go (x+1)

201 True -> id

202 n x = d (x+1)

203 in \z -> n (x + z)

204 in go n 0

206 Now `n` is not going to be considered interesting (its type is `Int -> Int`).

207 But this will prevent us from detecting how often the body of the let calls

208 `d`, and we will not find out anything.

210 It might be possible to be smarter here; this needs find-tuning as we find more

211 examples.

214 Note [Recursion and fixpointing]

215 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

217 For a recursive let, we begin by analysing the body, using the same incoming

218 arity as for the whole expression.

219 * We use the arity from the body on the variable as the incoming demand on the

220 rhs. Then we check if the rhs calls itself with the same arity.

221 - If so, we are done.

222 - If not, we re-analise the rhs with the reduced arity. We do that until

223 we are down to the exprArity, which then is certainly correct.

224 * If the rhs calls itself many times, we must (conservatively) pass the result

225 through forgetOnceCalls.

226 * Similarly, if the body calls the variable many times, we must pass the

227 result of the fixpointing through forgetOnceCalls.

228 * Then we can `lubEnv` the results from the body and the rhs: If all mentioned

229 calls are OnceAndOnly calls, then the body calls *either* the rhs *or* one

230 of the other mentioned variables. Similarly, the rhs calls *either* itself

231 again *or* one of the other mentioned variables. This precision is required!

233 We do not analyse mutually recursive functions. This can be done once we see it

234 in the wild.

236 Note [Case and App: Which side to take?]

237 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

239 Combining the case branches is easy, just `lubEnv` them – at most one branch is

240 taken.

242 But how to combine that with the information coming from the scrunitee? Very

243 similarly, how to combine the information from the callee and argument of an

244 `App`?

246 It would not be correct to just `lubEnv` then: `f n` obviously calls *both* `f`

247 and `n`. We need to forget about the cardinality of calls from one side using

248 `forgetOnceCalls`. But which one?

250 Both are correct, and sometimes one and sometimes the other is more precise

251 (also see example in [Which variables are interesting]).

253 So currently, we first check the scrunitee (resp. the callee) if the returned

254 value has any usesful information, and if so, we use that; otherwise we use the

255 information from the alternatives (resp. the argument).

257 It might be smarter to look for “more important” variables first, i.e. the

258 innermost recursive variable.

260 -}

276 topCallCount :: CallCount

281 callArityAnal ::

286 -- How this expression uses its interesting variables

287 -- and the expression with IdInfo updated

289 -- The trivial base cases

296 -- The transparent cases

302 -- The interesting case: Variables, Lambdas, Lets, Applications, Cases

304 | v `elemVarSet` int

306 | otherwise

309 -- We have a lambda that we are not sure to call. Tail calls therein

310 -- are no longer OneAndOnly calls

313 where

315 ae' = forgetOnceCalls ae

316 -- We have a lambda that we are calling. decrease arity.

319 where

322 -- Boring non-recursive let, i.e. no eta expansion possible. do not be smart about this

323 -- See Note [Which variables are interesting]

327 where

330 ae_body' = ae_body `delVarEnv` v

333 -- Non-recursive let. Find out how the body calls the rhs, analise that,

334 -- and combine the results, convervatively using both

337 -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ])

339 where

340 int_body = int `extendVarSet` v

342 callcount = lookupWithDefaultVarEnv ae_body topCallCount v

346 v' = v `setIdCallArity` safe_arity

348 -- Boring recursive let, i.e. no eta expansion possible. do not be smart about this

352 where

357 -- Recursive let.

358 -- See Note [Recursion and fixpointing]

361 -- (vcat [ppr v, ppr arity, ppr safe_arity, ppr rhs_arity', ppr final_ae ])

363 where

364 int_body = int `extendVarSet` v

366 callcount = lookupWithDefaultVarEnv ae_body topCallCount v

370 v' = v `setIdCallArity` new_arity

374 -- Mutual recursion. Do nothing serious here, for now

377 where

384 -- Application. Increase arity for the called expresion, nothing to know about

385 -- the second

388 where

391 -- See Note [Case and App: Which side to take?]

392 final_ae = ae1 `useBetterOf` ae2

394 -- Case expression. Here we decide whether

395 -- we want to look at calls from the scrunitee or the alternatives;

396 -- one of them we set to Nothing.

397 -- Naive idea: If there are interesting calls in the scrunitee,

398 -- zap the alternatives

401 -- (vcat [ppr scrut, ppr final_ae])

403 where

409 -- See Note [Case and App: Which side to take?]

410 final_ae = scrut_ae `useBetterOf` alt_ae

413 callArityFix arity int v e

415 | arity `lteCallCount` min_arity

416 -- The incoming arity is already lower than the exprArity, so we can

417 -- ignore the arity coming from the RHS

420 | otherwise

422 -- RHS puts a lower arity on itself, so try that

423 then callArityFix new_arity int v e

425 -- RHS calls itself with at least as many arguments as the body of the let: Great!

427 where

429 new_arity = lookupWithDefaultVarEnv ae topCallCount v

432 -- This is a variant of callArityAnal that takes a CallCount (i.e. an arity with a

433 -- cardinality) and adjust the resulting environment accordingly. It is to be used

434 -- on bound expressions that can possibly be shared.

435 -- It also returns the safe arity used: For a thunk that is called multiple

436 -- times, this will be 0!

439 where

462 -- See Note [Case and App: Which side to take?]

464 useBetterOf ae1 ae2 | anyGoodCalls ae1 = ae1 `lubEnv` forgetOnceCalls ae2

472 lubCount OnceAndOnly OnceAndOnly = OnceAndOnly

473 lubCount _ _ = Many

482 -- Used when combining results from alternative cases; take the minimum

484 lubEnv = plusVarEnv_C lubCallCount