Typos in comments [ci skip]
[ghc.git] / compiler / stranal / DmdAnal.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4
5 -----------------
6 A demand analysis
7 -----------------
8 -}
9
10 {-# LANGUAGE CPP #-}
11
12 module DmdAnal ( dmdAnalProgram ) where
13
14 #include "HsVersions.h"
15
16 import DynFlags
17 import WwLib ( findTypeShape, deepSplitProductType_maybe )
18 import Demand -- All of it
19 import CoreSyn
20 import CoreSeq ( seqBinds )
21 import Outputable
22 import VarEnv
23 import BasicTypes
24 import Data.List
25 import DataCon
26 import Id
27 import CoreUtils ( exprIsHNF, exprType, exprIsTrivial )
28 import TyCon
29 import Type
30 import Coercion ( Coercion, coVarsOfCo )
31 import FamInstEnv
32 import Util
33 import Maybes ( isJust )
34 import TysWiredIn
35 import TysPrim ( realWorldStatePrimTy )
36 import ErrUtils ( dumpIfSet_dyn )
37 import Name ( getName, stableNameCmp )
38 import Data.Function ( on )
39 import UniqSet
40
41 {-
42 ************************************************************************
43 * *
44 \subsection{Top level stuff}
45 * *
46 ************************************************************************
47 -}
48
49 dmdAnalProgram :: DynFlags -> FamInstEnvs -> CoreProgram -> IO CoreProgram
50 dmdAnalProgram dflags fam_envs binds
51 = do {
52 let { binds_plus_dmds = do_prog binds } ;
53 dumpIfSet_dyn dflags Opt_D_dump_str_signatures
54 "Strictness signatures" $
55 dumpStrSig binds_plus_dmds ;
56 -- See Note [Stamp out space leaks in demand analysis]
57 seqBinds binds_plus_dmds `seq` return binds_plus_dmds
58 }
59 where
60 do_prog :: CoreProgram -> CoreProgram
61 do_prog binds = snd $ mapAccumL dmdAnalTopBind (emptyAnalEnv dflags fam_envs) binds
62
63 -- Analyse a (group of) top-level binding(s)
64 dmdAnalTopBind :: AnalEnv
65 -> CoreBind
66 -> (AnalEnv, CoreBind)
67 dmdAnalTopBind env (NonRec id rhs)
68 = (extendAnalEnv TopLevel env id2 (idStrictness id2), NonRec id2 rhs2)
69 where
70 ( _, _, rhs1) = dmdAnalRhsLetDown TopLevel Nothing env cleanEvalDmd id rhs
71 ( _, id2, rhs2) = dmdAnalRhsLetDown TopLevel Nothing (nonVirgin env) cleanEvalDmd id rhs1
72 -- Do two passes to improve CPR information
73 -- See Note [CPR for thunks]
74 -- See Note [Optimistic CPR in the "virgin" case]
75 -- See Note [Initial CPR for strict binders]
76
77 dmdAnalTopBind env (Rec pairs)
78 = (env', Rec pairs')
79 where
80 (env', _, pairs') = dmdFix TopLevel env cleanEvalDmd pairs
81 -- We get two iterations automatically
82 -- c.f. the NonRec case above
83
84 {- Note [Stamp out space leaks in demand analysis]
85 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86 The demand analysis pass outputs a new copy of the Core program in
87 which binders have been annotated with demand and strictness
88 information. It's tiresome to ensure that this information is fully
89 evaluated everywhere that we produce it, so we just run a single
90 seqBinds over the output before returning it, to ensure that there are
91 no references holding on to the input Core program.
92
93 This makes a ~30% reduction in peak memory usage when compiling
94 DynFlags (cf Trac #9675 and #13426).
95
96 This is particularly important when we are doing late demand analysis,
97 since we don't do a seqBinds at any point thereafter. Hence code
98 generation would hold on to an extra copy of the Core program, via
99 unforced thunks in demand or strictness information; and it is the
100 most memory-intensive part of the compilation process, so this added
101 seqBinds makes a big difference in peak memory usage.
102 -}
103
104
105 {-
106 ************************************************************************
107 * *
108 \subsection{The analyser itself}
109 * *
110 ************************************************************************
111
112 Note [Ensure demand is strict]
113 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114 It's important not to analyse e with a lazy demand because
115 a) When we encounter case s of (a,b) ->
116 we demand s with U(d1d2)... but if the overall demand is lazy
117 that is wrong, and we'd need to reduce the demand on s,
118 which is inconvenient
119 b) More important, consider
120 f (let x = R in x+x), where f is lazy
121 We still want to mark x as demanded, because it will be when we
122 enter the let. If we analyse f's arg with a Lazy demand, we'll
123 just mark x as Lazy
124 c) The application rule wouldn't be right either
125 Evaluating (f x) in a L demand does *not* cause
126 evaluation of f in a C(L) demand!
127 -}
128
129 -- If e is complicated enough to become a thunk, its contents will be evaluated
130 -- at most once, so oneify it.
131 dmdTransformThunkDmd :: CoreExpr -> Demand -> Demand
132 dmdTransformThunkDmd e
133 | exprIsTrivial e = id
134 | otherwise = oneifyDmd
135
136 -- Do not process absent demands
137 -- Otherwise act like in a normal demand analysis
138 -- See ↦* relation in the Cardinality Analysis paper
139 dmdAnalStar :: AnalEnv
140 -> Demand -- This one takes a *Demand*
141 -> CoreExpr -> (BothDmdArg, CoreExpr)
142 dmdAnalStar env dmd e
143 | (defer_and_use, cd) <- toCleanDmd dmd (exprType e)
144 , (dmd_ty, e') <- dmdAnal env cd e
145 = (postProcessDmdType defer_and_use dmd_ty, e')
146
147 -- Main Demand Analsysis machinery
148 dmdAnal, dmdAnal' :: AnalEnv
149 -> CleanDemand -- The main one takes a *CleanDemand*
150 -> CoreExpr -> (DmdType, CoreExpr)
151
152 -- The CleanDemand is always strict and not absent
153 -- See Note [Ensure demand is strict]
154
155 dmdAnal env d e = -- pprTrace "dmdAnal" (ppr d <+> ppr e) $
156 dmdAnal' env d e
157
158 dmdAnal' _ _ (Lit lit) = (nopDmdType, Lit lit)
159 dmdAnal' _ _ (Type ty) = (nopDmdType, Type ty) -- Doesn't happen, in fact
160 dmdAnal' _ _ (Coercion co)
161 = (unitDmdType (coercionDmdEnv co), Coercion co)
162
163 dmdAnal' env dmd (Var var)
164 = (dmdTransform env var dmd, Var var)
165
166 dmdAnal' env dmd (Cast e co)
167 = (dmd_ty `bothDmdType` mkBothDmdArg (coercionDmdEnv co), Cast e' co)
168 where
169 (dmd_ty, e') = dmdAnal env dmd e
170
171 {- ----- I don't get this, so commenting out -------
172 to_co = pSnd (coercionKind co)
173 dmd'
174 | Just tc <- tyConAppTyCon_maybe to_co
175 , isRecursiveTyCon tc = cleanEvalDmd
176 | otherwise = dmd
177 -- This coerce usually arises from a recursive
178 -- newtype, and we don't want to look inside them
179 -- for exactly the same reason that we don't look
180 -- inside recursive products -- we might not reach
181 -- a fixpoint. So revert to a vanilla Eval demand
182 -}
183
184 dmdAnal' env dmd (Tick t e)
185 = (dmd_ty, Tick t e')
186 where
187 (dmd_ty, e') = dmdAnal env dmd e
188
189 dmdAnal' env dmd (App fun (Type ty))
190 = (fun_ty, App fun' (Type ty))
191 where
192 (fun_ty, fun') = dmdAnal env dmd fun
193
194 -- Lots of the other code is there to make this
195 -- beautiful, compositional, application rule :-)
196 dmdAnal' env dmd (App fun arg)
197 = -- This case handles value arguments (type args handled above)
198 -- Crucially, coercions /are/ handled here, because they are
199 -- value arguments (Trac #10288)
200 let
201 call_dmd = mkCallDmd dmd
202 (fun_ty, fun') = dmdAnal env call_dmd fun
203 (arg_dmd, res_ty) = splitDmdTy fun_ty
204 (arg_ty, arg') = dmdAnalStar env (dmdTransformThunkDmd arg arg_dmd) arg
205 in
206 -- pprTrace "dmdAnal:app" (vcat
207 -- [ text "dmd =" <+> ppr dmd
208 -- , text "expr =" <+> ppr (App fun arg)
209 -- , text "fun dmd_ty =" <+> ppr fun_ty
210 -- , text "arg dmd =" <+> ppr arg_dmd
211 -- , text "arg dmd_ty =" <+> ppr arg_ty
212 -- , text "res dmd_ty =" <+> ppr res_ty
213 -- , text "overall res dmd_ty =" <+> ppr (res_ty `bothDmdType` arg_ty) ])
214 (res_ty `bothDmdType` arg_ty, App fun' arg')
215
216 -- this is an anonymous lambda, since @dmdAnalRhsLetDown@ uses @collectBinders@
217 dmdAnal' env dmd (Lam var body)
218 | isTyVar var
219 = let
220 (body_ty, body') = dmdAnal env dmd body
221 in
222 (body_ty, Lam var body')
223
224 | otherwise
225 = let (body_dmd, defer_and_use) = peelCallDmd dmd
226 -- body_dmd: a demand to analyze the body
227
228 env' = extendSigsWithLam env var
229 (body_ty, body') = dmdAnal env' body_dmd body
230 (lam_ty, var') = annotateLamIdBndr env notArgOfDfun body_ty var
231 in
232 (postProcessUnsat defer_and_use lam_ty, Lam var' body')
233
234 dmdAnal' env dmd (Case scrut case_bndr ty [(DataAlt dc, bndrs, rhs)])
235 -- Only one alternative with a product constructor
236 | let tycon = dataConTyCon dc
237 , isJust (isDataProductTyCon_maybe tycon)
238 , Just rec_tc' <- checkRecTc (ae_rec_tc env) tycon
239 = let
240 env_w_tc = env { ae_rec_tc = rec_tc' }
241 env_alt = extendEnvForProdAlt env_w_tc scrut case_bndr dc bndrs
242 (rhs_ty, rhs') = dmdAnal env_alt dmd rhs
243 (alt_ty1, dmds) = findBndrsDmds env rhs_ty bndrs
244 (alt_ty2, case_bndr_dmd) = findBndrDmd env False alt_ty1 case_bndr
245 id_dmds = addCaseBndrDmd case_bndr_dmd dmds
246 alt_ty3 | io_hack_reqd scrut dc bndrs = deferAfterIO alt_ty2
247 | otherwise = alt_ty2
248
249 -- Compute demand on the scrutinee
250 -- See Note [Demand on scrutinee of a product case]
251 scrut_dmd = mkProdDmd (addDataConStrictness dc id_dmds)
252 (scrut_ty, scrut') = dmdAnal env scrut_dmd scrut
253 res_ty = alt_ty3 `bothDmdType` toBothDmdArg scrut_ty
254 case_bndr' = setIdDemandInfo case_bndr case_bndr_dmd
255 bndrs' = setBndrsDemandInfo bndrs id_dmds
256 in
257 -- pprTrace "dmdAnal:Case1" (vcat [ text "scrut" <+> ppr scrut
258 -- , text "dmd" <+> ppr dmd
259 -- , text "case_bndr_dmd" <+> ppr (idDemandInfo case_bndr')
260 -- , text "scrut_dmd" <+> ppr scrut_dmd
261 -- , text "scrut_ty" <+> ppr scrut_ty
262 -- , text "alt_ty" <+> ppr alt_ty2
263 -- , text "res_ty" <+> ppr res_ty ]) $
264 (res_ty, Case scrut' case_bndr' ty [(DataAlt dc, bndrs', rhs')])
265
266 dmdAnal' env dmd (Case scrut case_bndr ty alts)
267 = let -- Case expression with multiple alternatives
268 (alt_tys, alts') = mapAndUnzip (dmdAnalAlt env dmd case_bndr) alts
269 (scrut_ty, scrut') = dmdAnal env cleanEvalDmd scrut
270 (alt_ty, case_bndr') = annotateBndr env (foldr lubDmdType botDmdType alt_tys) case_bndr
271 -- NB: Base case is botDmdType, for empty case alternatives
272 -- This is a unit for lubDmdType, and the right result
273 -- when there really are no alternatives
274 res_ty = alt_ty `bothDmdType` toBothDmdArg scrut_ty
275 in
276 -- pprTrace "dmdAnal:Case2" (vcat [ text "scrut" <+> ppr scrut
277 -- , text "scrut_ty" <+> ppr scrut_ty
278 -- , text "alt_tys" <+> ppr alt_tys
279 -- , text "alt_ty" <+> ppr alt_ty
280 -- , text "res_ty" <+> ppr res_ty ]) $
281 (res_ty, Case scrut' case_bndr' ty alts')
282
283 -- Let bindings can be processed in two ways:
284 -- Down (RHS before body) or Up (body before RHS).
285 -- The following case handle the up variant.
286 --
287 -- It is very simple. For let x = rhs in body
288 -- * Demand-analyse 'body' in the current environment
289 -- * Find the demand, 'rhs_dmd' placed on 'x' by 'body'
290 -- * Demand-analyse 'rhs' in 'rhs_dmd'
291 --
292 -- This is used for a non-recursive local let without manifest lambdas.
293 -- This is the LetUp rule in the paper “Higher-Order Cardinality Analysis”.
294 dmdAnal' env dmd (Let (NonRec id rhs) body)
295 | useLetUp id rhs
296 , Nothing <- unpackTrivial rhs
297 -- dmdAnalRhsLetDown treats trivial right hand sides specially
298 -- so if we have a trival right hand side, fall through to that.
299 = (final_ty, Let (NonRec id' rhs') body')
300 where
301 (body_ty, body') = dmdAnal env dmd body
302 (body_ty', id_dmd) = findBndrDmd env notArgOfDfun body_ty id
303 id' = setIdDemandInfo id id_dmd
304
305 (rhs_ty, rhs') = dmdAnalStar env (dmdTransformThunkDmd rhs id_dmd) rhs
306 final_ty = body_ty' `bothDmdType` rhs_ty
307
308 dmdAnal' env dmd (Let (NonRec id rhs) body)
309 = (body_ty2, Let (NonRec id2 rhs') body')
310 where
311 (lazy_fv, id1, rhs') = dmdAnalRhsLetDown NotTopLevel Nothing env dmd id rhs
312 env1 = extendAnalEnv NotTopLevel env id1 (idStrictness id1)
313 (body_ty, body') = dmdAnal env1 dmd body
314 (body_ty1, id2) = annotateBndr env body_ty id1
315 body_ty2 = addLazyFVs body_ty1 lazy_fv -- see Note [Lazy and unleashable free variables]
316
317 -- If the actual demand is better than the vanilla call
318 -- demand, you might think that we might do better to re-analyse
319 -- the RHS with the stronger demand.
320 -- But (a) That seldom happens, because it means that *every* path in
321 -- the body of the let has to use that stronger demand
322 -- (b) It often happens temporarily in when fixpointing, because
323 -- the recursive function at first seems to place a massive demand.
324 -- But we don't want to go to extra work when the function will
325 -- probably iterate to something less demanding.
326 -- In practice, all the times the actual demand on id2 is more than
327 -- the vanilla call demand seem to be due to (b). So we don't
328 -- bother to re-analyse the RHS.
329
330 dmdAnal' env dmd (Let (Rec pairs) body)
331 = let
332 (env', lazy_fv, pairs') = dmdFix NotTopLevel env dmd pairs
333 (body_ty, body') = dmdAnal env' dmd body
334 body_ty1 = deleteFVs body_ty (map fst pairs)
335 body_ty2 = addLazyFVs body_ty1 lazy_fv -- see Note [Lazy and unleashable free variables]
336 in
337 body_ty2 `seq`
338 (body_ty2, Let (Rec pairs') body')
339
340 io_hack_reqd :: CoreExpr -> DataCon -> [Var] -> Bool
341 -- See Note [IO hack in the demand analyser]
342 io_hack_reqd scrut con bndrs
343 | (bndr:_) <- bndrs
344 , con == tupleDataCon Unboxed 2
345 , idType bndr `eqType` realWorldStatePrimTy
346 , (fun, _) <- collectArgs scrut
347 = case fun of
348 Var f -> not (isPrimOpId f)
349 _ -> True
350 | otherwise
351 = False
352
353 dmdAnalAlt :: AnalEnv -> CleanDemand -> Id -> Alt Var -> (DmdType, Alt Var)
354 dmdAnalAlt env dmd case_bndr (con,bndrs,rhs)
355 | null bndrs -- Literals, DEFAULT, and nullary constructors
356 , (rhs_ty, rhs') <- dmdAnal env dmd rhs
357 = (rhs_ty, (con, [], rhs'))
358
359 | otherwise -- Non-nullary data constructors
360 , (rhs_ty, rhs') <- dmdAnal env dmd rhs
361 , (alt_ty, dmds) <- findBndrsDmds env rhs_ty bndrs
362 , let case_bndr_dmd = findIdDemand alt_ty case_bndr
363 id_dmds = addCaseBndrDmd case_bndr_dmd dmds
364 = (alt_ty, (con, setBndrsDemandInfo bndrs id_dmds, rhs'))
365
366
367 {- Note [IO hack in the demand analyser]
368 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369 There's a hack here for I/O operations. Consider
370
371 case foo x s of { (# s', r #) -> y }
372
373 Is this strict in 'y'? Often not! If foo x s performs some observable action
374 (including raising an exception with raiseIO#, modifying a mutable variable, or
375 even ending the program normally), then we must not force 'y' (which may fail
376 to terminate) until we have performed foo x s.
377
378 Hackish solution: spot the IO-like situation and add a virtual branch,
379 as if we had
380 case foo x s of
381 (# s, r #) -> y
382 other -> return ()
383 So the 'y' isn't necessarily going to be evaluated
384
385 A more complete example (Trac #148, #1592) where this shows up is:
386 do { let len = <expensive> ;
387 ; when (...) (exitWith ExitSuccess)
388 ; print len }
389
390 However, consider
391 f x s = case getMaskingState# s of
392 (# s, r #) ->
393 case x of I# x2 -> ...
394
395 Here it is terribly sad to make 'f' lazy in 's'. After all,
396 getMaskingState# is not going to diverge or throw an exception! This
397 situation actually arises in GHC.IO.Handle.Internals.wantReadableHandle
398 (on an MVar not an Int), and made a material difference.
399
400 So if the scrutinee is a primop call, we *don't* apply the
401 state hack:
402 - If is a simple, terminating one like getMaskingState,
403 applying the hack is over-conservative.
404 - If the primop is raise# then it returns bottom, so
405 the case alternatives are already discarded.
406 - If the primop can raise a non-IO exception, like
407 divide by zero or seg-fault (eg writing an array
408 out of bounds) then we don't mind evaluating 'x' first.
409
410 Note [Demand on the scrutinee of a product case]
411 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
412 When figuring out the demand on the scrutinee of a product case,
413 we use the demands of the case alternative, i.e. id_dmds.
414 But note that these include the demand on the case binder;
415 see Note [Demand on case-alternative binders] in Demand.hs.
416 This is crucial. Example:
417 f x = case x of y { (a,b) -> k y a }
418 If we just take scrut_demand = U(L,A), then we won't pass x to the
419 worker, so the worker will rebuild
420 x = (a, absent-error)
421 and that'll crash.
422
423 Note [Aggregated demand for cardinality]
424 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
425 We use different strategies for strictness and usage/cardinality to
426 "unleash" demands captured on free variables by bindings. Let us
427 consider the example:
428
429 f1 y = let {-# NOINLINE h #-}
430 h = y
431 in (h, h)
432
433 We are interested in obtaining cardinality demand U1 on |y|, as it is
434 used only in a thunk, and, therefore, is not going to be updated any
435 more. Therefore, the demand on |y|, captured and unleashed by usage of
436 |h| is U1. However, if we unleash this demand every time |h| is used,
437 and then sum up the effects, the ultimate demand on |y| will be U1 +
438 U1 = U. In order to avoid it, we *first* collect the aggregate demand
439 on |h| in the body of let-expression, and only then apply the demand
440 transformer:
441
442 transf[x](U) = {y |-> U1}
443
444 so the resulting demand on |y| is U1.
445
446 The situation is, however, different for strictness, where this
447 aggregating approach exhibits worse results because of the nature of
448 |both| operation for strictness. Consider the example:
449
450 f y c =
451 let h x = y |seq| x
452 in case of
453 True -> h True
454 False -> y
455
456 It is clear that |f| is strict in |y|, however, the suggested analysis
457 will infer from the body of |let| that |h| is used lazily (as it is
458 used in one branch only), therefore lazy demand will be put on its
459 free variable |y|. Conversely, if the demand on |h| is unleashed right
460 on the spot, we will get the desired result, namely, that |f| is
461 strict in |y|.
462
463
464 ************************************************************************
465 * *
466 Demand transformer
467 * *
468 ************************************************************************
469 -}
470
471 dmdTransform :: AnalEnv -- The strictness environment
472 -> Id -- The function
473 -> CleanDemand -- The demand on the function
474 -> DmdType -- The demand type of the function in this context
475 -- Returned DmdEnv includes the demand on
476 -- this function plus demand on its free variables
477
478 dmdTransform env var dmd
479 | isDataConWorkId var -- Data constructor
480 = dmdTransformDataConSig (idArity var) (idStrictness var) dmd
481
482 | gopt Opt_DmdTxDictSel (ae_dflags env),
483 Just _ <- isClassOpId_maybe var -- Dictionary component selector
484 = dmdTransformDictSelSig (idStrictness var) dmd
485
486 | isGlobalId var -- Imported function
487 = let res = dmdTransformSig (idStrictness var) dmd in
488 -- pprTrace "dmdTransform" (vcat [ppr var, ppr (idStrictness var), ppr dmd, ppr res])
489 res
490
491 | Just (sig, top_lvl) <- lookupSigEnv env var -- Local letrec bound thing
492 , let fn_ty = dmdTransformSig sig dmd
493 = -- pprTrace "dmdTransform" (vcat [ppr var, ppr sig, ppr dmd, ppr fn_ty]) $
494 if isTopLevel top_lvl
495 then fn_ty -- Don't record top level things
496 else addVarDmd fn_ty var (mkOnceUsedDmd dmd)
497
498 | otherwise -- Local non-letrec-bound thing
499 = unitDmdType (unitVarEnv var (mkOnceUsedDmd dmd))
500
501 {-
502 ************************************************************************
503 * *
504 \subsection{Bindings}
505 * *
506 ************************************************************************
507 -}
508
509 -- Recursive bindings
510 dmdFix :: TopLevelFlag
511 -> AnalEnv -- Does not include bindings for this binding
512 -> CleanDemand
513 -> [(Id,CoreExpr)]
514 -> (AnalEnv, DmdEnv, [(Id,CoreExpr)]) -- Binders annotated with stricness info
515
516 dmdFix top_lvl env let_dmd orig_pairs
517 = loop 1 initial_pairs
518 where
519 bndrs = map fst orig_pairs
520
521 -- See Note [Initialising strictness]
522 initial_pairs | ae_virgin env = [(setIdStrictness id botSig, rhs) | (id, rhs) <- orig_pairs ]
523 | otherwise = orig_pairs
524
525 -- If fixed-point iteration does not yield a result we use this instead
526 -- See Note [Safe abortion in the fixed-point iteration]
527 abort :: (AnalEnv, DmdEnv, [(Id,CoreExpr)])
528 abort = (env, lazy_fv', zapped_pairs)
529 where (lazy_fv, pairs') = step True (zapIdStrictness orig_pairs)
530 -- Note [Lazy and unleashable free variables]
531 non_lazy_fvs = plusVarEnvList $ map (strictSigDmdEnv . idStrictness . fst) pairs'
532 lazy_fv' = lazy_fv `plusVarEnv` mapVarEnv (const topDmd) non_lazy_fvs
533 zapped_pairs = zapIdStrictness pairs'
534
535 -- The fixed-point varies the idStrictness field of the binders, and terminates if that
536 -- annotation does not change any more.
537 loop :: Int -> [(Id,CoreExpr)] -> (AnalEnv, DmdEnv, [(Id,CoreExpr)])
538 loop n pairs
539 | found_fixpoint = (final_anal_env, lazy_fv, pairs')
540 | n == 10 = abort
541 | otherwise = loop (n+1) pairs'
542 where
543 found_fixpoint = map (idStrictness . fst) pairs' == map (idStrictness . fst) pairs
544 first_round = n == 1
545 (lazy_fv, pairs') = step first_round pairs
546 final_anal_env = extendAnalEnvs top_lvl env (map fst pairs')
547
548 step :: Bool -> [(Id, CoreExpr)] -> (DmdEnv, [(Id, CoreExpr)])
549 step first_round pairs = (lazy_fv, pairs')
550 where
551 -- In all but the first iteration, delete the virgin flag
552 start_env | first_round = env
553 | otherwise = nonVirgin env
554
555 start = (extendAnalEnvs top_lvl start_env (map fst pairs), emptyDmdEnv)
556
557 ((_,lazy_fv), pairs') = mapAccumL my_downRhs start pairs
558 -- mapAccumL: Use the new signature to do the next pair
559 -- The occurrence analyser has arranged them in a good order
560 -- so this can significantly reduce the number of iterations needed
561
562 my_downRhs (env, lazy_fv) (id,rhs)
563 = ((env', lazy_fv'), (id', rhs'))
564 where
565 (lazy_fv1, id', rhs') = dmdAnalRhsLetDown top_lvl (Just bndrs) env let_dmd id rhs
566 lazy_fv' = plusVarEnv_C bothDmd lazy_fv lazy_fv1
567 env' = extendAnalEnv top_lvl env id (idStrictness id')
568
569
570 zapIdStrictness :: [(Id, CoreExpr)] -> [(Id, CoreExpr)]
571 zapIdStrictness pairs = [(setIdStrictness id nopSig, rhs) | (id, rhs) <- pairs ]
572
573 {-
574 Note [Safe abortion in the fixed-point iteration]
575 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
576
577 Fixed-point iteration may fail to terminate. But we cannot simply give up and
578 return the environment and code unchanged! We still need to do one additional
579 round, for two reasons:
580
581 * To get information on used free variables (both lazy and strict!)
582 (see Note [Lazy and unleashable free variables])
583 * To ensure that all expressions have been traversed at least once, and any left-over
584 strictness annotations have been updated.
585
586 This final iteration does not add the variables to the strictness signature
587 environment, which effectively assigns them 'nopSig' (see "getStrictness")
588
589 -}
590
591 -- Trivial RHS
592 -- See Note [Demand analysis for trivial right-hand sides]
593 dmdAnalTrivialRhs ::
594 AnalEnv -> Id -> CoreExpr -> Var ->
595 (DmdEnv, Id, CoreExpr)
596 dmdAnalTrivialRhs env id rhs fn
597 = (fn_fv, set_idStrictness env id fn_str, rhs)
598 where
599 fn_str = getStrictness env fn
600 fn_fv | isLocalId fn = unitVarEnv fn topDmd
601 | otherwise = emptyDmdEnv
602 -- Note [Remember to demand the function itself]
603 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
604 -- fn_fv: don't forget to produce a demand for fn itself
605 -- Lacking this caused Trac #9128
606 -- The demand is very conservative (topDmd), but that doesn't
607 -- matter; trivial bindings are usually inlined, so it only
608 -- kicks in for top-level bindings and NOINLINE bindings
609
610 -- Let bindings can be processed in two ways:
611 -- Down (RHS before body) or Up (body before RHS).
612 -- dmdAnalRhsLetDown implements the Down variant:
613 -- * assuming a demand of <L,U>
614 -- * looking at the definition
615 -- * determining a strictness signature
616 --
617 -- It is used for toplevel definition, recursive definitions and local
618 -- non-recursive definitions that have manifest lambdas.
619 -- Local non-recursive definitions without a lambda are handled with LetUp.
620 --
621 -- This is the LetDown rule in the paper “Higher-Order Cardinality Analysis”.
622 dmdAnalRhsLetDown :: TopLevelFlag
623 -> Maybe [Id] -- Just bs <=> recursive, Nothing <=> non-recursive
624 -> AnalEnv -> CleanDemand
625 -> Id -> CoreExpr
626 -> (DmdEnv, Id, CoreExpr)
627 -- Process the RHS of the binding, add the strictness signature
628 -- to the Id, and augment the environment with the signature as well.
629 dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs
630 | Just fn <- unpackTrivial rhs -- See Note [Demand analysis for trivial right-hand sides]
631 = dmdAnalTrivialRhs env id rhs fn
632
633 | otherwise
634 = (lazy_fv, id', mkLams bndrs' body')
635 where
636 (bndrs, body, body_dmd)
637 = case isJoinId_maybe id of
638 Just join_arity -- See Note [Demand analysis for join points]
639 | (bndrs, body) <- collectNBinders join_arity rhs
640 -> (bndrs, body, let_dmd)
641
642 Nothing | (bndrs, body) <- collectBinders rhs
643 -> (bndrs, body, mkBodyDmd env body)
644
645 env_body = foldl extendSigsWithLam env bndrs
646 (body_ty, body') = dmdAnal env_body body_dmd body
647 body_ty' = removeDmdTyArgs body_ty -- zap possible deep CPR info
648 (DmdType rhs_fv rhs_dmds rhs_res, bndrs')
649 = annotateLamBndrs env (isDFunId id) body_ty' bndrs
650 sig_ty = mkStrictSig (mkDmdType sig_fv rhs_dmds rhs_res')
651 id' = set_idStrictness env id sig_ty
652 -- See Note [NOINLINE and strictness]
653
654
655 -- See Note [Aggregated demand for cardinality]
656 rhs_fv1 = case rec_flag of
657 Just bs -> reuseEnv (delVarEnvList rhs_fv bs)
658 Nothing -> rhs_fv
659
660 -- See Note [Lazy and unleashable free variables]
661 (lazy_fv, sig_fv) = splitFVs is_thunk rhs_fv1
662
663 rhs_res' = trimCPRInfo trim_all trim_sums rhs_res
664 trim_all = is_thunk && not_strict
665 trim_sums = not (isTopLevel top_lvl) -- See Note [CPR for sum types]
666
667 -- See Note [CPR for thunks]
668 is_thunk = not (exprIsHNF rhs) && not (isJoinId id)
669 not_strict
670 = isTopLevel top_lvl -- Top level and recursive things don't
671 || isJust rec_flag -- get their demandInfo set at all
672 || not (isStrictDmd (idDemandInfo id) || ae_virgin env)
673 -- See Note [Optimistic CPR in the "virgin" case]
674
675 mkBodyDmd :: AnalEnv -> CoreExpr -> CleanDemand
676 -- See Note [Product demands for function body]
677 mkBodyDmd env body
678 = case deepSplitProductType_maybe (ae_fam_envs env) (exprType body) of
679 Nothing -> cleanEvalDmd
680 Just (dc, _, _, _) -> cleanEvalProdDmd (dataConRepArity dc)
681
682 unpackTrivial :: CoreExpr -> Maybe Id
683 -- Returns (Just v) if the arg is really equal to v, modulo
684 -- casts, type applications etc
685 -- See Note [Demand analysis for trivial right-hand sides]
686 unpackTrivial (Var v) = Just v
687 unpackTrivial (Cast e _) = unpackTrivial e
688 unpackTrivial (Lam v e) | isTyVar v = unpackTrivial e
689 unpackTrivial (App e a) | isTypeArg a = unpackTrivial e
690 unpackTrivial _ = Nothing
691
692 -- | If given the RHS of a let-binding, this 'useLetUp' determines
693 -- whether we should process the binding up (body before rhs) or
694 -- down (rhs before body).
695 --
696 -- We use LetDown if there is a chance to get a useful strictness signature.
697 -- This is the case when there are manifest value lambdas or the binding is a
698 -- join point (hence always acts like a function, not a value).
699 useLetUp :: Var -> CoreExpr -> Bool
700 useLetUp f _ | isJoinId f = False
701 useLetUp f (Lam v e) | isTyVar v = useLetUp f e
702 useLetUp _ (Lam _ _) = False
703 useLetUp _ _ = True
704
705
706 {- Note [Demand analysis for join points]
707 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
708 Consider
709 g :: (Int,Int) -> Int
710 g (p,q) = p+q
711
712 f :: T -> Int -> Int
713 f x p = g (join j y = (p,y)
714 in case x of
715 A -> j 3
716 B -> j 4
717 C -> (p,7))
718
719 If j was a vanilla function definition, we'd analyse its body with
720 evalDmd, and think that it was lazy in p. But for join points we can
721 do better! We know that j's body will (if called at all) be evaluated
722 with the demand that consumes the entire join-binding, in this case
723 the argument demand from g. Whizzo! g evaluates both components of
724 its argument pair, so p will certainly be evaluated if j is called.
725
726 For f to be strict in p, we need /all/ paths to evaluate p; in this
727 case the C branch does so too, so we are fine. So, as usual, we need
728 to transport demands on free variables to the call site(s). Compare
729 Note [Lazy and unleashable free variables].
730
731 The implementation is easy. When analysing a join point, we can
732 analyse its body with the demand from the entire join-binding (written
733 let_dmd here).
734
735 Another win for join points! Trac #13543.
736
737 Note [Demand analysis for trivial right-hand sides]
738 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
739 Consider
740 foo = plusInt |> co
741 where plusInt is an arity-2 function with known strictness. Clearly
742 we want plusInt's strictness to propagate to foo! But because it has
743 no manifest lambdas, it won't do so automatically, and indeed 'co' might
744 have type (Int->Int->Int) ~ T, so we *can't* eta-expand. So we have a
745 special case for right-hand sides that are "trivial", namely variables,
746 casts, type applications, and the like.
747
748 Note that this can mean that 'foo' has an arity that is smaller than that
749 indicated by its demand info. e.g. if co :: (Int->Int->Int) ~ T, then
750 foo's arity will be zero (see Note [exprArity invariant] in CoreArity),
751 but its demand signature will be that of plusInt. A small example is the
752 test case of Trac #8963.
753
754
755 Note [Product demands for function body]
756 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
757 This example comes from shootout/binary_trees:
758
759 Main.check' = \ b z ds. case z of z' { I# ip ->
760 case ds_d13s of
761 Main.Nil -> z'
762 Main.Node s14k s14l s14m ->
763 Main.check' (not b)
764 (Main.check' b
765 (case b {
766 False -> I# (-# s14h s14k);
767 True -> I# (+# s14h s14k)
768 })
769 s14l)
770 s14m } } }
771
772 Here we *really* want to unbox z, even though it appears to be used boxed in
773 the Nil case. Partly the Nil case is not a hot path. But more specifically,
774 the whole function gets the CPR property if we do.
775
776 So for the demand on the body of a RHS we use a product demand if it's
777 a product type.
778
779 ************************************************************************
780 * *
781 \subsection{Strictness signatures and types}
782 * *
783 ************************************************************************
784 -}
785
786 unitDmdType :: DmdEnv -> DmdType
787 unitDmdType dmd_env = DmdType dmd_env [] topRes
788
789 coercionDmdEnv :: Coercion -> DmdEnv
790 coercionDmdEnv co = mapVarEnv (const topDmd) (getUniqSet $ coVarsOfCo co)
791 -- The VarSet from coVarsOfCo is really a VarEnv Var
792
793 addVarDmd :: DmdType -> Var -> Demand -> DmdType
794 addVarDmd (DmdType fv ds res) var dmd
795 = DmdType (extendVarEnv_C bothDmd fv var dmd) ds res
796
797 addLazyFVs :: DmdType -> DmdEnv -> DmdType
798 addLazyFVs dmd_ty lazy_fvs
799 = dmd_ty `bothDmdType` mkBothDmdArg lazy_fvs
800 -- Using bothDmdType (rather than just both'ing the envs)
801 -- is vital. Consider
802 -- let f = \x -> (x,y)
803 -- in error (f 3)
804 -- Here, y is treated as a lazy-fv of f, but we must `bothDmd` that L
805 -- demand with the bottom coming up from 'error'
806 --
807 -- I got a loop in the fixpointer without this, due to an interaction
808 -- with the lazy_fv filtering in dmdAnalRhsLetDown. Roughly, it was
809 -- letrec f n x
810 -- = letrec g y = x `fatbar`
811 -- letrec h z = z + ...g...
812 -- in h (f (n-1) x)
813 -- in ...
814 -- In the initial iteration for f, f=Bot
815 -- Suppose h is found to be strict in z, but the occurrence of g in its RHS
816 -- is lazy. Now consider the fixpoint iteration for g, esp the demands it
817 -- places on its free variables. Suppose it places none. Then the
818 -- x `fatbar` ...call to h...
819 -- will give a x->V demand for x. That turns into a L demand for x,
820 -- which floats out of the defn for h. Without the modifyEnv, that
821 -- L demand doesn't get both'd with the Bot coming up from the inner
822 -- call to f. So we just get an L demand for x for g.
823
824 {-
825 Note [Do not strictify the argument dictionaries of a dfun]
826 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
827 The typechecker can tie recursive knots involving dfuns, so we do the
828 conservative thing and refrain from strictifying a dfun's argument
829 dictionaries.
830 -}
831
832 setBndrsDemandInfo :: [Var] -> [Demand] -> [Var]
833 setBndrsDemandInfo (b:bs) (d:ds)
834 | isTyVar b = b : setBndrsDemandInfo bs (d:ds)
835 | otherwise = setIdDemandInfo b d : setBndrsDemandInfo bs ds
836 setBndrsDemandInfo [] ds = ASSERT( null ds ) []
837 setBndrsDemandInfo bs _ = pprPanic "setBndrsDemandInfo" (ppr bs)
838
839 annotateBndr :: AnalEnv -> DmdType -> Var -> (DmdType, Var)
840 -- The returned env has the var deleted
841 -- The returned var is annotated with demand info
842 -- according to the result demand of the provided demand type
843 -- No effect on the argument demands
844 annotateBndr env dmd_ty var
845 | isId var = (dmd_ty', setIdDemandInfo var dmd)
846 | otherwise = (dmd_ty, var)
847 where
848 (dmd_ty', dmd) = findBndrDmd env False dmd_ty var
849
850 annotateLamBndrs :: AnalEnv -> DFunFlag -> DmdType -> [Var] -> (DmdType, [Var])
851 annotateLamBndrs env args_of_dfun ty bndrs = mapAccumR annotate ty bndrs
852 where
853 annotate dmd_ty bndr
854 | isId bndr = annotateLamIdBndr env args_of_dfun dmd_ty bndr
855 | otherwise = (dmd_ty, bndr)
856
857 annotateLamIdBndr :: AnalEnv
858 -> DFunFlag -- is this lambda at the top of the RHS of a dfun?
859 -> DmdType -- Demand type of body
860 -> Id -- Lambda binder
861 -> (DmdType, -- Demand type of lambda
862 Id) -- and binder annotated with demand
863
864 annotateLamIdBndr env arg_of_dfun dmd_ty id
865 -- For lambdas we add the demand to the argument demands
866 -- Only called for Ids
867 = ASSERT( isId id )
868 -- pprTrace "annLamBndr" (vcat [ppr id, ppr _dmd_ty]) $
869 (final_ty, setIdDemandInfo id dmd)
870 where
871 -- Watch out! See note [Lambda-bound unfoldings]
872 final_ty = case maybeUnfoldingTemplate (idUnfolding id) of
873 Nothing -> main_ty
874 Just unf -> main_ty `bothDmdType` unf_ty
875 where
876 (unf_ty, _) = dmdAnalStar env dmd unf
877
878 main_ty = addDemand dmd dmd_ty'
879 (dmd_ty', dmd) = findBndrDmd env arg_of_dfun dmd_ty id
880
881 deleteFVs :: DmdType -> [Var] -> DmdType
882 deleteFVs (DmdType fvs dmds res) bndrs
883 = DmdType (delVarEnvList fvs bndrs) dmds res
884
885 {-
886 Note [CPR for sum types]
887 ~~~~~~~~~~~~~~~~~~~~~~~~
888 At the moment we do not do CPR for let-bindings that
889 * non-top level
890 * bind a sum type
891 Reason: I found that in some benchmarks we were losing let-no-escapes,
892 which messed it all up. Example
893 let j = \x. ....
894 in case y of
895 True -> j False
896 False -> j True
897 If we w/w this we get
898 let j' = \x. ....
899 in case y of
900 True -> case j' False of { (# a #) -> Just a }
901 False -> case j' True of { (# a #) -> Just a }
902 Notice that j' is not a let-no-escape any more.
903
904 However this means in turn that the *enclosing* function
905 may be CPR'd (via the returned Justs). But in the case of
906 sums, there may be Nothing alternatives; and that messes
907 up the sum-type CPR.
908
909 Conclusion: only do this for products. It's still not
910 guaranteed OK for products, but sums definitely lose sometimes.
911
912 Note [CPR for thunks]
913 ~~~~~~~~~~~~~~~~~~~~~
914 If the rhs is a thunk, we usually forget the CPR info, because
915 it is presumably shared (else it would have been inlined, and
916 so we'd lose sharing if w/w'd it into a function). E.g.
917
918 let r = case expensive of
919 (a,b) -> (b,a)
920 in ...
921
922 If we marked r as having the CPR property, then we'd w/w into
923
924 let $wr = \() -> case expensive of
925 (a,b) -> (# b, a #)
926 r = case $wr () of
927 (# b,a #) -> (b,a)
928 in ...
929
930 But now r is a thunk, which won't be inlined, so we are no further ahead.
931 But consider
932
933 f x = let r = case expensive of (a,b) -> (b,a)
934 in if foo r then r else (x,x)
935
936 Does f have the CPR property? Well, no.
937
938 However, if the strictness analyser has figured out (in a previous
939 iteration) that it's strict, then we DON'T need to forget the CPR info.
940 Instead we can retain the CPR info and do the thunk-splitting transform
941 (see WorkWrap.splitThunk).
942
943 This made a big difference to PrelBase.modInt, which had something like
944 modInt = \ x -> let r = ... -> I# v in
945 ...body strict in r...
946 r's RHS isn't a value yet; but modInt returns r in various branches, so
947 if r doesn't have the CPR property then neither does modInt
948 Another case I found in practice (in Complex.magnitude), looks like this:
949 let k = if ... then I# a else I# b
950 in ... body strict in k ....
951 (For this example, it doesn't matter whether k is returned as part of
952 the overall result; but it does matter that k's RHS has the CPR property.)
953 Left to itself, the simplifier will make a join point thus:
954 let $j k = ...body strict in k...
955 if ... then $j (I# a) else $j (I# b)
956 With thunk-splitting, we get instead
957 let $j x = let k = I#x in ...body strict in k...
958 in if ... then $j a else $j b
959 This is much better; there's a good chance the I# won't get allocated.
960
961 The difficulty with this is that we need the strictness type to
962 look at the body... but we now need the body to calculate the demand
963 on the variable, so we can decide whether its strictness type should
964 have a CPR in it or not. Simple solution:
965 a) use strictness info from the previous iteration
966 b) make sure we do at least 2 iterations, by doing a second
967 round for top-level non-recs. Top level recs will get at
968 least 2 iterations except for totally-bottom functions
969 which aren't very interesting anyway.
970
971 NB: strictly_demanded is never true of a top-level Id, or of a recursive Id.
972
973 Note [Optimistic CPR in the "virgin" case]
974 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
975 Demand and strictness info are initialized by top elements. However,
976 this prevents from inferring a CPR property in the first pass of the
977 analyser, so we keep an explicit flag ae_virgin in the AnalEnv
978 datatype.
979
980 We can't start with 'not-demanded' (i.e., top) because then consider
981 f x = let
982 t = ... I# x
983 in
984 if ... then t else I# y else f x'
985
986 In the first iteration we'd have no demand info for x, so assume
987 not-demanded; then we'd get TopRes for f's CPR info. Next iteration
988 we'd see that t was demanded, and so give it the CPR property, but by
989 now f has TopRes, so it will stay TopRes. Instead, by checking the
990 ae_virgin flag at the first time round, we say 'yes t is demanded' the
991 first time.
992
993 However, this does mean that for non-recursive bindings we must
994 iterate twice to be sure of not getting over-optimistic CPR info,
995 in the case where t turns out to be not-demanded. This is handled
996 by dmdAnalTopBind.
997
998
999 Note [NOINLINE and strictness]
1000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1001 The strictness analyser used to have a HACK which ensured that NOINLNE
1002 things were not strictness-analysed. The reason was unsafePerformIO.
1003 Left to itself, the strictness analyser would discover this strictness
1004 for unsafePerformIO:
1005 unsafePerformIO: C(U(AV))
1006 But then consider this sub-expression
1007 unsafePerformIO (\s -> let r = f x in
1008 case writeIORef v r s of (# s1, _ #) ->
1009 (# s1, r #)
1010 The strictness analyser will now find that r is sure to be eval'd,
1011 and may then hoist it out. This makes tests/lib/should_run/memo002
1012 deadlock.
1013
1014 Solving this by making all NOINLINE things have no strictness info is overkill.
1015 In particular, it's overkill for runST, which is perfectly respectable.
1016 Consider
1017 f x = runST (return x)
1018 This should be strict in x.
1019
1020 So the new plan is to define unsafePerformIO using the 'lazy' combinator:
1021
1022 unsafePerformIO (IO m) = lazy (case m realWorld# of (# _, r #) -> r)
1023
1024 Remember, 'lazy' is a wired-in identity-function Id, of type a->a, which is
1025 magically NON-STRICT, and is inlined after strictness analysis. So
1026 unsafePerformIO will look non-strict, and that's what we want.
1027
1028 Now we don't need the hack in the strictness analyser. HOWEVER, this
1029 decision does mean that even a NOINLINE function is not entirely
1030 opaque: some aspect of its implementation leaks out, notably its
1031 strictness. For example, if you have a function implemented by an
1032 error stub, but which has RULES, you may want it not to be eliminated
1033 in favour of error!
1034
1035 Note [Lazy and unleashable free variables]
1036 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1037 We put the strict and once-used FVs in the DmdType of the Id, so
1038 that at its call sites we unleash demands on its strict fvs.
1039 An example is 'roll' in imaginary/wheel-sieve2
1040 Something like this:
1041 roll x = letrec
1042 go y = if ... then roll (x-1) else x+1
1043 in
1044 go ms
1045 We want to see that roll is strict in x, which is because
1046 go is called. So we put the DmdEnv for x in go's DmdType.
1047
1048 Another example:
1049
1050 f :: Int -> Int -> Int
1051 f x y = let t = x+1
1052 h z = if z==0 then t else
1053 if z==1 then x+1 else
1054 x + h (z-1)
1055 in h y
1056
1057 Calling h does indeed evaluate x, but we can only see
1058 that if we unleash a demand on x at the call site for t.
1059
1060 Incidentally, here's a place where lambda-lifting h would
1061 lose the cigar --- we couldn't see the joint strictness in t/x
1062
1063 ON THE OTHER HAND
1064
1065 We don't want to put *all* the fv's from the RHS into the
1066 DmdType. Because
1067
1068 * it makes the strictness signatures larger, and hence slows down fixpointing
1069
1070 and
1071
1072 * it is useless information at the call site anyways:
1073 For lazy, used-many times fv's we will never get any better result than
1074 that, no matter how good the actual demand on the function at the call site
1075 is (unless it is always absent, but then the whole binder is useless).
1076
1077 Therefore we exclude lazy multiple-used fv's from the environment in the
1078 DmdType.
1079
1080 But now the signature lies! (Missing variables are assumed to be absent.) To
1081 make up for this, the code that analyses the binding keeps the demand on those
1082 variable separate (usually called "lazy_fv") and adds it to the demand of the
1083 whole binding later.
1084
1085 What if we decide _not_ to store a strictness signature for a binding at all, as
1086 we do when aborting a fixed-point iteration? The we risk losing the information
1087 that the strict variables are being used. In that case, we take all free variables
1088 mentioned in the (unsound) strictness signature, conservatively approximate the
1089 demand put on them (topDmd), and add that to the "lazy_fv" returned by "dmdFix".
1090
1091
1092 Note [Lambda-bound unfoldings]
1093 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1094 We allow a lambda-bound variable to carry an unfolding, a facility that is used
1095 exclusively for join points; see Note [Case binders and join points]. If so,
1096 we must be careful to demand-analyse the RHS of the unfolding! Example
1097 \x. \y{=Just x}. <body>
1098 Then if <body> uses 'y', then transitively it uses 'x', and we must not
1099 forget that fact, otherwise we might make 'x' absent when it isn't.
1100
1101
1102 ************************************************************************
1103 * *
1104 \subsection{Strictness signatures}
1105 * *
1106 ************************************************************************
1107 -}
1108
1109 type DFunFlag = Bool -- indicates if the lambda being considered is in the
1110 -- sequence of lambdas at the top of the RHS of a dfun
1111 notArgOfDfun :: DFunFlag
1112 notArgOfDfun = False
1113
1114 data AnalEnv
1115 = AE { ae_dflags :: DynFlags
1116 , ae_sigs :: SigEnv
1117 , ae_virgin :: Bool -- True on first iteration only
1118 -- See Note [Initialising strictness]
1119 , ae_rec_tc :: RecTcChecker
1120 , ae_fam_envs :: FamInstEnvs
1121 }
1122
1123 -- We use the se_env to tell us whether to
1124 -- record info about a variable in the DmdEnv
1125 -- We do so if it's a LocalId, but not top-level
1126 --
1127 -- The DmdEnv gives the demand on the free vars of the function
1128 -- when it is given enough args to satisfy the strictness signature
1129
1130 type SigEnv = VarEnv (StrictSig, TopLevelFlag)
1131
1132 instance Outputable AnalEnv where
1133 ppr (AE { ae_sigs = env, ae_virgin = virgin })
1134 = text "AE" <+> braces (vcat
1135 [ text "ae_virgin =" <+> ppr virgin
1136 , text "ae_sigs =" <+> ppr env ])
1137
1138 emptyAnalEnv :: DynFlags -> FamInstEnvs -> AnalEnv
1139 emptyAnalEnv dflags fam_envs
1140 = AE { ae_dflags = dflags
1141 , ae_sigs = emptySigEnv
1142 , ae_virgin = True
1143 , ae_rec_tc = initRecTc
1144 , ae_fam_envs = fam_envs
1145 }
1146
1147 emptySigEnv :: SigEnv
1148 emptySigEnv = emptyVarEnv
1149
1150 -- | Extend an environment with the strictness IDs attached to the id
1151 extendAnalEnvs :: TopLevelFlag -> AnalEnv -> [Id] -> AnalEnv
1152 extendAnalEnvs top_lvl env vars
1153 = env { ae_sigs = extendSigEnvs top_lvl (ae_sigs env) vars }
1154
1155 extendSigEnvs :: TopLevelFlag -> SigEnv -> [Id] -> SigEnv
1156 extendSigEnvs top_lvl sigs vars
1157 = extendVarEnvList sigs [ (var, (idStrictness var, top_lvl)) | var <- vars]
1158
1159 extendAnalEnv :: TopLevelFlag -> AnalEnv -> Id -> StrictSig -> AnalEnv
1160 extendAnalEnv top_lvl env var sig
1161 = env { ae_sigs = extendSigEnv top_lvl (ae_sigs env) var sig }
1162
1163 extendSigEnv :: TopLevelFlag -> SigEnv -> Id -> StrictSig -> SigEnv
1164 extendSigEnv top_lvl sigs var sig = extendVarEnv sigs var (sig, top_lvl)
1165
1166 lookupSigEnv :: AnalEnv -> Id -> Maybe (StrictSig, TopLevelFlag)
1167 lookupSigEnv env id = lookupVarEnv (ae_sigs env) id
1168
1169 getStrictness :: AnalEnv -> Id -> StrictSig
1170 getStrictness env fn
1171 | isGlobalId fn = idStrictness fn
1172 | Just (sig, _) <- lookupSigEnv env fn = sig
1173 | otherwise = nopSig
1174
1175 nonVirgin :: AnalEnv -> AnalEnv
1176 nonVirgin env = env { ae_virgin = False }
1177
1178 extendSigsWithLam :: AnalEnv -> Id -> AnalEnv
1179 -- Extend the AnalEnv when we meet a lambda binder
1180 extendSigsWithLam env id
1181 | isId id
1182 , isStrictDmd (idDemandInfo id) || ae_virgin env
1183 -- See Note [Optimistic CPR in the "virgin" case]
1184 -- See Note [Initial CPR for strict binders]
1185 , Just (dc,_,_,_) <- deepSplitProductType_maybe (ae_fam_envs env) $ idType id
1186 = extendAnalEnv NotTopLevel env id (cprProdSig (dataConRepArity dc))
1187
1188 | otherwise
1189 = env
1190
1191 extendEnvForProdAlt :: AnalEnv -> CoreExpr -> Id -> DataCon -> [Var] -> AnalEnv
1192 -- See Note [CPR in a product case alternative]
1193 extendEnvForProdAlt env scrut case_bndr dc bndrs
1194 = foldl do_con_arg env1 ids_w_strs
1195 where
1196 env1 = extendAnalEnv NotTopLevel env case_bndr case_bndr_sig
1197
1198 ids_w_strs = filter isId bndrs `zip` dataConRepStrictness dc
1199 case_bndr_sig = cprProdSig (dataConRepArity dc)
1200 fam_envs = ae_fam_envs env
1201
1202 do_con_arg env (id, str)
1203 | let is_strict = isStrictDmd (idDemandInfo id) || isMarkedStrict str
1204 , ae_virgin env || (is_var_scrut && is_strict) -- See Note [CPR in a product case alternative]
1205 , Just (dc,_,_,_) <- deepSplitProductType_maybe fam_envs $ idType id
1206 = extendAnalEnv NotTopLevel env id (cprProdSig (dataConRepArity dc))
1207 | otherwise
1208 = env
1209
1210 is_var_scrut = is_var scrut
1211 is_var (Cast e _) = is_var e
1212 is_var (Var v) = isLocalId v
1213 is_var _ = False
1214
1215 addDataConStrictness :: DataCon -> [Demand] -> [Demand]
1216 -- See Note [Add demands for strict constructors]
1217 addDataConStrictness con ds
1218 = ASSERT2( equalLength strs ds, ppr con $$ ppr strs $$ ppr ds )
1219 zipWith add ds strs
1220 where
1221 strs = dataConRepStrictness con
1222 add dmd str | isMarkedStrict str
1223 , not (isAbsDmd dmd) = dmd `bothDmd` seqDmd
1224 | otherwise = dmd
1225
1226 findBndrsDmds :: AnalEnv -> DmdType -> [Var] -> (DmdType, [Demand])
1227 -- Return the demands on the Ids in the [Var]
1228 findBndrsDmds env dmd_ty bndrs
1229 = go dmd_ty bndrs
1230 where
1231 go dmd_ty [] = (dmd_ty, [])
1232 go dmd_ty (b:bs)
1233 | isId b = let (dmd_ty1, dmds) = go dmd_ty bs
1234 (dmd_ty2, dmd) = findBndrDmd env False dmd_ty1 b
1235 in (dmd_ty2, dmd : dmds)
1236 | otherwise = go dmd_ty bs
1237
1238 findBndrDmd :: AnalEnv -> Bool -> DmdType -> Id -> (DmdType, Demand)
1239 -- See Note [Trimming a demand to a type] in Demand.hs
1240 findBndrDmd env arg_of_dfun dmd_ty id
1241 = (dmd_ty', dmd')
1242 where
1243 dmd' = killUsageDemand (ae_dflags env) $
1244 strictify $
1245 trimToType starting_dmd (findTypeShape fam_envs id_ty)
1246
1247 (dmd_ty', starting_dmd) = peelFV dmd_ty id
1248
1249 id_ty = idType id
1250
1251 strictify dmd
1252 | gopt Opt_DictsStrict (ae_dflags env)
1253 -- We never want to strictify a recursive let. At the moment
1254 -- annotateBndr is only call for non-recursive lets; if that
1255 -- changes, we need a RecFlag parameter and another guard here.
1256 , not arg_of_dfun -- See Note [Do not strictify the argument dictionaries of a dfun]
1257 = strictifyDictDmd id_ty dmd
1258 | otherwise
1259 = dmd
1260
1261 fam_envs = ae_fam_envs env
1262
1263 set_idStrictness :: AnalEnv -> Id -> StrictSig -> Id
1264 set_idStrictness env id sig
1265 = setIdStrictness id (killUsageSig (ae_dflags env) sig)
1266
1267 dumpStrSig :: CoreProgram -> SDoc
1268 dumpStrSig binds = vcat (map printId ids)
1269 where
1270 ids = sortBy (stableNameCmp `on` getName) (concatMap getIds binds)
1271 getIds (NonRec i _) = [ i ]
1272 getIds (Rec bs) = map fst bs
1273 printId id | isExportedId id = ppr id <> colon <+> pprIfaceStrictSig (idStrictness id)
1274 | otherwise = empty
1275
1276 {- Note [CPR in a product case alternative]
1277 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1278 In a case alternative for a product type, we want to give some of the
1279 binders the CPR property. Specifically
1280
1281 * The case binder; inside the alternative, the case binder always has
1282 the CPR property, meaning that a case on it will successfully cancel.
1283 Example:
1284 f True x = case x of y { I# x' -> if x' ==# 3
1285 then y
1286 else I# 8 }
1287 f False x = I# 3
1288
1289 By giving 'y' the CPR property, we ensure that 'f' does too, so we get
1290 f b x = case fw b x of { r -> I# r }
1291 fw True x = case x of y { I# x' -> if x' ==# 3 then x' else 8 }
1292 fw False x = 3
1293
1294 Of course there is the usual risk of re-boxing: we have 'x' available
1295 boxed and unboxed, but we return the unboxed version for the wrapper to
1296 box. If the wrapper doesn't cancel with its caller, we'll end up
1297 re-boxing something that we did have available in boxed form.
1298
1299 * Any strict binders with product type, can use
1300 Note [Initial CPR for strict binders]. But we can go a little
1301 further. Consider
1302
1303 data T = MkT !Int Int
1304
1305 f2 (MkT x y) | y>0 = f2 (MkT x (y-1))
1306 | otherwise = x
1307
1308 For $wf2 we are going to unbox the MkT *and*, since it is strict, the
1309 first argument of the MkT; see Note [Add demands for strict constructors].
1310 But then we don't want box it up again when returning it! We want
1311 'f2' to have the CPR property, so we give 'x' the CPR property.
1312
1313 * It's a bit delicate because if this case is scrutinising something other
1314 than an argument the original function, we really don't have the unboxed
1315 version available. E.g
1316 g v = case foo v of
1317 MkT x y | y>0 -> ...
1318 | otherwise -> x
1319 Here we don't have the unboxed 'x' available. Hence the
1320 is_var_scrut test when making use of the strictness annotation.
1321 Slightly ad-hoc, because even if the scrutinee *is* a variable it
1322 might not be a onre of the arguments to the original function, or a
1323 sub-component thereof. But it's simple, and nothing terrible
1324 happens if we get it wrong. e.g. Trac #10694.
1325
1326 Note [Add demands for strict constructors]
1327 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1328 Consider this program (due to Roman):
1329
1330 data X a = X !a
1331
1332 foo :: X Int -> Int -> Int
1333 foo (X a) n = go 0
1334 where
1335 go i | i < n = a + go (i+1)
1336 | otherwise = 0
1337
1338 We want the worker for 'foo' too look like this:
1339
1340 $wfoo :: Int# -> Int# -> Int#
1341
1342 with the first argument unboxed, so that it is not eval'd each time
1343 around the 'go' loop (which would otherwise happen, since 'foo' is not
1344 strict in 'a'). It is sound for the wrapper to pass an unboxed arg
1345 because X is strict, so its argument must be evaluated. And if we
1346 *don't* pass an unboxed argument, we can't even repair it by adding a
1347 `seq` thus:
1348
1349 foo (X a) n = a `seq` go 0
1350
1351 because the seq is discarded (very early) since X is strict!
1352
1353 We achieve the effect using addDataConStrictness. It is called at a
1354 case expression, such as the pattern match on (X a) in the example
1355 above. After computing how 'a' is used in the alternatives, we add an
1356 extra 'seqDmd' to it. The case alternative isn't itself strict in the
1357 sub-components, but simply evaluating the scrutinee to HNF does force
1358 those sub-components.
1359
1360 If the argument is not used at all in the alternative (i.e. it is
1361 Absent), then *don't* add a 'seqDmd'. If we do, it makes it look used
1362 and hence it'll be passed to the worker when it doesn't need to be.
1363 Hence the isAbsDmd test in addDataConStrictness.
1364
1365 There is the usual danger of reboxing, which as usual we ignore. But
1366 if X is monomorphic, and has an UNPACK pragma, then this optimisation
1367 is even more important. We don't want the wrapper to rebox an unboxed
1368 argument, and pass an Int to $wfoo!
1369
1370
1371 Note [Initial CPR for strict binders]
1372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1373 CPR is initialized for a lambda binder in an optimistic manner, i.e,
1374 if the binder is used strictly and at least some of its components as
1375 a product are used, which is checked by the value of the absence
1376 demand.
1377
1378 If the binder is marked demanded with a strict demand, then give it a
1379 CPR signature. Here's a concrete example ('f1' in test T10482a),
1380 assuming h is strict:
1381
1382 f1 :: Int -> Int
1383 f1 x = case h x of
1384 A -> x
1385 B -> f1 (x-1)
1386 C -> x+1
1387
1388 If we notice that 'x' is used strictly, we can give it the CPR
1389 property; and hence f1 gets the CPR property too. It's sound (doesn't
1390 change strictness) to give it the CPR property because by the time 'x'
1391 is returned (case A above), it'll have been evaluated (by the wrapper
1392 of 'h' in the example).
1393
1394 Moreover, if f itself is strict in x, then we'll pass x unboxed to
1395 f1, and so the boxed version *won't* be available; in that case it's
1396 very helpful to give 'x' the CPR property.
1397
1398 Note that
1399
1400 * We only want to do this for something that definitely
1401 has product type, else we may get over-optimistic CPR results
1402 (e.g. from \x -> x!).
1403
1404 * See Note [CPR examples]
1405
1406 Note [CPR examples]
1407 ~~~~~~~~~~~~~~~~~~~~
1408 Here are some examples (stranal/should_compile/T10482a) of the
1409 usefulness of Note [CPR in a product case alternative]. The main
1410 point: all of these functions can have the CPR property.
1411
1412 ------- f1 -----------
1413 -- x is used strictly by h, so it'll be available
1414 -- unboxed before it is returned in the True branch
1415
1416 f1 :: Int -> Int
1417 f1 x = case h x x of
1418 True -> x
1419 False -> f1 (x-1)
1420
1421
1422 ------- f2 -----------
1423 -- x is a strict field of MkT2, so we'll pass it unboxed
1424 -- to $wf2, so it's available unboxed. This depends on
1425 -- the case expression analysing (a subcomponent of) one
1426 -- of the original arguments to the function, so it's
1427 -- a bit more delicate.
1428
1429 data T2 = MkT2 !Int Int
1430
1431 f2 :: T2 -> Int
1432 f2 (MkT2 x y) | y>0 = f2 (MkT2 x (y-1))
1433 | otherwise = x
1434
1435
1436 ------- f3 -----------
1437 -- h is strict in x, so x will be unboxed before it
1438 -- is rerturned in the otherwise case.
1439
1440 data T3 = MkT3 Int Int
1441
1442 f1 :: T3 -> Int
1443 f1 (MkT3 x y) | h x y = f3 (MkT3 x (y-1))
1444 | otherwise = x
1445
1446
1447 ------- f4 -----------
1448 -- Just like f2, but MkT4 can't unbox its strict
1449 -- argument automatically, as f2 can
1450
1451 data family Foo a
1452 newtype instance Foo Int = Foo Int
1453
1454 data T4 a = MkT4 !(Foo a) Int
1455
1456 f4 :: T4 Int -> Int
1457 f4 (MkT4 x@(Foo v) y) | y>0 = f4 (MkT4 x (y-1))
1458 | otherwise = v
1459
1460
1461 Note [Initialising strictness]
1462 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1463 See section 9.2 (Finding fixpoints) of the paper.
1464
1465 Our basic plan is to initialise the strictness of each Id in a
1466 recursive group to "bottom", and find a fixpoint from there. However,
1467 this group B might be inside an *enclosing* recursive group A, in
1468 which case we'll do the entire fixpoint shebang on for each iteration
1469 of A. This can be illustrated by the following example:
1470
1471 Example:
1472
1473 f [] = []
1474 f (x:xs) = let g [] = f xs
1475 g (y:ys) = y+1 : g ys
1476 in g (h x)
1477
1478 At each iteration of the fixpoint for f, the analyser has to find a
1479 fixpoint for the enclosed function g. In the meantime, the demand
1480 values for g at each iteration for f are *greater* than those we
1481 encountered in the previous iteration for f. Therefore, we can begin
1482 the fixpoint for g not with the bottom value but rather with the
1483 result of the previous analysis. I.e., when beginning the fixpoint
1484 process for g, we can start from the demand signature computed for g
1485 previously and attached to the binding occurrence of g.
1486
1487 To speed things up, we initialise each iteration of A (the enclosing
1488 one) from the result of the last one, which is neatly recorded in each
1489 binder. That way we make use of earlier iterations of the fixpoint
1490 algorithm. (Cunning plan.)
1491
1492 But on the *first* iteration we want to *ignore* the current strictness
1493 of the Id, and start from "bottom". Nowadays the Id can have a current
1494 strictness, because interface files record strictness for nested bindings.
1495 To know when we are in the first iteration, we look at the ae_virgin
1496 field of the AnalEnv.
1497
1498
1499 Note [Final Demand Analyser run]
1500 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1501 Some of the information that the demand analyser determines is not always
1502 preserved by the simplifier. For example, the simplifier will happily rewrite
1503 \y [Demand=1*U] let x = y in x + x
1504 to
1505 \y [Demand=1*U] y + y
1506 which is quite a lie.
1507
1508 The once-used information is (currently) only used by the code
1509 generator, though. So:
1510
1511 * We zap the used-once info in the worker-wrapper;
1512 see Note [Zapping Used Once info in WorkWrap] in WorkWrap. If it's
1513 not reliable, it's better not to have it at all.
1514
1515 * Just before TidyCore, we add a pass of the demand analyser,
1516 but WITHOUT subsequent worker/wrapper and simplifier,
1517 right before TidyCore. See SimplCore.getCoreToDo.
1518
1519 This way, correct information finds its way into the module interface
1520 (strictness signatures!) and the code generator (single-entry thunks!)
1521
1522 Note that, in contrast, the single-call information (C1(..)) /can/ be
1523 relied upon, as the simplifier tends to be very careful about not
1524 duplicating actual function calls.
1525
1526 Also see #11731.
1527 -}