Add more pprTrace to SpecConstr (debug only)
[ghc.git] / compiler / specialise / SpecConstr.hs
1 {-
2 ToDo [Oct 2013]
3 ~~~~~~~~~~~~~~~
4 1. Nuke ForceSpecConstr for good (it is subsumed by GHC.Types.SPEC in ghc-prim)
5 2. Nuke NoSpecConstr
6
7
8 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
9
10 \section[SpecConstr]{Specialise over constructors}
11 -}
12
13 {-# LANGUAGE CPP #-}
14
15 module SpecConstr(
16 specConstrProgram,
17 SpecConstrAnnotation(..)
18 ) where
19
20 #include "HsVersions.h"
21
22 import GhcPrelude
23
24 import CoreSyn
25 import CoreSubst
26 import CoreUtils
27 import CoreUnfold ( couldBeSmallEnoughToInline )
28 import CoreFVs ( exprsFreeVarsList )
29 import CoreMonad
30 import Literal ( litIsLifted )
31 import HscTypes ( ModGuts(..) )
32 import WwLib ( isWorkerSmallEnough, mkWorkerArgs )
33 import DataCon
34 import Coercion hiding( substCo )
35 import Rules
36 import Type hiding ( substTy )
37 import TyCon ( tyConName )
38 import Id
39 import PprCore ( pprParendExpr )
40 import MkCore ( mkImpossibleExpr )
41 import Var
42 import VarEnv
43 import VarSet
44 import Name
45 import BasicTypes
46 import DynFlags ( DynFlags(..), GeneralFlag( Opt_SpecConstrKeen )
47 , gopt, hasPprDebug )
48 import Maybes ( orElse, catMaybes, isJust, isNothing )
49 import Demand
50 import GHC.Serialized ( deserializeWithData )
51 import Util
52 import Pair
53 import UniqSupply
54 import Outputable
55 import FastString
56 import UniqFM
57 import MonadUtils
58 import Control.Monad ( zipWithM )
59 import Data.List
60 import PrelNames ( specTyConName )
61 import Module
62
63 -- See Note [Forcing specialisation]
64
65 import TyCon ( TyCon )
66 import GHC.Exts( SpecConstrAnnotation(..) )
67 import Data.Ord( comparing )
68
69 {-
70 -----------------------------------------------------
71 Game plan
72 -----------------------------------------------------
73
74 Consider
75 drop n [] = []
76 drop 0 xs = []
77 drop n (x:xs) = drop (n-1) xs
78
79 After the first time round, we could pass n unboxed. This happens in
80 numerical code too. Here's what it looks like in Core:
81
82 drop n xs = case xs of
83 [] -> []
84 (y:ys) -> case n of
85 I# n# -> case n# of
86 0 -> []
87 _ -> drop (I# (n# -# 1#)) xs
88
89 Notice that the recursive call has an explicit constructor as argument.
90 Noticing this, we can make a specialised version of drop
91
92 RULE: drop (I# n#) xs ==> drop' n# xs
93
94 drop' n# xs = let n = I# n# in ...orig RHS...
95
96 Now the simplifier will apply the specialisation in the rhs of drop', giving
97
98 drop' n# xs = case xs of
99 [] -> []
100 (y:ys) -> case n# of
101 0 -> []
102 _ -> drop' (n# -# 1#) xs
103
104 Much better!
105
106 We'd also like to catch cases where a parameter is carried along unchanged,
107 but evaluated each time round the loop:
108
109 f i n = if i>0 || i>n then i else f (i*2) n
110
111 Here f isn't strict in n, but we'd like to avoid evaluating it each iteration.
112 In Core, by the time we've w/wd (f is strict in i) we get
113
114 f i# n = case i# ># 0 of
115 False -> I# i#
116 True -> case n of { I# n# ->
117 case i# ># n# of
118 False -> I# i#
119 True -> f (i# *# 2#) n
120
121 At the call to f, we see that the argument, n is known to be (I# n#),
122 and n is evaluated elsewhere in the body of f, so we can play the same
123 trick as above.
124
125
126 Note [Reboxing]
127 ~~~~~~~~~~~~~~~
128 We must be careful not to allocate the same constructor twice. Consider
129 f p = (...(case p of (a,b) -> e)...p...,
130 ...let t = (r,s) in ...t...(f t)...)
131 At the recursive call to f, we can see that t is a pair. But we do NOT want
132 to make a specialised copy:
133 f' a b = let p = (a,b) in (..., ...)
134 because now t is allocated by the caller, then r and s are passed to the
135 recursive call, which allocates the (r,s) pair again.
136
137 This happens if
138 (a) the argument p is used in other than a case-scrutinisation way.
139 (b) the argument to the call is not a 'fresh' tuple; you have to
140 look into its unfolding to see that it's a tuple
141
142 Hence the "OR" part of Note [Good arguments] below.
143
144 ALTERNATIVE 2: pass both boxed and unboxed versions. This no longer saves
145 allocation, but does perhaps save evals. In the RULE we'd have
146 something like
147
148 f (I# x#) = f' (I# x#) x#
149
150 If at the call site the (I# x) was an unfolding, then we'd have to
151 rely on CSE to eliminate the duplicate allocation.... This alternative
152 doesn't look attractive enough to pursue.
153
154 ALTERNATIVE 3: ignore the reboxing problem. The trouble is that
155 the conservative reboxing story prevents many useful functions from being
156 specialised. Example:
157 foo :: Maybe Int -> Int -> Int
158 foo (Just m) 0 = 0
159 foo x@(Just m) n = foo x (n-m)
160 Here the use of 'x' will clearly not require boxing in the specialised function.
161
162 The strictness analyser has the same problem, in fact. Example:
163 f p@(a,b) = ...
164 If we pass just 'a' and 'b' to the worker, it might need to rebox the
165 pair to create (a,b). A more sophisticated analysis might figure out
166 precisely the cases in which this could happen, but the strictness
167 analyser does no such analysis; it just passes 'a' and 'b', and hopes
168 for the best.
169
170 So my current choice is to make SpecConstr similarly aggressive, and
171 ignore the bad potential of reboxing.
172
173
174 Note [Good arguments]
175 ~~~~~~~~~~~~~~~~~~~~~
176 So we look for
177
178 * A self-recursive function. Ignore mutual recursion for now,
179 because it's less common, and the code is simpler for self-recursion.
180
181 * EITHER
182
183 a) At a recursive call, one or more parameters is an explicit
184 constructor application
185 AND
186 That same parameter is scrutinised by a case somewhere in
187 the RHS of the function
188
189 OR
190
191 b) At a recursive call, one or more parameters has an unfolding
192 that is an explicit constructor application
193 AND
194 That same parameter is scrutinised by a case somewhere in
195 the RHS of the function
196 AND
197 Those are the only uses of the parameter (see Note [Reboxing])
198
199
200 What to abstract over
201 ~~~~~~~~~~~~~~~~~~~~~
202 There's a bit of a complication with type arguments. If the call
203 site looks like
204
205 f p = ...f ((:) [a] x xs)...
206
207 then our specialised function look like
208
209 f_spec x xs = let p = (:) [a] x xs in ....as before....
210
211 This only makes sense if either
212 a) the type variable 'a' is in scope at the top of f, or
213 b) the type variable 'a' is an argument to f (and hence fs)
214
215 Actually, (a) may hold for value arguments too, in which case
216 we may not want to pass them. Suppose 'x' is in scope at f's
217 defn, but xs is not. Then we'd like
218
219 f_spec xs = let p = (:) [a] x xs in ....as before....
220
221 Similarly (b) may hold too. If x is already an argument at the
222 call, no need to pass it again.
223
224 Finally, if 'a' is not in scope at the call site, we could abstract
225 it as we do the term variables:
226
227 f_spec a x xs = let p = (:) [a] x xs in ...as before...
228
229 So the grand plan is:
230
231 * abstract the call site to a constructor-only pattern
232 e.g. C x (D (f p) (g q)) ==> C s1 (D s2 s3)
233
234 * Find the free variables of the abstracted pattern
235
236 * Pass these variables, less any that are in scope at
237 the fn defn. But see Note [Shadowing] below.
238
239
240 NOTICE that we only abstract over variables that are not in scope,
241 so we're in no danger of shadowing variables used in "higher up"
242 in f_spec's RHS.
243
244
245 Note [Shadowing]
246 ~~~~~~~~~~~~~~~~
247 In this pass we gather up usage information that may mention variables
248 that are bound between the usage site and the definition site; or (more
249 seriously) may be bound to something different at the definition site.
250 For example:
251
252 f x = letrec g y v = let x = ...
253 in ...(g (a,b) x)...
254
255 Since 'x' is in scope at the call site, we may make a rewrite rule that
256 looks like
257 RULE forall a,b. g (a,b) x = ...
258 But this rule will never match, because it's really a different 'x' at
259 the call site -- and that difference will be manifest by the time the
260 simplifier gets to it. [A worry: the simplifier doesn't *guarantee*
261 no-shadowing, so perhaps it may not be distinct?]
262
263 Anyway, the rule isn't actually wrong, it's just not useful. One possibility
264 is to run deShadowBinds before running SpecConstr, but instead we run the
265 simplifier. That gives the simplest possible program for SpecConstr to
266 chew on; and it virtually guarantees no shadowing.
267
268 Note [Specialising for constant parameters]
269 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
270 This one is about specialising on a *constant* (but not necessarily
271 constructor) argument
272
273 foo :: Int -> (Int -> Int) -> Int
274 foo 0 f = 0
275 foo m f = foo (f m) (+1)
276
277 It produces
278
279 lvl_rmV :: GHC.Base.Int -> GHC.Base.Int
280 lvl_rmV =
281 \ (ds_dlk :: GHC.Base.Int) ->
282 case ds_dlk of wild_alH { GHC.Base.I# x_alG ->
283 GHC.Base.I# (GHC.Prim.+# x_alG 1)
284
285 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
286 GHC.Prim.Int#
287 T.$wfoo =
288 \ (ww_sme :: GHC.Prim.Int#) (w_smg :: GHC.Base.Int -> GHC.Base.Int) ->
289 case ww_sme of ds_Xlw {
290 __DEFAULT ->
291 case w_smg (GHC.Base.I# ds_Xlw) of w1_Xmo { GHC.Base.I# ww1_Xmz ->
292 T.$wfoo ww1_Xmz lvl_rmV
293 };
294 0 -> 0
295 }
296
297 The recursive call has lvl_rmV as its argument, so we could create a specialised copy
298 with that argument baked in; that is, not passed at all. Now it can perhaps be inlined.
299
300 When is this worth it? Call the constant 'lvl'
301 - If 'lvl' has an unfolding that is a constructor, see if the corresponding
302 parameter is scrutinised anywhere in the body.
303
304 - If 'lvl' has an unfolding that is a inlinable function, see if the corresponding
305 parameter is applied (...to enough arguments...?)
306
307 Also do this is if the function has RULES?
308
309 Also
310
311 Note [Specialising for lambda parameters]
312 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
313 foo :: Int -> (Int -> Int) -> Int
314 foo 0 f = 0
315 foo m f = foo (f m) (\n -> n-m)
316
317 This is subtly different from the previous one in that we get an
318 explicit lambda as the argument:
319
320 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
321 GHC.Prim.Int#
322 T.$wfoo =
323 \ (ww_sm8 :: GHC.Prim.Int#) (w_sma :: GHC.Base.Int -> GHC.Base.Int) ->
324 case ww_sm8 of ds_Xlr {
325 __DEFAULT ->
326 case w_sma (GHC.Base.I# ds_Xlr) of w1_Xmf { GHC.Base.I# ww1_Xmq ->
327 T.$wfoo
328 ww1_Xmq
329 (\ (n_ad3 :: GHC.Base.Int) ->
330 case n_ad3 of wild_alB { GHC.Base.I# x_alA ->
331 GHC.Base.I# (GHC.Prim.-# x_alA ds_Xlr)
332 })
333 };
334 0 -> 0
335 }
336
337 I wonder if SpecConstr couldn't be extended to handle this? After all,
338 lambda is a sort of constructor for functions and perhaps it already
339 has most of the necessary machinery?
340
341 Furthermore, there's an immediate win, because you don't need to allocate the lambda
342 at the call site; and if perchance it's called in the recursive call, then you
343 may avoid allocating it altogether. Just like for constructors.
344
345 Looks cool, but probably rare...but it might be easy to implement.
346
347
348 Note [SpecConstr for casts]
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
350 Consider
351 data family T a :: *
352 data instance T Int = T Int
353
354 foo n = ...
355 where
356 go (T 0) = 0
357 go (T n) = go (T (n-1))
358
359 The recursive call ends up looking like
360 go (T (I# ...) `cast` g)
361 So we want to spot the constructor application inside the cast.
362 That's why we have the Cast case in argToPat
363
364 Note [Local recursive groups]
365 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
366 For a *local* recursive group, we can see all the calls to the
367 function, so we seed the specialisation loop from the calls in the
368 body, not from the calls in the RHS. Consider:
369
370 bar m n = foo n (n,n) (n,n) (n,n) (n,n)
371 where
372 foo n p q r s
373 | n == 0 = m
374 | n > 3000 = case p of { (p1,p2) -> foo (n-1) (p2,p1) q r s }
375 | n > 2000 = case q of { (q1,q2) -> foo (n-1) p (q2,q1) r s }
376 | n > 1000 = case r of { (r1,r2) -> foo (n-1) p q (r2,r1) s }
377 | otherwise = case s of { (s1,s2) -> foo (n-1) p q r (s2,s1) }
378
379 If we start with the RHSs of 'foo', we get lots and lots of specialisations,
380 most of which are not needed. But if we start with the (single) call
381 in the rhs of 'bar' we get exactly one fully-specialised copy, and all
382 the recursive calls go to this fully-specialised copy. Indeed, the original
383 function is later collected as dead code. This is very important in
384 specialising the loops arising from stream fusion, for example in NDP where
385 we were getting literally hundreds of (mostly unused) specialisations of
386 a local function.
387
388 In a case like the above we end up never calling the original un-specialised
389 function. (Although we still leave its code around just in case.)
390
391 However, if we find any boring calls in the body, including *unsaturated*
392 ones, such as
393 letrec foo x y = ....foo...
394 in map foo xs
395 then we will end up calling the un-specialised function, so then we *should*
396 use the calls in the un-specialised RHS as seeds. We call these
397 "boring call patterns", and callsToPats reports if it finds any of these.
398
399 Note [Seeding top-level recursive groups]
400 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
401 This seeding is done in the binding for seed_calls in specRec.
402
403 1. If all the bindings in a top-level recursive group are local (not
404 exported), then all the calls are in the rest of the top-level
405 bindings. This means we can specialise with those call patterns
406 ONLY, and NOT with the RHSs of the recursive group (exactly like
407 Note [Local recursive groups])
408
409 2. But if any of the bindings are exported, the function may be called
410 with any old arguments, so (for lack of anything better) we specialise
411 based on
412 (a) the call patterns in the RHS
413 (b) the call patterns in the rest of the top-level bindings
414 NB: before Apr 15 we used (a) only, but Dimitrios had an example
415 where (b) was crucial, so I added that.
416 Adding (b) also improved nofib allocation results:
417 multiplier: 4% better
418 minimax: 2.8% better
419
420 Actually in case (2), instead of using the calls from the RHS, it
421 would be better to specialise in the importing module. We'd need to
422 add an INLINABLE pragma to the function, and then it can be
423 specialised in the importing scope, just as is done for type classes
424 in Specialise.specImports. This remains to be done (#10346).
425
426 Note [Top-level recursive groups]
427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428 To get the call usage information from "the rest of the top level
429 bindings" (c.f. Note [Seeding top-level recursive groups]), we work
430 backwards through the top-level bindings so we see the usage before we
431 get to the binding of the function. Before we can collect the usage
432 though, we go through all the bindings and add them to the
433 environment. This is necessary because usage is only tracked for
434 functions in the environment. These two passes are called
435 'go' and 'goEnv'
436 in specConstrProgram. (Looks a bit revolting to me.)
437
438 Note [Do not specialise diverging functions]
439 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
440 Specialising a function that just diverges is a waste of code.
441 Furthermore, it broke GHC (simpl014) thus:
442 {-# STR Sb #-}
443 f = \x. case x of (a,b) -> f x
444 If we specialise f we get
445 f = \x. case x of (a,b) -> fspec a b
446 But fspec doesn't have decent strictness info. As it happened,
447 (f x) :: IO t, so the state hack applied and we eta expanded fspec,
448 and hence f. But now f's strictness is less than its arity, which
449 breaks an invariant.
450
451
452 Note [Forcing specialisation]
453 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
454 With stream fusion and in other similar cases, we want to fully
455 specialise some (but not necessarily all!) loops regardless of their
456 size and the number of specialisations.
457
458 We allow a library to do this, in one of two ways (one which is
459 deprecated):
460
461 1) Add a parameter of type GHC.Types.SPEC (from ghc-prim) to the loop body.
462
463 2) (Deprecated) Annotate a type with ForceSpecConstr from GHC.Exts,
464 and then add *that* type as a parameter to the loop body
465
466 The reason #2 is deprecated is because it requires GHCi, which isn't
467 available for things like a cross compiler using stage1.
468
469 Here's a (simplified) example from the `vector` package. You may bring
470 the special 'force specialization' type into scope by saying:
471
472 import GHC.Types (SPEC(..))
473
474 or by defining your own type (again, deprecated):
475
476 data SPEC = SPEC | SPEC2
477 {-# ANN type SPEC ForceSpecConstr #-}
478
479 (Note this is the exact same definition of GHC.Types.SPEC, just
480 without the annotation.)
481
482 After that, you say:
483
484 foldl :: (a -> b -> a) -> a -> Stream b -> a
485 {-# INLINE foldl #-}
486 foldl f z (Stream step s _) = foldl_loop SPEC z s
487 where
488 foldl_loop !sPEC z s = case step s of
489 Yield x s' -> foldl_loop sPEC (f z x) s'
490 Skip -> foldl_loop sPEC z s'
491 Done -> z
492
493 SpecConstr will spot the SPEC parameter and always fully specialise
494 foldl_loop. Note that
495
496 * We have to prevent the SPEC argument from being removed by
497 w/w which is why (a) SPEC is a sum type, and (b) we have to seq on
498 the SPEC argument.
499
500 * And lastly, the SPEC argument is ultimately eliminated by
501 SpecConstr itself so there is no runtime overhead.
502
503 This is all quite ugly; we ought to come up with a better design.
504
505 ForceSpecConstr arguments are spotted in scExpr' and scTopBinds which then set
506 sc_force to True when calling specLoop. This flag does four things:
507 * Ignore specConstrThreshold, to specialise functions of arbitrary size
508 (see scTopBind)
509 * Ignore specConstrCount, to make arbitrary numbers of specialisations
510 (see specialise)
511 * Specialise even for arguments that are not scrutinised in the loop
512 (see argToPat; Trac #4488)
513 * Only specialise on recursive types a finite number of times
514 (see is_too_recursive; Trac #5550; Note [Limit recursive specialisation])
515
516 This flag is inherited for nested non-recursive bindings (which are likely to
517 be join points and hence should be fully specialised) but reset for nested
518 recursive bindings.
519
520 What alternatives did I consider? Annotating the loop itself doesn't
521 work because (a) it is local and (b) it will be w/w'ed and having
522 w/w propagating annotations somehow doesn't seem like a good idea. The
523 types of the loop arguments really seem to be the most persistent
524 thing.
525
526 Annotating the types that make up the loop state doesn't work,
527 either, because (a) it would prevent us from using types like Either
528 or tuples here, (b) we don't want to restrict the set of types that
529 can be used in Stream states and (c) some types are fixed by the user
530 (e.g., the accumulator here) but we still want to specialise as much
531 as possible.
532
533 Alternatives to ForceSpecConstr
534 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
535 Instead of giving the loop an extra argument of type SPEC, we
536 also considered *wrapping* arguments in SPEC, thus
537 data SPEC a = SPEC a | SPEC2
538
539 loop = \arg -> case arg of
540 SPEC state ->
541 case state of (x,y) -> ... loop (SPEC (x',y')) ...
542 S2 -> error ...
543 The idea is that a SPEC argument says "specialise this argument
544 regardless of whether the function case-analyses it". But this
545 doesn't work well:
546 * SPEC must still be a sum type, else the strictness analyser
547 eliminates it
548 * But that means that 'loop' won't be strict in its real payload
549 This loss of strictness in turn screws up specialisation, because
550 we may end up with calls like
551 loop (SPEC (case z of (p,q) -> (q,p)))
552 Without the SPEC, if 'loop' were strict, the case would move out
553 and we'd see loop applied to a pair. But if 'loop' isn't strict
554 this doesn't look like a specialisable call.
555
556 Note [Limit recursive specialisation]
557 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
558 It is possible for ForceSpecConstr to cause an infinite loop of specialisation.
559 Because there is no limit on the number of specialisations, a recursive call with
560 a recursive constructor as an argument (for example, list cons) will generate
561 a specialisation for that constructor. If the resulting specialisation also
562 contains a recursive call with the constructor, this could proceed indefinitely.
563
564 For example, if ForceSpecConstr is on:
565 loop :: [Int] -> [Int] -> [Int]
566 loop z [] = z
567 loop z (x:xs) = loop (x:z) xs
568 this example will create a specialisation for the pattern
569 loop (a:b) c = loop' a b c
570
571 loop' a b [] = (a:b)
572 loop' a b (x:xs) = loop (x:(a:b)) xs
573 and a new pattern is found:
574 loop (a:(b:c)) d = loop'' a b c d
575 which can continue indefinitely.
576
577 Roman's suggestion to fix this was to stop after a couple of times on recursive types,
578 but still specialising on non-recursive types as much as possible.
579
580 To implement this, we count the number of times we have gone round the
581 "specialise recursively" loop ('go' in 'specRec'). Once have gone round
582 more than N times (controlled by -fspec-constr-recursive=N) we check
583
584 - If sc_force is off, and sc_count is (Just max) then we don't
585 need to do anything: trim_pats will limit the number of specs
586
587 - Otherwise check if any function has now got more than (sc_count env)
588 specialisations. If sc_count is "no limit" then we arbitrarily
589 choose 10 as the limit (ugh).
590
591 See Trac #5550. Also Trac #13623, where this test had become over-aggressive,
592 and we lost a wonderful specialisation that we really wanted!
593
594 Note [NoSpecConstr]
595 ~~~~~~~~~~~~~~~~~~~
596 The ignoreDataCon stuff allows you to say
597 {-# ANN type T NoSpecConstr #-}
598 to mean "don't specialise on arguments of this type". It was added
599 before we had ForceSpecConstr. Lacking ForceSpecConstr we specialised
600 regardless of size; and then we needed a way to turn that *off*. Now
601 that we have ForceSpecConstr, this NoSpecConstr is probably redundant.
602 (Used only for PArray.)
603
604 -----------------------------------------------------
605 Stuff not yet handled
606 -----------------------------------------------------
607
608 Here are notes arising from Roman's work that I don't want to lose.
609
610 Example 1
611 ~~~~~~~~~
612 data T a = T !a
613
614 foo :: Int -> T Int -> Int
615 foo 0 t = 0
616 foo x t | even x = case t of { T n -> foo (x-n) t }
617 | otherwise = foo (x-1) t
618
619 SpecConstr does no specialisation, because the second recursive call
620 looks like a boxed use of the argument. A pity.
621
622 $wfoo_sFw :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
623 $wfoo_sFw =
624 \ (ww_sFo [Just L] :: GHC.Prim.Int#) (w_sFq [Just L] :: T.T GHC.Base.Int) ->
625 case ww_sFo of ds_Xw6 [Just L] {
626 __DEFAULT ->
627 case GHC.Prim.remInt# ds_Xw6 2 of wild1_aEF [Dead Just A] {
628 __DEFAULT -> $wfoo_sFw (GHC.Prim.-# ds_Xw6 1) w_sFq;
629 0 ->
630 case w_sFq of wild_Xy [Just L] { T.T n_ad5 [Just U(L)] ->
631 case n_ad5 of wild1_aET [Just A] { GHC.Base.I# y_aES [Just L] ->
632 $wfoo_sFw (GHC.Prim.-# ds_Xw6 y_aES) wild_Xy
633 } } };
634 0 -> 0
635
636 Example 2
637 ~~~~~~~~~
638 data a :*: b = !a :*: !b
639 data T a = T !a
640
641 foo :: (Int :*: T Int) -> Int
642 foo (0 :*: t) = 0
643 foo (x :*: t) | even x = case t of { T n -> foo ((x-n) :*: t) }
644 | otherwise = foo ((x-1) :*: t)
645
646 Very similar to the previous one, except that the parameters are now in
647 a strict tuple. Before SpecConstr, we have
648
649 $wfoo_sG3 :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
650 $wfoo_sG3 =
651 \ (ww_sFU [Just L] :: GHC.Prim.Int#) (ww_sFW [Just L] :: T.T
652 GHC.Base.Int) ->
653 case ww_sFU of ds_Xws [Just L] {
654 __DEFAULT ->
655 case GHC.Prim.remInt# ds_Xws 2 of wild1_aEZ [Dead Just A] {
656 __DEFAULT ->
657 case ww_sFW of tpl_B2 [Just L] { T.T a_sFo [Just A] ->
658 $wfoo_sG3 (GHC.Prim.-# ds_Xws 1) tpl_B2 -- $wfoo1
659 };
660 0 ->
661 case ww_sFW of wild_XB [Just A] { T.T n_ad7 [Just S(L)] ->
662 case n_ad7 of wild1_aFd [Just L] { GHC.Base.I# y_aFc [Just L] ->
663 $wfoo_sG3 (GHC.Prim.-# ds_Xws y_aFc) wild_XB -- $wfoo2
664 } } };
665 0 -> 0 }
666
667 We get two specialisations:
668 "SC:$wfoo1" [0] __forall {a_sFB :: GHC.Base.Int sc_sGC :: GHC.Prim.Int#}
669 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int a_sFB)
670 = Foo.$s$wfoo1 a_sFB sc_sGC ;
671 "SC:$wfoo2" [0] __forall {y_aFp :: GHC.Prim.Int# sc_sGC :: GHC.Prim.Int#}
672 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int (GHC.Base.I# y_aFp))
673 = Foo.$s$wfoo y_aFp sc_sGC ;
674
675 But perhaps the first one isn't good. After all, we know that tpl_B2 is
676 a T (I# x) really, because T is strict and Int has one constructor. (We can't
677 unbox the strict fields, because T is polymorphic!)
678
679 ************************************************************************
680 * *
681 \subsection{Top level wrapper stuff}
682 * *
683 ************************************************************************
684 -}
685
686 specConstrProgram :: ModGuts -> CoreM ModGuts
687 specConstrProgram guts
688 = do
689 dflags <- getDynFlags
690 us <- getUniqueSupplyM
691 annos <- getFirstAnnotations deserializeWithData guts
692 this_mod <- getModule
693 let binds' = reverse $ fst $ initUs us $ do
694 -- Note [Top-level recursive groups]
695 (env, binds) <- goEnv (initScEnv dflags this_mod annos)
696 (mg_binds guts)
697 -- binds is identical to (mg_binds guts), except that the
698 -- binders on the LHS have been replaced by extendBndr
699 -- (SPJ this seems like overkill; I don't think the binders
700 -- will change at all; and we don't substitute in the RHSs anyway!!)
701 go env nullUsage (reverse binds)
702
703 return (guts { mg_binds = binds' })
704 where
705 -- See Note [Top-level recursive groups]
706 goEnv env [] = return (env, [])
707 goEnv env (bind:binds) = do (env', bind') <- scTopBindEnv env bind
708 (env'', binds') <- goEnv env' binds
709 return (env'', bind' : binds')
710
711 -- Arg list of bindings is in reverse order
712 go _ _ [] = return []
713 go env usg (bind:binds) = do (usg', bind') <- scTopBind env usg bind
714 binds' <- go env usg' binds
715 return (bind' : binds')
716
717 {-
718 ************************************************************************
719 * *
720 \subsection{Environment: goes downwards}
721 * *
722 ************************************************************************
723
724 Note [Work-free values only in environment]
725 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
726 The sc_vals field keeps track of in-scope value bindings, so
727 that if we come across (case x of Just y ->...) we can reduce the
728 case from knowing that x is bound to a pair.
729
730 But only *work-free* values are ok here. For example if the envt had
731 x -> Just (expensive v)
732 then we do NOT want to expand to
733 let y = expensive v in ...
734 because the x-binding still exists and we've now duplicated (expensive v).
735
736 This seldom happens because let-bound constructor applications are
737 ANF-ised, but it can happen as a result of on-the-fly transformations in
738 SpecConstr itself. Here is Trac #7865:
739
740 let {
741 a'_shr =
742 case xs_af8 of _ {
743 [] -> acc_af6;
744 : ds_dgt [Dmd=<L,A>] ds_dgu [Dmd=<L,A>] ->
745 (expensive x_af7, x_af7
746 } } in
747 let {
748 ds_sht =
749 case a'_shr of _ { (p'_afd, q'_afe) ->
750 TSpecConstr_DoubleInline.recursive
751 (GHC.Types.: @ GHC.Types.Int x_af7 wild_X6) (q'_afe, p'_afd)
752 } } in
753
754 When processed knowing that xs_af8 was bound to a cons, we simplify to
755 a'_shr = (expensive x_af7, x_af7)
756 and we do NOT want to inline that at the occurrence of a'_shr in ds_sht.
757 (There are other occurrences of a'_shr.) No no no.
758
759 It would be possible to do some on-the-fly ANF-ising, so that a'_shr turned
760 into a work-free value again, thus
761 a1 = expensive x_af7
762 a'_shr = (a1, x_af7)
763 but that's more work, so until its shown to be important I'm going to
764 leave it for now.
765
766 Note [Making SpecConstr keener]
767 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
768 Consider this, in (perf/should_run/T9339)
769 last (filter odd [1..1000])
770
771 After optimisation, including SpecConstr, we get:
772 f :: Int# -> Int -> Int
773 f x y = case case remInt# x 2# of
774 __DEFAULT -> case x of
775 __DEFAULT -> f (+# wild_Xp 1#) (I# x)
776 1000000# -> ...
777 0# -> case x of
778 __DEFAULT -> f (+# wild_Xp 1#) y
779 1000000# -> y
780
781 Not good! We build an (I# x) box every time around the loop.
782 SpecConstr (as described in the paper) does not specialise f, despite
783 the call (f ... (I# x)) because 'y' is not scrutinied in the body.
784 But it is much better to specialise f for the case where the argument
785 is of form (I# x); then we build the box only when returning y, which
786 is on the cold path.
787
788 Another example:
789
790 f x = ...(g x)....
791
792 Here 'x' is not scrutinised in f's body; but if we did specialise 'f'
793 then the call (g x) might allow 'g' to be specialised in turn.
794
795 So sc_keen controls whether or not we take account of whether argument is
796 scrutinised in the body. True <=> ignore that, and speicalise whenever
797 the function is applied to a data constructor.
798 -}
799
800 data ScEnv = SCE { sc_dflags :: DynFlags,
801 sc_module :: !Module,
802 sc_size :: Maybe Int, -- Size threshold
803 -- Nothing => no limit
804
805 sc_count :: Maybe Int, -- Max # of specialisations for any one fn
806 -- Nothing => no limit
807 -- See Note [Avoiding exponential blowup]
808
809 sc_recursive :: Int, -- Max # of specialisations over recursive type.
810 -- Stops ForceSpecConstr from diverging.
811
812 sc_keen :: Bool, -- Specialise on arguments that are known
813 -- constructors, even if they are not
814 -- scrutinised in the body. See
815 -- Note [Making SpecConstr keener]
816
817 sc_force :: Bool, -- Force specialisation?
818 -- See Note [Forcing specialisation]
819
820 sc_subst :: Subst, -- Current substitution
821 -- Maps InIds to OutExprs
822
823 sc_how_bound :: HowBoundEnv,
824 -- Binds interesting non-top-level variables
825 -- Domain is OutVars (*after* applying the substitution)
826
827 sc_vals :: ValueEnv,
828 -- Domain is OutIds (*after* applying the substitution)
829 -- Used even for top-level bindings (but not imported ones)
830 -- The range of the ValueEnv is *work-free* values
831 -- such as (\x. blah), or (Just v)
832 -- but NOT (Just (expensive v))
833 -- See Note [Work-free values only in environment]
834
835 sc_annotations :: UniqFM SpecConstrAnnotation
836 }
837
838 ---------------------
839 type HowBoundEnv = VarEnv HowBound -- Domain is OutVars
840
841 ---------------------
842 type ValueEnv = IdEnv Value -- Domain is OutIds
843 data Value = ConVal AltCon [CoreArg] -- _Saturated_ constructors
844 -- The AltCon is never DEFAULT
845 | LambdaVal -- Inlinable lambdas or PAPs
846
847 instance Outputable Value where
848 ppr (ConVal con args) = ppr con <+> interpp'SP args
849 ppr LambdaVal = text "<Lambda>"
850
851 ---------------------
852 initScEnv :: DynFlags -> Module -> UniqFM SpecConstrAnnotation -> ScEnv
853 initScEnv dflags this_mod anns
854 = SCE { sc_dflags = dflags,
855 sc_module = this_mod,
856 sc_size = specConstrThreshold dflags,
857 sc_count = specConstrCount dflags,
858 sc_recursive = specConstrRecursive dflags,
859 sc_keen = gopt Opt_SpecConstrKeen dflags,
860 sc_force = False,
861 sc_subst = emptySubst,
862 sc_how_bound = emptyVarEnv,
863 sc_vals = emptyVarEnv,
864 sc_annotations = anns }
865
866 data HowBound = RecFun -- These are the recursive functions for which
867 -- we seek interesting call patterns
868
869 | RecArg -- These are those functions' arguments, or their sub-components;
870 -- we gather occurrence information for these
871
872 instance Outputable HowBound where
873 ppr RecFun = text "RecFun"
874 ppr RecArg = text "RecArg"
875
876 scForce :: ScEnv -> Bool -> ScEnv
877 scForce env b = env { sc_force = b }
878
879 lookupHowBound :: ScEnv -> Id -> Maybe HowBound
880 lookupHowBound env id = lookupVarEnv (sc_how_bound env) id
881
882 scSubstId :: ScEnv -> Id -> CoreExpr
883 scSubstId env v = lookupIdSubst (text "scSubstId") (sc_subst env) v
884
885 scSubstTy :: ScEnv -> Type -> Type
886 scSubstTy env ty = substTy (sc_subst env) ty
887
888 scSubstCo :: ScEnv -> Coercion -> Coercion
889 scSubstCo env co = substCo (sc_subst env) co
890
891 zapScSubst :: ScEnv -> ScEnv
892 zapScSubst env = env { sc_subst = zapSubstEnv (sc_subst env) }
893
894 extendScInScope :: ScEnv -> [Var] -> ScEnv
895 -- Bring the quantified variables into scope
896 extendScInScope env qvars = env { sc_subst = extendInScopeList (sc_subst env) qvars }
897
898 -- Extend the substitution
899 extendScSubst :: ScEnv -> Var -> OutExpr -> ScEnv
900 extendScSubst env var expr = env { sc_subst = extendSubst (sc_subst env) var expr }
901
902 extendScSubstList :: ScEnv -> [(Var,OutExpr)] -> ScEnv
903 extendScSubstList env prs = env { sc_subst = extendSubstList (sc_subst env) prs }
904
905 extendHowBound :: ScEnv -> [Var] -> HowBound -> ScEnv
906 extendHowBound env bndrs how_bound
907 = env { sc_how_bound = extendVarEnvList (sc_how_bound env)
908 [(bndr,how_bound) | bndr <- bndrs] }
909
910 extendBndrsWith :: HowBound -> ScEnv -> [Var] -> (ScEnv, [Var])
911 extendBndrsWith how_bound env bndrs
912 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndrs')
913 where
914 (subst', bndrs') = substBndrs (sc_subst env) bndrs
915 hb_env' = sc_how_bound env `extendVarEnvList`
916 [(bndr,how_bound) | bndr <- bndrs']
917
918 extendBndrWith :: HowBound -> ScEnv -> Var -> (ScEnv, Var)
919 extendBndrWith how_bound env bndr
920 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndr')
921 where
922 (subst', bndr') = substBndr (sc_subst env) bndr
923 hb_env' = extendVarEnv (sc_how_bound env) bndr' how_bound
924
925 extendRecBndrs :: ScEnv -> [Var] -> (ScEnv, [Var])
926 extendRecBndrs env bndrs = (env { sc_subst = subst' }, bndrs')
927 where
928 (subst', bndrs') = substRecBndrs (sc_subst env) bndrs
929
930 extendBndr :: ScEnv -> Var -> (ScEnv, Var)
931 extendBndr env bndr = (env { sc_subst = subst' }, bndr')
932 where
933 (subst', bndr') = substBndr (sc_subst env) bndr
934
935 extendValEnv :: ScEnv -> Id -> Maybe Value -> ScEnv
936 extendValEnv env _ Nothing = env
937 extendValEnv env id (Just cv)
938 | valueIsWorkFree cv -- Don't duplicate work!! Trac #7865
939 = env { sc_vals = extendVarEnv (sc_vals env) id cv }
940 extendValEnv env _ _ = env
941
942 extendCaseBndrs :: ScEnv -> OutExpr -> OutId -> AltCon -> [Var] -> (ScEnv, [Var])
943 -- When we encounter
944 -- case scrut of b
945 -- C x y -> ...
946 -- we want to bind b, to (C x y)
947 -- NB1: Extends only the sc_vals part of the envt
948 -- NB2: Kill the dead-ness info on the pattern binders x,y, since
949 -- they are potentially made alive by the [b -> C x y] binding
950 extendCaseBndrs env scrut case_bndr con alt_bndrs
951 = (env2, alt_bndrs')
952 where
953 live_case_bndr = not (isDeadBinder case_bndr)
954 env1 | Var v <- stripTicksTopE (const True) scrut
955 = extendValEnv env v cval
956 | otherwise = env -- See Note [Add scrutinee to ValueEnv too]
957 env2 | live_case_bndr = extendValEnv env1 case_bndr cval
958 | otherwise = env1
959
960 alt_bndrs' | case scrut of { Var {} -> True; _ -> live_case_bndr }
961 = map zap alt_bndrs
962 | otherwise
963 = alt_bndrs
964
965 cval = case con of
966 DEFAULT -> Nothing
967 LitAlt {} -> Just (ConVal con [])
968 DataAlt {} -> Just (ConVal con vanilla_args)
969 where
970 vanilla_args = map Type (tyConAppArgs (idType case_bndr)) ++
971 varsToCoreExprs alt_bndrs
972
973 zap v | isTyVar v = v -- See NB2 above
974 | otherwise = zapIdOccInfo v
975
976
977 decreaseSpecCount :: ScEnv -> Int -> ScEnv
978 -- See Note [Avoiding exponential blowup]
979 decreaseSpecCount env n_specs
980 = env { sc_count = case sc_count env of
981 Nothing -> Nothing
982 Just n -> Just (n `div` (n_specs + 1)) }
983 -- The "+1" takes account of the original function;
984 -- See Note [Avoiding exponential blowup]
985
986 ---------------------------------------------------
987 -- See Note [Forcing specialisation]
988 ignoreType :: ScEnv -> Type -> Bool
989 ignoreDataCon :: ScEnv -> DataCon -> Bool
990 forceSpecBndr :: ScEnv -> Var -> Bool
991
992 ignoreDataCon env dc = ignoreTyCon env (dataConTyCon dc)
993
994 ignoreType env ty
995 = case tyConAppTyCon_maybe ty of
996 Just tycon -> ignoreTyCon env tycon
997 _ -> False
998
999 ignoreTyCon :: ScEnv -> TyCon -> Bool
1000 ignoreTyCon env tycon
1001 = lookupUFM (sc_annotations env) tycon == Just NoSpecConstr
1002
1003 forceSpecBndr env var = forceSpecFunTy env . snd . splitForAllTys . varType $ var
1004
1005 forceSpecFunTy :: ScEnv -> Type -> Bool
1006 forceSpecFunTy env = any (forceSpecArgTy env) . fst . splitFunTys
1007
1008 forceSpecArgTy :: ScEnv -> Type -> Bool
1009 forceSpecArgTy env ty
1010 | Just ty' <- coreView ty = forceSpecArgTy env ty'
1011
1012 forceSpecArgTy env ty
1013 | Just (tycon, tys) <- splitTyConApp_maybe ty
1014 , tycon /= funTyCon
1015 = tyConName tycon == specTyConName
1016 || lookupUFM (sc_annotations env) tycon == Just ForceSpecConstr
1017 || any (forceSpecArgTy env) tys
1018
1019 forceSpecArgTy _ _ = False
1020
1021 {-
1022 Note [Add scrutinee to ValueEnv too]
1023 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1024 Consider this:
1025 case x of y
1026 (a,b) -> case b of c
1027 I# v -> ...(f y)...
1028 By the time we get to the call (f y), the ValueEnv
1029 will have a binding for y, and for c
1030 y -> (a,b)
1031 c -> I# v
1032 BUT that's not enough! Looking at the call (f y) we
1033 see that y is pair (a,b), but we also need to know what 'b' is.
1034 So in extendCaseBndrs we must *also* add the binding
1035 b -> I# v
1036 else we lose a useful specialisation for f. This is necessary even
1037 though the simplifier has systematically replaced uses of 'x' with 'y'
1038 and 'b' with 'c' in the code. The use of 'b' in the ValueEnv came
1039 from outside the case. See Trac #4908 for the live example.
1040
1041 Note [Avoiding exponential blowup]
1042 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1043 The sc_count field of the ScEnv says how many times we are prepared to
1044 duplicate a single function. But we must take care with recursive
1045 specialisations. Consider
1046
1047 let $j1 = let $j2 = let $j3 = ...
1048 in
1049 ...$j3...
1050 in
1051 ...$j2...
1052 in
1053 ...$j1...
1054
1055 If we specialise $j1 then in each specialisation (as well as the original)
1056 we can specialise $j2, and similarly $j3. Even if we make just *one*
1057 specialisation of each, because we also have the original we'll get 2^n
1058 copies of $j3, which is not good.
1059
1060 So when recursively specialising we divide the sc_count by the number of
1061 copies we are making at this level, including the original.
1062
1063
1064 ************************************************************************
1065 * *
1066 \subsection{Usage information: flows upwards}
1067 * *
1068 ************************************************************************
1069 -}
1070
1071 data ScUsage
1072 = SCU {
1073 scu_calls :: CallEnv, -- Calls
1074 -- The functions are a subset of the
1075 -- RecFuns in the ScEnv
1076
1077 scu_occs :: !(IdEnv ArgOcc) -- Information on argument occurrences
1078 } -- The domain is OutIds
1079
1080 type CallEnv = IdEnv [Call]
1081 data Call = Call Id [CoreArg] ValueEnv
1082 -- The arguments of the call, together with the
1083 -- env giving the constructor bindings at the call site
1084 -- We keep the function mainly for debug output
1085
1086 instance Outputable ScUsage where
1087 ppr (SCU { scu_calls = calls, scu_occs = occs })
1088 = text "SCU" <+> braces (sep [ ptext (sLit "calls =") <+> ppr calls
1089 , text "occs =" <+> ppr occs ])
1090
1091 instance Outputable Call where
1092 ppr (Call fn args _) = ppr fn <+> fsep (map pprParendExpr args)
1093
1094 nullUsage :: ScUsage
1095 nullUsage = SCU { scu_calls = emptyVarEnv, scu_occs = emptyVarEnv }
1096
1097 combineCalls :: CallEnv -> CallEnv -> CallEnv
1098 combineCalls = plusVarEnv_C (++)
1099 where
1100 -- plus cs ds | length res > 1
1101 -- = pprTrace "combineCalls" (vcat [ text "cs:" <+> ppr cs
1102 -- , text "ds:" <+> ppr ds])
1103 -- res
1104 -- | otherwise = res
1105 -- where
1106 -- res = cs ++ ds
1107
1108 combineUsage :: ScUsage -> ScUsage -> ScUsage
1109 combineUsage u1 u2 = SCU { scu_calls = combineCalls (scu_calls u1) (scu_calls u2),
1110 scu_occs = plusVarEnv_C combineOcc (scu_occs u1) (scu_occs u2) }
1111
1112 combineUsages :: [ScUsage] -> ScUsage
1113 combineUsages [] = nullUsage
1114 combineUsages us = foldr1 combineUsage us
1115
1116 lookupOccs :: ScUsage -> [OutVar] -> (ScUsage, [ArgOcc])
1117 lookupOccs (SCU { scu_calls = sc_calls, scu_occs = sc_occs }) bndrs
1118 = (SCU {scu_calls = sc_calls, scu_occs = delVarEnvList sc_occs bndrs},
1119 [lookupVarEnv sc_occs b `orElse` NoOcc | b <- bndrs])
1120
1121 data ArgOcc = NoOcc -- Doesn't occur at all; or a type argument
1122 | UnkOcc -- Used in some unknown way
1123
1124 | ScrutOcc -- See Note [ScrutOcc]
1125 (DataConEnv [ArgOcc]) -- How the sub-components are used
1126
1127 type DataConEnv a = UniqFM a -- Keyed by DataCon
1128
1129 {- Note [ScrutOcc]
1130 ~~~~~~~~~~~~~~~~~~~
1131 An occurrence of ScrutOcc indicates that the thing, or a `cast` version of the thing,
1132 is *only* taken apart or applied.
1133
1134 Functions, literal: ScrutOcc emptyUFM
1135 Data constructors: ScrutOcc subs,
1136
1137 where (subs :: UniqFM [ArgOcc]) gives usage of the *pattern-bound* components,
1138 The domain of the UniqFM is the Unique of the data constructor
1139
1140 The [ArgOcc] is the occurrences of the *pattern-bound* components
1141 of the data structure. E.g.
1142 data T a = forall b. MkT a b (b->a)
1143 A pattern binds b, x::a, y::b, z::b->a, but not 'a'!
1144
1145 -}
1146
1147 instance Outputable ArgOcc where
1148 ppr (ScrutOcc xs) = text "scrut-occ" <> ppr xs
1149 ppr UnkOcc = text "unk-occ"
1150 ppr NoOcc = text "no-occ"
1151
1152 evalScrutOcc :: ArgOcc
1153 evalScrutOcc = ScrutOcc emptyUFM
1154
1155 -- Experimentally, this vesion of combineOcc makes ScrutOcc "win", so
1156 -- that if the thing is scrutinised anywhere then we get to see that
1157 -- in the overall result, even if it's also used in a boxed way
1158 -- This might be too aggressive; see Note [Reboxing] Alternative 3
1159 combineOcc :: ArgOcc -> ArgOcc -> ArgOcc
1160 combineOcc NoOcc occ = occ
1161 combineOcc occ NoOcc = occ
1162 combineOcc (ScrutOcc xs) (ScrutOcc ys) = ScrutOcc (plusUFM_C combineOccs xs ys)
1163 combineOcc UnkOcc (ScrutOcc ys) = ScrutOcc ys
1164 combineOcc (ScrutOcc xs) UnkOcc = ScrutOcc xs
1165 combineOcc UnkOcc UnkOcc = UnkOcc
1166
1167 combineOccs :: [ArgOcc] -> [ArgOcc] -> [ArgOcc]
1168 combineOccs xs ys = zipWithEqual "combineOccs" combineOcc xs ys
1169
1170 setScrutOcc :: ScEnv -> ScUsage -> OutExpr -> ArgOcc -> ScUsage
1171 -- _Overwrite_ the occurrence info for the scrutinee, if the scrutinee
1172 -- is a variable, and an interesting variable
1173 setScrutOcc env usg (Cast e _) occ = setScrutOcc env usg e occ
1174 setScrutOcc env usg (Tick _ e) occ = setScrutOcc env usg e occ
1175 setScrutOcc env usg (Var v) occ
1176 | Just RecArg <- lookupHowBound env v = usg { scu_occs = extendVarEnv (scu_occs usg) v occ }
1177 | otherwise = usg
1178 setScrutOcc _env usg _other _occ -- Catch-all
1179 = usg
1180
1181 {-
1182 ************************************************************************
1183 * *
1184 \subsection{The main recursive function}
1185 * *
1186 ************************************************************************
1187
1188 The main recursive function gathers up usage information, and
1189 creates specialised versions of functions.
1190 -}
1191
1192 scExpr, scExpr' :: ScEnv -> CoreExpr -> UniqSM (ScUsage, CoreExpr)
1193 -- The unique supply is needed when we invent
1194 -- a new name for the specialised function and its args
1195
1196 scExpr env e = scExpr' env e
1197
1198 scExpr' env (Var v) = case scSubstId env v of
1199 Var v' -> return (mkVarUsage env v' [], Var v')
1200 e' -> scExpr (zapScSubst env) e'
1201
1202 scExpr' env (Type t) = return (nullUsage, Type (scSubstTy env t))
1203 scExpr' env (Coercion c) = return (nullUsage, Coercion (scSubstCo env c))
1204 scExpr' _ e@(Lit {}) = return (nullUsage, e)
1205 scExpr' env (Tick t e) = do (usg, e') <- scExpr env e
1206 return (usg, Tick t e')
1207 scExpr' env (Cast e co) = do (usg, e') <- scExpr env e
1208 return (usg, mkCast e' (scSubstCo env co))
1209 -- Important to use mkCast here
1210 -- See Note [SpecConstr call patterns]
1211 scExpr' env e@(App _ _) = scApp env (collectArgs e)
1212 scExpr' env (Lam b e) = do let (env', b') = extendBndr env b
1213 (usg, e') <- scExpr env' e
1214 return (usg, Lam b' e')
1215
1216 scExpr' env (Case scrut b ty alts)
1217 = do { (scrut_usg, scrut') <- scExpr env scrut
1218 ; case isValue (sc_vals env) scrut' of
1219 Just (ConVal con args) -> sc_con_app con args scrut'
1220 _other -> sc_vanilla scrut_usg scrut'
1221 }
1222 where
1223 sc_con_app con args scrut' -- Known constructor; simplify
1224 = do { let (_, bs, rhs) = findAlt con alts
1225 `orElse` (DEFAULT, [], mkImpossibleExpr ty)
1226 alt_env' = extendScSubstList env ((b,scrut') : bs `zip` trimConArgs con args)
1227 ; scExpr alt_env' rhs }
1228
1229 sc_vanilla scrut_usg scrut' -- Normal case
1230 = do { let (alt_env,b') = extendBndrWith RecArg env b
1231 -- Record RecArg for the components
1232
1233 ; (alt_usgs, alt_occs, alts')
1234 <- mapAndUnzip3M (sc_alt alt_env scrut' b') alts
1235
1236 ; let scrut_occ = foldr combineOcc NoOcc alt_occs
1237 scrut_usg' = setScrutOcc env scrut_usg scrut' scrut_occ
1238 -- The combined usage of the scrutinee is given
1239 -- by scrut_occ, which is passed to scScrut, which
1240 -- in turn treats a bare-variable scrutinee specially
1241
1242 ; return (foldr combineUsage scrut_usg' alt_usgs,
1243 Case scrut' b' (scSubstTy env ty) alts') }
1244
1245 sc_alt env scrut' b' (con,bs,rhs)
1246 = do { let (env1, bs1) = extendBndrsWith RecArg env bs
1247 (env2, bs2) = extendCaseBndrs env1 scrut' b' con bs1
1248 ; (usg, rhs') <- scExpr env2 rhs
1249 ; let (usg', b_occ:arg_occs) = lookupOccs usg (b':bs2)
1250 scrut_occ = case con of
1251 DataAlt dc -> ScrutOcc (unitUFM dc arg_occs)
1252 _ -> ScrutOcc emptyUFM
1253 ; return (usg', b_occ `combineOcc` scrut_occ, (con, bs2, rhs')) }
1254
1255 scExpr' env (Let (NonRec bndr rhs) body)
1256 | isTyVar bndr -- Type-lets may be created by doBeta
1257 = scExpr' (extendScSubst env bndr rhs) body
1258
1259 | otherwise
1260 = do { let (body_env, bndr') = extendBndr env bndr
1261 ; rhs_info <- scRecRhs env (bndr',rhs)
1262
1263 ; let body_env2 = extendHowBound body_env [bndr'] RecFun
1264 -- Note [Local let bindings]
1265 rhs' = ri_new_rhs rhs_info
1266 body_env3 = extendValEnv body_env2 bndr' (isValue (sc_vals env) rhs')
1267
1268 ; (body_usg, body') <- scExpr body_env3 body
1269
1270 -- NB: For non-recursive bindings we inherit sc_force flag from
1271 -- the parent function (see Note [Forcing specialisation])
1272 ; (spec_usg, specs) <- specNonRec env body_usg rhs_info
1273
1274 ; return (body_usg { scu_calls = scu_calls body_usg `delVarEnv` bndr' }
1275 `combineUsage` spec_usg, -- Note [spec_usg includes rhs_usg]
1276 mkLets [NonRec b r | (b,r) <- ruleInfoBinds rhs_info specs] body')
1277 }
1278
1279
1280 -- A *local* recursive group: see Note [Local recursive groups]
1281 scExpr' env (Let (Rec prs) body)
1282 = do { let (bndrs,rhss) = unzip prs
1283 (rhs_env1,bndrs') = extendRecBndrs env bndrs
1284 rhs_env2 = extendHowBound rhs_env1 bndrs' RecFun
1285 force_spec = any (forceSpecBndr env) bndrs'
1286 -- Note [Forcing specialisation]
1287
1288 ; rhs_infos <- mapM (scRecRhs rhs_env2) (bndrs' `zip` rhss)
1289 ; (body_usg, body') <- scExpr rhs_env2 body
1290
1291 -- NB: start specLoop from body_usg
1292 ; (spec_usg, specs) <- specRec NotTopLevel (scForce rhs_env2 force_spec)
1293 body_usg rhs_infos
1294 -- Do not unconditionally generate specialisations from rhs_usgs
1295 -- Instead use them only if we find an unspecialised call
1296 -- See Note [Local recursive groups]
1297
1298 ; let all_usg = spec_usg `combineUsage` body_usg -- Note [spec_usg includes rhs_usg]
1299 bind' = Rec (concat (zipWith ruleInfoBinds rhs_infos specs))
1300
1301 ; return (all_usg { scu_calls = scu_calls all_usg `delVarEnvList` bndrs' },
1302 Let bind' body') }
1303
1304 {-
1305 Note [Local let bindings]
1306 ~~~~~~~~~~~~~~~~~~~~~~~~~
1307 It is not uncommon to find this
1308
1309 let $j = \x. <blah> in ...$j True...$j True...
1310
1311 Here $j is an arbitrary let-bound function, but it often comes up for
1312 join points. We might like to specialise $j for its call patterns.
1313 Notice the difference from a letrec, where we look for call patterns
1314 in the *RHS* of the function. Here we look for call patterns in the
1315 *body* of the let.
1316
1317 At one point I predicated this on the RHS mentioning the outer
1318 recursive function, but that's not essential and might even be
1319 harmful. I'm not sure.
1320 -}
1321
1322 scApp :: ScEnv -> (InExpr, [InExpr]) -> UniqSM (ScUsage, CoreExpr)
1323
1324 scApp env (Var fn, args) -- Function is a variable
1325 = ASSERT( not (null args) )
1326 do { args_w_usgs <- mapM (scExpr env) args
1327 ; let (arg_usgs, args') = unzip args_w_usgs
1328 arg_usg = combineUsages arg_usgs
1329 ; case scSubstId env fn of
1330 fn'@(Lam {}) -> scExpr (zapScSubst env) (doBeta fn' args')
1331 -- Do beta-reduction and try again
1332
1333 Var fn' -> return (arg_usg `combineUsage` mkVarUsage env fn' args',
1334 mkApps (Var fn') args')
1335
1336 other_fn' -> return (arg_usg, mkApps other_fn' args') }
1337 -- NB: doing this ignores any usage info from the substituted
1338 -- function, but I don't think that matters. If it does
1339 -- we can fix it.
1340 where
1341 doBeta :: OutExpr -> [OutExpr] -> OutExpr
1342 -- ToDo: adjust for System IF
1343 doBeta (Lam bndr body) (arg : args) = Let (NonRec bndr arg) (doBeta body args)
1344 doBeta fn args = mkApps fn args
1345
1346 -- The function is almost always a variable, but not always.
1347 -- In particular, if this pass follows float-in,
1348 -- which it may, we can get
1349 -- (let f = ...f... in f) arg1 arg2
1350 scApp env (other_fn, args)
1351 = do { (fn_usg, fn') <- scExpr env other_fn
1352 ; (arg_usgs, args') <- mapAndUnzipM (scExpr env) args
1353 ; return (combineUsages arg_usgs `combineUsage` fn_usg, mkApps fn' args') }
1354
1355 ----------------------
1356 mkVarUsage :: ScEnv -> Id -> [CoreExpr] -> ScUsage
1357 mkVarUsage env fn args
1358 = case lookupHowBound env fn of
1359 Just RecFun -> SCU { scu_calls = unitVarEnv fn [Call fn args (sc_vals env)]
1360 , scu_occs = emptyVarEnv }
1361 Just RecArg -> SCU { scu_calls = emptyVarEnv
1362 , scu_occs = unitVarEnv fn arg_occ }
1363 Nothing -> nullUsage
1364 where
1365 -- I rather think we could use UnkOcc all the time
1366 arg_occ | null args = UnkOcc
1367 | otherwise = evalScrutOcc
1368
1369 ----------------------
1370 scTopBindEnv :: ScEnv -> CoreBind -> UniqSM (ScEnv, CoreBind)
1371 scTopBindEnv env (Rec prs)
1372 = do { let (rhs_env1,bndrs') = extendRecBndrs env bndrs
1373 rhs_env2 = extendHowBound rhs_env1 bndrs RecFun
1374
1375 prs' = zip bndrs' rhss
1376 ; return (rhs_env2, Rec prs') }
1377 where
1378 (bndrs,rhss) = unzip prs
1379
1380 scTopBindEnv env (NonRec bndr rhs)
1381 = do { let (env1, bndr') = extendBndr env bndr
1382 env2 = extendValEnv env1 bndr' (isValue (sc_vals env) rhs)
1383 ; return (env2, NonRec bndr' rhs) }
1384
1385 ----------------------
1386 scTopBind :: ScEnv -> ScUsage -> CoreBind -> UniqSM (ScUsage, CoreBind)
1387
1388 {-
1389 scTopBind _ usage _
1390 | pprTrace "scTopBind_usage" (ppr (scu_calls usage)) False
1391 = error "false"
1392 -}
1393
1394 scTopBind env body_usage (Rec prs)
1395 | Just threshold <- sc_size env
1396 , not force_spec
1397 , not (all (couldBeSmallEnoughToInline (sc_dflags env) threshold) rhss)
1398 -- No specialisation
1399 = -- pprTrace "scTopBind: nospec" (ppr bndrs) $
1400 do { (rhs_usgs, rhss') <- mapAndUnzipM (scExpr env) rhss
1401 ; return (body_usage `combineUsage` combineUsages rhs_usgs, Rec (bndrs `zip` rhss')) }
1402
1403 | otherwise -- Do specialisation
1404 = do { rhs_infos <- mapM (scRecRhs env) prs
1405
1406 ; (spec_usage, specs) <- specRec TopLevel (scForce env force_spec)
1407 body_usage rhs_infos
1408
1409 ; return (body_usage `combineUsage` spec_usage,
1410 Rec (concat (zipWith ruleInfoBinds rhs_infos specs))) }
1411 where
1412 (bndrs,rhss) = unzip prs
1413 force_spec = any (forceSpecBndr env) bndrs
1414 -- Note [Forcing specialisation]
1415
1416 scTopBind env usage (NonRec bndr rhs) -- Oddly, we don't seem to specialise top-level non-rec functions
1417 = do { (rhs_usg', rhs') <- scExpr env rhs
1418 ; return (usage `combineUsage` rhs_usg', NonRec bndr rhs') }
1419
1420 ----------------------
1421 scRecRhs :: ScEnv -> (OutId, InExpr) -> UniqSM RhsInfo
1422 scRecRhs env (bndr,rhs)
1423 = do { let (arg_bndrs,body) = collectBinders rhs
1424 (body_env, arg_bndrs') = extendBndrsWith RecArg env arg_bndrs
1425 ; (body_usg, body') <- scExpr body_env body
1426 ; let (rhs_usg, arg_occs) = lookupOccs body_usg arg_bndrs'
1427 ; return (RI { ri_rhs_usg = rhs_usg
1428 , ri_fn = bndr, ri_new_rhs = mkLams arg_bndrs' body'
1429 , ri_lam_bndrs = arg_bndrs, ri_lam_body = body
1430 , ri_arg_occs = arg_occs }) }
1431 -- The arg_occs says how the visible,
1432 -- lambda-bound binders of the RHS are used
1433 -- (including the TyVar binders)
1434 -- Two pats are the same if they match both ways
1435
1436 ----------------------
1437 ruleInfoBinds :: RhsInfo -> SpecInfo -> [(Id,CoreExpr)]
1438 ruleInfoBinds (RI { ri_fn = fn, ri_new_rhs = new_rhs })
1439 (SI { si_specs = specs })
1440 = [(id,rhs) | OS { os_id = id, os_rhs = rhs } <- specs] ++
1441 -- First the specialised bindings
1442
1443 [(fn `addIdSpecialisations` rules, new_rhs)]
1444 -- And now the original binding
1445 where
1446 rules = [r | OS { os_rule = r } <- specs]
1447
1448 {-
1449 ************************************************************************
1450 * *
1451 The specialiser itself
1452 * *
1453 ************************************************************************
1454 -}
1455
1456 data RhsInfo
1457 = RI { ri_fn :: OutId -- The binder
1458 , ri_new_rhs :: OutExpr -- The specialised RHS (in current envt)
1459 , ri_rhs_usg :: ScUsage -- Usage info from specialising RHS
1460
1461 , ri_lam_bndrs :: [InVar] -- The *original* RHS (\xs.body)
1462 , ri_lam_body :: InExpr -- Note [Specialise original body]
1463 , ri_arg_occs :: [ArgOcc] -- Info on how the xs occur in body
1464 }
1465
1466 data SpecInfo -- Info about specialisations for a particular Id
1467 = SI { si_specs :: [OneSpec] -- The specialisations we have generated
1468
1469 , si_n_specs :: Int -- Length of si_specs; used for numbering them
1470
1471 , si_mb_unspec :: Maybe ScUsage -- Just cs => we have not yet used calls in the
1472 } -- from calls in the *original* RHS as
1473 -- seeds for new specialisations;
1474 -- if you decide to do so, here is the
1475 -- RHS usage (which has not yet been
1476 -- unleashed)
1477 -- Nothing => we have
1478 -- See Note [Local recursive groups]
1479 -- See Note [spec_usg includes rhs_usg]
1480
1481 -- One specialisation: Rule plus definition
1482 data OneSpec =
1483 OS { os_pat :: CallPat -- Call pattern that generated this specialisation
1484 , os_rule :: CoreRule -- Rule connecting original id with the specialisation
1485 , os_id :: OutId -- Spec id
1486 , os_rhs :: OutExpr } -- Spec rhs
1487
1488 noSpecInfo :: SpecInfo
1489 noSpecInfo = SI { si_specs = [], si_n_specs = 0, si_mb_unspec = Nothing }
1490
1491 ----------------------
1492 specNonRec :: ScEnv
1493 -> ScUsage -- Body usage
1494 -> RhsInfo -- Structure info usage info for un-specialised RHS
1495 -> UniqSM (ScUsage, SpecInfo) -- Usage from RHSs (specialised and not)
1496 -- plus details of specialisations
1497
1498 specNonRec env body_usg rhs_info
1499 = specialise env (scu_calls body_usg) rhs_info
1500 (noSpecInfo { si_mb_unspec = Just (ri_rhs_usg rhs_info) })
1501
1502 ----------------------
1503 specRec :: TopLevelFlag -> ScEnv
1504 -> ScUsage -- Body usage
1505 -> [RhsInfo] -- Structure info and usage info for un-specialised RHSs
1506 -> UniqSM (ScUsage, [SpecInfo]) -- Usage from all RHSs (specialised and not)
1507 -- plus details of specialisations
1508
1509 specRec top_lvl env body_usg rhs_infos
1510 = go 1 seed_calls nullUsage init_spec_infos
1511 where
1512 (seed_calls, init_spec_infos) -- Note [Seeding top-level recursive groups]
1513 | isTopLevel top_lvl
1514 , any (isExportedId . ri_fn) rhs_infos -- Seed from body and RHSs
1515 = (all_calls, [noSpecInfo | _ <- rhs_infos])
1516 | otherwise -- Seed from body only
1517 = (calls_in_body, [noSpecInfo { si_mb_unspec = Just (ri_rhs_usg ri) }
1518 | ri <- rhs_infos])
1519
1520 calls_in_body = scu_calls body_usg
1521 calls_in_rhss = foldr (combineCalls . scu_calls . ri_rhs_usg) emptyVarEnv rhs_infos
1522 all_calls = calls_in_rhss `combineCalls` calls_in_body
1523
1524 -- Loop, specialising, until you get no new specialisations
1525 go :: Int -- Which iteration of the "until no new specialisations"
1526 -- loop we are on; first iteration is 1
1527 -> CallEnv -- Seed calls
1528 -- Two accumulating parameters:
1529 -> ScUsage -- Usage from earlier specialisations
1530 -> [SpecInfo] -- Details of specialisations so far
1531 -> UniqSM (ScUsage, [SpecInfo])
1532 go n_iter seed_calls usg_so_far spec_infos
1533 | isEmptyVarEnv seed_calls
1534 = -- pprTrace "specRec1" (vcat [ ppr (map ri_fn rhs_infos)
1535 -- , ppr seed_calls
1536 -- , ppr body_usg ]) $
1537 return (usg_so_far, spec_infos)
1538
1539 -- Limit recursive specialisation
1540 -- See Note [Limit recursive specialisation]
1541 | n_iter > sc_recursive env -- Too many iterations of the 'go' loop
1542 , sc_force env || isNothing (sc_count env)
1543 -- If both of these are false, the sc_count
1544 -- threshold will prevent non-termination
1545 , any ((> the_limit) . si_n_specs) spec_infos
1546 = -- pprTrace "specRec2" (ppr (map (map os_pat . si_specs) spec_infos)) $
1547 return (usg_so_far, spec_infos)
1548
1549 | otherwise
1550 = -- pprTrace "specRec3" (vcat [ text "bndrs" <+> ppr (map ri_fn rhs_infos)
1551 -- , text "iteration" <+> int n_iter
1552 -- , text "spec_infos" <+> ppr (map (map os_pat . si_specs) spec_infos)
1553 -- ]) $
1554 do { specs_w_usg <- zipWithM (specialise env seed_calls) rhs_infos spec_infos
1555 ; let (extra_usg_s, new_spec_infos) = unzip specs_w_usg
1556 extra_usg = combineUsages extra_usg_s
1557 all_usg = usg_so_far `combineUsage` extra_usg
1558 ; go (n_iter + 1) (scu_calls extra_usg) all_usg new_spec_infos }
1559
1560 -- See Note [Limit recursive specialisation]
1561 the_limit = case sc_count env of
1562 Nothing -> 10 -- Ugh!
1563 Just max -> max
1564
1565
1566 ----------------------
1567 specialise
1568 :: ScEnv
1569 -> CallEnv -- Info on newly-discovered calls to this function
1570 -> RhsInfo
1571 -> SpecInfo -- Original RHS plus patterns dealt with
1572 -> UniqSM (ScUsage, SpecInfo) -- New specialised versions and their usage
1573
1574 -- See Note [spec_usg includes rhs_usg]
1575
1576 -- Note: this only generates *specialised* bindings
1577 -- The original binding is added by ruleInfoBinds
1578 --
1579 -- Note: the rhs here is the optimised version of the original rhs
1580 -- So when we make a specialised copy of the RHS, we're starting
1581 -- from an RHS whose nested functions have been optimised already.
1582
1583 specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
1584 , ri_lam_body = body, ri_arg_occs = arg_occs })
1585 spec_info@(SI { si_specs = specs, si_n_specs = spec_count
1586 , si_mb_unspec = mb_unspec })
1587 | isBottomingId fn -- Note [Do not specialise diverging functions]
1588 -- and do not generate specialisation seeds from its RHS
1589 = -- pprTrace "specialise bot" (ppr fn) $
1590 return (nullUsage, spec_info)
1591
1592 | isNeverActive (idInlineActivation fn) -- See Note [Transfer activation]
1593 || null arg_bndrs -- Only specialise functions
1594 = -- pprTrace "specialise inactive" (ppr fn) $
1595 case mb_unspec of -- Behave as if there was a single, boring call
1596 Just rhs_usg -> return (rhs_usg, spec_info { si_mb_unspec = Nothing })
1597 -- See Note [spec_usg includes rhs_usg]
1598 Nothing -> return (nullUsage, spec_info)
1599
1600 | Just all_calls <- lookupVarEnv bind_calls fn
1601 = -- pprTrace "specialise entry {" (ppr fn <+> ppr all_calls) $
1602 do { (boring_call, new_pats) <- callsToNewPats env fn spec_info arg_occs all_calls
1603
1604 ; let n_pats = length new_pats
1605 -- ; if (not (null new_pats) || isJust mb_unspec) then
1606 -- pprTrace "specialise" (vcat [ ppr fn <+> text "with" <+> int n_pats <+> text "good patterns"
1607 -- , text "mb_unspec" <+> ppr (isJust mb_unspec)
1608 -- , text "arg_occs" <+> ppr arg_occs
1609 -- , text "good pats" <+> ppr new_pats]) $
1610 -- return ()
1611 -- else return ()
1612
1613 ; let spec_env = decreaseSpecCount env n_pats
1614 ; (spec_usgs, new_specs) <- mapAndUnzipM (spec_one spec_env fn arg_bndrs body)
1615 (new_pats `zip` [spec_count..])
1616 -- See Note [Specialise original body]
1617
1618 ; let spec_usg = combineUsages spec_usgs
1619
1620 -- If there were any boring calls among the seeds (= all_calls), then those
1621 -- calls will call the un-specialised function. So we should use the seeds
1622 -- from the _unspecialised_ function's RHS, which are in mb_unspec, by returning
1623 -- then in new_usg.
1624 (new_usg, mb_unspec')
1625 = case mb_unspec of
1626 Just rhs_usg | boring_call -> (spec_usg `combineUsage` rhs_usg, Nothing)
1627 _ -> (spec_usg, mb_unspec)
1628
1629 -- ; pprTrace "specialise return }"
1630 -- (vcat [ ppr fn
1631 -- , text "boring_call:" <+> ppr boring_call
1632 -- , text "new calls:" <+> ppr (scu_calls new_usg)]) $
1633 -- return ()
1634
1635 ; return (new_usg, SI { si_specs = new_specs ++ specs
1636 , si_n_specs = spec_count + n_pats
1637 , si_mb_unspec = mb_unspec' }) }
1638
1639 | otherwise -- No new seeds, so return nullUsage
1640 = return (nullUsage, spec_info)
1641
1642
1643
1644
1645 ---------------------
1646 spec_one :: ScEnv
1647 -> OutId -- Function
1648 -> [InVar] -- Lambda-binders of RHS; should match patterns
1649 -> InExpr -- Body of the original function
1650 -> (CallPat, Int)
1651 -> UniqSM (ScUsage, OneSpec) -- Rule and binding
1652
1653 -- spec_one creates a specialised copy of the function, together
1654 -- with a rule for using it. I'm very proud of how short this
1655 -- function is, considering what it does :-).
1656
1657 {-
1658 Example
1659
1660 In-scope: a, x::a
1661 f = /\b \y::[(a,b)] -> ....f (b,c) ((:) (a,(b,c)) (x,v) (h w))...
1662 [c::*, v::(b,c) are presumably bound by the (...) part]
1663 ==>
1664 f_spec = /\ b c \ v::(b,c) hw::[(a,(b,c))] ->
1665 (...entire body of f...) [b -> (b,c),
1666 y -> ((:) (a,(b,c)) (x,v) hw)]
1667
1668 RULE: forall b::* c::*, -- Note, *not* forall a, x
1669 v::(b,c),
1670 hw::[(a,(b,c))] .
1671
1672 f (b,c) ((:) (a,(b,c)) (x,v) hw) = f_spec b c v hw
1673 -}
1674
1675 spec_one env fn arg_bndrs body (call_pat@(qvars, pats), rule_number)
1676 = do { spec_uniq <- getUniqueM
1677 ; let spec_env = extendScSubstList (extendScInScope env qvars)
1678 (arg_bndrs `zip` pats)
1679 fn_name = idName fn
1680 fn_loc = nameSrcSpan fn_name
1681 fn_occ = nameOccName fn_name
1682 spec_occ = mkSpecOcc fn_occ
1683 -- We use fn_occ rather than fn in the rule_name string
1684 -- as we don't want the uniq to end up in the rule, and
1685 -- hence in the ABI, as that can cause spurious ABI
1686 -- changes (#4012).
1687 rule_name = mkFastString ("SC:" ++ occNameString fn_occ ++ show rule_number)
1688 spec_name = mkInternalName spec_uniq spec_occ fn_loc
1689 -- ; pprTrace "{spec_one" (ppr (sc_count env) <+> ppr fn
1690 -- <+> ppr pats <+> text "-->" <+> ppr spec_name) $
1691 -- return ()
1692
1693 -- Specialise the body
1694 ; (spec_usg, spec_body) <- scExpr spec_env body
1695
1696 -- ; pprTrace "done spec_one}" (ppr fn) $
1697 -- return ()
1698
1699 -- And build the results
1700 ; let (spec_lam_args, spec_call_args) = mkWorkerArgs (sc_dflags env)
1701 qvars body_ty
1702 -- Usual w/w hack to avoid generating
1703 -- a spec_rhs of unlifted type and no args
1704
1705 spec_lam_args_str = handOutStrictnessInformation (fst (splitStrictSig spec_str)) spec_lam_args
1706 -- Annotate the variables with the strictness information from
1707 -- the function (see Note [Strictness information in worker binders])
1708
1709 spec_join_arity | isJoinId fn = Just (length spec_lam_args)
1710 | otherwise = Nothing
1711 spec_id = mkLocalIdOrCoVar spec_name
1712 (mkLamTypes spec_lam_args body_ty)
1713 -- See Note [Transfer strictness]
1714 `setIdStrictness` spec_str
1715 `setIdArity` count isId spec_lam_args
1716 `asJoinId_maybe` spec_join_arity
1717 spec_str = calcSpecStrictness fn spec_lam_args pats
1718
1719
1720 -- Conditionally use result of new worker-wrapper transform
1721 spec_rhs = mkLams spec_lam_args_str spec_body
1722 body_ty = exprType spec_body
1723 rule_rhs = mkVarApps (Var spec_id) spec_call_args
1724 inline_act = idInlineActivation fn
1725 this_mod = sc_module spec_env
1726 rule = mkRule this_mod True {- Auto -} True {- Local -}
1727 rule_name inline_act fn_name qvars pats rule_rhs
1728 -- See Note [Transfer activation]
1729 ; return (spec_usg, OS { os_pat = call_pat, os_rule = rule
1730 , os_id = spec_id
1731 , os_rhs = spec_rhs }) }
1732
1733
1734 -- See Note [Strictness information in worker binders]
1735 handOutStrictnessInformation :: [Demand] -> [Var] -> [Var]
1736 handOutStrictnessInformation = go
1737 where
1738 go _ [] = []
1739 go [] vs = vs
1740 go (d:dmds) (v:vs) | isId v = setIdDemandInfo v d : go dmds vs
1741 go dmds (v:vs) = v : go dmds vs
1742
1743 calcSpecStrictness :: Id -- The original function
1744 -> [Var] -> [CoreExpr] -- Call pattern
1745 -> StrictSig -- Strictness of specialised thing
1746 -- See Note [Transfer strictness]
1747 calcSpecStrictness fn qvars pats
1748 = mkClosedStrictSig spec_dmds topRes
1749 where
1750 spec_dmds = [ lookupVarEnv dmd_env qv `orElse` topDmd | qv <- qvars, isId qv ]
1751 StrictSig (DmdType _ dmds _) = idStrictness fn
1752
1753 dmd_env = go emptyVarEnv dmds pats
1754
1755 go :: DmdEnv -> [Demand] -> [CoreExpr] -> DmdEnv
1756 go env ds (Type {} : pats) = go env ds pats
1757 go env ds (Coercion {} : pats) = go env ds pats
1758 go env (d:ds) (pat : pats) = go (go_one env d pat) ds pats
1759 go env _ _ = env
1760
1761 go_one :: DmdEnv -> Demand -> CoreExpr -> DmdEnv
1762 go_one env d (Var v) = extendVarEnv_C bothDmd env v d
1763 go_one env d e
1764 | Just ds <- splitProdDmd_maybe d -- NB: d does not have to be strict
1765 , (Var _, args) <- collectArgs e = go env ds args
1766 go_one env _ _ = env
1767
1768 {-
1769 Note [spec_usg includes rhs_usg]
1770 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1771 In calls to 'specialise', the returned ScUsage must include the rhs_usg in
1772 the passed-in SpecInfo, unless there are no calls at all to the function.
1773
1774 The caller can, indeed must, assume this. He should not combine in rhs_usg
1775 himself, or he'll get rhs_usg twice -- and that can lead to an exponential
1776 blowup of duplicates in the CallEnv. This is what gave rise to the massive
1777 performance loss in Trac #8852.
1778
1779 Note [Specialise original body]
1780 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1781 The RhsInfo for a binding keeps the *original* body of the binding. We
1782 must specialise that, *not* the result of applying specExpr to the RHS
1783 (which is also kept in RhsInfo). Otherwise we end up specialising a
1784 specialised RHS, and that can lead directly to exponential behaviour.
1785
1786 Note [Transfer activation]
1787 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1788 This note is for SpecConstr, but exactly the same thing
1789 happens in the overloading specialiser; see
1790 Note [Auto-specialisation and RULES] in Specialise.
1791
1792 In which phase should the specialise-constructor rules be active?
1793 Originally I made them always-active, but Manuel found that this
1794 defeated some clever user-written rules. Then I made them active only
1795 in Phase 0; after all, currently, the specConstr transformation is
1796 only run after the simplifier has reached Phase 0, but that meant
1797 that specialisations didn't fire inside wrappers; see test
1798 simplCore/should_compile/spec-inline.
1799
1800 So now I just use the inline-activation of the parent Id, as the
1801 activation for the specialiation RULE, just like the main specialiser;
1802
1803 This in turn means there is no point in specialising NOINLINE things,
1804 so we test for that.
1805
1806 Note [Transfer strictness]
1807 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1808 We must transfer strictness information from the original function to
1809 the specialised one. Suppose, for example
1810
1811 f has strictness SS
1812 and a RULE f (a:as) b = f_spec a as b
1813
1814 Now we want f_spec to have strictness LLS, otherwise we'll use call-by-need
1815 when calling f_spec instead of call-by-value. And that can result in
1816 unbounded worsening in space (cf the classic foldl vs foldl')
1817
1818 See Trac #3437 for a good example.
1819
1820 The function calcSpecStrictness performs the calculation.
1821
1822 Note [Strictness information in worker binders]
1823 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1824
1825 After having calculated the strictness annotation for the worker (see Note
1826 [Transfer strictness] above), we also want to have this information attached to
1827 the worker’s arguments, for the benefit of later passes. The function
1828 handOutStrictnessInformation decomposes the strictness annotation calculated by
1829 calcSpecStrictness and attaches them to the variables.
1830
1831 ************************************************************************
1832 * *
1833 \subsection{Argument analysis}
1834 * *
1835 ************************************************************************
1836
1837 This code deals with analysing call-site arguments to see whether
1838 they are constructor applications.
1839
1840 Note [Free type variables of the qvar types]
1841 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1842 In a call (f @a x True), that we want to specialise, what variables should
1843 we quantify over. Clearly over 'a' and 'x', but what about any type variables
1844 free in x's type? In fact we don't need to worry about them because (f @a)
1845 can only be a well-typed application if its type is compatible with x, so any
1846 variables free in x's type must be free in (f @a), and hence either be gathered
1847 via 'a' itself, or be in scope at f's defn. Hence we just take
1848 (exprsFreeVars pats).
1849
1850 BUT phantom type synonyms can mess this reasoning up,
1851 eg x::T b with type T b = Int
1852 So we apply expandTypeSynonyms to the bound Ids.
1853 See Trac # 5458. Yuk.
1854
1855 Note [SpecConstr call patterns]
1856 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1857 A "call patterns" that we collect is going to become the LHS of a RULE.
1858 It's important that it doesn't have
1859 e |> Refl
1860 or
1861 e |> g1 |> g2
1862 because both of these will be optimised by Simplify.simplRule. In the
1863 former case such optimisation benign, because the rule will match more
1864 terms; but in the latter we may lose a binding of 'g1' or 'g2', and
1865 end up with a rule LHS that doesn't bind the template variables
1866 (Trac #10602).
1867
1868 The simplifier eliminates such things, but SpecConstr itself constructs
1869 new terms by substituting. So the 'mkCast' in the Cast case of scExpr
1870 is very important!
1871
1872 Note [Choosing patterns]
1873 ~~~~~~~~~~~~~~~~~~~~~~~~
1874 If we get lots of patterns we may not want to make a specialisation
1875 for each of them (code bloat), so we choose as follows, implemented
1876 by trim_pats.
1877
1878 * The flag -fspec-constr-count-N sets the sc_count field
1879 of the ScEnv to (Just n). This limits the total number
1880 of specialisations for a given function to N.
1881
1882 * -fno-spec-constr-count sets the sc_count field to Nothing,
1883 which switches of the limit.
1884
1885 * The ghastly ForceSpecConstr trick also switches of the limit
1886 for a particular function
1887
1888 * Otherwise we sort the patterns to choose the most general
1889 ones first; more general => more widely applicable.
1890
1891 Note [SpecConstr and casts]
1892 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1893 Consider (Trac #14270) a call like
1894
1895 let f = e
1896 in ... f (K @(a |> co)) ...
1897
1898 where 'co' is a coercion variable not in scope at f's definition site.
1899 If we aren't caereful we'll get
1900
1901 let $sf a co = e (K @(a |> co))
1902 RULE "SC:f" forall a co. f (K @(a |> co)) = $sf a co
1903 f = e
1904 in ...
1905
1906 But alas, when we match the call we won't bind 'co', because type-matching
1907 (for good reasons) discards casts).
1908
1909 I don't know how to solve this, so for now I'm just discarding any
1910 call patterns that
1911 * Mentions a coercion variable
1912 * That is not in scope at the binding of the function
1913
1914 I think this is very rare.
1915
1916 Note that this /also/ discards the call pattern if we have a cast in a
1917 /term/, although in fact Rules.match does make a very flaky and
1918 fragile attempt to match coercions. e.g. a call like
1919 f (Maybe Age) (Nothing |> co) blah
1920 where co :: Maybe Int ~ Maybe Age
1921 will be discarded. It's extremely fragile to match on the form of a
1922 coercion, so I think it's better just not to try. A more complicated
1923 alternative would be to discard calls that mention coercion variables
1924 only in kind-casts, but I'm doing the simple thing for now.
1925 -}
1926
1927 type CallPat = ([Var], [CoreExpr]) -- Quantified variables and arguments
1928 -- See Note [SpecConstr call patterns]
1929
1930 callsToNewPats :: ScEnv -> Id
1931 -> SpecInfo
1932 -> [ArgOcc] -> [Call]
1933 -> UniqSM (Bool, [CallPat])
1934 -- Result has no duplicate patterns,
1935 -- nor ones mentioned in done_pats
1936 -- Bool indicates that there was at least one boring pattern
1937 callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
1938 = do { mb_pats <- mapM (callToPats env bndr_occs) calls
1939
1940 ; let have_boring_call = any isNothing mb_pats
1941
1942 good_pats :: [CallPat]
1943 good_pats = catMaybes mb_pats
1944
1945 -- Remove patterns we have already done
1946 new_pats = filterOut is_done good_pats
1947 is_done p = any (samePat p . os_pat) done_specs
1948
1949 -- Remove duplicates
1950 non_dups = nubBy samePat new_pats
1951
1952 -- Remove ones that have too many worker variables
1953 small_pats = filterOut too_big non_dups
1954 too_big (vars,_) = not (isWorkerSmallEnough (sc_dflags env) vars)
1955 -- We are about to construct w/w pair in 'spec_one'.
1956 -- Omit specialisation leading to high arity workers.
1957 -- See Note [Limit w/w arity] in WwLib
1958
1959 -- Discard specialisations if there are too many of them
1960 trimmed_pats = trim_pats env fn spec_info small_pats
1961
1962 -- ; pprTrace "callsToPats" (vcat [ text "calls to" <+> ppr fn <> colon <+> ppr calls
1963 -- , text "done_specs:" <+> ppr (map os_pat done_specs)
1964 -- , text "good_pats:" <+> ppr good_pats ]) $
1965 -- return ()
1966
1967 ; return (have_boring_call, trimmed_pats) }
1968
1969
1970 trim_pats :: ScEnv -> Id -> SpecInfo -> [CallPat] -> [CallPat]
1971 -- See Note [Choosing patterns]
1972 trim_pats env fn (SI { si_n_specs = done_spec_count }) pats
1973 | sc_force env
1974 || isNothing mb_scc
1975 || n_remaining >= n_pats
1976 = -- pprTrace "trim_pats: no-trim" (ppr (sc_force env) $$ ppr mb_scc $$ ppr n_remaining $$ ppr n_pats)
1977 pats -- No need to trim
1978
1979 | otherwise
1980 = emit_trace $ -- Need to trim, so keep the best ones
1981 take n_remaining sorted_pats
1982
1983 where
1984 n_pats = length pats
1985 spec_count' = n_pats + done_spec_count
1986 n_remaining = max_specs - done_spec_count
1987 mb_scc = sc_count env
1988 Just max_specs = mb_scc
1989
1990 sorted_pats = map fst $
1991 sortBy (comparing snd) $
1992 [(pat, pat_cons pat) | pat <- pats]
1993 -- Sort in order of increasing number of constructors
1994 -- (i.e. decreasing generality) and pick the initial
1995 -- segment of this list
1996
1997 pat_cons :: CallPat -> Int
1998 -- How many data constructors of literals are in
1999 -- the pattern. More data-cons => less general
2000 pat_cons (qs, ps) = foldr ((+) . n_cons) 0 ps
2001 where
2002 q_set = mkVarSet qs
2003 n_cons (Var v) | v `elemVarSet` q_set = 0
2004 | otherwise = 1
2005 n_cons (Cast e _) = n_cons e
2006 n_cons (App e1 e2) = n_cons e1 + n_cons e2
2007 n_cons (Lit {}) = 1
2008 n_cons _ = 0
2009
2010 emit_trace result
2011 | debugIsOn || hasPprDebug (sc_dflags env)
2012 -- Suppress this scary message for ordinary users! Trac #5125
2013 = pprTrace "SpecConstr" msg result
2014 | otherwise
2015 = result
2016 msg = vcat [ sep [ text "Function" <+> quotes (ppr fn)
2017 , nest 2 (text "has" <+>
2018 speakNOf spec_count' (text "call pattern") <> comma <+>
2019 text "but the limit is" <+> int max_specs) ]
2020 , text "Use -fspec-constr-count=n to set the bound"
2021 , text "done_spec_count =" <+> int done_spec_count
2022 , text "Keeping " <+> int n_remaining <> text ", out of" <+> int n_pats
2023 , text "Discarding:" <+> ppr (drop n_remaining sorted_pats) ]
2024
2025
2026 callToPats :: ScEnv -> [ArgOcc] -> Call -> UniqSM (Maybe CallPat)
2027 -- The [Var] is the variables to quantify over in the rule
2028 -- Type variables come first, since they may scope
2029 -- over the following term variables
2030 -- The [CoreExpr] are the argument patterns for the rule
2031 callToPats env bndr_occs call@(Call _ args con_env)
2032 | args `ltLength` bndr_occs -- Check saturated
2033 = return Nothing
2034 | otherwise
2035 = do { let in_scope = substInScope (sc_subst env)
2036 ; (interesting, pats) <- argsToPats env in_scope con_env args bndr_occs
2037 ; let pat_fvs = exprsFreeVarsList pats
2038 -- To get determinism we need the list of free variables in
2039 -- deterministic order. Otherwise we end up creating
2040 -- lambdas with different argument orders. See
2041 -- determinism/simplCore/should_compile/spec-inline-determ.hs
2042 -- for an example. For explanation of determinism
2043 -- considerations See Note [Unique Determinism] in Unique.
2044 in_scope_vars = getInScopeVars in_scope
2045 qvars = filterOut (`elemVarSet` in_scope_vars) pat_fvs
2046 -- Quantify over variables that are not in scope
2047 -- at the call site
2048 -- See Note [Free type variables of the qvar types]
2049 -- See Note [Shadowing] at the top
2050
2051 (ktvs, ids) = partition isTyVar qvars
2052 qvars' = toposortTyVars ktvs ++ map sanitise ids
2053 -- Order into kind variables, type variables, term variables
2054 -- The kind of a type variable may mention a kind variable
2055 -- and the type of a term variable may mention a type variable
2056
2057 sanitise id = id `setIdType` expandTypeSynonyms (idType id)
2058 -- See Note [Free type variables of the qvar types]
2059
2060 bad_covars = filter isCoVar ids
2061 -- See Note [SpecConstr and casts]
2062
2063 ; -- pprTrace "callToPats" (ppr args $$ ppr bndr_occs) $
2064 WARN( not (null bad_covars), text "SpecConstr: bad covars:" <+> ppr bad_covars
2065 $$ ppr call )
2066 if interesting && null bad_covars
2067 then return (Just (qvars', pats))
2068 else return Nothing }
2069
2070 -- argToPat takes an actual argument, and returns an abstracted
2071 -- version, consisting of just the "constructor skeleton" of the
2072 -- argument, with non-constructor sub-expression replaced by new
2073 -- placeholder variables. For example:
2074 -- C a (D (f x) (g y)) ==> C p1 (D p2 p3)
2075
2076 argToPat :: ScEnv
2077 -> InScopeSet -- What's in scope at the fn defn site
2078 -> ValueEnv -- ValueEnv at the call site
2079 -> CoreArg -- A call arg (or component thereof)
2080 -> ArgOcc
2081 -> UniqSM (Bool, CoreArg)
2082
2083 -- Returns (interesting, pat),
2084 -- where pat is the pattern derived from the argument
2085 -- interesting=True if the pattern is non-trivial (not a variable or type)
2086 -- E.g. x:xs --> (True, x:xs)
2087 -- f xs --> (False, w) where w is a fresh wildcard
2088 -- (f xs, 'c') --> (True, (w, 'c')) where w is a fresh wildcard
2089 -- \x. x+y --> (True, \x. x+y)
2090 -- lvl7 --> (True, lvl7) if lvl7 is bound
2091 -- somewhere further out
2092
2093 argToPat _env _in_scope _val_env arg@(Type {}) _arg_occ
2094 = return (False, arg)
2095
2096 argToPat env in_scope val_env (Tick _ arg) arg_occ
2097 = argToPat env in_scope val_env arg arg_occ
2098 -- Note [Notes in call patterns]
2099 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2100 -- Ignore Notes. In particular, we want to ignore any InlineMe notes
2101 -- Perhaps we should not ignore profiling notes, but I'm going to
2102 -- ride roughshod over them all for now.
2103 --- See Note [Notes in RULE matching] in Rules
2104
2105 argToPat env in_scope val_env (Let _ arg) arg_occ
2106 = argToPat env in_scope val_env arg arg_occ
2107 -- See Note [Matching lets] in Rule.hs
2108 -- Look through let expressions
2109 -- e.g. f (let v = rhs in (v,w))
2110 -- Here we can specialise for f (v,w)
2111 -- because the rule-matcher will look through the let.
2112
2113 {- Disabled; see Note [Matching cases] in Rule.hs
2114 argToPat env in_scope val_env (Case scrut _ _ [(_, _, rhs)]) arg_occ
2115 | exprOkForSpeculation scrut -- See Note [Matching cases] in Rule.hhs
2116 = argToPat env in_scope val_env rhs arg_occ
2117 -}
2118
2119 argToPat env in_scope val_env (Cast arg co) arg_occ
2120 | not (ignoreType env ty2)
2121 = do { (interesting, arg') <- argToPat env in_scope val_env arg arg_occ
2122 ; if not interesting then
2123 wildCardPat ty2
2124 else do
2125 { -- Make a wild-card pattern for the coercion
2126 uniq <- getUniqueM
2127 ; let co_name = mkSysTvName uniq (fsLit "sg")
2128 co_var = mkCoVar co_name (mkCoercionType Representational ty1 ty2)
2129 ; return (interesting, Cast arg' (mkCoVarCo co_var)) } }
2130 where
2131 Pair ty1 ty2 = coercionKind co
2132
2133
2134
2135 {- Disabling lambda specialisation for now
2136 It's fragile, and the spec_loop can be infinite
2137 argToPat in_scope val_env arg arg_occ
2138 | is_value_lam arg
2139 = return (True, arg)
2140 where
2141 is_value_lam (Lam v e) -- Spot a value lambda, even if
2142 | isId v = True -- it is inside a type lambda
2143 | otherwise = is_value_lam e
2144 is_value_lam other = False
2145 -}
2146
2147 -- Check for a constructor application
2148 -- NB: this *precedes* the Var case, so that we catch nullary constrs
2149 argToPat env in_scope val_env arg arg_occ
2150 | Just (ConVal (DataAlt dc) args) <- isValue val_env arg
2151 , not (ignoreDataCon env dc) -- See Note [NoSpecConstr]
2152 , Just arg_occs <- mb_scrut dc
2153 = do { let (ty_args, rest_args) = splitAtList (dataConUnivTyVars dc) args
2154 ; (_, args') <- argsToPats env in_scope val_env rest_args arg_occs
2155 ; return (True,
2156 mkConApp dc (ty_args ++ args')) }
2157 where
2158 mb_scrut dc = case arg_occ of
2159 ScrutOcc bs | Just occs <- lookupUFM bs dc
2160 -> Just (occs) -- See Note [Reboxing]
2161 _other | sc_force env || sc_keen env
2162 -> Just (repeat UnkOcc)
2163 | otherwise
2164 -> Nothing
2165
2166 -- Check if the argument is a variable that
2167 -- (a) is used in an interesting way in the function body
2168 -- (b) we know what its value is
2169 -- In that case it counts as "interesting"
2170 argToPat env in_scope val_env (Var v) arg_occ
2171 | sc_force env || case arg_occ of { UnkOcc -> False; _other -> True }, -- (a)
2172 is_value, -- (b)
2173 -- Ignoring sc_keen here to avoid gratuitously incurring Note [Reboxing]
2174 -- So sc_keen focused just on f (I# x), where we have freshly-allocated
2175 -- box that we can eliminate in the caller
2176 not (ignoreType env (varType v))
2177 = return (True, Var v)
2178 where
2179 is_value
2180 | isLocalId v = v `elemInScopeSet` in_scope
2181 && isJust (lookupVarEnv val_env v)
2182 -- Local variables have values in val_env
2183 | otherwise = isValueUnfolding (idUnfolding v)
2184 -- Imports have unfoldings
2185
2186 -- I'm really not sure what this comment means
2187 -- And by not wild-carding we tend to get forall'd
2188 -- variables that are in scope, which in turn can
2189 -- expose the weakness in let-matching
2190 -- See Note [Matching lets] in Rules
2191
2192 -- Check for a variable bound inside the function.
2193 -- Don't make a wild-card, because we may usefully share
2194 -- e.g. f a = let x = ... in f (x,x)
2195 -- NB: this case follows the lambda and con-app cases!!
2196 -- argToPat _in_scope _val_env (Var v) _arg_occ
2197 -- = return (False, Var v)
2198 -- SLPJ : disabling this to avoid proliferation of versions
2199 -- also works badly when thinking about seeding the loop
2200 -- from the body of the let
2201 -- f x y = letrec g z = ... in g (x,y)
2202 -- We don't want to specialise for that *particular* x,y
2203
2204 -- The default case: make a wild-card
2205 -- We use this for coercions too
2206 argToPat _env _in_scope _val_env arg _arg_occ
2207 = wildCardPat (exprType arg)
2208
2209 wildCardPat :: Type -> UniqSM (Bool, CoreArg)
2210 wildCardPat ty
2211 = do { uniq <- getUniqueM
2212 ; let id = mkSysLocalOrCoVar (fsLit "sc") uniq ty
2213 ; return (False, varToCoreExpr id) }
2214
2215 argsToPats :: ScEnv -> InScopeSet -> ValueEnv
2216 -> [CoreArg] -> [ArgOcc] -- Should be same length
2217 -> UniqSM (Bool, [CoreArg])
2218 argsToPats env in_scope val_env args occs
2219 = do { stuff <- zipWithM (argToPat env in_scope val_env) args occs
2220 ; let (interesting_s, args') = unzip stuff
2221 ; return (or interesting_s, args') }
2222
2223 isValue :: ValueEnv -> CoreExpr -> Maybe Value
2224 isValue _env (Lit lit)
2225 | litIsLifted lit = Nothing
2226 | otherwise = Just (ConVal (LitAlt lit) [])
2227
2228 isValue env (Var v)
2229 | Just cval <- lookupVarEnv env v
2230 = Just cval -- You might think we could look in the idUnfolding here
2231 -- but that doesn't take account of which branch of a
2232 -- case we are in, which is the whole point
2233
2234 | not (isLocalId v) && isCheapUnfolding unf
2235 = isValue env (unfoldingTemplate unf)
2236 where
2237 unf = idUnfolding v
2238 -- However we do want to consult the unfolding
2239 -- as well, for let-bound constructors!
2240
2241 isValue env (Lam b e)
2242 | isTyVar b = case isValue env e of
2243 Just _ -> Just LambdaVal
2244 Nothing -> Nothing
2245 | otherwise = Just LambdaVal
2246
2247 isValue env (Tick t e)
2248 | not (tickishIsCode t)
2249 = isValue env e
2250
2251 isValue _env expr -- Maybe it's a constructor application
2252 | (Var fun, args, _) <- collectArgsTicks (not . tickishIsCode) expr
2253 = case isDataConWorkId_maybe fun of
2254
2255 Just con | args `lengthAtLeast` dataConRepArity con
2256 -- Check saturated; might be > because the
2257 -- arity excludes type args
2258 -> Just (ConVal (DataAlt con) args)
2259
2260 _other | valArgCount args < idArity fun
2261 -- Under-applied function
2262 -> Just LambdaVal -- Partial application
2263
2264 _other -> Nothing
2265
2266 isValue _env _expr = Nothing
2267
2268 valueIsWorkFree :: Value -> Bool
2269 valueIsWorkFree LambdaVal = True
2270 valueIsWorkFree (ConVal _ args) = all exprIsWorkFree args
2271
2272 samePat :: CallPat -> CallPat -> Bool
2273 samePat (vs1, as1) (vs2, as2)
2274 = all2 same as1 as2
2275 where
2276 same (Var v1) (Var v2)
2277 | v1 `elem` vs1 = v2 `elem` vs2
2278 | v2 `elem` vs2 = False
2279 | otherwise = v1 == v2
2280
2281 same (Lit l1) (Lit l2) = l1==l2
2282 same (App f1 a1) (App f2 a2) = same f1 f2 && same a1 a2
2283
2284 same (Type {}) (Type {}) = True -- Note [Ignore type differences]
2285 same (Coercion {}) (Coercion {}) = True
2286 same (Tick _ e1) e2 = same e1 e2 -- Ignore casts and notes
2287 same (Cast e1 _) e2 = same e1 e2
2288 same e1 (Tick _ e2) = same e1 e2
2289 same e1 (Cast e2 _) = same e1 e2
2290
2291 same e1 e2 = WARN( bad e1 || bad e2, ppr e1 $$ ppr e2)
2292 False -- Let, lambda, case should not occur
2293 bad (Case {}) = True
2294 bad (Let {}) = True
2295 bad (Lam {}) = True
2296 bad _other = False
2297
2298 {-
2299 Note [Ignore type differences]
2300 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2301 We do not want to generate specialisations where the call patterns
2302 differ only in their type arguments! Not only is it utterly useless,
2303 but it also means that (with polymorphic recursion) we can generate
2304 an infinite number of specialisations. Example is Data.Sequence.adjustTree,
2305 I think.
2306 -}