Revert "CoreTidy: Don't seq unfoldings"
[ghc.git] / compiler / specialise / SpecConstr.hs
1 {-
2 ToDo [Oct 2013]
3 ~~~~~~~~~~~~~~~
4 1. Nuke ForceSpecConstr for good (it is subsumed by GHC.Types.SPEC in ghc-prim)
5 2. Nuke NoSpecConstr
6
7
8 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
9
10 \section[SpecConstr]{Specialise over constructors}
11 -}
12
13 {-# LANGUAGE CPP #-}
14
15 module SpecConstr(
16 specConstrProgram,
17 SpecConstrAnnotation(..)
18 ) where
19
20 #include "HsVersions.h"
21
22 import CoreSyn
23 import CoreSubst
24 import CoreUtils
25 import CoreUnfold ( couldBeSmallEnoughToInline )
26 import CoreFVs ( exprsFreeVarsList )
27 import CoreMonad
28 import Literal ( litIsLifted )
29 import HscTypes ( ModGuts(..) )
30 import WwLib ( isWorkerSmallEnough, mkWorkerArgs )
31 import DataCon
32 import Coercion hiding( substCo )
33 import Rules
34 import Type hiding ( substTy )
35 import TyCon ( tyConName )
36 import Id
37 import PprCore ( pprParendExpr )
38 import MkCore ( mkImpossibleExpr )
39 import Var
40 import VarEnv
41 import VarSet
42 import Name
43 import BasicTypes
44 import DynFlags ( DynFlags(..), GeneralFlag( Opt_SpecConstrKeen )
45 , gopt, hasPprDebug )
46 import Maybes ( orElse, catMaybes, isJust, isNothing )
47 import Demand
48 import GHC.Serialized ( deserializeWithData )
49 import Util
50 import Pair
51 import UniqSupply
52 import Outputable
53 import FastString
54 import UniqFM
55 import MonadUtils
56 import Control.Monad ( zipWithM )
57 import Data.List
58 import PrelNames ( specTyConName )
59 import Module
60
61 -- See Note [Forcing specialisation]
62
63 import TyCon ( TyCon )
64 import GHC.Exts( SpecConstrAnnotation(..) )
65 import Data.Ord( comparing )
66
67 {-
68 -----------------------------------------------------
69 Game plan
70 -----------------------------------------------------
71
72 Consider
73 drop n [] = []
74 drop 0 xs = []
75 drop n (x:xs) = drop (n-1) xs
76
77 After the first time round, we could pass n unboxed. This happens in
78 numerical code too. Here's what it looks like in Core:
79
80 drop n xs = case xs of
81 [] -> []
82 (y:ys) -> case n of
83 I# n# -> case n# of
84 0 -> []
85 _ -> drop (I# (n# -# 1#)) xs
86
87 Notice that the recursive call has an explicit constructor as argument.
88 Noticing this, we can make a specialised version of drop
89
90 RULE: drop (I# n#) xs ==> drop' n# xs
91
92 drop' n# xs = let n = I# n# in ...orig RHS...
93
94 Now the simplifier will apply the specialisation in the rhs of drop', giving
95
96 drop' n# xs = case xs of
97 [] -> []
98 (y:ys) -> case n# of
99 0 -> []
100 _ -> drop' (n# -# 1#) xs
101
102 Much better!
103
104 We'd also like to catch cases where a parameter is carried along unchanged,
105 but evaluated each time round the loop:
106
107 f i n = if i>0 || i>n then i else f (i*2) n
108
109 Here f isn't strict in n, but we'd like to avoid evaluating it each iteration.
110 In Core, by the time we've w/wd (f is strict in i) we get
111
112 f i# n = case i# ># 0 of
113 False -> I# i#
114 True -> case n of { I# n# ->
115 case i# ># n# of
116 False -> I# i#
117 True -> f (i# *# 2#) n
118
119 At the call to f, we see that the argument, n is known to be (I# n#),
120 and n is evaluated elsewhere in the body of f, so we can play the same
121 trick as above.
122
123
124 Note [Reboxing]
125 ~~~~~~~~~~~~~~~
126 We must be careful not to allocate the same constructor twice. Consider
127 f p = (...(case p of (a,b) -> e)...p...,
128 ...let t = (r,s) in ...t...(f t)...)
129 At the recursive call to f, we can see that t is a pair. But we do NOT want
130 to make a specialised copy:
131 f' a b = let p = (a,b) in (..., ...)
132 because now t is allocated by the caller, then r and s are passed to the
133 recursive call, which allocates the (r,s) pair again.
134
135 This happens if
136 (a) the argument p is used in other than a case-scrutinisation way.
137 (b) the argument to the call is not a 'fresh' tuple; you have to
138 look into its unfolding to see that it's a tuple
139
140 Hence the "OR" part of Note [Good arguments] below.
141
142 ALTERNATIVE 2: pass both boxed and unboxed versions. This no longer saves
143 allocation, but does perhaps save evals. In the RULE we'd have
144 something like
145
146 f (I# x#) = f' (I# x#) x#
147
148 If at the call site the (I# x) was an unfolding, then we'd have to
149 rely on CSE to eliminate the duplicate allocation.... This alternative
150 doesn't look attractive enough to pursue.
151
152 ALTERNATIVE 3: ignore the reboxing problem. The trouble is that
153 the conservative reboxing story prevents many useful functions from being
154 specialised. Example:
155 foo :: Maybe Int -> Int -> Int
156 foo (Just m) 0 = 0
157 foo x@(Just m) n = foo x (n-m)
158 Here the use of 'x' will clearly not require boxing in the specialised function.
159
160 The strictness analyser has the same problem, in fact. Example:
161 f p@(a,b) = ...
162 If we pass just 'a' and 'b' to the worker, it might need to rebox the
163 pair to create (a,b). A more sophisticated analysis might figure out
164 precisely the cases in which this could happen, but the strictness
165 analyser does no such analysis; it just passes 'a' and 'b', and hopes
166 for the best.
167
168 So my current choice is to make SpecConstr similarly aggressive, and
169 ignore the bad potential of reboxing.
170
171
172 Note [Good arguments]
173 ~~~~~~~~~~~~~~~~~~~~~
174 So we look for
175
176 * A self-recursive function. Ignore mutual recursion for now,
177 because it's less common, and the code is simpler for self-recursion.
178
179 * EITHER
180
181 a) At a recursive call, one or more parameters is an explicit
182 constructor application
183 AND
184 That same parameter is scrutinised by a case somewhere in
185 the RHS of the function
186
187 OR
188
189 b) At a recursive call, one or more parameters has an unfolding
190 that is an explicit constructor application
191 AND
192 That same parameter is scrutinised by a case somewhere in
193 the RHS of the function
194 AND
195 Those are the only uses of the parameter (see Note [Reboxing])
196
197
198 What to abstract over
199 ~~~~~~~~~~~~~~~~~~~~~
200 There's a bit of a complication with type arguments. If the call
201 site looks like
202
203 f p = ...f ((:) [a] x xs)...
204
205 then our specialised function look like
206
207 f_spec x xs = let p = (:) [a] x xs in ....as before....
208
209 This only makes sense if either
210 a) the type variable 'a' is in scope at the top of f, or
211 b) the type variable 'a' is an argument to f (and hence fs)
212
213 Actually, (a) may hold for value arguments too, in which case
214 we may not want to pass them. Suppose 'x' is in scope at f's
215 defn, but xs is not. Then we'd like
216
217 f_spec xs = let p = (:) [a] x xs in ....as before....
218
219 Similarly (b) may hold too. If x is already an argument at the
220 call, no need to pass it again.
221
222 Finally, if 'a' is not in scope at the call site, we could abstract
223 it as we do the term variables:
224
225 f_spec a x xs = let p = (:) [a] x xs in ...as before...
226
227 So the grand plan is:
228
229 * abstract the call site to a constructor-only pattern
230 e.g. C x (D (f p) (g q)) ==> C s1 (D s2 s3)
231
232 * Find the free variables of the abstracted pattern
233
234 * Pass these variables, less any that are in scope at
235 the fn defn. But see Note [Shadowing] below.
236
237
238 NOTICE that we only abstract over variables that are not in scope,
239 so we're in no danger of shadowing variables used in "higher up"
240 in f_spec's RHS.
241
242
243 Note [Shadowing]
244 ~~~~~~~~~~~~~~~~
245 In this pass we gather up usage information that may mention variables
246 that are bound between the usage site and the definition site; or (more
247 seriously) may be bound to something different at the definition site.
248 For example:
249
250 f x = letrec g y v = let x = ...
251 in ...(g (a,b) x)...
252
253 Since 'x' is in scope at the call site, we may make a rewrite rule that
254 looks like
255 RULE forall a,b. g (a,b) x = ...
256 But this rule will never match, because it's really a different 'x' at
257 the call site -- and that difference will be manifest by the time the
258 simplifier gets to it. [A worry: the simplifier doesn't *guarantee*
259 no-shadowing, so perhaps it may not be distinct?]
260
261 Anyway, the rule isn't actually wrong, it's just not useful. One possibility
262 is to run deShadowBinds before running SpecConstr, but instead we run the
263 simplifier. That gives the simplest possible program for SpecConstr to
264 chew on; and it virtually guarantees no shadowing.
265
266 Note [Specialising for constant parameters]
267 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
268 This one is about specialising on a *constant* (but not necessarily
269 constructor) argument
270
271 foo :: Int -> (Int -> Int) -> Int
272 foo 0 f = 0
273 foo m f = foo (f m) (+1)
274
275 It produces
276
277 lvl_rmV :: GHC.Base.Int -> GHC.Base.Int
278 lvl_rmV =
279 \ (ds_dlk :: GHC.Base.Int) ->
280 case ds_dlk of wild_alH { GHC.Base.I# x_alG ->
281 GHC.Base.I# (GHC.Prim.+# x_alG 1)
282
283 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
284 GHC.Prim.Int#
285 T.$wfoo =
286 \ (ww_sme :: GHC.Prim.Int#) (w_smg :: GHC.Base.Int -> GHC.Base.Int) ->
287 case ww_sme of ds_Xlw {
288 __DEFAULT ->
289 case w_smg (GHC.Base.I# ds_Xlw) of w1_Xmo { GHC.Base.I# ww1_Xmz ->
290 T.$wfoo ww1_Xmz lvl_rmV
291 };
292 0 -> 0
293 }
294
295 The recursive call has lvl_rmV as its argument, so we could create a specialised copy
296 with that argument baked in; that is, not passed at all. Now it can perhaps be inlined.
297
298 When is this worth it? Call the constant 'lvl'
299 - If 'lvl' has an unfolding that is a constructor, see if the corresponding
300 parameter is scrutinised anywhere in the body.
301
302 - If 'lvl' has an unfolding that is a inlinable function, see if the corresponding
303 parameter is applied (...to enough arguments...?)
304
305 Also do this is if the function has RULES?
306
307 Also
308
309 Note [Specialising for lambda parameters]
310 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
311 foo :: Int -> (Int -> Int) -> Int
312 foo 0 f = 0
313 foo m f = foo (f m) (\n -> n-m)
314
315 This is subtly different from the previous one in that we get an
316 explicit lambda as the argument:
317
318 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
319 GHC.Prim.Int#
320 T.$wfoo =
321 \ (ww_sm8 :: GHC.Prim.Int#) (w_sma :: GHC.Base.Int -> GHC.Base.Int) ->
322 case ww_sm8 of ds_Xlr {
323 __DEFAULT ->
324 case w_sma (GHC.Base.I# ds_Xlr) of w1_Xmf { GHC.Base.I# ww1_Xmq ->
325 T.$wfoo
326 ww1_Xmq
327 (\ (n_ad3 :: GHC.Base.Int) ->
328 case n_ad3 of wild_alB { GHC.Base.I# x_alA ->
329 GHC.Base.I# (GHC.Prim.-# x_alA ds_Xlr)
330 })
331 };
332 0 -> 0
333 }
334
335 I wonder if SpecConstr couldn't be extended to handle this? After all,
336 lambda is a sort of constructor for functions and perhaps it already
337 has most of the necessary machinery?
338
339 Furthermore, there's an immediate win, because you don't need to allocate the lambda
340 at the call site; and if perchance it's called in the recursive call, then you
341 may avoid allocating it altogether. Just like for constructors.
342
343 Looks cool, but probably rare...but it might be easy to implement.
344
345
346 Note [SpecConstr for casts]
347 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
348 Consider
349 data family T a :: *
350 data instance T Int = T Int
351
352 foo n = ...
353 where
354 go (T 0) = 0
355 go (T n) = go (T (n-1))
356
357 The recursive call ends up looking like
358 go (T (I# ...) `cast` g)
359 So we want to spot the constructor application inside the cast.
360 That's why we have the Cast case in argToPat
361
362 Note [Local recursive groups]
363 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
364 For a *local* recursive group, we can see all the calls to the
365 function, so we seed the specialisation loop from the calls in the
366 body, not from the calls in the RHS. Consider:
367
368 bar m n = foo n (n,n) (n,n) (n,n) (n,n)
369 where
370 foo n p q r s
371 | n == 0 = m
372 | n > 3000 = case p of { (p1,p2) -> foo (n-1) (p2,p1) q r s }
373 | n > 2000 = case q of { (q1,q2) -> foo (n-1) p (q2,q1) r s }
374 | n > 1000 = case r of { (r1,r2) -> foo (n-1) p q (r2,r1) s }
375 | otherwise = case s of { (s1,s2) -> foo (n-1) p q r (s2,s1) }
376
377 If we start with the RHSs of 'foo', we get lots and lots of specialisations,
378 most of which are not needed. But if we start with the (single) call
379 in the rhs of 'bar' we get exactly one fully-specialised copy, and all
380 the recursive calls go to this fully-specialised copy. Indeed, the original
381 function is later collected as dead code. This is very important in
382 specialising the loops arising from stream fusion, for example in NDP where
383 we were getting literally hundreds of (mostly unused) specialisations of
384 a local function.
385
386 In a case like the above we end up never calling the original un-specialised
387 function. (Although we still leave its code around just in case.)
388
389 However, if we find any boring calls in the body, including *unsaturated*
390 ones, such as
391 letrec foo x y = ....foo...
392 in map foo xs
393 then we will end up calling the un-specialised function, so then we *should*
394 use the calls in the un-specialised RHS as seeds. We call these
395 "boring call patterns", and callsToPats reports if it finds any of these.
396
397 Note [Seeding top-level recursive groups]
398 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
399 This seeding is done in the binding for seed_calls in specRec.
400
401 1. If all the bindings in a top-level recursive group are local (not
402 exported), then all the calls are in the rest of the top-level
403 bindings. This means we can specialise with those call patterns
404 ONLY, and NOT with the RHSs of the recursive group (exactly like
405 Note [Local recursive groups])
406
407 2. But if any of the bindings are exported, the function may be called
408 with any old arguments, so (for lack of anything better) we specialise
409 based on
410 (a) the call patterns in the RHS
411 (b) the call patterns in the rest of the top-level bindings
412 NB: before Apr 15 we used (a) only, but Dimitrios had an example
413 where (b) was crucial, so I added that.
414 Adding (b) also improved nofib allocation results:
415 multiplier: 4% better
416 minimax: 2.8% better
417
418 Actually in case (2), instead of using the calls from the RHS, it
419 would be better to specialise in the importing module. We'd need to
420 add an INLINABLE pragma to the function, and then it can be
421 specialised in the importing scope, just as is done for type classes
422 in Specialise.specImports. This remains to be done (#10346).
423
424 Note [Top-level recursive groups]
425 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
426 To get the call usage information from "the rest of the top level
427 bindings" (c.f. Note [Seeding top-level recursive groups]), we work
428 backwards through the top-level bindings so we see the usage before we
429 get to the binding of the function. Before we can collect the usage
430 though, we go through all the bindings and add them to the
431 environment. This is necessary because usage is only tracked for
432 functions in the environment. These two passes are called
433 'go' and 'goEnv'
434 in specConstrProgram. (Looks a bit revolting to me.)
435
436 Note [Do not specialise diverging functions]
437 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
438 Specialising a function that just diverges is a waste of code.
439 Furthermore, it broke GHC (simpl014) thus:
440 {-# STR Sb #-}
441 f = \x. case x of (a,b) -> f x
442 If we specialise f we get
443 f = \x. case x of (a,b) -> fspec a b
444 But fspec doesn't have decent strictness info. As it happened,
445 (f x) :: IO t, so the state hack applied and we eta expanded fspec,
446 and hence f. But now f's strictness is less than its arity, which
447 breaks an invariant.
448
449
450 Note [Forcing specialisation]
451 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
452 With stream fusion and in other similar cases, we want to fully
453 specialise some (but not necessarily all!) loops regardless of their
454 size and the number of specialisations.
455
456 We allow a library to do this, in one of two ways (one which is
457 deprecated):
458
459 1) Add a parameter of type GHC.Types.SPEC (from ghc-prim) to the loop body.
460
461 2) (Deprecated) Annotate a type with ForceSpecConstr from GHC.Exts,
462 and then add *that* type as a parameter to the loop body
463
464 The reason #2 is deprecated is because it requires GHCi, which isn't
465 available for things like a cross compiler using stage1.
466
467 Here's a (simplified) example from the `vector` package. You may bring
468 the special 'force specialization' type into scope by saying:
469
470 import GHC.Types (SPEC(..))
471
472 or by defining your own type (again, deprecated):
473
474 data SPEC = SPEC | SPEC2
475 {-# ANN type SPEC ForceSpecConstr #-}
476
477 (Note this is the exact same definition of GHC.Types.SPEC, just
478 without the annotation.)
479
480 After that, you say:
481
482 foldl :: (a -> b -> a) -> a -> Stream b -> a
483 {-# INLINE foldl #-}
484 foldl f z (Stream step s _) = foldl_loop SPEC z s
485 where
486 foldl_loop !sPEC z s = case step s of
487 Yield x s' -> foldl_loop sPEC (f z x) s'
488 Skip -> foldl_loop sPEC z s'
489 Done -> z
490
491 SpecConstr will spot the SPEC parameter and always fully specialise
492 foldl_loop. Note that
493
494 * We have to prevent the SPEC argument from being removed by
495 w/w which is why (a) SPEC is a sum type, and (b) we have to seq on
496 the SPEC argument.
497
498 * And lastly, the SPEC argument is ultimately eliminated by
499 SpecConstr itself so there is no runtime overhead.
500
501 This is all quite ugly; we ought to come up with a better design.
502
503 ForceSpecConstr arguments are spotted in scExpr' and scTopBinds which then set
504 sc_force to True when calling specLoop. This flag does four things:
505 * Ignore specConstrThreshold, to specialise functions of arbitrary size
506 (see scTopBind)
507 * Ignore specConstrCount, to make arbitrary numbers of specialisations
508 (see specialise)
509 * Specialise even for arguments that are not scrutinised in the loop
510 (see argToPat; Trac #4488)
511 * Only specialise on recursive types a finite number of times
512 (see is_too_recursive; Trac #5550; Note [Limit recursive specialisation])
513
514 This flag is inherited for nested non-recursive bindings (which are likely to
515 be join points and hence should be fully specialised) but reset for nested
516 recursive bindings.
517
518 What alternatives did I consider? Annotating the loop itself doesn't
519 work because (a) it is local and (b) it will be w/w'ed and having
520 w/w propagating annotations somehow doesn't seem like a good idea. The
521 types of the loop arguments really seem to be the most persistent
522 thing.
523
524 Annotating the types that make up the loop state doesn't work,
525 either, because (a) it would prevent us from using types like Either
526 or tuples here, (b) we don't want to restrict the set of types that
527 can be used in Stream states and (c) some types are fixed by the user
528 (e.g., the accumulator here) but we still want to specialise as much
529 as possible.
530
531 Alternatives to ForceSpecConstr
532 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
533 Instead of giving the loop an extra argument of type SPEC, we
534 also considered *wrapping* arguments in SPEC, thus
535 data SPEC a = SPEC a | SPEC2
536
537 loop = \arg -> case arg of
538 SPEC state ->
539 case state of (x,y) -> ... loop (SPEC (x',y')) ...
540 S2 -> error ...
541 The idea is that a SPEC argument says "specialise this argument
542 regardless of whether the function case-analyses it". But this
543 doesn't work well:
544 * SPEC must still be a sum type, else the strictness analyser
545 eliminates it
546 * But that means that 'loop' won't be strict in its real payload
547 This loss of strictness in turn screws up specialisation, because
548 we may end up with calls like
549 loop (SPEC (case z of (p,q) -> (q,p)))
550 Without the SPEC, if 'loop' were strict, the case would move out
551 and we'd see loop applied to a pair. But if 'loop' isn't strict
552 this doesn't look like a specialisable call.
553
554 Note [Limit recursive specialisation]
555 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
556 It is possible for ForceSpecConstr to cause an infinite loop of specialisation.
557 Because there is no limit on the number of specialisations, a recursive call with
558 a recursive constructor as an argument (for example, list cons) will generate
559 a specialisation for that constructor. If the resulting specialisation also
560 contains a recursive call with the constructor, this could proceed indefinitely.
561
562 For example, if ForceSpecConstr is on:
563 loop :: [Int] -> [Int] -> [Int]
564 loop z [] = z
565 loop z (x:xs) = loop (x:z) xs
566 this example will create a specialisation for the pattern
567 loop (a:b) c = loop' a b c
568
569 loop' a b [] = (a:b)
570 loop' a b (x:xs) = loop (x:(a:b)) xs
571 and a new pattern is found:
572 loop (a:(b:c)) d = loop'' a b c d
573 which can continue indefinitely.
574
575 Roman's suggestion to fix this was to stop after a couple of times on recursive types,
576 but still specialising on non-recursive types as much as possible.
577
578 To implement this, we count the number of times we have gone round the
579 "specialise recursively" loop ('go' in 'specRec'). Once have gone round
580 more than N times (controlled by -fspec-constr-recursive=N) we check
581
582 - If sc_force is off, and sc_count is (Just max) then we don't
583 need to do anything: trim_pats will limit the number of specs
584
585 - Otherwise check if any function has now got more than (sc_count env)
586 specialisations. If sc_count is "no limit" then we arbitrarily
587 choose 10 as the limit (ugh).
588
589 See Trac #5550. Also Trac #13623, where this test had become over-agressive,
590 and we lost a wonderful specialisation that we really wanted!
591
592 Note [NoSpecConstr]
593 ~~~~~~~~~~~~~~~~~~~
594 The ignoreDataCon stuff allows you to say
595 {-# ANN type T NoSpecConstr #-}
596 to mean "don't specialise on arguments of this type". It was added
597 before we had ForceSpecConstr. Lacking ForceSpecConstr we specialised
598 regardless of size; and then we needed a way to turn that *off*. Now
599 that we have ForceSpecConstr, this NoSpecConstr is probably redundant.
600 (Used only for PArray.)
601
602 -----------------------------------------------------
603 Stuff not yet handled
604 -----------------------------------------------------
605
606 Here are notes arising from Roman's work that I don't want to lose.
607
608 Example 1
609 ~~~~~~~~~
610 data T a = T !a
611
612 foo :: Int -> T Int -> Int
613 foo 0 t = 0
614 foo x t | even x = case t of { T n -> foo (x-n) t }
615 | otherwise = foo (x-1) t
616
617 SpecConstr does no specialisation, because the second recursive call
618 looks like a boxed use of the argument. A pity.
619
620 $wfoo_sFw :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
621 $wfoo_sFw =
622 \ (ww_sFo [Just L] :: GHC.Prim.Int#) (w_sFq [Just L] :: T.T GHC.Base.Int) ->
623 case ww_sFo of ds_Xw6 [Just L] {
624 __DEFAULT ->
625 case GHC.Prim.remInt# ds_Xw6 2 of wild1_aEF [Dead Just A] {
626 __DEFAULT -> $wfoo_sFw (GHC.Prim.-# ds_Xw6 1) w_sFq;
627 0 ->
628 case w_sFq of wild_Xy [Just L] { T.T n_ad5 [Just U(L)] ->
629 case n_ad5 of wild1_aET [Just A] { GHC.Base.I# y_aES [Just L] ->
630 $wfoo_sFw (GHC.Prim.-# ds_Xw6 y_aES) wild_Xy
631 } } };
632 0 -> 0
633
634 Example 2
635 ~~~~~~~~~
636 data a :*: b = !a :*: !b
637 data T a = T !a
638
639 foo :: (Int :*: T Int) -> Int
640 foo (0 :*: t) = 0
641 foo (x :*: t) | even x = case t of { T n -> foo ((x-n) :*: t) }
642 | otherwise = foo ((x-1) :*: t)
643
644 Very similar to the previous one, except that the parameters are now in
645 a strict tuple. Before SpecConstr, we have
646
647 $wfoo_sG3 :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
648 $wfoo_sG3 =
649 \ (ww_sFU [Just L] :: GHC.Prim.Int#) (ww_sFW [Just L] :: T.T
650 GHC.Base.Int) ->
651 case ww_sFU of ds_Xws [Just L] {
652 __DEFAULT ->
653 case GHC.Prim.remInt# ds_Xws 2 of wild1_aEZ [Dead Just A] {
654 __DEFAULT ->
655 case ww_sFW of tpl_B2 [Just L] { T.T a_sFo [Just A] ->
656 $wfoo_sG3 (GHC.Prim.-# ds_Xws 1) tpl_B2 -- $wfoo1
657 };
658 0 ->
659 case ww_sFW of wild_XB [Just A] { T.T n_ad7 [Just S(L)] ->
660 case n_ad7 of wild1_aFd [Just L] { GHC.Base.I# y_aFc [Just L] ->
661 $wfoo_sG3 (GHC.Prim.-# ds_Xws y_aFc) wild_XB -- $wfoo2
662 } } };
663 0 -> 0 }
664
665 We get two specialisations:
666 "SC:$wfoo1" [0] __forall {a_sFB :: GHC.Base.Int sc_sGC :: GHC.Prim.Int#}
667 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int a_sFB)
668 = Foo.$s$wfoo1 a_sFB sc_sGC ;
669 "SC:$wfoo2" [0] __forall {y_aFp :: GHC.Prim.Int# sc_sGC :: GHC.Prim.Int#}
670 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int (GHC.Base.I# y_aFp))
671 = Foo.$s$wfoo y_aFp sc_sGC ;
672
673 But perhaps the first one isn't good. After all, we know that tpl_B2 is
674 a T (I# x) really, because T is strict and Int has one constructor. (We can't
675 unbox the strict fields, because T is polymorphic!)
676
677 ************************************************************************
678 * *
679 \subsection{Top level wrapper stuff}
680 * *
681 ************************************************************************
682 -}
683
684 specConstrProgram :: ModGuts -> CoreM ModGuts
685 specConstrProgram guts
686 = do
687 dflags <- getDynFlags
688 us <- getUniqueSupplyM
689 annos <- getFirstAnnotations deserializeWithData guts
690 this_mod <- getModule
691 let binds' = reverse $ fst $ initUs us $ do
692 -- Note [Top-level recursive groups]
693 (env, binds) <- goEnv (initScEnv dflags this_mod annos)
694 (mg_binds guts)
695 -- binds is identical to (mg_binds guts), except that the
696 -- binders on the LHS have been replaced by extendBndr
697 -- (SPJ this seems like overkill; I don't think the binders
698 -- will change at all; and we don't substitute in the RHSs anyway!!)
699 go env nullUsage (reverse binds)
700
701 return (guts { mg_binds = binds' })
702 where
703 -- See Note [Top-level recursive groups]
704 goEnv env [] = return (env, [])
705 goEnv env (bind:binds) = do (env', bind') <- scTopBindEnv env bind
706 (env'', binds') <- goEnv env' binds
707 return (env'', bind' : binds')
708
709 -- Arg list of bindings is in reverse order
710 go _ _ [] = return []
711 go env usg (bind:binds) = do (usg', bind') <- scTopBind env usg bind
712 binds' <- go env usg' binds
713 return (bind' : binds')
714
715 {-
716 ************************************************************************
717 * *
718 \subsection{Environment: goes downwards}
719 * *
720 ************************************************************************
721
722 Note [Work-free values only in environment]
723 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
724 The sc_vals field keeps track of in-scope value bindings, so
725 that if we come across (case x of Just y ->...) we can reduce the
726 case from knowing that x is bound to a pair.
727
728 But only *work-free* values are ok here. For example if the envt had
729 x -> Just (expensive v)
730 then we do NOT want to expand to
731 let y = expensive v in ...
732 because the x-binding still exists and we've now duplicated (expensive v).
733
734 This seldom happens because let-bound constructor applications are
735 ANF-ised, but it can happen as a result of on-the-fly transformations in
736 SpecConstr itself. Here is Trac #7865:
737
738 let {
739 a'_shr =
740 case xs_af8 of _ {
741 [] -> acc_af6;
742 : ds_dgt [Dmd=<L,A>] ds_dgu [Dmd=<L,A>] ->
743 (expensive x_af7, x_af7
744 } } in
745 let {
746 ds_sht =
747 case a'_shr of _ { (p'_afd, q'_afe) ->
748 TSpecConstr_DoubleInline.recursive
749 (GHC.Types.: @ GHC.Types.Int x_af7 wild_X6) (q'_afe, p'_afd)
750 } } in
751
752 When processed knowing that xs_af8 was bound to a cons, we simplify to
753 a'_shr = (expensive x_af7, x_af7)
754 and we do NOT want to inline that at the occurrence of a'_shr in ds_sht.
755 (There are other occurrences of a'_shr.) No no no.
756
757 It would be possible to do some on-the-fly ANF-ising, so that a'_shr turned
758 into a work-free value again, thus
759 a1 = expensive x_af7
760 a'_shr = (a1, x_af7)
761 but that's more work, so until its shown to be important I'm going to
762 leave it for now.
763
764 Note [Making SpecConstr keener]
765 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
766 Consider this, in (perf/should_run/T9339)
767 last (filter odd [1..1000])
768
769 After optimisation, including SpecConstr, we get:
770 f :: Int# -> Int -> Int
771 f x y = case case remInt# x 2# of
772 __DEFAULT -> case x of
773 __DEFAULT -> f (+# wild_Xp 1#) (I# x)
774 1000000# -> ...
775 0# -> case x of
776 __DEFAULT -> f (+# wild_Xp 1#) y
777 1000000# -> y
778
779 Not good! We build an (I# x) box every time around the loop.
780 SpecConstr (as described in the paper) does not specialise f, despite
781 the call (f ... (I# x)) because 'y' is not scrutinied in the body.
782 But it is much better to specialise f for the case where the argument
783 is of form (I# x); then we build the box only when returning y, which
784 is on the cold path.
785
786 Another example:
787
788 f x = ...(g x)....
789
790 Here 'x' is not scrutinised in f's body; but if we did specialise 'f'
791 then the call (g x) might allow 'g' to be specialised in turn.
792
793 So sc_keen controls whether or not we take account of whether argument is
794 scrutinised in the body. True <=> ignore that, and speicalise whenever
795 the function is applied to a data constructor.
796 -}
797
798 data ScEnv = SCE { sc_dflags :: DynFlags,
799 sc_module :: !Module,
800 sc_size :: Maybe Int, -- Size threshold
801 -- Nothing => no limit
802
803 sc_count :: Maybe Int, -- Max # of specialisations for any one fn
804 -- Nothing => no limit
805 -- See Note [Avoiding exponential blowup]
806
807 sc_recursive :: Int, -- Max # of specialisations over recursive type.
808 -- Stops ForceSpecConstr from diverging.
809
810 sc_keen :: Bool, -- Specialise on arguments that are known
811 -- constructors, even if they are not
812 -- scrutinised in the body. See
813 -- Note [Making SpecConstr keener]
814
815 sc_force :: Bool, -- Force specialisation?
816 -- See Note [Forcing specialisation]
817
818 sc_subst :: Subst, -- Current substitution
819 -- Maps InIds to OutExprs
820
821 sc_how_bound :: HowBoundEnv,
822 -- Binds interesting non-top-level variables
823 -- Domain is OutVars (*after* applying the substitution)
824
825 sc_vals :: ValueEnv,
826 -- Domain is OutIds (*after* applying the substitution)
827 -- Used even for top-level bindings (but not imported ones)
828 -- The range of the ValueEnv is *work-free* values
829 -- such as (\x. blah), or (Just v)
830 -- but NOT (Just (expensive v))
831 -- See Note [Work-free values only in environment]
832
833 sc_annotations :: UniqFM SpecConstrAnnotation
834 }
835
836 ---------------------
837 type HowBoundEnv = VarEnv HowBound -- Domain is OutVars
838
839 ---------------------
840 type ValueEnv = IdEnv Value -- Domain is OutIds
841 data Value = ConVal AltCon [CoreArg] -- _Saturated_ constructors
842 -- The AltCon is never DEFAULT
843 | LambdaVal -- Inlinable lambdas or PAPs
844
845 instance Outputable Value where
846 ppr (ConVal con args) = ppr con <+> interpp'SP args
847 ppr LambdaVal = text "<Lambda>"
848
849 ---------------------
850 initScEnv :: DynFlags -> Module -> UniqFM SpecConstrAnnotation -> ScEnv
851 initScEnv dflags this_mod anns
852 = SCE { sc_dflags = dflags,
853 sc_module = this_mod,
854 sc_size = specConstrThreshold dflags,
855 sc_count = specConstrCount dflags,
856 sc_recursive = specConstrRecursive dflags,
857 sc_keen = gopt Opt_SpecConstrKeen dflags,
858 sc_force = False,
859 sc_subst = emptySubst,
860 sc_how_bound = emptyVarEnv,
861 sc_vals = emptyVarEnv,
862 sc_annotations = anns }
863
864 data HowBound = RecFun -- These are the recursive functions for which
865 -- we seek interesting call patterns
866
867 | RecArg -- These are those functions' arguments, or their sub-components;
868 -- we gather occurrence information for these
869
870 instance Outputable HowBound where
871 ppr RecFun = text "RecFun"
872 ppr RecArg = text "RecArg"
873
874 scForce :: ScEnv -> Bool -> ScEnv
875 scForce env b = env { sc_force = b }
876
877 lookupHowBound :: ScEnv -> Id -> Maybe HowBound
878 lookupHowBound env id = lookupVarEnv (sc_how_bound env) id
879
880 scSubstId :: ScEnv -> Id -> CoreExpr
881 scSubstId env v = lookupIdSubst (text "scSubstId") (sc_subst env) v
882
883 scSubstTy :: ScEnv -> Type -> Type
884 scSubstTy env ty = substTy (sc_subst env) ty
885
886 scSubstCo :: ScEnv -> Coercion -> Coercion
887 scSubstCo env co = substCo (sc_subst env) co
888
889 zapScSubst :: ScEnv -> ScEnv
890 zapScSubst env = env { sc_subst = zapSubstEnv (sc_subst env) }
891
892 extendScInScope :: ScEnv -> [Var] -> ScEnv
893 -- Bring the quantified variables into scope
894 extendScInScope env qvars = env { sc_subst = extendInScopeList (sc_subst env) qvars }
895
896 -- Extend the substitution
897 extendScSubst :: ScEnv -> Var -> OutExpr -> ScEnv
898 extendScSubst env var expr = env { sc_subst = extendSubst (sc_subst env) var expr }
899
900 extendScSubstList :: ScEnv -> [(Var,OutExpr)] -> ScEnv
901 extendScSubstList env prs = env { sc_subst = extendSubstList (sc_subst env) prs }
902
903 extendHowBound :: ScEnv -> [Var] -> HowBound -> ScEnv
904 extendHowBound env bndrs how_bound
905 = env { sc_how_bound = extendVarEnvList (sc_how_bound env)
906 [(bndr,how_bound) | bndr <- bndrs] }
907
908 extendBndrsWith :: HowBound -> ScEnv -> [Var] -> (ScEnv, [Var])
909 extendBndrsWith how_bound env bndrs
910 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndrs')
911 where
912 (subst', bndrs') = substBndrs (sc_subst env) bndrs
913 hb_env' = sc_how_bound env `extendVarEnvList`
914 [(bndr,how_bound) | bndr <- bndrs']
915
916 extendBndrWith :: HowBound -> ScEnv -> Var -> (ScEnv, Var)
917 extendBndrWith how_bound env bndr
918 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndr')
919 where
920 (subst', bndr') = substBndr (sc_subst env) bndr
921 hb_env' = extendVarEnv (sc_how_bound env) bndr' how_bound
922
923 extendRecBndrs :: ScEnv -> [Var] -> (ScEnv, [Var])
924 extendRecBndrs env bndrs = (env { sc_subst = subst' }, bndrs')
925 where
926 (subst', bndrs') = substRecBndrs (sc_subst env) bndrs
927
928 extendBndr :: ScEnv -> Var -> (ScEnv, Var)
929 extendBndr env bndr = (env { sc_subst = subst' }, bndr')
930 where
931 (subst', bndr') = substBndr (sc_subst env) bndr
932
933 extendValEnv :: ScEnv -> Id -> Maybe Value -> ScEnv
934 extendValEnv env _ Nothing = env
935 extendValEnv env id (Just cv)
936 | valueIsWorkFree cv -- Don't duplicate work!! Trac #7865
937 = env { sc_vals = extendVarEnv (sc_vals env) id cv }
938 extendValEnv env _ _ = env
939
940 extendCaseBndrs :: ScEnv -> OutExpr -> OutId -> AltCon -> [Var] -> (ScEnv, [Var])
941 -- When we encounter
942 -- case scrut of b
943 -- C x y -> ...
944 -- we want to bind b, to (C x y)
945 -- NB1: Extends only the sc_vals part of the envt
946 -- NB2: Kill the dead-ness info on the pattern binders x,y, since
947 -- they are potentially made alive by the [b -> C x y] binding
948 extendCaseBndrs env scrut case_bndr con alt_bndrs
949 = (env2, alt_bndrs')
950 where
951 live_case_bndr = not (isDeadBinder case_bndr)
952 env1 | Var v <- stripTicksTopE (const True) scrut
953 = extendValEnv env v cval
954 | otherwise = env -- See Note [Add scrutinee to ValueEnv too]
955 env2 | live_case_bndr = extendValEnv env1 case_bndr cval
956 | otherwise = env1
957
958 alt_bndrs' | case scrut of { Var {} -> True; _ -> live_case_bndr }
959 = map zap alt_bndrs
960 | otherwise
961 = alt_bndrs
962
963 cval = case con of
964 DEFAULT -> Nothing
965 LitAlt {} -> Just (ConVal con [])
966 DataAlt {} -> Just (ConVal con vanilla_args)
967 where
968 vanilla_args = map Type (tyConAppArgs (idType case_bndr)) ++
969 varsToCoreExprs alt_bndrs
970
971 zap v | isTyVar v = v -- See NB2 above
972 | otherwise = zapIdOccInfo v
973
974
975 decreaseSpecCount :: ScEnv -> Int -> ScEnv
976 -- See Note [Avoiding exponential blowup]
977 decreaseSpecCount env n_specs
978 = env { sc_count = case sc_count env of
979 Nothing -> Nothing
980 Just n -> Just (n `div` (n_specs + 1)) }
981 -- The "+1" takes account of the original function;
982 -- See Note [Avoiding exponential blowup]
983
984 ---------------------------------------------------
985 -- See Note [Forcing specialisation]
986 ignoreType :: ScEnv -> Type -> Bool
987 ignoreDataCon :: ScEnv -> DataCon -> Bool
988 forceSpecBndr :: ScEnv -> Var -> Bool
989
990 ignoreDataCon env dc = ignoreTyCon env (dataConTyCon dc)
991
992 ignoreType env ty
993 = case tyConAppTyCon_maybe ty of
994 Just tycon -> ignoreTyCon env tycon
995 _ -> False
996
997 ignoreTyCon :: ScEnv -> TyCon -> Bool
998 ignoreTyCon env tycon
999 = lookupUFM (sc_annotations env) tycon == Just NoSpecConstr
1000
1001 forceSpecBndr env var = forceSpecFunTy env . snd . splitForAllTys . varType $ var
1002
1003 forceSpecFunTy :: ScEnv -> Type -> Bool
1004 forceSpecFunTy env = any (forceSpecArgTy env) . fst . splitFunTys
1005
1006 forceSpecArgTy :: ScEnv -> Type -> Bool
1007 forceSpecArgTy env ty
1008 | Just ty' <- coreView ty = forceSpecArgTy env ty'
1009
1010 forceSpecArgTy env ty
1011 | Just (tycon, tys) <- splitTyConApp_maybe ty
1012 , tycon /= funTyCon
1013 = tyConName tycon == specTyConName
1014 || lookupUFM (sc_annotations env) tycon == Just ForceSpecConstr
1015 || any (forceSpecArgTy env) tys
1016
1017 forceSpecArgTy _ _ = False
1018
1019 {-
1020 Note [Add scrutinee to ValueEnv too]
1021 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1022 Consider this:
1023 case x of y
1024 (a,b) -> case b of c
1025 I# v -> ...(f y)...
1026 By the time we get to the call (f y), the ValueEnv
1027 will have a binding for y, and for c
1028 y -> (a,b)
1029 c -> I# v
1030 BUT that's not enough! Looking at the call (f y) we
1031 see that y is pair (a,b), but we also need to know what 'b' is.
1032 So in extendCaseBndrs we must *also* add the binding
1033 b -> I# v
1034 else we lose a useful specialisation for f. This is necessary even
1035 though the simplifier has systematically replaced uses of 'x' with 'y'
1036 and 'b' with 'c' in the code. The use of 'b' in the ValueEnv came
1037 from outside the case. See Trac #4908 for the live example.
1038
1039 Note [Avoiding exponential blowup]
1040 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1041 The sc_count field of the ScEnv says how many times we are prepared to
1042 duplicate a single function. But we must take care with recursive
1043 specialisations. Consider
1044
1045 let $j1 = let $j2 = let $j3 = ...
1046 in
1047 ...$j3...
1048 in
1049 ...$j2...
1050 in
1051 ...$j1...
1052
1053 If we specialise $j1 then in each specialisation (as well as the original)
1054 we can specialise $j2, and similarly $j3. Even if we make just *one*
1055 specialisation of each, because we also have the original we'll get 2^n
1056 copies of $j3, which is not good.
1057
1058 So when recursively specialising we divide the sc_count by the number of
1059 copies we are making at this level, including the original.
1060
1061
1062 ************************************************************************
1063 * *
1064 \subsection{Usage information: flows upwards}
1065 * *
1066 ************************************************************************
1067 -}
1068
1069 data ScUsage
1070 = SCU {
1071 scu_calls :: CallEnv, -- Calls
1072 -- The functions are a subset of the
1073 -- RecFuns in the ScEnv
1074
1075 scu_occs :: !(IdEnv ArgOcc) -- Information on argument occurrences
1076 } -- The domain is OutIds
1077
1078 type CallEnv = IdEnv [Call]
1079 data Call = Call Id [CoreArg] ValueEnv
1080 -- The arguments of the call, together with the
1081 -- env giving the constructor bindings at the call site
1082 -- We keep the function mainly for debug output
1083
1084 instance Outputable ScUsage where
1085 ppr (SCU { scu_calls = calls, scu_occs = occs })
1086 = text "SCU" <+> braces (sep [ ptext (sLit "calls =") <+> ppr calls
1087 , text "occs =" <+> ppr occs ])
1088
1089 instance Outputable Call where
1090 ppr (Call fn args _) = ppr fn <+> fsep (map pprParendExpr args)
1091
1092 nullUsage :: ScUsage
1093 nullUsage = SCU { scu_calls = emptyVarEnv, scu_occs = emptyVarEnv }
1094
1095 combineCalls :: CallEnv -> CallEnv -> CallEnv
1096 combineCalls = plusVarEnv_C (++)
1097 where
1098 -- plus cs ds | length res > 1
1099 -- = pprTrace "combineCalls" (vcat [ text "cs:" <+> ppr cs
1100 -- , text "ds:" <+> ppr ds])
1101 -- res
1102 -- | otherwise = res
1103 -- where
1104 -- res = cs ++ ds
1105
1106 combineUsage :: ScUsage -> ScUsage -> ScUsage
1107 combineUsage u1 u2 = SCU { scu_calls = combineCalls (scu_calls u1) (scu_calls u2),
1108 scu_occs = plusVarEnv_C combineOcc (scu_occs u1) (scu_occs u2) }
1109
1110 combineUsages :: [ScUsage] -> ScUsage
1111 combineUsages [] = nullUsage
1112 combineUsages us = foldr1 combineUsage us
1113
1114 lookupOccs :: ScUsage -> [OutVar] -> (ScUsage, [ArgOcc])
1115 lookupOccs (SCU { scu_calls = sc_calls, scu_occs = sc_occs }) bndrs
1116 = (SCU {scu_calls = sc_calls, scu_occs = delVarEnvList sc_occs bndrs},
1117 [lookupVarEnv sc_occs b `orElse` NoOcc | b <- bndrs])
1118
1119 data ArgOcc = NoOcc -- Doesn't occur at all; or a type argument
1120 | UnkOcc -- Used in some unknown way
1121
1122 | ScrutOcc -- See Note [ScrutOcc]
1123 (DataConEnv [ArgOcc]) -- How the sub-components are used
1124
1125 type DataConEnv a = UniqFM a -- Keyed by DataCon
1126
1127 {- Note [ScrutOcc]
1128 ~~~~~~~~~~~~~~~~~~~
1129 An occurrence of ScrutOcc indicates that the thing, or a `cast` version of the thing,
1130 is *only* taken apart or applied.
1131
1132 Functions, literal: ScrutOcc emptyUFM
1133 Data constructors: ScrutOcc subs,
1134
1135 where (subs :: UniqFM [ArgOcc]) gives usage of the *pattern-bound* components,
1136 The domain of the UniqFM is the Unique of the data constructor
1137
1138 The [ArgOcc] is the occurrences of the *pattern-bound* components
1139 of the data structure. E.g.
1140 data T a = forall b. MkT a b (b->a)
1141 A pattern binds b, x::a, y::b, z::b->a, but not 'a'!
1142
1143 -}
1144
1145 instance Outputable ArgOcc where
1146 ppr (ScrutOcc xs) = text "scrut-occ" <> ppr xs
1147 ppr UnkOcc = text "unk-occ"
1148 ppr NoOcc = text "no-occ"
1149
1150 evalScrutOcc :: ArgOcc
1151 evalScrutOcc = ScrutOcc emptyUFM
1152
1153 -- Experimentally, this vesion of combineOcc makes ScrutOcc "win", so
1154 -- that if the thing is scrutinised anywhere then we get to see that
1155 -- in the overall result, even if it's also used in a boxed way
1156 -- This might be too aggressive; see Note [Reboxing] Alternative 3
1157 combineOcc :: ArgOcc -> ArgOcc -> ArgOcc
1158 combineOcc NoOcc occ = occ
1159 combineOcc occ NoOcc = occ
1160 combineOcc (ScrutOcc xs) (ScrutOcc ys) = ScrutOcc (plusUFM_C combineOccs xs ys)
1161 combineOcc UnkOcc (ScrutOcc ys) = ScrutOcc ys
1162 combineOcc (ScrutOcc xs) UnkOcc = ScrutOcc xs
1163 combineOcc UnkOcc UnkOcc = UnkOcc
1164
1165 combineOccs :: [ArgOcc] -> [ArgOcc] -> [ArgOcc]
1166 combineOccs xs ys = zipWithEqual "combineOccs" combineOcc xs ys
1167
1168 setScrutOcc :: ScEnv -> ScUsage -> OutExpr -> ArgOcc -> ScUsage
1169 -- _Overwrite_ the occurrence info for the scrutinee, if the scrutinee
1170 -- is a variable, and an interesting variable
1171 setScrutOcc env usg (Cast e _) occ = setScrutOcc env usg e occ
1172 setScrutOcc env usg (Tick _ e) occ = setScrutOcc env usg e occ
1173 setScrutOcc env usg (Var v) occ
1174 | Just RecArg <- lookupHowBound env v = usg { scu_occs = extendVarEnv (scu_occs usg) v occ }
1175 | otherwise = usg
1176 setScrutOcc _env usg _other _occ -- Catch-all
1177 = usg
1178
1179 {-
1180 ************************************************************************
1181 * *
1182 \subsection{The main recursive function}
1183 * *
1184 ************************************************************************
1185
1186 The main recursive function gathers up usage information, and
1187 creates specialised versions of functions.
1188 -}
1189
1190 scExpr, scExpr' :: ScEnv -> CoreExpr -> UniqSM (ScUsage, CoreExpr)
1191 -- The unique supply is needed when we invent
1192 -- a new name for the specialised function and its args
1193
1194 scExpr env e = scExpr' env e
1195
1196 scExpr' env (Var v) = case scSubstId env v of
1197 Var v' -> return (mkVarUsage env v' [], Var v')
1198 e' -> scExpr (zapScSubst env) e'
1199
1200 scExpr' env (Type t) = return (nullUsage, Type (scSubstTy env t))
1201 scExpr' env (Coercion c) = return (nullUsage, Coercion (scSubstCo env c))
1202 scExpr' _ e@(Lit {}) = return (nullUsage, e)
1203 scExpr' env (Tick t e) = do (usg, e') <- scExpr env e
1204 return (usg, Tick t e')
1205 scExpr' env (Cast e co) = do (usg, e') <- scExpr env e
1206 return (usg, mkCast e' (scSubstCo env co))
1207 -- Important to use mkCast here
1208 -- See Note [SpecConstr call patterns]
1209 scExpr' env e@(App _ _) = scApp env (collectArgs e)
1210 scExpr' env (Lam b e) = do let (env', b') = extendBndr env b
1211 (usg, e') <- scExpr env' e
1212 return (usg, Lam b' e')
1213
1214 scExpr' env (Case scrut b ty alts)
1215 = do { (scrut_usg, scrut') <- scExpr env scrut
1216 ; case isValue (sc_vals env) scrut' of
1217 Just (ConVal con args) -> sc_con_app con args scrut'
1218 _other -> sc_vanilla scrut_usg scrut'
1219 }
1220 where
1221 sc_con_app con args scrut' -- Known constructor; simplify
1222 = do { let (_, bs, rhs) = findAlt con alts
1223 `orElse` (DEFAULT, [], mkImpossibleExpr ty)
1224 alt_env' = extendScSubstList env ((b,scrut') : bs `zip` trimConArgs con args)
1225 ; scExpr alt_env' rhs }
1226
1227 sc_vanilla scrut_usg scrut' -- Normal case
1228 = do { let (alt_env,b') = extendBndrWith RecArg env b
1229 -- Record RecArg for the components
1230
1231 ; (alt_usgs, alt_occs, alts')
1232 <- mapAndUnzip3M (sc_alt alt_env scrut' b') alts
1233
1234 ; let scrut_occ = foldr combineOcc NoOcc alt_occs
1235 scrut_usg' = setScrutOcc env scrut_usg scrut' scrut_occ
1236 -- The combined usage of the scrutinee is given
1237 -- by scrut_occ, which is passed to scScrut, which
1238 -- in turn treats a bare-variable scrutinee specially
1239
1240 ; return (foldr combineUsage scrut_usg' alt_usgs,
1241 Case scrut' b' (scSubstTy env ty) alts') }
1242
1243 sc_alt env scrut' b' (con,bs,rhs)
1244 = do { let (env1, bs1) = extendBndrsWith RecArg env bs
1245 (env2, bs2) = extendCaseBndrs env1 scrut' b' con bs1
1246 ; (usg, rhs') <- scExpr env2 rhs
1247 ; let (usg', b_occ:arg_occs) = lookupOccs usg (b':bs2)
1248 scrut_occ = case con of
1249 DataAlt dc -> ScrutOcc (unitUFM dc arg_occs)
1250 _ -> ScrutOcc emptyUFM
1251 ; return (usg', b_occ `combineOcc` scrut_occ, (con, bs2, rhs')) }
1252
1253 scExpr' env (Let (NonRec bndr rhs) body)
1254 | isTyVar bndr -- Type-lets may be created by doBeta
1255 = scExpr' (extendScSubst env bndr rhs) body
1256
1257 | otherwise
1258 = do { let (body_env, bndr') = extendBndr env bndr
1259 ; rhs_info <- scRecRhs env (bndr',rhs)
1260
1261 ; let body_env2 = extendHowBound body_env [bndr'] RecFun
1262 -- Note [Local let bindings]
1263 rhs' = ri_new_rhs rhs_info
1264 body_env3 = extendValEnv body_env2 bndr' (isValue (sc_vals env) rhs')
1265
1266 ; (body_usg, body') <- scExpr body_env3 body
1267
1268 -- NB: For non-recursive bindings we inherit sc_force flag from
1269 -- the parent function (see Note [Forcing specialisation])
1270 ; (spec_usg, specs) <- specNonRec env body_usg rhs_info
1271
1272 ; return (body_usg { scu_calls = scu_calls body_usg `delVarEnv` bndr' }
1273 `combineUsage` spec_usg, -- Note [spec_usg includes rhs_usg]
1274 mkLets [NonRec b r | (b,r) <- ruleInfoBinds rhs_info specs] body')
1275 }
1276
1277
1278 -- A *local* recursive group: see Note [Local recursive groups]
1279 scExpr' env (Let (Rec prs) body)
1280 = do { let (bndrs,rhss) = unzip prs
1281 (rhs_env1,bndrs') = extendRecBndrs env bndrs
1282 rhs_env2 = extendHowBound rhs_env1 bndrs' RecFun
1283 force_spec = any (forceSpecBndr env) bndrs'
1284 -- Note [Forcing specialisation]
1285
1286 ; rhs_infos <- mapM (scRecRhs rhs_env2) (bndrs' `zip` rhss)
1287 ; (body_usg, body') <- scExpr rhs_env2 body
1288
1289 -- NB: start specLoop from body_usg
1290 ; (spec_usg, specs) <- specRec NotTopLevel (scForce rhs_env2 force_spec)
1291 body_usg rhs_infos
1292 -- Do not unconditionally generate specialisations from rhs_usgs
1293 -- Instead use them only if we find an unspecialised call
1294 -- See Note [Local recursive groups]
1295
1296 ; let all_usg = spec_usg `combineUsage` body_usg -- Note [spec_usg includes rhs_usg]
1297 bind' = Rec (concat (zipWith ruleInfoBinds rhs_infos specs))
1298
1299 ; return (all_usg { scu_calls = scu_calls all_usg `delVarEnvList` bndrs' },
1300 Let bind' body') }
1301
1302 {-
1303 Note [Local let bindings]
1304 ~~~~~~~~~~~~~~~~~~~~~~~~~
1305 It is not uncommon to find this
1306
1307 let $j = \x. <blah> in ...$j True...$j True...
1308
1309 Here $j is an arbitrary let-bound function, but it often comes up for
1310 join points. We might like to specialise $j for its call patterns.
1311 Notice the difference from a letrec, where we look for call patterns
1312 in the *RHS* of the function. Here we look for call patterns in the
1313 *body* of the let.
1314
1315 At one point I predicated this on the RHS mentioning the outer
1316 recursive function, but that's not essential and might even be
1317 harmful. I'm not sure.
1318 -}
1319
1320 scApp :: ScEnv -> (InExpr, [InExpr]) -> UniqSM (ScUsage, CoreExpr)
1321
1322 scApp env (Var fn, args) -- Function is a variable
1323 = ASSERT( not (null args) )
1324 do { args_w_usgs <- mapM (scExpr env) args
1325 ; let (arg_usgs, args') = unzip args_w_usgs
1326 arg_usg = combineUsages arg_usgs
1327 ; case scSubstId env fn of
1328 fn'@(Lam {}) -> scExpr (zapScSubst env) (doBeta fn' args')
1329 -- Do beta-reduction and try again
1330
1331 Var fn' -> return (arg_usg `combineUsage` mkVarUsage env fn' args',
1332 mkApps (Var fn') args')
1333
1334 other_fn' -> return (arg_usg, mkApps other_fn' args') }
1335 -- NB: doing this ignores any usage info from the substituted
1336 -- function, but I don't think that matters. If it does
1337 -- we can fix it.
1338 where
1339 doBeta :: OutExpr -> [OutExpr] -> OutExpr
1340 -- ToDo: adjust for System IF
1341 doBeta (Lam bndr body) (arg : args) = Let (NonRec bndr arg) (doBeta body args)
1342 doBeta fn args = mkApps fn args
1343
1344 -- The function is almost always a variable, but not always.
1345 -- In particular, if this pass follows float-in,
1346 -- which it may, we can get
1347 -- (let f = ...f... in f) arg1 arg2
1348 scApp env (other_fn, args)
1349 = do { (fn_usg, fn') <- scExpr env other_fn
1350 ; (arg_usgs, args') <- mapAndUnzipM (scExpr env) args
1351 ; return (combineUsages arg_usgs `combineUsage` fn_usg, mkApps fn' args') }
1352
1353 ----------------------
1354 mkVarUsage :: ScEnv -> Id -> [CoreExpr] -> ScUsage
1355 mkVarUsage env fn args
1356 = case lookupHowBound env fn of
1357 Just RecFun -> SCU { scu_calls = unitVarEnv fn [Call fn args (sc_vals env)]
1358 , scu_occs = emptyVarEnv }
1359 Just RecArg -> SCU { scu_calls = emptyVarEnv
1360 , scu_occs = unitVarEnv fn arg_occ }
1361 Nothing -> nullUsage
1362 where
1363 -- I rather think we could use UnkOcc all the time
1364 arg_occ | null args = UnkOcc
1365 | otherwise = evalScrutOcc
1366
1367 ----------------------
1368 scTopBindEnv :: ScEnv -> CoreBind -> UniqSM (ScEnv, CoreBind)
1369 scTopBindEnv env (Rec prs)
1370 = do { let (rhs_env1,bndrs') = extendRecBndrs env bndrs
1371 rhs_env2 = extendHowBound rhs_env1 bndrs RecFun
1372
1373 prs' = zip bndrs' rhss
1374 ; return (rhs_env2, Rec prs') }
1375 where
1376 (bndrs,rhss) = unzip prs
1377
1378 scTopBindEnv env (NonRec bndr rhs)
1379 = do { let (env1, bndr') = extendBndr env bndr
1380 env2 = extendValEnv env1 bndr' (isValue (sc_vals env) rhs)
1381 ; return (env2, NonRec bndr' rhs) }
1382
1383 ----------------------
1384 scTopBind :: ScEnv -> ScUsage -> CoreBind -> UniqSM (ScUsage, CoreBind)
1385
1386 {-
1387 scTopBind _ usage _
1388 | pprTrace "scTopBind_usage" (ppr (scu_calls usage)) False
1389 = error "false"
1390 -}
1391
1392 scTopBind env body_usage (Rec prs)
1393 | Just threshold <- sc_size env
1394 , not force_spec
1395 , not (all (couldBeSmallEnoughToInline (sc_dflags env) threshold) rhss)
1396 -- No specialisation
1397 = -- pprTrace "scTopBind: nospec" (ppr bndrs) $
1398 do { (rhs_usgs, rhss') <- mapAndUnzipM (scExpr env) rhss
1399 ; return (body_usage `combineUsage` combineUsages rhs_usgs, Rec (bndrs `zip` rhss')) }
1400
1401 | otherwise -- Do specialisation
1402 = do { rhs_infos <- mapM (scRecRhs env) prs
1403
1404 ; (spec_usage, specs) <- specRec TopLevel (scForce env force_spec)
1405 body_usage rhs_infos
1406
1407 ; return (body_usage `combineUsage` spec_usage,
1408 Rec (concat (zipWith ruleInfoBinds rhs_infos specs))) }
1409 where
1410 (bndrs,rhss) = unzip prs
1411 force_spec = any (forceSpecBndr env) bndrs
1412 -- Note [Forcing specialisation]
1413
1414 scTopBind env usage (NonRec bndr rhs) -- Oddly, we don't seem to specialise top-level non-rec functions
1415 = do { (rhs_usg', rhs') <- scExpr env rhs
1416 ; return (usage `combineUsage` rhs_usg', NonRec bndr rhs') }
1417
1418 ----------------------
1419 scRecRhs :: ScEnv -> (OutId, InExpr) -> UniqSM RhsInfo
1420 scRecRhs env (bndr,rhs)
1421 = do { let (arg_bndrs,body) = collectBinders rhs
1422 (body_env, arg_bndrs') = extendBndrsWith RecArg env arg_bndrs
1423 ; (body_usg, body') <- scExpr body_env body
1424 ; let (rhs_usg, arg_occs) = lookupOccs body_usg arg_bndrs'
1425 ; return (RI { ri_rhs_usg = rhs_usg
1426 , ri_fn = bndr, ri_new_rhs = mkLams arg_bndrs' body'
1427 , ri_lam_bndrs = arg_bndrs, ri_lam_body = body
1428 , ri_arg_occs = arg_occs }) }
1429 -- The arg_occs says how the visible,
1430 -- lambda-bound binders of the RHS are used
1431 -- (including the TyVar binders)
1432 -- Two pats are the same if they match both ways
1433
1434 ----------------------
1435 ruleInfoBinds :: RhsInfo -> SpecInfo -> [(Id,CoreExpr)]
1436 ruleInfoBinds (RI { ri_fn = fn, ri_new_rhs = new_rhs })
1437 (SI { si_specs = specs })
1438 = [(id,rhs) | OS { os_id = id, os_rhs = rhs } <- specs] ++
1439 -- First the specialised bindings
1440
1441 [(fn `addIdSpecialisations` rules, new_rhs)]
1442 -- And now the original binding
1443 where
1444 rules = [r | OS { os_rule = r } <- specs]
1445
1446 {-
1447 ************************************************************************
1448 * *
1449 The specialiser itself
1450 * *
1451 ************************************************************************
1452 -}
1453
1454 data RhsInfo
1455 = RI { ri_fn :: OutId -- The binder
1456 , ri_new_rhs :: OutExpr -- The specialised RHS (in current envt)
1457 , ri_rhs_usg :: ScUsage -- Usage info from specialising RHS
1458
1459 , ri_lam_bndrs :: [InVar] -- The *original* RHS (\xs.body)
1460 , ri_lam_body :: InExpr -- Note [Specialise original body]
1461 , ri_arg_occs :: [ArgOcc] -- Info on how the xs occur in body
1462 }
1463
1464 data SpecInfo -- Info about specialisations for a particular Id
1465 = SI { si_specs :: [OneSpec] -- The specialisations we have generated
1466
1467 , si_n_specs :: Int -- Length of si_specs; used for numbering them
1468
1469 , si_mb_unspec :: Maybe ScUsage -- Just cs => we have not yet used calls in the
1470 } -- from calls in the *original* RHS as
1471 -- seeds for new specialisations;
1472 -- if you decide to do so, here is the
1473 -- RHS usage (which has not yet been
1474 -- unleashed)
1475 -- Nothing => we have
1476 -- See Note [Local recursive groups]
1477 -- See Note [spec_usg includes rhs_usg]
1478
1479 -- One specialisation: Rule plus definition
1480 data OneSpec =
1481 OS { os_pat :: CallPat -- Call pattern that generated this specialisation
1482 , os_rule :: CoreRule -- Rule connecting original id with the specialisation
1483 , os_id :: OutId -- Spec id
1484 , os_rhs :: OutExpr } -- Spec rhs
1485
1486 noSpecInfo :: SpecInfo
1487 noSpecInfo = SI { si_specs = [], si_n_specs = 0, si_mb_unspec = Nothing }
1488
1489 ----------------------
1490 specNonRec :: ScEnv
1491 -> ScUsage -- Body usage
1492 -> RhsInfo -- Structure info usage info for un-specialised RHS
1493 -> UniqSM (ScUsage, SpecInfo) -- Usage from RHSs (specialised and not)
1494 -- plus details of specialisations
1495
1496 specNonRec env body_usg rhs_info
1497 = specialise env (scu_calls body_usg) rhs_info
1498 (noSpecInfo { si_mb_unspec = Just (ri_rhs_usg rhs_info) })
1499
1500 ----------------------
1501 specRec :: TopLevelFlag -> ScEnv
1502 -> ScUsage -- Body usage
1503 -> [RhsInfo] -- Structure info and usage info for un-specialised RHSs
1504 -> UniqSM (ScUsage, [SpecInfo]) -- Usage from all RHSs (specialised and not)
1505 -- plus details of specialisations
1506
1507 specRec top_lvl env body_usg rhs_infos
1508 = go 1 seed_calls nullUsage init_spec_infos
1509 where
1510 (seed_calls, init_spec_infos) -- Note [Seeding top-level recursive groups]
1511 | isTopLevel top_lvl
1512 , any (isExportedId . ri_fn) rhs_infos -- Seed from body and RHSs
1513 = (all_calls, [noSpecInfo | _ <- rhs_infos])
1514 | otherwise -- Seed from body only
1515 = (calls_in_body, [noSpecInfo { si_mb_unspec = Just (ri_rhs_usg ri) }
1516 | ri <- rhs_infos])
1517
1518 calls_in_body = scu_calls body_usg
1519 calls_in_rhss = foldr (combineCalls . scu_calls . ri_rhs_usg) emptyVarEnv rhs_infos
1520 all_calls = calls_in_rhss `combineCalls` calls_in_body
1521
1522 -- Loop, specialising, until you get no new specialisations
1523 go :: Int -- Which iteration of the "until no new specialisations"
1524 -- loop we are on; first iteration is 1
1525 -> CallEnv -- Seed calls
1526 -- Two accumulating parameters:
1527 -> ScUsage -- Usage from earlier specialisations
1528 -> [SpecInfo] -- Details of specialisations so far
1529 -> UniqSM (ScUsage, [SpecInfo])
1530 go n_iter seed_calls usg_so_far spec_infos
1531 | isEmptyVarEnv seed_calls
1532 = -- pprTrace "specRec1" (vcat [ ppr (map ri_fn rhs_infos)
1533 -- , ppr seed_calls
1534 -- , ppr body_usg ]) $
1535 return (usg_so_far, spec_infos)
1536
1537 -- Limit recursive specialisation
1538 -- See Note [Limit recursive specialisation]
1539 | n_iter > sc_recursive env -- Too many iterations of the 'go' loop
1540 , sc_force env || isNothing (sc_count env)
1541 -- If both of these are false, the sc_count
1542 -- threshold will prevent non-termination
1543 , any ((> the_limit) . si_n_specs) spec_infos
1544 = -- pprTrace "specRec2" (ppr (map (map os_pat . si_specs) spec_infos)) $
1545 return (usg_so_far, spec_infos)
1546
1547 | otherwise
1548 = do { specs_w_usg <- zipWithM (specialise env seed_calls) rhs_infos spec_infos
1549 ; let (extra_usg_s, new_spec_infos) = unzip specs_w_usg
1550 extra_usg = combineUsages extra_usg_s
1551 all_usg = usg_so_far `combineUsage` extra_usg
1552 ; go (n_iter + 1) (scu_calls extra_usg) all_usg new_spec_infos }
1553
1554 -- See Note [Limit recursive specialisation]
1555 the_limit = case sc_count env of
1556 Nothing -> 10 -- Ugh!
1557 Just max -> max
1558
1559
1560 ----------------------
1561 specialise
1562 :: ScEnv
1563 -> CallEnv -- Info on newly-discovered calls to this function
1564 -> RhsInfo
1565 -> SpecInfo -- Original RHS plus patterns dealt with
1566 -> UniqSM (ScUsage, SpecInfo) -- New specialised versions and their usage
1567
1568 -- See Note [spec_usg includes rhs_usg]
1569
1570 -- Note: this only generates *specialised* bindings
1571 -- The original binding is added by ruleInfoBinds
1572 --
1573 -- Note: the rhs here is the optimised version of the original rhs
1574 -- So when we make a specialised copy of the RHS, we're starting
1575 -- from an RHS whose nested functions have been optimised already.
1576
1577 specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
1578 , ri_lam_body = body, ri_arg_occs = arg_occs })
1579 spec_info@(SI { si_specs = specs, si_n_specs = spec_count
1580 , si_mb_unspec = mb_unspec })
1581 | isBottomingId fn -- Note [Do not specialise diverging functions]
1582 -- and do not generate specialisation seeds from its RHS
1583 = -- pprTrace "specialise bot" (ppr fn) $
1584 return (nullUsage, spec_info)
1585
1586 | isNeverActive (idInlineActivation fn) -- See Note [Transfer activation]
1587 || null arg_bndrs -- Only specialise functions
1588 = -- pprTrace "specialise inactive" (ppr fn) $
1589 case mb_unspec of -- Behave as if there was a single, boring call
1590 Just rhs_usg -> return (rhs_usg, spec_info { si_mb_unspec = Nothing })
1591 -- See Note [spec_usg includes rhs_usg]
1592 Nothing -> return (nullUsage, spec_info)
1593
1594 | Just all_calls <- lookupVarEnv bind_calls fn
1595 = -- pprTrace "specialise entry {" (ppr fn <+> ppr all_calls) $
1596 do { (boring_call, new_pats) <- callsToNewPats env fn spec_info arg_occs all_calls
1597
1598 ; let n_pats = length new_pats
1599 -- ; if (not (null new_pats) || isJust mb_unspec) then
1600 -- pprTrace "specialise" (vcat [ ppr fn <+> text "with" <+> int n_pats <+> text "good patterns"
1601 -- , text "mb_unspec" <+> ppr (isJust mb_unspec)
1602 -- , text "arg_occs" <+> ppr arg_occs
1603 -- , text "good pats" <+> ppr new_pats]) $
1604 -- return ()
1605 -- else return ()
1606
1607 ; let spec_env = decreaseSpecCount env n_pats
1608 ; (spec_usgs, new_specs) <- mapAndUnzipM (spec_one spec_env fn arg_bndrs body)
1609 (new_pats `zip` [spec_count..])
1610 -- See Note [Specialise original body]
1611
1612 ; let spec_usg = combineUsages spec_usgs
1613
1614 -- If there were any boring calls among the seeds (= all_calls), then those
1615 -- calls will call the un-specialised function. So we should use the seeds
1616 -- from the _unspecialised_ function's RHS, which are in mb_unspec, by returning
1617 -- then in new_usg.
1618 (new_usg, mb_unspec')
1619 = case mb_unspec of
1620 Just rhs_usg | boring_call -> (spec_usg `combineUsage` rhs_usg, Nothing)
1621 _ -> (spec_usg, mb_unspec)
1622
1623 -- ; pprTrace "specialise return }"
1624 -- (vcat [ ppr fn
1625 -- , text "boring_call:" <+> ppr boring_call
1626 -- , text "new calls:" <+> ppr (scu_calls new_usg)]) $
1627 -- return ()
1628
1629 ; return (new_usg, SI { si_specs = new_specs ++ specs
1630 , si_n_specs = spec_count + n_pats
1631 , si_mb_unspec = mb_unspec' }) }
1632
1633 | otherwise -- No new seeds, so return nullUsage
1634 = return (nullUsage, spec_info)
1635
1636
1637
1638
1639 ---------------------
1640 spec_one :: ScEnv
1641 -> OutId -- Function
1642 -> [InVar] -- Lambda-binders of RHS; should match patterns
1643 -> InExpr -- Body of the original function
1644 -> (CallPat, Int)
1645 -> UniqSM (ScUsage, OneSpec) -- Rule and binding
1646
1647 -- spec_one creates a specialised copy of the function, together
1648 -- with a rule for using it. I'm very proud of how short this
1649 -- function is, considering what it does :-).
1650
1651 {-
1652 Example
1653
1654 In-scope: a, x::a
1655 f = /\b \y::[(a,b)] -> ....f (b,c) ((:) (a,(b,c)) (x,v) (h w))...
1656 [c::*, v::(b,c) are presumably bound by the (...) part]
1657 ==>
1658 f_spec = /\ b c \ v::(b,c) hw::[(a,(b,c))] ->
1659 (...entire body of f...) [b -> (b,c),
1660 y -> ((:) (a,(b,c)) (x,v) hw)]
1661
1662 RULE: forall b::* c::*, -- Note, *not* forall a, x
1663 v::(b,c),
1664 hw::[(a,(b,c))] .
1665
1666 f (b,c) ((:) (a,(b,c)) (x,v) hw) = f_spec b c v hw
1667 -}
1668
1669 spec_one env fn arg_bndrs body (call_pat@(qvars, pats), rule_number)
1670 = do { spec_uniq <- getUniqueM
1671 ; let spec_env = extendScSubstList (extendScInScope env qvars)
1672 (arg_bndrs `zip` pats)
1673 fn_name = idName fn
1674 fn_loc = nameSrcSpan fn_name
1675 fn_occ = nameOccName fn_name
1676 spec_occ = mkSpecOcc fn_occ
1677 -- We use fn_occ rather than fn in the rule_name string
1678 -- as we don't want the uniq to end up in the rule, and
1679 -- hence in the ABI, as that can cause spurious ABI
1680 -- changes (#4012).
1681 rule_name = mkFastString ("SC:" ++ occNameString fn_occ ++ show rule_number)
1682 spec_name = mkInternalName spec_uniq spec_occ fn_loc
1683 -- ; pprTrace "{spec_one" (ppr (sc_count env) <+> ppr fn
1684 -- <+> ppr pats <+> text "-->" <+> ppr spec_name) $
1685 -- return ()
1686
1687 -- Specialise the body
1688 ; (spec_usg, spec_body) <- scExpr spec_env body
1689
1690 -- ; pprTrace "done spec_one}" (ppr fn) $
1691 -- return ()
1692
1693 -- And build the results
1694 ; let (spec_lam_args, spec_call_args) = mkWorkerArgs (sc_dflags env)
1695 qvars body_ty
1696 -- Usual w/w hack to avoid generating
1697 -- a spec_rhs of unlifted type and no args
1698
1699 spec_lam_args_str = handOutStrictnessInformation (fst (splitStrictSig spec_str)) spec_lam_args
1700 -- Annotate the variables with the strictness information from
1701 -- the function (see Note [Strictness information in worker binders])
1702
1703 spec_join_arity | isJoinId fn = Just (length spec_lam_args)
1704 | otherwise = Nothing
1705 spec_id = mkLocalIdOrCoVar spec_name
1706 (mkLamTypes spec_lam_args body_ty)
1707 -- See Note [Transfer strictness]
1708 `setIdStrictness` spec_str
1709 `setIdArity` count isId spec_lam_args
1710 `asJoinId_maybe` spec_join_arity
1711 spec_str = calcSpecStrictness fn spec_lam_args pats
1712
1713
1714 -- Conditionally use result of new worker-wrapper transform
1715 spec_rhs = mkLams spec_lam_args_str spec_body
1716 body_ty = exprType spec_body
1717 rule_rhs = mkVarApps (Var spec_id) spec_call_args
1718 inline_act = idInlineActivation fn
1719 this_mod = sc_module spec_env
1720 rule = mkRule this_mod True {- Auto -} True {- Local -}
1721 rule_name inline_act fn_name qvars pats rule_rhs
1722 -- See Note [Transfer activation]
1723 ; return (spec_usg, OS { os_pat = call_pat, os_rule = rule
1724 , os_id = spec_id
1725 , os_rhs = spec_rhs }) }
1726
1727
1728 -- See Note [Strictness information in worker binders]
1729 handOutStrictnessInformation :: [Demand] -> [Var] -> [Var]
1730 handOutStrictnessInformation = go
1731 where
1732 go _ [] = []
1733 go [] vs = vs
1734 go (d:dmds) (v:vs) | isId v = setIdDemandInfo v d : go dmds vs
1735 go dmds (v:vs) = v : go dmds vs
1736
1737 calcSpecStrictness :: Id -- The original function
1738 -> [Var] -> [CoreExpr] -- Call pattern
1739 -> StrictSig -- Strictness of specialised thing
1740 -- See Note [Transfer strictness]
1741 calcSpecStrictness fn qvars pats
1742 = mkClosedStrictSig spec_dmds topRes
1743 where
1744 spec_dmds = [ lookupVarEnv dmd_env qv `orElse` topDmd | qv <- qvars, isId qv ]
1745 StrictSig (DmdType _ dmds _) = idStrictness fn
1746
1747 dmd_env = go emptyVarEnv dmds pats
1748
1749 go :: DmdEnv -> [Demand] -> [CoreExpr] -> DmdEnv
1750 go env ds (Type {} : pats) = go env ds pats
1751 go env ds (Coercion {} : pats) = go env ds pats
1752 go env (d:ds) (pat : pats) = go (go_one env d pat) ds pats
1753 go env _ _ = env
1754
1755 go_one :: DmdEnv -> Demand -> CoreExpr -> DmdEnv
1756 go_one env d (Var v) = extendVarEnv_C bothDmd env v d
1757 go_one env d e
1758 | Just ds <- splitProdDmd_maybe d -- NB: d does not have to be strict
1759 , (Var _, args) <- collectArgs e = go env ds args
1760 go_one env _ _ = env
1761
1762 {-
1763 Note [spec_usg includes rhs_usg]
1764 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1765 In calls to 'specialise', the returned ScUsage must include the rhs_usg in
1766 the passed-in SpecInfo, unless there are no calls at all to the function.
1767
1768 The caller can, indeed must, assume this. He should not combine in rhs_usg
1769 himself, or he'll get rhs_usg twice -- and that can lead to an exponential
1770 blowup of duplicates in the CallEnv. This is what gave rise to the massive
1771 performance loss in Trac #8852.
1772
1773 Note [Specialise original body]
1774 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1775 The RhsInfo for a binding keeps the *original* body of the binding. We
1776 must specialise that, *not* the result of applying specExpr to the RHS
1777 (which is also kept in RhsInfo). Otherwise we end up specialising a
1778 specialised RHS, and that can lead directly to exponential behaviour.
1779
1780 Note [Transfer activation]
1781 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1782 This note is for SpecConstr, but exactly the same thing
1783 happens in the overloading specialiser; see
1784 Note [Auto-specialisation and RULES] in Specialise.
1785
1786 In which phase should the specialise-constructor rules be active?
1787 Originally I made them always-active, but Manuel found that this
1788 defeated some clever user-written rules. Then I made them active only
1789 in Phase 0; after all, currently, the specConstr transformation is
1790 only run after the simplifier has reached Phase 0, but that meant
1791 that specialisations didn't fire inside wrappers; see test
1792 simplCore/should_compile/spec-inline.
1793
1794 So now I just use the inline-activation of the parent Id, as the
1795 activation for the specialiation RULE, just like the main specialiser;
1796
1797 This in turn means there is no point in specialising NOINLINE things,
1798 so we test for that.
1799
1800 Note [Transfer strictness]
1801 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1802 We must transfer strictness information from the original function to
1803 the specialised one. Suppose, for example
1804
1805 f has strictness SS
1806 and a RULE f (a:as) b = f_spec a as b
1807
1808 Now we want f_spec to have strictness LLS, otherwise we'll use call-by-need
1809 when calling f_spec instead of call-by-value. And that can result in
1810 unbounded worsening in space (cf the classic foldl vs foldl')
1811
1812 See Trac #3437 for a good example.
1813
1814 The function calcSpecStrictness performs the calculation.
1815
1816 Note [Strictness information in worker binders]
1817 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1818
1819 After having calculated the strictness annotation for the worker (see Note
1820 [Transfer strictness] above), we also want to have this information attached to
1821 the worker’s arguments, for the benefit of later passes. The function
1822 handOutStrictnessInformation decomposes the strictness annotation calculated by
1823 calcSpecStrictness and attaches them to the variables.
1824
1825 ************************************************************************
1826 * *
1827 \subsection{Argument analysis}
1828 * *
1829 ************************************************************************
1830
1831 This code deals with analysing call-site arguments to see whether
1832 they are constructor applications.
1833
1834 Note [Free type variables of the qvar types]
1835 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1836 In a call (f @a x True), that we want to specialise, what variables should
1837 we quantify over. Clearly over 'a' and 'x', but what about any type variables
1838 free in x's type? In fact we don't need to worry about them because (f @a)
1839 can only be a well-typed application if its type is compatible with x, so any
1840 variables free in x's type must be free in (f @a), and hence either be gathered
1841 via 'a' itself, or be in scope at f's defn. Hence we just take
1842 (exprsFreeVars pats).
1843
1844 BUT phantom type synonyms can mess this reasoning up,
1845 eg x::T b with type T b = Int
1846 So we apply expandTypeSynonyms to the bound Ids.
1847 See Trac # 5458. Yuk.
1848
1849 Note [SpecConstr call patterns]
1850 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1851 A "call patterns" that we collect is going to become the LHS of a RULE.
1852 It's important that it doesn't have
1853 e |> Refl
1854 or
1855 e |> g1 |> g2
1856 because both of these will be optimised by Simplify.simplRule. In the
1857 former case such optimisation benign, because the rule will match more
1858 terms; but in the latter we may lose a binding of 'g1' or 'g2', and
1859 end up with a rule LHS that doesn't bind the template variables
1860 (Trac #10602).
1861
1862 The simplifier eliminates such things, but SpecConstr itself constructs
1863 new terms by substituting. So the 'mkCast' in the Cast case of scExpr
1864 is very important!
1865
1866 Note [Choosing patterns]
1867 ~~~~~~~~~~~~~~~~~~~~~~~~
1868 If we get lots of patterns we may not want to make a specialisation
1869 for each of them (code bloat), so we choose as follows, implemented
1870 by trim_pats.
1871
1872 * The flag -fspec-constr-count-N sets the sc_count field
1873 of the ScEnv to (Just n). This limits the total number
1874 of specialisations for a given function to N.
1875
1876 * -fno-spec-constr-count sets the sc_count field to Nothing,
1877 which switches of the limit.
1878
1879 * The ghastly ForceSpecConstr trick also switches of the limit
1880 for a particular function
1881
1882 * Otherwise we sort the patterns to choose the most general
1883 ones first; more general => more widely applicable.
1884 -}
1885
1886 type CallPat = ([Var], [CoreExpr]) -- Quantified variables and arguments
1887 -- See Note [SpecConstr call patterns]
1888
1889 callsToNewPats :: ScEnv -> Id
1890 -> SpecInfo
1891 -> [ArgOcc] -> [Call]
1892 -> UniqSM (Bool, [CallPat])
1893 -- Result has no duplicate patterns,
1894 -- nor ones mentioned in done_pats
1895 -- Bool indicates that there was at least one boring pattern
1896 callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
1897 = do { mb_pats <- mapM (callToPats env bndr_occs) calls
1898
1899 ; let have_boring_call = any isNothing mb_pats
1900
1901 good_pats :: [CallPat]
1902 good_pats = catMaybes mb_pats
1903
1904 -- Remove patterns we have already done
1905 new_pats = filterOut is_done good_pats
1906 is_done p = any (samePat p . os_pat) done_specs
1907
1908 -- Remove duplicates
1909 non_dups = nubBy samePat new_pats
1910
1911 -- Remove ones that have too many worker variables
1912 small_pats = filterOut too_big non_dups
1913 too_big (vars,_) = not (isWorkerSmallEnough (sc_dflags env) vars)
1914 -- We are about to construct w/w pair in 'spec_one'.
1915 -- Omit specialisation leading to high arity workers.
1916 -- See Note [Limit w/w arity] in WwLib
1917
1918 -- Discard specialisations if there are too many of them
1919 trimmed_pats = trim_pats env fn spec_info small_pats
1920
1921 -- ; pprTrace "callsToPats" (vcat [ text "calls:" <+> ppr calls
1922 -- , text "good_pats:" <+> ppr good_pats ]) $
1923 -- return ()
1924
1925 ; return (have_boring_call, trimmed_pats) }
1926
1927
1928 trim_pats :: ScEnv -> Id -> SpecInfo -> [CallPat] -> [CallPat]
1929 -- See Note [Choosing patterns]
1930 trim_pats env fn (SI { si_n_specs = done_spec_count }) pats
1931 | sc_force env
1932 || isNothing mb_scc
1933 || n_remaining >= n_pats
1934 = pats -- No need to trim
1935
1936 | otherwise
1937 = emit_trace $ -- Need to trim, so keep the best ones
1938 take n_remaining sorted_pats
1939
1940 where
1941 n_pats = length pats
1942 spec_count' = n_pats + done_spec_count
1943 n_remaining = max_specs - done_spec_count
1944 mb_scc = sc_count env
1945 Just max_specs = mb_scc
1946
1947 sorted_pats = map fst $
1948 sortBy (comparing snd) $
1949 [(pat, pat_cons pat) | pat <- pats]
1950 -- Sort in order of increasing number of constructors
1951 -- (i.e. decreasing generality) and pick the initial
1952 -- segment of this list
1953
1954 pat_cons :: CallPat -> Int
1955 -- How many data consturorst of literals are in
1956 -- the patten. More data-cons => less general
1957 pat_cons (qs, ps) = foldr ((+) . n_cons) 0 ps
1958 where
1959 q_set = mkVarSet qs
1960 n_cons (Var v) | v `elemVarSet` q_set = 0
1961 | otherwise = 1
1962 n_cons (Cast e _) = n_cons e
1963 n_cons (App e1 e2) = n_cons e1 + n_cons e2
1964 n_cons (Lit {}) = 1
1965 n_cons _ = 0
1966
1967 emit_trace result
1968 | debugIsOn || hasPprDebug (sc_dflags env)
1969 -- Suppress this scary message for ordinary users! Trac #5125
1970 = pprTrace "SpecConstr" msg result
1971 | otherwise
1972 = result
1973 msg = vcat [ sep [ text "Function" <+> quotes (ppr fn)
1974 , nest 2 (text "has" <+>
1975 speakNOf spec_count' (text "call pattern") <> comma <+>
1976 text "but the limit is" <+> int max_specs) ]
1977 , text "Use -fspec-constr-count=n to set the bound"
1978 , text "Discarding:" <+> ppr (drop n_remaining sorted_pats) ]
1979
1980
1981 callToPats :: ScEnv -> [ArgOcc] -> Call -> UniqSM (Maybe CallPat)
1982 -- The [Var] is the variables to quantify over in the rule
1983 -- Type variables come first, since they may scope
1984 -- over the following term variables
1985 -- The [CoreExpr] are the argument patterns for the rule
1986 callToPats env bndr_occs (Call _ args con_env)
1987 | length args < length bndr_occs -- Check saturated
1988 = return Nothing
1989 | otherwise
1990 = do { let in_scope = substInScope (sc_subst env)
1991 ; (interesting, pats) <- argsToPats env in_scope con_env args bndr_occs
1992 ; let pat_fvs = exprsFreeVarsList pats
1993 -- To get determinism we need the list of free variables in
1994 -- deterministic order. Otherwise we end up creating
1995 -- lambdas with different argument orders. See
1996 -- determinism/simplCore/should_compile/spec-inline-determ.hs
1997 -- for an example. For explanation of determinism
1998 -- considerations See Note [Unique Determinism] in Unique.
1999 in_scope_vars = getInScopeVars in_scope
2000 qvars = filterOut (`elemVarSet` in_scope_vars) pat_fvs
2001 -- Quantify over variables that are not in scope
2002 -- at the call site
2003 -- See Note [Free type variables of the qvar types]
2004 -- See Note [Shadowing] at the top
2005
2006 (ktvs, ids) = partition isTyVar qvars
2007 qvars' = toposortTyVars ktvs ++ map sanitise ids
2008 -- Order into kind variables, type variables, term variables
2009 -- The kind of a type variable may mention a kind variable
2010 -- and the type of a term variable may mention a type variable
2011
2012 sanitise id = id `setIdType` expandTypeSynonyms (idType id)
2013 -- See Note [Free type variables of the qvar types]
2014
2015 ; -- pprTrace "callToPats" (ppr args $$ ppr bndr_occs) $
2016 if interesting
2017 then return (Just (qvars', pats))
2018 else return Nothing }
2019
2020 -- argToPat takes an actual argument, and returns an abstracted
2021 -- version, consisting of just the "constructor skeleton" of the
2022 -- argument, with non-constructor sub-expression replaced by new
2023 -- placeholder variables. For example:
2024 -- C a (D (f x) (g y)) ==> C p1 (D p2 p3)
2025
2026 argToPat :: ScEnv
2027 -> InScopeSet -- What's in scope at the fn defn site
2028 -> ValueEnv -- ValueEnv at the call site
2029 -> CoreArg -- A call arg (or component thereof)
2030 -> ArgOcc
2031 -> UniqSM (Bool, CoreArg)
2032
2033 -- Returns (interesting, pat),
2034 -- where pat is the pattern derived from the argument
2035 -- interesting=True if the pattern is non-trivial (not a variable or type)
2036 -- E.g. x:xs --> (True, x:xs)
2037 -- f xs --> (False, w) where w is a fresh wildcard
2038 -- (f xs, 'c') --> (True, (w, 'c')) where w is a fresh wildcard
2039 -- \x. x+y --> (True, \x. x+y)
2040 -- lvl7 --> (True, lvl7) if lvl7 is bound
2041 -- somewhere further out
2042
2043 argToPat _env _in_scope _val_env arg@(Type {}) _arg_occ
2044 = return (False, arg)
2045
2046 argToPat env in_scope val_env (Tick _ arg) arg_occ
2047 = argToPat env in_scope val_env arg arg_occ
2048 -- Note [Notes in call patterns]
2049 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2050 -- Ignore Notes. In particular, we want to ignore any InlineMe notes
2051 -- Perhaps we should not ignore profiling notes, but I'm going to
2052 -- ride roughshod over them all for now.
2053 --- See Note [Notes in RULE matching] in Rules
2054
2055 argToPat env in_scope val_env (Let _ arg) arg_occ
2056 = argToPat env in_scope val_env arg arg_occ
2057 -- See Note [Matching lets] in Rule.hs
2058 -- Look through let expressions
2059 -- e.g. f (let v = rhs in (v,w))
2060 -- Here we can specialise for f (v,w)
2061 -- because the rule-matcher will look through the let.
2062
2063 {- Disabled; see Note [Matching cases] in Rule.hs
2064 argToPat env in_scope val_env (Case scrut _ _ [(_, _, rhs)]) arg_occ
2065 | exprOkForSpeculation scrut -- See Note [Matching cases] in Rule.hhs
2066 = argToPat env in_scope val_env rhs arg_occ
2067 -}
2068
2069 argToPat env in_scope val_env (Cast arg co) arg_occ
2070 | not (ignoreType env ty2)
2071 = do { (interesting, arg') <- argToPat env in_scope val_env arg arg_occ
2072 ; if not interesting then
2073 wildCardPat ty2
2074 else do
2075 { -- Make a wild-card pattern for the coercion
2076 uniq <- getUniqueM
2077 ; let co_name = mkSysTvName uniq (fsLit "sg")
2078 co_var = mkCoVar co_name (mkCoercionType Representational ty1 ty2)
2079 ; return (interesting, Cast arg' (mkCoVarCo co_var)) } }
2080 where
2081 Pair ty1 ty2 = coercionKind co
2082
2083
2084
2085 {- Disabling lambda specialisation for now
2086 It's fragile, and the spec_loop can be infinite
2087 argToPat in_scope val_env arg arg_occ
2088 | is_value_lam arg
2089 = return (True, arg)
2090 where
2091 is_value_lam (Lam v e) -- Spot a value lambda, even if
2092 | isId v = True -- it is inside a type lambda
2093 | otherwise = is_value_lam e
2094 is_value_lam other = False
2095 -}
2096
2097 -- Check for a constructor application
2098 -- NB: this *precedes* the Var case, so that we catch nullary constrs
2099 argToPat env in_scope val_env arg arg_occ
2100 | Just (ConVal (DataAlt dc) args) <- isValue val_env arg
2101 , not (ignoreDataCon env dc) -- See Note [NoSpecConstr]
2102 , Just arg_occs <- mb_scrut dc
2103 = do { let (ty_args, rest_args) = splitAtList (dataConUnivTyVars dc) args
2104 ; (_, args') <- argsToPats env in_scope val_env rest_args arg_occs
2105 ; return (True,
2106 mkConApp dc (ty_args ++ args')) }
2107 where
2108 mb_scrut dc = case arg_occ of
2109 ScrutOcc bs | Just occs <- lookupUFM bs dc
2110 -> Just (occs) -- See Note [Reboxing]
2111 _other | sc_force env || sc_keen env
2112 -> Just (repeat UnkOcc)
2113 | otherwise
2114 -> Nothing
2115
2116 -- Check if the argument is a variable that
2117 -- (a) is used in an interesting way in the function body
2118 -- (b) we know what its value is
2119 -- In that case it counts as "interesting"
2120 argToPat env in_scope val_env (Var v) arg_occ
2121 | sc_force env || case arg_occ of { UnkOcc -> False; _other -> True }, -- (a)
2122 is_value, -- (b)
2123 -- Ignoring sc_keen here to avoid gratuitously incurring Note [Reboxing]
2124 -- So sc_keen focused just on f (I# x), where we have freshly-allocated
2125 -- box that we can eliminate in the caller
2126 not (ignoreType env (varType v))
2127 = return (True, Var v)
2128 where
2129 is_value
2130 | isLocalId v = v `elemInScopeSet` in_scope
2131 && isJust (lookupVarEnv val_env v)
2132 -- Local variables have values in val_env
2133 | otherwise = isValueUnfolding (idUnfolding v)
2134 -- Imports have unfoldings
2135
2136 -- I'm really not sure what this comment means
2137 -- And by not wild-carding we tend to get forall'd
2138 -- variables that are in scope, which in turn can
2139 -- expose the weakness in let-matching
2140 -- See Note [Matching lets] in Rules
2141
2142 -- Check for a variable bound inside the function.
2143 -- Don't make a wild-card, because we may usefully share
2144 -- e.g. f a = let x = ... in f (x,x)
2145 -- NB: this case follows the lambda and con-app cases!!
2146 -- argToPat _in_scope _val_env (Var v) _arg_occ
2147 -- = return (False, Var v)
2148 -- SLPJ : disabling this to avoid proliferation of versions
2149 -- also works badly when thinking about seeding the loop
2150 -- from the body of the let
2151 -- f x y = letrec g z = ... in g (x,y)
2152 -- We don't want to specialise for that *particular* x,y
2153
2154 -- The default case: make a wild-card
2155 -- We use this for coercions too
2156 argToPat _env _in_scope _val_env arg _arg_occ
2157 = wildCardPat (exprType arg)
2158
2159 wildCardPat :: Type -> UniqSM (Bool, CoreArg)
2160 wildCardPat ty
2161 = do { uniq <- getUniqueM
2162 ; let id = mkSysLocalOrCoVar (fsLit "sc") uniq ty
2163 ; return (False, varToCoreExpr id) }
2164
2165 argsToPats :: ScEnv -> InScopeSet -> ValueEnv
2166 -> [CoreArg] -> [ArgOcc] -- Should be same length
2167 -> UniqSM (Bool, [CoreArg])
2168 argsToPats env in_scope val_env args occs
2169 = do { stuff <- zipWithM (argToPat env in_scope val_env) args occs
2170 ; let (interesting_s, args') = unzip stuff
2171 ; return (or interesting_s, args') }
2172
2173 isValue :: ValueEnv -> CoreExpr -> Maybe Value
2174 isValue _env (Lit lit)
2175 | litIsLifted lit = Nothing
2176 | otherwise = Just (ConVal (LitAlt lit) [])
2177
2178 isValue env (Var v)
2179 | Just cval <- lookupVarEnv env v
2180 = Just cval -- You might think we could look in the idUnfolding here
2181 -- but that doesn't take account of which branch of a
2182 -- case we are in, which is the whole point
2183
2184 | not (isLocalId v) && isCheapUnfolding unf
2185 = isValue env (unfoldingTemplate unf)
2186 where
2187 unf = idUnfolding v
2188 -- However we do want to consult the unfolding
2189 -- as well, for let-bound constructors!
2190
2191 isValue env (Lam b e)
2192 | isTyVar b = case isValue env e of
2193 Just _ -> Just LambdaVal
2194 Nothing -> Nothing
2195 | otherwise = Just LambdaVal
2196
2197 isValue env (Tick t e)
2198 | not (tickishIsCode t)
2199 = isValue env e
2200
2201 isValue _env expr -- Maybe it's a constructor application
2202 | (Var fun, args, _) <- collectArgsTicks (not . tickishIsCode) expr
2203 = case isDataConWorkId_maybe fun of
2204
2205 Just con | args `lengthAtLeast` dataConRepArity con
2206 -- Check saturated; might be > because the
2207 -- arity excludes type args
2208 -> Just (ConVal (DataAlt con) args)
2209
2210 _other | valArgCount args < idArity fun
2211 -- Under-applied function
2212 -> Just LambdaVal -- Partial application
2213
2214 _other -> Nothing
2215
2216 isValue _env _expr = Nothing
2217
2218 valueIsWorkFree :: Value -> Bool
2219 valueIsWorkFree LambdaVal = True
2220 valueIsWorkFree (ConVal _ args) = all exprIsWorkFree args
2221
2222 samePat :: CallPat -> CallPat -> Bool
2223 samePat (vs1, as1) (vs2, as2)
2224 = all2 same as1 as2
2225 where
2226 same (Var v1) (Var v2)
2227 | v1 `elem` vs1 = v2 `elem` vs2
2228 | v2 `elem` vs2 = False
2229 | otherwise = v1 == v2
2230
2231 same (Lit l1) (Lit l2) = l1==l2
2232 same (App f1 a1) (App f2 a2) = same f1 f2 && same a1 a2
2233
2234 same (Type {}) (Type {}) = True -- Note [Ignore type differences]
2235 same (Coercion {}) (Coercion {}) = True
2236 same (Tick _ e1) e2 = same e1 e2 -- Ignore casts and notes
2237 same (Cast e1 _) e2 = same e1 e2
2238 same e1 (Tick _ e2) = same e1 e2
2239 same e1 (Cast e2 _) = same e1 e2
2240
2241 same e1 e2 = WARN( bad e1 || bad e2, ppr e1 $$ ppr e2)
2242 False -- Let, lambda, case should not occur
2243 bad (Case {}) = True
2244 bad (Let {}) = True
2245 bad (Lam {}) = True
2246 bad _other = False
2247
2248 {-
2249 Note [Ignore type differences]
2250 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2251 We do not want to generate specialisations where the call patterns
2252 differ only in their type arguments! Not only is it utterly useless,
2253 but it also means that (with polymorphic recursion) we can generate
2254 an infinite number of specialisations. Example is Data.Sequence.adjustTree,
2255 I think.
2256 -}