6115a031ec9d1174571446897f2881072a6a0f37
[ghc.git] / compiler / specialise / SpecConstr.hs
1 {-
2 ToDo [Oct 2013]
3 ~~~~~~~~~~~~~~~
4 1. Nuke ForceSpecConstr for good (it is subsumed by GHC.Types.SPEC in ghc-prim)
5 2. Nuke NoSpecConstr
6
7
8 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
9
10 \section[SpecConstr]{Specialise over constructors}
11 -}
12
13 {-# LANGUAGE CPP #-}
14
15 module SpecConstr(
16 specConstrProgram,
17 SpecConstrAnnotation(..)
18 ) where
19
20 #include "HsVersions.h"
21
22 import GhcPrelude
23
24 import CoreSyn
25 import CoreSubst
26 import CoreUtils
27 import CoreUnfold ( couldBeSmallEnoughToInline )
28 import CoreFVs ( exprsFreeVarsList )
29 import CoreMonad
30 import Literal ( litIsLifted )
31 import HscTypes ( ModGuts(..) )
32 import WwLib ( isWorkerSmallEnough, mkWorkerArgs )
33 import DataCon
34 import Coercion hiding( substCo )
35 import Rules
36 import Type hiding ( substTy )
37 import TyCon ( tyConName )
38 import Id
39 import PprCore ( pprParendExpr )
40 import MkCore ( mkImpossibleExpr )
41 import Var
42 import VarEnv
43 import VarSet
44 import Name
45 import BasicTypes
46 import DynFlags ( DynFlags(..), GeneralFlag( Opt_SpecConstrKeen )
47 , gopt, hasPprDebug )
48 import Maybes ( orElse, catMaybes, isJust, isNothing )
49 import Demand
50 import GHC.Serialized ( deserializeWithData )
51 import Util
52 import Pair
53 import UniqSupply
54 import Outputable
55 import FastString
56 import UniqFM
57 import MonadUtils
58 import Control.Monad ( zipWithM )
59 import Data.List
60 import PrelNames ( specTyConName )
61 import Module
62 import TyCon ( TyCon )
63 import GHC.Exts( SpecConstrAnnotation(..) )
64 import Data.Ord( comparing )
65
66 {-
67 -----------------------------------------------------
68 Game plan
69 -----------------------------------------------------
70
71 Consider
72 drop n [] = []
73 drop 0 xs = []
74 drop n (x:xs) = drop (n-1) xs
75
76 After the first time round, we could pass n unboxed. This happens in
77 numerical code too. Here's what it looks like in Core:
78
79 drop n xs = case xs of
80 [] -> []
81 (y:ys) -> case n of
82 I# n# -> case n# of
83 0 -> []
84 _ -> drop (I# (n# -# 1#)) xs
85
86 Notice that the recursive call has an explicit constructor as argument.
87 Noticing this, we can make a specialised version of drop
88
89 RULE: drop (I# n#) xs ==> drop' n# xs
90
91 drop' n# xs = let n = I# n# in ...orig RHS...
92
93 Now the simplifier will apply the specialisation in the rhs of drop', giving
94
95 drop' n# xs = case xs of
96 [] -> []
97 (y:ys) -> case n# of
98 0 -> []
99 _ -> drop' (n# -# 1#) xs
100
101 Much better!
102
103 We'd also like to catch cases where a parameter is carried along unchanged,
104 but evaluated each time round the loop:
105
106 f i n = if i>0 || i>n then i else f (i*2) n
107
108 Here f isn't strict in n, but we'd like to avoid evaluating it each iteration.
109 In Core, by the time we've w/wd (f is strict in i) we get
110
111 f i# n = case i# ># 0 of
112 False -> I# i#
113 True -> case n of { I# n# ->
114 case i# ># n# of
115 False -> I# i#
116 True -> f (i# *# 2#) n
117
118 At the call to f, we see that the argument, n is known to be (I# n#),
119 and n is evaluated elsewhere in the body of f, so we can play the same
120 trick as above.
121
122
123 Note [Reboxing]
124 ~~~~~~~~~~~~~~~
125 We must be careful not to allocate the same constructor twice. Consider
126 f p = (...(case p of (a,b) -> e)...p...,
127 ...let t = (r,s) in ...t...(f t)...)
128 At the recursive call to f, we can see that t is a pair. But we do NOT want
129 to make a specialised copy:
130 f' a b = let p = (a,b) in (..., ...)
131 because now t is allocated by the caller, then r and s are passed to the
132 recursive call, which allocates the (r,s) pair again.
133
134 This happens if
135 (a) the argument p is used in other than a case-scrutinisation way.
136 (b) the argument to the call is not a 'fresh' tuple; you have to
137 look into its unfolding to see that it's a tuple
138
139 Hence the "OR" part of Note [Good arguments] below.
140
141 ALTERNATIVE 2: pass both boxed and unboxed versions. This no longer saves
142 allocation, but does perhaps save evals. In the RULE we'd have
143 something like
144
145 f (I# x#) = f' (I# x#) x#
146
147 If at the call site the (I# x) was an unfolding, then we'd have to
148 rely on CSE to eliminate the duplicate allocation.... This alternative
149 doesn't look attractive enough to pursue.
150
151 ALTERNATIVE 3: ignore the reboxing problem. The trouble is that
152 the conservative reboxing story prevents many useful functions from being
153 specialised. Example:
154 foo :: Maybe Int -> Int -> Int
155 foo (Just m) 0 = 0
156 foo x@(Just m) n = foo x (n-m)
157 Here the use of 'x' will clearly not require boxing in the specialised function.
158
159 The strictness analyser has the same problem, in fact. Example:
160 f p@(a,b) = ...
161 If we pass just 'a' and 'b' to the worker, it might need to rebox the
162 pair to create (a,b). A more sophisticated analysis might figure out
163 precisely the cases in which this could happen, but the strictness
164 analyser does no such analysis; it just passes 'a' and 'b', and hopes
165 for the best.
166
167 So my current choice is to make SpecConstr similarly aggressive, and
168 ignore the bad potential of reboxing.
169
170
171 Note [Good arguments]
172 ~~~~~~~~~~~~~~~~~~~~~
173 So we look for
174
175 * A self-recursive function. Ignore mutual recursion for now,
176 because it's less common, and the code is simpler for self-recursion.
177
178 * EITHER
179
180 a) At a recursive call, one or more parameters is an explicit
181 constructor application
182 AND
183 That same parameter is scrutinised by a case somewhere in
184 the RHS of the function
185
186 OR
187
188 b) At a recursive call, one or more parameters has an unfolding
189 that is an explicit constructor application
190 AND
191 That same parameter is scrutinised by a case somewhere in
192 the RHS of the function
193 AND
194 Those are the only uses of the parameter (see Note [Reboxing])
195
196
197 What to abstract over
198 ~~~~~~~~~~~~~~~~~~~~~
199 There's a bit of a complication with type arguments. If the call
200 site looks like
201
202 f p = ...f ((:) [a] x xs)...
203
204 then our specialised function look like
205
206 f_spec x xs = let p = (:) [a] x xs in ....as before....
207
208 This only makes sense if either
209 a) the type variable 'a' is in scope at the top of f, or
210 b) the type variable 'a' is an argument to f (and hence fs)
211
212 Actually, (a) may hold for value arguments too, in which case
213 we may not want to pass them. Suppose 'x' is in scope at f's
214 defn, but xs is not. Then we'd like
215
216 f_spec xs = let p = (:) [a] x xs in ....as before....
217
218 Similarly (b) may hold too. If x is already an argument at the
219 call, no need to pass it again.
220
221 Finally, if 'a' is not in scope at the call site, we could abstract
222 it as we do the term variables:
223
224 f_spec a x xs = let p = (:) [a] x xs in ...as before...
225
226 So the grand plan is:
227
228 * abstract the call site to a constructor-only pattern
229 e.g. C x (D (f p) (g q)) ==> C s1 (D s2 s3)
230
231 * Find the free variables of the abstracted pattern
232
233 * Pass these variables, less any that are in scope at
234 the fn defn. But see Note [Shadowing] below.
235
236
237 NOTICE that we only abstract over variables that are not in scope,
238 so we're in no danger of shadowing variables used in "higher up"
239 in f_spec's RHS.
240
241
242 Note [Shadowing]
243 ~~~~~~~~~~~~~~~~
244 In this pass we gather up usage information that may mention variables
245 that are bound between the usage site and the definition site; or (more
246 seriously) may be bound to something different at the definition site.
247 For example:
248
249 f x = letrec g y v = let x = ...
250 in ...(g (a,b) x)...
251
252 Since 'x' is in scope at the call site, we may make a rewrite rule that
253 looks like
254 RULE forall a,b. g (a,b) x = ...
255 But this rule will never match, because it's really a different 'x' at
256 the call site -- and that difference will be manifest by the time the
257 simplifier gets to it. [A worry: the simplifier doesn't *guarantee*
258 no-shadowing, so perhaps it may not be distinct?]
259
260 Anyway, the rule isn't actually wrong, it's just not useful. One possibility
261 is to run deShadowBinds before running SpecConstr, but instead we run the
262 simplifier. That gives the simplest possible program for SpecConstr to
263 chew on; and it virtually guarantees no shadowing.
264
265 Note [Specialising for constant parameters]
266 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
267 This one is about specialising on a *constant* (but not necessarily
268 constructor) argument
269
270 foo :: Int -> (Int -> Int) -> Int
271 foo 0 f = 0
272 foo m f = foo (f m) (+1)
273
274 It produces
275
276 lvl_rmV :: GHC.Base.Int -> GHC.Base.Int
277 lvl_rmV =
278 \ (ds_dlk :: GHC.Base.Int) ->
279 case ds_dlk of wild_alH { GHC.Base.I# x_alG ->
280 GHC.Base.I# (GHC.Prim.+# x_alG 1)
281
282 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
283 GHC.Prim.Int#
284 T.$wfoo =
285 \ (ww_sme :: GHC.Prim.Int#) (w_smg :: GHC.Base.Int -> GHC.Base.Int) ->
286 case ww_sme of ds_Xlw {
287 __DEFAULT ->
288 case w_smg (GHC.Base.I# ds_Xlw) of w1_Xmo { GHC.Base.I# ww1_Xmz ->
289 T.$wfoo ww1_Xmz lvl_rmV
290 };
291 0 -> 0
292 }
293
294 The recursive call has lvl_rmV as its argument, so we could create a specialised copy
295 with that argument baked in; that is, not passed at all. Now it can perhaps be inlined.
296
297 When is this worth it? Call the constant 'lvl'
298 - If 'lvl' has an unfolding that is a constructor, see if the corresponding
299 parameter is scrutinised anywhere in the body.
300
301 - If 'lvl' has an unfolding that is a inlinable function, see if the corresponding
302 parameter is applied (...to enough arguments...?)
303
304 Also do this is if the function has RULES?
305
306 Also
307
308 Note [Specialising for lambda parameters]
309 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
310 foo :: Int -> (Int -> Int) -> Int
311 foo 0 f = 0
312 foo m f = foo (f m) (\n -> n-m)
313
314 This is subtly different from the previous one in that we get an
315 explicit lambda as the argument:
316
317 T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) ->
318 GHC.Prim.Int#
319 T.$wfoo =
320 \ (ww_sm8 :: GHC.Prim.Int#) (w_sma :: GHC.Base.Int -> GHC.Base.Int) ->
321 case ww_sm8 of ds_Xlr {
322 __DEFAULT ->
323 case w_sma (GHC.Base.I# ds_Xlr) of w1_Xmf { GHC.Base.I# ww1_Xmq ->
324 T.$wfoo
325 ww1_Xmq
326 (\ (n_ad3 :: GHC.Base.Int) ->
327 case n_ad3 of wild_alB { GHC.Base.I# x_alA ->
328 GHC.Base.I# (GHC.Prim.-# x_alA ds_Xlr)
329 })
330 };
331 0 -> 0
332 }
333
334 I wonder if SpecConstr couldn't be extended to handle this? After all,
335 lambda is a sort of constructor for functions and perhaps it already
336 has most of the necessary machinery?
337
338 Furthermore, there's an immediate win, because you don't need to allocate the lambda
339 at the call site; and if perchance it's called in the recursive call, then you
340 may avoid allocating it altogether. Just like for constructors.
341
342 Looks cool, but probably rare...but it might be easy to implement.
343
344
345 Note [SpecConstr for casts]
346 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
347 Consider
348 data family T a :: *
349 data instance T Int = T Int
350
351 foo n = ...
352 where
353 go (T 0) = 0
354 go (T n) = go (T (n-1))
355
356 The recursive call ends up looking like
357 go (T (I# ...) `cast` g)
358 So we want to spot the constructor application inside the cast.
359 That's why we have the Cast case in argToPat
360
361 Note [Local recursive groups]
362 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
363 For a *local* recursive group, we can see all the calls to the
364 function, so we seed the specialisation loop from the calls in the
365 body, not from the calls in the RHS. Consider:
366
367 bar m n = foo n (n,n) (n,n) (n,n) (n,n)
368 where
369 foo n p q r s
370 | n == 0 = m
371 | n > 3000 = case p of { (p1,p2) -> foo (n-1) (p2,p1) q r s }
372 | n > 2000 = case q of { (q1,q2) -> foo (n-1) p (q2,q1) r s }
373 | n > 1000 = case r of { (r1,r2) -> foo (n-1) p q (r2,r1) s }
374 | otherwise = case s of { (s1,s2) -> foo (n-1) p q r (s2,s1) }
375
376 If we start with the RHSs of 'foo', we get lots and lots of specialisations,
377 most of which are not needed. But if we start with the (single) call
378 in the rhs of 'bar' we get exactly one fully-specialised copy, and all
379 the recursive calls go to this fully-specialised copy. Indeed, the original
380 function is later collected as dead code. This is very important in
381 specialising the loops arising from stream fusion, for example in NDP where
382 we were getting literally hundreds of (mostly unused) specialisations of
383 a local function.
384
385 In a case like the above we end up never calling the original un-specialised
386 function. (Although we still leave its code around just in case.)
387
388 However, if we find any boring calls in the body, including *unsaturated*
389 ones, such as
390 letrec foo x y = ....foo...
391 in map foo xs
392 then we will end up calling the un-specialised function, so then we *should*
393 use the calls in the un-specialised RHS as seeds. We call these
394 "boring call patterns", and callsToPats reports if it finds any of these.
395
396 Note [Seeding top-level recursive groups]
397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
398 This seeding is done in the binding for seed_calls in specRec.
399
400 1. If all the bindings in a top-level recursive group are local (not
401 exported), then all the calls are in the rest of the top-level
402 bindings. This means we can specialise with those call patterns
403 ONLY, and NOT with the RHSs of the recursive group (exactly like
404 Note [Local recursive groups])
405
406 2. But if any of the bindings are exported, the function may be called
407 with any old arguments, so (for lack of anything better) we specialise
408 based on
409 (a) the call patterns in the RHS
410 (b) the call patterns in the rest of the top-level bindings
411 NB: before Apr 15 we used (a) only, but Dimitrios had an example
412 where (b) was crucial, so I added that.
413 Adding (b) also improved nofib allocation results:
414 multiplier: 4% better
415 minimax: 2.8% better
416
417 Actually in case (2), instead of using the calls from the RHS, it
418 would be better to specialise in the importing module. We'd need to
419 add an INLINABLE pragma to the function, and then it can be
420 specialised in the importing scope, just as is done for type classes
421 in Specialise.specImports. This remains to be done (#10346).
422
423 Note [Top-level recursive groups]
424 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
425 To get the call usage information from "the rest of the top level
426 bindings" (c.f. Note [Seeding top-level recursive groups]), we work
427 backwards through the top-level bindings so we see the usage before we
428 get to the binding of the function. Before we can collect the usage
429 though, we go through all the bindings and add them to the
430 environment. This is necessary because usage is only tracked for
431 functions in the environment. These two passes are called
432 'go' and 'goEnv'
433 in specConstrProgram. (Looks a bit revolting to me.)
434
435 Note [Do not specialise diverging functions]
436 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
437 Specialising a function that just diverges is a waste of code.
438 Furthermore, it broke GHC (simpl014) thus:
439 {-# STR Sb #-}
440 f = \x. case x of (a,b) -> f x
441 If we specialise f we get
442 f = \x. case x of (a,b) -> fspec a b
443 But fspec doesn't have decent strictness info. As it happened,
444 (f x) :: IO t, so the state hack applied and we eta expanded fspec,
445 and hence f. But now f's strictness is less than its arity, which
446 breaks an invariant.
447
448
449 Note [Forcing specialisation]
450 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
451 With stream fusion and in other similar cases, we want to fully
452 specialise some (but not necessarily all!) loops regardless of their
453 size and the number of specialisations.
454
455 We allow a library to do this, in one of two ways (one which is
456 deprecated):
457
458 1) Add a parameter of type GHC.Types.SPEC (from ghc-prim) to the loop body.
459
460 2) (Deprecated) Annotate a type with ForceSpecConstr from GHC.Exts,
461 and then add *that* type as a parameter to the loop body
462
463 The reason #2 is deprecated is because it requires GHCi, which isn't
464 available for things like a cross compiler using stage1.
465
466 Here's a (simplified) example from the `vector` package. You may bring
467 the special 'force specialization' type into scope by saying:
468
469 import GHC.Types (SPEC(..))
470
471 or by defining your own type (again, deprecated):
472
473 data SPEC = SPEC | SPEC2
474 {-# ANN type SPEC ForceSpecConstr #-}
475
476 (Note this is the exact same definition of GHC.Types.SPEC, just
477 without the annotation.)
478
479 After that, you say:
480
481 foldl :: (a -> b -> a) -> a -> Stream b -> a
482 {-# INLINE foldl #-}
483 foldl f z (Stream step s _) = foldl_loop SPEC z s
484 where
485 foldl_loop !sPEC z s = case step s of
486 Yield x s' -> foldl_loop sPEC (f z x) s'
487 Skip -> foldl_loop sPEC z s'
488 Done -> z
489
490 SpecConstr will spot the SPEC parameter and always fully specialise
491 foldl_loop. Note that
492
493 * We have to prevent the SPEC argument from being removed by
494 w/w which is why (a) SPEC is a sum type, and (b) we have to seq on
495 the SPEC argument.
496
497 * And lastly, the SPEC argument is ultimately eliminated by
498 SpecConstr itself so there is no runtime overhead.
499
500 This is all quite ugly; we ought to come up with a better design.
501
502 ForceSpecConstr arguments are spotted in scExpr' and scTopBinds which then set
503 sc_force to True when calling specLoop. This flag does four things:
504
505 * Ignore specConstrThreshold, to specialise functions of arbitrary size
506 (see scTopBind)
507 * Ignore specConstrCount, to make arbitrary numbers of specialisations
508 (see specialise)
509 * Specialise even for arguments that are not scrutinised in the loop
510 (see argToPat; Trac #4488)
511 * Only specialise on recursive types a finite number of times
512 (see is_too_recursive; Trac #5550; Note [Limit recursive specialisation])
513
514 The flag holds only for specialising a single binding group, and NOT
515 for nested bindings. (So really it should be passed around explicitly
516 and not stored in ScEnv.) Trac #14379 turned out to be caused by
517 f SPEC x = let g1 x = ...
518 in ...
519 We force-specialise f (becuase of the SPEC), but that generates a specialised
520 copy of g1 (as well as the original). Alas g1 has a nested binding g2; and
521 in each copy of g1 we get an unspecialised and specialised copy of g2; and so
522 on. Result, exponential. So the force-spec flag now only applies to one
523 level of bindings at a time.
524
525 Mechanism for this one-level-only thing:
526
527 - Switch it on at the call to specRec, in scExpr and scTopBinds
528 - Switch it off when doing the RHSs;
529 this can be done very conveneniently in decreaseSpecCount
530
531 What alternatives did I consider?
532
533 * Annotating the loop itself doesn't work because (a) it is local and
534 (b) it will be w/w'ed and having w/w propagating annotations somehow
535 doesn't seem like a good idea. The types of the loop arguments
536 really seem to be the most persistent thing.
537
538 * Annotating the types that make up the loop state doesn't work,
539 either, because (a) it would prevent us from using types like Either
540 or tuples here, (b) we don't want to restrict the set of types that
541 can be used in Stream states and (c) some types are fixed by the
542 user (e.g., the accumulator here) but we still want to specialise as
543 much as possible.
544
545 Alternatives to ForceSpecConstr
546 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
547 Instead of giving the loop an extra argument of type SPEC, we
548 also considered *wrapping* arguments in SPEC, thus
549 data SPEC a = SPEC a | SPEC2
550
551 loop = \arg -> case arg of
552 SPEC state ->
553 case state of (x,y) -> ... loop (SPEC (x',y')) ...
554 S2 -> error ...
555 The idea is that a SPEC argument says "specialise this argument
556 regardless of whether the function case-analyses it". But this
557 doesn't work well:
558 * SPEC must still be a sum type, else the strictness analyser
559 eliminates it
560 * But that means that 'loop' won't be strict in its real payload
561 This loss of strictness in turn screws up specialisation, because
562 we may end up with calls like
563 loop (SPEC (case z of (p,q) -> (q,p)))
564 Without the SPEC, if 'loop' were strict, the case would move out
565 and we'd see loop applied to a pair. But if 'loop' isn't strict
566 this doesn't look like a specialisable call.
567
568 Note [Limit recursive specialisation]
569 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
570 It is possible for ForceSpecConstr to cause an infinite loop of specialisation.
571 Because there is no limit on the number of specialisations, a recursive call with
572 a recursive constructor as an argument (for example, list cons) will generate
573 a specialisation for that constructor. If the resulting specialisation also
574 contains a recursive call with the constructor, this could proceed indefinitely.
575
576 For example, if ForceSpecConstr is on:
577 loop :: [Int] -> [Int] -> [Int]
578 loop z [] = z
579 loop z (x:xs) = loop (x:z) xs
580 this example will create a specialisation for the pattern
581 loop (a:b) c = loop' a b c
582
583 loop' a b [] = (a:b)
584 loop' a b (x:xs) = loop (x:(a:b)) xs
585 and a new pattern is found:
586 loop (a:(b:c)) d = loop'' a b c d
587 which can continue indefinitely.
588
589 Roman's suggestion to fix this was to stop after a couple of times on recursive types,
590 but still specialising on non-recursive types as much as possible.
591
592 To implement this, we count the number of times we have gone round the
593 "specialise recursively" loop ('go' in 'specRec'). Once have gone round
594 more than N times (controlled by -fspec-constr-recursive=N) we check
595
596 - If sc_force is off, and sc_count is (Just max) then we don't
597 need to do anything: trim_pats will limit the number of specs
598
599 - Otherwise check if any function has now got more than (sc_count env)
600 specialisations. If sc_count is "no limit" then we arbitrarily
601 choose 10 as the limit (ugh).
602
603 See Trac #5550. Also Trac #13623, where this test had become over-aggressive,
604 and we lost a wonderful specialisation that we really wanted!
605
606 Note [NoSpecConstr]
607 ~~~~~~~~~~~~~~~~~~~
608 The ignoreDataCon stuff allows you to say
609 {-# ANN type T NoSpecConstr #-}
610 to mean "don't specialise on arguments of this type". It was added
611 before we had ForceSpecConstr. Lacking ForceSpecConstr we specialised
612 regardless of size; and then we needed a way to turn that *off*. Now
613 that we have ForceSpecConstr, this NoSpecConstr is probably redundant.
614 (Used only for PArray.)
615
616 -----------------------------------------------------
617 Stuff not yet handled
618 -----------------------------------------------------
619
620 Here are notes arising from Roman's work that I don't want to lose.
621
622 Example 1
623 ~~~~~~~~~
624 data T a = T !a
625
626 foo :: Int -> T Int -> Int
627 foo 0 t = 0
628 foo x t | even x = case t of { T n -> foo (x-n) t }
629 | otherwise = foo (x-1) t
630
631 SpecConstr does no specialisation, because the second recursive call
632 looks like a boxed use of the argument. A pity.
633
634 $wfoo_sFw :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
635 $wfoo_sFw =
636 \ (ww_sFo [Just L] :: GHC.Prim.Int#) (w_sFq [Just L] :: T.T GHC.Base.Int) ->
637 case ww_sFo of ds_Xw6 [Just L] {
638 __DEFAULT ->
639 case GHC.Prim.remInt# ds_Xw6 2 of wild1_aEF [Dead Just A] {
640 __DEFAULT -> $wfoo_sFw (GHC.Prim.-# ds_Xw6 1) w_sFq;
641 0 ->
642 case w_sFq of wild_Xy [Just L] { T.T n_ad5 [Just U(L)] ->
643 case n_ad5 of wild1_aET [Just A] { GHC.Base.I# y_aES [Just L] ->
644 $wfoo_sFw (GHC.Prim.-# ds_Xw6 y_aES) wild_Xy
645 } } };
646 0 -> 0
647
648 Example 2
649 ~~~~~~~~~
650 data a :*: b = !a :*: !b
651 data T a = T !a
652
653 foo :: (Int :*: T Int) -> Int
654 foo (0 :*: t) = 0
655 foo (x :*: t) | even x = case t of { T n -> foo ((x-n) :*: t) }
656 | otherwise = foo ((x-1) :*: t)
657
658 Very similar to the previous one, except that the parameters are now in
659 a strict tuple. Before SpecConstr, we have
660
661 $wfoo_sG3 :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int#
662 $wfoo_sG3 =
663 \ (ww_sFU [Just L] :: GHC.Prim.Int#) (ww_sFW [Just L] :: T.T
664 GHC.Base.Int) ->
665 case ww_sFU of ds_Xws [Just L] {
666 __DEFAULT ->
667 case GHC.Prim.remInt# ds_Xws 2 of wild1_aEZ [Dead Just A] {
668 __DEFAULT ->
669 case ww_sFW of tpl_B2 [Just L] { T.T a_sFo [Just A] ->
670 $wfoo_sG3 (GHC.Prim.-# ds_Xws 1) tpl_B2 -- $wfoo1
671 };
672 0 ->
673 case ww_sFW of wild_XB [Just A] { T.T n_ad7 [Just S(L)] ->
674 case n_ad7 of wild1_aFd [Just L] { GHC.Base.I# y_aFc [Just L] ->
675 $wfoo_sG3 (GHC.Prim.-# ds_Xws y_aFc) wild_XB -- $wfoo2
676 } } };
677 0 -> 0 }
678
679 We get two specialisations:
680 "SC:$wfoo1" [0] __forall {a_sFB :: GHC.Base.Int sc_sGC :: GHC.Prim.Int#}
681 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int a_sFB)
682 = Foo.$s$wfoo1 a_sFB sc_sGC ;
683 "SC:$wfoo2" [0] __forall {y_aFp :: GHC.Prim.Int# sc_sGC :: GHC.Prim.Int#}
684 Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int (GHC.Base.I# y_aFp))
685 = Foo.$s$wfoo y_aFp sc_sGC ;
686
687 But perhaps the first one isn't good. After all, we know that tpl_B2 is
688 a T (I# x) really, because T is strict and Int has one constructor. (We can't
689 unbox the strict fields, because T is polymorphic!)
690
691 ************************************************************************
692 * *
693 \subsection{Top level wrapper stuff}
694 * *
695 ************************************************************************
696 -}
697
698 specConstrProgram :: ModGuts -> CoreM ModGuts
699 specConstrProgram guts
700 = do
701 dflags <- getDynFlags
702 us <- getUniqueSupplyM
703 annos <- getFirstAnnotations deserializeWithData guts
704 this_mod <- getModule
705 let binds' = reverse $ fst $ initUs us $ do
706 -- Note [Top-level recursive groups]
707 (env, binds) <- goEnv (initScEnv dflags this_mod annos)
708 (mg_binds guts)
709 -- binds is identical to (mg_binds guts), except that the
710 -- binders on the LHS have been replaced by extendBndr
711 -- (SPJ this seems like overkill; I don't think the binders
712 -- will change at all; and we don't substitute in the RHSs anyway!!)
713 go env nullUsage (reverse binds)
714
715 return (guts { mg_binds = binds' })
716 where
717 -- See Note [Top-level recursive groups]
718 goEnv env [] = return (env, [])
719 goEnv env (bind:binds) = do (env', bind') <- scTopBindEnv env bind
720 (env'', binds') <- goEnv env' binds
721 return (env'', bind' : binds')
722
723 -- Arg list of bindings is in reverse order
724 go _ _ [] = return []
725 go env usg (bind:binds) = do (usg', bind') <- scTopBind env usg bind
726 binds' <- go env usg' binds
727 return (bind' : binds')
728
729 {-
730 ************************************************************************
731 * *
732 \subsection{Environment: goes downwards}
733 * *
734 ************************************************************************
735
736 Note [Work-free values only in environment]
737 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
738 The sc_vals field keeps track of in-scope value bindings, so
739 that if we come across (case x of Just y ->...) we can reduce the
740 case from knowing that x is bound to a pair.
741
742 But only *work-free* values are ok here. For example if the envt had
743 x -> Just (expensive v)
744 then we do NOT want to expand to
745 let y = expensive v in ...
746 because the x-binding still exists and we've now duplicated (expensive v).
747
748 This seldom happens because let-bound constructor applications are
749 ANF-ised, but it can happen as a result of on-the-fly transformations in
750 SpecConstr itself. Here is Trac #7865:
751
752 let {
753 a'_shr =
754 case xs_af8 of _ {
755 [] -> acc_af6;
756 : ds_dgt [Dmd=<L,A>] ds_dgu [Dmd=<L,A>] ->
757 (expensive x_af7, x_af7
758 } } in
759 let {
760 ds_sht =
761 case a'_shr of _ { (p'_afd, q'_afe) ->
762 TSpecConstr_DoubleInline.recursive
763 (GHC.Types.: @ GHC.Types.Int x_af7 wild_X6) (q'_afe, p'_afd)
764 } } in
765
766 When processed knowing that xs_af8 was bound to a cons, we simplify to
767 a'_shr = (expensive x_af7, x_af7)
768 and we do NOT want to inline that at the occurrence of a'_shr in ds_sht.
769 (There are other occurrences of a'_shr.) No no no.
770
771 It would be possible to do some on-the-fly ANF-ising, so that a'_shr turned
772 into a work-free value again, thus
773 a1 = expensive x_af7
774 a'_shr = (a1, x_af7)
775 but that's more work, so until its shown to be important I'm going to
776 leave it for now.
777
778 Note [Making SpecConstr keener]
779 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
780 Consider this, in (perf/should_run/T9339)
781 last (filter odd [1..1000])
782
783 After optimisation, including SpecConstr, we get:
784 f :: Int# -> Int -> Int
785 f x y = case case remInt# x 2# of
786 __DEFAULT -> case x of
787 __DEFAULT -> f (+# wild_Xp 1#) (I# x)
788 1000000# -> ...
789 0# -> case x of
790 __DEFAULT -> f (+# wild_Xp 1#) y
791 1000000# -> y
792
793 Not good! We build an (I# x) box every time around the loop.
794 SpecConstr (as described in the paper) does not specialise f, despite
795 the call (f ... (I# x)) because 'y' is not scrutinied in the body.
796 But it is much better to specialise f for the case where the argument
797 is of form (I# x); then we build the box only when returning y, which
798 is on the cold path.
799
800 Another example:
801
802 f x = ...(g x)....
803
804 Here 'x' is not scrutinised in f's body; but if we did specialise 'f'
805 then the call (g x) might allow 'g' to be specialised in turn.
806
807 So sc_keen controls whether or not we take account of whether argument is
808 scrutinised in the body. True <=> ignore that, and speicalise whenever
809 the function is applied to a data constructor.
810 -}
811
812 data ScEnv = SCE { sc_dflags :: DynFlags,
813 sc_module :: !Module,
814 sc_size :: Maybe Int, -- Size threshold
815 -- Nothing => no limit
816
817 sc_count :: Maybe Int, -- Max # of specialisations for any one fn
818 -- Nothing => no limit
819 -- See Note [Avoiding exponential blowup]
820
821 sc_recursive :: Int, -- Max # of specialisations over recursive type.
822 -- Stops ForceSpecConstr from diverging.
823
824 sc_keen :: Bool, -- Specialise on arguments that are known
825 -- constructors, even if they are not
826 -- scrutinised in the body. See
827 -- Note [Making SpecConstr keener]
828
829 sc_force :: Bool, -- Force specialisation?
830 -- See Note [Forcing specialisation]
831
832 sc_subst :: Subst, -- Current substitution
833 -- Maps InIds to OutExprs
834
835 sc_how_bound :: HowBoundEnv,
836 -- Binds interesting non-top-level variables
837 -- Domain is OutVars (*after* applying the substitution)
838
839 sc_vals :: ValueEnv,
840 -- Domain is OutIds (*after* applying the substitution)
841 -- Used even for top-level bindings (but not imported ones)
842 -- The range of the ValueEnv is *work-free* values
843 -- such as (\x. blah), or (Just v)
844 -- but NOT (Just (expensive v))
845 -- See Note [Work-free values only in environment]
846
847 sc_annotations :: UniqFM SpecConstrAnnotation
848 }
849
850 ---------------------
851 type HowBoundEnv = VarEnv HowBound -- Domain is OutVars
852
853 ---------------------
854 type ValueEnv = IdEnv Value -- Domain is OutIds
855 data Value = ConVal AltCon [CoreArg] -- _Saturated_ constructors
856 -- The AltCon is never DEFAULT
857 | LambdaVal -- Inlinable lambdas or PAPs
858
859 instance Outputable Value where
860 ppr (ConVal con args) = ppr con <+> interpp'SP args
861 ppr LambdaVal = text "<Lambda>"
862
863 ---------------------
864 initScEnv :: DynFlags -> Module -> UniqFM SpecConstrAnnotation -> ScEnv
865 initScEnv dflags this_mod anns
866 = SCE { sc_dflags = dflags,
867 sc_module = this_mod,
868 sc_size = specConstrThreshold dflags,
869 sc_count = specConstrCount dflags,
870 sc_recursive = specConstrRecursive dflags,
871 sc_keen = gopt Opt_SpecConstrKeen dflags,
872 sc_force = False,
873 sc_subst = emptySubst,
874 sc_how_bound = emptyVarEnv,
875 sc_vals = emptyVarEnv,
876 sc_annotations = anns }
877
878 data HowBound = RecFun -- These are the recursive functions for which
879 -- we seek interesting call patterns
880
881 | RecArg -- These are those functions' arguments, or their sub-components;
882 -- we gather occurrence information for these
883
884 instance Outputable HowBound where
885 ppr RecFun = text "RecFun"
886 ppr RecArg = text "RecArg"
887
888 scForce :: ScEnv -> Bool -> ScEnv
889 scForce env b = env { sc_force = b }
890
891 lookupHowBound :: ScEnv -> Id -> Maybe HowBound
892 lookupHowBound env id = lookupVarEnv (sc_how_bound env) id
893
894 scSubstId :: ScEnv -> Id -> CoreExpr
895 scSubstId env v = lookupIdSubst (text "scSubstId") (sc_subst env) v
896
897 scSubstTy :: ScEnv -> Type -> Type
898 scSubstTy env ty = substTy (sc_subst env) ty
899
900 scSubstCo :: ScEnv -> Coercion -> Coercion
901 scSubstCo env co = substCo (sc_subst env) co
902
903 zapScSubst :: ScEnv -> ScEnv
904 zapScSubst env = env { sc_subst = zapSubstEnv (sc_subst env) }
905
906 extendScInScope :: ScEnv -> [Var] -> ScEnv
907 -- Bring the quantified variables into scope
908 extendScInScope env qvars = env { sc_subst = extendInScopeList (sc_subst env) qvars }
909
910 -- Extend the substitution
911 extendScSubst :: ScEnv -> Var -> OutExpr -> ScEnv
912 extendScSubst env var expr = env { sc_subst = extendSubst (sc_subst env) var expr }
913
914 extendScSubstList :: ScEnv -> [(Var,OutExpr)] -> ScEnv
915 extendScSubstList env prs = env { sc_subst = extendSubstList (sc_subst env) prs }
916
917 extendHowBound :: ScEnv -> [Var] -> HowBound -> ScEnv
918 extendHowBound env bndrs how_bound
919 = env { sc_how_bound = extendVarEnvList (sc_how_bound env)
920 [(bndr,how_bound) | bndr <- bndrs] }
921
922 extendBndrsWith :: HowBound -> ScEnv -> [Var] -> (ScEnv, [Var])
923 extendBndrsWith how_bound env bndrs
924 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndrs')
925 where
926 (subst', bndrs') = substBndrs (sc_subst env) bndrs
927 hb_env' = sc_how_bound env `extendVarEnvList`
928 [(bndr,how_bound) | bndr <- bndrs']
929
930 extendBndrWith :: HowBound -> ScEnv -> Var -> (ScEnv, Var)
931 extendBndrWith how_bound env bndr
932 = (env { sc_subst = subst', sc_how_bound = hb_env' }, bndr')
933 where
934 (subst', bndr') = substBndr (sc_subst env) bndr
935 hb_env' = extendVarEnv (sc_how_bound env) bndr' how_bound
936
937 extendRecBndrs :: ScEnv -> [Var] -> (ScEnv, [Var])
938 extendRecBndrs env bndrs = (env { sc_subst = subst' }, bndrs')
939 where
940 (subst', bndrs') = substRecBndrs (sc_subst env) bndrs
941
942 extendBndr :: ScEnv -> Var -> (ScEnv, Var)
943 extendBndr env bndr = (env { sc_subst = subst' }, bndr')
944 where
945 (subst', bndr') = substBndr (sc_subst env) bndr
946
947 extendValEnv :: ScEnv -> Id -> Maybe Value -> ScEnv
948 extendValEnv env _ Nothing = env
949 extendValEnv env id (Just cv)
950 | valueIsWorkFree cv -- Don't duplicate work!! Trac #7865
951 = env { sc_vals = extendVarEnv (sc_vals env) id cv }
952 extendValEnv env _ _ = env
953
954 extendCaseBndrs :: ScEnv -> OutExpr -> OutId -> AltCon -> [Var] -> (ScEnv, [Var])
955 -- When we encounter
956 -- case scrut of b
957 -- C x y -> ...
958 -- we want to bind b, to (C x y)
959 -- NB1: Extends only the sc_vals part of the envt
960 -- NB2: Kill the dead-ness info on the pattern binders x,y, since
961 -- they are potentially made alive by the [b -> C x y] binding
962 extendCaseBndrs env scrut case_bndr con alt_bndrs
963 = (env2, alt_bndrs')
964 where
965 live_case_bndr = not (isDeadBinder case_bndr)
966 env1 | Var v <- stripTicksTopE (const True) scrut
967 = extendValEnv env v cval
968 | otherwise = env -- See Note [Add scrutinee to ValueEnv too]
969 env2 | live_case_bndr = extendValEnv env1 case_bndr cval
970 | otherwise = env1
971
972 alt_bndrs' | case scrut of { Var {} -> True; _ -> live_case_bndr }
973 = map zap alt_bndrs
974 | otherwise
975 = alt_bndrs
976
977 cval = case con of
978 DEFAULT -> Nothing
979 LitAlt {} -> Just (ConVal con [])
980 DataAlt {} -> Just (ConVal con vanilla_args)
981 where
982 vanilla_args = map Type (tyConAppArgs (idType case_bndr)) ++
983 varsToCoreExprs alt_bndrs
984
985 zap v | isTyVar v = v -- See NB2 above
986 | otherwise = zapIdOccInfo v
987
988
989 decreaseSpecCount :: ScEnv -> Int -> ScEnv
990 -- See Note [Avoiding exponential blowup]
991 decreaseSpecCount env n_specs
992 = env { sc_force = False -- See Note [Forcing specialisation]
993 , sc_count = case sc_count env of
994 Nothing -> Nothing
995 Just n -> Just (n `div` (n_specs + 1)) }
996 -- The "+1" takes account of the original function;
997 -- See Note [Avoiding exponential blowup]
998
999 ---------------------------------------------------
1000 -- See Note [Forcing specialisation]
1001 ignoreType :: ScEnv -> Type -> Bool
1002 ignoreDataCon :: ScEnv -> DataCon -> Bool
1003 forceSpecBndr :: ScEnv -> Var -> Bool
1004
1005 ignoreDataCon env dc = ignoreTyCon env (dataConTyCon dc)
1006
1007 ignoreType env ty
1008 = case tyConAppTyCon_maybe ty of
1009 Just tycon -> ignoreTyCon env tycon
1010 _ -> False
1011
1012 ignoreTyCon :: ScEnv -> TyCon -> Bool
1013 ignoreTyCon env tycon
1014 = lookupUFM (sc_annotations env) tycon == Just NoSpecConstr
1015
1016 forceSpecBndr env var = forceSpecFunTy env . snd . splitForAllTys . varType $ var
1017
1018 forceSpecFunTy :: ScEnv -> Type -> Bool
1019 forceSpecFunTy env = any (forceSpecArgTy env) . fst . splitFunTys
1020
1021 forceSpecArgTy :: ScEnv -> Type -> Bool
1022 forceSpecArgTy env ty
1023 | Just ty' <- coreView ty = forceSpecArgTy env ty'
1024
1025 forceSpecArgTy env ty
1026 | Just (tycon, tys) <- splitTyConApp_maybe ty
1027 , tycon /= funTyCon
1028 = tyConName tycon == specTyConName
1029 || lookupUFM (sc_annotations env) tycon == Just ForceSpecConstr
1030 || any (forceSpecArgTy env) tys
1031
1032 forceSpecArgTy _ _ = False
1033
1034 {-
1035 Note [Add scrutinee to ValueEnv too]
1036 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1037 Consider this:
1038 case x of y
1039 (a,b) -> case b of c
1040 I# v -> ...(f y)...
1041 By the time we get to the call (f y), the ValueEnv
1042 will have a binding for y, and for c
1043 y -> (a,b)
1044 c -> I# v
1045 BUT that's not enough! Looking at the call (f y) we
1046 see that y is pair (a,b), but we also need to know what 'b' is.
1047 So in extendCaseBndrs we must *also* add the binding
1048 b -> I# v
1049 else we lose a useful specialisation for f. This is necessary even
1050 though the simplifier has systematically replaced uses of 'x' with 'y'
1051 and 'b' with 'c' in the code. The use of 'b' in the ValueEnv came
1052 from outside the case. See Trac #4908 for the live example.
1053
1054 Note [Avoiding exponential blowup]
1055 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1056 The sc_count field of the ScEnv says how many times we are prepared to
1057 duplicate a single function. But we must take care with recursive
1058 specialisations. Consider
1059
1060 let $j1 = let $j2 = let $j3 = ...
1061 in
1062 ...$j3...
1063 in
1064 ...$j2...
1065 in
1066 ...$j1...
1067
1068 If we specialise $j1 then in each specialisation (as well as the original)
1069 we can specialise $j2, and similarly $j3. Even if we make just *one*
1070 specialisation of each, because we also have the original we'll get 2^n
1071 copies of $j3, which is not good.
1072
1073 So when recursively specialising we divide the sc_count by the number of
1074 copies we are making at this level, including the original.
1075
1076
1077 ************************************************************************
1078 * *
1079 \subsection{Usage information: flows upwards}
1080 * *
1081 ************************************************************************
1082 -}
1083
1084 data ScUsage
1085 = SCU {
1086 scu_calls :: CallEnv, -- Calls
1087 -- The functions are a subset of the
1088 -- RecFuns in the ScEnv
1089
1090 scu_occs :: !(IdEnv ArgOcc) -- Information on argument occurrences
1091 } -- The domain is OutIds
1092
1093 type CallEnv = IdEnv [Call]
1094 data Call = Call Id [CoreArg] ValueEnv
1095 -- The arguments of the call, together with the
1096 -- env giving the constructor bindings at the call site
1097 -- We keep the function mainly for debug output
1098
1099 instance Outputable ScUsage where
1100 ppr (SCU { scu_calls = calls, scu_occs = occs })
1101 = text "SCU" <+> braces (sep [ ptext (sLit "calls =") <+> ppr calls
1102 , text "occs =" <+> ppr occs ])
1103
1104 instance Outputable Call where
1105 ppr (Call fn args _) = ppr fn <+> fsep (map pprParendExpr args)
1106
1107 nullUsage :: ScUsage
1108 nullUsage = SCU { scu_calls = emptyVarEnv, scu_occs = emptyVarEnv }
1109
1110 combineCalls :: CallEnv -> CallEnv -> CallEnv
1111 combineCalls = plusVarEnv_C (++)
1112 where
1113 -- plus cs ds | length res > 1
1114 -- = pprTrace "combineCalls" (vcat [ text "cs:" <+> ppr cs
1115 -- , text "ds:" <+> ppr ds])
1116 -- res
1117 -- | otherwise = res
1118 -- where
1119 -- res = cs ++ ds
1120
1121 combineUsage :: ScUsage -> ScUsage -> ScUsage
1122 combineUsage u1 u2 = SCU { scu_calls = combineCalls (scu_calls u1) (scu_calls u2),
1123 scu_occs = plusVarEnv_C combineOcc (scu_occs u1) (scu_occs u2) }
1124
1125 combineUsages :: [ScUsage] -> ScUsage
1126 combineUsages [] = nullUsage
1127 combineUsages us = foldr1 combineUsage us
1128
1129 lookupOccs :: ScUsage -> [OutVar] -> (ScUsage, [ArgOcc])
1130 lookupOccs (SCU { scu_calls = sc_calls, scu_occs = sc_occs }) bndrs
1131 = (SCU {scu_calls = sc_calls, scu_occs = delVarEnvList sc_occs bndrs},
1132 [lookupVarEnv sc_occs b `orElse` NoOcc | b <- bndrs])
1133
1134 data ArgOcc = NoOcc -- Doesn't occur at all; or a type argument
1135 | UnkOcc -- Used in some unknown way
1136
1137 | ScrutOcc -- See Note [ScrutOcc]
1138 (DataConEnv [ArgOcc]) -- How the sub-components are used
1139
1140 type DataConEnv a = UniqFM a -- Keyed by DataCon
1141
1142 {- Note [ScrutOcc]
1143 ~~~~~~~~~~~~~~~~~~~
1144 An occurrence of ScrutOcc indicates that the thing, or a `cast` version of the thing,
1145 is *only* taken apart or applied.
1146
1147 Functions, literal: ScrutOcc emptyUFM
1148 Data constructors: ScrutOcc subs,
1149
1150 where (subs :: UniqFM [ArgOcc]) gives usage of the *pattern-bound* components,
1151 The domain of the UniqFM is the Unique of the data constructor
1152
1153 The [ArgOcc] is the occurrences of the *pattern-bound* components
1154 of the data structure. E.g.
1155 data T a = forall b. MkT a b (b->a)
1156 A pattern binds b, x::a, y::b, z::b->a, but not 'a'!
1157
1158 -}
1159
1160 instance Outputable ArgOcc where
1161 ppr (ScrutOcc xs) = text "scrut-occ" <> ppr xs
1162 ppr UnkOcc = text "unk-occ"
1163 ppr NoOcc = text "no-occ"
1164
1165 evalScrutOcc :: ArgOcc
1166 evalScrutOcc = ScrutOcc emptyUFM
1167
1168 -- Experimentally, this vesion of combineOcc makes ScrutOcc "win", so
1169 -- that if the thing is scrutinised anywhere then we get to see that
1170 -- in the overall result, even if it's also used in a boxed way
1171 -- This might be too aggressive; see Note [Reboxing] Alternative 3
1172 combineOcc :: ArgOcc -> ArgOcc -> ArgOcc
1173 combineOcc NoOcc occ = occ
1174 combineOcc occ NoOcc = occ
1175 combineOcc (ScrutOcc xs) (ScrutOcc ys) = ScrutOcc (plusUFM_C combineOccs xs ys)
1176 combineOcc UnkOcc (ScrutOcc ys) = ScrutOcc ys
1177 combineOcc (ScrutOcc xs) UnkOcc = ScrutOcc xs
1178 combineOcc UnkOcc UnkOcc = UnkOcc
1179
1180 combineOccs :: [ArgOcc] -> [ArgOcc] -> [ArgOcc]
1181 combineOccs xs ys = zipWithEqual "combineOccs" combineOcc xs ys
1182
1183 setScrutOcc :: ScEnv -> ScUsage -> OutExpr -> ArgOcc -> ScUsage
1184 -- _Overwrite_ the occurrence info for the scrutinee, if the scrutinee
1185 -- is a variable, and an interesting variable
1186 setScrutOcc env usg (Cast e _) occ = setScrutOcc env usg e occ
1187 setScrutOcc env usg (Tick _ e) occ = setScrutOcc env usg e occ
1188 setScrutOcc env usg (Var v) occ
1189 | Just RecArg <- lookupHowBound env v = usg { scu_occs = extendVarEnv (scu_occs usg) v occ }
1190 | otherwise = usg
1191 setScrutOcc _env usg _other _occ -- Catch-all
1192 = usg
1193
1194 {-
1195 ************************************************************************
1196 * *
1197 \subsection{The main recursive function}
1198 * *
1199 ************************************************************************
1200
1201 The main recursive function gathers up usage information, and
1202 creates specialised versions of functions.
1203 -}
1204
1205 scExpr, scExpr' :: ScEnv -> CoreExpr -> UniqSM (ScUsage, CoreExpr)
1206 -- The unique supply is needed when we invent
1207 -- a new name for the specialised function and its args
1208
1209 scExpr env e = scExpr' env e
1210
1211 scExpr' env (Var v) = case scSubstId env v of
1212 Var v' -> return (mkVarUsage env v' [], Var v')
1213 e' -> scExpr (zapScSubst env) e'
1214
1215 scExpr' env (Type t) = return (nullUsage, Type (scSubstTy env t))
1216 scExpr' env (Coercion c) = return (nullUsage, Coercion (scSubstCo env c))
1217 scExpr' _ e@(Lit {}) = return (nullUsage, e)
1218 scExpr' env (Tick t e) = do (usg, e') <- scExpr env e
1219 return (usg, Tick t e')
1220 scExpr' env (Cast e co) = do (usg, e') <- scExpr env e
1221 return (usg, mkCast e' (scSubstCo env co))
1222 -- Important to use mkCast here
1223 -- See Note [SpecConstr call patterns]
1224 scExpr' env e@(App _ _) = scApp env (collectArgs e)
1225 scExpr' env (Lam b e) = do let (env', b') = extendBndr env b
1226 (usg, e') <- scExpr env' e
1227 return (usg, Lam b' e')
1228
1229 scExpr' env (Case scrut b ty alts)
1230 = do { (scrut_usg, scrut') <- scExpr env scrut
1231 ; case isValue (sc_vals env) scrut' of
1232 Just (ConVal con args) -> sc_con_app con args scrut'
1233 _other -> sc_vanilla scrut_usg scrut'
1234 }
1235 where
1236 sc_con_app con args scrut' -- Known constructor; simplify
1237 = do { let (_, bs, rhs) = findAlt con alts
1238 `orElse` (DEFAULT, [], mkImpossibleExpr ty)
1239 alt_env' = extendScSubstList env ((b,scrut') : bs `zip` trimConArgs con args)
1240 ; scExpr alt_env' rhs }
1241
1242 sc_vanilla scrut_usg scrut' -- Normal case
1243 = do { let (alt_env,b') = extendBndrWith RecArg env b
1244 -- Record RecArg for the components
1245
1246 ; (alt_usgs, alt_occs, alts')
1247 <- mapAndUnzip3M (sc_alt alt_env scrut' b') alts
1248
1249 ; let scrut_occ = foldr combineOcc NoOcc alt_occs
1250 scrut_usg' = setScrutOcc env scrut_usg scrut' scrut_occ
1251 -- The combined usage of the scrutinee is given
1252 -- by scrut_occ, which is passed to scScrut, which
1253 -- in turn treats a bare-variable scrutinee specially
1254
1255 ; return (foldr combineUsage scrut_usg' alt_usgs,
1256 Case scrut' b' (scSubstTy env ty) alts') }
1257
1258 sc_alt env scrut' b' (con,bs,rhs)
1259 = do { let (env1, bs1) = extendBndrsWith RecArg env bs
1260 (env2, bs2) = extendCaseBndrs env1 scrut' b' con bs1
1261 ; (usg, rhs') <- scExpr env2 rhs
1262 ; let (usg', b_occ:arg_occs) = lookupOccs usg (b':bs2)
1263 scrut_occ = case con of
1264 DataAlt dc -> ScrutOcc (unitUFM dc arg_occs)
1265 _ -> ScrutOcc emptyUFM
1266 ; return (usg', b_occ `combineOcc` scrut_occ, (con, bs2, rhs')) }
1267
1268 scExpr' env (Let (NonRec bndr rhs) body)
1269 | isTyVar bndr -- Type-lets may be created by doBeta
1270 = scExpr' (extendScSubst env bndr rhs) body
1271
1272 | otherwise
1273 = do { let (body_env, bndr') = extendBndr env bndr
1274 ; rhs_info <- scRecRhs env (bndr',rhs)
1275
1276 ; let body_env2 = extendHowBound body_env [bndr'] RecFun
1277 -- Note [Local let bindings]
1278 rhs' = ri_new_rhs rhs_info
1279 body_env3 = extendValEnv body_env2 bndr' (isValue (sc_vals env) rhs')
1280
1281 ; (body_usg, body') <- scExpr body_env3 body
1282
1283 -- NB: For non-recursive bindings we inherit sc_force flag from
1284 -- the parent function (see Note [Forcing specialisation])
1285 ; (spec_usg, specs) <- specNonRec env body_usg rhs_info
1286
1287 ; return (body_usg { scu_calls = scu_calls body_usg `delVarEnv` bndr' }
1288 `combineUsage` spec_usg, -- Note [spec_usg includes rhs_usg]
1289 mkLets [NonRec b r | (b,r) <- ruleInfoBinds rhs_info specs] body')
1290 }
1291
1292
1293 -- A *local* recursive group: see Note [Local recursive groups]
1294 scExpr' env (Let (Rec prs) body)
1295 = do { let (bndrs,rhss) = unzip prs
1296 (rhs_env1,bndrs') = extendRecBndrs env bndrs
1297 rhs_env2 = extendHowBound rhs_env1 bndrs' RecFun
1298 force_spec = any (forceSpecBndr env) bndrs'
1299 -- Note [Forcing specialisation]
1300
1301 ; rhs_infos <- mapM (scRecRhs rhs_env2) (bndrs' `zip` rhss)
1302 ; (body_usg, body') <- scExpr rhs_env2 body
1303
1304 -- NB: start specLoop from body_usg
1305 ; (spec_usg, specs) <- specRec NotTopLevel (scForce rhs_env2 force_spec)
1306 body_usg rhs_infos
1307 -- Do not unconditionally generate specialisations from rhs_usgs
1308 -- Instead use them only if we find an unspecialised call
1309 -- See Note [Local recursive groups]
1310
1311 ; let all_usg = spec_usg `combineUsage` body_usg -- Note [spec_usg includes rhs_usg]
1312 bind' = Rec (concat (zipWith ruleInfoBinds rhs_infos specs))
1313
1314 ; return (all_usg { scu_calls = scu_calls all_usg `delVarEnvList` bndrs' },
1315 Let bind' body') }
1316
1317 {-
1318 Note [Local let bindings]
1319 ~~~~~~~~~~~~~~~~~~~~~~~~~
1320 It is not uncommon to find this
1321
1322 let $j = \x. <blah> in ...$j True...$j True...
1323
1324 Here $j is an arbitrary let-bound function, but it often comes up for
1325 join points. We might like to specialise $j for its call patterns.
1326 Notice the difference from a letrec, where we look for call patterns
1327 in the *RHS* of the function. Here we look for call patterns in the
1328 *body* of the let.
1329
1330 At one point I predicated this on the RHS mentioning the outer
1331 recursive function, but that's not essential and might even be
1332 harmful. I'm not sure.
1333 -}
1334
1335 scApp :: ScEnv -> (InExpr, [InExpr]) -> UniqSM (ScUsage, CoreExpr)
1336
1337 scApp env (Var fn, args) -- Function is a variable
1338 = ASSERT( not (null args) )
1339 do { args_w_usgs <- mapM (scExpr env) args
1340 ; let (arg_usgs, args') = unzip args_w_usgs
1341 arg_usg = combineUsages arg_usgs
1342 ; case scSubstId env fn of
1343 fn'@(Lam {}) -> scExpr (zapScSubst env) (doBeta fn' args')
1344 -- Do beta-reduction and try again
1345
1346 Var fn' -> return (arg_usg `combineUsage` mkVarUsage env fn' args',
1347 mkApps (Var fn') args')
1348
1349 other_fn' -> return (arg_usg, mkApps other_fn' args') }
1350 -- NB: doing this ignores any usage info from the substituted
1351 -- function, but I don't think that matters. If it does
1352 -- we can fix it.
1353 where
1354 doBeta :: OutExpr -> [OutExpr] -> OutExpr
1355 -- ToDo: adjust for System IF
1356 doBeta (Lam bndr body) (arg : args) = Let (NonRec bndr arg) (doBeta body args)
1357 doBeta fn args = mkApps fn args
1358
1359 -- The function is almost always a variable, but not always.
1360 -- In particular, if this pass follows float-in,
1361 -- which it may, we can get
1362 -- (let f = ...f... in f) arg1 arg2
1363 scApp env (other_fn, args)
1364 = do { (fn_usg, fn') <- scExpr env other_fn
1365 ; (arg_usgs, args') <- mapAndUnzipM (scExpr env) args
1366 ; return (combineUsages arg_usgs `combineUsage` fn_usg, mkApps fn' args') }
1367
1368 ----------------------
1369 mkVarUsage :: ScEnv -> Id -> [CoreExpr] -> ScUsage
1370 mkVarUsage env fn args
1371 = case lookupHowBound env fn of
1372 Just RecFun -> SCU { scu_calls = unitVarEnv fn [Call fn args (sc_vals env)]
1373 , scu_occs = emptyVarEnv }
1374 Just RecArg -> SCU { scu_calls = emptyVarEnv
1375 , scu_occs = unitVarEnv fn arg_occ }
1376 Nothing -> nullUsage
1377 where
1378 -- I rather think we could use UnkOcc all the time
1379 arg_occ | null args = UnkOcc
1380 | otherwise = evalScrutOcc
1381
1382 ----------------------
1383 scTopBindEnv :: ScEnv -> CoreBind -> UniqSM (ScEnv, CoreBind)
1384 scTopBindEnv env (Rec prs)
1385 = do { let (rhs_env1,bndrs') = extendRecBndrs env bndrs
1386 rhs_env2 = extendHowBound rhs_env1 bndrs RecFun
1387
1388 prs' = zip bndrs' rhss
1389 ; return (rhs_env2, Rec prs') }
1390 where
1391 (bndrs,rhss) = unzip prs
1392
1393 scTopBindEnv env (NonRec bndr rhs)
1394 = do { let (env1, bndr') = extendBndr env bndr
1395 env2 = extendValEnv env1 bndr' (isValue (sc_vals env) rhs)
1396 ; return (env2, NonRec bndr' rhs) }
1397
1398 ----------------------
1399 scTopBind :: ScEnv -> ScUsage -> CoreBind -> UniqSM (ScUsage, CoreBind)
1400
1401 {-
1402 scTopBind _ usage _
1403 | pprTrace "scTopBind_usage" (ppr (scu_calls usage)) False
1404 = error "false"
1405 -}
1406
1407 scTopBind env body_usage (Rec prs)
1408 | Just threshold <- sc_size env
1409 , not force_spec
1410 , not (all (couldBeSmallEnoughToInline (sc_dflags env) threshold) rhss)
1411 -- No specialisation
1412 = -- pprTrace "scTopBind: nospec" (ppr bndrs) $
1413 do { (rhs_usgs, rhss') <- mapAndUnzipM (scExpr env) rhss
1414 ; return (body_usage `combineUsage` combineUsages rhs_usgs, Rec (bndrs `zip` rhss')) }
1415
1416 | otherwise -- Do specialisation
1417 = do { rhs_infos <- mapM (scRecRhs env) prs
1418
1419 ; (spec_usage, specs) <- specRec TopLevel (scForce env force_spec)
1420 body_usage rhs_infos
1421
1422 ; return (body_usage `combineUsage` spec_usage,
1423 Rec (concat (zipWith ruleInfoBinds rhs_infos specs))) }
1424 where
1425 (bndrs,rhss) = unzip prs
1426 force_spec = any (forceSpecBndr env) bndrs
1427 -- Note [Forcing specialisation]
1428
1429 scTopBind env usage (NonRec bndr rhs) -- Oddly, we don't seem to specialise top-level non-rec functions
1430 = do { (rhs_usg', rhs') <- scExpr env rhs
1431 ; return (usage `combineUsage` rhs_usg', NonRec bndr rhs') }
1432
1433 ----------------------
1434 scRecRhs :: ScEnv -> (OutId, InExpr) -> UniqSM RhsInfo
1435 scRecRhs env (bndr,rhs)
1436 = do { let (arg_bndrs,body) = collectBinders rhs
1437 (body_env, arg_bndrs') = extendBndrsWith RecArg env arg_bndrs
1438 ; (body_usg, body') <- scExpr body_env body
1439 ; let (rhs_usg, arg_occs) = lookupOccs body_usg arg_bndrs'
1440 ; return (RI { ri_rhs_usg = rhs_usg
1441 , ri_fn = bndr, ri_new_rhs = mkLams arg_bndrs' body'
1442 , ri_lam_bndrs = arg_bndrs, ri_lam_body = body
1443 , ri_arg_occs = arg_occs }) }
1444 -- The arg_occs says how the visible,
1445 -- lambda-bound binders of the RHS are used
1446 -- (including the TyVar binders)
1447 -- Two pats are the same if they match both ways
1448
1449 ----------------------
1450 ruleInfoBinds :: RhsInfo -> SpecInfo -> [(Id,CoreExpr)]
1451 ruleInfoBinds (RI { ri_fn = fn, ri_new_rhs = new_rhs })
1452 (SI { si_specs = specs })
1453 = [(id,rhs) | OS { os_id = id, os_rhs = rhs } <- specs] ++
1454 -- First the specialised bindings
1455
1456 [(fn `addIdSpecialisations` rules, new_rhs)]
1457 -- And now the original binding
1458 where
1459 rules = [r | OS { os_rule = r } <- specs]
1460
1461 {-
1462 ************************************************************************
1463 * *
1464 The specialiser itself
1465 * *
1466 ************************************************************************
1467 -}
1468
1469 data RhsInfo
1470 = RI { ri_fn :: OutId -- The binder
1471 , ri_new_rhs :: OutExpr -- The specialised RHS (in current envt)
1472 , ri_rhs_usg :: ScUsage -- Usage info from specialising RHS
1473
1474 , ri_lam_bndrs :: [InVar] -- The *original* RHS (\xs.body)
1475 , ri_lam_body :: InExpr -- Note [Specialise original body]
1476 , ri_arg_occs :: [ArgOcc] -- Info on how the xs occur in body
1477 }
1478
1479 data SpecInfo -- Info about specialisations for a particular Id
1480 = SI { si_specs :: [OneSpec] -- The specialisations we have generated
1481
1482 , si_n_specs :: Int -- Length of si_specs; used for numbering them
1483
1484 , si_mb_unspec :: Maybe ScUsage -- Just cs => we have not yet used calls in the
1485 } -- from calls in the *original* RHS as
1486 -- seeds for new specialisations;
1487 -- if you decide to do so, here is the
1488 -- RHS usage (which has not yet been
1489 -- unleashed)
1490 -- Nothing => we have
1491 -- See Note [Local recursive groups]
1492 -- See Note [spec_usg includes rhs_usg]
1493
1494 -- One specialisation: Rule plus definition
1495 data OneSpec =
1496 OS { os_pat :: CallPat -- Call pattern that generated this specialisation
1497 , os_rule :: CoreRule -- Rule connecting original id with the specialisation
1498 , os_id :: OutId -- Spec id
1499 , os_rhs :: OutExpr } -- Spec rhs
1500
1501 noSpecInfo :: SpecInfo
1502 noSpecInfo = SI { si_specs = [], si_n_specs = 0, si_mb_unspec = Nothing }
1503
1504 ----------------------
1505 specNonRec :: ScEnv
1506 -> ScUsage -- Body usage
1507 -> RhsInfo -- Structure info usage info for un-specialised RHS
1508 -> UniqSM (ScUsage, SpecInfo) -- Usage from RHSs (specialised and not)
1509 -- plus details of specialisations
1510
1511 specNonRec env body_usg rhs_info
1512 = specialise env (scu_calls body_usg) rhs_info
1513 (noSpecInfo { si_mb_unspec = Just (ri_rhs_usg rhs_info) })
1514
1515 ----------------------
1516 specRec :: TopLevelFlag -> ScEnv
1517 -> ScUsage -- Body usage
1518 -> [RhsInfo] -- Structure info and usage info for un-specialised RHSs
1519 -> UniqSM (ScUsage, [SpecInfo]) -- Usage from all RHSs (specialised and not)
1520 -- plus details of specialisations
1521
1522 specRec top_lvl env body_usg rhs_infos
1523 = go 1 seed_calls nullUsage init_spec_infos
1524 where
1525 (seed_calls, init_spec_infos) -- Note [Seeding top-level recursive groups]
1526 | isTopLevel top_lvl
1527 , any (isExportedId . ri_fn) rhs_infos -- Seed from body and RHSs
1528 = (all_calls, [noSpecInfo | _ <- rhs_infos])
1529 | otherwise -- Seed from body only
1530 = (calls_in_body, [noSpecInfo { si_mb_unspec = Just (ri_rhs_usg ri) }
1531 | ri <- rhs_infos])
1532
1533 calls_in_body = scu_calls body_usg
1534 calls_in_rhss = foldr (combineCalls . scu_calls . ri_rhs_usg) emptyVarEnv rhs_infos
1535 all_calls = calls_in_rhss `combineCalls` calls_in_body
1536
1537 -- Loop, specialising, until you get no new specialisations
1538 go :: Int -- Which iteration of the "until no new specialisations"
1539 -- loop we are on; first iteration is 1
1540 -> CallEnv -- Seed calls
1541 -- Two accumulating parameters:
1542 -> ScUsage -- Usage from earlier specialisations
1543 -> [SpecInfo] -- Details of specialisations so far
1544 -> UniqSM (ScUsage, [SpecInfo])
1545 go n_iter seed_calls usg_so_far spec_infos
1546 | isEmptyVarEnv seed_calls
1547 = -- pprTrace "specRec1" (vcat [ ppr (map ri_fn rhs_infos)
1548 -- , ppr seed_calls
1549 -- , ppr body_usg ]) $
1550 return (usg_so_far, spec_infos)
1551
1552 -- Limit recursive specialisation
1553 -- See Note [Limit recursive specialisation]
1554 | n_iter > sc_recursive env -- Too many iterations of the 'go' loop
1555 , sc_force env || isNothing (sc_count env)
1556 -- If both of these are false, the sc_count
1557 -- threshold will prevent non-termination
1558 , any ((> the_limit) . si_n_specs) spec_infos
1559 = -- pprTrace "specRec2" (ppr (map (map os_pat . si_specs) spec_infos)) $
1560 return (usg_so_far, spec_infos)
1561
1562 | otherwise
1563 = -- pprTrace "specRec3" (vcat [ text "bndrs" <+> ppr (map ri_fn rhs_infos)
1564 -- , text "iteration" <+> int n_iter
1565 -- , text "spec_infos" <+> ppr (map (map os_pat . si_specs) spec_infos)
1566 -- ]) $
1567 do { specs_w_usg <- zipWithM (specialise env seed_calls) rhs_infos spec_infos
1568 ; let (extra_usg_s, new_spec_infos) = unzip specs_w_usg
1569 extra_usg = combineUsages extra_usg_s
1570 all_usg = usg_so_far `combineUsage` extra_usg
1571 ; go (n_iter + 1) (scu_calls extra_usg) all_usg new_spec_infos }
1572
1573 -- See Note [Limit recursive specialisation]
1574 the_limit = case sc_count env of
1575 Nothing -> 10 -- Ugh!
1576 Just max -> max
1577
1578
1579 ----------------------
1580 specialise
1581 :: ScEnv
1582 -> CallEnv -- Info on newly-discovered calls to this function
1583 -> RhsInfo
1584 -> SpecInfo -- Original RHS plus patterns dealt with
1585 -> UniqSM (ScUsage, SpecInfo) -- New specialised versions and their usage
1586
1587 -- See Note [spec_usg includes rhs_usg]
1588
1589 -- Note: this only generates *specialised* bindings
1590 -- The original binding is added by ruleInfoBinds
1591 --
1592 -- Note: the rhs here is the optimised version of the original rhs
1593 -- So when we make a specialised copy of the RHS, we're starting
1594 -- from an RHS whose nested functions have been optimised already.
1595
1596 specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
1597 , ri_lam_body = body, ri_arg_occs = arg_occs })
1598 spec_info@(SI { si_specs = specs, si_n_specs = spec_count
1599 , si_mb_unspec = mb_unspec })
1600 | isBottomingId fn -- Note [Do not specialise diverging functions]
1601 -- and do not generate specialisation seeds from its RHS
1602 = -- pprTrace "specialise bot" (ppr fn) $
1603 return (nullUsage, spec_info)
1604
1605 | isNeverActive (idInlineActivation fn) -- See Note [Transfer activation]
1606 || null arg_bndrs -- Only specialise functions
1607 = -- pprTrace "specialise inactive" (ppr fn) $
1608 case mb_unspec of -- Behave as if there was a single, boring call
1609 Just rhs_usg -> return (rhs_usg, spec_info { si_mb_unspec = Nothing })
1610 -- See Note [spec_usg includes rhs_usg]
1611 Nothing -> return (nullUsage, spec_info)
1612
1613 | Just all_calls <- lookupVarEnv bind_calls fn
1614 = -- pprTrace "specialise entry {" (ppr fn <+> ppr all_calls) $
1615 do { (boring_call, new_pats) <- callsToNewPats env fn spec_info arg_occs all_calls
1616
1617 ; let n_pats = length new_pats
1618 -- ; if (not (null new_pats) || isJust mb_unspec) then
1619 -- pprTrace "specialise" (vcat [ ppr fn <+> text "with" <+> int n_pats <+> text "good patterns"
1620 -- , text "mb_unspec" <+> ppr (isJust mb_unspec)
1621 -- , text "arg_occs" <+> ppr arg_occs
1622 -- , text "good pats" <+> ppr new_pats]) $
1623 -- return ()
1624 -- else return ()
1625
1626 ; let spec_env = decreaseSpecCount env n_pats
1627 ; (spec_usgs, new_specs) <- mapAndUnzipM (spec_one spec_env fn arg_bndrs body)
1628 (new_pats `zip` [spec_count..])
1629 -- See Note [Specialise original body]
1630
1631 ; let spec_usg = combineUsages spec_usgs
1632
1633 -- If there were any boring calls among the seeds (= all_calls), then those
1634 -- calls will call the un-specialised function. So we should use the seeds
1635 -- from the _unspecialised_ function's RHS, which are in mb_unspec, by returning
1636 -- then in new_usg.
1637 (new_usg, mb_unspec')
1638 = case mb_unspec of
1639 Just rhs_usg | boring_call -> (spec_usg `combineUsage` rhs_usg, Nothing)
1640 _ -> (spec_usg, mb_unspec)
1641
1642 -- ; pprTrace "specialise return }"
1643 -- (vcat [ ppr fn
1644 -- , text "boring_call:" <+> ppr boring_call
1645 -- , text "new calls:" <+> ppr (scu_calls new_usg)]) $
1646 -- return ()
1647
1648 ; return (new_usg, SI { si_specs = new_specs ++ specs
1649 , si_n_specs = spec_count + n_pats
1650 , si_mb_unspec = mb_unspec' }) }
1651
1652 | otherwise -- No new seeds, so return nullUsage
1653 = return (nullUsage, spec_info)
1654
1655
1656
1657
1658 ---------------------
1659 spec_one :: ScEnv
1660 -> OutId -- Function
1661 -> [InVar] -- Lambda-binders of RHS; should match patterns
1662 -> InExpr -- Body of the original function
1663 -> (CallPat, Int)
1664 -> UniqSM (ScUsage, OneSpec) -- Rule and binding
1665
1666 -- spec_one creates a specialised copy of the function, together
1667 -- with a rule for using it. I'm very proud of how short this
1668 -- function is, considering what it does :-).
1669
1670 {-
1671 Example
1672
1673 In-scope: a, x::a
1674 f = /\b \y::[(a,b)] -> ....f (b,c) ((:) (a,(b,c)) (x,v) (h w))...
1675 [c::*, v::(b,c) are presumably bound by the (...) part]
1676 ==>
1677 f_spec = /\ b c \ v::(b,c) hw::[(a,(b,c))] ->
1678 (...entire body of f...) [b -> (b,c),
1679 y -> ((:) (a,(b,c)) (x,v) hw)]
1680
1681 RULE: forall b::* c::*, -- Note, *not* forall a, x
1682 v::(b,c),
1683 hw::[(a,(b,c))] .
1684
1685 f (b,c) ((:) (a,(b,c)) (x,v) hw) = f_spec b c v hw
1686 -}
1687
1688 spec_one env fn arg_bndrs body (call_pat@(qvars, pats), rule_number)
1689 = do { spec_uniq <- getUniqueM
1690 ; let spec_env = extendScSubstList (extendScInScope env qvars)
1691 (arg_bndrs `zip` pats)
1692 fn_name = idName fn
1693 fn_loc = nameSrcSpan fn_name
1694 fn_occ = nameOccName fn_name
1695 spec_occ = mkSpecOcc fn_occ
1696 -- We use fn_occ rather than fn in the rule_name string
1697 -- as we don't want the uniq to end up in the rule, and
1698 -- hence in the ABI, as that can cause spurious ABI
1699 -- changes (#4012).
1700 rule_name = mkFastString ("SC:" ++ occNameString fn_occ ++ show rule_number)
1701 spec_name = mkInternalName spec_uniq spec_occ fn_loc
1702 -- ; pprTrace "{spec_one" (ppr (sc_count env) <+> ppr fn
1703 -- <+> ppr pats <+> text "-->" <+> ppr spec_name) $
1704 -- return ()
1705
1706 -- Specialise the body
1707 ; (spec_usg, spec_body) <- scExpr spec_env body
1708
1709 -- ; pprTrace "done spec_one}" (ppr fn) $
1710 -- return ()
1711
1712 -- And build the results
1713 ; let (spec_lam_args, spec_call_args) = mkWorkerArgs (sc_dflags env)
1714 qvars body_ty
1715 -- Usual w/w hack to avoid generating
1716 -- a spec_rhs of unlifted type and no args
1717
1718 spec_lam_args_str = handOutStrictnessInformation (fst (splitStrictSig spec_str)) spec_lam_args
1719 -- Annotate the variables with the strictness information from
1720 -- the function (see Note [Strictness information in worker binders])
1721
1722 spec_join_arity | isJoinId fn = Just (length spec_lam_args)
1723 | otherwise = Nothing
1724 spec_id = mkLocalIdOrCoVar spec_name
1725 (mkLamTypes spec_lam_args body_ty)
1726 -- See Note [Transfer strictness]
1727 `setIdStrictness` spec_str
1728 `setIdArity` count isId spec_lam_args
1729 `asJoinId_maybe` spec_join_arity
1730 spec_str = calcSpecStrictness fn spec_lam_args pats
1731
1732
1733 -- Conditionally use result of new worker-wrapper transform
1734 spec_rhs = mkLams spec_lam_args_str spec_body
1735 body_ty = exprType spec_body
1736 rule_rhs = mkVarApps (Var spec_id) spec_call_args
1737 inline_act = idInlineActivation fn
1738 this_mod = sc_module spec_env
1739 rule = mkRule this_mod True {- Auto -} True {- Local -}
1740 rule_name inline_act fn_name qvars pats rule_rhs
1741 -- See Note [Transfer activation]
1742 ; return (spec_usg, OS { os_pat = call_pat, os_rule = rule
1743 , os_id = spec_id
1744 , os_rhs = spec_rhs }) }
1745
1746
1747 -- See Note [Strictness information in worker binders]
1748 handOutStrictnessInformation :: [Demand] -> [Var] -> [Var]
1749 handOutStrictnessInformation = go
1750 where
1751 go _ [] = []
1752 go [] vs = vs
1753 go (d:dmds) (v:vs) | isId v = setIdDemandInfo v d : go dmds vs
1754 go dmds (v:vs) = v : go dmds vs
1755
1756 calcSpecStrictness :: Id -- The original function
1757 -> [Var] -> [CoreExpr] -- Call pattern
1758 -> StrictSig -- Strictness of specialised thing
1759 -- See Note [Transfer strictness]
1760 calcSpecStrictness fn qvars pats
1761 = mkClosedStrictSig spec_dmds topRes
1762 where
1763 spec_dmds = [ lookupVarEnv dmd_env qv `orElse` topDmd | qv <- qvars, isId qv ]
1764 StrictSig (DmdType _ dmds _) = idStrictness fn
1765
1766 dmd_env = go emptyVarEnv dmds pats
1767
1768 go :: DmdEnv -> [Demand] -> [CoreExpr] -> DmdEnv
1769 go env ds (Type {} : pats) = go env ds pats
1770 go env ds (Coercion {} : pats) = go env ds pats
1771 go env (d:ds) (pat : pats) = go (go_one env d pat) ds pats
1772 go env _ _ = env
1773
1774 go_one :: DmdEnv -> Demand -> CoreExpr -> DmdEnv
1775 go_one env d (Var v) = extendVarEnv_C bothDmd env v d
1776 go_one env d e
1777 | Just ds <- splitProdDmd_maybe d -- NB: d does not have to be strict
1778 , (Var _, args) <- collectArgs e = go env ds args
1779 go_one env _ _ = env
1780
1781 {-
1782 Note [spec_usg includes rhs_usg]
1783 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1784 In calls to 'specialise', the returned ScUsage must include the rhs_usg in
1785 the passed-in SpecInfo, unless there are no calls at all to the function.
1786
1787 The caller can, indeed must, assume this. He should not combine in rhs_usg
1788 himself, or he'll get rhs_usg twice -- and that can lead to an exponential
1789 blowup of duplicates in the CallEnv. This is what gave rise to the massive
1790 performance loss in Trac #8852.
1791
1792 Note [Specialise original body]
1793 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1794 The RhsInfo for a binding keeps the *original* body of the binding. We
1795 must specialise that, *not* the result of applying specExpr to the RHS
1796 (which is also kept in RhsInfo). Otherwise we end up specialising a
1797 specialised RHS, and that can lead directly to exponential behaviour.
1798
1799 Note [Transfer activation]
1800 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1801 This note is for SpecConstr, but exactly the same thing
1802 happens in the overloading specialiser; see
1803 Note [Auto-specialisation and RULES] in Specialise.
1804
1805 In which phase should the specialise-constructor rules be active?
1806 Originally I made them always-active, but Manuel found that this
1807 defeated some clever user-written rules. Then I made them active only
1808 in Phase 0; after all, currently, the specConstr transformation is
1809 only run after the simplifier has reached Phase 0, but that meant
1810 that specialisations didn't fire inside wrappers; see test
1811 simplCore/should_compile/spec-inline.
1812
1813 So now I just use the inline-activation of the parent Id, as the
1814 activation for the specialisation RULE, just like the main specialiser;
1815
1816 This in turn means there is no point in specialising NOINLINE things,
1817 so we test for that.
1818
1819 Note [Transfer strictness]
1820 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1821 We must transfer strictness information from the original function to
1822 the specialised one. Suppose, for example
1823
1824 f has strictness SS
1825 and a RULE f (a:as) b = f_spec a as b
1826
1827 Now we want f_spec to have strictness LLS, otherwise we'll use call-by-need
1828 when calling f_spec instead of call-by-value. And that can result in
1829 unbounded worsening in space (cf the classic foldl vs foldl')
1830
1831 See Trac #3437 for a good example.
1832
1833 The function calcSpecStrictness performs the calculation.
1834
1835 Note [Strictness information in worker binders]
1836 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1837
1838 After having calculated the strictness annotation for the worker (see Note
1839 [Transfer strictness] above), we also want to have this information attached to
1840 the worker’s arguments, for the benefit of later passes. The function
1841 handOutStrictnessInformation decomposes the strictness annotation calculated by
1842 calcSpecStrictness and attaches them to the variables.
1843
1844 ************************************************************************
1845 * *
1846 \subsection{Argument analysis}
1847 * *
1848 ************************************************************************
1849
1850 This code deals with analysing call-site arguments to see whether
1851 they are constructor applications.
1852
1853 Note [Free type variables of the qvar types]
1854 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1855 In a call (f @a x True), that we want to specialise, what variables should
1856 we quantify over. Clearly over 'a' and 'x', but what about any type variables
1857 free in x's type? In fact we don't need to worry about them because (f @a)
1858 can only be a well-typed application if its type is compatible with x, so any
1859 variables free in x's type must be free in (f @a), and hence either be gathered
1860 via 'a' itself, or be in scope at f's defn. Hence we just take
1861 (exprsFreeVars pats).
1862
1863 BUT phantom type synonyms can mess this reasoning up,
1864 eg x::T b with type T b = Int
1865 So we apply expandTypeSynonyms to the bound Ids.
1866 See Trac # 5458. Yuk.
1867
1868 Note [SpecConstr call patterns]
1869 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1870 A "call patterns" that we collect is going to become the LHS of a RULE.
1871 It's important that it doesn't have
1872 e |> Refl
1873 or
1874 e |> g1 |> g2
1875 because both of these will be optimised by Simplify.simplRule. In the
1876 former case such optimisation benign, because the rule will match more
1877 terms; but in the latter we may lose a binding of 'g1' or 'g2', and
1878 end up with a rule LHS that doesn't bind the template variables
1879 (Trac #10602).
1880
1881 The simplifier eliminates such things, but SpecConstr itself constructs
1882 new terms by substituting. So the 'mkCast' in the Cast case of scExpr
1883 is very important!
1884
1885 Note [Choosing patterns]
1886 ~~~~~~~~~~~~~~~~~~~~~~~~
1887 If we get lots of patterns we may not want to make a specialisation
1888 for each of them (code bloat), so we choose as follows, implemented
1889 by trim_pats.
1890
1891 * The flag -fspec-constr-count-N sets the sc_count field
1892 of the ScEnv to (Just n). This limits the total number
1893 of specialisations for a given function to N.
1894
1895 * -fno-spec-constr-count sets the sc_count field to Nothing,
1896 which switches of the limit.
1897
1898 * The ghastly ForceSpecConstr trick also switches of the limit
1899 for a particular function
1900
1901 * Otherwise we sort the patterns to choose the most general
1902 ones first; more general => more widely applicable.
1903
1904 Note [SpecConstr and casts]
1905 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1906 Consider (Trac #14270) a call like
1907
1908 let f = e
1909 in ... f (K @(a |> co)) ...
1910
1911 where 'co' is a coercion variable not in scope at f's definition site.
1912 If we aren't caereful we'll get
1913
1914 let $sf a co = e (K @(a |> co))
1915 RULE "SC:f" forall a co. f (K @(a |> co)) = $sf a co
1916 f = e
1917 in ...
1918
1919 But alas, when we match the call we won't bind 'co', because type-matching
1920 (for good reasons) discards casts).
1921
1922 I don't know how to solve this, so for now I'm just discarding any
1923 call patterns that
1924 * Mentions a coercion variable
1925 * That is not in scope at the binding of the function
1926
1927 I think this is very rare.
1928
1929 Note that this /also/ discards the call pattern if we have a cast in a
1930 /term/, although in fact Rules.match does make a very flaky and
1931 fragile attempt to match coercions. e.g. a call like
1932 f (Maybe Age) (Nothing |> co) blah
1933 where co :: Maybe Int ~ Maybe Age
1934 will be discarded. It's extremely fragile to match on the form of a
1935 coercion, so I think it's better just not to try. A more complicated
1936 alternative would be to discard calls that mention coercion variables
1937 only in kind-casts, but I'm doing the simple thing for now.
1938 -}
1939
1940 type CallPat = ([Var], [CoreExpr]) -- Quantified variables and arguments
1941 -- See Note [SpecConstr call patterns]
1942
1943 callsToNewPats :: ScEnv -> Id
1944 -> SpecInfo
1945 -> [ArgOcc] -> [Call]
1946 -> UniqSM (Bool, [CallPat])
1947 -- Result has no duplicate patterns,
1948 -- nor ones mentioned in done_pats
1949 -- Bool indicates that there was at least one boring pattern
1950 callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls
1951 = do { mb_pats <- mapM (callToPats env bndr_occs) calls
1952
1953 ; let have_boring_call = any isNothing mb_pats
1954
1955 good_pats :: [CallPat]
1956 good_pats = catMaybes mb_pats
1957
1958 -- Remove patterns we have already done
1959 new_pats = filterOut is_done good_pats
1960 is_done p = any (samePat p . os_pat) done_specs
1961
1962 -- Remove duplicates
1963 non_dups = nubBy samePat new_pats
1964
1965 -- Remove ones that have too many worker variables
1966 small_pats = filterOut too_big non_dups
1967 too_big (vars,_) = not (isWorkerSmallEnough (sc_dflags env) vars)
1968 -- We are about to construct w/w pair in 'spec_one'.
1969 -- Omit specialisation leading to high arity workers.
1970 -- See Note [Limit w/w arity] in WwLib
1971
1972 -- Discard specialisations if there are too many of them
1973 trimmed_pats = trim_pats env fn spec_info small_pats
1974
1975 -- ; pprTrace "callsToPats" (vcat [ text "calls to" <+> ppr fn <> colon <+> ppr calls
1976 -- , text "done_specs:" <+> ppr (map os_pat done_specs)
1977 -- , text "good_pats:" <+> ppr good_pats ]) $
1978 -- return ()
1979
1980 ; return (have_boring_call, trimmed_pats) }
1981
1982
1983 trim_pats :: ScEnv -> Id -> SpecInfo -> [CallPat] -> [CallPat]
1984 -- See Note [Choosing patterns]
1985 trim_pats env fn (SI { si_n_specs = done_spec_count }) pats
1986 | sc_force env
1987 || isNothing mb_scc
1988 || n_remaining >= n_pats
1989 = -- pprTrace "trim_pats: no-trim" (ppr (sc_force env) $$ ppr mb_scc $$ ppr n_remaining $$ ppr n_pats)
1990 pats -- No need to trim
1991
1992 | otherwise
1993 = emit_trace $ -- Need to trim, so keep the best ones
1994 take n_remaining sorted_pats
1995
1996 where
1997 n_pats = length pats
1998 spec_count' = n_pats + done_spec_count
1999 n_remaining = max_specs - done_spec_count
2000 mb_scc = sc_count env
2001 Just max_specs = mb_scc
2002
2003 sorted_pats = map fst $
2004 sortBy (comparing snd) $
2005 [(pat, pat_cons pat) | pat <- pats]
2006 -- Sort in order of increasing number of constructors
2007 -- (i.e. decreasing generality) and pick the initial
2008 -- segment of this list
2009
2010 pat_cons :: CallPat -> Int
2011 -- How many data constructors of literals are in
2012 -- the pattern. More data-cons => less general
2013 pat_cons (qs, ps) = foldr ((+) . n_cons) 0 ps
2014 where
2015 q_set = mkVarSet qs
2016 n_cons (Var v) | v `elemVarSet` q_set = 0
2017 | otherwise = 1
2018 n_cons (Cast e _) = n_cons e
2019 n_cons (App e1 e2) = n_cons e1 + n_cons e2
2020 n_cons (Lit {}) = 1
2021 n_cons _ = 0
2022
2023 emit_trace result
2024 | debugIsOn || hasPprDebug (sc_dflags env)
2025 -- Suppress this scary message for ordinary users! Trac #5125
2026 = pprTrace "SpecConstr" msg result
2027 | otherwise
2028 = result
2029 msg = vcat [ sep [ text "Function" <+> quotes (ppr fn)
2030 , nest 2 (text "has" <+>
2031 speakNOf spec_count' (text "call pattern") <> comma <+>
2032 text "but the limit is" <+> int max_specs) ]
2033 , text "Use -fspec-constr-count=n to set the bound"
2034 , text "done_spec_count =" <+> int done_spec_count
2035 , text "Keeping " <+> int n_remaining <> text ", out of" <+> int n_pats
2036 , text "Discarding:" <+> ppr (drop n_remaining sorted_pats) ]
2037
2038
2039 callToPats :: ScEnv -> [ArgOcc] -> Call -> UniqSM (Maybe CallPat)
2040 -- The [Var] is the variables to quantify over in the rule
2041 -- Type variables come first, since they may scope
2042 -- over the following term variables
2043 -- The [CoreExpr] are the argument patterns for the rule
2044 callToPats env bndr_occs call@(Call _ args con_env)
2045 | args `ltLength` bndr_occs -- Check saturated
2046 = return Nothing
2047 | otherwise
2048 = do { let in_scope = substInScope (sc_subst env)
2049 ; (interesting, pats) <- argsToPats env in_scope con_env args bndr_occs
2050 ; let pat_fvs = exprsFreeVarsList pats
2051 -- To get determinism we need the list of free variables in
2052 -- deterministic order. Otherwise we end up creating
2053 -- lambdas with different argument orders. See
2054 -- determinism/simplCore/should_compile/spec-inline-determ.hs
2055 -- for an example. For explanation of determinism
2056 -- considerations See Note [Unique Determinism] in Unique.
2057 in_scope_vars = getInScopeVars in_scope
2058 qvars = filterOut (`elemVarSet` in_scope_vars) pat_fvs
2059 -- Quantify over variables that are not in scope
2060 -- at the call site
2061 -- See Note [Free type variables of the qvar types]
2062 -- See Note [Shadowing] at the top
2063
2064 (ktvs, ids) = partition isTyVar qvars
2065 qvars' = toposortTyVars ktvs ++ map sanitise ids
2066 -- Order into kind variables, type variables, term variables
2067 -- The kind of a type variable may mention a kind variable
2068 -- and the type of a term variable may mention a type variable
2069
2070 sanitise id = id `setIdType` expandTypeSynonyms (idType id)
2071 -- See Note [Free type variables of the qvar types]
2072
2073 bad_covars = filter isCoVar ids
2074 -- See Note [SpecConstr and casts]
2075
2076 ; -- pprTrace "callToPats" (ppr args $$ ppr bndr_occs) $
2077 WARN( not (null bad_covars), text "SpecConstr: bad covars:" <+> ppr bad_covars
2078 $$ ppr call )
2079 if interesting && null bad_covars
2080 then return (Just (qvars', pats))
2081 else return Nothing }
2082
2083 -- argToPat takes an actual argument, and returns an abstracted
2084 -- version, consisting of just the "constructor skeleton" of the
2085 -- argument, with non-constructor sub-expression replaced by new
2086 -- placeholder variables. For example:
2087 -- C a (D (f x) (g y)) ==> C p1 (D p2 p3)
2088
2089 argToPat :: ScEnv
2090 -> InScopeSet -- What's in scope at the fn defn site
2091 -> ValueEnv -- ValueEnv at the call site
2092 -> CoreArg -- A call arg (or component thereof)
2093 -> ArgOcc
2094 -> UniqSM (Bool, CoreArg)
2095
2096 -- Returns (interesting, pat),
2097 -- where pat is the pattern derived from the argument
2098 -- interesting=True if the pattern is non-trivial (not a variable or type)
2099 -- E.g. x:xs --> (True, x:xs)
2100 -- f xs --> (False, w) where w is a fresh wildcard
2101 -- (f xs, 'c') --> (True, (w, 'c')) where w is a fresh wildcard
2102 -- \x. x+y --> (True, \x. x+y)
2103 -- lvl7 --> (True, lvl7) if lvl7 is bound
2104 -- somewhere further out
2105
2106 argToPat _env _in_scope _val_env arg@(Type {}) _arg_occ
2107 = return (False, arg)
2108
2109 argToPat env in_scope val_env (Tick _ arg) arg_occ
2110 = argToPat env in_scope val_env arg arg_occ
2111 -- Note [Notes in call patterns]
2112 -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2113 -- Ignore Notes. In particular, we want to ignore any InlineMe notes
2114 -- Perhaps we should not ignore profiling notes, but I'm going to
2115 -- ride roughshod over them all for now.
2116 --- See Note [Notes in RULE matching] in Rules
2117
2118 argToPat env in_scope val_env (Let _ arg) arg_occ
2119 = argToPat env in_scope val_env arg arg_occ
2120 -- See Note [Matching lets] in Rule.hs
2121 -- Look through let expressions
2122 -- e.g. f (let v = rhs in (v,w))
2123 -- Here we can specialise for f (v,w)
2124 -- because the rule-matcher will look through the let.
2125
2126 {- Disabled; see Note [Matching cases] in Rule.hs
2127 argToPat env in_scope val_env (Case scrut _ _ [(_, _, rhs)]) arg_occ
2128 | exprOkForSpeculation scrut -- See Note [Matching cases] in Rule.hhs
2129 = argToPat env in_scope val_env rhs arg_occ
2130 -}
2131
2132 argToPat env in_scope val_env (Cast arg co) arg_occ
2133 | not (ignoreType env ty2)
2134 = do { (interesting, arg') <- argToPat env in_scope val_env arg arg_occ
2135 ; if not interesting then
2136 wildCardPat ty2
2137 else do
2138 { -- Make a wild-card pattern for the coercion
2139 uniq <- getUniqueM
2140 ; let co_name = mkSysTvName uniq (fsLit "sg")
2141 co_var = mkCoVar co_name (mkCoercionType Representational ty1 ty2)
2142 ; return (interesting, Cast arg' (mkCoVarCo co_var)) } }
2143 where
2144 Pair ty1 ty2 = coercionKind co
2145
2146
2147
2148 {- Disabling lambda specialisation for now
2149 It's fragile, and the spec_loop can be infinite
2150 argToPat in_scope val_env arg arg_occ
2151 | is_value_lam arg
2152 = return (True, arg)
2153 where
2154 is_value_lam (Lam v e) -- Spot a value lambda, even if
2155 | isId v = True -- it is inside a type lambda
2156 | otherwise = is_value_lam e
2157 is_value_lam other = False
2158 -}
2159
2160 -- Check for a constructor application
2161 -- NB: this *precedes* the Var case, so that we catch nullary constrs
2162 argToPat env in_scope val_env arg arg_occ
2163 | Just (ConVal (DataAlt dc) args) <- isValue val_env arg
2164 , not (ignoreDataCon env dc) -- See Note [NoSpecConstr]
2165 , Just arg_occs <- mb_scrut dc
2166 = do { let (ty_args, rest_args) = splitAtList (dataConUnivTyVars dc) args
2167 ; (_, args') <- argsToPats env in_scope val_env rest_args arg_occs
2168 ; return (True,
2169 mkConApp dc (ty_args ++ args')) }
2170 where
2171 mb_scrut dc = case arg_occ of
2172 ScrutOcc bs | Just occs <- lookupUFM bs dc
2173 -> Just (occs) -- See Note [Reboxing]
2174 _other | sc_force env || sc_keen env
2175 -> Just (repeat UnkOcc)
2176 | otherwise
2177 -> Nothing
2178
2179 -- Check if the argument is a variable that
2180 -- (a) is used in an interesting way in the function body
2181 -- (b) we know what its value is
2182 -- In that case it counts as "interesting"
2183 argToPat env in_scope val_env (Var v) arg_occ
2184 | sc_force env || case arg_occ of { UnkOcc -> False; _other -> True }, -- (a)
2185 is_value, -- (b)
2186 -- Ignoring sc_keen here to avoid gratuitously incurring Note [Reboxing]
2187 -- So sc_keen focused just on f (I# x), where we have freshly-allocated
2188 -- box that we can eliminate in the caller
2189 not (ignoreType env (varType v))
2190 = return (True, Var v)
2191 where
2192 is_value
2193 | isLocalId v = v `elemInScopeSet` in_scope
2194 && isJust (lookupVarEnv val_env v)
2195 -- Local variables have values in val_env
2196 | otherwise = isValueUnfolding (idUnfolding v)
2197 -- Imports have unfoldings
2198
2199 -- I'm really not sure what this comment means
2200 -- And by not wild-carding we tend to get forall'd
2201 -- variables that are in scope, which in turn can
2202 -- expose the weakness in let-matching
2203 -- See Note [Matching lets] in Rules
2204
2205 -- Check for a variable bound inside the function.
2206 -- Don't make a wild-card, because we may usefully share
2207 -- e.g. f a = let x = ... in f (x,x)
2208 -- NB: this case follows the lambda and con-app cases!!
2209 -- argToPat _in_scope _val_env (Var v) _arg_occ
2210 -- = return (False, Var v)
2211 -- SLPJ : disabling this to avoid proliferation of versions
2212 -- also works badly when thinking about seeding the loop
2213 -- from the body of the let
2214 -- f x y = letrec g z = ... in g (x,y)
2215 -- We don't want to specialise for that *particular* x,y
2216
2217 -- The default case: make a wild-card
2218 -- We use this for coercions too
2219 argToPat _env _in_scope _val_env arg _arg_occ
2220 = wildCardPat (exprType arg)
2221
2222 wildCardPat :: Type -> UniqSM (Bool, CoreArg)
2223 wildCardPat ty
2224 = do { uniq <- getUniqueM
2225 ; let id = mkSysLocalOrCoVar (fsLit "sc") uniq ty
2226 ; return (False, varToCoreExpr id) }
2227
2228 argsToPats :: ScEnv -> InScopeSet -> ValueEnv
2229 -> [CoreArg] -> [ArgOcc] -- Should be same length
2230 -> UniqSM (Bool, [CoreArg])
2231 argsToPats env in_scope val_env args occs
2232 = do { stuff <- zipWithM (argToPat env in_scope val_env) args occs
2233 ; let (interesting_s, args') = unzip stuff
2234 ; return (or interesting_s, args') }
2235
2236 isValue :: ValueEnv -> CoreExpr -> Maybe Value
2237 isValue _env (Lit lit)
2238 | litIsLifted lit = Nothing
2239 | otherwise = Just (ConVal (LitAlt lit) [])
2240
2241 isValue env (Var v)
2242 | Just cval <- lookupVarEnv env v
2243 = Just cval -- You might think we could look in the idUnfolding here
2244 -- but that doesn't take account of which branch of a
2245 -- case we are in, which is the whole point
2246
2247 | not (isLocalId v) && isCheapUnfolding unf
2248 = isValue env (unfoldingTemplate unf)
2249 where
2250 unf = idUnfolding v
2251 -- However we do want to consult the unfolding
2252 -- as well, for let-bound constructors!
2253
2254 isValue env (Lam b e)
2255 | isTyVar b = case isValue env e of
2256 Just _ -> Just LambdaVal
2257 Nothing -> Nothing
2258 | otherwise = Just LambdaVal
2259
2260 isValue env (Tick t e)
2261 | not (tickishIsCode t)
2262 = isValue env e
2263
2264 isValue _env expr -- Maybe it's a constructor application
2265 | (Var fun, args, _) <- collectArgsTicks (not . tickishIsCode) expr
2266 = case isDataConWorkId_maybe fun of
2267
2268 Just con | args `lengthAtLeast` dataConRepArity con
2269 -- Check saturated; might be > because the
2270 -- arity excludes type args
2271 -> Just (ConVal (DataAlt con) args)
2272
2273 _other | valArgCount args < idArity fun
2274 -- Under-applied function
2275 -> Just LambdaVal -- Partial application
2276
2277 _other -> Nothing
2278
2279 isValue _env _expr = Nothing
2280
2281 valueIsWorkFree :: Value -> Bool
2282 valueIsWorkFree LambdaVal = True
2283 valueIsWorkFree (ConVal _ args) = all exprIsWorkFree args
2284
2285 samePat :: CallPat -> CallPat -> Bool
2286 samePat (vs1, as1) (vs2, as2)
2287 = all2 same as1 as2
2288 where
2289 same (Var v1) (Var v2)
2290 | v1 `elem` vs1 = v2 `elem` vs2
2291 | v2 `elem` vs2 = False
2292 | otherwise = v1 == v2
2293
2294 same (Lit l1) (Lit l2) = l1==l2
2295 same (App f1 a1) (App f2 a2) = same f1 f2 && same a1 a2
2296
2297 same (Type {}) (Type {}) = True -- Note [Ignore type differences]
2298 same (Coercion {}) (Coercion {}) = True
2299 same (Tick _ e1) e2 = same e1 e2 -- Ignore casts and notes
2300 same (Cast e1 _) e2 = same e1 e2
2301 same e1 (Tick _ e2) = same e1 e2
2302 same e1 (Cast e2 _) = same e1 e2
2303
2304 same e1 e2 = WARN( bad e1 || bad e2, ppr e1 $$ ppr e2)
2305 False -- Let, lambda, case should not occur
2306 bad (Case {}) = True
2307 bad (Let {}) = True
2308 bad (Lam {}) = True
2309 bad _other = False
2310
2311 {-
2312 Note [Ignore type differences]
2313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2314 We do not want to generate specialisations where the call patterns
2315 differ only in their type arguments! Not only is it utterly useless,
2316 but it also means that (with polymorphic recursion) we can generate
2317 an infinite number of specialisations. Example is Data.Sequence.adjustTree,
2318 I think.
2319 -}