Let the specialiser work on dicts under lambdas
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 {-# LANGUAGE ViewPatterns #-}
9 module Specialise ( specProgram, specUnfolding ) where
10
11 #include "HsVersions.h"
12
13 import GhcPrelude
14
15 import Id
16 import TcType hiding( substTy )
17 import Type hiding( substTy, extendTvSubstList )
18 import Module( Module, HasModule(..) )
19 import Coercion( Coercion )
20 import CoreMonad
21 import qualified CoreSubst
22 import CoreUnfold
23 import Var ( isLocalVar )
24 import VarSet
25 import VarEnv
26 import CoreSyn
27 import Rules
28 import CoreOpt ( collectBindersPushingCo )
29 import CoreUtils ( exprIsTrivial, mkCast, exprType )
30 import CoreFVs
31 import CoreArity ( etaExpandToJoinPointRule )
32 import UniqSupply
33 import Name
34 import MkId ( voidArgId, voidPrimId )
35 import Maybes ( mapMaybe, isJust )
36 import MonadUtils ( foldlM )
37 import BasicTypes
38 import HscTypes
39 import Bag
40 import DynFlags
41 import Util
42 import Outputable
43 import FastString
44 import State
45 import UniqDFM
46 import TyCoRep (TyCoBinder (..))
47
48 import Control.Monad
49 import qualified Control.Monad.Fail as MonadFail
50
51 {-
52 ************************************************************************
53 * *
54 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
55 * *
56 ************************************************************************
57
58 These notes describe how we implement specialisation to eliminate
59 overloading.
60
61 The specialisation pass works on Core
62 syntax, complete with all the explicit dictionary application,
63 abstraction and construction as added by the type checker. The
64 existing type checker remains largely as it is.
65
66 One important thought: the {\em types} passed to an overloaded
67 function, and the {\em dictionaries} passed are mutually redundant.
68 If the same function is applied to the same type(s) then it is sure to
69 be applied to the same dictionary(s)---or rather to the same {\em
70 values}. (The arguments might look different but they will evaluate
71 to the same value.)
72
73 Second important thought: we know that we can make progress by
74 treating dictionary arguments as static and worth specialising on. So
75 we can do without binding-time analysis, and instead specialise on
76 dictionary arguments and no others.
77
78 The basic idea
79 ~~~~~~~~~~~~~~
80 Suppose we have
81
82 let f = <f_rhs>
83 in <body>
84
85 and suppose f is overloaded.
86
87 STEP 1: CALL-INSTANCE COLLECTION
88
89 We traverse <body>, accumulating all applications of f to types and
90 dictionaries.
91
92 (Might there be partial applications, to just some of its types and
93 dictionaries? In principle yes, but in practice the type checker only
94 builds applications of f to all its types and dictionaries, so partial
95 applications could only arise as a result of transformation, and even
96 then I think it's unlikely. In any case, we simply don't accumulate such
97 partial applications.)
98
99
100 STEP 2: EQUIVALENCES
101
102 So now we have a collection of calls to f:
103 f t1 t2 d1 d2
104 f t3 t4 d3 d4
105 ...
106 Notice that f may take several type arguments. To avoid ambiguity, we
107 say that f is called at type t1/t2 and t3/t4.
108
109 We take equivalence classes using equality of the *types* (ignoring
110 the dictionary args, which as mentioned previously are redundant).
111
112 STEP 3: SPECIALISATION
113
114 For each equivalence class, choose a representative (f t1 t2 d1 d2),
115 and create a local instance of f, defined thus:
116
117 f@t1/t2 = <f_rhs> t1 t2 d1 d2
118
119 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
120 of simplification will now result. However we don't actually *do* that
121 simplification. Rather, we leave it for the simplifier to do. If we
122 *did* do it, though, we'd get more call instances from the specialised
123 RHS. We can work out what they are by instantiating the call-instance
124 set from f's RHS with the types t1, t2.
125
126 Add this new id to f's IdInfo, to record that f has a specialised version.
127
128 Before doing any of this, check that f's IdInfo doesn't already
129 tell us about an existing instance of f at the required type/s.
130 (This might happen if specialisation was applied more than once, or
131 it might arise from user SPECIALIZE pragmas.)
132
133 Recursion
134 ~~~~~~~~~
135 Wait a minute! What if f is recursive? Then we can't just plug in
136 its right-hand side, can we?
137
138 But it's ok. The type checker *always* creates non-recursive definitions
139 for overloaded recursive functions. For example:
140
141 f x = f (x+x) -- Yes I know its silly
142
143 becomes
144
145 f a (d::Num a) = let p = +.sel a d
146 in
147 letrec fl (y::a) = fl (p y y)
148 in
149 fl
150
151 We still have recursion for non-overloaded functions which we
152 specialise, but the recursive call should get specialised to the
153 same recursive version.
154
155
156 Polymorphism 1
157 ~~~~~~~~~~~~~~
158
159 All this is crystal clear when the function is applied to *constant
160 types*; that is, types which have no type variables inside. But what if
161 it is applied to non-constant types? Suppose we find a call of f at type
162 t1/t2. There are two possibilities:
163
164 (a) The free type variables of t1, t2 are in scope at the definition point
165 of f. In this case there's no problem, we proceed just as before. A common
166 example is as follows. Here's the Haskell:
167
168 g y = let f x = x+x
169 in f y + f y
170
171 After typechecking we have
172
173 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
174 in +.sel a d (f a d y) (f a d y)
175
176 Notice that the call to f is at type type "a"; a non-constant type.
177 Both calls to f are at the same type, so we can specialise to give:
178
179 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
180 in +.sel a d (f@a y) (f@a y)
181
182
183 (b) The other case is when the type variables in the instance types
184 are *not* in scope at the definition point of f. The example we are
185 working with above is a good case. There are two instances of (+.sel a d),
186 but "a" is not in scope at the definition of +.sel. Can we do anything?
187 Yes, we can "common them up", a sort of limited common sub-expression deal.
188 This would give:
189
190 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
191 f@a (x::a) = +.sel@a x x
192 in +.sel@a (f@a y) (f@a y)
193
194 This can save work, and can't be spotted by the type checker, because
195 the two instances of +.sel weren't originally at the same type.
196
197 Further notes on (b)
198
199 * There are quite a few variations here. For example, the defn of
200 +.sel could be floated ouside the \y, to attempt to gain laziness.
201 It certainly mustn't be floated outside the \d because the d has to
202 be in scope too.
203
204 * We don't want to inline f_rhs in this case, because
205 that will duplicate code. Just commoning up the call is the point.
206
207 * Nothing gets added to +.sel's IdInfo.
208
209 * Don't bother unless the equivalence class has more than one item!
210
211 Not clear whether this is all worth it. It is of course OK to
212 simply discard call-instances when passing a big lambda.
213
214 Polymorphism 2 -- Overloading
215 ~~~~~~~~~~~~~~
216 Consider a function whose most general type is
217
218 f :: forall a b. Ord a => [a] -> b -> b
219
220 There is really no point in making a version of g at Int/Int and another
221 at Int/Bool, because it's only instantiating the type variable "a" which
222 buys us any efficiency. Since g is completely polymorphic in b there
223 ain't much point in making separate versions of g for the different
224 b types.
225
226 That suggests that we should identify which of g's type variables
227 are constrained (like "a") and which are unconstrained (like "b").
228 Then when taking equivalence classes in STEP 2, we ignore the type args
229 corresponding to unconstrained type variable. In STEP 3 we make
230 polymorphic versions. Thus:
231
232 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
233
234 We do this.
235
236
237 Dictionary floating
238 ~~~~~~~~~~~~~~~~~~~
239 Consider this
240
241 f a (d::Num a) = let g = ...
242 in
243 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
244
245 Here, g is only called at one type, but the dictionary isn't in scope at the
246 definition point for g. Usually the type checker would build a
247 definition for d1 which enclosed g, but the transformation system
248 might have moved d1's defn inward. Solution: float dictionary bindings
249 outwards along with call instances.
250
251 Consider
252
253 f x = let g p q = p==q
254 h r s = (r+s, g r s)
255 in
256 h x x
257
258
259 Before specialisation, leaving out type abstractions we have
260
261 f df x = let g :: Eq a => a -> a -> Bool
262 g dg p q = == dg p q
263 h :: Num a => a -> a -> (a, Bool)
264 h dh r s = let deq = eqFromNum dh
265 in (+ dh r s, g deq r s)
266 in
267 h df x x
268
269 After specialising h we get a specialised version of h, like this:
270
271 h' r s = let deq = eqFromNum df
272 in (+ df r s, g deq r s)
273
274 But we can't naively make an instance for g from this, because deq is not in scope
275 at the defn of g. Instead, we have to float out the (new) defn of deq
276 to widen its scope. Notice that this floating can't be done in advance -- it only
277 shows up when specialisation is done.
278
279 User SPECIALIZE pragmas
280 ~~~~~~~~~~~~~~~~~~~~~~~
281 Specialisation pragmas can be digested by the type checker, and implemented
282 by adding extra definitions along with that of f, in the same way as before
283
284 f@t1/t2 = <f_rhs> t1 t2 d1 d2
285
286 Indeed the pragmas *have* to be dealt with by the type checker, because
287 only it knows how to build the dictionaries d1 and d2! For example
288
289 g :: Ord a => [a] -> [a]
290 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
291
292 Here, the specialised version of g is an application of g's rhs to the
293 Ord dictionary for (Tree Int), which only the type checker can conjure
294 up. There might not even *be* one, if (Tree Int) is not an instance of
295 Ord! (All the other specialision has suitable dictionaries to hand
296 from actual calls.)
297
298 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
299 it is buried in a complex (as-yet-un-desugared) binding group.
300 Maybe we should say
301
302 f@t1/t2 = f* t1 t2 d1 d2
303
304 where f* is the Id f with an IdInfo which says "inline me regardless!".
305 Indeed all the specialisation could be done in this way.
306 That in turn means that the simplifier has to be prepared to inline absolutely
307 any in-scope let-bound thing.
308
309
310 Again, the pragma should permit polymorphism in unconstrained variables:
311
312 h :: Ord a => [a] -> b -> b
313 {-# SPECIALIZE h :: [Int] -> b -> b #-}
314
315 We *insist* that all overloaded type variables are specialised to ground types,
316 (and hence there can be no context inside a SPECIALIZE pragma).
317 We *permit* unconstrained type variables to be specialised to
318 - a ground type
319 - or left as a polymorphic type variable
320 but nothing in between. So
321
322 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
323
324 is *illegal*. (It can be handled, but it adds complication, and gains the
325 programmer nothing.)
326
327
328 SPECIALISING INSTANCE DECLARATIONS
329 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
330 Consider
331
332 instance Foo a => Foo [a] where
333 ...
334 {-# SPECIALIZE instance Foo [Int] #-}
335
336 The original instance decl creates a dictionary-function
337 definition:
338
339 dfun.Foo.List :: forall a. Foo a -> Foo [a]
340
341 The SPECIALIZE pragma just makes a specialised copy, just as for
342 ordinary function definitions:
343
344 dfun.Foo.List@Int :: Foo [Int]
345 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
346
347 The information about what instance of the dfun exist gets added to
348 the dfun's IdInfo in the same way as a user-defined function too.
349
350
351 Automatic instance decl specialisation?
352 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
353 Can instance decls be specialised automatically? It's tricky.
354 We could collect call-instance information for each dfun, but
355 then when we specialised their bodies we'd get new call-instances
356 for ordinary functions; and when we specialised their bodies, we might get
357 new call-instances of the dfuns, and so on. This all arises because of
358 the unrestricted mutual recursion between instance decls and value decls.
359
360 Still, there's no actual problem; it just means that we may not do all
361 the specialisation we could theoretically do.
362
363 Furthermore, instance decls are usually exported and used non-locally,
364 so we'll want to compile enough to get those specialisations done.
365
366 Lastly, there's no such thing as a local instance decl, so we can
367 survive solely by spitting out *usage* information, and then reading that
368 back in as a pragma when next compiling the file. So for now,
369 we only specialise instance decls in response to pragmas.
370
371
372 SPITTING OUT USAGE INFORMATION
373 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
374
375 To spit out usage information we need to traverse the code collecting
376 call-instance information for all imported (non-prelude?) functions
377 and data types. Then we equivalence-class it and spit it out.
378
379 This is done at the top-level when all the call instances which escape
380 must be for imported functions and data types.
381
382 *** Not currently done ***
383
384
385 Partial specialisation by pragmas
386 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
387 What about partial specialisation:
388
389 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
390 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
391
392 or even
393
394 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
395
396 Seems quite reasonable. Similar things could be done with instance decls:
397
398 instance (Foo a, Foo b) => Foo (a,b) where
399 ...
400 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
401 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
402
403 Ho hum. Things are complex enough without this. I pass.
404
405
406 Requirements for the simplifier
407 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
408 The simplifier has to be able to take advantage of the specialisation.
409
410 * When the simplifier finds an application of a polymorphic f, it looks in
411 f's IdInfo in case there is a suitable instance to call instead. This converts
412
413 f t1 t2 d1 d2 ===> f_t1_t2
414
415 Note that the dictionaries get eaten up too!
416
417 * Dictionary selection operations on constant dictionaries must be
418 short-circuited:
419
420 +.sel Int d ===> +Int
421
422 The obvious way to do this is in the same way as other specialised
423 calls: +.sel has inside it some IdInfo which tells that if it's applied
424 to the type Int then it should eat a dictionary and transform to +Int.
425
426 In short, dictionary selectors need IdInfo inside them for constant
427 methods.
428
429 * Exactly the same applies if a superclass dictionary is being
430 extracted:
431
432 Eq.sel Int d ===> dEqInt
433
434 * Something similar applies to dictionary construction too. Suppose
435 dfun.Eq.List is the function taking a dictionary for (Eq a) to
436 one for (Eq [a]). Then we want
437
438 dfun.Eq.List Int d ===> dEq.List_Int
439
440 Where does the Eq [Int] dictionary come from? It is built in
441 response to a SPECIALIZE pragma on the Eq [a] instance decl.
442
443 In short, dfun Ids need IdInfo with a specialisation for each
444 constant instance of their instance declaration.
445
446 All this uses a single mechanism: the SpecEnv inside an Id
447
448
449 What does the specialisation IdInfo look like?
450 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
451
452 The SpecEnv of an Id maps a list of types (the template) to an expression
453
454 [Type] |-> Expr
455
456 For example, if f has this RuleInfo:
457
458 [Int, a] -> \d:Ord Int. f' a
459
460 it means that we can replace the call
461
462 f Int t ===> (\d. f' t)
463
464 This chucks one dictionary away and proceeds with the
465 specialised version of f, namely f'.
466
467
468 What can't be done this way?
469 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
470 There is no way, post-typechecker, to get a dictionary for (say)
471 Eq a from a dictionary for Eq [a]. So if we find
472
473 ==.sel [t] d
474
475 we can't transform to
476
477 eqList (==.sel t d')
478
479 where
480 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
481
482 Of course, we currently have no way to automatically derive
483 eqList, nor to connect it to the Eq [a] instance decl, but you
484 can imagine that it might somehow be possible. Taking advantage
485 of this is permanently ruled out.
486
487 Still, this is no great hardship, because we intend to eliminate
488 overloading altogether anyway!
489
490 A note about non-tyvar dictionaries
491 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
492 Some Ids have types like
493
494 forall a,b,c. Eq a -> Ord [a] -> tau
495
496 This seems curious at first, because we usually only have dictionary
497 args whose types are of the form (C a) where a is a type variable.
498 But this doesn't hold for the functions arising from instance decls,
499 which sometimes get arguments with types of form (C (T a)) for some
500 type constructor T.
501
502 Should we specialise wrt this compound-type dictionary? We used to say
503 "no", saying:
504 "This is a heuristic judgement, as indeed is the fact that we
505 specialise wrt only dictionaries. We choose *not* to specialise
506 wrt compound dictionaries because at the moment the only place
507 they show up is in instance decls, where they are simply plugged
508 into a returned dictionary. So nothing is gained by specialising
509 wrt them."
510
511 But it is simpler and more uniform to specialise wrt these dicts too;
512 and in future GHC is likely to support full fledged type signatures
513 like
514 f :: Eq [(a,b)] => ...
515
516
517 ************************************************************************
518 * *
519 \subsubsection{The new specialiser}
520 * *
521 ************************************************************************
522
523 Our basic game plan is this. For let(rec) bound function
524 f :: (C a, D c) => (a,b,c,d) -> Bool
525
526 * Find any specialised calls of f, (f ts ds), where
527 ts are the type arguments t1 .. t4, and
528 ds are the dictionary arguments d1 .. d2.
529
530 * Add a new definition for f1 (say):
531
532 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
533
534 Note that we abstract over the unconstrained type arguments.
535
536 * Add the mapping
537
538 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
539
540 to the specialisations of f. This will be used by the
541 simplifier to replace calls
542 (f t1 t2 t3 t4) da db
543 by
544 (\d1 d1 -> f1 t2 t4) da db
545
546 All the stuff about how many dictionaries to discard, and what types
547 to apply the specialised function to, are handled by the fact that the
548 SpecEnv contains a template for the result of the specialisation.
549
550 We don't build *partial* specialisations for f. For example:
551
552 f :: Eq a => a -> a -> Bool
553 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
554
555 Here, little is gained by making a specialised copy of f.
556 There's a distinct danger that the specialised version would
557 first build a dictionary for (Eq b, Eq c), and then select the (==)
558 method from it! Even if it didn't, not a great deal is saved.
559
560 We do, however, generate polymorphic, but not overloaded, specialisations:
561
562 f :: Eq a => [a] -> b -> b -> b
563 ... SPECIALISE f :: [Int] -> b -> b -> b ...
564
565 Hence, the invariant is this:
566
567 *** no specialised version is overloaded ***
568
569
570 ************************************************************************
571 * *
572 \subsubsection{The exported function}
573 * *
574 ************************************************************************
575 -}
576
577 -- | Specialise calls to type-class overloaded functions occuring in a program.
578 specProgram :: ModGuts -> CoreM ModGuts
579 specProgram guts@(ModGuts { mg_module = this_mod
580 , mg_rules = local_rules
581 , mg_binds = binds })
582 = do { dflags <- getDynFlags
583
584 -- Specialise the bindings of this module
585 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
586
587 -- Specialise imported functions
588 ; hpt_rules <- getRuleBase
589 ; let rule_base = extendRuleBaseList hpt_rules local_rules
590 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
591 [] rule_base uds
592
593 ; let final_binds
594 | null spec_binds = binds'
595 | otherwise = Rec (flattenBinds spec_binds) : binds'
596 -- Note [Glom the bindings if imported functions are specialised]
597
598 ; return (guts { mg_binds = final_binds
599 , mg_rules = new_rules ++ local_rules }) }
600 where
601 -- We need to start with a Subst that knows all the things
602 -- that are in scope, so that the substitution engine doesn't
603 -- accidentally re-use a unique that's already in use
604 -- Easiest thing is to do it all at once, as if all the top-level
605 -- decls were mutually recursive
606 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
607 bindersOfBinds binds
608 , se_interesting = emptyVarSet }
609
610 go [] = return ([], emptyUDs)
611 go (bind:binds) = do (binds', uds) <- go binds
612 (bind', uds') <- specBind top_env bind uds
613 return (bind' ++ binds', uds')
614
615 {-
616 Note [Wrap bindings returned by specImports]
617 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
618 'specImports' returns a set of specialized bindings. However, these are lacking
619 necessary floated dictionary bindings, which are returned by
620 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
621 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
622 for instance, the 'specImports' call in 'specProgram'.
623
624
625 Note [Disabling cross-module specialisation]
626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
627 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
628 in modules outside of the current module. This can sometimes uncover user code
629 which explodes in size when aggressively optimized. The
630 -fno-cross-module-specialise option was introduced to allow users to being
631 bitten by such instances to revert to the pre-7.10 behavior.
632
633 See #10491
634 -}
635
636 -- | An argument that we might want to specialise.
637 -- See Note [Specialising Calls] for the nitty gritty details.
638 data SpecArg
639 =
640 -- | Type arguments that should be specialised, due to appearing
641 -- free in the type of a 'SpecDict'.
642 SpecType Type
643 -- | Type arguments that should remain polymorphic.
644 | UnspecType
645 -- | Dictionaries that should be specialised.
646 | SpecDict DictExpr
647 -- | Value arguments that should not be specialised.
648 | UnspecArg
649
650 instance Outputable SpecArg where
651 ppr (SpecType t) = text "SpecType" <+> ppr t
652 ppr UnspecType = text "UnspecType"
653 ppr (SpecDict d) = text "SpecDict" <+> ppr d
654 ppr UnspecArg = text "UnspecArg"
655
656 getSpecDicts :: [SpecArg] -> [DictExpr]
657 getSpecDicts = mapMaybe go
658 where
659 go (SpecDict d) = Just d
660 go _ = Nothing
661
662 getSpecTypes :: [SpecArg] -> [Type]
663 getSpecTypes = mapMaybe go
664 where
665 go (SpecType t) = Just t
666 go _ = Nothing
667
668 isUnspecArg :: SpecArg -> Bool
669 isUnspecArg UnspecArg = True
670 isUnspecArg UnspecType = True
671 isUnspecArg _ = False
672
673 isValueArg :: SpecArg -> Bool
674 isValueArg UnspecArg = True
675 isValueArg (SpecDict _) = True
676 isValueArg _ = False
677
678 -- | Given binders from an original function 'f', and the 'SpecArg's
679 -- corresponding to its usage, compute everything necessary to build
680 -- a specialisation.
681 --
682 -- We will use a running example. Consider the function
683 --
684 -- foo :: forall a b. Eq a => Int -> blah
685 -- foo @a @b dEqA i = blah
686 --
687 -- which is called with the 'CallInfo'
688 --
689 -- [SpecType T1, UnspecType, SpecDict dEqT1, UnspecArg]
690 --
691 -- We'd eventually like to build the RULE
692 --
693 -- RULE "SPEC foo @T1 _"
694 -- forall @a @b (dEqA' :: Eq a).
695 -- foo @T1 @b dEqA' = $sfoo @b
696 --
697 -- and the specialisation '$sfoo'
698 --
699 -- $sfoo :: forall b. Int -> blah
700 -- $sfoo @b = \i -> SUBST[a->T1, dEqA->dEqA'] blah
701 --
702 -- The cases for 'specHeader' below are presented in the same order as this
703 -- running example. The result of 'specHeader' for this example is as follows:
704 --
705 -- ( -- Returned arguments
706 -- env + [a -> T1, deqA -> dEqA']
707 -- , []
708 --
709 -- -- RULE helpers
710 -- , [b, dx', i]
711 -- , [T1, b, dx', i]
712 --
713 -- -- Specialised function helpers
714 -- , [b, i]
715 -- , [dx]
716 -- , [T1, b, dx_spec, i]
717 -- )
718 specHeader
719 :: SpecEnv
720 -> [CoreBndr] -- The binders from the original function 'f'
721 -> [SpecArg] -- From the CallInfo
722 -> SpecM ( -- Returned arguments
723 SpecEnv -- Substitution to apply to the body of 'f'
724 , [CoreBndr] -- All the remaining unspecialised args from the original function 'f'
725
726 -- RULE helpers
727 , [CoreBndr] -- Binders for the RULE
728 , [CoreArg] -- Args for the LHS of the rule
729
730 -- Specialised function helpers
731 , [CoreBndr] -- Binders for $sf
732 , [DictBind] -- Auxiliary dictionary bindings
733 , [CoreExpr] -- Specialised arguments for unfolding
734 )
735
736 -- We want to specialise on type 'T1', and so we must construct a substitution
737 -- 'a->T1', as well as a LHS argument for the resulting RULE and unfolding
738 -- details.
739 specHeader env (bndr : bndrs) (SpecType t : args)
740 = do { let env' = extendTvSubstList env [(bndr, t)]
741 ; (env'', unused_bndrs, rule_bs, rule_es, bs', dx, spec_args)
742 <- specHeader env' bndrs args
743 ; pure ( env''
744 , unused_bndrs
745 , rule_bs
746 , Type t : rule_es
747 , bs'
748 , dx
749 , Type t : spec_args
750 )
751 }
752
753 -- Next we have a type that we don't want to specialise. We need to perform
754 -- a substitution on it (in case the type refers to 'a'). Additionally, we need
755 -- to produce a binder, LHS argument and RHS argument for the resulting rule,
756 -- /and/ a binder for the specialised body.
757 specHeader env (bndr : bndrs) (UnspecType : args)
758 = do { let (env', bndr') = substBndr env bndr
759 ; (env'', unused_bndrs, rule_bs, rule_es, bs', dx, spec_args)
760 <- specHeader env' bndrs args
761 ; pure ( env''
762 , unused_bndrs
763 , bndr' : rule_bs
764 , varToCoreExpr bndr' : rule_es
765 , bndr' : bs'
766 , dx
767 , varToCoreExpr bndr' : spec_args
768 )
769 }
770
771 -- Next we want to specialise the 'Eq a' dict away. We need to construct
772 -- a wildcard binder to match the dictionary (See Note [Specialising Calls] for
773 -- the nitty-gritty), as a LHS rule and unfolding details.
774 specHeader env (bndr : bndrs) (SpecDict d : args)
775 = do { inst_dict_id <- newDictBndr env bndr
776 ; let (rhs_env2, dx_binds, spec_dict_args')
777 = bindAuxiliaryDicts env [bndr] [d] [inst_dict_id]
778 ; (env', unused_bndrs, rule_bs, rule_es, bs', dx, spec_args)
779 <- specHeader rhs_env2 bndrs args
780 ; pure ( env'
781 , unused_bndrs
782 -- See Note [Evidence foralls]
783 , exprFreeIdsList (varToCoreExpr inst_dict_id) ++ rule_bs
784 , varToCoreExpr inst_dict_id : rule_es
785 , bs'
786 , dx_binds ++ dx
787 , spec_dict_args' ++ spec_args
788 )
789 }
790
791 -- Finally, we have the unspecialised argument 'i'. We need to produce
792 -- a binder, LHS and RHS argument for the RULE, and a binder for the
793 -- specialised body.
794 --
795 -- NB: Calls to 'specHeader' will trim off any trailing 'UnspecArg's, which is
796 -- why 'i' doesn't appear in our RULE above. But we have no guarantee that
797 -- there aren't 'UnspecArg's which come /before/ all of the dictionaries, so
798 -- this case must be here.
799 specHeader env (bndr : bndrs) (UnspecArg : args)
800 = do { let (env', bndr') = substBndr env bndr
801 ; (env'', unused_bndrs, rule_bs, rule_es, bs', dx, spec_args)
802 <- specHeader env' bndrs args
803 ; pure ( env''
804 , unused_bndrs
805 , bndr' : rule_bs
806 , varToCoreExpr bndr' : rule_es
807 , bndr' : bs'
808 , dx
809 , varToCoreExpr bndr' : spec_args
810 )
811 }
812
813 -- Return all remaining binders from the original function. These have the
814 -- invariant that they should all correspond to unspecialised arguments, so
815 -- it's safe to stop processing at this point.
816 specHeader env bndrs [] = pure (env, bndrs, [], [], [], [], [])
817 specHeader env [] _ = pure (env, [], [], [], [], [], [])
818
819
820 -- | Specialise a set of calls to imported bindings
821 specImports :: DynFlags
822 -> Module
823 -> SpecEnv -- Passed in so that all top-level Ids are in scope
824 -> VarSet -- Don't specialise these ones
825 -- See Note [Avoiding recursive specialisation]
826 -> [Id] -- Stack of imported functions being specialised
827 -> RuleBase -- Rules from this module and the home package
828 -- (but not external packages, which can change)
829 -> UsageDetails -- Calls for imported things, and floating bindings
830 -> CoreM ( [CoreRule] -- New rules
831 , [CoreBind] ) -- Specialised bindings
832 -- See Note [Wrapping bindings returned by specImports]
833 specImports dflags this_mod top_env done callers rule_base
834 (MkUD { ud_binds = dict_binds, ud_calls = calls })
835 -- See Note [Disabling cross-module specialisation]
836 | not $ gopt Opt_CrossModuleSpecialise dflags
837 = return ([], [])
838
839 | otherwise
840 = do { let import_calls = dVarEnvElts calls
841 ; (rules, spec_binds) <- go rule_base import_calls
842
843 -- Don't forget to wrap the specialized bindings with
844 -- bindings for the needed dictionaries.
845 -- See Note [Wrap bindings returned by specImports]
846 ; let spec_binds' = wrapDictBinds dict_binds spec_binds
847
848 ; return (rules, spec_binds') }
849 where
850 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
851 go _ [] = return ([], [])
852 go rb (cis@(CIS fn _) : other_calls)
853 = do { let ok_calls = filterCalls cis dict_binds
854 -- Drop calls that (directly or indirectly) refer to fn
855 -- See Note [Avoiding loops]
856 -- ; debugTraceMsg (text "specImport" <+> vcat [ ppr fn
857 -- , text "calls" <+> ppr cis
858 -- , text "ud_binds =" <+> ppr dict_binds
859 -- , text "dump set =" <+> ppr dump_set
860 -- , text "filtered calls =" <+> ppr ok_calls ])
861 ; (rules1, spec_binds1) <- specImport dflags this_mod top_env
862 done callers rb fn ok_calls
863
864 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
865 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
866
867 specImport :: DynFlags
868 -> Module
869 -> SpecEnv -- Passed in so that all top-level Ids are in scope
870 -> VarSet -- Don't specialise these
871 -- See Note [Avoiding recursive specialisation]
872 -> [Id] -- Stack of imported functions being specialised
873 -> RuleBase -- Rules from this module
874 -> Id -> [CallInfo] -- Imported function and calls for it
875 -> CoreM ( [CoreRule] -- New rules
876 , [CoreBind] ) -- Specialised bindings
877 specImport dflags this_mod top_env done callers rb fn calls_for_fn
878 | fn `elemVarSet` done
879 = return ([], []) -- No warning. This actually happens all the time
880 -- when specialising a recursive function, because
881 -- the RHS of the specialised function contains a recursive
882 -- call to the original function
883
884 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
885 = return ([], [])
886
887 | wantSpecImport dflags unfolding
888 , Just rhs <- maybeUnfoldingTemplate unfolding
889 = do { -- Get rules from the external package state
890 -- We keep doing this in case we "page-fault in"
891 -- more rules as we go along
892 ; hsc_env <- getHscEnv
893 ; eps <- liftIO $ hscEPS hsc_env
894 ; vis_orphs <- getVisibleOrphanMods
895 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
896 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
897
898 ; (rules1, spec_pairs, uds)
899 <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
900 runSpecM dflags this_mod $
901 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
902 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
903 -- After the rules kick in we may get recursion, but
904 -- we rely on a global GlomBinds to sort that out later
905 -- See Note [Glom the bindings if imported functions are specialised]
906
907 -- Now specialise any cascaded calls
908 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
909 specImports dflags this_mod top_env
910 (extendVarSet done fn)
911 (fn:callers)
912 (extendRuleBaseList rb rules1)
913 uds
914
915 ; let final_binds = spec_binds2 ++ spec_binds1
916
917 ; return (rules2 ++ rules1, final_binds) }
918
919 | otherwise = do { tryWarnMissingSpecs dflags callers fn calls_for_fn
920 ; return ([], [])}
921
922 where
923 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
924
925 -- | Returns whether or not to show a missed-spec warning.
926 -- If -Wall-missed-specializations is on, show the warning.
927 -- Otherwise, if -Wmissed-specializations is on, only show a warning
928 -- if there is at least one imported function being specialized,
929 -- and if all imported functions are marked with an inline pragma
930 -- Use the most specific warning as the reason.
931 tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM ()
932 -- See Note [Warning about missed specialisations]
933 tryWarnMissingSpecs dflags callers fn calls_for_fn
934 | wopt Opt_WarnMissedSpecs dflags
935 && not (null callers)
936 && allCallersInlined = doWarn $ Reason Opt_WarnMissedSpecs
937 | wopt Opt_WarnAllMissedSpecs dflags = doWarn $ Reason Opt_WarnAllMissedSpecs
938 | otherwise = return ()
939 where
940 allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers
941 doWarn reason =
942 warnMsg reason
943 (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn))
944 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
945 | caller <- callers])
946 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
947 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
948
949 wantSpecImport :: DynFlags -> Unfolding -> Bool
950 -- See Note [Specialise imported INLINABLE things]
951 wantSpecImport dflags unf
952 = case unf of
953 NoUnfolding -> False
954 BootUnfolding -> False
955 OtherCon {} -> False
956 DFunUnfolding {} -> True
957 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
958 | gopt Opt_SpecialiseAggressively dflags -> True
959 | isStableSource src -> True
960 -- Specialise even INLINE things; it hasn't inlined yet,
961 -- so perhaps it never will. Moreover it may have calls
962 -- inside it that we want to specialise
963 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
964
965 {- Note [Warning about missed specialisations]
966 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
967 Suppose
968 * In module Lib, you carefully mark a function 'foo' INLINABLE
969 * Import Lib(foo) into another module M
970 * Call 'foo' at some specialised type in M
971 Then you jolly well expect it to be specialised in M. But what if
972 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
973 specialised too. But if 'bar' is not marked INLINABLE it may well
974 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
975
976 It's more noisy to warning about a missed specialisation opportunity
977 for /every/ overloaded imported function, but sometimes useful. That
978 is what Opt_WarnAllMissedSpecs does.
979
980 ToDo: warn about missed opportunities for local functions.
981
982 Note [Specialise imported INLINABLE things]
983 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
984 What imported functions do we specialise? The basic set is
985 * DFuns and things with INLINABLE pragmas.
986 but with -fspecialise-aggressively we add
987 * Anything with an unfolding template
988
989 #8874 has a good example of why we want to auto-specialise DFuns.
990
991 We have the -fspecialise-aggressively flag (usually off), because we
992 risk lots of orphan modules from over-vigorous specialisation.
993 However it's not a big deal: anything non-recursive with an
994 unfolding-template will probably have been inlined already.
995
996 Note [Glom the bindings if imported functions are specialised]
997 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
998 Suppose we have an imported, *recursive*, INLINABLE function
999 f :: Eq a => a -> a
1000 f = /\a \d x. ...(f a d)...
1001 In the module being compiled we have
1002 g x = f (x::Int)
1003 Now we'll make a specialised function
1004 f_spec :: Int -> Int
1005 f_spec = \x -> ...(f Int dInt)...
1006 {-# RULE f Int _ = f_spec #-}
1007 g = \x. f Int dInt x
1008 Note that f_spec doesn't look recursive
1009 After rewriting with the RULE, we get
1010 f_spec = \x -> ...(f_spec)...
1011 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
1012 The occurrence analyser never turns a NonRec into a Rec. So we must
1013 make sure that f_spec is recursive. Easiest thing is to make all
1014 the specialisations for imported bindings recursive.
1015
1016
1017 Note [Avoiding recursive specialisation]
1018 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1019 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
1020 'f's RHS. So we want to specialise g,h. But we don't want to
1021 specialise f any more! It's possible that f's RHS might have a
1022 recursive yet-more-specialised call, so we'd diverge in that case.
1023 And if the call is to the same type, one specialisation is enough.
1024 Avoiding this recursive specialisation loop is the reason for the
1025 'done' VarSet passed to specImports and specImport.
1026
1027 ************************************************************************
1028 * *
1029 \subsubsection{@specExpr@: the main function}
1030 * *
1031 ************************************************************************
1032 -}
1033
1034 data SpecEnv
1035 = SE { se_subst :: CoreSubst.Subst
1036 -- We carry a substitution down:
1037 -- a) we must clone any binding that might float outwards,
1038 -- to avoid name clashes
1039 -- b) we carry a type substitution to use when analysing
1040 -- the RHS of specialised bindings (no type-let!)
1041
1042
1043 , se_interesting :: VarSet
1044 -- Dict Ids that we know something about
1045 -- and hence may be worth specialising against
1046 -- See Note [Interesting dictionary arguments]
1047 }
1048
1049 specVar :: SpecEnv -> Id -> CoreExpr
1050 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
1051
1052 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
1053
1054 ---------------- First the easy cases --------------------
1055 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
1056 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
1057 specExpr env (Var v) = return (specVar env v, emptyUDs)
1058 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
1059 specExpr env (Cast e co)
1060 = do { (e', uds) <- specExpr env e
1061 ; return ((mkCast e' (substCo env co)), uds) }
1062 specExpr env (Tick tickish body)
1063 = do { (body', uds) <- specExpr env body
1064 ; return (Tick (specTickish env tickish) body', uds) }
1065
1066 ---------------- Applications might generate a call instance --------------------
1067 specExpr env expr@(App {})
1068 = go expr []
1069 where
1070 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
1071 (fun', uds_app) <- go fun (arg':args)
1072 return (App fun' arg', uds_arg `plusUDs` uds_app)
1073
1074 go (Var f) args = case specVar env f of
1075 Var f' -> return (Var f', mkCallUDs env f' args)
1076 e' -> return (e', emptyUDs) -- I don't expect this!
1077 go other _ = specExpr env other
1078
1079 ---------------- Lambda/case require dumping of usage details --------------------
1080 specExpr env e@(Lam _ _) = do
1081 (body', uds) <- specExpr env' body
1082 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
1083 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
1084 where
1085 (bndrs, body) = collectBinders e
1086 (env', bndrs') = substBndrs env bndrs
1087 -- More efficient to collect a group of binders together all at once
1088 -- and we don't want to split a lambda group with dumped bindings
1089
1090 specExpr env (Case scrut case_bndr ty alts)
1091 = do { (scrut', scrut_uds) <- specExpr env scrut
1092 ; (scrut'', case_bndr', alts', alts_uds)
1093 <- specCase env scrut' case_bndr alts
1094 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
1095 , scrut_uds `plusUDs` alts_uds) }
1096
1097 ---------------- Finally, let is the interesting case --------------------
1098 specExpr env (Let bind body)
1099 = do { -- Clone binders
1100 (rhs_env, body_env, bind') <- cloneBindSM env bind
1101
1102 -- Deal with the body
1103 ; (body', body_uds) <- specExpr body_env body
1104
1105 -- Deal with the bindings
1106 ; (binds', uds) <- specBind rhs_env bind' body_uds
1107
1108 -- All done
1109 ; return (foldr Let body' binds', uds) }
1110
1111 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
1112 specTickish env (Breakpoint ix ids)
1113 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
1114 -- drop vars from the list if they have a non-variable substitution.
1115 -- should never happen, but it's harmless to drop them anyway.
1116 specTickish _ other_tickish = other_tickish
1117
1118 specCase :: SpecEnv
1119 -> CoreExpr -- Scrutinee, already done
1120 -> Id -> [CoreAlt]
1121 -> SpecM ( CoreExpr -- New scrutinee
1122 , Id
1123 , [CoreAlt]
1124 , UsageDetails)
1125 specCase env scrut' case_bndr [(con, args, rhs)]
1126 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
1127 , interestingDict env scrut'
1128 , not (isDeadBinder case_bndr && null sc_args')
1129 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
1130
1131 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
1132 [(con, args', Var sc_arg')]
1133 | sc_arg' <- sc_args' ]
1134
1135 -- Extend the substitution for RHS to map the *original* binders
1136 -- to their floated versions.
1137 mb_sc_flts :: [Maybe DictId]
1138 mb_sc_flts = map (lookupVarEnv clone_env) args'
1139 clone_env = zipVarEnv sc_args' sc_args_flt
1140 subst_prs = (case_bndr, Var case_bndr_flt)
1141 : [ (arg, Var sc_flt)
1142 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
1143 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
1144 , se_interesting = se_interesting env_rhs `extendVarSetList`
1145 (case_bndr_flt : sc_args_flt) }
1146
1147 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
1148 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
1149 case_bndr_set = unitVarSet case_bndr_flt
1150 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
1151 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
1152 flt_binds = scrut_bind : sc_binds
1153 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
1154 all_uds = flt_binds `addDictBinds` free_uds
1155 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
1156 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
1157 where
1158 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
1159 sc_args' = filter is_flt_sc_arg args'
1160
1161 clone_me bndr = do { uniq <- getUniqueM
1162 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
1163 where
1164 name = idName bndr
1165 ty = idType bndr
1166 occ = nameOccName name
1167 loc = getSrcSpan name
1168
1169 arg_set = mkVarSet args'
1170 is_flt_sc_arg var = isId var
1171 && not (isDeadBinder var)
1172 && isDictTy var_ty
1173 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
1174 where
1175 var_ty = idType var
1176
1177
1178 specCase env scrut case_bndr alts
1179 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
1180 ; return (scrut, case_bndr', alts', uds_alts) }
1181 where
1182 (env_alt, case_bndr') = substBndr env case_bndr
1183 spec_alt (con, args, rhs) = do
1184 (rhs', uds) <- specExpr env_rhs rhs
1185 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
1186 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
1187 where
1188 (env_rhs, args') = substBndrs env_alt args
1189
1190 {-
1191 Note [Floating dictionaries out of cases]
1192 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1193 Consider
1194 g = \d. case d of { MkD sc ... -> ...(f sc)... }
1195 Naively we can't float d2's binding out of the case expression,
1196 because 'sc' is bound by the case, and that in turn means we can't
1197 specialise f, which seems a pity.
1198
1199 So we invert the case, by floating out a binding
1200 for 'sc_flt' thus:
1201 sc_flt = case d of { MkD sc ... -> sc }
1202 Now we can float the call instance for 'f'. Indeed this is just
1203 what'll happen if 'sc' was originally bound with a let binding,
1204 but case is more efficient, and necessary with equalities. So it's
1205 good to work with both.
1206
1207 You might think that this won't make any difference, because the
1208 call instance will only get nuked by the \d. BUT if 'g' itself is
1209 specialised, then transitively we should be able to specialise f.
1210
1211 In general, given
1212 case e of cb { MkD sc ... -> ...(f sc)... }
1213 we transform to
1214 let cb_flt = e
1215 sc_flt = case cb_flt of { MkD sc ... -> sc }
1216 in
1217 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1218
1219 The "_flt" things are the floated binds; we use the current substitution
1220 to substitute sc -> sc_flt in the RHS
1221
1222 ************************************************************************
1223 * *
1224 Dealing with a binding
1225 * *
1226 ************************************************************************
1227 -}
1228
1229 specBind :: SpecEnv -- Use this for RHSs
1230 -> CoreBind -- Binders are already cloned by cloneBindSM,
1231 -- but RHSs are un-processed
1232 -> UsageDetails -- Info on how the scope of the binding
1233 -> SpecM ([CoreBind], -- New bindings
1234 UsageDetails) -- And info to pass upstream
1235
1236 -- Returned UsageDetails:
1237 -- No calls for binders of this bind
1238 specBind rhs_env (NonRec fn rhs) body_uds
1239 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1240 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1241
1242 ; let pairs = spec_defns ++ [(fn', rhs')]
1243 -- fn' mentions the spec_defns in its rules,
1244 -- so put the latter first
1245
1246 combined_uds = body_uds1 `plusUDs` rhs_uds
1247
1248 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1249
1250 final_binds :: [DictBind]
1251 -- See Note [From non-recursive to recursive]
1252 final_binds
1253 | not (isEmptyBag dump_dbs)
1254 , not (null spec_defns)
1255 = [recWithDumpedDicts pairs dump_dbs]
1256 | otherwise
1257 = [mkDB $ NonRec b r | (b,r) <- pairs]
1258 ++ bagToList dump_dbs
1259
1260 ; if float_all then
1261 -- Rather than discard the calls mentioning the bound variables
1262 -- we float this (dictionary) binding along with the others
1263 return ([], free_uds `snocDictBinds` final_binds)
1264 else
1265 -- No call in final_uds mentions bound variables,
1266 -- so we can just leave the binding here
1267 return (map fst final_binds, free_uds) }
1268
1269
1270 specBind rhs_env (Rec pairs) body_uds
1271 -- Note [Specialising a recursive group]
1272 = do { let (bndrs,rhss) = unzip pairs
1273 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1274 ; let scope_uds = body_uds `plusUDs` rhs_uds
1275 -- Includes binds and calls arising from rhss
1276
1277 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1278
1279 ; (bndrs3, spec_defns3, uds3)
1280 <- if null spec_defns1 -- Common case: no specialisation
1281 then return (bndrs1, [], uds1)
1282 else do { -- Specialisation occurred; do it again
1283 (bndrs2, spec_defns2, uds2)
1284 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1285 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1286
1287 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1288 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1289 dumped_dbs
1290
1291 ; if float_all then
1292 return ([], final_uds `snocDictBind` final_bind)
1293 else
1294 return ([fst final_bind], final_uds) }
1295
1296
1297 ---------------------------
1298 specDefns :: SpecEnv
1299 -> UsageDetails -- Info on how it is used in its scope
1300 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1301 -> SpecM ([OutId], -- Original Ids with RULES added
1302 [(OutId,OutExpr)], -- Extra, specialised bindings
1303 UsageDetails) -- Stuff to fling upwards from the specialised versions
1304
1305 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1306 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1307 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1308 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1309 -- But not vice versa (it's a fixpoint problem).
1310
1311 specDefns _env uds []
1312 = return ([], [], uds)
1313 specDefns env uds ((bndr,rhs):pairs)
1314 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1315 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1316 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1317
1318 ---------------------------
1319 specDefn :: SpecEnv
1320 -> UsageDetails -- Info on how it is used in its scope
1321 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1322 -> SpecM (Id, -- Original Id with added RULES
1323 [(Id,CoreExpr)], -- Extra, specialised bindings
1324 UsageDetails) -- Stuff to fling upwards from the specialised versions
1325
1326 specDefn env body_uds fn rhs
1327 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1328 rules_for_me = idCoreRules fn
1329 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1330 calls_for_me fn rhs
1331 ; return ( fn `addIdSpecialisations` rules
1332 , spec_defns
1333 , body_uds_without_me `plusUDs` spec_uds) }
1334 -- It's important that the `plusUDs` is this way
1335 -- round, because body_uds_without_me may bind
1336 -- dictionaries that are used in calls_for_me passed
1337 -- to specDefn. So the dictionary bindings in
1338 -- spec_uds may mention dictionaries bound in
1339 -- body_uds_without_me
1340
1341 ---------------------------
1342 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1343 -- Nothing => specialising local fn
1344 -> SpecEnv
1345 -> [CoreRule] -- Existing RULES for the fn
1346 -> [CallInfo]
1347 -> OutId -> InExpr
1348 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details
1349
1350 -- This function checks existing rules, and does not create
1351 -- duplicate ones. So the caller does not need to do this filtering.
1352 -- See 'already_covered'
1353
1354 type SpecInfo = ( [CoreRule] -- Specialisation rules
1355 , [(Id,CoreExpr)] -- Specialised definition
1356 , UsageDetails ) -- Usage details from specialised RHSs
1357
1358 specCalls mb_mod env existing_rules calls_for_me fn rhs
1359 -- The first case is the interesting one
1360 | callSpecArity pis <= fn_arity -- See Note [Specialisation Must Preserve Sharing]
1361 && notNull calls_for_me -- And there are some calls to specialise
1362 && not (isNeverActive (idInlineActivation fn))
1363 -- Don't specialise NOINLINE things
1364 -- See Note [Auto-specialisation and RULES]
1365
1366 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1367 -- See Note [Inline specialisation] for why we do not
1368 -- switch off specialisation for inline functions
1369
1370 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1371 foldlM spec_call ([], [], emptyUDs) calls_for_me
1372
1373 | otherwise -- No calls or RHS doesn't fit our preconceptions
1374 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1375 text "Missed specialisation opportunity for"
1376 <+> ppr fn $$ _trace_doc )
1377 -- Note [Specialisation shape]
1378 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1379 return ([], [], emptyUDs)
1380 where
1381 _trace_doc = sep [ ppr rhs_tyvars, ppr rhs_bndrs
1382 , ppr (idInlineActivation fn) ]
1383
1384 fn_type = idType fn
1385 fn_arity = idArity fn
1386 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1387 pis = fst $ splitPiTys fn_type
1388 theta = getTheta pis
1389 n_dicts = length theta
1390 inl_prag = idInlinePragma fn
1391 inl_act = inlinePragmaActivation inl_prag
1392 is_local = isLocalId fn
1393
1394 -- Figure out whether the function has an INLINE pragma
1395 -- See Note [Inline specialisations]
1396
1397 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1398 -- See Note [Account for casts in binding]
1399 rhs_tyvars = filter isTyVar rhs_bndrs
1400
1401 in_scope = CoreSubst.substInScope (se_subst env)
1402
1403 already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool
1404 already_covered dflags new_rules args -- Note [Specialisations already covered]
1405 = isJust (lookupRule dflags (in_scope, realIdUnfolding)
1406 (const True) fn args
1407 (new_rules ++ existing_rules))
1408 -- NB: we look both in the new_rules (generated by this invocation
1409 -- of specCalls), and in existing_rules (passed in to specCalls)
1410
1411 ----------------------------------------------------------
1412 -- Specialise to one particular call pattern
1413 spec_call :: SpecInfo -- Accumulating parameter
1414 -> CallInfo -- Call instance
1415 -> SpecM SpecInfo
1416 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc)
1417 (CI { ci_key = call_args, ci_arity = call_arity })
1418 = ASSERT(call_arity <= fn_arity)
1419
1420 -- See Note [Specialising Calls]
1421 do { (rhs_env2, unused_bndrs, rule_bndrs, rule_args, unspec_bndrs, dx_binds, spec_args)
1422 <- specHeader env rhs_bndrs $ dropWhileEndLE isUnspecArg call_args
1423 ; let rhs_body' = mkLams unused_bndrs rhs_body
1424 ; dflags <- getDynFlags
1425 ; if already_covered dflags rules_acc rule_args
1426 then return spec_acc
1427 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1428 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1429 -- , ppr dx_binds ]) $
1430 do
1431 { -- Figure out the type of the specialised function
1432 let body = mkLams unspec_bndrs rhs_body'
1433 body_ty = substTy rhs_env2 $ exprType body
1434 (lam_extra_args, app_args) -- See Note [Specialisations Must Be Lifted]
1435 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1436 , not (isJoinId fn)
1437 = ([voidArgId], unspec_bndrs ++ [voidPrimId])
1438 | otherwise = ([], unspec_bndrs)
1439 join_arity_change = length app_args - length rule_args
1440 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1441 = Just (orig_join_arity + join_arity_change)
1442 | otherwise
1443 = Nothing
1444
1445 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_extra_args body)
1446 ; let spec_id_ty = exprType spec_rhs
1447 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1448 ; this_mod <- getModule
1449 ; let
1450 -- The rule to put in the function's specialisation is:
1451 -- forall x @b d1' d2'.
1452 -- f x @T1 @b @T2 d1' d2' = f1 x @b
1453 -- See Note [Specialising Calls]
1454 herald = case mb_mod of
1455 Nothing -- Specialising local fn
1456 -> text "SPEC"
1457 Just this_mod -- Specialising imported fn
1458 -> text "SPEC/" <> ppr this_mod
1459
1460 rule_name = mkFastString $ showSDoc dflags $
1461 herald <+> ftext (occNameFS (getOccName fn))
1462 <+> hsep (mapMaybe ppr_call_key_ty call_args)
1463 -- This name ends up in interface files, so use occNameString.
1464 -- Otherwise uniques end up there, making builds
1465 -- less deterministic (See #4012 comment:61 ff)
1466
1467 rule_wout_eta = mkRule
1468 this_mod
1469 True {- Auto generated -}
1470 is_local
1471 rule_name
1472 inl_act -- Note [Auto-specialisation and RULES]
1473 (idName fn)
1474 rule_bndrs
1475 rule_args
1476 (mkVarApps (Var spec_f) app_args)
1477
1478 spec_rule
1479 = case isJoinId_maybe fn of
1480 Just join_arity -> etaExpandToJoinPointRule join_arity
1481 rule_wout_eta
1482 Nothing -> rule_wout_eta
1483
1484 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1485 -- See Note [Specialising Calls]
1486 spec_uds = foldr consDictBind rhs_uds dx_binds
1487
1488 --------------------------------------
1489 -- Add a suitable unfolding if the spec_inl_prag says so
1490 -- See Note [Inline specialisations]
1491 (spec_inl_prag, spec_unf)
1492 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1493 = (neverInlinePragma, noUnfolding)
1494 -- See Note [Specialising imported functions] in OccurAnal
1495
1496 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1497 = (inl_prag { inl_inline = NoUserInline }, noUnfolding)
1498
1499 | otherwise
1500 = (inl_prag, specUnfolding dflags unspec_bndrs spec_app n_dicts fn_unf)
1501
1502 spec_app e = e `mkApps` spec_args
1503
1504 --------------------------------------
1505 -- Adding arity information just propagates it a bit faster
1506 -- See Note [Arity decrease] in Simplify
1507 -- Copy InlinePragma information from the parent Id.
1508 -- So if f has INLINE[1] so does spec_f
1509 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1510 `setInlinePragma` spec_inl_prag
1511 `setIdUnfolding` spec_unf
1512 `asJoinId_maybe` spec_join_arity
1513
1514 _rule_trace_doc = vcat [ ppr spec_f, ppr fn_type, ppr spec_id_ty
1515 , ppr rhs_bndrs, ppr call_args
1516 , ppr spec_rule
1517 ]
1518
1519 ; -- pprTrace "spec_call: rule" _rule_trace_doc
1520 return ( spec_rule : rules_acc
1521 , (spec_f_w_arity, spec_rhs) : pairs_acc
1522 , spec_uds `plusUDs` uds_acc
1523 ) } }
1524
1525 {- Note [Specialisation Must Preserve Sharing]
1526 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1527 Consider a function:
1528
1529 f :: forall a. Eq a => a -> blah
1530 f =
1531 if expensive
1532 then f1
1533 else f2
1534
1535 As written, all calls to 'f' will share 'expensive'. But if we specialise 'f'
1536 at 'Int', eg:
1537
1538 $sfInt = SUBST[a->Int,dict->dEqInt] (if expensive then f1 else f2)
1539
1540 RULE "SPEC f"
1541 forall (d :: Eq Int).
1542 f Int _ = $sfIntf
1543
1544 We've now lost sharing between 'f' and '$sfInt' for 'expensive'. Yikes!
1545
1546 To avoid this, we only generate specialisations for functions whose arity is
1547 enough to bind all of the arguments we need to specialise. This ensures our
1548 specialised functions don't do any work before receiving all of their dicts,
1549 and thus avoids the 'f' case above.
1550
1551 Note [Specialisations Must Be Lifted]
1552 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1553 Consider a function 'f':
1554
1555 f = forall a. Eq a => Array# a
1556
1557 used like
1558
1559 case x of
1560 True -> ...f @Int dEqInt...
1561 False -> 0
1562
1563 Naively, we might generate an (expensive) specialisation
1564
1565 $sfInt :: Array# Int
1566
1567 even in the case that @x = False@! Instead, we add a dummy 'Void#' argument to
1568 the specialisation '$sfInt' ($sfInt :: Void# -> Array# Int) in order to
1569 preserve laziness.
1570
1571 Note [Specialising Calls]
1572 ~~~~~~~~~~~~~~~~~~~~~~~~~
1573 Suppose we have a function:
1574
1575 f :: Int -> forall a b c. (Foo a, Foo c) => Bar -> Qux
1576 f = \x -> /\ a b c -> \d1 d2 bar -> rhs
1577
1578 and suppose it is called at:
1579
1580 f 7 @T1 @T2 @T3 dFooT1 dFooT3 bar
1581
1582 This call is described as a 'CallInfo' whose 'ci_key' is
1583
1584 [ UnspecArg, SpecType T1, UnspecType, SpecType T3, SpecDict dFooT1
1585 , SpecDict dFooT3, UnspecArg ]
1586
1587 Why are 'a' and 'c' identified as 'SpecType', while 'b' is 'UnspecType'?
1588 Because we must specialise the function on type variables that appear
1589 free in its *dictionary* arguments; but not on type variables that do not
1590 appear in any dictionaries, i.e. are fully polymorphic.
1591
1592 Because this call has dictionaries applied, we'd like to specialise
1593 the call on any type argument that appears free in those dictionaries.
1594 In this case, those are (a ~ T1, c ~ T3).
1595
1596 As a result, we'd like to generate a function:
1597
1598 $sf :: Int -> forall b. Bar -> Qux
1599 $sf = SUBST[a->T1, c->T3, d1->d1', d2->d2'] (\x -> /\ b -> \bar -> rhs)
1600
1601 Note that the substitution is applied to the whole thing. This is
1602 convenient, but just slightly fragile. Notably:
1603 * There had better be no name clashes in a/b/c
1604
1605 We must construct a rewrite rule:
1606
1607 RULE "SPEC f @T1 _ @T3"
1608 forall (x :: Int) (@b :: Type) (d1' :: Foo T1) (d2' :: Foo T3).
1609 f x @T1 @b @T3 d1' d2' = $sf x @b
1610
1611 In the rule, d1' and d2' are just wildcards, not used in the RHS. Note
1612 additionally that 'bar' isn't captured by this rule --- we bind only
1613 enough etas in order to capture all of the *specialised* arguments.
1614
1615 Finally, we must also construct the usage-details
1616
1617 { d1' = dx1; d2' = dx2 }
1618
1619 where d1', d2' are cloned versions of d1,d2, with the type substitution
1620 applied. These auxiliary bindings just avoid duplication of dx1, dx2.
1621
1622 Note [Account for casts in binding]
1623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1624 Consider
1625 f :: Eq a => a -> IO ()
1626 {-# INLINABLE f
1627 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1628 #-}
1629 f = ...
1630
1631 In f's stable unfolding we have done some modest simplification which
1632 has pushed the cast to the outside. (I wonder if this is the Right
1633 Thing, but it's what happens now; see SimplUtils Note [Casts and
1634 lambdas].) Now that stable unfolding must be specialised, so we want
1635 to push the cast back inside. It would be terrible if the cast
1636 defeated specialisation! Hence the use of collectBindersPushingCo.
1637
1638 Note [Evidence foralls]
1639 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1640 Suppose (#12212) that we are specialising
1641 f :: forall a b. (Num a, F a ~ F b) => blah
1642 with a=b=Int. Then the RULE will be something like
1643 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1644 f Int Int d g = f_spec
1645 But both varToCoreExpr (when constructing the LHS args), and the
1646 simplifier (when simplifying the LHS args), will transform to
1647 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1648 f Int Int d <F Int> = f_spec
1649 by replacing g with Refl. So now 'g' is unbound, which results in a later
1650 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1651 * varToCoreExpr generates a Refl
1652 * exprsFreeIdsList returns the Ids bound by the args,
1653 which won't include g
1654
1655 You might wonder if this will match as often, but the simplifier replaces
1656 complicated Refl coercions with Refl pretty aggressively.
1657
1658 Note [Orphans and auto-generated rules]
1659 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1660 When we specialise an INLINABLE function, or when we have
1661 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1662 We don't want to warn about these, or we'd generate a lot of warnings.
1663 Thus, we only warn about user-specified orphan rules.
1664
1665 Indeed, we don't even treat the module as an orphan module if it has
1666 auto-generated *rule* orphans. Orphan modules are read every time we
1667 compile, so they are pretty obtrusive and slow down every compilation,
1668 even non-optimised ones. (Reason: for type class instances it's a
1669 type correctness issue.) But specialisation rules are strictly for
1670 *optimisation* only so it's fine not to read the interface.
1671
1672 What this means is that a SPEC rules from auto-specialisation in
1673 module M will be used in other modules only if M.hi has been read for
1674 some other reason, which is actually pretty likely.
1675 -}
1676
1677 bindAuxiliaryDicts
1678 :: SpecEnv
1679 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1680 -> [DictId] -- A cloned dict-id for each dict arg
1681 -> (SpecEnv, -- Substitute for all orig_dicts
1682 [DictBind], -- Auxiliary dict bindings
1683 [CoreExpr]) -- Witnessing expressions (all trivial)
1684 -- Bind any dictionary arguments to fresh names, to preserve sharing
1685 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1686 orig_dict_ids call_ds inst_dict_ids
1687 = (env', dx_binds, spec_dict_args)
1688 where
1689 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1690 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1691 (orig_dict_ids `zip` spec_dict_args)
1692 `CoreSubst.extendInScopeList` dx_ids
1693 , se_interesting = interesting `unionVarSet` interesting_dicts }
1694
1695 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1696 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1697 , interestingDict env dx ]
1698 -- See Note [Make the new dictionaries interesting]
1699
1700 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1701 go [] _ = ([], [])
1702 go (dx:dxs) (dx_id:dx_ids)
1703 | exprIsTrivial dx = (dx_binds, dx : args)
1704 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1705 where
1706 (dx_binds, args) = go dxs dx_ids
1707 -- In the first case extend the substitution but not bindings;
1708 -- in the latter extend the bindings but not the substitution.
1709 -- For the former, note that we bind the *original* dict in the substitution,
1710 -- overriding any d->dx_id binding put there by substBndrs
1711 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1712
1713 {-
1714 Note [Make the new dictionaries interesting]
1715 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1716 Important! We're going to substitute dx_id1 for d
1717 and we want it to look "interesting", else we won't gather *any*
1718 consequential calls. E.g.
1719 f d = ...g d....
1720 If we specialise f for a call (f (dfun dNumInt)), we'll get
1721 a consequent call (g d') with an auxiliary definition
1722 d' = df dNumInt
1723 We want that consequent call to look interesting
1724
1725
1726 Note [From non-recursive to recursive]
1727 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1728 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1729 have built a recursive knot
1730
1731 f a d x = <blah>
1732 MkUD { ud_binds = NonRec d7 (MkD ..f..)
1733 , ud_calls = ...(f T d7)... }
1734
1735 The we generate
1736
1737 Rec { fs x = <blah>[T/a, d7/d]
1738 f a d x = <blah>
1739 RULE f T _ = fs
1740 d7 = ...f... }
1741
1742 Here the recursion is only through the RULE.
1743
1744 However we definitely should /not/ make the Rec in this wildly common
1745 case:
1746 d = ...
1747 MkUD { ud_binds = NonRec d7 (...d...)
1748 , ud_calls = ...(f T d7)... }
1749
1750 Here we want simply to add d to the floats, giving
1751 MkUD { ud_binds = NonRec d (...)
1752 NonRec d7 (...d...)
1753 , ud_calls = ...(f T d7)... }
1754
1755 In general, we need only make this Rec if
1756 - there are some specialisations (spec_binds non-empty)
1757 - there are some dict_binds that depend on f (dump_dbs non-empty)
1758
1759 Note [Avoiding loops]
1760 ~~~~~~~~~~~~~~~~~~~~~
1761 When specialising /dictionary functions/ we must be very careful to
1762 avoid building loops. Here is an example that bit us badly: #3591
1763
1764 class Eq a => C a
1765 instance Eq [a] => C [a]
1766
1767 This translates to
1768 dfun :: Eq [a] -> C [a]
1769 dfun a d = MkD a d (meth d)
1770
1771 d4 :: Eq [T] = <blah>
1772 d2 :: C [T] = dfun T d4
1773 d1 :: Eq [T] = $p1 d2
1774 d3 :: C [T] = dfun T d1
1775
1776 None of these definitions is recursive. What happened was that we
1777 generated a specialisation:
1778
1779 RULE forall d. dfun T d = dT :: C [T]
1780 dT = (MkD a d (meth d)) [T/a, d1/d]
1781 = MkD T d1 (meth d1)
1782
1783 But now we use the RULE on the RHS of d2, to get
1784
1785 d2 = dT = MkD d1 (meth d1)
1786 d1 = $p1 d2
1787
1788 and now d1 is bottom! The problem is that when specialising 'dfun' we
1789 should first dump "below" the binding all floated dictionary bindings
1790 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1791 placed below 'dfun', and thus unavailable to it when specialising
1792 'dfun'. That in turn means that the call (dfun T d1) must be
1793 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1794 d4 doesn't mention dfun.
1795
1796 Solution:
1797 Discard all calls that mention dictionaries that depend
1798 (directly or indirectly) on the dfun we are specialising.
1799 This is done by 'filterCalls'
1800
1801 --------------
1802 Here's another example, this time for an imported dfun, so the call
1803 to filterCalls is in specImports (#13429). Suppose we have
1804 class Monoid v => C v a where ...
1805
1806 We start with a call
1807 f @ [Integer] @ Integer $fC[]Integer
1808
1809 Specialising call to 'f' gives dict bindings
1810 $dMonoid_1 :: Monoid [Integer]
1811 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
1812
1813 $dC_1 :: C [Integer] (Node [Integer] Integer)
1814 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1815
1816 ...plus a recursive call to
1817 f @ [Integer] @ (Node [Integer] Integer) $dC_1
1818
1819 Specialising that call gives
1820 $dMonoid_2 :: Monoid [Integer]
1821 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1822
1823 $dC_2 :: C [Integer] (Node [Integer] Integer)
1824 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
1825
1826 Now we have two calls to the imported function
1827 M.$fCvNode :: Monoid v => C v a
1828 M.$fCvNode @v @a m = C m some_fun
1829
1830 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
1831 for specialisation, else we get:
1832
1833 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1834 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1835 $s$fCvNode = C $dMonoid_2 ...
1836 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
1837
1838 Now use the rule to rewrite the call in the RHS of $dC_1
1839 and we get a loop!
1840
1841 --------------
1842 Here's yet another example
1843
1844 class C a where { foo,bar :: [a] -> [a] }
1845
1846 instance C Int where
1847 foo x = r_bar x
1848 bar xs = reverse xs
1849
1850 r_bar :: C a => [a] -> [a]
1851 r_bar xs = bar (xs ++ xs)
1852
1853 That translates to:
1854
1855 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1856
1857 Rec { $fCInt :: C Int = MkC foo_help reverse
1858 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1859
1860 The call (r_bar $fCInt) mentions $fCInt,
1861 which mentions foo_help,
1862 which mentions r_bar
1863 But we DO want to specialise r_bar at Int:
1864
1865 Rec { $fCInt :: C Int = MkC foo_help reverse
1866 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1867
1868 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1869 RULE r_bar Int _ = r_bar_Int
1870
1871 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1872 }
1873
1874 Note that, because of its RULE, r_bar joins the recursive
1875 group. (In this case it'll unravel a short moment later.)
1876
1877
1878 Note [Specialising a recursive group]
1879 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1880 Consider
1881 let rec { f x = ...g x'...
1882 ; g y = ...f y'.... }
1883 in f 'a'
1884 Here we specialise 'f' at Char; but that is very likely to lead to
1885 a specialisation of 'g' at Char. We must do the latter, else the
1886 whole point of specialisation is lost.
1887
1888 But we do not want to keep iterating to a fixpoint, because in the
1889 presence of polymorphic recursion we might generate an infinite number
1890 of specialisations.
1891
1892 So we use the following heuristic:
1893 * Arrange the rec block in dependency order, so far as possible
1894 (the occurrence analyser already does this)
1895
1896 * Specialise it much like a sequence of lets
1897
1898 * Then go through the block a second time, feeding call-info from
1899 the RHSs back in the bottom, as it were
1900
1901 In effect, the ordering maxmimises the effectiveness of each sweep,
1902 and we do just two sweeps. This should catch almost every case of
1903 monomorphic recursion -- the exception could be a very knotted-up
1904 recursion with multiple cycles tied up together.
1905
1906 This plan is implemented in the Rec case of specBindItself.
1907
1908 Note [Specialisations already covered]
1909 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1910 We obviously don't want to generate two specialisations for the same
1911 argument pattern. There are two wrinkles
1912
1913 1. We do the already-covered test in specDefn, not when we generate
1914 the CallInfo in mkCallUDs. We used to test in the latter place, but
1915 we now iterate the specialiser somewhat, and the Id at the call site
1916 might therefore not have all the RULES that we can see in specDefn
1917
1918 2. What about two specialisations where the second is an *instance*
1919 of the first? If the more specific one shows up first, we'll generate
1920 specialisations for both. If the *less* specific one shows up first,
1921 we *don't* currently generate a specialisation for the more specific
1922 one. (See the call to lookupRule in already_covered.) Reasons:
1923 (a) lookupRule doesn't say which matches are exact (bad reason)
1924 (b) if the earlier specialisation is user-provided, it's
1925 far from clear that we should auto-specialise further
1926
1927 Note [Auto-specialisation and RULES]
1928 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1929 Consider:
1930 g :: Num a => a -> a
1931 g = ...
1932
1933 f :: (Int -> Int) -> Int
1934 f w = ...
1935 {-# RULE f g = 0 #-}
1936
1937 Suppose that auto-specialisation makes a specialised version of
1938 g::Int->Int That version won't appear in the LHS of the RULE for f.
1939 So if the specialisation rule fires too early, the rule for f may
1940 never fire.
1941
1942 It might be possible to add new rules, to "complete" the rewrite system.
1943 Thus when adding
1944 RULE forall d. g Int d = g_spec
1945 also add
1946 RULE f g_spec = 0
1947
1948 But that's a bit complicated. For now we ask the programmer's help,
1949 by *copying the INLINE activation pragma* to the auto-specialised
1950 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1951 will also not be active until phase 2. And that's what programmers
1952 should jolly well do anyway, even aside from specialisation, to ensure
1953 that g doesn't inline too early.
1954
1955 This in turn means that the RULE would never fire for a NOINLINE
1956 thing so not much point in generating a specialisation at all.
1957
1958 Note [Specialisation shape]
1959 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1960 We only specialise a function if it has visible top-level lambdas
1961 corresponding to its overloading. E.g. if
1962 f :: forall a. Eq a => ....
1963 then its body must look like
1964 f = /\a. \d. ...
1965
1966 Reason: when specialising the body for a call (f ty dexp), we want to
1967 substitute dexp for d, and pick up specialised calls in the body of f.
1968
1969 This doesn't always work. One example I came across was this:
1970 newtype Gen a = MkGen{ unGen :: Int -> a }
1971
1972 choose :: Eq a => a -> Gen a
1973 choose n = MkGen (\r -> n)
1974
1975 oneof = choose (1::Int)
1976
1977 It's a silly example, but we get
1978 choose = /\a. g `cast` co
1979 where choose doesn't have any dict arguments. Thus far I have not
1980 tried to fix this (wait till there's a real example).
1981
1982 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1983 it doesn't matter. This comes up with single-method classes
1984
1985 class C a where { op :: a -> a }
1986 instance C a => C [a] where ....
1987 ==>
1988 $fCList :: C a => C [a]
1989 $fCList = $copList |> (...coercion>...)
1990 ....(uses of $fCList at particular types)...
1991
1992 So we suppress the WARN if the rhs is trivial.
1993
1994 Note [Inline specialisations]
1995 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1996 Here is what we do with the InlinePragma of the original function
1997 * Activation/RuleMatchInfo: both transferred to the
1998 specialised function
1999 * InlineSpec:
2000 (a) An INLINE pragma is transferred
2001 (b) An INLINABLE pragma is *not* transferred
2002
2003 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
2004 specialise the function at its call site, and arguably that's not so
2005 important for the specialised copies. BUT *pragma-directed*
2006 specialisation now takes place in the typechecker/desugarer, with
2007 manually specified INLINEs. The specialisation here is automatic.
2008 It'd be very odd if a function marked INLINE was specialised (because
2009 of some local use), and then forever after (including importing
2010 modules) the specialised version wasn't INLINEd. After all, the
2011 programmer said INLINE!
2012
2013 You might wonder why we specialise INLINE functions at all. After
2014 all they should be inlined, right? Two reasons:
2015
2016 * Even INLINE functions are sometimes not inlined, when they aren't
2017 applied to interesting arguments. But perhaps the type arguments
2018 alone are enough to specialise (even though the args are too boring
2019 to trigger inlining), and it's certainly better to call the
2020 specialised version.
2021
2022 * The RHS of an INLINE function might call another overloaded function,
2023 and we'd like to generate a specialised version of that function too.
2024 This actually happens a lot. Consider
2025 replicateM_ :: (Monad m) => Int -> m a -> m ()
2026 {-# INLINABLE replicateM_ #-}
2027 replicateM_ d x ma = ...
2028 The strictness analyser may transform to
2029 replicateM_ :: (Monad m) => Int -> m a -> m ()
2030 {-# INLINE replicateM_ #-}
2031 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
2032
2033 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
2034 {-# INLINABLE $wreplicateM_ #-}
2035 $wreplicateM_ = ...
2036 Now an importing module has a specialised call to replicateM_, say
2037 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
2038 This particular example had a huge effect on the call to replicateM_
2039 in nofib/shootout/n-body.
2040
2041 Why (b): discard INLINABLE pragmas? See #4874 for persuasive examples.
2042 Suppose we have
2043 {-# INLINABLE f #-}
2044 f :: Ord a => [a] -> Int
2045 f xs = letrec f' = ...f'... in f'
2046 Then, when f is specialised and optimised we might get
2047 wgo :: [Int] -> Int#
2048 wgo = ...wgo...
2049 f_spec :: [Int] -> Int
2050 f_spec xs = case wgo xs of { r -> I# r }
2051 and we clearly want to inline f_spec at call sites. But if we still
2052 have the big, un-optimised of f (albeit specialised) captured in an
2053 INLINABLE pragma for f_spec, we won't get that optimisation.
2054
2055 So we simply drop INLINABLE pragmas when specialising. It's not really
2056 a complete solution; ignoring specialisation for now, INLINABLE functions
2057 don't get properly strictness analysed, for example. But it works well
2058 for examples involving specialisation, which is the dominant use of
2059 INLINABLE. See #4874.
2060
2061
2062 ************************************************************************
2063 * *
2064 \subsubsection{UsageDetails and suchlike}
2065 * *
2066 ************************************************************************
2067 -}
2068
2069 data UsageDetails
2070 = MkUD {
2071 ud_binds :: !(Bag DictBind),
2072 -- See Note [Floated dictionary bindings]
2073 -- The order is important;
2074 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
2075 -- (Remember, Bags preserve order in GHC.)
2076
2077 ud_calls :: !CallDetails
2078
2079 -- INVARIANT: suppose bs = bindersOf ud_binds
2080 -- Then 'calls' may *mention* 'bs',
2081 -- but there should be no calls *for* bs
2082 }
2083
2084 -- | A 'DictBind' is a binding along with a cached set containing its free
2085 -- variables (both type variables and dictionaries)
2086 type DictBind = (CoreBind, VarSet)
2087
2088 {- Note [Floated dictionary bindings]
2089 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2090 We float out dictionary bindings for the reasons described under
2091 "Dictionary floating" above. But not /just/ dictionary bindings.
2092 Consider
2093
2094 f :: Eq a => blah
2095 f a d = rhs
2096
2097 $c== :: T -> T -> Bool
2098 $c== x y = ...
2099
2100 $df :: Eq T
2101 $df = Eq $c== ...
2102
2103 gurgle = ...(f @T $df)...
2104
2105 We gather the call info for (f @T $df), and we don't want to drop it
2106 when we come across the binding for $df. So we add $df to the floats
2107 and continue. But then we have to add $c== to the floats, and so on.
2108 These all float above the binding for 'f', and and now we can
2109 successfully specialise 'f'.
2110
2111 So the DictBinds in (ud_binds :: Bag DictBind) may contain
2112 non-dictionary bindings too.
2113 -}
2114
2115 instance Outputable UsageDetails where
2116 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
2117 = text "MkUD" <+> braces (sep (punctuate comma
2118 [text "binds" <+> equals <+> ppr dbs,
2119 text "calls" <+> equals <+> ppr calls]))
2120
2121 emptyUDs :: UsageDetails
2122 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
2123
2124 ------------------------------------------------------------
2125 type CallDetails = DIdEnv CallInfoSet
2126 -- The order of specialized binds and rules depends on how we linearize
2127 -- CallDetails, so to get determinism we must use a deterministic set here.
2128 -- See Note [Deterministic UniqFM] in UniqDFM
2129
2130 data CallInfoSet = CIS Id (Bag CallInfo)
2131 -- The list of types and dictionaries is guaranteed to
2132 -- match the type of f
2133 -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
2134 -- These dups are eliminated by already_covered in specCalls
2135
2136 data CallInfo
2137 = CI { ci_key :: [SpecArg] -- All arguments
2138 , ci_arity :: Int -- The number of variables necessary to bind
2139 -- all of the specialised arguments
2140 , ci_fvs :: VarSet -- Free vars of the ci_key
2141 -- call (including tyvars)
2142 -- [*not* include the main id itself, of course]
2143 }
2144
2145 type DictExpr = CoreExpr
2146
2147 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
2148 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
2149
2150 instance Outputable CallInfoSet where
2151 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
2152 2 (ppr map)
2153
2154 pprCallInfo :: Id -> CallInfo -> SDoc
2155 pprCallInfo fn (CI { ci_key = key })
2156 = ppr fn <+> ppr key
2157
2158 ppr_call_key_ty :: SpecArg -> Maybe SDoc
2159 ppr_call_key_ty (SpecType ty) = Just $ char '@' <+> pprParendType ty
2160 ppr_call_key_ty UnspecType = Just $ char '_'
2161 ppr_call_key_ty (SpecDict _) = Nothing
2162 ppr_call_key_ty UnspecArg = Nothing
2163
2164 instance Outputable CallInfo where
2165 ppr (CI { ci_key = key, ci_fvs = fvs })
2166 = text "CI" <> braces (hsep [ fsep (mapMaybe ppr_call_key_ty key), ppr fvs ])
2167
2168 unionCalls :: CallDetails -> CallDetails -> CallDetails
2169 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
2170
2171 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
2172 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
2173 CIS f (calls1 `unionBags` calls2)
2174
2175 callDetailsFVs :: CallDetails -> VarSet
2176 callDetailsFVs calls =
2177 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
2178 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
2179 -- immediately by converting to a nondeterministic set.
2180
2181 callInfoFVs :: CallInfoSet -> VarSet
2182 callInfoFVs (CIS _ call_info) =
2183 foldrBag (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
2184
2185 computeArity :: [SpecArg] -> Int
2186 computeArity = length . filter isValueArg . dropWhileEndLE isUnspecArg
2187
2188 callSpecArity :: [TyCoBinder] -> Int
2189 callSpecArity = length . filter (not . isNamedBinder) . dropWhileEndLE isVisibleBinder
2190
2191 getTheta :: [TyCoBinder] -> [PredType]
2192 getTheta = fmap tyBinderType . filter isInvisibleBinder . filter (not . isNamedBinder)
2193
2194
2195 ------------------------------------------------------------
2196 singleCall :: Id -> [SpecArg] -> UsageDetails
2197 singleCall id args
2198 = MkUD {ud_binds = emptyBag,
2199 ud_calls = unitDVarEnv id $ CIS id $
2200 unitBag (CI { ci_key = args -- used to be tys
2201 , ci_arity = computeArity args
2202 , ci_fvs = call_fvs }) }
2203 where
2204 tys = getSpecTypes args
2205 dicts = getSpecDicts args
2206 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
2207 tys_fvs = tyCoVarsOfTypes tys
2208 -- The type args (tys) are guaranteed to be part of the dictionary
2209 -- types, because they are just the constrained types,
2210 -- and the dictionary is therefore sure to be bound
2211 -- inside the binding for any type variables free in the type;
2212 -- hence it's safe to neglect tyvars free in tys when making
2213 -- the free-var set for this call
2214 -- BUT I don't trust this reasoning; play safe and include tys_fvs
2215 --
2216 -- We don't include the 'id' itself.
2217
2218 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
2219 mkCallUDs env f args
2220 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
2221 res
2222 where
2223 res = mkCallUDs' env f args
2224
2225 mkCallUDs' env f args
2226 | not (want_calls_for f) -- Imported from elsewhere
2227 || null theta -- Not overloaded
2228 = emptyUDs
2229
2230 | not (all type_determines_value theta)
2231 || not (computeArity ci_key <= idArity f)
2232 || not (length dicts == length theta)
2233 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
2234 -- See also Note [Specialisations already covered]
2235 = -- pprTrace "mkCallUDs: discarding" _trace_doc
2236 emptyUDs -- Not overloaded, or no specialisation wanted
2237
2238 | otherwise
2239 = -- pprTrace "mkCallUDs: keeping" _trace_doc
2240 singleCall f ci_key
2241 where
2242 _trace_doc = vcat [ppr f, ppr args, ppr (map (interestingDict env) dicts)]
2243 pis = fst $ splitPiTys $ idType f
2244 theta = getTheta pis
2245 constrained_tyvars = tyCoVarsOfTypes theta
2246
2247 ci_key :: [SpecArg]
2248 ci_key = fmap (\(t, a) ->
2249 case t of
2250 Named (binderVar -> tyVar)
2251 | tyVar `elemVarSet` constrained_tyvars
2252 -> case a of
2253 Type ty -> SpecType ty
2254 _ -> pprPanic "ci_key" $ ppr a
2255 | otherwise
2256 -> UnspecType
2257 Anon InvisArg _ -> SpecDict a
2258 Anon VisArg _ -> UnspecArg
2259 ) $ zip pis args
2260
2261 dicts = getSpecDicts ci_key
2262
2263 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
2264 -- For imported things, we gather call instances if
2265 -- there is an unfolding that we could in principle specialise
2266 -- We might still decide not to use it (consulting dflags)
2267 -- in specImports
2268 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
2269
2270 type_determines_value pred -- See Note [Type determines value]
2271 = case classifyPredType pred of
2272 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
2273 EqPred {} -> True
2274 IrredPred {} -> True -- Things like (D []) where D is a
2275 -- Constraint-ranged family; #7785
2276 ForAllPred {} -> True
2277
2278 {-
2279 Note [Type determines value]
2280 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2281 Only specialise if all overloading is on non-IP *class* params,
2282 because these are the ones whose *type* determines their *value*. In
2283 parrticular, with implicit params, the type args *don't* say what the
2284 value of the implicit param is! See #7101
2285
2286 However, consider
2287 type family D (v::*->*) :: Constraint
2288 type instance D [] = ()
2289 f :: D v => v Char -> Int
2290 If we see a call (f "foo"), we'll pass a "dictionary"
2291 () |> (g :: () ~ D [])
2292 and it's good to specialise f at this dictionary.
2293
2294 So the question is: can an implicit parameter "hide inside" a
2295 type-family constraint like (D a). Well, no. We don't allow
2296 type instance D Maybe = ?x:Int
2297 Hence the IrredPred case in type_determines_value.
2298 See #7785.
2299
2300 Note [Interesting dictionary arguments]
2301 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2302 Consider this
2303 \a.\d:Eq a. let f = ... in ...(f d)...
2304 There really is not much point in specialising f wrt the dictionary d,
2305 because the code for the specialised f is not improved at all, because
2306 d is lambda-bound. We simply get junk specialisations.
2307
2308 What is "interesting"? Just that it has *some* structure. But what about
2309 variables?
2310
2311 * A variable might be imported, in which case its unfolding
2312 will tell us whether it has useful structure
2313
2314 * Local variables are cloned on the way down (to avoid clashes when
2315 we float dictionaries), and cloning drops the unfolding
2316 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
2317 nuisance to give them unfoldings. So we keep track of the
2318 "interesting" dictionaries as a VarSet in SpecEnv.
2319 We have to take care to put any new interesting dictionary
2320 bindings in the set.
2321
2322 We accidentally lost accurate tracking of local variables for a long
2323 time, because cloned variables don't have unfoldings. But makes a
2324 massive difference in a few cases, eg #5113. For nofib as a
2325 whole it's only a small win: 2.2% improvement in allocation for ansi,
2326 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2327 -}
2328
2329 interestingDict :: SpecEnv -> CoreExpr -> Bool
2330 -- A dictionary argument is interesting if it has *some* structure
2331 -- NB: "dictionary" arguments include constraints of all sorts,
2332 -- including equality constraints; hence the Coercion case
2333 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2334 || isDataConWorkId v
2335 || v `elemVarSet` se_interesting env
2336 interestingDict _ (Type _) = False
2337 interestingDict _ (Coercion _) = False
2338 interestingDict env (App fn (Type _)) = interestingDict env fn
2339 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2340 interestingDict env (Tick _ a) = interestingDict env a
2341 interestingDict env (Cast e _) = interestingDict env e
2342 interestingDict _ _ = True
2343
2344 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2345 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2346 (MkUD {ud_binds = db2, ud_calls = calls2})
2347 = MkUD { ud_binds = db1 `unionBags` db2
2348 , ud_calls = calls1 `unionCalls` calls2 }
2349
2350 -----------------------------
2351 _dictBindBndrs :: Bag DictBind -> [Id]
2352 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2353
2354 -- | Construct a 'DictBind' from a 'CoreBind'
2355 mkDB :: CoreBind -> DictBind
2356 mkDB bind = (bind, bind_fvs bind)
2357
2358 -- | Identify the free variables of a 'CoreBind'
2359 bind_fvs :: CoreBind -> VarSet
2360 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2361 bind_fvs (Rec prs) = foldl' delVarSet rhs_fvs bndrs
2362 where
2363 bndrs = map fst prs
2364 rhs_fvs = unionVarSets (map pair_fvs prs)
2365
2366 pair_fvs :: (Id, CoreExpr) -> VarSet
2367 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2368 `unionVarSet` idFreeVars bndr
2369 -- idFreeVars: don't forget variables mentioned in
2370 -- the rules of the bndr. C.f. OccAnal.addRuleUsage
2371 -- Also tyvars mentioned in its type; they may not appear
2372 -- in the RHS
2373 -- type T a = Int
2374 -- x :: T a = 3
2375 where
2376 interesting :: InterestingVarFun
2377 interesting v = isLocalVar v || (isId v && isDFunId v)
2378 -- Very important: include DFunIds /even/ if it is imported
2379 -- Reason: See Note [Avoiding loops], the second exmaple
2380 -- involving an imported dfun. We must know whether
2381 -- a dictionary binding depends on an imported dfun,
2382 -- in case we try to specialise that imported dfun
2383 -- #13429 illustrates
2384
2385 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
2386 -- pairs, into a single recursive binding.
2387 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind
2388 recWithDumpedDicts pairs dbs
2389 = (Rec bindings, fvs)
2390 where
2391 (bindings, fvs) = foldrBag add
2392 ([], emptyVarSet)
2393 (dbs `snocBag` mkDB (Rec pairs))
2394 add (NonRec b r, fvs') (pairs, fvs) =
2395 ((b,r) : pairs, fvs `unionVarSet` fvs')
2396 add (Rec prs1, fvs') (pairs, fvs) =
2397 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2398
2399 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2400 -- Add ud_binds to the tail end of the bindings in uds
2401 snocDictBinds uds dbs
2402 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2403
2404 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2405 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2406
2407 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2408 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2409
2410 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2411 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2412
2413 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2414 wrapDictBinds dbs binds
2415 = foldrBag add binds dbs
2416 where
2417 add (bind,_) binds = bind : binds
2418
2419 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2420 wrapDictBindsE dbs expr
2421 = foldrBag add expr dbs
2422 where
2423 add (bind,_) expr = Let bind expr
2424
2425 ----------------------
2426 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2427 -- Used at a lambda or case binder; just dump anything mentioning the binder
2428 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2429 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2430 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2431 (free_uds, dump_dbs)
2432 where
2433 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2434 bndr_set = mkVarSet bndrs
2435 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2436 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2437 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2438 -- no calls for any of the dicts in dump_dbs
2439
2440 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2441 -- Used at a let(rec) binding.
2442 -- We return a boolean indicating whether the binding itself is mentioned,
2443 -- directly or indirectly, by any of the ud_calls; in that case we want to
2444 -- float the binding itself;
2445 -- See Note [Floated dictionary bindings]
2446 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2447 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2448 (free_uds, dump_dbs, float_all)
2449 where
2450 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2451 bndr_set = mkVarSet bndrs
2452 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2453 free_calls = deleteCallsFor bndrs orig_calls
2454 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2455
2456 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2457 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2458 = -- pprTrace ("callsForMe")
2459 -- (vcat [ppr fn,
2460 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2461 -- text "Orig calls =" <+> ppr orig_calls,
2462 -- text "Dep set =" <+> ppr dep_set,
2463 -- text "Calls for me =" <+> ppr calls_for_me]) $
2464 (uds_without_me, calls_for_me)
2465 where
2466 uds_without_me = MkUD { ud_binds = orig_dbs
2467 , ud_calls = delDVarEnv orig_calls fn }
2468 calls_for_me = case lookupDVarEnv orig_calls fn of
2469 Nothing -> []
2470 Just cis -> filterCalls cis orig_dbs
2471 -- filterCalls: drop calls that (directly or indirectly)
2472 -- refer to fn. See Note [Avoiding loops]
2473
2474 ----------------------
2475 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2476 -- See Note [Avoiding loops]
2477 filterCalls (CIS fn call_bag) dbs
2478 = filter ok_call (bagToList call_bag)
2479 where
2480 dump_set = foldlBag go (unitVarSet fn) dbs
2481 -- This dump-set could also be computed by splitDictBinds
2482 -- (_,_,dump_set) = splitDictBinds dbs {fn}
2483 -- But this variant is shorter
2484
2485 go so_far (db,fvs) | fvs `intersectsVarSet` so_far
2486 = extendVarSetList so_far (bindersOf db)
2487 | otherwise = so_far
2488
2489 ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set)
2490
2491 ----------------------
2492 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2493 -- splitDictBinds dbs bndrs returns
2494 -- (free_dbs, dump_dbs, dump_set)
2495 -- where
2496 -- * dump_dbs depends, transitively on bndrs
2497 -- * free_dbs does not depend on bndrs
2498 -- * dump_set = bndrs `union` bndrs(dump_dbs)
2499 splitDictBinds dbs bndr_set
2500 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2501 -- Important that it's foldl not foldr;
2502 -- we're accumulating the set of dumped ids in dump_set
2503 where
2504 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2505 | dump_idset `intersectsVarSet` fvs -- Dump it
2506 = (free_dbs, dump_dbs `snocBag` db,
2507 extendVarSetList dump_idset (bindersOf bind))
2508
2509 | otherwise -- Don't dump it
2510 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2511
2512
2513 ----------------------
2514 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2515 -- Remove calls *mentioning* bs in any way
2516 deleteCallsMentioning bs calls
2517 = mapDVarEnv (ciSetFilter keep_call) calls
2518 where
2519 keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs)
2520
2521 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2522 -- Remove calls *for* bs
2523 deleteCallsFor bs calls = delDVarEnvList calls bs
2524
2525 {-
2526 ************************************************************************
2527 * *
2528 \subsubsection{Boring helper functions}
2529 * *
2530 ************************************************************************
2531 -}
2532
2533 newtype SpecM a = SpecM (State SpecState a)
2534
2535 data SpecState = SpecState {
2536 spec_uniq_supply :: UniqSupply,
2537 spec_module :: Module,
2538 spec_dflags :: DynFlags
2539 }
2540
2541 instance Functor SpecM where
2542 fmap = liftM
2543
2544 instance Applicative SpecM where
2545 pure x = SpecM $ return x
2546 (<*>) = ap
2547
2548 instance Monad SpecM where
2549 SpecM x >>= f = SpecM $ do y <- x
2550 case f y of
2551 SpecM z ->
2552 z
2553 #if !MIN_VERSION_base(4,13,0)
2554 fail = MonadFail.fail
2555 #endif
2556
2557 instance MonadFail.MonadFail SpecM where
2558 fail str = SpecM $ error str
2559
2560 instance MonadUnique SpecM where
2561 getUniqueSupplyM
2562 = SpecM $ do st <- get
2563 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2564 put $ st { spec_uniq_supply = us2 }
2565 return us1
2566
2567 getUniqueM
2568 = SpecM $ do st <- get
2569 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2570 put $ st { spec_uniq_supply = us' }
2571 return u
2572
2573 instance HasDynFlags SpecM where
2574 getDynFlags = SpecM $ liftM spec_dflags get
2575
2576 instance HasModule SpecM where
2577 getModule = SpecM $ liftM spec_module get
2578
2579 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2580 runSpecM dflags this_mod (SpecM spec)
2581 = do us <- getUniqueSupplyM
2582 let initialState = SpecState {
2583 spec_uniq_supply = us,
2584 spec_module = this_mod,
2585 spec_dflags = dflags
2586 }
2587 return $ evalState spec initialState
2588
2589 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2590 mapAndCombineSM _ [] = return ([], emptyUDs)
2591 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2592 (ys, uds2) <- mapAndCombineSM f xs
2593 return (y:ys, uds1 `plusUDs` uds2)
2594
2595 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2596 extendTvSubstList env tv_binds
2597 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2598
2599 substTy :: SpecEnv -> Type -> Type
2600 substTy env ty = CoreSubst.substTy (se_subst env) ty
2601
2602 substCo :: SpecEnv -> Coercion -> Coercion
2603 substCo env co = CoreSubst.substCo (se_subst env) co
2604
2605 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2606 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2607 (subst', bs') -> (env { se_subst = subst' }, bs')
2608
2609 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2610 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2611 (subst', bs') -> (env { se_subst = subst' }, bs')
2612
2613 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2614 -- Clone the binders of the bind; return new bind with the cloned binders
2615 -- Return the substitution to use for RHSs, and the one to use for the body
2616 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2617 = do { us <- getUniqueSupplyM
2618 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2619 interesting' | interestingDict env rhs
2620 = interesting `extendVarSet` bndr'
2621 | otherwise = interesting
2622 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2623 , NonRec bndr' rhs) }
2624
2625 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2626 = do { us <- getUniqueSupplyM
2627 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2628 env' = env { se_subst = subst'
2629 , se_interesting = interesting `extendVarSetList`
2630 [ v | (v,r) <- pairs, interestingDict env r ] }
2631 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2632
2633 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2634 -- Make up completely fresh binders for the dictionaries
2635 -- Their bindings are going to float outwards
2636 newDictBndr env b = do { uniq <- getUniqueM
2637 ; let n = idName b
2638 ty' = substTy env (idType b)
2639 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2640
2641 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2642 -- Give the new Id a similar occurrence name to the old one
2643 newSpecIdSM old_id new_ty join_arity_maybe
2644 = do { uniq <- getUniqueM
2645 ; let name = idName old_id
2646 new_occ = mkSpecOcc (nameOccName name)
2647 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2648 `asJoinId_maybe` join_arity_maybe
2649 ; return new_id }
2650
2651 {-
2652 Old (but interesting) stuff about unboxed bindings
2653 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2654
2655 What should we do when a value is specialised to a *strict* unboxed value?
2656
2657 map_*_* f (x:xs) = let h = f x
2658 t = map f xs
2659 in h:t
2660
2661 Could convert let to case:
2662
2663 map_*_Int# f (x:xs) = case f x of h# ->
2664 let t = map f xs
2665 in h#:t
2666
2667 This may be undesirable since it forces evaluation here, but the value
2668 may not be used in all branches of the body. In the general case this
2669 transformation is impossible since the mutual recursion in a letrec
2670 cannot be expressed as a case.
2671
2672 There is also a problem with top-level unboxed values, since our
2673 implementation cannot handle unboxed values at the top level.
2674
2675 Solution: Lift the binding of the unboxed value and extract it when it
2676 is used:
2677
2678 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2679 t = map f xs
2680 in case h of
2681 _Lift h# -> h#:t
2682
2683 Now give it to the simplifier and the _Lifting will be optimised away.
2684
2685 The benefit is that we have given the specialised "unboxed" values a
2686 very simple lifted semantics and then leave it up to the simplifier to
2687 optimise it --- knowing that the overheads will be removed in nearly
2688 all cases.
2689
2690 In particular, the value will only be evaluated in the branches of the
2691 program which use it, rather than being forced at the point where the
2692 value is bound. For example:
2693
2694 filtermap_*_* p f (x:xs)
2695 = let h = f x
2696 t = ...
2697 in case p x of
2698 True -> h:t
2699 False -> t
2700 ==>
2701 filtermap_*_Int# p f (x:xs)
2702 = let h = case (f x) of h# -> _Lift h#
2703 t = ...
2704 in case p x of
2705 True -> case h of _Lift h#
2706 -> h#:t
2707 False -> t
2708
2709 The binding for h can still be inlined in the one branch and the
2710 _Lifting eliminated.
2711
2712
2713 Question: When won't the _Lifting be eliminated?
2714
2715 Answer: When they at the top-level (where it is necessary) or when
2716 inlining would duplicate work (or possibly code depending on
2717 options). However, the _Lifting will still be eliminated if the
2718 strictness analyser deems the lifted binding strict.
2719 -}