Typos in manual and comments [ci skip]
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy )
14 import Type hiding( substTy, extendTvSubstList )
15 import Module( Module, HasModule(..) )
16 import Coercion( Coercion )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars, exprsFreeIdsList )
26 import UniqSupply
27 import Name
28 import MkId ( voidArgId, voidPrimId )
29 import Maybes ( catMaybes, isJust )
30 import BasicTypes
31 import HscTypes
32 import Bag
33 import DynFlags
34 import Util
35 import Outputable
36 import FastString
37 import State
38 import UniqDFM
39 import TrieMap
40
41 import Control.Monad
42 #if __GLASGOW_HASKELL__ > 710
43 import qualified Control.Monad.Fail as MonadFail
44 #endif
45
46 {-
47 ************************************************************************
48 * *
49 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
50 * *
51 ************************************************************************
52
53 These notes describe how we implement specialisation to eliminate
54 overloading.
55
56 The specialisation pass works on Core
57 syntax, complete with all the explicit dictionary application,
58 abstraction and construction as added by the type checker. The
59 existing type checker remains largely as it is.
60
61 One important thought: the {\em types} passed to an overloaded
62 function, and the {\em dictionaries} passed are mutually redundant.
63 If the same function is applied to the same type(s) then it is sure to
64 be applied to the same dictionary(s)---or rather to the same {\em
65 values}. (The arguments might look different but they will evaluate
66 to the same value.)
67
68 Second important thought: we know that we can make progress by
69 treating dictionary arguments as static and worth specialising on. So
70 we can do without binding-time analysis, and instead specialise on
71 dictionary arguments and no others.
72
73 The basic idea
74 ~~~~~~~~~~~~~~
75 Suppose we have
76
77 let f = <f_rhs>
78 in <body>
79
80 and suppose f is overloaded.
81
82 STEP 1: CALL-INSTANCE COLLECTION
83
84 We traverse <body>, accumulating all applications of f to types and
85 dictionaries.
86
87 (Might there be partial applications, to just some of its types and
88 dictionaries? In principle yes, but in practice the type checker only
89 builds applications of f to all its types and dictionaries, so partial
90 applications could only arise as a result of transformation, and even
91 then I think it's unlikely. In any case, we simply don't accumulate such
92 partial applications.)
93
94
95 STEP 2: EQUIVALENCES
96
97 So now we have a collection of calls to f:
98 f t1 t2 d1 d2
99 f t3 t4 d3 d4
100 ...
101 Notice that f may take several type arguments. To avoid ambiguity, we
102 say that f is called at type t1/t2 and t3/t4.
103
104 We take equivalence classes using equality of the *types* (ignoring
105 the dictionary args, which as mentioned previously are redundant).
106
107 STEP 3: SPECIALISATION
108
109 For each equivalence class, choose a representative (f t1 t2 d1 d2),
110 and create a local instance of f, defined thus:
111
112 f@t1/t2 = <f_rhs> t1 t2 d1 d2
113
114 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
115 of simplification will now result. However we don't actually *do* that
116 simplification. Rather, we leave it for the simplifier to do. If we
117 *did* do it, though, we'd get more call instances from the specialised
118 RHS. We can work out what they are by instantiating the call-instance
119 set from f's RHS with the types t1, t2.
120
121 Add this new id to f's IdInfo, to record that f has a specialised version.
122
123 Before doing any of this, check that f's IdInfo doesn't already
124 tell us about an existing instance of f at the required type/s.
125 (This might happen if specialisation was applied more than once, or
126 it might arise from user SPECIALIZE pragmas.)
127
128 Recursion
129 ~~~~~~~~~
130 Wait a minute! What if f is recursive? Then we can't just plug in
131 its right-hand side, can we?
132
133 But it's ok. The type checker *always* creates non-recursive definitions
134 for overloaded recursive functions. For example:
135
136 f x = f (x+x) -- Yes I know its silly
137
138 becomes
139
140 f a (d::Num a) = let p = +.sel a d
141 in
142 letrec fl (y::a) = fl (p y y)
143 in
144 fl
145
146 We still have recusion for non-overloaded functions which we
147 speciailise, but the recursive call should get specialised to the
148 same recursive version.
149
150
151 Polymorphism 1
152 ~~~~~~~~~~~~~~
153
154 All this is crystal clear when the function is applied to *constant
155 types*; that is, types which have no type variables inside. But what if
156 it is applied to non-constant types? Suppose we find a call of f at type
157 t1/t2. There are two possibilities:
158
159 (a) The free type variables of t1, t2 are in scope at the definition point
160 of f. In this case there's no problem, we proceed just as before. A common
161 example is as follows. Here's the Haskell:
162
163 g y = let f x = x+x
164 in f y + f y
165
166 After typechecking we have
167
168 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
169 in +.sel a d (f a d y) (f a d y)
170
171 Notice that the call to f is at type type "a"; a non-constant type.
172 Both calls to f are at the same type, so we can specialise to give:
173
174 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
175 in +.sel a d (f@a y) (f@a y)
176
177
178 (b) The other case is when the type variables in the instance types
179 are *not* in scope at the definition point of f. The example we are
180 working with above is a good case. There are two instances of (+.sel a d),
181 but "a" is not in scope at the definition of +.sel. Can we do anything?
182 Yes, we can "common them up", a sort of limited common sub-expression deal.
183 This would give:
184
185 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
186 f@a (x::a) = +.sel@a x x
187 in +.sel@a (f@a y) (f@a y)
188
189 This can save work, and can't be spotted by the type checker, because
190 the two instances of +.sel weren't originally at the same type.
191
192 Further notes on (b)
193
194 * There are quite a few variations here. For example, the defn of
195 +.sel could be floated ouside the \y, to attempt to gain laziness.
196 It certainly mustn't be floated outside the \d because the d has to
197 be in scope too.
198
199 * We don't want to inline f_rhs in this case, because
200 that will duplicate code. Just commoning up the call is the point.
201
202 * Nothing gets added to +.sel's IdInfo.
203
204 * Don't bother unless the equivalence class has more than one item!
205
206 Not clear whether this is all worth it. It is of course OK to
207 simply discard call-instances when passing a big lambda.
208
209 Polymorphism 2 -- Overloading
210 ~~~~~~~~~~~~~~
211 Consider a function whose most general type is
212
213 f :: forall a b. Ord a => [a] -> b -> b
214
215 There is really no point in making a version of g at Int/Int and another
216 at Int/Bool, because it's only instancing the type variable "a" which
217 buys us any efficiency. Since g is completely polymorphic in b there
218 ain't much point in making separate versions of g for the different
219 b types.
220
221 That suggests that we should identify which of g's type variables
222 are constrained (like "a") and which are unconstrained (like "b").
223 Then when taking equivalence classes in STEP 2, we ignore the type args
224 corresponding to unconstrained type variable. In STEP 3 we make
225 polymorphic versions. Thus:
226
227 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
228
229 We do this.
230
231
232 Dictionary floating
233 ~~~~~~~~~~~~~~~~~~~
234 Consider this
235
236 f a (d::Num a) = let g = ...
237 in
238 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
239
240 Here, g is only called at one type, but the dictionary isn't in scope at the
241 definition point for g. Usually the type checker would build a
242 definition for d1 which enclosed g, but the transformation system
243 might have moved d1's defn inward. Solution: float dictionary bindings
244 outwards along with call instances.
245
246 Consider
247
248 f x = let g p q = p==q
249 h r s = (r+s, g r s)
250 in
251 h x x
252
253
254 Before specialisation, leaving out type abstractions we have
255
256 f df x = let g :: Eq a => a -> a -> Bool
257 g dg p q = == dg p q
258 h :: Num a => a -> a -> (a, Bool)
259 h dh r s = let deq = eqFromNum dh
260 in (+ dh r s, g deq r s)
261 in
262 h df x x
263
264 After specialising h we get a specialised version of h, like this:
265
266 h' r s = let deq = eqFromNum df
267 in (+ df r s, g deq r s)
268
269 But we can't naively make an instance for g from this, because deq is not in scope
270 at the defn of g. Instead, we have to float out the (new) defn of deq
271 to widen its scope. Notice that this floating can't be done in advance -- it only
272 shows up when specialisation is done.
273
274 User SPECIALIZE pragmas
275 ~~~~~~~~~~~~~~~~~~~~~~~
276 Specialisation pragmas can be digested by the type checker, and implemented
277 by adding extra definitions along with that of f, in the same way as before
278
279 f@t1/t2 = <f_rhs> t1 t2 d1 d2
280
281 Indeed the pragmas *have* to be dealt with by the type checker, because
282 only it knows how to build the dictionaries d1 and d2! For example
283
284 g :: Ord a => [a] -> [a]
285 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
286
287 Here, the specialised version of g is an application of g's rhs to the
288 Ord dictionary for (Tree Int), which only the type checker can conjure
289 up. There might not even *be* one, if (Tree Int) is not an instance of
290 Ord! (All the other specialision has suitable dictionaries to hand
291 from actual calls.)
292
293 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
294 it is buried in a complex (as-yet-un-desugared) binding group.
295 Maybe we should say
296
297 f@t1/t2 = f* t1 t2 d1 d2
298
299 where f* is the Id f with an IdInfo which says "inline me regardless!".
300 Indeed all the specialisation could be done in this way.
301 That in turn means that the simplifier has to be prepared to inline absolutely
302 any in-scope let-bound thing.
303
304
305 Again, the pragma should permit polymorphism in unconstrained variables:
306
307 h :: Ord a => [a] -> b -> b
308 {-# SPECIALIZE h :: [Int] -> b -> b #-}
309
310 We *insist* that all overloaded type variables are specialised to ground types,
311 (and hence there can be no context inside a SPECIALIZE pragma).
312 We *permit* unconstrained type variables to be specialised to
313 - a ground type
314 - or left as a polymorphic type variable
315 but nothing in between. So
316
317 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
318
319 is *illegal*. (It can be handled, but it adds complication, and gains the
320 programmer nothing.)
321
322
323 SPECIALISING INSTANCE DECLARATIONS
324 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
325 Consider
326
327 instance Foo a => Foo [a] where
328 ...
329 {-# SPECIALIZE instance Foo [Int] #-}
330
331 The original instance decl creates a dictionary-function
332 definition:
333
334 dfun.Foo.List :: forall a. Foo a -> Foo [a]
335
336 The SPECIALIZE pragma just makes a specialised copy, just as for
337 ordinary function definitions:
338
339 dfun.Foo.List@Int :: Foo [Int]
340 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
341
342 The information about what instance of the dfun exist gets added to
343 the dfun's IdInfo in the same way as a user-defined function too.
344
345
346 Automatic instance decl specialisation?
347 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
348 Can instance decls be specialised automatically? It's tricky.
349 We could collect call-instance information for each dfun, but
350 then when we specialised their bodies we'd get new call-instances
351 for ordinary functions; and when we specialised their bodies, we might get
352 new call-instances of the dfuns, and so on. This all arises because of
353 the unrestricted mutual recursion between instance decls and value decls.
354
355 Still, there's no actual problem; it just means that we may not do all
356 the specialisation we could theoretically do.
357
358 Furthermore, instance decls are usually exported and used non-locally,
359 so we'll want to compile enough to get those specialisations done.
360
361 Lastly, there's no such thing as a local instance decl, so we can
362 survive solely by spitting out *usage* information, and then reading that
363 back in as a pragma when next compiling the file. So for now,
364 we only specialise instance decls in response to pragmas.
365
366
367 SPITTING OUT USAGE INFORMATION
368 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369
370 To spit out usage information we need to traverse the code collecting
371 call-instance information for all imported (non-prelude?) functions
372 and data types. Then we equivalence-class it and spit it out.
373
374 This is done at the top-level when all the call instances which escape
375 must be for imported functions and data types.
376
377 *** Not currently done ***
378
379
380 Partial specialisation by pragmas
381 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
382 What about partial specialisation:
383
384 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
385 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
386
387 or even
388
389 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
390
391 Seems quite reasonable. Similar things could be done with instance decls:
392
393 instance (Foo a, Foo b) => Foo (a,b) where
394 ...
395 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
396 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
397
398 Ho hum. Things are complex enough without this. I pass.
399
400
401 Requirements for the simplifier
402 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
403 The simplifier has to be able to take advantage of the specialisation.
404
405 * When the simplifier finds an application of a polymorphic f, it looks in
406 f's IdInfo in case there is a suitable instance to call instead. This converts
407
408 f t1 t2 d1 d2 ===> f_t1_t2
409
410 Note that the dictionaries get eaten up too!
411
412 * Dictionary selection operations on constant dictionaries must be
413 short-circuited:
414
415 +.sel Int d ===> +Int
416
417 The obvious way to do this is in the same way as other specialised
418 calls: +.sel has inside it some IdInfo which tells that if it's applied
419 to the type Int then it should eat a dictionary and transform to +Int.
420
421 In short, dictionary selectors need IdInfo inside them for constant
422 methods.
423
424 * Exactly the same applies if a superclass dictionary is being
425 extracted:
426
427 Eq.sel Int d ===> dEqInt
428
429 * Something similar applies to dictionary construction too. Suppose
430 dfun.Eq.List is the function taking a dictionary for (Eq a) to
431 one for (Eq [a]). Then we want
432
433 dfun.Eq.List Int d ===> dEq.List_Int
434
435 Where does the Eq [Int] dictionary come from? It is built in
436 response to a SPECIALIZE pragma on the Eq [a] instance decl.
437
438 In short, dfun Ids need IdInfo with a specialisation for each
439 constant instance of their instance declaration.
440
441 All this uses a single mechanism: the SpecEnv inside an Id
442
443
444 What does the specialisation IdInfo look like?
445 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
446
447 The SpecEnv of an Id maps a list of types (the template) to an expression
448
449 [Type] |-> Expr
450
451 For example, if f has this RuleInfo:
452
453 [Int, a] -> \d:Ord Int. f' a
454
455 it means that we can replace the call
456
457 f Int t ===> (\d. f' t)
458
459 This chucks one dictionary away and proceeds with the
460 specialised version of f, namely f'.
461
462
463 What can't be done this way?
464 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
465 There is no way, post-typechecker, to get a dictionary for (say)
466 Eq a from a dictionary for Eq [a]. So if we find
467
468 ==.sel [t] d
469
470 we can't transform to
471
472 eqList (==.sel t d')
473
474 where
475 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
476
477 Of course, we currently have no way to automatically derive
478 eqList, nor to connect it to the Eq [a] instance decl, but you
479 can imagine that it might somehow be possible. Taking advantage
480 of this is permanently ruled out.
481
482 Still, this is no great hardship, because we intend to eliminate
483 overloading altogether anyway!
484
485 A note about non-tyvar dictionaries
486 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
487 Some Ids have types like
488
489 forall a,b,c. Eq a -> Ord [a] -> tau
490
491 This seems curious at first, because we usually only have dictionary
492 args whose types are of the form (C a) where a is a type variable.
493 But this doesn't hold for the functions arising from instance decls,
494 which sometimes get arguments with types of form (C (T a)) for some
495 type constructor T.
496
497 Should we specialise wrt this compound-type dictionary? We used to say
498 "no", saying:
499 "This is a heuristic judgement, as indeed is the fact that we
500 specialise wrt only dictionaries. We choose *not* to specialise
501 wrt compound dictionaries because at the moment the only place
502 they show up is in instance decls, where they are simply plugged
503 into a returned dictionary. So nothing is gained by specialising
504 wrt them."
505
506 But it is simpler and more uniform to specialise wrt these dicts too;
507 and in future GHC is likely to support full fledged type signatures
508 like
509 f :: Eq [(a,b)] => ...
510
511
512 ************************************************************************
513 * *
514 \subsubsection{The new specialiser}
515 * *
516 ************************************************************************
517
518 Our basic game plan is this. For let(rec) bound function
519 f :: (C a, D c) => (a,b,c,d) -> Bool
520
521 * Find any specialised calls of f, (f ts ds), where
522 ts are the type arguments t1 .. t4, and
523 ds are the dictionary arguments d1 .. d2.
524
525 * Add a new definition for f1 (say):
526
527 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
528
529 Note that we abstract over the unconstrained type arguments.
530
531 * Add the mapping
532
533 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
534
535 to the specialisations of f. This will be used by the
536 simplifier to replace calls
537 (f t1 t2 t3 t4) da db
538 by
539 (\d1 d1 -> f1 t2 t4) da db
540
541 All the stuff about how many dictionaries to discard, and what types
542 to apply the specialised function to, are handled by the fact that the
543 SpecEnv contains a template for the result of the specialisation.
544
545 We don't build *partial* specialisations for f. For example:
546
547 f :: Eq a => a -> a -> Bool
548 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
549
550 Here, little is gained by making a specialised copy of f.
551 There's a distinct danger that the specialised version would
552 first build a dictionary for (Eq b, Eq c), and then select the (==)
553 method from it! Even if it didn't, not a great deal is saved.
554
555 We do, however, generate polymorphic, but not overloaded, specialisations:
556
557 f :: Eq a => [a] -> b -> b -> b
558 ... SPECIALISE f :: [Int] -> b -> b -> b ...
559
560 Hence, the invariant is this:
561
562 *** no specialised version is overloaded ***
563
564
565 ************************************************************************
566 * *
567 \subsubsection{The exported function}
568 * *
569 ************************************************************************
570 -}
571
572 -- | Specialise calls to type-class overloaded functions occuring in a program.
573 specProgram :: ModGuts -> CoreM ModGuts
574 specProgram guts@(ModGuts { mg_module = this_mod
575 , mg_rules = local_rules
576 , mg_binds = binds })
577 = do { dflags <- getDynFlags
578
579 -- Specialise the bindings of this module
580 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
581
582 -- Specialise imported functions
583 ; hpt_rules <- getRuleBase
584 ; let rule_base = extendRuleBaseList hpt_rules local_rules
585 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
586 [] rule_base (ud_calls uds)
587
588 -- Don't forget to wrap the specialized bindings with bindings
589 -- for the needed dictionaries.
590 -- See Note [Wrap bindings returned by specImports]
591 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
592
593 ; let final_binds
594 | null spec_binds' = binds'
595 | otherwise = Rec (flattenBinds spec_binds') : binds'
596 -- Note [Glom the bindings if imported functions are specialised]
597
598 ; return (guts { mg_binds = final_binds
599 , mg_rules = new_rules ++ local_rules }) }
600 where
601 -- We need to start with a Subst that knows all the things
602 -- that are in scope, so that the substitution engine doesn't
603 -- accidentally re-use a unique that's already in use
604 -- Easiest thing is to do it all at once, as if all the top-level
605 -- decls were mutually recursive
606 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
607 bindersOfBinds binds
608 , se_interesting = emptyVarSet }
609
610 go [] = return ([], emptyUDs)
611 go (bind:binds) = do (binds', uds) <- go binds
612 (bind', uds') <- specBind top_env bind uds
613 return (bind' ++ binds', uds')
614
615 {-
616 Note [Wrap bindings returned by specImports]
617 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
618 'specImports' returns a set of specialized bindings. However, these are lacking
619 necessary floated dictionary bindings, which are returned by
620 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
621 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
622 for instance, the 'specImports' call in 'specProgram'.
623
624
625 Note [Disabling cross-module specialisation]
626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
627 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
628 in modules outside of the current module. This can sometimes uncover user code
629 which explodes in size when aggressively optimized. The
630 -fno-cross-module-specialise option was introduced to allow users to being
631 bitten by such instances to revert to the pre-7.10 behavior.
632
633 See Trac #10491
634 -}
635
636 -- | Specialise a set of calls to imported bindings
637 specImports :: DynFlags
638 -> Module
639 -> SpecEnv -- Passed in so that all top-level Ids are in scope
640 -> VarSet -- Don't specialise these ones
641 -- See Note [Avoiding recursive specialisation]
642 -> [Id] -- Stack of imported functions being specialised
643 -> RuleBase -- Rules from this module and the home package
644 -- (but not external packages, which can change)
645 -> CallDetails -- Calls for imported things, and floating bindings
646 -> CoreM ( [CoreRule] -- New rules
647 , [CoreBind] ) -- Specialised bindings
648 -- See Note [Wrapping bindings returned by specImports]
649 specImports dflags this_mod top_env done callers rule_base cds
650 -- See Note [Disabling cross-module specialisation]
651 | not $ gopt Opt_CrossModuleSpecialise dflags =
652 return ([], [])
653
654 | otherwise =
655 do { let import_calls = dVarEnvElts cds
656 ; (rules, spec_binds) <- go rule_base import_calls
657 ; return (rules, spec_binds) }
658 where
659 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
660 go _ [] = return ([], [])
661 go rb (cis@(CIS fn _calls_for_fn) : other_calls)
662 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
663 done callers rb fn $
664 ciSetToList cis
665 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
666 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
667
668 specImport :: DynFlags
669 -> Module
670 -> SpecEnv -- Passed in so that all top-level Ids are in scope
671 -> VarSet -- Don't specialise these
672 -- See Note [Avoiding recursive specialisation]
673 -> [Id] -- Stack of imported functions being specialised
674 -> RuleBase -- Rules from this module
675 -> Id -> [CallInfo] -- Imported function and calls for it
676 -> CoreM ( [CoreRule] -- New rules
677 , [CoreBind] ) -- Specialised bindings
678 specImport dflags this_mod top_env done callers rb fn calls_for_fn
679 | fn `elemVarSet` done
680 = return ([], []) -- No warning. This actually happens all the time
681 -- when specialising a recursive function, because
682 -- the RHS of the specialised function contains a recursive
683 -- call to the original function
684
685 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
686 = return ([], [])
687
688 | wantSpecImport dflags unfolding
689 , Just rhs <- maybeUnfoldingTemplate unfolding
690 = do { -- Get rules from the external package state
691 -- We keep doing this in case we "page-fault in"
692 -- more rules as we go along
693 ; hsc_env <- getHscEnv
694 ; eps <- liftIO $ hscEPS hsc_env
695 ; vis_orphs <- getVisibleOrphanMods
696 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
697 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
698
699 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
700 runSpecM dflags this_mod $
701 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
702 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
703 -- After the rules kick in we may get recursion, but
704 -- we rely on a global GlomBinds to sort that out later
705 -- See Note [Glom the bindings if imported functions are specialised]
706
707 -- Now specialise any cascaded calls
708 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
709 specImports dflags this_mod top_env
710 (extendVarSet done fn)
711 (fn:callers)
712 (extendRuleBaseList rb rules1)
713 (ud_calls uds)
714
715 -- Don't forget to wrap the specialized bindings with bindings
716 -- for the needed dictionaries
717 -- See Note [Wrap bindings returned by specImports]
718 ; let final_binds = wrapDictBinds (ud_binds uds)
719 (spec_binds2 ++ spec_binds1)
720
721 ; return (rules2 ++ rules1, final_binds) }
722
723 | warnMissingSpecs dflags callers
724 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
725 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
726 | caller <- callers])
727 , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
728 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
729 ; return ([], []) }
730
731 | otherwise
732 = return ([], [])
733 where
734 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
735
736 warnMissingSpecs :: DynFlags -> [Id] -> Bool
737 -- See Note [Warning about missed specialisations]
738 warnMissingSpecs dflags callers
739 | wopt Opt_WarnAllMissedSpecs dflags = True
740 | not (wopt Opt_WarnMissedSpecs dflags) = False
741 | null callers = False
742 | otherwise = all has_inline_prag callers
743 where
744 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
745
746 wantSpecImport :: DynFlags -> Unfolding -> Bool
747 -- See Note [Specialise imported INLINABLE things]
748 wantSpecImport dflags unf
749 = case unf of
750 NoUnfolding -> False
751 BootUnfolding -> False
752 OtherCon {} -> False
753 DFunUnfolding {} -> True
754 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
755 | gopt Opt_SpecialiseAggressively dflags -> True
756 | isStableSource src -> True
757 -- Specialise even INLINE things; it hasn't inlined yet,
758 -- so perhaps it never will. Moreover it may have calls
759 -- inside it that we want to specialise
760 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
761
762 {- Note [Warning about missed specialisations]
763 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
764 Suppose
765 * In module Lib, you carefully mark a function 'foo' INLINABLE
766 * Import Lib(foo) into another module M
767 * Call 'foo' at some specialised type in M
768 Then you jolly well expect it to be specialised in M. But what if
769 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
770 specialised too. But if 'bar' is not marked INLINABLE it may well
771 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
772
773 It's more noisy to warning about a missed specialisation opportunity
774 for /every/ overloaded imported function, but sometimes useful. That
775 is what Opt_WarnAllMissedSpecs does.
776
777 ToDo: warn about missed opportunities for local functions.
778
779 Note [Specialise imported INLINABLE things]
780 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
781 What imported functions do we specialise? The basic set is
782 * DFuns and things with INLINABLE pragmas.
783 but with -fspecialise-aggressively we add
784 * Anything with an unfolding template
785
786 Trac #8874 has a good example of why we want to auto-specialise DFuns.
787
788 We have the -fspecialise-aggressively flag (usually off), because we
789 risk lots of orphan modules from over-vigorous specialisation.
790 However it's not a big deal: anything non-recursive with an
791 unfolding-template will probably have been inlined already.
792
793 Note [Glom the bindings if imported functions are specialised]
794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
795 Suppose we have an imported, *recursive*, INLINABLE function
796 f :: Eq a => a -> a
797 f = /\a \d x. ...(f a d)...
798 In the module being compiled we have
799 g x = f (x::Int)
800 Now we'll make a specialised function
801 f_spec :: Int -> Int
802 f_spec = \x -> ...(f Int dInt)...
803 {-# RULE f Int _ = f_spec #-}
804 g = \x. f Int dInt x
805 Note that f_spec doesn't look recursive
806 After rewriting with the RULE, we get
807 f_spec = \x -> ...(f_spec)...
808 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
809 The occurrence analyser never turns a NonRec into a Rec. So we must
810 make sure that f_spec is recursive. Easiest thing is to make all
811 the specialisations for imported bindings recursive.
812
813
814 Note [Avoiding recursive specialisation]
815 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
816 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
817 'f's RHS. So we want to specialise g,h. But we don't want to
818 specialise f any more! It's possible that f's RHS might have a
819 recursive yet-more-specialised call, so we'd diverge in that case.
820 And if the call is to the same type, one specialisation is enough.
821 Avoiding this recursive specialisation loop is the reason for the
822 'done' VarSet passed to specImports and specImport.
823
824 ************************************************************************
825 * *
826 \subsubsection{@specExpr@: the main function}
827 * *
828 ************************************************************************
829 -}
830
831 data SpecEnv
832 = SE { se_subst :: CoreSubst.Subst
833 -- We carry a substitution down:
834 -- a) we must clone any binding that might float outwards,
835 -- to avoid name clashes
836 -- b) we carry a type substitution to use when analysing
837 -- the RHS of specialised bindings (no type-let!)
838
839
840 , se_interesting :: VarSet
841 -- Dict Ids that we know something about
842 -- and hence may be worth specialising against
843 -- See Note [Interesting dictionary arguments]
844 }
845
846 specVar :: SpecEnv -> Id -> CoreExpr
847 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
848
849 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
850
851 ---------------- First the easy cases --------------------
852 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
853 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
854 specExpr env (Var v) = return (specVar env v, emptyUDs)
855 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
856 specExpr env (Cast e co)
857 = do { (e', uds) <- specExpr env e
858 ; return ((mkCast e' (substCo env co)), uds) }
859 specExpr env (Tick tickish body)
860 = do { (body', uds) <- specExpr env body
861 ; return (Tick (specTickish env tickish) body', uds) }
862
863 ---------------- Applications might generate a call instance --------------------
864 specExpr env expr@(App {})
865 = go expr []
866 where
867 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
868 (fun', uds_app) <- go fun (arg':args)
869 return (App fun' arg', uds_arg `plusUDs` uds_app)
870
871 go (Var f) args = case specVar env f of
872 Var f' -> return (Var f', mkCallUDs env f' args)
873 e' -> return (e', emptyUDs) -- I don't expect this!
874 go other _ = specExpr env other
875
876 ---------------- Lambda/case require dumping of usage details --------------------
877 specExpr env e@(Lam _ _) = do
878 (body', uds) <- specExpr env' body
879 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
880 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
881 where
882 (bndrs, body) = collectBinders e
883 (env', bndrs') = substBndrs env bndrs
884 -- More efficient to collect a group of binders together all at once
885 -- and we don't want to split a lambda group with dumped bindings
886
887 specExpr env (Case scrut case_bndr ty alts)
888 = do { (scrut', scrut_uds) <- specExpr env scrut
889 ; (scrut'', case_bndr', alts', alts_uds)
890 <- specCase env scrut' case_bndr alts
891 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
892 , scrut_uds `plusUDs` alts_uds) }
893
894 ---------------- Finally, let is the interesting case --------------------
895 specExpr env (Let bind body)
896 = do { -- Clone binders
897 (rhs_env, body_env, bind') <- cloneBindSM env bind
898
899 -- Deal with the body
900 ; (body', body_uds) <- specExpr body_env body
901
902 -- Deal with the bindings
903 ; (binds', uds) <- specBind rhs_env bind' body_uds
904
905 -- All done
906 ; return (foldr Let body' binds', uds) }
907
908 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
909 specTickish env (Breakpoint ix ids)
910 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
911 -- drop vars from the list if they have a non-variable substitution.
912 -- should never happen, but it's harmless to drop them anyway.
913 specTickish _ other_tickish = other_tickish
914
915 specCase :: SpecEnv
916 -> CoreExpr -- Scrutinee, already done
917 -> Id -> [CoreAlt]
918 -> SpecM ( CoreExpr -- New scrutinee
919 , Id
920 , [CoreAlt]
921 , UsageDetails)
922 specCase env scrut' case_bndr [(con, args, rhs)]
923 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
924 , interestingDict env scrut'
925 , not (isDeadBinder case_bndr && null sc_args')
926 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
927
928 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
929 [(con, args', Var sc_arg')]
930 | sc_arg' <- sc_args' ]
931
932 -- Extend the substitution for RHS to map the *original* binders
933 -- to their floated verions.
934 mb_sc_flts :: [Maybe DictId]
935 mb_sc_flts = map (lookupVarEnv clone_env) args'
936 clone_env = zipVarEnv sc_args' sc_args_flt
937 subst_prs = (case_bndr, Var case_bndr_flt)
938 : [ (arg, Var sc_flt)
939 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
940 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
941 , se_interesting = se_interesting env_rhs `extendVarSetList`
942 (case_bndr_flt : sc_args_flt) }
943
944 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
945 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
946 case_bndr_set = unitVarSet case_bndr_flt
947 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
948 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
949 flt_binds = scrut_bind : sc_binds
950 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
951 all_uds = flt_binds `addDictBinds` free_uds
952 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
953 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
954 where
955 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
956 sc_args' = filter is_flt_sc_arg args'
957
958 clone_me bndr = do { uniq <- getUniqueM
959 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
960 where
961 name = idName bndr
962 ty = idType bndr
963 occ = nameOccName name
964 loc = getSrcSpan name
965
966 arg_set = mkVarSet args'
967 is_flt_sc_arg var = isId var
968 && not (isDeadBinder var)
969 && isDictTy var_ty
970 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
971 where
972 var_ty = idType var
973
974
975 specCase env scrut case_bndr alts
976 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
977 ; return (scrut, case_bndr', alts', uds_alts) }
978 where
979 (env_alt, case_bndr') = substBndr env case_bndr
980 spec_alt (con, args, rhs) = do
981 (rhs', uds) <- specExpr env_rhs rhs
982 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
983 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
984 where
985 (env_rhs, args') = substBndrs env_alt args
986
987 {-
988 Note [Floating dictionaries out of cases]
989 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
990 Consider
991 g = \d. case d of { MkD sc ... -> ...(f sc)... }
992 Naively we can't float d2's binding out of the case expression,
993 because 'sc' is bound by the case, and that in turn means we can't
994 specialise f, which seems a pity.
995
996 So we invert the case, by floating out a binding
997 for 'sc_flt' thus:
998 sc_flt = case d of { MkD sc ... -> sc }
999 Now we can float the call instance for 'f'. Indeed this is just
1000 what'll happen if 'sc' was originally bound with a let binding,
1001 but case is more efficient, and necessary with equalities. So it's
1002 good to work with both.
1003
1004 You might think that this won't make any difference, because the
1005 call instance will only get nuked by the \d. BUT if 'g' itself is
1006 specialised, then transitively we should be able to specialise f.
1007
1008 In general, given
1009 case e of cb { MkD sc ... -> ...(f sc)... }
1010 we transform to
1011 let cb_flt = e
1012 sc_flt = case cb_flt of { MkD sc ... -> sc }
1013 in
1014 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1015
1016 The "_flt" things are the floated binds; we use the current substitution
1017 to substitute sc -> sc_flt in the RHS
1018
1019 ************************************************************************
1020 * *
1021 Dealing with a binding
1022 * *
1023 ************************************************************************
1024 -}
1025
1026 specBind :: SpecEnv -- Use this for RHSs
1027 -> CoreBind -- Binders are already cloned by cloneBindSM,
1028 -- but RHSs are un-processed
1029 -> UsageDetails -- Info on how the scope of the binding
1030 -> SpecM ([CoreBind], -- New bindings
1031 UsageDetails) -- And info to pass upstream
1032
1033 -- Returned UsageDetails:
1034 -- No calls for binders of this bind
1035 specBind rhs_env (NonRec fn rhs) body_uds
1036 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1037 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1038
1039 ; let pairs = spec_defns ++ [(fn', rhs')]
1040 -- fn' mentions the spec_defns in its rules,
1041 -- so put the latter first
1042
1043 combined_uds = body_uds1 `plusUDs` rhs_uds
1044 -- This way round a call in rhs_uds of a function f
1045 -- at type T will override a call of f at T in body_uds1; and
1046 -- that is good because it'll tend to keep "earlier" calls
1047 -- See Note [Specialisation of dictionary functions]
1048
1049 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1050 -- See Note [From non-recursive to recursive]
1051
1052 final_binds :: [DictBind]
1053 final_binds
1054 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1055 | otherwise = [flattenDictBinds dump_dbs pairs]
1056
1057 ; if float_all then
1058 -- Rather than discard the calls mentioning the bound variables
1059 -- we float this binding along with the others
1060 return ([], free_uds `snocDictBinds` final_binds)
1061 else
1062 -- No call in final_uds mentions bound variables,
1063 -- so we can just leave the binding here
1064 return (map fst final_binds, free_uds) }
1065
1066
1067 specBind rhs_env (Rec pairs) body_uds
1068 -- Note [Specialising a recursive group]
1069 = do { let (bndrs,rhss) = unzip pairs
1070 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1071 ; let scope_uds = body_uds `plusUDs` rhs_uds
1072 -- Includes binds and calls arising from rhss
1073
1074 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1075
1076 ; (bndrs3, spec_defns3, uds3)
1077 <- if null spec_defns1 -- Common case: no specialisation
1078 then return (bndrs1, [], uds1)
1079 else do { -- Specialisation occurred; do it again
1080 (bndrs2, spec_defns2, uds2)
1081 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1082 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1083
1084 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1085 bind = flattenDictBinds dumped_dbs
1086 (spec_defns3 ++ zip bndrs3 rhss')
1087
1088 ; if float_all then
1089 return ([], final_uds `snocDictBind` bind)
1090 else
1091 return ([fst bind], final_uds) }
1092
1093
1094 ---------------------------
1095 specDefns :: SpecEnv
1096 -> UsageDetails -- Info on how it is used in its scope
1097 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1098 -> SpecM ([OutId], -- Original Ids with RULES added
1099 [(OutId,OutExpr)], -- Extra, specialised bindings
1100 UsageDetails) -- Stuff to fling upwards from the specialised versions
1101
1102 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1103 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1104 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1105 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1106 -- But not vice versa (it's a fixpoint problem).
1107
1108 specDefns _env uds []
1109 = return ([], [], uds)
1110 specDefns env uds ((bndr,rhs):pairs)
1111 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1112 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1113 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1114
1115 ---------------------------
1116 specDefn :: SpecEnv
1117 -> UsageDetails -- Info on how it is used in its scope
1118 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1119 -> SpecM (Id, -- Original Id with added RULES
1120 [(Id,CoreExpr)], -- Extra, specialised bindings
1121 UsageDetails) -- Stuff to fling upwards from the specialised versions
1122
1123 specDefn env body_uds fn rhs
1124 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1125 rules_for_me = idCoreRules fn
1126 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1127 calls_for_me fn rhs
1128 ; return ( fn `addIdSpecialisations` rules
1129 , spec_defns
1130 , body_uds_without_me `plusUDs` spec_uds) }
1131 -- It's important that the `plusUDs` is this way
1132 -- round, because body_uds_without_me may bind
1133 -- dictionaries that are used in calls_for_me passed
1134 -- to specDefn. So the dictionary bindings in
1135 -- spec_uds may mention dictionaries bound in
1136 -- body_uds_without_me
1137
1138 ---------------------------
1139 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1140 -- Nothing => specialising local fn
1141 -> SpecEnv
1142 -> [CoreRule] -- Existing RULES for the fn
1143 -> [CallInfo]
1144 -> OutId -> InExpr
1145 -> SpecM ([CoreRule], -- New RULES for the fn
1146 [(Id,CoreExpr)], -- Extra, specialised bindings
1147 UsageDetails) -- New usage details from the specialised RHSs
1148
1149 -- This function checks existing rules, and does not create
1150 -- duplicate ones. So the caller does not need to do this filtering.
1151 -- See 'already_covered'
1152
1153 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1154 -- The first case is the interesting one
1155 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1156 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1157 && notNull calls_for_me -- And there are some calls to specialise
1158 && not (isNeverActive (idInlineActivation fn))
1159 -- Don't specialise NOINLINE things
1160 -- See Note [Auto-specialisation and RULES]
1161
1162 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1163 -- See Note [Inline specialisation] for why we do not
1164 -- switch off specialisation for inline functions
1165
1166 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1167 do { stuff <- mapM spec_call calls_for_me
1168 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1169 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1170
1171 | otherwise -- No calls or RHS doesn't fit our preconceptions
1172 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1173 text "Missed specialisation opportunity for"
1174 <+> ppr fn $$ _trace_doc )
1175 -- Note [Specialisation shape]
1176 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1177 return ([], [], emptyUDs)
1178 where
1179 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1180 , ppr rhs_ids, ppr n_dicts
1181 , ppr (idInlineActivation fn) ]
1182
1183 fn_type = idType fn
1184 fn_arity = idArity fn
1185 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1186 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1187 n_tyvars = length tyvars
1188 n_dicts = length theta
1189 inl_prag = idInlinePragma fn
1190 inl_act = inlinePragmaActivation inl_prag
1191 is_local = isLocalId fn
1192
1193 -- Figure out whether the function has an INLINE pragma
1194 -- See Note [Inline specialisations]
1195
1196 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1197
1198 rhs_dict_ids = take n_dicts rhs_ids
1199 body = mkLams (drop n_dicts rhs_ids) rhs_body
1200 -- Glue back on the non-dict lambdas
1201
1202 already_covered :: DynFlags -> [CoreExpr] -> Bool
1203 already_covered dflags args -- Note [Specialisations already covered]
1204 = isJust (lookupRule dflags
1205 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1206 (const True)
1207 fn args rules_for_me)
1208
1209 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1210 mk_ty_args [] poly_tvs
1211 = ASSERT( null poly_tvs ) []
1212 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1213 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1214 mk_ty_args (Just ty : call_ts) poly_tvs
1215 = Type ty : mk_ty_args call_ts poly_tvs
1216 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1217
1218 ----------------------------------------------------------
1219 -- Specialise to one particular call pattern
1220 spec_call :: CallInfo -- Call instance
1221 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1222 UsageDetails, -- Usage details from specialised body
1223 CoreRule)) -- Info for the Id's SpecEnv
1224 spec_call _call_info@(CallKey call_ts, (call_ds, _))
1225 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1226
1227 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1228 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1229
1230 -- Construct the new binding
1231 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1232 -- PLUS the rule
1233 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1234 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1235 -- PLUS the usage-details
1236 -- { d1' = dx1; d2' = dx2 }
1237 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1238 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1239 --
1240 -- Note that the substitution is applied to the whole thing.
1241 -- This is convenient, but just slightly fragile. Notably:
1242 -- * There had better be no name clashes in a/b/c
1243 do { let
1244 -- poly_tyvars = [b] in the example above
1245 -- spec_tyvars = [a,c]
1246 -- ty_args = [t1,b,t3]
1247 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1248 env1 = extendTvSubstList env spec_tv_binds
1249 (rhs_env, poly_tyvars) = substBndrs env1
1250 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1251
1252 -- Clone rhs_dicts, including instantiating their types
1253 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1254 ; let (rhs_env2, dx_binds, spec_dict_args)
1255 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1256 ty_args = mk_ty_args call_ts poly_tyvars
1257 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1258 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1259 rule_args = ty_args ++ ev_args
1260 rule_bndrs = poly_tyvars ++ ev_bndrs
1261
1262 ; dflags <- getDynFlags
1263 ; if already_covered dflags rule_args then
1264 return Nothing
1265 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1266 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1267 -- , ppr dx_binds ]) $
1268 do
1269 { -- Figure out the type of the specialised function
1270 let body_ty = applyTypeToArgs rhs fn_type rule_args
1271 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1272 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1273 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1274 | otherwise = (poly_tyvars, poly_tyvars)
1275 spec_id_ty = mkLamTypes lam_args body_ty
1276
1277 ; spec_f <- newSpecIdSM fn spec_id_ty
1278 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1279 ; this_mod <- getModule
1280 ; let
1281 -- The rule to put in the function's specialisation is:
1282 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1283 herald = case mb_mod of
1284 Nothing -- Specialising local fn
1285 -> text "SPEC"
1286 Just this_mod -- Specialising imoprted fn
1287 -> text "SPEC/" <> ppr this_mod
1288
1289 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1290 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1291 -- This name ends up in interface files, so use showSDocForUser,
1292 -- otherwise uniques end up there, making builds
1293 -- less deterministic (See #4012 comment:61 ff)
1294
1295 spec_env_rule = mkRule
1296 this_mod
1297 True {- Auto generated -}
1298 is_local
1299 rule_name
1300 inl_act -- Note [Auto-specialisation and RULES]
1301 (idName fn)
1302 rule_bndrs
1303 rule_args
1304 (mkVarApps (Var spec_f) app_args)
1305
1306 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1307 final_uds = foldr consDictBind rhs_uds dx_binds
1308
1309 --------------------------------------
1310 -- Add a suitable unfolding if the spec_inl_prag says so
1311 -- See Note [Inline specialisations]
1312 (spec_inl_prag, spec_unf)
1313 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1314 = (neverInlinePragma, noUnfolding)
1315 -- See Note [Specialising imported functions] in OccurAnal
1316
1317 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1318 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1319
1320 | otherwise
1321 = (inl_prag, specUnfolding poly_tyvars spec_app
1322 arity_decrease fn_unf)
1323
1324 arity_decrease = length spec_dict_args
1325 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1326
1327 --------------------------------------
1328 -- Adding arity information just propagates it a bit faster
1329 -- See Note [Arity decrease] in Simplify
1330 -- Copy InlinePragma information from the parent Id.
1331 -- So if f has INLINE[1] so does spec_f
1332 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1333 `setInlinePragma` spec_inl_prag
1334 `setIdUnfolding` spec_unf
1335
1336 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1337
1338 {- Note [Evidence foralls]
1339 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1340 Suppose (Trac #12212) that we are specialising
1341 f :: forall a b. (Num a, F a ~ F b) => blah
1342 with a=b=Int. Then the RULE will be something like
1343 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1344 f Int Int d g = f_spec
1345 But both varToCoreExpr (when constructing the LHS args), and the
1346 simplifier (when simplifying the LHS args), will transform to
1347 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1348 f Int Int d <F Int> = f_spec
1349 by replacing g with Refl. So now 'g' is unbound, which results in a later
1350 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1351 * varToCoreExpr generates a Refl
1352 * exprsFreeIdsList returns the Ids bound by the args,
1353 which won't include g
1354
1355 You might wonder if this will match as often, but the simplifier replaces
1356 complicated Refl coercions with Refl pretty aggressively.
1357
1358 Note [Orphans and auto-generated rules]
1359 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1360 When we specialise an INLINABLE function, or when we have
1361 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1362 We don't want to warn about these, or we'd generate a lot of warnings.
1363 Thus, we only warn about user-specified orphan rules.
1364
1365 Indeed, we don't even treat the module as an orphan module if it has
1366 auto-generated *rule* orphans. Orphan modules are read every time we
1367 compile, so they are pretty obtrusive and slow down every compilation,
1368 even non-optimised ones. (Reason: for type class instances it's a
1369 type correctness issue.) But specialisation rules are strictly for
1370 *optimisation* only so it's fine not to read the interface.
1371
1372 What this means is that a SPEC rules from auto-specialisation in
1373 module M will be used in other modules only if M.hi has been read for
1374 some other reason, which is actually pretty likely.
1375 -}
1376
1377 bindAuxiliaryDicts
1378 :: SpecEnv
1379 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1380 -> [DictId] -- A cloned dict-id for each dict arg
1381 -> (SpecEnv, -- Substitute for all orig_dicts
1382 [DictBind], -- Auxiliary dict bindings
1383 [CoreExpr]) -- Witnessing expressions (all trivial)
1384 -- Bind any dictionary arguments to fresh names, to preserve sharing
1385 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1386 orig_dict_ids call_ds inst_dict_ids
1387 = (env', dx_binds, spec_dict_args)
1388 where
1389 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1390 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1391 (orig_dict_ids `zip` spec_dict_args)
1392 `CoreSubst.extendInScopeList` dx_ids
1393 , se_interesting = interesting `unionVarSet` interesting_dicts }
1394
1395 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1396 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1397 , interestingDict env dx ]
1398 -- See Note [Make the new dictionaries interesting]
1399
1400 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1401 go [] _ = ([], [])
1402 go (dx:dxs) (dx_id:dx_ids)
1403 | exprIsTrivial dx = (dx_binds, dx : args)
1404 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1405 where
1406 (dx_binds, args) = go dxs dx_ids
1407 -- In the first case extend the substitution but not bindings;
1408 -- in the latter extend the bindings but not the substitution.
1409 -- For the former, note that we bind the *original* dict in the substitution,
1410 -- overriding any d->dx_id binding put there by substBndrs
1411 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1412
1413 {-
1414 Note [Make the new dictionaries interesting]
1415 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1416 Important! We're going to substitute dx_id1 for d
1417 and we want it to look "interesting", else we won't gather *any*
1418 consequential calls. E.g.
1419 f d = ...g d....
1420 If we specialise f for a call (f (dfun dNumInt)), we'll get
1421 a consequent call (g d') with an auxiliary definition
1422 d' = df dNumInt
1423 We want that consequent call to look interesting
1424
1425
1426 Note [From non-recursive to recursive]
1427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1428 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1429 have built a recursive knot
1430
1431 f a d x = <blah>
1432 MkUD { ud_binds = d7 = MkD ..f..
1433 , ud_calls = ...(f T d7)... }
1434
1435 The we generate
1436
1437 Rec { fs x = <blah>[T/a, d7/d]
1438 f a d x = <blah>
1439 RULE f T _ = fs
1440 d7 = ...f... }
1441
1442 Here the recursion is only through the RULE.
1443
1444
1445 Note [Specialisation of dictionary functions]
1446 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1447 Here is a nasty example that bit us badly: see Trac #3591
1448
1449 class Eq a => C a
1450 instance Eq [a] => C [a]
1451
1452 ---------------
1453 dfun :: Eq [a] -> C [a]
1454 dfun a d = MkD a d (meth d)
1455
1456 d4 :: Eq [T] = <blah>
1457 d2 :: C [T] = dfun T d4
1458 d1 :: Eq [T] = $p1 d2
1459 d3 :: C [T] = dfun T d1
1460
1461 None of these definitions is recursive. What happened was that we
1462 generated a specialisation:
1463
1464 RULE forall d. dfun T d = dT :: C [T]
1465 dT = (MkD a d (meth d)) [T/a, d1/d]
1466 = MkD T d1 (meth d1)
1467
1468 But now we use the RULE on the RHS of d2, to get
1469
1470 d2 = dT = MkD d1 (meth d1)
1471 d1 = $p1 d2
1472
1473 and now d1 is bottom! The problem is that when specialising 'dfun' we
1474 should first dump "below" the binding all floated dictionary bindings
1475 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1476 placed below 'dfun', and thus unavailable to it when specialising
1477 'dfun'. That in turn means that the call (dfun T d1) must be
1478 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1479 d4 doesn't mention dfun.
1480
1481 But look at this:
1482
1483 class C a where { foo,bar :: [a] -> [a] }
1484
1485 instance C Int where
1486 foo x = r_bar x
1487 bar xs = reverse xs
1488
1489 r_bar :: C a => [a] -> [a]
1490 r_bar xs = bar (xs ++ xs)
1491
1492 That translates to:
1493
1494 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1495
1496 Rec { $fCInt :: C Int = MkC foo_help reverse
1497 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1498
1499 The call (r_bar $fCInt) mentions $fCInt,
1500 which mentions foo_help,
1501 which mentions r_bar
1502 But we DO want to specialise r_bar at Int:
1503
1504 Rec { $fCInt :: C Int = MkC foo_help reverse
1505 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1506
1507 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1508 RULE r_bar Int _ = r_bar_Int
1509
1510 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1511 }
1512
1513 Note that, because of its RULE, r_bar joins the recursive
1514 group. (In this case it'll unravel a short moment later.)
1515
1516
1517 Conclusion: we catch the nasty case using filter_dfuns in
1518 callsForMe. To be honest I'm not 100% certain that this is 100%
1519 right, but it works. Sigh.
1520
1521
1522 Note [Specialising a recursive group]
1523 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1524 Consider
1525 let rec { f x = ...g x'...
1526 ; g y = ...f y'.... }
1527 in f 'a'
1528 Here we specialise 'f' at Char; but that is very likely to lead to
1529 a specialisation of 'g' at Char. We must do the latter, else the
1530 whole point of specialisation is lost.
1531
1532 But we do not want to keep iterating to a fixpoint, because in the
1533 presence of polymorphic recursion we might generate an infinite number
1534 of specialisations.
1535
1536 So we use the following heuristic:
1537 * Arrange the rec block in dependency order, so far as possible
1538 (the occurrence analyser already does this)
1539
1540 * Specialise it much like a sequence of lets
1541
1542 * Then go through the block a second time, feeding call-info from
1543 the RHSs back in the bottom, as it were
1544
1545 In effect, the ordering maxmimises the effectiveness of each sweep,
1546 and we do just two sweeps. This should catch almost every case of
1547 monomorphic recursion -- the exception could be a very knotted-up
1548 recursion with multiple cycles tied up together.
1549
1550 This plan is implemented in the Rec case of specBindItself.
1551
1552 Note [Specialisations already covered]
1553 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1554 We obviously don't want to generate two specialisations for the same
1555 argument pattern. There are two wrinkles
1556
1557 1. We do the already-covered test in specDefn, not when we generate
1558 the CallInfo in mkCallUDs. We used to test in the latter place, but
1559 we now iterate the specialiser somewhat, and the Id at the call site
1560 might therefore not have all the RULES that we can see in specDefn
1561
1562 2. What about two specialisations where the second is an *instance*
1563 of the first? If the more specific one shows up first, we'll generate
1564 specialisations for both. If the *less* specific one shows up first,
1565 we *don't* currently generate a specialisation for the more specific
1566 one. (See the call to lookupRule in already_covered.) Reasons:
1567 (a) lookupRule doesn't say which matches are exact (bad reason)
1568 (b) if the earlier specialisation is user-provided, it's
1569 far from clear that we should auto-specialise further
1570
1571 Note [Auto-specialisation and RULES]
1572 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1573 Consider:
1574 g :: Num a => a -> a
1575 g = ...
1576
1577 f :: (Int -> Int) -> Int
1578 f w = ...
1579 {-# RULE f g = 0 #-}
1580
1581 Suppose that auto-specialisation makes a specialised version of
1582 g::Int->Int That version won't appear in the LHS of the RULE for f.
1583 So if the specialisation rule fires too early, the rule for f may
1584 never fire.
1585
1586 It might be possible to add new rules, to "complete" the rewrite system.
1587 Thus when adding
1588 RULE forall d. g Int d = g_spec
1589 also add
1590 RULE f g_spec = 0
1591
1592 But that's a bit complicated. For now we ask the programmer's help,
1593 by *copying the INLINE activation pragma* to the auto-specialised
1594 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1595 will also not be active until phase 2. And that's what programmers
1596 should jolly well do anyway, even aside from specialisation, to ensure
1597 that g doesn't inline too early.
1598
1599 This in turn means that the RULE would never fire for a NOINLINE
1600 thing so not much point in generating a specialisation at all.
1601
1602 Note [Specialisation shape]
1603 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1604 We only specialise a function if it has visible top-level lambdas
1605 corresponding to its overloading. E.g. if
1606 f :: forall a. Eq a => ....
1607 then its body must look like
1608 f = /\a. \d. ...
1609
1610 Reason: when specialising the body for a call (f ty dexp), we want to
1611 substitute dexp for d, and pick up specialised calls in the body of f.
1612
1613 This doesn't always work. One example I came across was this:
1614 newtype Gen a = MkGen{ unGen :: Int -> a }
1615
1616 choose :: Eq a => a -> Gen a
1617 choose n = MkGen (\r -> n)
1618
1619 oneof = choose (1::Int)
1620
1621 It's a silly exapmle, but we get
1622 choose = /\a. g `cast` co
1623 where choose doesn't have any dict arguments. Thus far I have not
1624 tried to fix this (wait till there's a real example).
1625
1626 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1627 it doesn't matter. This comes up with single-method classes
1628
1629 class C a where { op :: a -> a }
1630 instance C a => C [a] where ....
1631 ==>
1632 $fCList :: C a => C [a]
1633 $fCList = $copList |> (...coercion>...)
1634 ....(uses of $fCList at particular types)...
1635
1636 So we suppress the WARN if the rhs is trivial.
1637
1638 Note [Inline specialisations]
1639 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1640 Here is what we do with the InlinePragma of the original function
1641 * Activation/RuleMatchInfo: both transferred to the
1642 specialised function
1643 * InlineSpec:
1644 (a) An INLINE pragma is transferred
1645 (b) An INLINABLE pragma is *not* transferred
1646
1647 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1648 specialise the function at its call site, and arguably that's not so
1649 important for the specialised copies. BUT *pragma-directed*
1650 specialisation now takes place in the typechecker/desugarer, with
1651 manually specified INLINEs. The specialisation here is automatic.
1652 It'd be very odd if a function marked INLINE was specialised (because
1653 of some local use), and then forever after (including importing
1654 modules) the specialised version wasn't INLINEd. After all, the
1655 programmer said INLINE!
1656
1657 You might wonder why we specialise INLINE functions at all. After
1658 all they should be inlined, right? Two reasons:
1659
1660 * Even INLINE functions are sometimes not inlined, when they aren't
1661 applied to interesting arguments. But perhaps the type arguments
1662 alone are enough to specialise (even though the args are too boring
1663 to trigger inlining), and it's certainly better to call the
1664 specialised version.
1665
1666 * The RHS of an INLINE function might call another overloaded function,
1667 and we'd like to generate a specialised version of that function too.
1668 This actually happens a lot. Consider
1669 replicateM_ :: (Monad m) => Int -> m a -> m ()
1670 {-# INLINABLE replicateM_ #-}
1671 replicateM_ d x ma = ...
1672 The strictness analyser may transform to
1673 replicateM_ :: (Monad m) => Int -> m a -> m ()
1674 {-# INLINE replicateM_ #-}
1675 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1676
1677 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1678 {-# INLINABLE $wreplicateM_ #-}
1679 $wreplicateM_ = ...
1680 Now an importing module has a specialised call to replicateM_, say
1681 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1682 This particular example had a huge effect on the call to replicateM_
1683 in nofib/shootout/n-body.
1684
1685 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1686 Suppose we have
1687 {-# INLINABLE f #-}
1688 f :: Ord a => [a] -> Int
1689 f xs = letrec f' = ...f'... in f'
1690 Then, when f is specialised and optimised we might get
1691 wgo :: [Int] -> Int#
1692 wgo = ...wgo...
1693 f_spec :: [Int] -> Int
1694 f_spec xs = case wgo xs of { r -> I# r }
1695 and we clearly want to inline f_spec at call sites. But if we still
1696 have the big, un-optimised of f (albeit specialised) captured in an
1697 INLINABLE pragma for f_spec, we won't get that optimisation.
1698
1699 So we simply drop INLINABLE pragmas when specialising. It's not really
1700 a complete solution; ignoring specialisation for now, INLINABLE functions
1701 don't get properly strictness analysed, for example. But it works well
1702 for examples involving specialisation, which is the dominant use of
1703 INLINABLE. See Trac #4874.
1704
1705
1706 ************************************************************************
1707 * *
1708 \subsubsection{UsageDetails and suchlike}
1709 * *
1710 ************************************************************************
1711 -}
1712
1713 data UsageDetails
1714 = MkUD {
1715 ud_binds :: !(Bag DictBind),
1716 -- Floated dictionary bindings
1717 -- The order is important;
1718 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1719 -- (Remember, Bags preserve order in GHC.)
1720
1721 ud_calls :: !CallDetails
1722
1723 -- INVARIANT: suppose bs = bindersOf ud_binds
1724 -- Then 'calls' may *mention* 'bs',
1725 -- but there should be no calls *for* bs
1726 }
1727
1728 instance Outputable UsageDetails where
1729 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1730 = text "MkUD" <+> braces (sep (punctuate comma
1731 [text "binds" <+> equals <+> ppr dbs,
1732 text "calls" <+> equals <+> ppr calls]))
1733
1734 -- | A 'DictBind' is a binding along with a cached set containing its free
1735 -- variables (both type variables and dictionaries)
1736 type DictBind = (CoreBind, VarSet)
1737
1738 type DictExpr = CoreExpr
1739
1740 emptyUDs :: UsageDetails
1741 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1742
1743 ------------------------------------------------------------
1744 type CallDetails = DIdEnv CallInfoSet
1745 -- The order of specialized binds and rules depends on how we linearize
1746 -- CallDetails, so to get determinism we must use a deterministic set here.
1747 -- See Note [Deterministic UniqFM] in UniqDFM
1748 newtype CallKey = CallKey [Maybe Type]
1749 -- Nothing => unconstrained type argument
1750
1751 data CallInfoSet = CIS Id (Bag CallInfo)
1752 -- The list of types and dictionaries is guaranteed to
1753 -- match the type of f
1754
1755 {-
1756 Note [CallInfoSet determinism]
1757 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1758 CallInfoSet holds a Bag of (CallKey, [DictExpr], VarSet) triplets for a given
1759 Id. They represent the types that the function is instantiated at along with
1760 the dictionaries and free variables.
1761
1762 We use this information to generate specialized versions of a given function.
1763 CallInfoSet used to be defined as:
1764
1765 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1766
1767 Unfortunately this was not deterministic. The Ord instance of CallKey was
1768 defined in terms of nonDetCmpType which is not deterministic.
1769 See Note [nonDetCmpType nondeterminism].
1770 The end result was that if the function had multiple specializations they would
1771 be generated in arbitrary order.
1772
1773 We need a container that:
1774 a) when turned into a list has only one element per each CallKey and the list
1775 has deterministic order
1776 b) supports union
1777 c) supports singleton
1778 d) supports filter
1779
1780 We can't use UniqDFM here because there's no one Unique that we can key on.
1781
1782 The current approach is to implement the set as a Bag with duplicates.
1783 This makes b), c), d) trivial and pushes a) towards the end. The deduplication
1784 is done by using a TrieMap for membership tests on CallKey. This lets us delete
1785 the nondeterministic Ord CallKey instance.
1786
1787 An alternative approach would be to augment the Map the same way that UniqDFM
1788 is augmented, by keeping track of insertion order and using it to order the
1789 resulting lists. It would mean keeping the nondeterministic Ord CallKey
1790 instance making it easy to reintroduce nondeterminism in the future.
1791 -}
1792
1793 ciSetToList :: CallInfoSet -> [CallInfo]
1794 ciSetToList (CIS _ b) = snd $ foldrBag combine (emptyTM, []) b
1795 where
1796 -- This is where we eliminate duplicates, recording the CallKeys we've
1797 -- already seen in the TrieMap. See Note [CallInfoSet determinism].
1798 combine :: CallInfo -> (CallKeySet, [CallInfo]) -> (CallKeySet, [CallInfo])
1799 combine ci@(CallKey key, _) (set, acc)
1800 | Just _ <- lookupTM key set = (set, acc)
1801 | otherwise = (insertTM key () set, ci:acc)
1802
1803 type CallKeySet = ListMap (MaybeMap TypeMap) ()
1804 -- We only use it in ciSetToList to check for membership
1805
1806 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1807 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1808
1809 type CallInfo = (CallKey, ([DictExpr], VarSet))
1810 -- Range is dict args and the vars of the whole
1811 -- call (including tyvars)
1812 -- [*not* include the main id itself, of course]
1813
1814 instance Outputable CallInfoSet where
1815 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1816 2 (ppr map)
1817
1818 pprCallInfo :: Id -> CallInfo -> SDoc
1819 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1820 = hang (ppr fn)
1821 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1822
1823 ppr_call_key_ty :: Maybe Type -> SDoc
1824 ppr_call_key_ty Nothing = char '_'
1825 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1826
1827 instance Outputable CallKey where
1828 ppr (CallKey ts) = ppr ts
1829
1830 unionCalls :: CallDetails -> CallDetails -> CallDetails
1831 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1832
1833 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1834 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1835 CIS f (calls1 `unionBags` calls2)
1836
1837 callDetailsFVs :: CallDetails -> VarSet
1838 callDetailsFVs calls =
1839 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1840 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1841 -- immediately by converting to a nondeterministic set.
1842
1843 callInfoFVs :: CallInfoSet -> VarSet
1844 callInfoFVs (CIS _ call_info) =
1845 foldrBag (\(_, (_,fv)) vs -> unionVarSet fv vs) emptyVarSet call_info
1846
1847 ------------------------------------------------------------
1848 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1849 singleCall id tys dicts
1850 = MkUD {ud_binds = emptyBag,
1851 ud_calls = unitDVarEnv id $ CIS id $
1852 unitBag (CallKey tys, (dicts, call_fvs)) }
1853 where
1854 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1855 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1856 -- The type args (tys) are guaranteed to be part of the dictionary
1857 -- types, because they are just the constrained types,
1858 -- and the dictionary is therefore sure to be bound
1859 -- inside the binding for any type variables free in the type;
1860 -- hence it's safe to neglect tyvars free in tys when making
1861 -- the free-var set for this call
1862 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1863 --
1864 -- We don't include the 'id' itself.
1865
1866 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1867 mkCallUDs env f args
1868 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1869 res
1870 where
1871 res = mkCallUDs' env f args
1872
1873 mkCallUDs' env f args
1874 | not (want_calls_for f) -- Imported from elsewhere
1875 || null theta -- Not overloaded
1876 = emptyUDs
1877
1878 | not (all type_determines_value theta)
1879 || not (spec_tys `lengthIs` n_tyvars)
1880 || not ( dicts `lengthIs` n_dicts)
1881 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1882 -- See also Note [Specialisations already covered]
1883 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1884 emptyUDs -- Not overloaded, or no specialisation wanted
1885
1886 | otherwise
1887 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1888 singleCall f spec_tys dicts
1889 where
1890 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1891 , ppr (map (interestingDict env) dicts)]
1892 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1893 constrained_tyvars = tyCoVarsOfTypes theta
1894 n_tyvars = length tyvars
1895 n_dicts = length theta
1896
1897 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1898 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1899
1900 -- ignores Coercion arguments
1901 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1902 type_zip tvs (Coercion _ : args) = type_zip tvs args
1903 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1904 type_zip _ _ = []
1905
1906 mk_spec_ty tyvar ty
1907 | tyvar `elemVarSet` constrained_tyvars = Just ty
1908 | otherwise = Nothing
1909
1910 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1911 -- For imported things, we gather call instances if
1912 -- there is an unfolding that we could in principle specialise
1913 -- We might still decide not to use it (consulting dflags)
1914 -- in specImports
1915 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1916
1917 type_determines_value pred -- See Note [Type determines value]
1918 = case classifyPredType pred of
1919 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1920 EqPred {} -> True
1921 IrredPred {} -> True -- Things like (D []) where D is a
1922 -- Constraint-ranged family; Trac #7785
1923
1924 {-
1925 Note [Type determines value]
1926 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1927 Only specialise if all overloading is on non-IP *class* params,
1928 because these are the ones whose *type* determines their *value*. In
1929 parrticular, with implicit params, the type args *don't* say what the
1930 value of the implicit param is! See Trac #7101
1931
1932 However, consider
1933 type family D (v::*->*) :: Constraint
1934 type instance D [] = ()
1935 f :: D v => v Char -> Int
1936 If we see a call (f "foo"), we'll pass a "dictionary"
1937 () |> (g :: () ~ D [])
1938 and it's good to specialise f at this dictionary.
1939
1940 So the question is: can an implicit parameter "hide inside" a
1941 type-family constraint like (D a). Well, no. We don't allow
1942 type instance D Maybe = ?x:Int
1943 Hence the IrredPred case in type_determines_value.
1944 See Trac #7785.
1945
1946 Note [Interesting dictionary arguments]
1947 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1948 Consider this
1949 \a.\d:Eq a. let f = ... in ...(f d)...
1950 There really is not much point in specialising f wrt the dictionary d,
1951 because the code for the specialised f is not improved at all, because
1952 d is lambda-bound. We simply get junk specialisations.
1953
1954 What is "interesting"? Just that it has *some* structure. But what about
1955 variables?
1956
1957 * A variable might be imported, in which case its unfolding
1958 will tell us whether it has useful structure
1959
1960 * Local variables are cloned on the way down (to avoid clashes when
1961 we float dictionaries), and cloning drops the unfolding
1962 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1963 nuisance to give them unfoldings. So we keep track of the
1964 "interesting" dictionaries as a VarSet in SpecEnv.
1965 We have to take care to put any new interesting dictionary
1966 bindings in the set.
1967
1968 We accidentally lost accurate tracking of local variables for a long
1969 time, because cloned variables don't have unfoldings. But makes a
1970 massive difference in a few cases, eg Trac #5113. For nofib as a
1971 whole it's only a small win: 2.2% improvement in allocation for ansi,
1972 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1973 -}
1974
1975 interestingDict :: SpecEnv -> CoreExpr -> Bool
1976 -- A dictionary argument is interesting if it has *some* structure
1977 -- NB: "dictionary" arguments include constraints of all sorts,
1978 -- including equality constraints; hence the Coercion case
1979 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1980 || isDataConWorkId v
1981 || v `elemVarSet` se_interesting env
1982 interestingDict _ (Type _) = False
1983 interestingDict _ (Coercion _) = False
1984 interestingDict env (App fn (Type _)) = interestingDict env fn
1985 interestingDict env (App fn (Coercion _)) = interestingDict env fn
1986 interestingDict env (Tick _ a) = interestingDict env a
1987 interestingDict env (Cast e _) = interestingDict env e
1988 interestingDict _ _ = True
1989
1990 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1991 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
1992 (MkUD {ud_binds = db2, ud_calls = calls2})
1993 = MkUD { ud_binds = db1 `unionBags` db2
1994 , ud_calls = calls1 `unionCalls` calls2 }
1995
1996 plusUDList :: [UsageDetails] -> UsageDetails
1997 plusUDList = foldr plusUDs emptyUDs
1998
1999 -----------------------------
2000 _dictBindBndrs :: Bag DictBind -> [Id]
2001 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2002
2003 -- | Construct a 'DictBind' from a 'CoreBind'
2004 mkDB :: CoreBind -> DictBind
2005 mkDB bind = (bind, bind_fvs bind)
2006
2007 -- | Identify the free variables of a 'CoreBind'
2008 bind_fvs :: CoreBind -> VarSet
2009 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2010 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2011 where
2012 bndrs = map fst prs
2013 rhs_fvs = unionVarSets (map pair_fvs prs)
2014
2015 pair_fvs :: (Id, CoreExpr) -> VarSet
2016 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
2017 -- Don't forget variables mentioned in the
2018 -- rules of the bndr. C.f. OccAnal.addRuleUsage
2019 -- Also tyvars mentioned in its type; they may not appear in the RHS
2020 -- type T a = Int
2021 -- x :: T a = 3
2022
2023 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
2024 -- recursive binding, including some additional bindings.
2025 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
2026 flattenDictBinds dbs pairs
2027 = (Rec bindings, fvs)
2028 where
2029 (bindings, fvs) = foldrBag add
2030 ([], emptyVarSet)
2031 (dbs `snocBag` mkDB (Rec pairs))
2032 add (NonRec b r, fvs') (pairs, fvs) =
2033 ((b,r) : pairs, fvs `unionVarSet` fvs')
2034 add (Rec prs1, fvs') (pairs, fvs) =
2035 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2036
2037 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2038 -- Add ud_binds to the tail end of the bindings in uds
2039 snocDictBinds uds dbs
2040 = uds { ud_binds = ud_binds uds `unionBags`
2041 foldr consBag emptyBag dbs }
2042
2043 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2044 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2045
2046 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2047 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2048
2049 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2050 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2051
2052 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2053 wrapDictBinds dbs binds
2054 = foldrBag add binds dbs
2055 where
2056 add (bind,_) binds = bind : binds
2057
2058 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2059 wrapDictBindsE dbs expr
2060 = foldrBag add expr dbs
2061 where
2062 add (bind,_) expr = Let bind expr
2063
2064 ----------------------
2065 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2066 -- Used at a lambda or case binder; just dump anything mentioning the binder
2067 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2068 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2069 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2070 (free_uds, dump_dbs)
2071 where
2072 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2073 bndr_set = mkVarSet bndrs
2074 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2075 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2076 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2077 -- no calls for any of the dicts in dump_dbs
2078
2079 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2080 -- Used at a lambda or case binder; just dump anything mentioning the binder
2081 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2082 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2083 (free_uds, dump_dbs, float_all)
2084 where
2085 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2086 bndr_set = mkVarSet bndrs
2087 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2088 free_calls = deleteCallsFor bndrs orig_calls
2089 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2090
2091 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2092 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2093 = -- pprTrace ("callsForMe")
2094 -- (vcat [ppr fn,
2095 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2096 -- text "Orig calls =" <+> ppr orig_calls,
2097 -- text "Dep set =" <+> ppr dep_set,
2098 -- text "Calls for me =" <+> ppr calls_for_me]) $
2099 (uds_without_me, calls_for_me)
2100 where
2101 uds_without_me = MkUD { ud_binds = orig_dbs
2102 , ud_calls = delDVarEnv orig_calls fn }
2103 calls_for_me = case lookupDVarEnv orig_calls fn of
2104 Nothing -> []
2105 Just cis -> filter_dfuns (ciSetToList cis)
2106
2107 dep_set = foldlBag go (unitVarSet fn) orig_dbs
2108 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2109 = extendVarSetList dep_set (bindersOf db)
2110 | otherwise = dep_set
2111
2112 -- Note [Specialisation of dictionary functions]
2113 filter_dfuns | isDFunId fn = filter ok_call
2114 | otherwise = \cs -> cs
2115
2116 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2117
2118 ----------------------
2119 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2120 -- Returns (free_dbs, dump_dbs, dump_set)
2121 splitDictBinds dbs bndr_set
2122 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2123 -- Important that it's foldl not foldr;
2124 -- we're accumulating the set of dumped ids in dump_set
2125 where
2126 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2127 | dump_idset `intersectsVarSet` fvs -- Dump it
2128 = (free_dbs, dump_dbs `snocBag` db,
2129 extendVarSetList dump_idset (bindersOf bind))
2130
2131 | otherwise -- Don't dump it
2132 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2133
2134
2135 ----------------------
2136 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2137 -- Remove calls *mentioning* bs
2138 deleteCallsMentioning bs calls
2139 = mapDVarEnv (ciSetFilter keep_call) calls
2140 where
2141 keep_call (_, (_, fvs)) = not (fvs `intersectsVarSet` bs)
2142
2143 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2144 -- Remove calls *for* bs
2145 deleteCallsFor bs calls = delDVarEnvList calls bs
2146
2147 {-
2148 ************************************************************************
2149 * *
2150 \subsubsection{Boring helper functions}
2151 * *
2152 ************************************************************************
2153 -}
2154
2155 newtype SpecM a = SpecM (State SpecState a)
2156
2157 data SpecState = SpecState {
2158 spec_uniq_supply :: UniqSupply,
2159 spec_module :: Module,
2160 spec_dflags :: DynFlags
2161 }
2162
2163 instance Functor SpecM where
2164 fmap = liftM
2165
2166 instance Applicative SpecM where
2167 pure x = SpecM $ return x
2168 (<*>) = ap
2169
2170 instance Monad SpecM where
2171 SpecM x >>= f = SpecM $ do y <- x
2172 case f y of
2173 SpecM z ->
2174 z
2175 fail str = SpecM $ fail str
2176
2177 #if __GLASGOW_HASKELL__ > 710
2178 instance MonadFail.MonadFail SpecM where
2179 fail str = SpecM $ fail str
2180 #endif
2181
2182 instance MonadUnique SpecM where
2183 getUniqueSupplyM
2184 = SpecM $ do st <- get
2185 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2186 put $ st { spec_uniq_supply = us2 }
2187 return us1
2188
2189 getUniqueM
2190 = SpecM $ do st <- get
2191 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2192 put $ st { spec_uniq_supply = us' }
2193 return u
2194
2195 instance HasDynFlags SpecM where
2196 getDynFlags = SpecM $ liftM spec_dflags get
2197
2198 instance HasModule SpecM where
2199 getModule = SpecM $ liftM spec_module get
2200
2201 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2202 runSpecM dflags this_mod (SpecM spec)
2203 = do us <- getUniqueSupplyM
2204 let initialState = SpecState {
2205 spec_uniq_supply = us,
2206 spec_module = this_mod,
2207 spec_dflags = dflags
2208 }
2209 return $ evalState spec initialState
2210
2211 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2212 mapAndCombineSM _ [] = return ([], emptyUDs)
2213 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2214 (ys, uds2) <- mapAndCombineSM f xs
2215 return (y:ys, uds1 `plusUDs` uds2)
2216
2217 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2218 extendTvSubstList env tv_binds
2219 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2220
2221 substTy :: SpecEnv -> Type -> Type
2222 substTy env ty = CoreSubst.substTy (se_subst env) ty
2223
2224 substCo :: SpecEnv -> Coercion -> Coercion
2225 substCo env co = CoreSubst.substCo (se_subst env) co
2226
2227 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2228 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2229 (subst', bs') -> (env { se_subst = subst' }, bs')
2230
2231 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2232 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2233 (subst', bs') -> (env { se_subst = subst' }, bs')
2234
2235 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2236 -- Clone the binders of the bind; return new bind with the cloned binders
2237 -- Return the substitution to use for RHSs, and the one to use for the body
2238 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2239 = do { us <- getUniqueSupplyM
2240 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2241 interesting' | interestingDict env rhs
2242 = interesting `extendVarSet` bndr'
2243 | otherwise = interesting
2244 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2245 , NonRec bndr' rhs) }
2246
2247 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2248 = do { us <- getUniqueSupplyM
2249 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2250 env' = env { se_subst = subst'
2251 , se_interesting = interesting `extendVarSetList`
2252 [ v | (v,r) <- pairs, interestingDict env r ] }
2253 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2254
2255 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2256 -- Make up completely fresh binders for the dictionaries
2257 -- Their bindings are going to float outwards
2258 newDictBndr env b = do { uniq <- getUniqueM
2259 ; let n = idName b
2260 ty' = substTy env (idType b)
2261 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2262
2263 newSpecIdSM :: Id -> Type -> SpecM Id
2264 -- Give the new Id a similar occurrence name to the old one
2265 newSpecIdSM old_id new_ty
2266 = do { uniq <- getUniqueM
2267 ; let name = idName old_id
2268 new_occ = mkSpecOcc (nameOccName name)
2269 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2270 ; return new_id }
2271
2272 {-
2273 Old (but interesting) stuff about unboxed bindings
2274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2275
2276 What should we do when a value is specialised to a *strict* unboxed value?
2277
2278 map_*_* f (x:xs) = let h = f x
2279 t = map f xs
2280 in h:t
2281
2282 Could convert let to case:
2283
2284 map_*_Int# f (x:xs) = case f x of h# ->
2285 let t = map f xs
2286 in h#:t
2287
2288 This may be undesirable since it forces evaluation here, but the value
2289 may not be used in all branches of the body. In the general case this
2290 transformation is impossible since the mutual recursion in a letrec
2291 cannot be expressed as a case.
2292
2293 There is also a problem with top-level unboxed values, since our
2294 implementation cannot handle unboxed values at the top level.
2295
2296 Solution: Lift the binding of the unboxed value and extract it when it
2297 is used:
2298
2299 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2300 t = map f xs
2301 in case h of
2302 _Lift h# -> h#:t
2303
2304 Now give it to the simplifier and the _Lifting will be optimised away.
2305
2306 The benefit is that we have given the specialised "unboxed" values a
2307 very simple lifted semantics and then leave it up to the simplifier to
2308 optimise it --- knowing that the overheads will be removed in nearly
2309 all cases.
2310
2311 In particular, the value will only be evaluated in the branches of the
2312 program which use it, rather than being forced at the point where the
2313 value is bound. For example:
2314
2315 filtermap_*_* p f (x:xs)
2316 = let h = f x
2317 t = ...
2318 in case p x of
2319 True -> h:t
2320 False -> t
2321 ==>
2322 filtermap_*_Int# p f (x:xs)
2323 = let h = case (f x) of h# -> _Lift h#
2324 t = ...
2325 in case p x of
2326 True -> case h of _Lift h#
2327 -> h#:t
2328 False -> t
2329
2330 The binding for h can still be inlined in the one branch and the
2331 _Lifting eliminated.
2332
2333
2334 Question: When won't the _Lifting be eliminated?
2335
2336 Answer: When they at the top-level (where it is necessary) or when
2337 inlining would duplicate work (or possibly code depending on
2338 options). However, the _Lifting will still be eliminated if the
2339 strictness analyser deems the lifted binding strict.
2340 -}