bccf600c1022552b8dd55622562a86308f4ed13d
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy, extendTCvSubstList )
14 import Type hiding( substTy, extendTCvSubstList )
15 import Coercion( Coercion )
16 import Module( Module, HasModule(..) )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars )
26 import UniqSupply
27 import Name
28 import MkId ( voidArgId, voidPrimId )
29 import Maybes ( catMaybes, isJust )
30 import BasicTypes
31 import HscTypes
32 import Bag
33 import DynFlags
34 import Util
35 import Outputable
36 import FastString
37 import State
38
39 import Control.Monad
40 #if __GLASGOW_HASKELL__ > 710
41 import qualified Control.Monad.Fail as MonadFail
42 #endif
43 import Data.Map (Map)
44 import qualified Data.Map as Map
45 import qualified FiniteMap as Map
46
47 {-
48 ************************************************************************
49 * *
50 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
51 * *
52 ************************************************************************
53
54 These notes describe how we implement specialisation to eliminate
55 overloading.
56
57 The specialisation pass works on Core
58 syntax, complete with all the explicit dictionary application,
59 abstraction and construction as added by the type checker. The
60 existing type checker remains largely as it is.
61
62 One important thought: the {\em types} passed to an overloaded
63 function, and the {\em dictionaries} passed are mutually redundant.
64 If the same function is applied to the same type(s) then it is sure to
65 be applied to the same dictionary(s)---or rather to the same {\em
66 values}. (The arguments might look different but they will evaluate
67 to the same value.)
68
69 Second important thought: we know that we can make progress by
70 treating dictionary arguments as static and worth specialising on. So
71 we can do without binding-time analysis, and instead specialise on
72 dictionary arguments and no others.
73
74 The basic idea
75 ~~~~~~~~~~~~~~
76 Suppose we have
77
78 let f = <f_rhs>
79 in <body>
80
81 and suppose f is overloaded.
82
83 STEP 1: CALL-INSTANCE COLLECTION
84
85 We traverse <body>, accumulating all applications of f to types and
86 dictionaries.
87
88 (Might there be partial applications, to just some of its types and
89 dictionaries? In principle yes, but in practice the type checker only
90 builds applications of f to all its types and dictionaries, so partial
91 applications could only arise as a result of transformation, and even
92 then I think it's unlikely. In any case, we simply don't accumulate such
93 partial applications.)
94
95
96 STEP 2: EQUIVALENCES
97
98 So now we have a collection of calls to f:
99 f t1 t2 d1 d2
100 f t3 t4 d3 d4
101 ...
102 Notice that f may take several type arguments. To avoid ambiguity, we
103 say that f is called at type t1/t2 and t3/t4.
104
105 We take equivalence classes using equality of the *types* (ignoring
106 the dictionary args, which as mentioned previously are redundant).
107
108 STEP 3: SPECIALISATION
109
110 For each equivalence class, choose a representative (f t1 t2 d1 d2),
111 and create a local instance of f, defined thus:
112
113 f@t1/t2 = <f_rhs> t1 t2 d1 d2
114
115 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
116 of simplification will now result. However we don't actually *do* that
117 simplification. Rather, we leave it for the simplifier to do. If we
118 *did* do it, though, we'd get more call instances from the specialised
119 RHS. We can work out what they are by instantiating the call-instance
120 set from f's RHS with the types t1, t2.
121
122 Add this new id to f's IdInfo, to record that f has a specialised version.
123
124 Before doing any of this, check that f's IdInfo doesn't already
125 tell us about an existing instance of f at the required type/s.
126 (This might happen if specialisation was applied more than once, or
127 it might arise from user SPECIALIZE pragmas.)
128
129 Recursion
130 ~~~~~~~~~
131 Wait a minute! What if f is recursive? Then we can't just plug in
132 its right-hand side, can we?
133
134 But it's ok. The type checker *always* creates non-recursive definitions
135 for overloaded recursive functions. For example:
136
137 f x = f (x+x) -- Yes I know its silly
138
139 becomes
140
141 f a (d::Num a) = let p = +.sel a d
142 in
143 letrec fl (y::a) = fl (p y y)
144 in
145 fl
146
147 We still have recusion for non-overloaded functions which we
148 speciailise, but the recursive call should get specialised to the
149 same recursive version.
150
151
152 Polymorphism 1
153 ~~~~~~~~~~~~~~
154
155 All this is crystal clear when the function is applied to *constant
156 types*; that is, types which have no type variables inside. But what if
157 it is applied to non-constant types? Suppose we find a call of f at type
158 t1/t2. There are two possibilities:
159
160 (a) The free type variables of t1, t2 are in scope at the definition point
161 of f. In this case there's no problem, we proceed just as before. A common
162 example is as follows. Here's the Haskell:
163
164 g y = let f x = x+x
165 in f y + f y
166
167 After typechecking we have
168
169 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
170 in +.sel a d (f a d y) (f a d y)
171
172 Notice that the call to f is at type type "a"; a non-constant type.
173 Both calls to f are at the same type, so we can specialise to give:
174
175 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
176 in +.sel a d (f@a y) (f@a y)
177
178
179 (b) The other case is when the type variables in the instance types
180 are *not* in scope at the definition point of f. The example we are
181 working with above is a good case. There are two instances of (+.sel a d),
182 but "a" is not in scope at the definition of +.sel. Can we do anything?
183 Yes, we can "common them up", a sort of limited common sub-expression deal.
184 This would give:
185
186 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
187 f@a (x::a) = +.sel@a x x
188 in +.sel@a (f@a y) (f@a y)
189
190 This can save work, and can't be spotted by the type checker, because
191 the two instances of +.sel weren't originally at the same type.
192
193 Further notes on (b)
194
195 * There are quite a few variations here. For example, the defn of
196 +.sel could be floated ouside the \y, to attempt to gain laziness.
197 It certainly mustn't be floated outside the \d because the d has to
198 be in scope too.
199
200 * We don't want to inline f_rhs in this case, because
201 that will duplicate code. Just commoning up the call is the point.
202
203 * Nothing gets added to +.sel's IdInfo.
204
205 * Don't bother unless the equivalence class has more than one item!
206
207 Not clear whether this is all worth it. It is of course OK to
208 simply discard call-instances when passing a big lambda.
209
210 Polymorphism 2 -- Overloading
211 ~~~~~~~~~~~~~~
212 Consider a function whose most general type is
213
214 f :: forall a b. Ord a => [a] -> b -> b
215
216 There is really no point in making a version of g at Int/Int and another
217 at Int/Bool, because it's only instancing the type variable "a" which
218 buys us any efficiency. Since g is completely polymorphic in b there
219 ain't much point in making separate versions of g for the different
220 b types.
221
222 That suggests that we should identify which of g's type variables
223 are constrained (like "a") and which are unconstrained (like "b").
224 Then when taking equivalence classes in STEP 2, we ignore the type args
225 corresponding to unconstrained type variable. In STEP 3 we make
226 polymorphic versions. Thus:
227
228 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
229
230 We do this.
231
232
233 Dictionary floating
234 ~~~~~~~~~~~~~~~~~~~
235 Consider this
236
237 f a (d::Num a) = let g = ...
238 in
239 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
240
241 Here, g is only called at one type, but the dictionary isn't in scope at the
242 definition point for g. Usually the type checker would build a
243 definition for d1 which enclosed g, but the transformation system
244 might have moved d1's defn inward. Solution: float dictionary bindings
245 outwards along with call instances.
246
247 Consider
248
249 f x = let g p q = p==q
250 h r s = (r+s, g r s)
251 in
252 h x x
253
254
255 Before specialisation, leaving out type abstractions we have
256
257 f df x = let g :: Eq a => a -> a -> Bool
258 g dg p q = == dg p q
259 h :: Num a => a -> a -> (a, Bool)
260 h dh r s = let deq = eqFromNum dh
261 in (+ dh r s, g deq r s)
262 in
263 h df x x
264
265 After specialising h we get a specialised version of h, like this:
266
267 h' r s = let deq = eqFromNum df
268 in (+ df r s, g deq r s)
269
270 But we can't naively make an instance for g from this, because deq is not in scope
271 at the defn of g. Instead, we have to float out the (new) defn of deq
272 to widen its scope. Notice that this floating can't be done in advance -- it only
273 shows up when specialisation is done.
274
275 User SPECIALIZE pragmas
276 ~~~~~~~~~~~~~~~~~~~~~~~
277 Specialisation pragmas can be digested by the type checker, and implemented
278 by adding extra definitions along with that of f, in the same way as before
279
280 f@t1/t2 = <f_rhs> t1 t2 d1 d2
281
282 Indeed the pragmas *have* to be dealt with by the type checker, because
283 only it knows how to build the dictionaries d1 and d2! For example
284
285 g :: Ord a => [a] -> [a]
286 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
287
288 Here, the specialised version of g is an application of g's rhs to the
289 Ord dictionary for (Tree Int), which only the type checker can conjure
290 up. There might not even *be* one, if (Tree Int) is not an instance of
291 Ord! (All the other specialision has suitable dictionaries to hand
292 from actual calls.)
293
294 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
295 it is buried in a complex (as-yet-un-desugared) binding group.
296 Maybe we should say
297
298 f@t1/t2 = f* t1 t2 d1 d2
299
300 where f* is the Id f with an IdInfo which says "inline me regardless!".
301 Indeed all the specialisation could be done in this way.
302 That in turn means that the simplifier has to be prepared to inline absolutely
303 any in-scope let-bound thing.
304
305
306 Again, the pragma should permit polymorphism in unconstrained variables:
307
308 h :: Ord a => [a] -> b -> b
309 {-# SPECIALIZE h :: [Int] -> b -> b #-}
310
311 We *insist* that all overloaded type variables are specialised to ground types,
312 (and hence there can be no context inside a SPECIALIZE pragma).
313 We *permit* unconstrained type variables to be specialised to
314 - a ground type
315 - or left as a polymorphic type variable
316 but nothing in between. So
317
318 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
319
320 is *illegal*. (It can be handled, but it adds complication, and gains the
321 programmer nothing.)
322
323
324 SPECIALISING INSTANCE DECLARATIONS
325 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326 Consider
327
328 instance Foo a => Foo [a] where
329 ...
330 {-# SPECIALIZE instance Foo [Int] #-}
331
332 The original instance decl creates a dictionary-function
333 definition:
334
335 dfun.Foo.List :: forall a. Foo a -> Foo [a]
336
337 The SPECIALIZE pragma just makes a specialised copy, just as for
338 ordinary function definitions:
339
340 dfun.Foo.List@Int :: Foo [Int]
341 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
342
343 The information about what instance of the dfun exist gets added to
344 the dfun's IdInfo in the same way as a user-defined function too.
345
346
347 Automatic instance decl specialisation?
348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349 Can instance decls be specialised automatically? It's tricky.
350 We could collect call-instance information for each dfun, but
351 then when we specialised their bodies we'd get new call-instances
352 for ordinary functions; and when we specialised their bodies, we might get
353 new call-instances of the dfuns, and so on. This all arises because of
354 the unrestricted mutual recursion between instance decls and value decls.
355
356 Still, there's no actual problem; it just means that we may not do all
357 the specialisation we could theoretically do.
358
359 Furthermore, instance decls are usually exported and used non-locally,
360 so we'll want to compile enough to get those specialisations done.
361
362 Lastly, there's no such thing as a local instance decl, so we can
363 survive solely by spitting out *usage* information, and then reading that
364 back in as a pragma when next compiling the file. So for now,
365 we only specialise instance decls in response to pragmas.
366
367
368 SPITTING OUT USAGE INFORMATION
369 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
370
371 To spit out usage information we need to traverse the code collecting
372 call-instance information for all imported (non-prelude?) functions
373 and data types. Then we equivalence-class it and spit it out.
374
375 This is done at the top-level when all the call instances which escape
376 must be for imported functions and data types.
377
378 *** Not currently done ***
379
380
381 Partial specialisation by pragmas
382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383 What about partial specialisation:
384
385 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
386 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
387
388 or even
389
390 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
391
392 Seems quite reasonable. Similar things could be done with instance decls:
393
394 instance (Foo a, Foo b) => Foo (a,b) where
395 ...
396 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
397 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
398
399 Ho hum. Things are complex enough without this. I pass.
400
401
402 Requirements for the simplifer
403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404 The simplifier has to be able to take advantage of the specialisation.
405
406 * When the simplifier finds an application of a polymorphic f, it looks in
407 f's IdInfo in case there is a suitable instance to call instead. This converts
408
409 f t1 t2 d1 d2 ===> f_t1_t2
410
411 Note that the dictionaries get eaten up too!
412
413 * Dictionary selection operations on constant dictionaries must be
414 short-circuited:
415
416 +.sel Int d ===> +Int
417
418 The obvious way to do this is in the same way as other specialised
419 calls: +.sel has inside it some IdInfo which tells that if it's applied
420 to the type Int then it should eat a dictionary and transform to +Int.
421
422 In short, dictionary selectors need IdInfo inside them for constant
423 methods.
424
425 * Exactly the same applies if a superclass dictionary is being
426 extracted:
427
428 Eq.sel Int d ===> dEqInt
429
430 * Something similar applies to dictionary construction too. Suppose
431 dfun.Eq.List is the function taking a dictionary for (Eq a) to
432 one for (Eq [a]). Then we want
433
434 dfun.Eq.List Int d ===> dEq.List_Int
435
436 Where does the Eq [Int] dictionary come from? It is built in
437 response to a SPECIALIZE pragma on the Eq [a] instance decl.
438
439 In short, dfun Ids need IdInfo with a specialisation for each
440 constant instance of their instance declaration.
441
442 All this uses a single mechanism: the SpecEnv inside an Id
443
444
445 What does the specialisation IdInfo look like?
446 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
447
448 The SpecEnv of an Id maps a list of types (the template) to an expression
449
450 [Type] |-> Expr
451
452 For example, if f has this RuleInfo:
453
454 [Int, a] -> \d:Ord Int. f' a
455
456 it means that we can replace the call
457
458 f Int t ===> (\d. f' t)
459
460 This chucks one dictionary away and proceeds with the
461 specialised version of f, namely f'.
462
463
464 What can't be done this way?
465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466 There is no way, post-typechecker, to get a dictionary for (say)
467 Eq a from a dictionary for Eq [a]. So if we find
468
469 ==.sel [t] d
470
471 we can't transform to
472
473 eqList (==.sel t d')
474
475 where
476 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
477
478 Of course, we currently have no way to automatically derive
479 eqList, nor to connect it to the Eq [a] instance decl, but you
480 can imagine that it might somehow be possible. Taking advantage
481 of this is permanently ruled out.
482
483 Still, this is no great hardship, because we intend to eliminate
484 overloading altogether anyway!
485
486 A note about non-tyvar dictionaries
487 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
488 Some Ids have types like
489
490 forall a,b,c. Eq a -> Ord [a] -> tau
491
492 This seems curious at first, because we usually only have dictionary
493 args whose types are of the form (C a) where a is a type variable.
494 But this doesn't hold for the functions arising from instance decls,
495 which sometimes get arguments with types of form (C (T a)) for some
496 type constructor T.
497
498 Should we specialise wrt this compound-type dictionary? We used to say
499 "no", saying:
500 "This is a heuristic judgement, as indeed is the fact that we
501 specialise wrt only dictionaries. We choose *not* to specialise
502 wrt compound dictionaries because at the moment the only place
503 they show up is in instance decls, where they are simply plugged
504 into a returned dictionary. So nothing is gained by specialising
505 wrt them."
506
507 But it is simpler and more uniform to specialise wrt these dicts too;
508 and in future GHC is likely to support full fledged type signatures
509 like
510 f :: Eq [(a,b)] => ...
511
512
513 ************************************************************************
514 * *
515 \subsubsection{The new specialiser}
516 * *
517 ************************************************************************
518
519 Our basic game plan is this. For let(rec) bound function
520 f :: (C a, D c) => (a,b,c,d) -> Bool
521
522 * Find any specialised calls of f, (f ts ds), where
523 ts are the type arguments t1 .. t4, and
524 ds are the dictionary arguments d1 .. d2.
525
526 * Add a new definition for f1 (say):
527
528 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
529
530 Note that we abstract over the unconstrained type arguments.
531
532 * Add the mapping
533
534 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
535
536 to the specialisations of f. This will be used by the
537 simplifier to replace calls
538 (f t1 t2 t3 t4) da db
539 by
540 (\d1 d1 -> f1 t2 t4) da db
541
542 All the stuff about how many dictionaries to discard, and what types
543 to apply the specialised function to, are handled by the fact that the
544 SpecEnv contains a template for the result of the specialisation.
545
546 We don't build *partial* specialisations for f. For example:
547
548 f :: Eq a => a -> a -> Bool
549 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
550
551 Here, little is gained by making a specialised copy of f.
552 There's a distinct danger that the specialised version would
553 first build a dictionary for (Eq b, Eq c), and then select the (==)
554 method from it! Even if it didn't, not a great deal is saved.
555
556 We do, however, generate polymorphic, but not overloaded, specialisations:
557
558 f :: Eq a => [a] -> b -> b -> b
559 ... SPECIALISE f :: [Int] -> b -> b -> b ...
560
561 Hence, the invariant is this:
562
563 *** no specialised version is overloaded ***
564
565
566 ************************************************************************
567 * *
568 \subsubsection{The exported function}
569 * *
570 ************************************************************************
571 -}
572
573 -- | Specialise calls to type-class overloaded functions occuring in a program.
574 specProgram :: ModGuts -> CoreM ModGuts
575 specProgram guts@(ModGuts { mg_module = this_mod
576 , mg_rules = local_rules
577 , mg_binds = binds })
578 = do { dflags <- getDynFlags
579
580 -- Specialise the bindings of this module
581 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
582
583 -- Specialise imported functions
584 ; hpt_rules <- getRuleBase
585 ; let rule_base = extendRuleBaseList hpt_rules local_rules
586 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
587 [] rule_base (ud_calls uds)
588
589 -- Don't forget to wrap the specialized bindings with bindings
590 -- for the needed dictionaries.
591 -- See Note [Wrap bindings returned by specImports]
592 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
593
594 ; let final_binds
595 | null spec_binds' = binds'
596 | otherwise = Rec (flattenBinds spec_binds') : binds'
597 -- Note [Glom the bindings if imported functions are specialised]
598
599 ; return (guts { mg_binds = final_binds
600 , mg_rules = new_rules ++ local_rules }) }
601 where
602 -- We need to start with a Subst that knows all the things
603 -- that are in scope, so that the substitution engine doesn't
604 -- accidentally re-use a unique that's already in use
605 -- Easiest thing is to do it all at once, as if all the top-level
606 -- decls were mutually recursive
607 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
608 bindersOfBinds binds
609 , se_interesting = emptyVarSet }
610
611 go [] = return ([], emptyUDs)
612 go (bind:binds) = do (binds', uds) <- go binds
613 (bind', uds') <- specBind top_env bind uds
614 return (bind' ++ binds', uds')
615
616 {-
617 Note [Wrap bindings returned by specImports]
618 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
619 'specImports' returns a set of specialized bindings. However, these are lacking
620 necessary floated dictionary bindings, which are returned by
621 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
622 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
623 for instance, the 'specImports' call in 'specProgram'.
624
625
626 Note [Disabling cross-module specialisation]
627 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
628 Since GHC 7.10 we have performed specialisation of INLINEABLE bindings living
629 in modules outside of the current module. This can sometimes uncover user code
630 which explodes in size when aggressively optimized. The
631 -fno-cross-module-specialise option was introduced to allow users to being
632 bitten by such instances to revert to the pre-7.10 behavior.
633
634 See Trac #10491
635 -}
636
637 -- | Specialise a set of calls to imported bindings
638 specImports :: DynFlags
639 -> Module
640 -> SpecEnv -- Passed in so that all top-level Ids are in scope
641 -> VarSet -- Don't specialise these ones
642 -- See Note [Avoiding recursive specialisation]
643 -> [Id] -- Stack of imported functions being specialised
644 -> RuleBase -- Rules from this module and the home package
645 -- (but not external packages, which can change)
646 -> CallDetails -- Calls for imported things, and floating bindings
647 -> CoreM ( [CoreRule] -- New rules
648 , [CoreBind] ) -- Specialised bindings
649 -- See Note [Wrapping bindings returned by specImports]
650 specImports dflags this_mod top_env done callers rule_base cds
651 -- See Note [Disabling cross-module specialisation]
652 | not $ gopt Opt_CrossModuleSpecialise dflags =
653 return ([], [])
654
655 | otherwise =
656 do { let import_calls = varEnvElts cds
657 ; (rules, spec_binds) <- go rule_base import_calls
658 ; return (rules, spec_binds) }
659 where
660 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
661 go _ [] = return ([], [])
662 go rb (CIS fn calls_for_fn : other_calls)
663 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
664 done callers rb fn $
665 Map.toList calls_for_fn
666 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
667 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
668
669 specImport :: DynFlags
670 -> Module
671 -> SpecEnv -- Passed in so that all top-level Ids are in scope
672 -> VarSet -- Don't specialise these
673 -- See Note [Avoiding recursive specialisation]
674 -> [Id] -- Stack of imported functions being specialised
675 -> RuleBase -- Rules from this module
676 -> Id -> [CallInfo] -- Imported function and calls for it
677 -> CoreM ( [CoreRule] -- New rules
678 , [CoreBind] ) -- Specialised bindings
679 specImport dflags this_mod top_env done callers rb fn calls_for_fn
680 | fn `elemVarSet` done
681 = return ([], []) -- No warning. This actually happens all the time
682 -- when specialising a recursive function, because
683 -- the RHS of the specialised function contains a recursive
684 -- call to the original function
685
686 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
687 = return ([], [])
688
689 | wantSpecImport dflags unfolding
690 , Just rhs <- maybeUnfoldingTemplate unfolding
691 = do { -- Get rules from the external package state
692 -- We keep doing this in case we "page-fault in"
693 -- more rules as we go along
694 ; hsc_env <- getHscEnv
695 ; eps <- liftIO $ hscEPS hsc_env
696 ; vis_orphs <- getVisibleOrphanMods
697 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
698 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
699
700 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
701 runSpecM dflags this_mod $
702 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
703 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
704 -- After the rules kick in we may get recursion, but
705 -- we rely on a global GlomBinds to sort that out later
706 -- See Note [Glom the bindings if imported functions are specialised]
707
708 -- Now specialise any cascaded calls
709 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
710 specImports dflags this_mod top_env
711 (extendVarSet done fn)
712 (fn:callers)
713 (extendRuleBaseList rb rules1)
714 (ud_calls uds)
715
716 -- Don't forget to wrap the specialized bindings with bindings
717 -- for the needed dictionaries
718 -- See Note [Wrap bindings returned by specImports]
719 ; let final_binds = wrapDictBinds (ud_binds uds)
720 (spec_binds2 ++ spec_binds1)
721
722 ; return (rules2 ++ rules1, final_binds) }
723
724 | warnMissingSpecs dflags callers
725 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
726 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
727 | caller <- callers])
728 , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
729 , text "Probable fix: add INLINEABLE pragma on" <+> quotes (ppr fn) ])
730 ; return ([], []) }
731
732 | otherwise
733 = return ([], [])
734 where
735 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
736
737 warnMissingSpecs :: DynFlags -> [Id] -> Bool
738 -- See Note [Warning about missed specialisations]
739 warnMissingSpecs dflags callers
740 | wopt Opt_WarnAllMissedSpecs dflags = True
741 | not (wopt Opt_WarnMissedSpecs dflags) = False
742 | null callers = False
743 | otherwise = all has_inline_prag callers
744 where
745 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
746
747 wantSpecImport :: DynFlags -> Unfolding -> Bool
748 -- See Note [Specialise imported INLINABLE things]
749 wantSpecImport dflags unf
750 = case unf of
751 NoUnfolding -> False
752 OtherCon {} -> False
753 DFunUnfolding {} -> True
754 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
755 | gopt Opt_SpecialiseAggressively dflags -> True
756 | isStableSource src -> True
757 -- Specialise even INLINE things; it hasn't inlined yet,
758 -- so perhaps it never will. Moreover it may have calls
759 -- inside it that we want to specialise
760 | otherwise -> False -- Stable, not INLINE, hence INLINEABLE
761
762 {- Note [Warning about missed specialisations]
763 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
764 Suppose
765 * In module Lib, you carefully mark a function 'foo' INLINEABLE
766 * Import Lib(foo) into another module M
767 * Call 'foo' at some specialised type in M
768 Then you jolly well expect it to be specialised in M. But what if
769 'foo' calls another fuction 'Lib.bar'. Then you'd like 'bar' to be
770 specialised too. But if 'bar' is not marked INLINEABLE it may well
771 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
772
773 It's more noisy to warning about a missed specialisation opportunity
774 for /every/ overloaded imported function, but sometimes useful. That
775 is what Opt_WarnAllMissedSpecs does.
776
777 ToDo: warn about missed opportunities for local functions.
778
779 Note [Specialise imported INLINABLE things]
780 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
781 What imported functions do we specialise? The basic set is
782 * DFuns and things with INLINABLE pragmas.
783 but with -fspecialise-aggressively we add
784 * Anything with an unfolding template
785
786 Trac #8874 has a good example of why we want to auto-specialise DFuns.
787
788 We have the -fspecialise-aggressively flag (usually off), because we
789 risk lots of orphan modules from over-vigorous specialisation.
790 However it's not a big deal: anything non-recursive with an
791 unfolding-template will probably have been inlined already.
792
793 Note [Glom the bindings if imported functions are specialised]
794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
795 Suppose we have an imported, *recursive*, INLINABLE function
796 f :: Eq a => a -> a
797 f = /\a \d x. ...(f a d)...
798 In the module being compiled we have
799 g x = f (x::Int)
800 Now we'll make a specialised function
801 f_spec :: Int -> Int
802 f_spec = \x -> ...(f Int dInt)...
803 {-# RULE f Int _ = f_spec #-}
804 g = \x. f Int dInt x
805 Note that f_spec doesn't look recursive
806 After rewriting with the RULE, we get
807 f_spec = \x -> ...(f_spec)...
808 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
809 The occurrence analyser never turns a NonRec into a Rec. So we must
810 make sure that f_spec is recursive. Easiest thing is to make all
811 the specialisations for imported bindings recursive.
812
813
814 Note [Avoiding recursive specialisation]
815 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
816 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
817 'f's RHS. So we want to specialise g,h. But we don't want to
818 specialise f any more! It's possible that f's RHS might have a
819 recursive yet-more-specialised call, so we'd diverge in that case.
820 And if the call is to the same type, one specialisation is enough.
821 Avoiding this recursive specialisation loop is the reason for the
822 'done' VarSet passed to specImports and specImport.
823
824 ************************************************************************
825 * *
826 \subsubsection{@specExpr@: the main function}
827 * *
828 ************************************************************************
829 -}
830
831 data SpecEnv
832 = SE { se_subst :: CoreSubst.Subst
833 -- We carry a substitution down:
834 -- a) we must clone any binding that might float outwards,
835 -- to avoid name clashes
836 -- b) we carry a type substitution to use when analysing
837 -- the RHS of specialised bindings (no type-let!)
838
839
840 , se_interesting :: VarSet
841 -- Dict Ids that we know something about
842 -- and hence may be worth specialising against
843 -- See Note [Interesting dictionary arguments]
844 }
845
846 specVar :: SpecEnv -> Id -> CoreExpr
847 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
848
849 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
850
851 ---------------- First the easy cases --------------------
852 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
853 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
854 specExpr env (Var v) = return (specVar env v, emptyUDs)
855 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
856 specExpr env (Cast e co)
857 = do { (e', uds) <- specExpr env e
858 ; return ((mkCast e' (substCo env co)), uds) }
859 specExpr env (Tick tickish body)
860 = do { (body', uds) <- specExpr env body
861 ; return (Tick (specTickish env tickish) body', uds) }
862
863 ---------------- Applications might generate a call instance --------------------
864 specExpr env expr@(App {})
865 = go expr []
866 where
867 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
868 (fun', uds_app) <- go fun (arg':args)
869 return (App fun' arg', uds_arg `plusUDs` uds_app)
870
871 go (Var f) args = case specVar env f of
872 Var f' -> return (Var f', mkCallUDs env f' args)
873 e' -> return (e', emptyUDs) -- I don't expect this!
874 go other _ = specExpr env other
875
876 ---------------- Lambda/case require dumping of usage details --------------------
877 specExpr env e@(Lam _ _) = do
878 (body', uds) <- specExpr env' body
879 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
880 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
881 where
882 (bndrs, body) = collectBinders e
883 (env', bndrs') = substBndrs env bndrs
884 -- More efficient to collect a group of binders together all at once
885 -- and we don't want to split a lambda group with dumped bindings
886
887 specExpr env (Case scrut case_bndr ty alts)
888 = do { (scrut', scrut_uds) <- specExpr env scrut
889 ; (scrut'', case_bndr', alts', alts_uds)
890 <- specCase env scrut' case_bndr alts
891 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
892 , scrut_uds `plusUDs` alts_uds) }
893
894 ---------------- Finally, let is the interesting case --------------------
895 specExpr env (Let bind body)
896 = do { -- Clone binders
897 (rhs_env, body_env, bind') <- cloneBindSM env bind
898
899 -- Deal with the body
900 ; (body', body_uds) <- specExpr body_env body
901
902 -- Deal with the bindings
903 ; (binds', uds) <- specBind rhs_env bind' body_uds
904
905 -- All done
906 ; return (foldr Let body' binds', uds) }
907
908 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
909 specTickish env (Breakpoint ix ids)
910 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
911 -- drop vars from the list if they have a non-variable substitution.
912 -- should never happen, but it's harmless to drop them anyway.
913 specTickish _ other_tickish = other_tickish
914
915 specCase :: SpecEnv
916 -> CoreExpr -- Scrutinee, already done
917 -> Id -> [CoreAlt]
918 -> SpecM ( CoreExpr -- New scrutinee
919 , Id
920 , [CoreAlt]
921 , UsageDetails)
922 specCase env scrut' case_bndr [(con, args, rhs)]
923 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
924 , interestingDict env scrut'
925 , not (isDeadBinder case_bndr && null sc_args')
926 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
927
928 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
929 [(con, args', Var sc_arg')]
930 | sc_arg' <- sc_args' ]
931
932 -- Extend the substitution for RHS to map the *original* binders
933 -- to their floated verions.
934 mb_sc_flts :: [Maybe DictId]
935 mb_sc_flts = map (lookupVarEnv clone_env) args'
936 clone_env = zipVarEnv sc_args' sc_args_flt
937 subst_prs = (case_bndr, Var case_bndr_flt)
938 : [ (arg, Var sc_flt)
939 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
940 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
941 , se_interesting = se_interesting env_rhs `extendVarSetList`
942 (case_bndr_flt : sc_args_flt) }
943
944 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
945 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
946 case_bndr_set = unitVarSet case_bndr_flt
947 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
948 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
949 flt_binds = scrut_bind : sc_binds
950 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
951 all_uds = flt_binds `addDictBinds` free_uds
952 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
953 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
954 where
955 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
956 sc_args' = filter is_flt_sc_arg args'
957
958 clone_me bndr = do { uniq <- getUniqueM
959 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
960 where
961 name = idName bndr
962 ty = idType bndr
963 occ = nameOccName name
964 loc = getSrcSpan name
965
966 arg_set = mkVarSet args'
967 is_flt_sc_arg var = isId var
968 && not (isDeadBinder var)
969 && isDictTy var_ty
970 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
971 where
972 var_ty = idType var
973
974
975 specCase env scrut case_bndr alts
976 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
977 ; return (scrut, case_bndr', alts', uds_alts) }
978 where
979 (env_alt, case_bndr') = substBndr env case_bndr
980 spec_alt (con, args, rhs) = do
981 (rhs', uds) <- specExpr env_rhs rhs
982 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
983 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
984 where
985 (env_rhs, args') = substBndrs env_alt args
986
987 {-
988 Note [Floating dictionaries out of cases]
989 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
990 Consider
991 g = \d. case d of { MkD sc ... -> ...(f sc)... }
992 Naively we can't float d2's binding out of the case expression,
993 because 'sc' is bound by the case, and that in turn means we can't
994 specialise f, which seems a pity.
995
996 So we invert the case, by floating out a binding
997 for 'sc_flt' thus:
998 sc_flt = case d of { MkD sc ... -> sc }
999 Now we can float the call instance for 'f'. Indeed this is just
1000 what'll happen if 'sc' was originally bound with a let binding,
1001 but case is more efficient, and necessary with equalities. So it's
1002 good to work with both.
1003
1004 You might think that this won't make any difference, because the
1005 call instance will only get nuked by the \d. BUT if 'g' itself is
1006 specialised, then transitively we should be able to specialise f.
1007
1008 In general, given
1009 case e of cb { MkD sc ... -> ...(f sc)... }
1010 we transform to
1011 let cb_flt = e
1012 sc_flt = case cb_flt of { MkD sc ... -> sc }
1013 in
1014 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1015
1016 The "_flt" things are the floated binds; we use the current substitution
1017 to substitute sc -> sc_flt in the RHS
1018
1019 ************************************************************************
1020 * *
1021 Dealing with a binding
1022 * *
1023 ************************************************************************
1024 -}
1025
1026 specBind :: SpecEnv -- Use this for RHSs
1027 -> CoreBind
1028 -> UsageDetails -- Info on how the scope of the binding
1029 -> SpecM ([CoreBind], -- New bindings
1030 UsageDetails) -- And info to pass upstream
1031
1032 -- Returned UsageDetails:
1033 -- No calls for binders of this bind
1034 specBind rhs_env (NonRec fn rhs) body_uds
1035 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1036 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1037
1038 ; let pairs = spec_defns ++ [(fn', rhs')]
1039 -- fn' mentions the spec_defns in its rules,
1040 -- so put the latter first
1041
1042 combined_uds = body_uds1 `plusUDs` rhs_uds
1043 -- This way round a call in rhs_uds of a function f
1044 -- at type T will override a call of f at T in body_uds1; and
1045 -- that is good because it'll tend to keep "earlier" calls
1046 -- See Note [Specialisation of dictionary functions]
1047
1048 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1049 -- See Note [From non-recursive to recursive]
1050
1051 final_binds :: [DictBind]
1052 final_binds
1053 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1054 | otherwise = [flattenDictBinds dump_dbs pairs]
1055
1056 ; if float_all then
1057 -- Rather than discard the calls mentioning the bound variables
1058 -- we float this binding along with the others
1059 return ([], free_uds `snocDictBinds` final_binds)
1060 else
1061 -- No call in final_uds mentions bound variables,
1062 -- so we can just leave the binding here
1063 return (map fst final_binds, free_uds) }
1064
1065
1066 specBind rhs_env (Rec pairs) body_uds
1067 -- Note [Specialising a recursive group]
1068 = do { let (bndrs,rhss) = unzip pairs
1069 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1070 ; let scope_uds = body_uds `plusUDs` rhs_uds
1071 -- Includes binds and calls arising from rhss
1072
1073 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1074
1075 ; (bndrs3, spec_defns3, uds3)
1076 <- if null spec_defns1 -- Common case: no specialisation
1077 then return (bndrs1, [], uds1)
1078 else do { -- Specialisation occurred; do it again
1079 (bndrs2, spec_defns2, uds2)
1080 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1081 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1082
1083 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1084 bind = flattenDictBinds dumped_dbs
1085 (spec_defns3 ++ zip bndrs3 rhss')
1086
1087 ; if float_all then
1088 return ([], final_uds `snocDictBind` bind)
1089 else
1090 return ([fst bind], final_uds) }
1091
1092
1093 ---------------------------
1094 specDefns :: SpecEnv
1095 -> UsageDetails -- Info on how it is used in its scope
1096 -> [(Id,CoreExpr)] -- The things being bound and their un-processed RHS
1097 -> SpecM ([Id], -- Original Ids with RULES added
1098 [(Id,CoreExpr)], -- Extra, specialised bindings
1099 UsageDetails) -- Stuff to fling upwards from the specialised versions
1100
1101 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1102 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1103 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1104 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1105 -- But not vice versa (it's a fixpoint problem).
1106
1107 specDefns _env uds []
1108 = return ([], [], uds)
1109 specDefns env uds ((bndr,rhs):pairs)
1110 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1111 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1112 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1113
1114 ---------------------------
1115 specDefn :: SpecEnv
1116 -> UsageDetails -- Info on how it is used in its scope
1117 -> Id -> CoreExpr -- The thing being bound and its un-processed RHS
1118 -> SpecM (Id, -- Original Id with added RULES
1119 [(Id,CoreExpr)], -- Extra, specialised bindings
1120 UsageDetails) -- Stuff to fling upwards from the specialised versions
1121
1122 specDefn env body_uds fn rhs
1123 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1124 rules_for_me = idCoreRules fn
1125 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1126 calls_for_me fn rhs
1127 ; return ( fn `addIdSpecialisations` rules
1128 , spec_defns
1129 , body_uds_without_me `plusUDs` spec_uds) }
1130 -- It's important that the `plusUDs` is this way
1131 -- round, because body_uds_without_me may bind
1132 -- dictionaries that are used in calls_for_me passed
1133 -- to specDefn. So the dictionary bindings in
1134 -- spec_uds may mention dictionaries bound in
1135 -- body_uds_without_me
1136
1137 ---------------------------
1138 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1139 -- Nothing => specialising local fn
1140 -> SpecEnv
1141 -> [CoreRule] -- Existing RULES for the fn
1142 -> [CallInfo]
1143 -> Id -> CoreExpr
1144 -> SpecM ([CoreRule], -- New RULES for the fn
1145 [(Id,CoreExpr)], -- Extra, specialised bindings
1146 UsageDetails) -- New usage details from the specialised RHSs
1147
1148 -- This function checks existing rules, and does not create
1149 -- duplicate ones. So the caller does not need to do this filtering.
1150 -- See 'already_covered'
1151
1152 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1153 -- The first case is the interesting one
1154 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1155 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1156 && notNull calls_for_me -- And there are some calls to specialise
1157 && not (isNeverActive (idInlineActivation fn))
1158 -- Don't specialise NOINLINE things
1159 -- See Note [Auto-specialisation and RULES]
1160
1161 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1162 -- See Note [Inline specialisation] for why we do not
1163 -- switch off specialisation for inline functions
1164
1165 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1166 do { stuff <- mapM spec_call calls_for_me
1167 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1168 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1169
1170 | otherwise -- No calls or RHS doesn't fit our preconceptions
1171 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1172 text "Missed specialisation opportunity for"
1173 <+> ppr fn $$ _trace_doc )
1174 -- Note [Specialisation shape]
1175 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1176 return ([], [], emptyUDs)
1177 where
1178 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1179 , ppr rhs_ids, ppr n_dicts
1180 , ppr (idInlineActivation fn) ]
1181
1182 fn_type = idType fn
1183 fn_arity = idArity fn
1184 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1185 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1186 n_tyvars = length tyvars
1187 n_dicts = length theta
1188 inl_prag = idInlinePragma fn
1189 inl_act = inlinePragmaActivation inl_prag
1190 is_local = isLocalId fn
1191
1192 -- Figure out whether the function has an INLINE pragma
1193 -- See Note [Inline specialisations]
1194
1195 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1196
1197 rhs_dict_ids = take n_dicts rhs_ids
1198 body = mkLams (drop n_dicts rhs_ids) rhs_body
1199 -- Glue back on the non-dict lambdas
1200
1201 already_covered :: DynFlags -> [CoreExpr] -> Bool
1202 already_covered dflags args -- Note [Specialisations already covered]
1203 = isJust (lookupRule dflags
1204 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1205 (const True)
1206 fn args rules_for_me)
1207
1208 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1209 mk_ty_args [] poly_tvs
1210 = ASSERT( null poly_tvs ) []
1211 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1212 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1213 mk_ty_args (Just ty : call_ts) poly_tvs
1214 = Type ty : mk_ty_args call_ts poly_tvs
1215 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1216
1217 ----------------------------------------------------------
1218 -- Specialise to one particular call pattern
1219 spec_call :: CallInfo -- Call instance
1220 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1221 UsageDetails, -- Usage details from specialised body
1222 CoreRule)) -- Info for the Id's SpecEnv
1223 spec_call (CallKey call_ts, (call_ds, _))
1224 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1225
1226 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1227 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1228
1229 -- Construct the new binding
1230 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1231 -- PLUS the usage-details
1232 -- { d1' = dx1; d2' = dx2 }
1233 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1234 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1235 --
1236 -- Note that the substitution is applied to the whole thing.
1237 -- This is convenient, but just slightly fragile. Notably:
1238 -- * There had better be no name clashes in a/b/c
1239 do { let
1240 -- poly_tyvars = [b] in the example above
1241 -- spec_tyvars = [a,c]
1242 -- ty_args = [t1,b,t3]
1243 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1244 env1 = extendTCvSubstList env spec_tv_binds
1245 (rhs_env, poly_tyvars) = substBndrs env1
1246 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1247
1248 -- Clone rhs_dicts, including instantiating their types
1249 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1250 ; let (rhs_env2, dx_binds, spec_dict_args)
1251 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1252 ty_args = mk_ty_args call_ts poly_tyvars
1253 rule_args = ty_args ++ map Var inst_dict_ids
1254 rule_bndrs = poly_tyvars ++ inst_dict_ids
1255
1256 ; dflags <- getDynFlags
1257 ; if already_covered dflags rule_args then
1258 return Nothing
1259 else do
1260 { -- Figure out the type of the specialised function
1261 let body_ty = applyTypeToArgs rhs fn_type rule_args
1262 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1263 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1264 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1265 | otherwise = (poly_tyvars, poly_tyvars)
1266 spec_id_ty = mkPiTypes lam_args body_ty
1267
1268 ; spec_f <- newSpecIdSM fn spec_id_ty
1269 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1270 ; this_mod <- getModule
1271 ; let
1272 -- The rule to put in the function's specialisation is:
1273 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1274 herald = case mb_mod of
1275 Nothing -- Specialising local fn
1276 -> text "SPEC"
1277 Just this_mod -- Specialising imoprted fn
1278 -> text "SPEC/" <> ppr this_mod
1279
1280 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1281 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1282 -- This name ends up in interface files, so use showSDocForUser,
1283 -- otherwise uniques end up there, making builds
1284 -- less deterministic (See #4012 comment:61 ff)
1285
1286 spec_env_rule = mkRule
1287 this_mod
1288 True {- Auto generated -}
1289 is_local
1290 rule_name
1291 inl_act -- Note [Auto-specialisation and RULES]
1292 (idName fn)
1293 rule_bndrs
1294 rule_args
1295 (mkVarApps (Var spec_f) app_args)
1296
1297 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1298 final_uds = foldr consDictBind rhs_uds dx_binds
1299
1300 --------------------------------------
1301 -- Add a suitable unfolding if the spec_inl_prag says so
1302 -- See Note [Inline specialisations]
1303 (spec_inl_prag, spec_unf)
1304 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1305 = (neverInlinePragma, noUnfolding)
1306 -- See Note [Specialising imported functions] in OccurAnal
1307
1308 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1309 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1310
1311 | otherwise
1312 = (inl_prag, specUnfolding dflags (se_subst env)
1313 poly_tyvars (ty_args ++ spec_dict_args)
1314 fn_unf)
1315
1316 --------------------------------------
1317 -- Adding arity information just propagates it a bit faster
1318 -- See Note [Arity decrease] in Simplify
1319 -- Copy InlinePragma information from the parent Id.
1320 -- So if f has INLINE[1] so does spec_f
1321 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1322 `setInlinePragma` spec_inl_prag
1323 `setIdUnfolding` spec_unf
1324
1325 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1326
1327 {-
1328 Note [Orphans and auto-generated rules]
1329 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1330 When we specialise an INLINEABLE function, or when we have
1331 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1332 We don't want to warn about these, or we'd generate a lot of warnings.
1333 Thus, we only warn about user-specified orphan rules.
1334
1335 Indeed, we don't even treat the module as an orphan module if it has
1336 auto-generated *rule* orphans. Orphan modules are read every time we
1337 compile, so they are pretty obtrusive and slow down every compilation,
1338 even non-optimised ones. (Reason: for type class instances it's a
1339 type correctness issue.) But specialisation rules are strictly for
1340 *optimisation* only so it's fine not to read the interface.
1341
1342 What this means is that a SPEC rules from auto-specialisation in
1343 module M will be used in other modules only if M.hi has been read for
1344 some other reason, which is actually pretty likely.
1345 -}
1346
1347 bindAuxiliaryDicts
1348 :: SpecEnv
1349 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1350 -> [DictId] -- A cloned dict-id for each dict arg
1351 -> (SpecEnv, -- Substitute for all orig_dicts
1352 [DictBind], -- Auxiliary dict bindings
1353 [CoreExpr]) -- Witnessing expressions (all trivial)
1354 -- Bind any dictionary arguments to fresh names, to preserve sharing
1355 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1356 orig_dict_ids call_ds inst_dict_ids
1357 = (env', dx_binds, spec_dict_args)
1358 where
1359 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1360 env' = env { se_subst = CoreSubst.extendIdSubstList subst (orig_dict_ids `zip` spec_dict_args)
1361 , se_interesting = interesting `unionVarSet` interesting_dicts }
1362
1363 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1364 , interestingDict env dx ]
1365 -- See Note [Make the new dictionaries interesting]
1366
1367 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1368 go [] _ = ([], [])
1369 go (dx:dxs) (dx_id:dx_ids)
1370 | exprIsTrivial dx = (dx_binds, dx:args)
1371 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1372 where
1373 (dx_binds, args) = go dxs dx_ids
1374 -- In the first case extend the substitution but not bindings;
1375 -- in the latter extend the bindings but not the substitution.
1376 -- For the former, note that we bind the *original* dict in the substitution,
1377 -- overriding any d->dx_id binding put there by substBndrs
1378 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1379
1380 {-
1381 Note [Make the new dictionaries interesting]
1382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1383 Important! We're going to substitute dx_id1 for d
1384 and we want it to look "interesting", else we won't gather *any*
1385 consequential calls. E.g.
1386 f d = ...g d....
1387 If we specialise f for a call (f (dfun dNumInt)), we'll get
1388 a consequent call (g d') with an auxiliary definition
1389 d' = df dNumInt
1390 We want that consequent call to look interesting
1391
1392
1393 Note [From non-recursive to recursive]
1394 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1395 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1396 have built a recursive knot
1397
1398 f a d x = <blah>
1399 MkUD { ud_binds = d7 = MkD ..f..
1400 , ud_calls = ...(f T d7)... }
1401
1402 The we generate
1403
1404 Rec { fs x = <blah>[T/a, d7/d]
1405 f a d x = <blah>
1406 RULE f T _ = fs
1407 d7 = ...f... }
1408
1409 Here the recursion is only through the RULE.
1410
1411
1412 Note [Specialisation of dictionary functions]
1413 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1414 Here is a nasty example that bit us badly: see Trac #3591
1415
1416 class Eq a => C a
1417 instance Eq [a] => C [a]
1418
1419 ---------------
1420 dfun :: Eq [a] -> C [a]
1421 dfun a d = MkD a d (meth d)
1422
1423 d4 :: Eq [T] = <blah>
1424 d2 :: C [T] = dfun T d4
1425 d1 :: Eq [T] = $p1 d2
1426 d3 :: C [T] = dfun T d1
1427
1428 None of these definitions is recursive. What happened was that we
1429 generated a specialisation:
1430
1431 RULE forall d. dfun T d = dT :: C [T]
1432 dT = (MkD a d (meth d)) [T/a, d1/d]
1433 = MkD T d1 (meth d1)
1434
1435 But now we use the RULE on the RHS of d2, to get
1436
1437 d2 = dT = MkD d1 (meth d1)
1438 d1 = $p1 d2
1439
1440 and now d1 is bottom! The problem is that when specialising 'dfun' we
1441 should first dump "below" the binding all floated dictionary bindings
1442 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1443 placed below 'dfun', and thus unavailable to it when specialising
1444 'dfun'. That in turn means that the call (dfun T d1) must be
1445 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1446 d4 doesn't mention dfun.
1447
1448 But look at this:
1449
1450 class C a where { foo,bar :: [a] -> [a] }
1451
1452 instance C Int where
1453 foo x = r_bar x
1454 bar xs = reverse xs
1455
1456 r_bar :: C a => [a] -> [a]
1457 r_bar xs = bar (xs ++ xs)
1458
1459 That translates to:
1460
1461 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1462
1463 Rec { $fCInt :: C Int = MkC foo_help reverse
1464 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1465
1466 The call (r_bar $fCInt) mentions $fCInt,
1467 which mentions foo_help,
1468 which mentions r_bar
1469 But we DO want to specialise r_bar at Int:
1470
1471 Rec { $fCInt :: C Int = MkC foo_help reverse
1472 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1473
1474 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1475 RULE r_bar Int _ = r_bar_Int
1476
1477 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1478 }
1479
1480 Note that, because of its RULE, r_bar joins the recursive
1481 group. (In this case it'll unravel a short moment later.)
1482
1483
1484 Conclusion: we catch the nasty case using filter_dfuns in
1485 callsForMe. To be honest I'm not 100% certain that this is 100%
1486 right, but it works. Sigh.
1487
1488
1489 Note [Specialising a recursive group]
1490 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1491 Consider
1492 let rec { f x = ...g x'...
1493 ; g y = ...f y'.... }
1494 in f 'a'
1495 Here we specialise 'f' at Char; but that is very likely to lead to
1496 a specialisation of 'g' at Char. We must do the latter, else the
1497 whole point of specialisation is lost.
1498
1499 But we do not want to keep iterating to a fixpoint, because in the
1500 presence of polymorphic recursion we might generate an infinite number
1501 of specialisations.
1502
1503 So we use the following heuristic:
1504 * Arrange the rec block in dependency order, so far as possible
1505 (the occurrence analyser already does this)
1506
1507 * Specialise it much like a sequence of lets
1508
1509 * Then go through the block a second time, feeding call-info from
1510 the RHSs back in the bottom, as it were
1511
1512 In effect, the ordering maxmimises the effectiveness of each sweep,
1513 and we do just two sweeps. This should catch almost every case of
1514 monomorphic recursion -- the exception could be a very knotted-up
1515 recursion with multiple cycles tied up together.
1516
1517 This plan is implemented in the Rec case of specBindItself.
1518
1519 Note [Specialisations already covered]
1520 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1521 We obviously don't want to generate two specialisations for the same
1522 argument pattern. There are two wrinkles
1523
1524 1. We do the already-covered test in specDefn, not when we generate
1525 the CallInfo in mkCallUDs. We used to test in the latter place, but
1526 we now iterate the specialiser somewhat, and the Id at the call site
1527 might therefore not have all the RULES that we can see in specDefn
1528
1529 2. What about two specialisations where the second is an *instance*
1530 of the first? If the more specific one shows up first, we'll generate
1531 specialisations for both. If the *less* specific one shows up first,
1532 we *don't* currently generate a specialisation for the more specific
1533 one. (See the call to lookupRule in already_covered.) Reasons:
1534 (a) lookupRule doesn't say which matches are exact (bad reason)
1535 (b) if the earlier specialisation is user-provided, it's
1536 far from clear that we should auto-specialise further
1537
1538 Note [Auto-specialisation and RULES]
1539 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1540 Consider:
1541 g :: Num a => a -> a
1542 g = ...
1543
1544 f :: (Int -> Int) -> Int
1545 f w = ...
1546 {-# RULE f g = 0 #-}
1547
1548 Suppose that auto-specialisation makes a specialised version of
1549 g::Int->Int That version won't appear in the LHS of the RULE for f.
1550 So if the specialisation rule fires too early, the rule for f may
1551 never fire.
1552
1553 It might be possible to add new rules, to "complete" the rewrite system.
1554 Thus when adding
1555 RULE forall d. g Int d = g_spec
1556 also add
1557 RULE f g_spec = 0
1558
1559 But that's a bit complicated. For now we ask the programmer's help,
1560 by *copying the INLINE activation pragma* to the auto-specialised
1561 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1562 will also not be active until phase 2. And that's what programmers
1563 should jolly well do anyway, even aside from specialisation, to ensure
1564 that g doesn't inline too early.
1565
1566 This in turn means that the RULE would never fire for a NOINLINE
1567 thing so not much point in generating a specialisation at all.
1568
1569 Note [Specialisation shape]
1570 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1571 We only specialise a function if it has visible top-level lambdas
1572 corresponding to its overloading. E.g. if
1573 f :: forall a. Eq a => ....
1574 then its body must look like
1575 f = /\a. \d. ...
1576
1577 Reason: when specialising the body for a call (f ty dexp), we want to
1578 substitute dexp for d, and pick up specialised calls in the body of f.
1579
1580 This doesn't always work. One example I came across was this:
1581 newtype Gen a = MkGen{ unGen :: Int -> a }
1582
1583 choose :: Eq a => a -> Gen a
1584 choose n = MkGen (\r -> n)
1585
1586 oneof = choose (1::Int)
1587
1588 It's a silly exapmle, but we get
1589 choose = /\a. g `cast` co
1590 where choose doesn't have any dict arguments. Thus far I have not
1591 tried to fix this (wait till there's a real example).
1592
1593 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1594 it doesn't matter. This comes up with single-method classes
1595
1596 class C a where { op :: a -> a }
1597 instance C a => C [a] where ....
1598 ==>
1599 $fCList :: C a => C [a]
1600 $fCList = $copList |> (...coercion>...)
1601 ....(uses of $fCList at particular types)...
1602
1603 So we suppress the WARN if the rhs is trivial.
1604
1605 Note [Inline specialisations]
1606 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1607 Here is what we do with the InlinePragma of the original function
1608 * Activation/RuleMatchInfo: both transferred to the
1609 specialised function
1610 * InlineSpec:
1611 (a) An INLINE pragma is transferred
1612 (b) An INLINABLE pragma is *not* transferred
1613
1614 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1615 specialise the function at its call site, and arguably that's not so
1616 important for the specialised copies. BUT *pragma-directed*
1617 specialisation now takes place in the typechecker/desugarer, with
1618 manually specified INLINEs. The specialisation here is automatic.
1619 It'd be very odd if a function marked INLINE was specialised (because
1620 of some local use), and then forever after (including importing
1621 modules) the specialised version wasn't INLINEd. After all, the
1622 programmer said INLINE!
1623
1624 You might wonder why we specialise INLINE functions at all. After
1625 all they should be inlined, right? Two reasons:
1626
1627 * Even INLINE functions are sometimes not inlined, when they aren't
1628 applied to interesting arguments. But perhaps the type arguments
1629 alone are enough to specialise (even though the args are too boring
1630 to trigger inlining), and it's certainly better to call the
1631 specialised version.
1632
1633 * The RHS of an INLINE function might call another overloaded function,
1634 and we'd like to generate a specialised version of that function too.
1635 This actually happens a lot. Consider
1636 replicateM_ :: (Monad m) => Int -> m a -> m ()
1637 {-# INLINABLE replicateM_ #-}
1638 replicateM_ d x ma = ...
1639 The strictness analyser may transform to
1640 replicateM_ :: (Monad m) => Int -> m a -> m ()
1641 {-# INLINE replicateM_ #-}
1642 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1643
1644 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1645 {-# INLINABLE $wreplicateM_ #-}
1646 $wreplicateM_ = ...
1647 Now an importing module has a specialised call to replicateM_, say
1648 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1649 This particular example had a huge effect on the call to replicateM_
1650 in nofib/shootout/n-body.
1651
1652 Why (b): discard INLINEABLE pragmas? See Trac #4874 for persuasive examples.
1653 Suppose we have
1654 {-# INLINABLE f #-}
1655 f :: Ord a => [a] -> Int
1656 f xs = letrec f' = ...f'... in f'
1657 Then, when f is specialised and optimised we might get
1658 wgo :: [Int] -> Int#
1659 wgo = ...wgo...
1660 f_spec :: [Int] -> Int
1661 f_spec xs = case wgo xs of { r -> I# r }
1662 and we clearly want to inline f_spec at call sites. But if we still
1663 have the big, un-optimised of f (albeit specialised) captured in an
1664 INLINABLE pragma for f_spec, we won't get that optimisation.
1665
1666 So we simply drop INLINABLE pragmas when specialising. It's not really
1667 a complete solution; ignoring specialisation for now, INLINABLE functions
1668 don't get properly strictness analysed, for example. But it works well
1669 for examples involving specialisation, which is the dominant use of
1670 INLINABLE. See Trac #4874.
1671
1672
1673 ************************************************************************
1674 * *
1675 \subsubsection{UsageDetails and suchlike}
1676 * *
1677 ************************************************************************
1678 -}
1679
1680 data UsageDetails
1681 = MkUD {
1682 ud_binds :: !(Bag DictBind),
1683 -- Floated dictionary bindings
1684 -- The order is important;
1685 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1686 -- (Remember, Bags preserve order in GHC.)
1687
1688 ud_calls :: !CallDetails
1689
1690 -- INVARIANT: suppose bs = bindersOf ud_binds
1691 -- Then 'calls' may *mention* 'bs',
1692 -- but there should be no calls *for* bs
1693 }
1694
1695 instance Outputable UsageDetails where
1696 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1697 = text "MkUD" <+> braces (sep (punctuate comma
1698 [text "binds" <+> equals <+> ppr dbs,
1699 text "calls" <+> equals <+> ppr calls]))
1700
1701 -- | A 'DictBind' is a binding along with a cached set containing its free
1702 -- variables (both type variables and dictionaries)
1703 type DictBind = (CoreBind, VarSet)
1704
1705 type DictExpr = CoreExpr
1706
1707 emptyUDs :: UsageDetails
1708 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyVarEnv }
1709
1710 ------------------------------------------------------------
1711 type CallDetails = IdEnv CallInfoSet
1712 newtype CallKey = CallKey [Maybe Type] -- Nothing => unconstrained type argument
1713
1714 -- CallInfo uses a Map, thereby ensuring that
1715 -- we record only one call instance for any key
1716 --
1717 -- The list of types and dictionaries is guaranteed to
1718 -- match the type of f
1719 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1720 -- Range is dict args and the vars of the whole
1721 -- call (including tyvars)
1722 -- [*not* include the main id itself, of course]
1723
1724 type CallInfo = (CallKey, ([DictExpr], VarSet))
1725
1726 instance Outputable CallInfoSet where
1727 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1728 2 (ppr map)
1729
1730 pprCallInfo :: Id -> CallInfo -> SDoc
1731 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1732 = hang (ppr fn)
1733 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1734
1735 ppr_call_key_ty :: Maybe Type -> SDoc
1736 ppr_call_key_ty Nothing = char '_'
1737 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1738
1739 instance Outputable CallKey where
1740 ppr (CallKey ts) = ppr ts
1741
1742 -- Type isn't an instance of Ord, so that we can control which
1743 -- instance we use. That's tiresome here. Oh well
1744 instance Eq CallKey where
1745 k1 == k2 = case k1 `compare` k2 of { EQ -> True; _ -> False }
1746
1747 instance Ord CallKey where
1748 compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2
1749 where
1750 cmp Nothing Nothing = EQ
1751 cmp Nothing (Just _) = LT
1752 cmp (Just _) Nothing = GT
1753 cmp (Just t1) (Just t2) = cmpType t1 t2
1754
1755 unionCalls :: CallDetails -> CallDetails -> CallDetails
1756 unionCalls c1 c2 = plusVarEnv_C unionCallInfoSet c1 c2
1757
1758 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1759 unionCallInfoSet (CIS f calls1) (CIS _ calls2) = CIS f (calls1 `Map.union` calls2)
1760
1761 callDetailsFVs :: CallDetails -> VarSet
1762 callDetailsFVs calls = foldVarEnv (unionVarSet . callInfoFVs) emptyVarSet calls
1763
1764 callInfoFVs :: CallInfoSet -> VarSet
1765 callInfoFVs (CIS _ call_info) = Map.foldRight (\(_,fv) vs -> unionVarSet fv vs) emptyVarSet call_info
1766
1767 ------------------------------------------------------------
1768 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1769 singleCall id tys dicts
1770 = MkUD {ud_binds = emptyBag,
1771 ud_calls = unitVarEnv id $ CIS id $
1772 Map.singleton (CallKey tys) (dicts, call_fvs) }
1773 where
1774 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1775 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1776 -- The type args (tys) are guaranteed to be part of the dictionary
1777 -- types, because they are just the constrained types,
1778 -- and the dictionary is therefore sure to be bound
1779 -- inside the binding for any type variables free in the type;
1780 -- hence it's safe to neglect tyvars free in tys when making
1781 -- the free-var set for this call
1782 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1783 --
1784 -- We don't include the 'id' itself.
1785
1786 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1787 mkCallUDs env f args
1788 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1789 res
1790 where
1791 res = mkCallUDs' env f args
1792
1793 mkCallUDs' env f args
1794 | not (want_calls_for f) -- Imported from elsewhere
1795 || null theta -- Not overloaded
1796 = emptyUDs
1797
1798 | not (all type_determines_value theta)
1799 || not (spec_tys `lengthIs` n_tyvars)
1800 || not ( dicts `lengthIs` n_dicts)
1801 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1802 -- See also Note [Specialisations already covered]
1803 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1804 emptyUDs -- Not overloaded, or no specialisation wanted
1805
1806 | otherwise
1807 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1808 singleCall f spec_tys dicts
1809 where
1810 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1811 , ppr (map (interestingDict env) dicts)]
1812 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1813 constrained_tyvars = tyCoVarsOfTypes theta
1814 n_tyvars = length tyvars
1815 n_dicts = length theta
1816
1817 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1818 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1819
1820 -- ignores Coercion arguments
1821 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1822 type_zip tvs (Coercion _ : args) = type_zip tvs args
1823 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1824 type_zip _ _ = []
1825
1826 mk_spec_ty tyvar ty
1827 | tyvar `elemVarSet` constrained_tyvars = Just ty
1828 | otherwise = Nothing
1829
1830 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1831 -- For imported things, we gather call instances if
1832 -- there is an unfolding that we could in principle specialise
1833 -- We might still decide not to use it (consulting dflags)
1834 -- in specImports
1835 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1836
1837 type_determines_value pred -- See Note [Type determines value]
1838 = case classifyPredType pred of
1839 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1840 EqPred {} -> True
1841 IrredPred {} -> True -- Things like (D []) where D is a
1842 -- Constraint-ranged family; Trac #7785
1843
1844 {-
1845 Note [Type determines value]
1846 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1847 Only specialise if all overloading is on non-IP *class* params,
1848 because these are the ones whose *type* determines their *value*. In
1849 parrticular, with implicit params, the type args *don't* say what the
1850 value of the implicit param is! See Trac #7101
1851
1852 However, consider
1853 type family D (v::*->*) :: Constraint
1854 type instance D [] = ()
1855 f :: D v => v Char -> Int
1856 If we see a call (f "foo"), we'll pass a "dictionary"
1857 () |> (g :: () ~ D [])
1858 and it's good to specialise f at this dictionary.
1859
1860 So the question is: can an implicit parameter "hide inside" a
1861 type-family constraint like (D a). Well, no. We don't allow
1862 type instance D Maybe = ?x:Int
1863 Hence the IrredPred case in type_determines_value.
1864 See Trac #7785.
1865
1866 Note [Interesting dictionary arguments]
1867 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1868 Consider this
1869 \a.\d:Eq a. let f = ... in ...(f d)...
1870 There really is not much point in specialising f wrt the dictionary d,
1871 because the code for the specialised f is not improved at all, because
1872 d is lambda-bound. We simply get junk specialisations.
1873
1874 What is "interesting"? Just that it has *some* structure. But what about
1875 variables?
1876
1877 * A variable might be imported, in which case its unfolding
1878 will tell us whether it has useful structure
1879
1880 * Local variables are cloned on the way down (to avoid clashes when
1881 we float dictionaries), and cloning drops the unfolding
1882 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1883 nuisance to give them unfoldings. So we keep track of the
1884 "interesting" dictionaries as a VarSet in SpecEnv.
1885 We have to take care to put any new interesting dictionary
1886 bindings in the set.
1887
1888 We accidentally lost accurate tracking of local variables for a long
1889 time, because cloned variables don't have unfoldings. But makes a
1890 massive difference in a few cases, eg Trac #5113. For nofib as a
1891 whole it's only a small win: 2.2% improvement in allocation for ansi,
1892 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1893 -}
1894
1895 interestingDict :: SpecEnv -> CoreExpr -> Bool
1896 -- A dictionary argument is interesting if it has *some* structure
1897 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1898 || isDataConWorkId v
1899 || v `elemVarSet` se_interesting env
1900 interestingDict _ (Type _) = False
1901 interestingDict _ (Coercion _) = False
1902 interestingDict env (App fn (Type _)) = interestingDict env fn
1903 interestingDict env (App fn (Coercion _)) = interestingDict env fn
1904 interestingDict env (Tick _ a) = interestingDict env a
1905 interestingDict env (Cast e _) = interestingDict env e
1906 interestingDict _ _ = True
1907
1908 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1909 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
1910 (MkUD {ud_binds = db2, ud_calls = calls2})
1911 = MkUD { ud_binds = db1 `unionBags` db2
1912 , ud_calls = calls1 `unionCalls` calls2 }
1913
1914 plusUDList :: [UsageDetails] -> UsageDetails
1915 plusUDList = foldr plusUDs emptyUDs
1916
1917 -----------------------------
1918 _dictBindBndrs :: Bag DictBind -> [Id]
1919 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
1920
1921 -- | Construct a 'DictBind' from a 'CoreBind'
1922 mkDB :: CoreBind -> DictBind
1923 mkDB bind = (bind, bind_fvs bind)
1924
1925 -- | Identify the free variables of a 'CoreBind'
1926 bind_fvs :: CoreBind -> VarSet
1927 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
1928 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
1929 where
1930 bndrs = map fst prs
1931 rhs_fvs = unionVarSets (map pair_fvs prs)
1932
1933 pair_fvs :: (Id, CoreExpr) -> VarSet
1934 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
1935 -- Don't forget variables mentioned in the
1936 -- rules of the bndr. C.f. OccAnal.addRuleUsage
1937 -- Also tyvars mentioned in its type; they may not appear in the RHS
1938 -- type T a = Int
1939 -- x :: T a = 3
1940
1941 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
1942 -- recursive binding, including some additional bindings.
1943 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
1944 flattenDictBinds dbs pairs
1945 = (Rec bindings, fvs)
1946 where
1947 (bindings, fvs) = foldrBag add
1948 ([], emptyVarSet)
1949 (dbs `snocBag` mkDB (Rec pairs))
1950 add (NonRec b r, fvs') (pairs, fvs) =
1951 ((b,r) : pairs, fvs `unionVarSet` fvs')
1952 add (Rec prs1, fvs') (pairs, fvs) =
1953 (prs1 ++ pairs, fvs `unionVarSet` fvs')
1954
1955 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
1956 -- Add ud_binds to the tail end of the bindings in uds
1957 snocDictBinds uds dbs
1958 = uds { ud_binds = ud_binds uds `unionBags`
1959 foldr consBag emptyBag dbs }
1960
1961 consDictBind :: DictBind -> UsageDetails -> UsageDetails
1962 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
1963
1964 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
1965 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
1966
1967 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
1968 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
1969
1970 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
1971 wrapDictBinds dbs binds
1972 = foldrBag add binds dbs
1973 where
1974 add (bind,_) binds = bind : binds
1975
1976 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
1977 wrapDictBindsE dbs expr
1978 = foldrBag add expr dbs
1979 where
1980 add (bind,_) expr = Let bind expr
1981
1982 ----------------------
1983 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
1984 -- Used at a lambda or case binder; just dump anything mentioning the binder
1985 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1986 | null bndrs = (uds, emptyBag) -- Common in case alternatives
1987 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
1988 (free_uds, dump_dbs)
1989 where
1990 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
1991 bndr_set = mkVarSet bndrs
1992 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
1993 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
1994 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
1995 -- no calls for any of the dicts in dump_dbs
1996
1997 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
1998 -- Used at a lambda or case binder; just dump anything mentioning the binder
1999 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2000 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2001 (free_uds, dump_dbs, float_all)
2002 where
2003 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2004 bndr_set = mkVarSet bndrs
2005 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2006 free_calls = deleteCallsFor bndrs orig_calls
2007 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2008
2009 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2010 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2011 = -- pprTrace ("callsForMe")
2012 -- (vcat [ppr fn,
2013 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2014 -- text "Orig calls =" <+> ppr orig_calls,
2015 -- text "Dep set =" <+> ppr dep_set,
2016 -- text "Calls for me =" <+> ppr calls_for_me]) $
2017 (uds_without_me, calls_for_me)
2018 where
2019 uds_without_me = MkUD { ud_binds = orig_dbs, ud_calls = delVarEnv orig_calls fn }
2020 calls_for_me = case lookupVarEnv orig_calls fn of
2021 Nothing -> []
2022 Just (CIS _ calls) -> filter_dfuns (Map.toList calls)
2023
2024 dep_set = foldlBag go (unitVarSet fn) orig_dbs
2025 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2026 = extendVarSetList dep_set (bindersOf db)
2027 | otherwise = dep_set
2028
2029 -- Note [Specialisation of dictionary functions]
2030 filter_dfuns | isDFunId fn = filter ok_call
2031 | otherwise = \cs -> cs
2032
2033 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2034
2035 ----------------------
2036 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2037 -- Returns (free_dbs, dump_dbs, dump_set)
2038 splitDictBinds dbs bndr_set
2039 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2040 -- Important that it's foldl not foldr;
2041 -- we're accumulating the set of dumped ids in dump_set
2042 where
2043 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2044 | dump_idset `intersectsVarSet` fvs -- Dump it
2045 = (free_dbs, dump_dbs `snocBag` db,
2046 extendVarSetList dump_idset (bindersOf bind))
2047
2048 | otherwise -- Don't dump it
2049 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2050
2051
2052 ----------------------
2053 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2054 -- Remove calls *mentioning* bs
2055 deleteCallsMentioning bs calls
2056 = mapVarEnv filter_calls calls
2057 where
2058 filter_calls :: CallInfoSet -> CallInfoSet
2059 filter_calls (CIS f calls) = CIS f (Map.filter keep_call calls)
2060 keep_call (_, fvs) = not (fvs `intersectsVarSet` bs)
2061
2062 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2063 -- Remove calls *for* bs
2064 deleteCallsFor bs calls = delVarEnvList calls bs
2065
2066 {-
2067 ************************************************************************
2068 * *
2069 \subsubsection{Boring helper functions}
2070 * *
2071 ************************************************************************
2072 -}
2073
2074 newtype SpecM a = SpecM (State SpecState a)
2075
2076 data SpecState = SpecState {
2077 spec_uniq_supply :: UniqSupply,
2078 spec_module :: Module,
2079 spec_dflags :: DynFlags
2080 }
2081
2082 instance Functor SpecM where
2083 fmap = liftM
2084
2085 instance Applicative SpecM where
2086 pure x = SpecM $ return x
2087 (<*>) = ap
2088
2089 instance Monad SpecM where
2090 SpecM x >>= f = SpecM $ do y <- x
2091 case f y of
2092 SpecM z ->
2093 z
2094 fail str = SpecM $ fail str
2095
2096 #if __GLASGOW_HASKELL__ > 710
2097 instance MonadFail.MonadFail SpecM where
2098 fail str = SpecM $ fail str
2099 #endif
2100
2101 instance MonadUnique SpecM where
2102 getUniqueSupplyM
2103 = SpecM $ do st <- get
2104 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2105 put $ st { spec_uniq_supply = us2 }
2106 return us1
2107
2108 getUniqueM
2109 = SpecM $ do st <- get
2110 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2111 put $ st { spec_uniq_supply = us' }
2112 return u
2113
2114 instance HasDynFlags SpecM where
2115 getDynFlags = SpecM $ liftM spec_dflags get
2116
2117 instance HasModule SpecM where
2118 getModule = SpecM $ liftM spec_module get
2119
2120 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2121 runSpecM dflags this_mod (SpecM spec)
2122 = do us <- getUniqueSupplyM
2123 let initialState = SpecState {
2124 spec_uniq_supply = us,
2125 spec_module = this_mod,
2126 spec_dflags = dflags
2127 }
2128 return $ evalState spec initialState
2129
2130 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2131 mapAndCombineSM _ [] = return ([], emptyUDs)
2132 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2133 (ys, uds2) <- mapAndCombineSM f xs
2134 return (y:ys, uds1 `plusUDs` uds2)
2135
2136 extendTCvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2137 extendTCvSubstList env tv_binds
2138 = env { se_subst = CoreSubst.extendTCvSubstList (se_subst env) tv_binds }
2139
2140 substTy :: SpecEnv -> Type -> Type
2141 substTy env ty = CoreSubst.substTy (se_subst env) ty
2142
2143 substCo :: SpecEnv -> Coercion -> Coercion
2144 substCo env co = CoreSubst.substCo (se_subst env) co
2145
2146 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2147 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2148 (subst', bs') -> (env { se_subst = subst' }, bs')
2149
2150 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2151 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2152 (subst', bs') -> (env { se_subst = subst' }, bs')
2153
2154 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2155 -- Clone the binders of the bind; return new bind with the cloned binders
2156 -- Return the substitution to use for RHSs, and the one to use for the body
2157 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2158 = do { us <- getUniqueSupplyM
2159 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2160 interesting' | interestingDict env rhs
2161 = interesting `extendVarSet` bndr'
2162 | otherwise = interesting
2163 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2164 , NonRec bndr' rhs) }
2165
2166 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2167 = do { us <- getUniqueSupplyM
2168 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2169 env' = env { se_subst = subst'
2170 , se_interesting = interesting `extendVarSetList`
2171 [ v | (v,r) <- pairs, interestingDict env r ] }
2172 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2173
2174 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2175 -- Make up completely fresh binders for the dictionaries
2176 -- Their bindings are going to float outwards
2177 newDictBndr env b = do { uniq <- getUniqueM
2178 ; let n = idName b
2179 ty' = substTy env (idType b)
2180 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2181
2182 newSpecIdSM :: Id -> Type -> SpecM Id
2183 -- Give the new Id a similar occurrence name to the old one
2184 newSpecIdSM old_id new_ty
2185 = do { uniq <- getUniqueM
2186 ; let name = idName old_id
2187 new_occ = mkSpecOcc (nameOccName name)
2188 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2189 ; return new_id }
2190
2191 {-
2192 Old (but interesting) stuff about unboxed bindings
2193 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2194
2195 What should we do when a value is specialised to a *strict* unboxed value?
2196
2197 map_*_* f (x:xs) = let h = f x
2198 t = map f xs
2199 in h:t
2200
2201 Could convert let to case:
2202
2203 map_*_Int# f (x:xs) = case f x of h# ->
2204 let t = map f xs
2205 in h#:t
2206
2207 This may be undesirable since it forces evaluation here, but the value
2208 may not be used in all branches of the body. In the general case this
2209 transformation is impossible since the mutual recursion in a letrec
2210 cannot be expressed as a case.
2211
2212 There is also a problem with top-level unboxed values, since our
2213 implementation cannot handle unboxed values at the top level.
2214
2215 Solution: Lift the binding of the unboxed value and extract it when it
2216 is used:
2217
2218 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2219 t = map f xs
2220 in case h of
2221 _Lift h# -> h#:t
2222
2223 Now give it to the simplifier and the _Lifting will be optimised away.
2224
2225 The benfit is that we have given the specialised "unboxed" values a
2226 very simplep lifted semantics and then leave it up to the simplifier to
2227 optimise it --- knowing that the overheads will be removed in nearly
2228 all cases.
2229
2230 In particular, the value will only be evaluted in the branches of the
2231 program which use it, rather than being forced at the point where the
2232 value is bound. For example:
2233
2234 filtermap_*_* p f (x:xs)
2235 = let h = f x
2236 t = ...
2237 in case p x of
2238 True -> h:t
2239 False -> t
2240 ==>
2241 filtermap_*_Int# p f (x:xs)
2242 = let h = case (f x) of h# -> _Lift h#
2243 t = ...
2244 in case p x of
2245 True -> case h of _Lift h#
2246 -> h#:t
2247 False -> t
2248
2249 The binding for h can still be inlined in the one branch and the
2250 _Lifting eliminated.
2251
2252
2253 Question: When won't the _Lifting be eliminated?
2254
2255 Answer: When they at the top-level (where it is necessary) or when
2256 inlining would duplicate work (or possibly code depending on
2257 options). However, the _Lifting will still be eliminated if the
2258 strictness analyser deems the lifted binding strict.
2259 -}