Typos in comments [skip ci]
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy )
14 import Type hiding( substTy, extendTvSubstList )
15 import Module( Module, HasModule(..) )
16 import Coercion( Coercion )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars, exprsFreeIdsList )
26 import UniqSupply
27 import Name
28 import MkId ( voidArgId, voidPrimId )
29 import Maybes ( catMaybes, isJust )
30 import BasicTypes
31 import HscTypes
32 import Bag
33 import DynFlags
34 import Util
35 import Outputable
36 import FastString
37 import State
38 import UniqDFM
39 import TrieMap
40
41 import Control.Monad
42 #if __GLASGOW_HASKELL__ > 710
43 import qualified Control.Monad.Fail as MonadFail
44 #endif
45
46 {-
47 ************************************************************************
48 * *
49 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
50 * *
51 ************************************************************************
52
53 These notes describe how we implement specialisation to eliminate
54 overloading.
55
56 The specialisation pass works on Core
57 syntax, complete with all the explicit dictionary application,
58 abstraction and construction as added by the type checker. The
59 existing type checker remains largely as it is.
60
61 One important thought: the {\em types} passed to an overloaded
62 function, and the {\em dictionaries} passed are mutually redundant.
63 If the same function is applied to the same type(s) then it is sure to
64 be applied to the same dictionary(s)---or rather to the same {\em
65 values}. (The arguments might look different but they will evaluate
66 to the same value.)
67
68 Second important thought: we know that we can make progress by
69 treating dictionary arguments as static and worth specialising on. So
70 we can do without binding-time analysis, and instead specialise on
71 dictionary arguments and no others.
72
73 The basic idea
74 ~~~~~~~~~~~~~~
75 Suppose we have
76
77 let f = <f_rhs>
78 in <body>
79
80 and suppose f is overloaded.
81
82 STEP 1: CALL-INSTANCE COLLECTION
83
84 We traverse <body>, accumulating all applications of f to types and
85 dictionaries.
86
87 (Might there be partial applications, to just some of its types and
88 dictionaries? In principle yes, but in practice the type checker only
89 builds applications of f to all its types and dictionaries, so partial
90 applications could only arise as a result of transformation, and even
91 then I think it's unlikely. In any case, we simply don't accumulate such
92 partial applications.)
93
94
95 STEP 2: EQUIVALENCES
96
97 So now we have a collection of calls to f:
98 f t1 t2 d1 d2
99 f t3 t4 d3 d4
100 ...
101 Notice that f may take several type arguments. To avoid ambiguity, we
102 say that f is called at type t1/t2 and t3/t4.
103
104 We take equivalence classes using equality of the *types* (ignoring
105 the dictionary args, which as mentioned previously are redundant).
106
107 STEP 3: SPECIALISATION
108
109 For each equivalence class, choose a representative (f t1 t2 d1 d2),
110 and create a local instance of f, defined thus:
111
112 f@t1/t2 = <f_rhs> t1 t2 d1 d2
113
114 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
115 of simplification will now result. However we don't actually *do* that
116 simplification. Rather, we leave it for the simplifier to do. If we
117 *did* do it, though, we'd get more call instances from the specialised
118 RHS. We can work out what they are by instantiating the call-instance
119 set from f's RHS with the types t1, t2.
120
121 Add this new id to f's IdInfo, to record that f has a specialised version.
122
123 Before doing any of this, check that f's IdInfo doesn't already
124 tell us about an existing instance of f at the required type/s.
125 (This might happen if specialisation was applied more than once, or
126 it might arise from user SPECIALIZE pragmas.)
127
128 Recursion
129 ~~~~~~~~~
130 Wait a minute! What if f is recursive? Then we can't just plug in
131 its right-hand side, can we?
132
133 But it's ok. The type checker *always* creates non-recursive definitions
134 for overloaded recursive functions. For example:
135
136 f x = f (x+x) -- Yes I know its silly
137
138 becomes
139
140 f a (d::Num a) = let p = +.sel a d
141 in
142 letrec fl (y::a) = fl (p y y)
143 in
144 fl
145
146 We still have recusion for non-overloaded functions which we
147 speciailise, but the recursive call should get specialised to the
148 same recursive version.
149
150
151 Polymorphism 1
152 ~~~~~~~~~~~~~~
153
154 All this is crystal clear when the function is applied to *constant
155 types*; that is, types which have no type variables inside. But what if
156 it is applied to non-constant types? Suppose we find a call of f at type
157 t1/t2. There are two possibilities:
158
159 (a) The free type variables of t1, t2 are in scope at the definition point
160 of f. In this case there's no problem, we proceed just as before. A common
161 example is as follows. Here's the Haskell:
162
163 g y = let f x = x+x
164 in f y + f y
165
166 After typechecking we have
167
168 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
169 in +.sel a d (f a d y) (f a d y)
170
171 Notice that the call to f is at type type "a"; a non-constant type.
172 Both calls to f are at the same type, so we can specialise to give:
173
174 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
175 in +.sel a d (f@a y) (f@a y)
176
177
178 (b) The other case is when the type variables in the instance types
179 are *not* in scope at the definition point of f. The example we are
180 working with above is a good case. There are two instances of (+.sel a d),
181 but "a" is not in scope at the definition of +.sel. Can we do anything?
182 Yes, we can "common them up", a sort of limited common sub-expression deal.
183 This would give:
184
185 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
186 f@a (x::a) = +.sel@a x x
187 in +.sel@a (f@a y) (f@a y)
188
189 This can save work, and can't be spotted by the type checker, because
190 the two instances of +.sel weren't originally at the same type.
191
192 Further notes on (b)
193
194 * There are quite a few variations here. For example, the defn of
195 +.sel could be floated ouside the \y, to attempt to gain laziness.
196 It certainly mustn't be floated outside the \d because the d has to
197 be in scope too.
198
199 * We don't want to inline f_rhs in this case, because
200 that will duplicate code. Just commoning up the call is the point.
201
202 * Nothing gets added to +.sel's IdInfo.
203
204 * Don't bother unless the equivalence class has more than one item!
205
206 Not clear whether this is all worth it. It is of course OK to
207 simply discard call-instances when passing a big lambda.
208
209 Polymorphism 2 -- Overloading
210 ~~~~~~~~~~~~~~
211 Consider a function whose most general type is
212
213 f :: forall a b. Ord a => [a] -> b -> b
214
215 There is really no point in making a version of g at Int/Int and another
216 at Int/Bool, because it's only instancing the type variable "a" which
217 buys us any efficiency. Since g is completely polymorphic in b there
218 ain't much point in making separate versions of g for the different
219 b types.
220
221 That suggests that we should identify which of g's type variables
222 are constrained (like "a") and which are unconstrained (like "b").
223 Then when taking equivalence classes in STEP 2, we ignore the type args
224 corresponding to unconstrained type variable. In STEP 3 we make
225 polymorphic versions. Thus:
226
227 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
228
229 We do this.
230
231
232 Dictionary floating
233 ~~~~~~~~~~~~~~~~~~~
234 Consider this
235
236 f a (d::Num a) = let g = ...
237 in
238 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
239
240 Here, g is only called at one type, but the dictionary isn't in scope at the
241 definition point for g. Usually the type checker would build a
242 definition for d1 which enclosed g, but the transformation system
243 might have moved d1's defn inward. Solution: float dictionary bindings
244 outwards along with call instances.
245
246 Consider
247
248 f x = let g p q = p==q
249 h r s = (r+s, g r s)
250 in
251 h x x
252
253
254 Before specialisation, leaving out type abstractions we have
255
256 f df x = let g :: Eq a => a -> a -> Bool
257 g dg p q = == dg p q
258 h :: Num a => a -> a -> (a, Bool)
259 h dh r s = let deq = eqFromNum dh
260 in (+ dh r s, g deq r s)
261 in
262 h df x x
263
264 After specialising h we get a specialised version of h, like this:
265
266 h' r s = let deq = eqFromNum df
267 in (+ df r s, g deq r s)
268
269 But we can't naively make an instance for g from this, because deq is not in scope
270 at the defn of g. Instead, we have to float out the (new) defn of deq
271 to widen its scope. Notice that this floating can't be done in advance -- it only
272 shows up when specialisation is done.
273
274 User SPECIALIZE pragmas
275 ~~~~~~~~~~~~~~~~~~~~~~~
276 Specialisation pragmas can be digested by the type checker, and implemented
277 by adding extra definitions along with that of f, in the same way as before
278
279 f@t1/t2 = <f_rhs> t1 t2 d1 d2
280
281 Indeed the pragmas *have* to be dealt with by the type checker, because
282 only it knows how to build the dictionaries d1 and d2! For example
283
284 g :: Ord a => [a] -> [a]
285 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
286
287 Here, the specialised version of g is an application of g's rhs to the
288 Ord dictionary for (Tree Int), which only the type checker can conjure
289 up. There might not even *be* one, if (Tree Int) is not an instance of
290 Ord! (All the other specialision has suitable dictionaries to hand
291 from actual calls.)
292
293 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
294 it is buried in a complex (as-yet-un-desugared) binding group.
295 Maybe we should say
296
297 f@t1/t2 = f* t1 t2 d1 d2
298
299 where f* is the Id f with an IdInfo which says "inline me regardless!".
300 Indeed all the specialisation could be done in this way.
301 That in turn means that the simplifier has to be prepared to inline absolutely
302 any in-scope let-bound thing.
303
304
305 Again, the pragma should permit polymorphism in unconstrained variables:
306
307 h :: Ord a => [a] -> b -> b
308 {-# SPECIALIZE h :: [Int] -> b -> b #-}
309
310 We *insist* that all overloaded type variables are specialised to ground types,
311 (and hence there can be no context inside a SPECIALIZE pragma).
312 We *permit* unconstrained type variables to be specialised to
313 - a ground type
314 - or left as a polymorphic type variable
315 but nothing in between. So
316
317 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
318
319 is *illegal*. (It can be handled, but it adds complication, and gains the
320 programmer nothing.)
321
322
323 SPECIALISING INSTANCE DECLARATIONS
324 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
325 Consider
326
327 instance Foo a => Foo [a] where
328 ...
329 {-# SPECIALIZE instance Foo [Int] #-}
330
331 The original instance decl creates a dictionary-function
332 definition:
333
334 dfun.Foo.List :: forall a. Foo a -> Foo [a]
335
336 The SPECIALIZE pragma just makes a specialised copy, just as for
337 ordinary function definitions:
338
339 dfun.Foo.List@Int :: Foo [Int]
340 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
341
342 The information about what instance of the dfun exist gets added to
343 the dfun's IdInfo in the same way as a user-defined function too.
344
345
346 Automatic instance decl specialisation?
347 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
348 Can instance decls be specialised automatically? It's tricky.
349 We could collect call-instance information for each dfun, but
350 then when we specialised their bodies we'd get new call-instances
351 for ordinary functions; and when we specialised their bodies, we might get
352 new call-instances of the dfuns, and so on. This all arises because of
353 the unrestricted mutual recursion between instance decls and value decls.
354
355 Still, there's no actual problem; it just means that we may not do all
356 the specialisation we could theoretically do.
357
358 Furthermore, instance decls are usually exported and used non-locally,
359 so we'll want to compile enough to get those specialisations done.
360
361 Lastly, there's no such thing as a local instance decl, so we can
362 survive solely by spitting out *usage* information, and then reading that
363 back in as a pragma when next compiling the file. So for now,
364 we only specialise instance decls in response to pragmas.
365
366
367 SPITTING OUT USAGE INFORMATION
368 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369
370 To spit out usage information we need to traverse the code collecting
371 call-instance information for all imported (non-prelude?) functions
372 and data types. Then we equivalence-class it and spit it out.
373
374 This is done at the top-level when all the call instances which escape
375 must be for imported functions and data types.
376
377 *** Not currently done ***
378
379
380 Partial specialisation by pragmas
381 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
382 What about partial specialisation:
383
384 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
385 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
386
387 or even
388
389 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
390
391 Seems quite reasonable. Similar things could be done with instance decls:
392
393 instance (Foo a, Foo b) => Foo (a,b) where
394 ...
395 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
396 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
397
398 Ho hum. Things are complex enough without this. I pass.
399
400
401 Requirements for the simplifier
402 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
403 The simplifier has to be able to take advantage of the specialisation.
404
405 * When the simplifier finds an application of a polymorphic f, it looks in
406 f's IdInfo in case there is a suitable instance to call instead. This converts
407
408 f t1 t2 d1 d2 ===> f_t1_t2
409
410 Note that the dictionaries get eaten up too!
411
412 * Dictionary selection operations on constant dictionaries must be
413 short-circuited:
414
415 +.sel Int d ===> +Int
416
417 The obvious way to do this is in the same way as other specialised
418 calls: +.sel has inside it some IdInfo which tells that if it's applied
419 to the type Int then it should eat a dictionary and transform to +Int.
420
421 In short, dictionary selectors need IdInfo inside them for constant
422 methods.
423
424 * Exactly the same applies if a superclass dictionary is being
425 extracted:
426
427 Eq.sel Int d ===> dEqInt
428
429 * Something similar applies to dictionary construction too. Suppose
430 dfun.Eq.List is the function taking a dictionary for (Eq a) to
431 one for (Eq [a]). Then we want
432
433 dfun.Eq.List Int d ===> dEq.List_Int
434
435 Where does the Eq [Int] dictionary come from? It is built in
436 response to a SPECIALIZE pragma on the Eq [a] instance decl.
437
438 In short, dfun Ids need IdInfo with a specialisation for each
439 constant instance of their instance declaration.
440
441 All this uses a single mechanism: the SpecEnv inside an Id
442
443
444 What does the specialisation IdInfo look like?
445 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
446
447 The SpecEnv of an Id maps a list of types (the template) to an expression
448
449 [Type] |-> Expr
450
451 For example, if f has this RuleInfo:
452
453 [Int, a] -> \d:Ord Int. f' a
454
455 it means that we can replace the call
456
457 f Int t ===> (\d. f' t)
458
459 This chucks one dictionary away and proceeds with the
460 specialised version of f, namely f'.
461
462
463 What can't be done this way?
464 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
465 There is no way, post-typechecker, to get a dictionary for (say)
466 Eq a from a dictionary for Eq [a]. So if we find
467
468 ==.sel [t] d
469
470 we can't transform to
471
472 eqList (==.sel t d')
473
474 where
475 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
476
477 Of course, we currently have no way to automatically derive
478 eqList, nor to connect it to the Eq [a] instance decl, but you
479 can imagine that it might somehow be possible. Taking advantage
480 of this is permanently ruled out.
481
482 Still, this is no great hardship, because we intend to eliminate
483 overloading altogether anyway!
484
485 A note about non-tyvar dictionaries
486 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
487 Some Ids have types like
488
489 forall a,b,c. Eq a -> Ord [a] -> tau
490
491 This seems curious at first, because we usually only have dictionary
492 args whose types are of the form (C a) where a is a type variable.
493 But this doesn't hold for the functions arising from instance decls,
494 which sometimes get arguments with types of form (C (T a)) for some
495 type constructor T.
496
497 Should we specialise wrt this compound-type dictionary? We used to say
498 "no", saying:
499 "This is a heuristic judgement, as indeed is the fact that we
500 specialise wrt only dictionaries. We choose *not* to specialise
501 wrt compound dictionaries because at the moment the only place
502 they show up is in instance decls, where they are simply plugged
503 into a returned dictionary. So nothing is gained by specialising
504 wrt them."
505
506 But it is simpler and more uniform to specialise wrt these dicts too;
507 and in future GHC is likely to support full fledged type signatures
508 like
509 f :: Eq [(a,b)] => ...
510
511
512 ************************************************************************
513 * *
514 \subsubsection{The new specialiser}
515 * *
516 ************************************************************************
517
518 Our basic game plan is this. For let(rec) bound function
519 f :: (C a, D c) => (a,b,c,d) -> Bool
520
521 * Find any specialised calls of f, (f ts ds), where
522 ts are the type arguments t1 .. t4, and
523 ds are the dictionary arguments d1 .. d2.
524
525 * Add a new definition for f1 (say):
526
527 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
528
529 Note that we abstract over the unconstrained type arguments.
530
531 * Add the mapping
532
533 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
534
535 to the specialisations of f. This will be used by the
536 simplifier to replace calls
537 (f t1 t2 t3 t4) da db
538 by
539 (\d1 d1 -> f1 t2 t4) da db
540
541 All the stuff about how many dictionaries to discard, and what types
542 to apply the specialised function to, are handled by the fact that the
543 SpecEnv contains a template for the result of the specialisation.
544
545 We don't build *partial* specialisations for f. For example:
546
547 f :: Eq a => a -> a -> Bool
548 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
549
550 Here, little is gained by making a specialised copy of f.
551 There's a distinct danger that the specialised version would
552 first build a dictionary for (Eq b, Eq c), and then select the (==)
553 method from it! Even if it didn't, not a great deal is saved.
554
555 We do, however, generate polymorphic, but not overloaded, specialisations:
556
557 f :: Eq a => [a] -> b -> b -> b
558 ... SPECIALISE f :: [Int] -> b -> b -> b ...
559
560 Hence, the invariant is this:
561
562 *** no specialised version is overloaded ***
563
564
565 ************************************************************************
566 * *
567 \subsubsection{The exported function}
568 * *
569 ************************************************************************
570 -}
571
572 -- | Specialise calls to type-class overloaded functions occuring in a program.
573 specProgram :: ModGuts -> CoreM ModGuts
574 specProgram guts@(ModGuts { mg_module = this_mod
575 , mg_rules = local_rules
576 , mg_binds = binds })
577 = do { dflags <- getDynFlags
578
579 -- Specialise the bindings of this module
580 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
581
582 -- Specialise imported functions
583 ; hpt_rules <- getRuleBase
584 ; let rule_base = extendRuleBaseList hpt_rules local_rules
585 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
586 [] rule_base (ud_calls uds)
587
588 -- Don't forget to wrap the specialized bindings with bindings
589 -- for the needed dictionaries.
590 -- See Note [Wrap bindings returned by specImports]
591 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
592
593 ; let final_binds
594 | null spec_binds' = binds'
595 | otherwise = Rec (flattenBinds spec_binds') : binds'
596 -- Note [Glom the bindings if imported functions are specialised]
597
598 ; return (guts { mg_binds = final_binds
599 , mg_rules = new_rules ++ local_rules }) }
600 where
601 -- We need to start with a Subst that knows all the things
602 -- that are in scope, so that the substitution engine doesn't
603 -- accidentally re-use a unique that's already in use
604 -- Easiest thing is to do it all at once, as if all the top-level
605 -- decls were mutually recursive
606 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
607 bindersOfBinds binds
608 , se_interesting = emptyVarSet }
609
610 go [] = return ([], emptyUDs)
611 go (bind:binds) = do (binds', uds) <- go binds
612 (bind', uds') <- specBind top_env bind uds
613 return (bind' ++ binds', uds')
614
615 {-
616 Note [Wrap bindings returned by specImports]
617 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
618 'specImports' returns a set of specialized bindings. However, these are lacking
619 necessary floated dictionary bindings, which are returned by
620 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
621 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
622 for instance, the 'specImports' call in 'specProgram'.
623
624
625 Note [Disabling cross-module specialisation]
626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
627 Since GHC 7.10 we have performed specialisation of INLINEABLE bindings living
628 in modules outside of the current module. This can sometimes uncover user code
629 which explodes in size when aggressively optimized. The
630 -fno-cross-module-specialise option was introduced to allow users to being
631 bitten by such instances to revert to the pre-7.10 behavior.
632
633 See Trac #10491
634 -}
635
636 -- | Specialise a set of calls to imported bindings
637 specImports :: DynFlags
638 -> Module
639 -> SpecEnv -- Passed in so that all top-level Ids are in scope
640 -> VarSet -- Don't specialise these ones
641 -- See Note [Avoiding recursive specialisation]
642 -> [Id] -- Stack of imported functions being specialised
643 -> RuleBase -- Rules from this module and the home package
644 -- (but not external packages, which can change)
645 -> CallDetails -- Calls for imported things, and floating bindings
646 -> CoreM ( [CoreRule] -- New rules
647 , [CoreBind] ) -- Specialised bindings
648 -- See Note [Wrapping bindings returned by specImports]
649 specImports dflags this_mod top_env done callers rule_base cds
650 -- See Note [Disabling cross-module specialisation]
651 | not $ gopt Opt_CrossModuleSpecialise dflags =
652 return ([], [])
653
654 | otherwise =
655 do { let import_calls = dVarEnvElts cds
656 ; (rules, spec_binds) <- go rule_base import_calls
657 ; return (rules, spec_binds) }
658 where
659 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
660 go _ [] = return ([], [])
661 go rb (cis@(CIS fn _calls_for_fn) : other_calls)
662 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
663 done callers rb fn $
664 ciSetToList cis
665 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
666 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
667
668 specImport :: DynFlags
669 -> Module
670 -> SpecEnv -- Passed in so that all top-level Ids are in scope
671 -> VarSet -- Don't specialise these
672 -- See Note [Avoiding recursive specialisation]
673 -> [Id] -- Stack of imported functions being specialised
674 -> RuleBase -- Rules from this module
675 -> Id -> [CallInfo] -- Imported function and calls for it
676 -> CoreM ( [CoreRule] -- New rules
677 , [CoreBind] ) -- Specialised bindings
678 specImport dflags this_mod top_env done callers rb fn calls_for_fn
679 | fn `elemVarSet` done
680 = return ([], []) -- No warning. This actually happens all the time
681 -- when specialising a recursive function, because
682 -- the RHS of the specialised function contains a recursive
683 -- call to the original function
684
685 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
686 = return ([], [])
687
688 | wantSpecImport dflags unfolding
689 , Just rhs <- maybeUnfoldingTemplate unfolding
690 = do { -- Get rules from the external package state
691 -- We keep doing this in case we "page-fault in"
692 -- more rules as we go along
693 ; hsc_env <- getHscEnv
694 ; eps <- liftIO $ hscEPS hsc_env
695 ; vis_orphs <- getVisibleOrphanMods
696 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
697 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
698
699 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
700 runSpecM dflags this_mod $
701 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
702 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
703 -- After the rules kick in we may get recursion, but
704 -- we rely on a global GlomBinds to sort that out later
705 -- See Note [Glom the bindings if imported functions are specialised]
706
707 -- Now specialise any cascaded calls
708 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
709 specImports dflags this_mod top_env
710 (extendVarSet done fn)
711 (fn:callers)
712 (extendRuleBaseList rb rules1)
713 (ud_calls uds)
714
715 -- Don't forget to wrap the specialized bindings with bindings
716 -- for the needed dictionaries
717 -- See Note [Wrap bindings returned by specImports]
718 ; let final_binds = wrapDictBinds (ud_binds uds)
719 (spec_binds2 ++ spec_binds1)
720
721 ; return (rules2 ++ rules1, final_binds) }
722
723 | warnMissingSpecs dflags callers
724 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
725 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
726 | caller <- callers])
727 , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
728 , text "Probable fix: add INLINEABLE pragma on" <+> quotes (ppr fn) ])
729 ; return ([], []) }
730
731 | otherwise
732 = return ([], [])
733 where
734 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
735
736 warnMissingSpecs :: DynFlags -> [Id] -> Bool
737 -- See Note [Warning about missed specialisations]
738 warnMissingSpecs dflags callers
739 | wopt Opt_WarnAllMissedSpecs dflags = True
740 | not (wopt Opt_WarnMissedSpecs dflags) = False
741 | null callers = False
742 | otherwise = all has_inline_prag callers
743 where
744 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
745
746 wantSpecImport :: DynFlags -> Unfolding -> Bool
747 -- See Note [Specialise imported INLINABLE things]
748 wantSpecImport dflags unf
749 = case unf of
750 NoUnfolding -> False
751 OtherCon {} -> False
752 DFunUnfolding {} -> True
753 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
754 | gopt Opt_SpecialiseAggressively dflags -> True
755 | isStableSource src -> True
756 -- Specialise even INLINE things; it hasn't inlined yet,
757 -- so perhaps it never will. Moreover it may have calls
758 -- inside it that we want to specialise
759 | otherwise -> False -- Stable, not INLINE, hence INLINEABLE
760
761 {- Note [Warning about missed specialisations]
762 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
763 Suppose
764 * In module Lib, you carefully mark a function 'foo' INLINEABLE
765 * Import Lib(foo) into another module M
766 * Call 'foo' at some specialised type in M
767 Then you jolly well expect it to be specialised in M. But what if
768 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
769 specialised too. But if 'bar' is not marked INLINEABLE it may well
770 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
771
772 It's more noisy to warning about a missed specialisation opportunity
773 for /every/ overloaded imported function, but sometimes useful. That
774 is what Opt_WarnAllMissedSpecs does.
775
776 ToDo: warn about missed opportunities for local functions.
777
778 Note [Specialise imported INLINABLE things]
779 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
780 What imported functions do we specialise? The basic set is
781 * DFuns and things with INLINABLE pragmas.
782 but with -fspecialise-aggressively we add
783 * Anything with an unfolding template
784
785 Trac #8874 has a good example of why we want to auto-specialise DFuns.
786
787 We have the -fspecialise-aggressively flag (usually off), because we
788 risk lots of orphan modules from over-vigorous specialisation.
789 However it's not a big deal: anything non-recursive with an
790 unfolding-template will probably have been inlined already.
791
792 Note [Glom the bindings if imported functions are specialised]
793 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
794 Suppose we have an imported, *recursive*, INLINABLE function
795 f :: Eq a => a -> a
796 f = /\a \d x. ...(f a d)...
797 In the module being compiled we have
798 g x = f (x::Int)
799 Now we'll make a specialised function
800 f_spec :: Int -> Int
801 f_spec = \x -> ...(f Int dInt)...
802 {-# RULE f Int _ = f_spec #-}
803 g = \x. f Int dInt x
804 Note that f_spec doesn't look recursive
805 After rewriting with the RULE, we get
806 f_spec = \x -> ...(f_spec)...
807 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
808 The occurrence analyser never turns a NonRec into a Rec. So we must
809 make sure that f_spec is recursive. Easiest thing is to make all
810 the specialisations for imported bindings recursive.
811
812
813 Note [Avoiding recursive specialisation]
814 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
815 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
816 'f's RHS. So we want to specialise g,h. But we don't want to
817 specialise f any more! It's possible that f's RHS might have a
818 recursive yet-more-specialised call, so we'd diverge in that case.
819 And if the call is to the same type, one specialisation is enough.
820 Avoiding this recursive specialisation loop is the reason for the
821 'done' VarSet passed to specImports and specImport.
822
823 ************************************************************************
824 * *
825 \subsubsection{@specExpr@: the main function}
826 * *
827 ************************************************************************
828 -}
829
830 data SpecEnv
831 = SE { se_subst :: CoreSubst.Subst
832 -- We carry a substitution down:
833 -- a) we must clone any binding that might float outwards,
834 -- to avoid name clashes
835 -- b) we carry a type substitution to use when analysing
836 -- the RHS of specialised bindings (no type-let!)
837
838
839 , se_interesting :: VarSet
840 -- Dict Ids that we know something about
841 -- and hence may be worth specialising against
842 -- See Note [Interesting dictionary arguments]
843 }
844
845 specVar :: SpecEnv -> Id -> CoreExpr
846 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
847
848 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
849
850 ---------------- First the easy cases --------------------
851 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
852 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
853 specExpr env (Var v) = return (specVar env v, emptyUDs)
854 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
855 specExpr env (Cast e co)
856 = do { (e', uds) <- specExpr env e
857 ; return ((mkCast e' (substCo env co)), uds) }
858 specExpr env (Tick tickish body)
859 = do { (body', uds) <- specExpr env body
860 ; return (Tick (specTickish env tickish) body', uds) }
861
862 ---------------- Applications might generate a call instance --------------------
863 specExpr env expr@(App {})
864 = go expr []
865 where
866 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
867 (fun', uds_app) <- go fun (arg':args)
868 return (App fun' arg', uds_arg `plusUDs` uds_app)
869
870 go (Var f) args = case specVar env f of
871 Var f' -> return (Var f', mkCallUDs env f' args)
872 e' -> return (e', emptyUDs) -- I don't expect this!
873 go other _ = specExpr env other
874
875 ---------------- Lambda/case require dumping of usage details --------------------
876 specExpr env e@(Lam _ _) = do
877 (body', uds) <- specExpr env' body
878 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
879 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
880 where
881 (bndrs, body) = collectBinders e
882 (env', bndrs') = substBndrs env bndrs
883 -- More efficient to collect a group of binders together all at once
884 -- and we don't want to split a lambda group with dumped bindings
885
886 specExpr env (Case scrut case_bndr ty alts)
887 = do { (scrut', scrut_uds) <- specExpr env scrut
888 ; (scrut'', case_bndr', alts', alts_uds)
889 <- specCase env scrut' case_bndr alts
890 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
891 , scrut_uds `plusUDs` alts_uds) }
892
893 ---------------- Finally, let is the interesting case --------------------
894 specExpr env (Let bind body)
895 = do { -- Clone binders
896 (rhs_env, body_env, bind') <- cloneBindSM env bind
897
898 -- Deal with the body
899 ; (body', body_uds) <- specExpr body_env body
900
901 -- Deal with the bindings
902 ; (binds', uds) <- specBind rhs_env bind' body_uds
903
904 -- All done
905 ; return (foldr Let body' binds', uds) }
906
907 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
908 specTickish env (Breakpoint ix ids)
909 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
910 -- drop vars from the list if they have a non-variable substitution.
911 -- should never happen, but it's harmless to drop them anyway.
912 specTickish _ other_tickish = other_tickish
913
914 specCase :: SpecEnv
915 -> CoreExpr -- Scrutinee, already done
916 -> Id -> [CoreAlt]
917 -> SpecM ( CoreExpr -- New scrutinee
918 , Id
919 , [CoreAlt]
920 , UsageDetails)
921 specCase env scrut' case_bndr [(con, args, rhs)]
922 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
923 , interestingDict env scrut'
924 , not (isDeadBinder case_bndr && null sc_args')
925 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
926
927 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
928 [(con, args', Var sc_arg')]
929 | sc_arg' <- sc_args' ]
930
931 -- Extend the substitution for RHS to map the *original* binders
932 -- to their floated verions.
933 mb_sc_flts :: [Maybe DictId]
934 mb_sc_flts = map (lookupVarEnv clone_env) args'
935 clone_env = zipVarEnv sc_args' sc_args_flt
936 subst_prs = (case_bndr, Var case_bndr_flt)
937 : [ (arg, Var sc_flt)
938 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
939 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
940 , se_interesting = se_interesting env_rhs `extendVarSetList`
941 (case_bndr_flt : sc_args_flt) }
942
943 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
944 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
945 case_bndr_set = unitVarSet case_bndr_flt
946 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
947 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
948 flt_binds = scrut_bind : sc_binds
949 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
950 all_uds = flt_binds `addDictBinds` free_uds
951 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
952 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
953 where
954 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
955 sc_args' = filter is_flt_sc_arg args'
956
957 clone_me bndr = do { uniq <- getUniqueM
958 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
959 where
960 name = idName bndr
961 ty = idType bndr
962 occ = nameOccName name
963 loc = getSrcSpan name
964
965 arg_set = mkVarSet args'
966 is_flt_sc_arg var = isId var
967 && not (isDeadBinder var)
968 && isDictTy var_ty
969 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
970 where
971 var_ty = idType var
972
973
974 specCase env scrut case_bndr alts
975 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
976 ; return (scrut, case_bndr', alts', uds_alts) }
977 where
978 (env_alt, case_bndr') = substBndr env case_bndr
979 spec_alt (con, args, rhs) = do
980 (rhs', uds) <- specExpr env_rhs rhs
981 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
982 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
983 where
984 (env_rhs, args') = substBndrs env_alt args
985
986 {-
987 Note [Floating dictionaries out of cases]
988 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
989 Consider
990 g = \d. case d of { MkD sc ... -> ...(f sc)... }
991 Naively we can't float d2's binding out of the case expression,
992 because 'sc' is bound by the case, and that in turn means we can't
993 specialise f, which seems a pity.
994
995 So we invert the case, by floating out a binding
996 for 'sc_flt' thus:
997 sc_flt = case d of { MkD sc ... -> sc }
998 Now we can float the call instance for 'f'. Indeed this is just
999 what'll happen if 'sc' was originally bound with a let binding,
1000 but case is more efficient, and necessary with equalities. So it's
1001 good to work with both.
1002
1003 You might think that this won't make any difference, because the
1004 call instance will only get nuked by the \d. BUT if 'g' itself is
1005 specialised, then transitively we should be able to specialise f.
1006
1007 In general, given
1008 case e of cb { MkD sc ... -> ...(f sc)... }
1009 we transform to
1010 let cb_flt = e
1011 sc_flt = case cb_flt of { MkD sc ... -> sc }
1012 in
1013 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1014
1015 The "_flt" things are the floated binds; we use the current substitution
1016 to substitute sc -> sc_flt in the RHS
1017
1018 ************************************************************************
1019 * *
1020 Dealing with a binding
1021 * *
1022 ************************************************************************
1023 -}
1024
1025 specBind :: SpecEnv -- Use this for RHSs
1026 -> CoreBind
1027 -> UsageDetails -- Info on how the scope of the binding
1028 -> SpecM ([CoreBind], -- New bindings
1029 UsageDetails) -- And info to pass upstream
1030
1031 -- Returned UsageDetails:
1032 -- No calls for binders of this bind
1033 specBind rhs_env (NonRec fn rhs) body_uds
1034 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1035 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1036
1037 ; let pairs = spec_defns ++ [(fn', rhs')]
1038 -- fn' mentions the spec_defns in its rules,
1039 -- so put the latter first
1040
1041 combined_uds = body_uds1 `plusUDs` rhs_uds
1042 -- This way round a call in rhs_uds of a function f
1043 -- at type T will override a call of f at T in body_uds1; and
1044 -- that is good because it'll tend to keep "earlier" calls
1045 -- See Note [Specialisation of dictionary functions]
1046
1047 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1048 -- See Note [From non-recursive to recursive]
1049
1050 final_binds :: [DictBind]
1051 final_binds
1052 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1053 | otherwise = [flattenDictBinds dump_dbs pairs]
1054
1055 ; if float_all then
1056 -- Rather than discard the calls mentioning the bound variables
1057 -- we float this binding along with the others
1058 return ([], free_uds `snocDictBinds` final_binds)
1059 else
1060 -- No call in final_uds mentions bound variables,
1061 -- so we can just leave the binding here
1062 return (map fst final_binds, free_uds) }
1063
1064
1065 specBind rhs_env (Rec pairs) body_uds
1066 -- Note [Specialising a recursive group]
1067 = do { let (bndrs,rhss) = unzip pairs
1068 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1069 ; let scope_uds = body_uds `plusUDs` rhs_uds
1070 -- Includes binds and calls arising from rhss
1071
1072 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1073
1074 ; (bndrs3, spec_defns3, uds3)
1075 <- if null spec_defns1 -- Common case: no specialisation
1076 then return (bndrs1, [], uds1)
1077 else do { -- Specialisation occurred; do it again
1078 (bndrs2, spec_defns2, uds2)
1079 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1080 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1081
1082 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1083 bind = flattenDictBinds dumped_dbs
1084 (spec_defns3 ++ zip bndrs3 rhss')
1085
1086 ; if float_all then
1087 return ([], final_uds `snocDictBind` bind)
1088 else
1089 return ([fst bind], final_uds) }
1090
1091
1092 ---------------------------
1093 specDefns :: SpecEnv
1094 -> UsageDetails -- Info on how it is used in its scope
1095 -> [(Id,CoreExpr)] -- The things being bound and their un-processed RHS
1096 -> SpecM ([Id], -- Original Ids with RULES added
1097 [(Id,CoreExpr)], -- Extra, specialised bindings
1098 UsageDetails) -- Stuff to fling upwards from the specialised versions
1099
1100 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1101 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1102 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1103 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1104 -- But not vice versa (it's a fixpoint problem).
1105
1106 specDefns _env uds []
1107 = return ([], [], uds)
1108 specDefns env uds ((bndr,rhs):pairs)
1109 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1110 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1111 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1112
1113 ---------------------------
1114 specDefn :: SpecEnv
1115 -> UsageDetails -- Info on how it is used in its scope
1116 -> Id -> CoreExpr -- The thing being bound and its un-processed RHS
1117 -> SpecM (Id, -- Original Id with added RULES
1118 [(Id,CoreExpr)], -- Extra, specialised bindings
1119 UsageDetails) -- Stuff to fling upwards from the specialised versions
1120
1121 specDefn env body_uds fn rhs
1122 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1123 rules_for_me = idCoreRules fn
1124 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1125 calls_for_me fn rhs
1126 ; return ( fn `addIdSpecialisations` rules
1127 , spec_defns
1128 , body_uds_without_me `plusUDs` spec_uds) }
1129 -- It's important that the `plusUDs` is this way
1130 -- round, because body_uds_without_me may bind
1131 -- dictionaries that are used in calls_for_me passed
1132 -- to specDefn. So the dictionary bindings in
1133 -- spec_uds may mention dictionaries bound in
1134 -- body_uds_without_me
1135
1136 ---------------------------
1137 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1138 -- Nothing => specialising local fn
1139 -> SpecEnv
1140 -> [CoreRule] -- Existing RULES for the fn
1141 -> [CallInfo]
1142 -> Id -> CoreExpr
1143 -> SpecM ([CoreRule], -- New RULES for the fn
1144 [(Id,CoreExpr)], -- Extra, specialised bindings
1145 UsageDetails) -- New usage details from the specialised RHSs
1146
1147 -- This function checks existing rules, and does not create
1148 -- duplicate ones. So the caller does not need to do this filtering.
1149 -- See 'already_covered'
1150
1151 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1152 -- The first case is the interesting one
1153 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1154 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1155 && notNull calls_for_me -- And there are some calls to specialise
1156 && not (isNeverActive (idInlineActivation fn))
1157 -- Don't specialise NOINLINE things
1158 -- See Note [Auto-specialisation and RULES]
1159
1160 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1161 -- See Note [Inline specialisation] for why we do not
1162 -- switch off specialisation for inline functions
1163
1164 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1165 do { stuff <- mapM spec_call calls_for_me
1166 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1167 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1168
1169 | otherwise -- No calls or RHS doesn't fit our preconceptions
1170 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1171 text "Missed specialisation opportunity for"
1172 <+> ppr fn $$ _trace_doc )
1173 -- Note [Specialisation shape]
1174 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1175 return ([], [], emptyUDs)
1176 where
1177 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1178 , ppr rhs_ids, ppr n_dicts
1179 , ppr (idInlineActivation fn) ]
1180
1181 fn_type = idType fn
1182 fn_arity = idArity fn
1183 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1184 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1185 n_tyvars = length tyvars
1186 n_dicts = length theta
1187 inl_prag = idInlinePragma fn
1188 inl_act = inlinePragmaActivation inl_prag
1189 is_local = isLocalId fn
1190
1191 -- Figure out whether the function has an INLINE pragma
1192 -- See Note [Inline specialisations]
1193
1194 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1195
1196 rhs_dict_ids = take n_dicts rhs_ids
1197 body = mkLams (drop n_dicts rhs_ids) rhs_body
1198 -- Glue back on the non-dict lambdas
1199
1200 already_covered :: DynFlags -> [CoreExpr] -> Bool
1201 already_covered dflags args -- Note [Specialisations already covered]
1202 = isJust (lookupRule dflags
1203 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1204 (const True)
1205 fn args rules_for_me)
1206
1207 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1208 mk_ty_args [] poly_tvs
1209 = ASSERT( null poly_tvs ) []
1210 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1211 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1212 mk_ty_args (Just ty : call_ts) poly_tvs
1213 = Type ty : mk_ty_args call_ts poly_tvs
1214 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1215
1216 ----------------------------------------------------------
1217 -- Specialise to one particular call pattern
1218 spec_call :: CallInfo -- Call instance
1219 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1220 UsageDetails, -- Usage details from specialised body
1221 CoreRule)) -- Info for the Id's SpecEnv
1222 spec_call _call_info@(CallKey call_ts, (call_ds, _))
1223 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1224
1225 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1226 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1227
1228 -- Construct the new binding
1229 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1230 -- PLUS the rule
1231 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1232 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1233 -- PLUS the usage-details
1234 -- { d1' = dx1; d2' = dx2 }
1235 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1236 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1237 --
1238 -- Note that the substitution is applied to the whole thing.
1239 -- This is convenient, but just slightly fragile. Notably:
1240 -- * There had better be no name clashes in a/b/c
1241 do { let
1242 -- poly_tyvars = [b] in the example above
1243 -- spec_tyvars = [a,c]
1244 -- ty_args = [t1,b,t3]
1245 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1246 env1 = extendTvSubstList env spec_tv_binds
1247 (rhs_env, poly_tyvars) = substBndrs env1
1248 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1249
1250 -- Clone rhs_dicts, including instantiating their types
1251 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1252 ; let (rhs_env2, dx_binds, spec_dict_args)
1253 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1254 ty_args = mk_ty_args call_ts poly_tyvars
1255 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1256 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1257 rule_args = ty_args ++ ev_args
1258 rule_bndrs = poly_tyvars ++ ev_bndrs
1259
1260 ; dflags <- getDynFlags
1261 ; if already_covered dflags rule_args then
1262 return Nothing
1263 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1264 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1265 -- , ppr dx_binds ]) $
1266 do
1267 { -- Figure out the type of the specialised function
1268 let body_ty = applyTypeToArgs rhs fn_type rule_args
1269 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1270 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1271 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1272 | otherwise = (poly_tyvars, poly_tyvars)
1273 spec_id_ty = mkLamTypes lam_args body_ty
1274
1275 ; spec_f <- newSpecIdSM fn spec_id_ty
1276 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1277 ; this_mod <- getModule
1278 ; let
1279 -- The rule to put in the function's specialisation is:
1280 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1281 herald = case mb_mod of
1282 Nothing -- Specialising local fn
1283 -> text "SPEC"
1284 Just this_mod -- Specialising imoprted fn
1285 -> text "SPEC/" <> ppr this_mod
1286
1287 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1288 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1289 -- This name ends up in interface files, so use showSDocForUser,
1290 -- otherwise uniques end up there, making builds
1291 -- less deterministic (See #4012 comment:61 ff)
1292
1293 spec_env_rule = mkRule
1294 this_mod
1295 True {- Auto generated -}
1296 is_local
1297 rule_name
1298 inl_act -- Note [Auto-specialisation and RULES]
1299 (idName fn)
1300 rule_bndrs
1301 rule_args
1302 (mkVarApps (Var spec_f) app_args)
1303
1304 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1305 final_uds = foldr consDictBind rhs_uds dx_binds
1306
1307 --------------------------------------
1308 -- Add a suitable unfolding if the spec_inl_prag says so
1309 -- See Note [Inline specialisations]
1310 (spec_inl_prag, spec_unf)
1311 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1312 = (neverInlinePragma, noUnfolding)
1313 -- See Note [Specialising imported functions] in OccurAnal
1314
1315 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1316 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1317
1318 | otherwise
1319 = (inl_prag, specUnfolding dflags spec_unf_subst poly_tyvars
1320 spec_unf_args fn_unf)
1321
1322 spec_unf_args = ty_args ++ spec_dict_args
1323 spec_unf_subst = CoreSubst.setInScope (se_subst env)
1324 (CoreSubst.substInScope (se_subst rhs_env2))
1325 -- Extend the in-scope set to satisfy the precondition of
1326 -- specUnfolding, namely that in-scope(unf_subst) includes
1327 -- the free vars of spec_unf_args. The in-scope set of rhs_env2
1328 -- is just the ticket; but the actual substitution we want is
1329 -- the same old one from 'env'
1330
1331 --------------------------------------
1332 -- Adding arity information just propagates it a bit faster
1333 -- See Note [Arity decrease] in Simplify
1334 -- Copy InlinePragma information from the parent Id.
1335 -- So if f has INLINE[1] so does spec_f
1336 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1337 `setInlinePragma` spec_inl_prag
1338 `setIdUnfolding` spec_unf
1339
1340 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1341
1342 {- Note [Evidence foralls]
1343 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1344 Suppose (Trac #12212) that we are specialising
1345 f :: forall a b. (Num a, F a ~ F b) => blah
1346 with a=b=Int. Then the RULE will be something like
1347 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1348 f Int Int d g = f_spec
1349 But both varToCoreExpr (when constructing the LHS args), and the
1350 simplifier (when simplifying the LHS args), will transform to
1351 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1352 f Int Int d <F Int> = f_spec
1353 by replacing g with Refl. So now 'g' is unbound, which results in a later
1354 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1355 * varToCoreExpr generates a Refl
1356 * exprsFreeIdsList returns the Ids bound by the args,
1357 which won't include g
1358
1359 You might wonder if this will match as often, but the simplifier replaces
1360 complicated Refl coercions with Refl pretty aggressively.
1361
1362 Note [Orphans and auto-generated rules]
1363 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1364 When we specialise an INLINEABLE function, or when we have
1365 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1366 We don't want to warn about these, or we'd generate a lot of warnings.
1367 Thus, we only warn about user-specified orphan rules.
1368
1369 Indeed, we don't even treat the module as an orphan module if it has
1370 auto-generated *rule* orphans. Orphan modules are read every time we
1371 compile, so they are pretty obtrusive and slow down every compilation,
1372 even non-optimised ones. (Reason: for type class instances it's a
1373 type correctness issue.) But specialisation rules are strictly for
1374 *optimisation* only so it's fine not to read the interface.
1375
1376 What this means is that a SPEC rules from auto-specialisation in
1377 module M will be used in other modules only if M.hi has been read for
1378 some other reason, which is actually pretty likely.
1379 -}
1380
1381 bindAuxiliaryDicts
1382 :: SpecEnv
1383 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1384 -> [DictId] -- A cloned dict-id for each dict arg
1385 -> (SpecEnv, -- Substitute for all orig_dicts
1386 [DictBind], -- Auxiliary dict bindings
1387 [CoreExpr]) -- Witnessing expressions (all trivial)
1388 -- Bind any dictionary arguments to fresh names, to preserve sharing
1389 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1390 orig_dict_ids call_ds inst_dict_ids
1391 = (env', dx_binds, spec_dict_args)
1392 where
1393 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1394 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1395 (orig_dict_ids `zip` spec_dict_args)
1396 `CoreSubst.extendInScopeList` dx_ids
1397 , se_interesting = interesting `unionVarSet` interesting_dicts }
1398
1399 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1400 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1401 , interestingDict env dx ]
1402 -- See Note [Make the new dictionaries interesting]
1403
1404 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1405 go [] _ = ([], [])
1406 go (dx:dxs) (dx_id:dx_ids)
1407 | exprIsTrivial dx = (dx_binds, dx : args)
1408 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1409 where
1410 (dx_binds, args) = go dxs dx_ids
1411 -- In the first case extend the substitution but not bindings;
1412 -- in the latter extend the bindings but not the substitution.
1413 -- For the former, note that we bind the *original* dict in the substitution,
1414 -- overriding any d->dx_id binding put there by substBndrs
1415 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1416
1417 {-
1418 Note [Make the new dictionaries interesting]
1419 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1420 Important! We're going to substitute dx_id1 for d
1421 and we want it to look "interesting", else we won't gather *any*
1422 consequential calls. E.g.
1423 f d = ...g d....
1424 If we specialise f for a call (f (dfun dNumInt)), we'll get
1425 a consequent call (g d') with an auxiliary definition
1426 d' = df dNumInt
1427 We want that consequent call to look interesting
1428
1429
1430 Note [From non-recursive to recursive]
1431 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1432 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1433 have built a recursive knot
1434
1435 f a d x = <blah>
1436 MkUD { ud_binds = d7 = MkD ..f..
1437 , ud_calls = ...(f T d7)... }
1438
1439 The we generate
1440
1441 Rec { fs x = <blah>[T/a, d7/d]
1442 f a d x = <blah>
1443 RULE f T _ = fs
1444 d7 = ...f... }
1445
1446 Here the recursion is only through the RULE.
1447
1448
1449 Note [Specialisation of dictionary functions]
1450 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1451 Here is a nasty example that bit us badly: see Trac #3591
1452
1453 class Eq a => C a
1454 instance Eq [a] => C [a]
1455
1456 ---------------
1457 dfun :: Eq [a] -> C [a]
1458 dfun a d = MkD a d (meth d)
1459
1460 d4 :: Eq [T] = <blah>
1461 d2 :: C [T] = dfun T d4
1462 d1 :: Eq [T] = $p1 d2
1463 d3 :: C [T] = dfun T d1
1464
1465 None of these definitions is recursive. What happened was that we
1466 generated a specialisation:
1467
1468 RULE forall d. dfun T d = dT :: C [T]
1469 dT = (MkD a d (meth d)) [T/a, d1/d]
1470 = MkD T d1 (meth d1)
1471
1472 But now we use the RULE on the RHS of d2, to get
1473
1474 d2 = dT = MkD d1 (meth d1)
1475 d1 = $p1 d2
1476
1477 and now d1 is bottom! The problem is that when specialising 'dfun' we
1478 should first dump "below" the binding all floated dictionary bindings
1479 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1480 placed below 'dfun', and thus unavailable to it when specialising
1481 'dfun'. That in turn means that the call (dfun T d1) must be
1482 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1483 d4 doesn't mention dfun.
1484
1485 But look at this:
1486
1487 class C a where { foo,bar :: [a] -> [a] }
1488
1489 instance C Int where
1490 foo x = r_bar x
1491 bar xs = reverse xs
1492
1493 r_bar :: C a => [a] -> [a]
1494 r_bar xs = bar (xs ++ xs)
1495
1496 That translates to:
1497
1498 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1499
1500 Rec { $fCInt :: C Int = MkC foo_help reverse
1501 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1502
1503 The call (r_bar $fCInt) mentions $fCInt,
1504 which mentions foo_help,
1505 which mentions r_bar
1506 But we DO want to specialise r_bar at Int:
1507
1508 Rec { $fCInt :: C Int = MkC foo_help reverse
1509 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1510
1511 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1512 RULE r_bar Int _ = r_bar_Int
1513
1514 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1515 }
1516
1517 Note that, because of its RULE, r_bar joins the recursive
1518 group. (In this case it'll unravel a short moment later.)
1519
1520
1521 Conclusion: we catch the nasty case using filter_dfuns in
1522 callsForMe. To be honest I'm not 100% certain that this is 100%
1523 right, but it works. Sigh.
1524
1525
1526 Note [Specialising a recursive group]
1527 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1528 Consider
1529 let rec { f x = ...g x'...
1530 ; g y = ...f y'.... }
1531 in f 'a'
1532 Here we specialise 'f' at Char; but that is very likely to lead to
1533 a specialisation of 'g' at Char. We must do the latter, else the
1534 whole point of specialisation is lost.
1535
1536 But we do not want to keep iterating to a fixpoint, because in the
1537 presence of polymorphic recursion we might generate an infinite number
1538 of specialisations.
1539
1540 So we use the following heuristic:
1541 * Arrange the rec block in dependency order, so far as possible
1542 (the occurrence analyser already does this)
1543
1544 * Specialise it much like a sequence of lets
1545
1546 * Then go through the block a second time, feeding call-info from
1547 the RHSs back in the bottom, as it were
1548
1549 In effect, the ordering maxmimises the effectiveness of each sweep,
1550 and we do just two sweeps. This should catch almost every case of
1551 monomorphic recursion -- the exception could be a very knotted-up
1552 recursion with multiple cycles tied up together.
1553
1554 This plan is implemented in the Rec case of specBindItself.
1555
1556 Note [Specialisations already covered]
1557 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1558 We obviously don't want to generate two specialisations for the same
1559 argument pattern. There are two wrinkles
1560
1561 1. We do the already-covered test in specDefn, not when we generate
1562 the CallInfo in mkCallUDs. We used to test in the latter place, but
1563 we now iterate the specialiser somewhat, and the Id at the call site
1564 might therefore not have all the RULES that we can see in specDefn
1565
1566 2. What about two specialisations where the second is an *instance*
1567 of the first? If the more specific one shows up first, we'll generate
1568 specialisations for both. If the *less* specific one shows up first,
1569 we *don't* currently generate a specialisation for the more specific
1570 one. (See the call to lookupRule in already_covered.) Reasons:
1571 (a) lookupRule doesn't say which matches are exact (bad reason)
1572 (b) if the earlier specialisation is user-provided, it's
1573 far from clear that we should auto-specialise further
1574
1575 Note [Auto-specialisation and RULES]
1576 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1577 Consider:
1578 g :: Num a => a -> a
1579 g = ...
1580
1581 f :: (Int -> Int) -> Int
1582 f w = ...
1583 {-# RULE f g = 0 #-}
1584
1585 Suppose that auto-specialisation makes a specialised version of
1586 g::Int->Int That version won't appear in the LHS of the RULE for f.
1587 So if the specialisation rule fires too early, the rule for f may
1588 never fire.
1589
1590 It might be possible to add new rules, to "complete" the rewrite system.
1591 Thus when adding
1592 RULE forall d. g Int d = g_spec
1593 also add
1594 RULE f g_spec = 0
1595
1596 But that's a bit complicated. For now we ask the programmer's help,
1597 by *copying the INLINE activation pragma* to the auto-specialised
1598 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1599 will also not be active until phase 2. And that's what programmers
1600 should jolly well do anyway, even aside from specialisation, to ensure
1601 that g doesn't inline too early.
1602
1603 This in turn means that the RULE would never fire for a NOINLINE
1604 thing so not much point in generating a specialisation at all.
1605
1606 Note [Specialisation shape]
1607 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1608 We only specialise a function if it has visible top-level lambdas
1609 corresponding to its overloading. E.g. if
1610 f :: forall a. Eq a => ....
1611 then its body must look like
1612 f = /\a. \d. ...
1613
1614 Reason: when specialising the body for a call (f ty dexp), we want to
1615 substitute dexp for d, and pick up specialised calls in the body of f.
1616
1617 This doesn't always work. One example I came across was this:
1618 newtype Gen a = MkGen{ unGen :: Int -> a }
1619
1620 choose :: Eq a => a -> Gen a
1621 choose n = MkGen (\r -> n)
1622
1623 oneof = choose (1::Int)
1624
1625 It's a silly exapmle, but we get
1626 choose = /\a. g `cast` co
1627 where choose doesn't have any dict arguments. Thus far I have not
1628 tried to fix this (wait till there's a real example).
1629
1630 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1631 it doesn't matter. This comes up with single-method classes
1632
1633 class C a where { op :: a -> a }
1634 instance C a => C [a] where ....
1635 ==>
1636 $fCList :: C a => C [a]
1637 $fCList = $copList |> (...coercion>...)
1638 ....(uses of $fCList at particular types)...
1639
1640 So we suppress the WARN if the rhs is trivial.
1641
1642 Note [Inline specialisations]
1643 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1644 Here is what we do with the InlinePragma of the original function
1645 * Activation/RuleMatchInfo: both transferred to the
1646 specialised function
1647 * InlineSpec:
1648 (a) An INLINE pragma is transferred
1649 (b) An INLINABLE pragma is *not* transferred
1650
1651 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1652 specialise the function at its call site, and arguably that's not so
1653 important for the specialised copies. BUT *pragma-directed*
1654 specialisation now takes place in the typechecker/desugarer, with
1655 manually specified INLINEs. The specialisation here is automatic.
1656 It'd be very odd if a function marked INLINE was specialised (because
1657 of some local use), and then forever after (including importing
1658 modules) the specialised version wasn't INLINEd. After all, the
1659 programmer said INLINE!
1660
1661 You might wonder why we specialise INLINE functions at all. After
1662 all they should be inlined, right? Two reasons:
1663
1664 * Even INLINE functions are sometimes not inlined, when they aren't
1665 applied to interesting arguments. But perhaps the type arguments
1666 alone are enough to specialise (even though the args are too boring
1667 to trigger inlining), and it's certainly better to call the
1668 specialised version.
1669
1670 * The RHS of an INLINE function might call another overloaded function,
1671 and we'd like to generate a specialised version of that function too.
1672 This actually happens a lot. Consider
1673 replicateM_ :: (Monad m) => Int -> m a -> m ()
1674 {-# INLINABLE replicateM_ #-}
1675 replicateM_ d x ma = ...
1676 The strictness analyser may transform to
1677 replicateM_ :: (Monad m) => Int -> m a -> m ()
1678 {-# INLINE replicateM_ #-}
1679 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1680
1681 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1682 {-# INLINABLE $wreplicateM_ #-}
1683 $wreplicateM_ = ...
1684 Now an importing module has a specialised call to replicateM_, say
1685 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1686 This particular example had a huge effect on the call to replicateM_
1687 in nofib/shootout/n-body.
1688
1689 Why (b): discard INLINEABLE pragmas? See Trac #4874 for persuasive examples.
1690 Suppose we have
1691 {-# INLINABLE f #-}
1692 f :: Ord a => [a] -> Int
1693 f xs = letrec f' = ...f'... in f'
1694 Then, when f is specialised and optimised we might get
1695 wgo :: [Int] -> Int#
1696 wgo = ...wgo...
1697 f_spec :: [Int] -> Int
1698 f_spec xs = case wgo xs of { r -> I# r }
1699 and we clearly want to inline f_spec at call sites. But if we still
1700 have the big, un-optimised of f (albeit specialised) captured in an
1701 INLINABLE pragma for f_spec, we won't get that optimisation.
1702
1703 So we simply drop INLINABLE pragmas when specialising. It's not really
1704 a complete solution; ignoring specialisation for now, INLINABLE functions
1705 don't get properly strictness analysed, for example. But it works well
1706 for examples involving specialisation, which is the dominant use of
1707 INLINABLE. See Trac #4874.
1708
1709
1710 ************************************************************************
1711 * *
1712 \subsubsection{UsageDetails and suchlike}
1713 * *
1714 ************************************************************************
1715 -}
1716
1717 data UsageDetails
1718 = MkUD {
1719 ud_binds :: !(Bag DictBind),
1720 -- Floated dictionary bindings
1721 -- The order is important;
1722 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1723 -- (Remember, Bags preserve order in GHC.)
1724
1725 ud_calls :: !CallDetails
1726
1727 -- INVARIANT: suppose bs = bindersOf ud_binds
1728 -- Then 'calls' may *mention* 'bs',
1729 -- but there should be no calls *for* bs
1730 }
1731
1732 instance Outputable UsageDetails where
1733 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1734 = text "MkUD" <+> braces (sep (punctuate comma
1735 [text "binds" <+> equals <+> ppr dbs,
1736 text "calls" <+> equals <+> ppr calls]))
1737
1738 -- | A 'DictBind' is a binding along with a cached set containing its free
1739 -- variables (both type variables and dictionaries)
1740 type DictBind = (CoreBind, VarSet)
1741
1742 type DictExpr = CoreExpr
1743
1744 emptyUDs :: UsageDetails
1745 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1746
1747 ------------------------------------------------------------
1748 type CallDetails = DIdEnv CallInfoSet
1749 -- The order of specialized binds and rules depends on how we linearize
1750 -- CallDetails, so to get determinism we must use a deterministic set here.
1751 -- See Note [Deterministic UniqFM] in UniqDFM
1752 newtype CallKey = CallKey [Maybe Type]
1753 -- Nothing => unconstrained type argument
1754
1755 data CallInfoSet = CIS Id (Bag CallInfo)
1756 -- The list of types and dictionaries is guaranteed to
1757 -- match the type of f
1758
1759 {-
1760 Note [CallInfoSet determinism]
1761 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1762 CallInfoSet holds a Bag of (CallKey, [DictExpr], VarSet) triplets for a given
1763 Id. They represent the types that the function is instantiated at along with
1764 the dictionaries and free variables.
1765
1766 We use this information to generate specialized versions of a given function.
1767 CallInfoSet used to be defined as:
1768
1769 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1770
1771 Unfortunately this was not deterministic. The Ord instance of CallKey was
1772 defined in terms of nonDetCmpType which is not deterministic.
1773 See Note [nonDetCmpType nondeterminism].
1774 The end result was that if the function had multiple specializations they would
1775 be generated in arbitrary order.
1776
1777 We need a container that:
1778 a) when turned into a list has only one element per each CallKey and the list
1779 has deterministic order
1780 b) supports union
1781 c) supports singleton
1782 d) supports filter
1783
1784 We can't use UniqDFM here because there's no one Unique that we can key on.
1785
1786 The current approach is to implement the set as a Bag with duplicates.
1787 This makes b), c), d) trivial and pushes a) towards the end. The deduplication
1788 is done by using a TrieMap for membership tests on CallKey. This lets us delete
1789 the nondeterministic Ord CallKey instance.
1790
1791 An alternative approach would be to augument the Map the same way that UniqDFM
1792 is augumented, by keeping track of insertion order and using it to order the
1793 resulting lists. It would mean keeping the nondeterministic Ord CallKey
1794 instance making it easy to reintroduce nondeterminism in the future.
1795 -}
1796
1797 ciSetToList :: CallInfoSet -> [CallInfo]
1798 ciSetToList (CIS _ b) = snd $ foldrBag combine (emptyTM, []) b
1799 where
1800 -- This is where we eliminate duplicates, recording the CallKeys we've
1801 -- already seen in the TrieMap. See Note [CallInfoSet determinism].
1802 combine :: CallInfo -> (CallKeySet, [CallInfo]) -> (CallKeySet, [CallInfo])
1803 combine ci@(CallKey key, _) (set, acc)
1804 | Just _ <- lookupTM key set = (set, acc)
1805 | otherwise = (insertTM key () set, ci:acc)
1806
1807 type CallKeySet = ListMap (MaybeMap TypeMap) ()
1808 -- We only use it in ciSetToList to check for membership
1809
1810 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1811 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1812
1813 type CallInfo = (CallKey, ([DictExpr], VarSet))
1814 -- Range is dict args and the vars of the whole
1815 -- call (including tyvars)
1816 -- [*not* include the main id itself, of course]
1817
1818 instance Outputable CallInfoSet where
1819 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1820 2 (ppr map)
1821
1822 pprCallInfo :: Id -> CallInfo -> SDoc
1823 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1824 = hang (ppr fn)
1825 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1826
1827 ppr_call_key_ty :: Maybe Type -> SDoc
1828 ppr_call_key_ty Nothing = char '_'
1829 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1830
1831 instance Outputable CallKey where
1832 ppr (CallKey ts) = ppr ts
1833
1834 unionCalls :: CallDetails -> CallDetails -> CallDetails
1835 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1836
1837 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1838 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1839 CIS f (calls1 `unionBags` calls2)
1840
1841 callDetailsFVs :: CallDetails -> VarSet
1842 callDetailsFVs calls =
1843 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1844 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1845 -- immediately by converting to a nondeterministic set.
1846
1847 callInfoFVs :: CallInfoSet -> VarSet
1848 callInfoFVs (CIS _ call_info) =
1849 foldrBag (\(_, (_,fv)) vs -> unionVarSet fv vs) emptyVarSet call_info
1850
1851 ------------------------------------------------------------
1852 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1853 singleCall id tys dicts
1854 = MkUD {ud_binds = emptyBag,
1855 ud_calls = unitDVarEnv id $ CIS id $
1856 unitBag (CallKey tys, (dicts, call_fvs)) }
1857 where
1858 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1859 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1860 -- The type args (tys) are guaranteed to be part of the dictionary
1861 -- types, because they are just the constrained types,
1862 -- and the dictionary is therefore sure to be bound
1863 -- inside the binding for any type variables free in the type;
1864 -- hence it's safe to neglect tyvars free in tys when making
1865 -- the free-var set for this call
1866 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1867 --
1868 -- We don't include the 'id' itself.
1869
1870 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1871 mkCallUDs env f args
1872 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1873 res
1874 where
1875 res = mkCallUDs' env f args
1876
1877 mkCallUDs' env f args
1878 | not (want_calls_for f) -- Imported from elsewhere
1879 || null theta -- Not overloaded
1880 = emptyUDs
1881
1882 | not (all type_determines_value theta)
1883 || not (spec_tys `lengthIs` n_tyvars)
1884 || not ( dicts `lengthIs` n_dicts)
1885 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1886 -- See also Note [Specialisations already covered]
1887 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1888 emptyUDs -- Not overloaded, or no specialisation wanted
1889
1890 | otherwise
1891 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1892 singleCall f spec_tys dicts
1893 where
1894 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1895 , ppr (map (interestingDict env) dicts)]
1896 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1897 constrained_tyvars = tyCoVarsOfTypes theta
1898 n_tyvars = length tyvars
1899 n_dicts = length theta
1900
1901 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1902 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1903
1904 -- ignores Coercion arguments
1905 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1906 type_zip tvs (Coercion _ : args) = type_zip tvs args
1907 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1908 type_zip _ _ = []
1909
1910 mk_spec_ty tyvar ty
1911 | tyvar `elemVarSet` constrained_tyvars = Just ty
1912 | otherwise = Nothing
1913
1914 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1915 -- For imported things, we gather call instances if
1916 -- there is an unfolding that we could in principle specialise
1917 -- We might still decide not to use it (consulting dflags)
1918 -- in specImports
1919 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1920
1921 type_determines_value pred -- See Note [Type determines value]
1922 = case classifyPredType pred of
1923 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1924 EqPred {} -> True
1925 IrredPred {} -> True -- Things like (D []) where D is a
1926 -- Constraint-ranged family; Trac #7785
1927
1928 {-
1929 Note [Type determines value]
1930 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1931 Only specialise if all overloading is on non-IP *class* params,
1932 because these are the ones whose *type* determines their *value*. In
1933 parrticular, with implicit params, the type args *don't* say what the
1934 value of the implicit param is! See Trac #7101
1935
1936 However, consider
1937 type family D (v::*->*) :: Constraint
1938 type instance D [] = ()
1939 f :: D v => v Char -> Int
1940 If we see a call (f "foo"), we'll pass a "dictionary"
1941 () |> (g :: () ~ D [])
1942 and it's good to specialise f at this dictionary.
1943
1944 So the question is: can an implicit parameter "hide inside" a
1945 type-family constraint like (D a). Well, no. We don't allow
1946 type instance D Maybe = ?x:Int
1947 Hence the IrredPred case in type_determines_value.
1948 See Trac #7785.
1949
1950 Note [Interesting dictionary arguments]
1951 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1952 Consider this
1953 \a.\d:Eq a. let f = ... in ...(f d)...
1954 There really is not much point in specialising f wrt the dictionary d,
1955 because the code for the specialised f is not improved at all, because
1956 d is lambda-bound. We simply get junk specialisations.
1957
1958 What is "interesting"? Just that it has *some* structure. But what about
1959 variables?
1960
1961 * A variable might be imported, in which case its unfolding
1962 will tell us whether it has useful structure
1963
1964 * Local variables are cloned on the way down (to avoid clashes when
1965 we float dictionaries), and cloning drops the unfolding
1966 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1967 nuisance to give them unfoldings. So we keep track of the
1968 "interesting" dictionaries as a VarSet in SpecEnv.
1969 We have to take care to put any new interesting dictionary
1970 bindings in the set.
1971
1972 We accidentally lost accurate tracking of local variables for a long
1973 time, because cloned variables don't have unfoldings. But makes a
1974 massive difference in a few cases, eg Trac #5113. For nofib as a
1975 whole it's only a small win: 2.2% improvement in allocation for ansi,
1976 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1977 -}
1978
1979 interestingDict :: SpecEnv -> CoreExpr -> Bool
1980 -- A dictionary argument is interesting if it has *some* structure
1981 -- NB: "dictionary" arguments include constraints of all sorts,
1982 -- including equality constraints; hence the Coercion case
1983 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1984 || isDataConWorkId v
1985 || v `elemVarSet` se_interesting env
1986 interestingDict _ (Type _) = False
1987 interestingDict _ (Coercion _) = False
1988 interestingDict env (App fn (Type _)) = interestingDict env fn
1989 interestingDict env (App fn (Coercion _)) = interestingDict env fn
1990 interestingDict env (Tick _ a) = interestingDict env a
1991 interestingDict env (Cast e _) = interestingDict env e
1992 interestingDict _ _ = True
1993
1994 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1995 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
1996 (MkUD {ud_binds = db2, ud_calls = calls2})
1997 = MkUD { ud_binds = db1 `unionBags` db2
1998 , ud_calls = calls1 `unionCalls` calls2 }
1999
2000 plusUDList :: [UsageDetails] -> UsageDetails
2001 plusUDList = foldr plusUDs emptyUDs
2002
2003 -----------------------------
2004 _dictBindBndrs :: Bag DictBind -> [Id]
2005 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2006
2007 -- | Construct a 'DictBind' from a 'CoreBind'
2008 mkDB :: CoreBind -> DictBind
2009 mkDB bind = (bind, bind_fvs bind)
2010
2011 -- | Identify the free variables of a 'CoreBind'
2012 bind_fvs :: CoreBind -> VarSet
2013 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2014 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2015 where
2016 bndrs = map fst prs
2017 rhs_fvs = unionVarSets (map pair_fvs prs)
2018
2019 pair_fvs :: (Id, CoreExpr) -> VarSet
2020 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
2021 -- Don't forget variables mentioned in the
2022 -- rules of the bndr. C.f. OccAnal.addRuleUsage
2023 -- Also tyvars mentioned in its type; they may not appear in the RHS
2024 -- type T a = Int
2025 -- x :: T a = 3
2026
2027 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
2028 -- recursive binding, including some additional bindings.
2029 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
2030 flattenDictBinds dbs pairs
2031 = (Rec bindings, fvs)
2032 where
2033 (bindings, fvs) = foldrBag add
2034 ([], emptyVarSet)
2035 (dbs `snocBag` mkDB (Rec pairs))
2036 add (NonRec b r, fvs') (pairs, fvs) =
2037 ((b,r) : pairs, fvs `unionVarSet` fvs')
2038 add (Rec prs1, fvs') (pairs, fvs) =
2039 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2040
2041 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2042 -- Add ud_binds to the tail end of the bindings in uds
2043 snocDictBinds uds dbs
2044 = uds { ud_binds = ud_binds uds `unionBags`
2045 foldr consBag emptyBag dbs }
2046
2047 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2048 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2049
2050 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2051 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2052
2053 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2054 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2055
2056 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2057 wrapDictBinds dbs binds
2058 = foldrBag add binds dbs
2059 where
2060 add (bind,_) binds = bind : binds
2061
2062 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2063 wrapDictBindsE dbs expr
2064 = foldrBag add expr dbs
2065 where
2066 add (bind,_) expr = Let bind expr
2067
2068 ----------------------
2069 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2070 -- Used at a lambda or case binder; just dump anything mentioning the binder
2071 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2072 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2073 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2074 (free_uds, dump_dbs)
2075 where
2076 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2077 bndr_set = mkVarSet bndrs
2078 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2079 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2080 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2081 -- no calls for any of the dicts in dump_dbs
2082
2083 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2084 -- Used at a lambda or case binder; just dump anything mentioning the binder
2085 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2086 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2087 (free_uds, dump_dbs, float_all)
2088 where
2089 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2090 bndr_set = mkVarSet bndrs
2091 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2092 free_calls = deleteCallsFor bndrs orig_calls
2093 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2094
2095 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2096 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2097 = -- pprTrace ("callsForMe")
2098 -- (vcat [ppr fn,
2099 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2100 -- text "Orig calls =" <+> ppr orig_calls,
2101 -- text "Dep set =" <+> ppr dep_set,
2102 -- text "Calls for me =" <+> ppr calls_for_me]) $
2103 (uds_without_me, calls_for_me)
2104 where
2105 uds_without_me = MkUD { ud_binds = orig_dbs
2106 , ud_calls = delDVarEnv orig_calls fn }
2107 calls_for_me = case lookupDVarEnv orig_calls fn of
2108 Nothing -> []
2109 Just cis -> filter_dfuns (ciSetToList cis)
2110
2111 dep_set = foldlBag go (unitVarSet fn) orig_dbs
2112 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2113 = extendVarSetList dep_set (bindersOf db)
2114 | otherwise = dep_set
2115
2116 -- Note [Specialisation of dictionary functions]
2117 filter_dfuns | isDFunId fn = filter ok_call
2118 | otherwise = \cs -> cs
2119
2120 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2121
2122 ----------------------
2123 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2124 -- Returns (free_dbs, dump_dbs, dump_set)
2125 splitDictBinds dbs bndr_set
2126 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2127 -- Important that it's foldl not foldr;
2128 -- we're accumulating the set of dumped ids in dump_set
2129 where
2130 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2131 | dump_idset `intersectsVarSet` fvs -- Dump it
2132 = (free_dbs, dump_dbs `snocBag` db,
2133 extendVarSetList dump_idset (bindersOf bind))
2134
2135 | otherwise -- Don't dump it
2136 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2137
2138
2139 ----------------------
2140 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2141 -- Remove calls *mentioning* bs
2142 deleteCallsMentioning bs calls
2143 = mapDVarEnv (ciSetFilter keep_call) calls
2144 where
2145 keep_call (_, (_, fvs)) = not (fvs `intersectsVarSet` bs)
2146
2147 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2148 -- Remove calls *for* bs
2149 deleteCallsFor bs calls = delDVarEnvList calls bs
2150
2151 {-
2152 ************************************************************************
2153 * *
2154 \subsubsection{Boring helper functions}
2155 * *
2156 ************************************************************************
2157 -}
2158
2159 newtype SpecM a = SpecM (State SpecState a)
2160
2161 data SpecState = SpecState {
2162 spec_uniq_supply :: UniqSupply,
2163 spec_module :: Module,
2164 spec_dflags :: DynFlags
2165 }
2166
2167 instance Functor SpecM where
2168 fmap = liftM
2169
2170 instance Applicative SpecM where
2171 pure x = SpecM $ return x
2172 (<*>) = ap
2173
2174 instance Monad SpecM where
2175 SpecM x >>= f = SpecM $ do y <- x
2176 case f y of
2177 SpecM z ->
2178 z
2179 fail str = SpecM $ fail str
2180
2181 #if __GLASGOW_HASKELL__ > 710
2182 instance MonadFail.MonadFail SpecM where
2183 fail str = SpecM $ fail str
2184 #endif
2185
2186 instance MonadUnique SpecM where
2187 getUniqueSupplyM
2188 = SpecM $ do st <- get
2189 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2190 put $ st { spec_uniq_supply = us2 }
2191 return us1
2192
2193 getUniqueM
2194 = SpecM $ do st <- get
2195 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2196 put $ st { spec_uniq_supply = us' }
2197 return u
2198
2199 instance HasDynFlags SpecM where
2200 getDynFlags = SpecM $ liftM spec_dflags get
2201
2202 instance HasModule SpecM where
2203 getModule = SpecM $ liftM spec_module get
2204
2205 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2206 runSpecM dflags this_mod (SpecM spec)
2207 = do us <- getUniqueSupplyM
2208 let initialState = SpecState {
2209 spec_uniq_supply = us,
2210 spec_module = this_mod,
2211 spec_dflags = dflags
2212 }
2213 return $ evalState spec initialState
2214
2215 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2216 mapAndCombineSM _ [] = return ([], emptyUDs)
2217 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2218 (ys, uds2) <- mapAndCombineSM f xs
2219 return (y:ys, uds1 `plusUDs` uds2)
2220
2221 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2222 extendTvSubstList env tv_binds
2223 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2224
2225 substTy :: SpecEnv -> Type -> Type
2226 substTy env ty = CoreSubst.substTy (se_subst env) ty
2227
2228 substCo :: SpecEnv -> Coercion -> Coercion
2229 substCo env co = CoreSubst.substCo (se_subst env) co
2230
2231 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2232 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2233 (subst', bs') -> (env { se_subst = subst' }, bs')
2234
2235 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2236 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2237 (subst', bs') -> (env { se_subst = subst' }, bs')
2238
2239 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2240 -- Clone the binders of the bind; return new bind with the cloned binders
2241 -- Return the substitution to use for RHSs, and the one to use for the body
2242 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2243 = do { us <- getUniqueSupplyM
2244 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2245 interesting' | interestingDict env rhs
2246 = interesting `extendVarSet` bndr'
2247 | otherwise = interesting
2248 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2249 , NonRec bndr' rhs) }
2250
2251 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2252 = do { us <- getUniqueSupplyM
2253 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2254 env' = env { se_subst = subst'
2255 , se_interesting = interesting `extendVarSetList`
2256 [ v | (v,r) <- pairs, interestingDict env r ] }
2257 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2258
2259 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2260 -- Make up completely fresh binders for the dictionaries
2261 -- Their bindings are going to float outwards
2262 newDictBndr env b = do { uniq <- getUniqueM
2263 ; let n = idName b
2264 ty' = substTy env (idType b)
2265 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2266
2267 newSpecIdSM :: Id -> Type -> SpecM Id
2268 -- Give the new Id a similar occurrence name to the old one
2269 newSpecIdSM old_id new_ty
2270 = do { uniq <- getUniqueM
2271 ; let name = idName old_id
2272 new_occ = mkSpecOcc (nameOccName name)
2273 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2274 ; return new_id }
2275
2276 {-
2277 Old (but interesting) stuff about unboxed bindings
2278 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2279
2280 What should we do when a value is specialised to a *strict* unboxed value?
2281
2282 map_*_* f (x:xs) = let h = f x
2283 t = map f xs
2284 in h:t
2285
2286 Could convert let to case:
2287
2288 map_*_Int# f (x:xs) = case f x of h# ->
2289 let t = map f xs
2290 in h#:t
2291
2292 This may be undesirable since it forces evaluation here, but the value
2293 may not be used in all branches of the body. In the general case this
2294 transformation is impossible since the mutual recursion in a letrec
2295 cannot be expressed as a case.
2296
2297 There is also a problem with top-level unboxed values, since our
2298 implementation cannot handle unboxed values at the top level.
2299
2300 Solution: Lift the binding of the unboxed value and extract it when it
2301 is used:
2302
2303 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2304 t = map f xs
2305 in case h of
2306 _Lift h# -> h#:t
2307
2308 Now give it to the simplifier and the _Lifting will be optimised away.
2309
2310 The benfit is that we have given the specialised "unboxed" values a
2311 very simplep lifted semantics and then leave it up to the simplifier to
2312 optimise it --- knowing that the overheads will be removed in nearly
2313 all cases.
2314
2315 In particular, the value will only be evaluted in the branches of the
2316 program which use it, rather than being forced at the point where the
2317 value is bound. For example:
2318
2319 filtermap_*_* p f (x:xs)
2320 = let h = f x
2321 t = ...
2322 in case p x of
2323 True -> h:t
2324 False -> t
2325 ==>
2326 filtermap_*_Int# p f (x:xs)
2327 = let h = case (f x) of h# -> _Lift h#
2328 t = ...
2329 in case p x of
2330 True -> case h of _Lift h#
2331 -> h#:t
2332 False -> t
2333
2334 The binding for h can still be inlined in the one branch and the
2335 _Lifting eliminated.
2336
2337
2338 Question: When won't the _Lifting be eliminated?
2339
2340 Answer: When they at the top-level (where it is necessary) or when
2341 inlining would duplicate work (or possibly code depending on
2342 options). However, the _Lifting will still be eliminated if the
2343 strictness analyser deems the lifted binding strict.
2344 -}