2b4d9f51857d15d7d3e6fb65456aabea7a34c324
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy )
14 import Type hiding( substTy, extendTvSubstList )
15 import Module( Module, HasModule(..) )
16 import Coercion( Coercion )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars, exprsFreeIdsList )
26 import CoreArity ( etaExpandToJoinPointRule )
27 import UniqSupply
28 import Name
29 import MkId ( voidArgId, voidPrimId )
30 import Maybes ( catMaybes, isJust )
31 import BasicTypes
32 import HscTypes
33 import Bag
34 import DynFlags
35 import Util
36 import Outputable
37 import FastString
38 import State
39 import UniqDFM
40 import TrieMap
41
42 import Control.Monad
43 #if __GLASGOW_HASKELL__ > 710
44 import qualified Control.Monad.Fail as MonadFail
45 #endif
46
47 {-
48 ************************************************************************
49 * *
50 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
51 * *
52 ************************************************************************
53
54 These notes describe how we implement specialisation to eliminate
55 overloading.
56
57 The specialisation pass works on Core
58 syntax, complete with all the explicit dictionary application,
59 abstraction and construction as added by the type checker. The
60 existing type checker remains largely as it is.
61
62 One important thought: the {\em types} passed to an overloaded
63 function, and the {\em dictionaries} passed are mutually redundant.
64 If the same function is applied to the same type(s) then it is sure to
65 be applied to the same dictionary(s)---or rather to the same {\em
66 values}. (The arguments might look different but they will evaluate
67 to the same value.)
68
69 Second important thought: we know that we can make progress by
70 treating dictionary arguments as static and worth specialising on. So
71 we can do without binding-time analysis, and instead specialise on
72 dictionary arguments and no others.
73
74 The basic idea
75 ~~~~~~~~~~~~~~
76 Suppose we have
77
78 let f = <f_rhs>
79 in <body>
80
81 and suppose f is overloaded.
82
83 STEP 1: CALL-INSTANCE COLLECTION
84
85 We traverse <body>, accumulating all applications of f to types and
86 dictionaries.
87
88 (Might there be partial applications, to just some of its types and
89 dictionaries? In principle yes, but in practice the type checker only
90 builds applications of f to all its types and dictionaries, so partial
91 applications could only arise as a result of transformation, and even
92 then I think it's unlikely. In any case, we simply don't accumulate such
93 partial applications.)
94
95
96 STEP 2: EQUIVALENCES
97
98 So now we have a collection of calls to f:
99 f t1 t2 d1 d2
100 f t3 t4 d3 d4
101 ...
102 Notice that f may take several type arguments. To avoid ambiguity, we
103 say that f is called at type t1/t2 and t3/t4.
104
105 We take equivalence classes using equality of the *types* (ignoring
106 the dictionary args, which as mentioned previously are redundant).
107
108 STEP 3: SPECIALISATION
109
110 For each equivalence class, choose a representative (f t1 t2 d1 d2),
111 and create a local instance of f, defined thus:
112
113 f@t1/t2 = <f_rhs> t1 t2 d1 d2
114
115 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
116 of simplification will now result. However we don't actually *do* that
117 simplification. Rather, we leave it for the simplifier to do. If we
118 *did* do it, though, we'd get more call instances from the specialised
119 RHS. We can work out what they are by instantiating the call-instance
120 set from f's RHS with the types t1, t2.
121
122 Add this new id to f's IdInfo, to record that f has a specialised version.
123
124 Before doing any of this, check that f's IdInfo doesn't already
125 tell us about an existing instance of f at the required type/s.
126 (This might happen if specialisation was applied more than once, or
127 it might arise from user SPECIALIZE pragmas.)
128
129 Recursion
130 ~~~~~~~~~
131 Wait a minute! What if f is recursive? Then we can't just plug in
132 its right-hand side, can we?
133
134 But it's ok. The type checker *always* creates non-recursive definitions
135 for overloaded recursive functions. For example:
136
137 f x = f (x+x) -- Yes I know its silly
138
139 becomes
140
141 f a (d::Num a) = let p = +.sel a d
142 in
143 letrec fl (y::a) = fl (p y y)
144 in
145 fl
146
147 We still have recusion for non-overloaded functions which we
148 speciailise, but the recursive call should get specialised to the
149 same recursive version.
150
151
152 Polymorphism 1
153 ~~~~~~~~~~~~~~
154
155 All this is crystal clear when the function is applied to *constant
156 types*; that is, types which have no type variables inside. But what if
157 it is applied to non-constant types? Suppose we find a call of f at type
158 t1/t2. There are two possibilities:
159
160 (a) The free type variables of t1, t2 are in scope at the definition point
161 of f. In this case there's no problem, we proceed just as before. A common
162 example is as follows. Here's the Haskell:
163
164 g y = let f x = x+x
165 in f y + f y
166
167 After typechecking we have
168
169 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
170 in +.sel a d (f a d y) (f a d y)
171
172 Notice that the call to f is at type type "a"; a non-constant type.
173 Both calls to f are at the same type, so we can specialise to give:
174
175 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
176 in +.sel a d (f@a y) (f@a y)
177
178
179 (b) The other case is when the type variables in the instance types
180 are *not* in scope at the definition point of f. The example we are
181 working with above is a good case. There are two instances of (+.sel a d),
182 but "a" is not in scope at the definition of +.sel. Can we do anything?
183 Yes, we can "common them up", a sort of limited common sub-expression deal.
184 This would give:
185
186 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
187 f@a (x::a) = +.sel@a x x
188 in +.sel@a (f@a y) (f@a y)
189
190 This can save work, and can't be spotted by the type checker, because
191 the two instances of +.sel weren't originally at the same type.
192
193 Further notes on (b)
194
195 * There are quite a few variations here. For example, the defn of
196 +.sel could be floated ouside the \y, to attempt to gain laziness.
197 It certainly mustn't be floated outside the \d because the d has to
198 be in scope too.
199
200 * We don't want to inline f_rhs in this case, because
201 that will duplicate code. Just commoning up the call is the point.
202
203 * Nothing gets added to +.sel's IdInfo.
204
205 * Don't bother unless the equivalence class has more than one item!
206
207 Not clear whether this is all worth it. It is of course OK to
208 simply discard call-instances when passing a big lambda.
209
210 Polymorphism 2 -- Overloading
211 ~~~~~~~~~~~~~~
212 Consider a function whose most general type is
213
214 f :: forall a b. Ord a => [a] -> b -> b
215
216 There is really no point in making a version of g at Int/Int and another
217 at Int/Bool, because it's only instancing the type variable "a" which
218 buys us any efficiency. Since g is completely polymorphic in b there
219 ain't much point in making separate versions of g for the different
220 b types.
221
222 That suggests that we should identify which of g's type variables
223 are constrained (like "a") and which are unconstrained (like "b").
224 Then when taking equivalence classes in STEP 2, we ignore the type args
225 corresponding to unconstrained type variable. In STEP 3 we make
226 polymorphic versions. Thus:
227
228 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
229
230 We do this.
231
232
233 Dictionary floating
234 ~~~~~~~~~~~~~~~~~~~
235 Consider this
236
237 f a (d::Num a) = let g = ...
238 in
239 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
240
241 Here, g is only called at one type, but the dictionary isn't in scope at the
242 definition point for g. Usually the type checker would build a
243 definition for d1 which enclosed g, but the transformation system
244 might have moved d1's defn inward. Solution: float dictionary bindings
245 outwards along with call instances.
246
247 Consider
248
249 f x = let g p q = p==q
250 h r s = (r+s, g r s)
251 in
252 h x x
253
254
255 Before specialisation, leaving out type abstractions we have
256
257 f df x = let g :: Eq a => a -> a -> Bool
258 g dg p q = == dg p q
259 h :: Num a => a -> a -> (a, Bool)
260 h dh r s = let deq = eqFromNum dh
261 in (+ dh r s, g deq r s)
262 in
263 h df x x
264
265 After specialising h we get a specialised version of h, like this:
266
267 h' r s = let deq = eqFromNum df
268 in (+ df r s, g deq r s)
269
270 But we can't naively make an instance for g from this, because deq is not in scope
271 at the defn of g. Instead, we have to float out the (new) defn of deq
272 to widen its scope. Notice that this floating can't be done in advance -- it only
273 shows up when specialisation is done.
274
275 User SPECIALIZE pragmas
276 ~~~~~~~~~~~~~~~~~~~~~~~
277 Specialisation pragmas can be digested by the type checker, and implemented
278 by adding extra definitions along with that of f, in the same way as before
279
280 f@t1/t2 = <f_rhs> t1 t2 d1 d2
281
282 Indeed the pragmas *have* to be dealt with by the type checker, because
283 only it knows how to build the dictionaries d1 and d2! For example
284
285 g :: Ord a => [a] -> [a]
286 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
287
288 Here, the specialised version of g is an application of g's rhs to the
289 Ord dictionary for (Tree Int), which only the type checker can conjure
290 up. There might not even *be* one, if (Tree Int) is not an instance of
291 Ord! (All the other specialision has suitable dictionaries to hand
292 from actual calls.)
293
294 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
295 it is buried in a complex (as-yet-un-desugared) binding group.
296 Maybe we should say
297
298 f@t1/t2 = f* t1 t2 d1 d2
299
300 where f* is the Id f with an IdInfo which says "inline me regardless!".
301 Indeed all the specialisation could be done in this way.
302 That in turn means that the simplifier has to be prepared to inline absolutely
303 any in-scope let-bound thing.
304
305
306 Again, the pragma should permit polymorphism in unconstrained variables:
307
308 h :: Ord a => [a] -> b -> b
309 {-# SPECIALIZE h :: [Int] -> b -> b #-}
310
311 We *insist* that all overloaded type variables are specialised to ground types,
312 (and hence there can be no context inside a SPECIALIZE pragma).
313 We *permit* unconstrained type variables to be specialised to
314 - a ground type
315 - or left as a polymorphic type variable
316 but nothing in between. So
317
318 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
319
320 is *illegal*. (It can be handled, but it adds complication, and gains the
321 programmer nothing.)
322
323
324 SPECIALISING INSTANCE DECLARATIONS
325 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326 Consider
327
328 instance Foo a => Foo [a] where
329 ...
330 {-# SPECIALIZE instance Foo [Int] #-}
331
332 The original instance decl creates a dictionary-function
333 definition:
334
335 dfun.Foo.List :: forall a. Foo a -> Foo [a]
336
337 The SPECIALIZE pragma just makes a specialised copy, just as for
338 ordinary function definitions:
339
340 dfun.Foo.List@Int :: Foo [Int]
341 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
342
343 The information about what instance of the dfun exist gets added to
344 the dfun's IdInfo in the same way as a user-defined function too.
345
346
347 Automatic instance decl specialisation?
348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349 Can instance decls be specialised automatically? It's tricky.
350 We could collect call-instance information for each dfun, but
351 then when we specialised their bodies we'd get new call-instances
352 for ordinary functions; and when we specialised their bodies, we might get
353 new call-instances of the dfuns, and so on. This all arises because of
354 the unrestricted mutual recursion between instance decls and value decls.
355
356 Still, there's no actual problem; it just means that we may not do all
357 the specialisation we could theoretically do.
358
359 Furthermore, instance decls are usually exported and used non-locally,
360 so we'll want to compile enough to get those specialisations done.
361
362 Lastly, there's no such thing as a local instance decl, so we can
363 survive solely by spitting out *usage* information, and then reading that
364 back in as a pragma when next compiling the file. So for now,
365 we only specialise instance decls in response to pragmas.
366
367
368 SPITTING OUT USAGE INFORMATION
369 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
370
371 To spit out usage information we need to traverse the code collecting
372 call-instance information for all imported (non-prelude?) functions
373 and data types. Then we equivalence-class it and spit it out.
374
375 This is done at the top-level when all the call instances which escape
376 must be for imported functions and data types.
377
378 *** Not currently done ***
379
380
381 Partial specialisation by pragmas
382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383 What about partial specialisation:
384
385 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
386 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
387
388 or even
389
390 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
391
392 Seems quite reasonable. Similar things could be done with instance decls:
393
394 instance (Foo a, Foo b) => Foo (a,b) where
395 ...
396 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
397 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
398
399 Ho hum. Things are complex enough without this. I pass.
400
401
402 Requirements for the simplifier
403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404 The simplifier has to be able to take advantage of the specialisation.
405
406 * When the simplifier finds an application of a polymorphic f, it looks in
407 f's IdInfo in case there is a suitable instance to call instead. This converts
408
409 f t1 t2 d1 d2 ===> f_t1_t2
410
411 Note that the dictionaries get eaten up too!
412
413 * Dictionary selection operations on constant dictionaries must be
414 short-circuited:
415
416 +.sel Int d ===> +Int
417
418 The obvious way to do this is in the same way as other specialised
419 calls: +.sel has inside it some IdInfo which tells that if it's applied
420 to the type Int then it should eat a dictionary and transform to +Int.
421
422 In short, dictionary selectors need IdInfo inside them for constant
423 methods.
424
425 * Exactly the same applies if a superclass dictionary is being
426 extracted:
427
428 Eq.sel Int d ===> dEqInt
429
430 * Something similar applies to dictionary construction too. Suppose
431 dfun.Eq.List is the function taking a dictionary for (Eq a) to
432 one for (Eq [a]). Then we want
433
434 dfun.Eq.List Int d ===> dEq.List_Int
435
436 Where does the Eq [Int] dictionary come from? It is built in
437 response to a SPECIALIZE pragma on the Eq [a] instance decl.
438
439 In short, dfun Ids need IdInfo with a specialisation for each
440 constant instance of their instance declaration.
441
442 All this uses a single mechanism: the SpecEnv inside an Id
443
444
445 What does the specialisation IdInfo look like?
446 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
447
448 The SpecEnv of an Id maps a list of types (the template) to an expression
449
450 [Type] |-> Expr
451
452 For example, if f has this RuleInfo:
453
454 [Int, a] -> \d:Ord Int. f' a
455
456 it means that we can replace the call
457
458 f Int t ===> (\d. f' t)
459
460 This chucks one dictionary away and proceeds with the
461 specialised version of f, namely f'.
462
463
464 What can't be done this way?
465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466 There is no way, post-typechecker, to get a dictionary for (say)
467 Eq a from a dictionary for Eq [a]. So if we find
468
469 ==.sel [t] d
470
471 we can't transform to
472
473 eqList (==.sel t d')
474
475 where
476 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
477
478 Of course, we currently have no way to automatically derive
479 eqList, nor to connect it to the Eq [a] instance decl, but you
480 can imagine that it might somehow be possible. Taking advantage
481 of this is permanently ruled out.
482
483 Still, this is no great hardship, because we intend to eliminate
484 overloading altogether anyway!
485
486 A note about non-tyvar dictionaries
487 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
488 Some Ids have types like
489
490 forall a,b,c. Eq a -> Ord [a] -> tau
491
492 This seems curious at first, because we usually only have dictionary
493 args whose types are of the form (C a) where a is a type variable.
494 But this doesn't hold for the functions arising from instance decls,
495 which sometimes get arguments with types of form (C (T a)) for some
496 type constructor T.
497
498 Should we specialise wrt this compound-type dictionary? We used to say
499 "no", saying:
500 "This is a heuristic judgement, as indeed is the fact that we
501 specialise wrt only dictionaries. We choose *not* to specialise
502 wrt compound dictionaries because at the moment the only place
503 they show up is in instance decls, where they are simply plugged
504 into a returned dictionary. So nothing is gained by specialising
505 wrt them."
506
507 But it is simpler and more uniform to specialise wrt these dicts too;
508 and in future GHC is likely to support full fledged type signatures
509 like
510 f :: Eq [(a,b)] => ...
511
512
513 ************************************************************************
514 * *
515 \subsubsection{The new specialiser}
516 * *
517 ************************************************************************
518
519 Our basic game plan is this. For let(rec) bound function
520 f :: (C a, D c) => (a,b,c,d) -> Bool
521
522 * Find any specialised calls of f, (f ts ds), where
523 ts are the type arguments t1 .. t4, and
524 ds are the dictionary arguments d1 .. d2.
525
526 * Add a new definition for f1 (say):
527
528 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
529
530 Note that we abstract over the unconstrained type arguments.
531
532 * Add the mapping
533
534 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
535
536 to the specialisations of f. This will be used by the
537 simplifier to replace calls
538 (f t1 t2 t3 t4) da db
539 by
540 (\d1 d1 -> f1 t2 t4) da db
541
542 All the stuff about how many dictionaries to discard, and what types
543 to apply the specialised function to, are handled by the fact that the
544 SpecEnv contains a template for the result of the specialisation.
545
546 We don't build *partial* specialisations for f. For example:
547
548 f :: Eq a => a -> a -> Bool
549 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
550
551 Here, little is gained by making a specialised copy of f.
552 There's a distinct danger that the specialised version would
553 first build a dictionary for (Eq b, Eq c), and then select the (==)
554 method from it! Even if it didn't, not a great deal is saved.
555
556 We do, however, generate polymorphic, but not overloaded, specialisations:
557
558 f :: Eq a => [a] -> b -> b -> b
559 ... SPECIALISE f :: [Int] -> b -> b -> b ...
560
561 Hence, the invariant is this:
562
563 *** no specialised version is overloaded ***
564
565
566 ************************************************************************
567 * *
568 \subsubsection{The exported function}
569 * *
570 ************************************************************************
571 -}
572
573 -- | Specialise calls to type-class overloaded functions occuring in a program.
574 specProgram :: ModGuts -> CoreM ModGuts
575 specProgram guts@(ModGuts { mg_module = this_mod
576 , mg_rules = local_rules
577 , mg_binds = binds })
578 = do { dflags <- getDynFlags
579
580 -- Specialise the bindings of this module
581 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
582
583 -- Specialise imported functions
584 ; hpt_rules <- getRuleBase
585 ; let rule_base = extendRuleBaseList hpt_rules local_rules
586 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
587 [] rule_base (ud_calls uds)
588
589 -- Don't forget to wrap the specialized bindings with bindings
590 -- for the needed dictionaries.
591 -- See Note [Wrap bindings returned by specImports]
592 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
593
594 ; let final_binds
595 | null spec_binds' = binds'
596 | otherwise = Rec (flattenBinds spec_binds') : binds'
597 -- Note [Glom the bindings if imported functions are specialised]
598
599 ; return (guts { mg_binds = final_binds
600 , mg_rules = new_rules ++ local_rules }) }
601 where
602 -- We need to start with a Subst that knows all the things
603 -- that are in scope, so that the substitution engine doesn't
604 -- accidentally re-use a unique that's already in use
605 -- Easiest thing is to do it all at once, as if all the top-level
606 -- decls were mutually recursive
607 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
608 bindersOfBinds binds
609 , se_interesting = emptyVarSet }
610
611 go [] = return ([], emptyUDs)
612 go (bind:binds) = do (binds', uds) <- go binds
613 (bind', uds') <- specBind top_env bind uds
614 return (bind' ++ binds', uds')
615
616 {-
617 Note [Wrap bindings returned by specImports]
618 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
619 'specImports' returns a set of specialized bindings. However, these are lacking
620 necessary floated dictionary bindings, which are returned by
621 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
622 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
623 for instance, the 'specImports' call in 'specProgram'.
624
625
626 Note [Disabling cross-module specialisation]
627 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
628 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
629 in modules outside of the current module. This can sometimes uncover user code
630 which explodes in size when aggressively optimized. The
631 -fno-cross-module-specialise option was introduced to allow users to being
632 bitten by such instances to revert to the pre-7.10 behavior.
633
634 See Trac #10491
635 -}
636
637 -- | Specialise a set of calls to imported bindings
638 specImports :: DynFlags
639 -> Module
640 -> SpecEnv -- Passed in so that all top-level Ids are in scope
641 -> VarSet -- Don't specialise these ones
642 -- See Note [Avoiding recursive specialisation]
643 -> [Id] -- Stack of imported functions being specialised
644 -> RuleBase -- Rules from this module and the home package
645 -- (but not external packages, which can change)
646 -> CallDetails -- Calls for imported things, and floating bindings
647 -> CoreM ( [CoreRule] -- New rules
648 , [CoreBind] ) -- Specialised bindings
649 -- See Note [Wrapping bindings returned by specImports]
650 specImports dflags this_mod top_env done callers rule_base cds
651 -- See Note [Disabling cross-module specialisation]
652 | not $ gopt Opt_CrossModuleSpecialise dflags =
653 return ([], [])
654
655 | otherwise =
656 do { let import_calls = dVarEnvElts cds
657 ; (rules, spec_binds) <- go rule_base import_calls
658 ; return (rules, spec_binds) }
659 where
660 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
661 go _ [] = return ([], [])
662 go rb (cis@(CIS fn _calls_for_fn) : other_calls)
663 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
664 done callers rb fn $
665 ciSetToList cis
666 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
667 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
668
669 specImport :: DynFlags
670 -> Module
671 -> SpecEnv -- Passed in so that all top-level Ids are in scope
672 -> VarSet -- Don't specialise these
673 -- See Note [Avoiding recursive specialisation]
674 -> [Id] -- Stack of imported functions being specialised
675 -> RuleBase -- Rules from this module
676 -> Id -> [CallInfo] -- Imported function and calls for it
677 -> CoreM ( [CoreRule] -- New rules
678 , [CoreBind] ) -- Specialised bindings
679 specImport dflags this_mod top_env done callers rb fn calls_for_fn
680 | fn `elemVarSet` done
681 = return ([], []) -- No warning. This actually happens all the time
682 -- when specialising a recursive function, because
683 -- the RHS of the specialised function contains a recursive
684 -- call to the original function
685
686 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
687 = return ([], [])
688
689 | wantSpecImport dflags unfolding
690 , Just rhs <- maybeUnfoldingTemplate unfolding
691 = do { -- Get rules from the external package state
692 -- We keep doing this in case we "page-fault in"
693 -- more rules as we go along
694 ; hsc_env <- getHscEnv
695 ; eps <- liftIO $ hscEPS hsc_env
696 ; vis_orphs <- getVisibleOrphanMods
697 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
698 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
699
700 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
701 runSpecM dflags this_mod $
702 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
703 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
704 -- After the rules kick in we may get recursion, but
705 -- we rely on a global GlomBinds to sort that out later
706 -- See Note [Glom the bindings if imported functions are specialised]
707
708 -- Now specialise any cascaded calls
709 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
710 specImports dflags this_mod top_env
711 (extendVarSet done fn)
712 (fn:callers)
713 (extendRuleBaseList rb rules1)
714 (ud_calls uds)
715
716 -- Don't forget to wrap the specialized bindings with bindings
717 -- for the needed dictionaries
718 -- See Note [Wrap bindings returned by specImports]
719 ; let final_binds = wrapDictBinds (ud_binds uds)
720 (spec_binds2 ++ spec_binds1)
721
722 ; return (rules2 ++ rules1, final_binds) }
723
724 | warnMissingSpecs dflags callers
725 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
726 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
727 | caller <- callers])
728 , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
729 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
730 ; return ([], []) }
731
732 | otherwise
733 = return ([], [])
734 where
735 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
736
737 warnMissingSpecs :: DynFlags -> [Id] -> Bool
738 -- See Note [Warning about missed specialisations]
739 warnMissingSpecs dflags callers
740 | wopt Opt_WarnAllMissedSpecs dflags = True
741 | not (wopt Opt_WarnMissedSpecs dflags) = False
742 | null callers = False
743 | otherwise = all has_inline_prag callers
744 where
745 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
746
747 wantSpecImport :: DynFlags -> Unfolding -> Bool
748 -- See Note [Specialise imported INLINABLE things]
749 wantSpecImport dflags unf
750 = case unf of
751 NoUnfolding -> False
752 BootUnfolding -> False
753 OtherCon {} -> False
754 DFunUnfolding {} -> True
755 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
756 | gopt Opt_SpecialiseAggressively dflags -> True
757 | isStableSource src -> True
758 -- Specialise even INLINE things; it hasn't inlined yet,
759 -- so perhaps it never will. Moreover it may have calls
760 -- inside it that we want to specialise
761 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
762
763 {- Note [Warning about missed specialisations]
764 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
765 Suppose
766 * In module Lib, you carefully mark a function 'foo' INLINABLE
767 * Import Lib(foo) into another module M
768 * Call 'foo' at some specialised type in M
769 Then you jolly well expect it to be specialised in M. But what if
770 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
771 specialised too. But if 'bar' is not marked INLINABLE it may well
772 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
773
774 It's more noisy to warning about a missed specialisation opportunity
775 for /every/ overloaded imported function, but sometimes useful. That
776 is what Opt_WarnAllMissedSpecs does.
777
778 ToDo: warn about missed opportunities for local functions.
779
780 Note [Specialise imported INLINABLE things]
781 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
782 What imported functions do we specialise? The basic set is
783 * DFuns and things with INLINABLE pragmas.
784 but with -fspecialise-aggressively we add
785 * Anything with an unfolding template
786
787 Trac #8874 has a good example of why we want to auto-specialise DFuns.
788
789 We have the -fspecialise-aggressively flag (usually off), because we
790 risk lots of orphan modules from over-vigorous specialisation.
791 However it's not a big deal: anything non-recursive with an
792 unfolding-template will probably have been inlined already.
793
794 Note [Glom the bindings if imported functions are specialised]
795 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
796 Suppose we have an imported, *recursive*, INLINABLE function
797 f :: Eq a => a -> a
798 f = /\a \d x. ...(f a d)...
799 In the module being compiled we have
800 g x = f (x::Int)
801 Now we'll make a specialised function
802 f_spec :: Int -> Int
803 f_spec = \x -> ...(f Int dInt)...
804 {-# RULE f Int _ = f_spec #-}
805 g = \x. f Int dInt x
806 Note that f_spec doesn't look recursive
807 After rewriting with the RULE, we get
808 f_spec = \x -> ...(f_spec)...
809 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
810 The occurrence analyser never turns a NonRec into a Rec. So we must
811 make sure that f_spec is recursive. Easiest thing is to make all
812 the specialisations for imported bindings recursive.
813
814
815 Note [Avoiding recursive specialisation]
816 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
817 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
818 'f's RHS. So we want to specialise g,h. But we don't want to
819 specialise f any more! It's possible that f's RHS might have a
820 recursive yet-more-specialised call, so we'd diverge in that case.
821 And if the call is to the same type, one specialisation is enough.
822 Avoiding this recursive specialisation loop is the reason for the
823 'done' VarSet passed to specImports and specImport.
824
825 ************************************************************************
826 * *
827 \subsubsection{@specExpr@: the main function}
828 * *
829 ************************************************************************
830 -}
831
832 data SpecEnv
833 = SE { se_subst :: CoreSubst.Subst
834 -- We carry a substitution down:
835 -- a) we must clone any binding that might float outwards,
836 -- to avoid name clashes
837 -- b) we carry a type substitution to use when analysing
838 -- the RHS of specialised bindings (no type-let!)
839
840
841 , se_interesting :: VarSet
842 -- Dict Ids that we know something about
843 -- and hence may be worth specialising against
844 -- See Note [Interesting dictionary arguments]
845 }
846
847 specVar :: SpecEnv -> Id -> CoreExpr
848 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
849
850 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
851
852 ---------------- First the easy cases --------------------
853 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
854 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
855 specExpr env (Var v) = return (specVar env v, emptyUDs)
856 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
857 specExpr env (Cast e co)
858 = do { (e', uds) <- specExpr env e
859 ; return ((mkCast e' (substCo env co)), uds) }
860 specExpr env (Tick tickish body)
861 = do { (body', uds) <- specExpr env body
862 ; return (Tick (specTickish env tickish) body', uds) }
863
864 ---------------- Applications might generate a call instance --------------------
865 specExpr env expr@(App {})
866 = go expr []
867 where
868 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
869 (fun', uds_app) <- go fun (arg':args)
870 return (App fun' arg', uds_arg `plusUDs` uds_app)
871
872 go (Var f) args = case specVar env f of
873 Var f' -> return (Var f', mkCallUDs env f' args)
874 e' -> return (e', emptyUDs) -- I don't expect this!
875 go other _ = specExpr env other
876
877 ---------------- Lambda/case require dumping of usage details --------------------
878 specExpr env e@(Lam _ _) = do
879 (body', uds) <- specExpr env' body
880 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
881 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
882 where
883 (bndrs, body) = collectBinders e
884 (env', bndrs') = substBndrs env bndrs
885 -- More efficient to collect a group of binders together all at once
886 -- and we don't want to split a lambda group with dumped bindings
887
888 specExpr env (Case scrut case_bndr ty alts)
889 = do { (scrut', scrut_uds) <- specExpr env scrut
890 ; (scrut'', case_bndr', alts', alts_uds)
891 <- specCase env scrut' case_bndr alts
892 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
893 , scrut_uds `plusUDs` alts_uds) }
894
895 ---------------- Finally, let is the interesting case --------------------
896 specExpr env (Let bind body)
897 = do { -- Clone binders
898 (rhs_env, body_env, bind') <- cloneBindSM env bind
899
900 -- Deal with the body
901 ; (body', body_uds) <- specExpr body_env body
902
903 -- Deal with the bindings
904 ; (binds', uds) <- specBind rhs_env bind' body_uds
905
906 -- All done
907 ; return (foldr Let body' binds', uds) }
908
909 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
910 specTickish env (Breakpoint ix ids)
911 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
912 -- drop vars from the list if they have a non-variable substitution.
913 -- should never happen, but it's harmless to drop them anyway.
914 specTickish _ other_tickish = other_tickish
915
916 specCase :: SpecEnv
917 -> CoreExpr -- Scrutinee, already done
918 -> Id -> [CoreAlt]
919 -> SpecM ( CoreExpr -- New scrutinee
920 , Id
921 , [CoreAlt]
922 , UsageDetails)
923 specCase env scrut' case_bndr [(con, args, rhs)]
924 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
925 , interestingDict env scrut'
926 , not (isDeadBinder case_bndr && null sc_args')
927 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
928
929 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
930 [(con, args', Var sc_arg')]
931 | sc_arg' <- sc_args' ]
932
933 -- Extend the substitution for RHS to map the *original* binders
934 -- to their floated verions.
935 mb_sc_flts :: [Maybe DictId]
936 mb_sc_flts = map (lookupVarEnv clone_env) args'
937 clone_env = zipVarEnv sc_args' sc_args_flt
938 subst_prs = (case_bndr, Var case_bndr_flt)
939 : [ (arg, Var sc_flt)
940 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
941 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
942 , se_interesting = se_interesting env_rhs `extendVarSetList`
943 (case_bndr_flt : sc_args_flt) }
944
945 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
946 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
947 case_bndr_set = unitVarSet case_bndr_flt
948 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
949 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
950 flt_binds = scrut_bind : sc_binds
951 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
952 all_uds = flt_binds `addDictBinds` free_uds
953 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
954 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
955 where
956 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
957 sc_args' = filter is_flt_sc_arg args'
958
959 clone_me bndr = do { uniq <- getUniqueM
960 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
961 where
962 name = idName bndr
963 ty = idType bndr
964 occ = nameOccName name
965 loc = getSrcSpan name
966
967 arg_set = mkVarSet args'
968 is_flt_sc_arg var = isId var
969 && not (isDeadBinder var)
970 && isDictTy var_ty
971 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
972 where
973 var_ty = idType var
974
975
976 specCase env scrut case_bndr alts
977 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
978 ; return (scrut, case_bndr', alts', uds_alts) }
979 where
980 (env_alt, case_bndr') = substBndr env case_bndr
981 spec_alt (con, args, rhs) = do
982 (rhs', uds) <- specExpr env_rhs rhs
983 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
984 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
985 where
986 (env_rhs, args') = substBndrs env_alt args
987
988 {-
989 Note [Floating dictionaries out of cases]
990 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
991 Consider
992 g = \d. case d of { MkD sc ... -> ...(f sc)... }
993 Naively we can't float d2's binding out of the case expression,
994 because 'sc' is bound by the case, and that in turn means we can't
995 specialise f, which seems a pity.
996
997 So we invert the case, by floating out a binding
998 for 'sc_flt' thus:
999 sc_flt = case d of { MkD sc ... -> sc }
1000 Now we can float the call instance for 'f'. Indeed this is just
1001 what'll happen if 'sc' was originally bound with a let binding,
1002 but case is more efficient, and necessary with equalities. So it's
1003 good to work with both.
1004
1005 You might think that this won't make any difference, because the
1006 call instance will only get nuked by the \d. BUT if 'g' itself is
1007 specialised, then transitively we should be able to specialise f.
1008
1009 In general, given
1010 case e of cb { MkD sc ... -> ...(f sc)... }
1011 we transform to
1012 let cb_flt = e
1013 sc_flt = case cb_flt of { MkD sc ... -> sc }
1014 in
1015 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1016
1017 The "_flt" things are the floated binds; we use the current substitution
1018 to substitute sc -> sc_flt in the RHS
1019
1020 ************************************************************************
1021 * *
1022 Dealing with a binding
1023 * *
1024 ************************************************************************
1025 -}
1026
1027 specBind :: SpecEnv -- Use this for RHSs
1028 -> CoreBind -- Binders are already cloned by cloneBindSM,
1029 -- but RHSs are un-processed
1030 -> UsageDetails -- Info on how the scope of the binding
1031 -> SpecM ([CoreBind], -- New bindings
1032 UsageDetails) -- And info to pass upstream
1033
1034 -- Returned UsageDetails:
1035 -- No calls for binders of this bind
1036 specBind rhs_env (NonRec fn rhs) body_uds
1037 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1038 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1039
1040 ; let pairs = spec_defns ++ [(fn', rhs')]
1041 -- fn' mentions the spec_defns in its rules,
1042 -- so put the latter first
1043
1044 combined_uds = body_uds1 `plusUDs` rhs_uds
1045 -- This way round a call in rhs_uds of a function f
1046 -- at type T will override a call of f at T in body_uds1; and
1047 -- that is good because it'll tend to keep "earlier" calls
1048 -- See Note [Specialisation of dictionary functions]
1049
1050 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1051 -- See Note [From non-recursive to recursive]
1052
1053 final_binds :: [DictBind]
1054 final_binds
1055 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1056 | otherwise = [flattenDictBinds dump_dbs pairs]
1057
1058 ; if float_all then
1059 -- Rather than discard the calls mentioning the bound variables
1060 -- we float this binding along with the others
1061 return ([], free_uds `snocDictBinds` final_binds)
1062 else
1063 -- No call in final_uds mentions bound variables,
1064 -- so we can just leave the binding here
1065 return (map fst final_binds, free_uds) }
1066
1067
1068 specBind rhs_env (Rec pairs) body_uds
1069 -- Note [Specialising a recursive group]
1070 = do { let (bndrs,rhss) = unzip pairs
1071 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1072 ; let scope_uds = body_uds `plusUDs` rhs_uds
1073 -- Includes binds and calls arising from rhss
1074
1075 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1076
1077 ; (bndrs3, spec_defns3, uds3)
1078 <- if null spec_defns1 -- Common case: no specialisation
1079 then return (bndrs1, [], uds1)
1080 else do { -- Specialisation occurred; do it again
1081 (bndrs2, spec_defns2, uds2)
1082 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1083 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1084
1085 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1086 bind = flattenDictBinds dumped_dbs
1087 (spec_defns3 ++ zip bndrs3 rhss')
1088
1089 ; if float_all then
1090 return ([], final_uds `snocDictBind` bind)
1091 else
1092 return ([fst bind], final_uds) }
1093
1094
1095 ---------------------------
1096 specDefns :: SpecEnv
1097 -> UsageDetails -- Info on how it is used in its scope
1098 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1099 -> SpecM ([OutId], -- Original Ids with RULES added
1100 [(OutId,OutExpr)], -- Extra, specialised bindings
1101 UsageDetails) -- Stuff to fling upwards from the specialised versions
1102
1103 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1104 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1105 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1106 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1107 -- But not vice versa (it's a fixpoint problem).
1108
1109 specDefns _env uds []
1110 = return ([], [], uds)
1111 specDefns env uds ((bndr,rhs):pairs)
1112 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1113 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1114 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1115
1116 ---------------------------
1117 specDefn :: SpecEnv
1118 -> UsageDetails -- Info on how it is used in its scope
1119 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1120 -> SpecM (Id, -- Original Id with added RULES
1121 [(Id,CoreExpr)], -- Extra, specialised bindings
1122 UsageDetails) -- Stuff to fling upwards from the specialised versions
1123
1124 specDefn env body_uds fn rhs
1125 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1126 rules_for_me = idCoreRules fn
1127 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1128 calls_for_me fn rhs
1129 ; return ( fn `addIdSpecialisations` rules
1130 , spec_defns
1131 , body_uds_without_me `plusUDs` spec_uds) }
1132 -- It's important that the `plusUDs` is this way
1133 -- round, because body_uds_without_me may bind
1134 -- dictionaries that are used in calls_for_me passed
1135 -- to specDefn. So the dictionary bindings in
1136 -- spec_uds may mention dictionaries bound in
1137 -- body_uds_without_me
1138
1139 ---------------------------
1140 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1141 -- Nothing => specialising local fn
1142 -> SpecEnv
1143 -> [CoreRule] -- Existing RULES for the fn
1144 -> [CallInfo]
1145 -> OutId -> InExpr
1146 -> SpecM ([CoreRule], -- New RULES for the fn
1147 [(Id,CoreExpr)], -- Extra, specialised bindings
1148 UsageDetails) -- New usage details from the specialised RHSs
1149
1150 -- This function checks existing rules, and does not create
1151 -- duplicate ones. So the caller does not need to do this filtering.
1152 -- See 'already_covered'
1153
1154 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1155 -- The first case is the interesting one
1156 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1157 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1158 && notNull calls_for_me -- And there are some calls to specialise
1159 && not (isNeverActive (idInlineActivation fn))
1160 -- Don't specialise NOINLINE things
1161 -- See Note [Auto-specialisation and RULES]
1162
1163 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1164 -- See Note [Inline specialisation] for why we do not
1165 -- switch off specialisation for inline functions
1166
1167 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1168 do { stuff <- mapM spec_call calls_for_me
1169 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1170 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1171
1172 | otherwise -- No calls or RHS doesn't fit our preconceptions
1173 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1174 text "Missed specialisation opportunity for"
1175 <+> ppr fn $$ _trace_doc )
1176 -- Note [Specialisation shape]
1177 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1178 return ([], [], emptyUDs)
1179 where
1180 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1181 , ppr rhs_ids, ppr n_dicts
1182 , ppr (idInlineActivation fn) ]
1183
1184 fn_type = idType fn
1185 fn_arity = idArity fn
1186 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1187 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1188 n_tyvars = length tyvars
1189 n_dicts = length theta
1190 inl_prag = idInlinePragma fn
1191 inl_act = inlinePragmaActivation inl_prag
1192 is_local = isLocalId fn
1193
1194 -- Figure out whether the function has an INLINE pragma
1195 -- See Note [Inline specialisations]
1196
1197 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1198
1199 rhs_dict_ids = take n_dicts rhs_ids
1200 body = mkLams (drop n_dicts rhs_ids) rhs_body
1201 -- Glue back on the non-dict lambdas
1202
1203 already_covered :: DynFlags -> [CoreExpr] -> Bool
1204 already_covered dflags args -- Note [Specialisations already covered]
1205 = isJust (lookupRule dflags
1206 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1207 (const True)
1208 fn args rules_for_me)
1209
1210 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1211 mk_ty_args [] poly_tvs
1212 = ASSERT( null poly_tvs ) []
1213 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1214 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1215 mk_ty_args (Just ty : call_ts) poly_tvs
1216 = Type ty : mk_ty_args call_ts poly_tvs
1217 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1218
1219 ----------------------------------------------------------
1220 -- Specialise to one particular call pattern
1221 spec_call :: CallInfo -- Call instance
1222 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1223 UsageDetails, -- Usage details from specialised body
1224 CoreRule)) -- Info for the Id's SpecEnv
1225 spec_call _call_info@(CallKey call_ts, (call_ds, _))
1226 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1227
1228 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1229 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1230
1231 -- Construct the new binding
1232 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1233 -- PLUS the rule
1234 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1235 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1236 -- PLUS the usage-details
1237 -- { d1' = dx1; d2' = dx2 }
1238 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1239 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1240 --
1241 -- Note that the substitution is applied to the whole thing.
1242 -- This is convenient, but just slightly fragile. Notably:
1243 -- * There had better be no name clashes in a/b/c
1244 do { let
1245 -- poly_tyvars = [b] in the example above
1246 -- spec_tyvars = [a,c]
1247 -- ty_args = [t1,b,t3]
1248 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1249 env1 = extendTvSubstList env spec_tv_binds
1250 (rhs_env, poly_tyvars) = substBndrs env1
1251 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1252
1253 -- Clone rhs_dicts, including instantiating their types
1254 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1255 ; let (rhs_env2, dx_binds, spec_dict_args)
1256 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1257 ty_args = mk_ty_args call_ts poly_tyvars
1258 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1259 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1260 rule_args = ty_args ++ ev_args
1261 rule_bndrs = poly_tyvars ++ ev_bndrs
1262
1263 ; dflags <- getDynFlags
1264 ; if already_covered dflags rule_args then
1265 return Nothing
1266 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1267 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1268 -- , ppr dx_binds ]) $
1269 do
1270 { -- Figure out the type of the specialised function
1271 let body_ty = applyTypeToArgs rhs fn_type rule_args
1272 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1273 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1274 , not (isJoinId fn)
1275 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1276 | otherwise = (poly_tyvars, poly_tyvars)
1277 spec_id_ty = mkLamTypes lam_args body_ty
1278 join_arity_change = length app_args - length rule_args
1279 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1280 = Just (orig_join_arity + join_arity_change)
1281 | otherwise
1282 = Nothing
1283
1284 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1285 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1286 ; this_mod <- getModule
1287 ; let
1288 -- The rule to put in the function's specialisation is:
1289 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1290 herald = case mb_mod of
1291 Nothing -- Specialising local fn
1292 -> text "SPEC"
1293 Just this_mod -- Specialising imoprted fn
1294 -> text "SPEC/" <> ppr this_mod
1295
1296 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1297 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1298 -- This name ends up in interface files, so use showSDocForUser,
1299 -- otherwise uniques end up there, making builds
1300 -- less deterministic (See #4012 comment:61 ff)
1301
1302 rule_wout_eta = mkRule
1303 this_mod
1304 True {- Auto generated -}
1305 is_local
1306 rule_name
1307 inl_act -- Note [Auto-specialisation and RULES]
1308 (idName fn)
1309 rule_bndrs
1310 rule_args
1311 (mkVarApps (Var spec_f) app_args)
1312
1313 spec_env_rule
1314 = case isJoinId_maybe fn of
1315 Just join_arity -> etaExpandToJoinPointRule join_arity
1316 rule_wout_eta
1317 Nothing -> rule_wout_eta
1318
1319 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1320 final_uds = foldr consDictBind rhs_uds dx_binds
1321
1322 --------------------------------------
1323 -- Add a suitable unfolding if the spec_inl_prag says so
1324 -- See Note [Inline specialisations]
1325 (spec_inl_prag, spec_unf)
1326 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1327 = (neverInlinePragma, noUnfolding)
1328 -- See Note [Specialising imported functions] in OccurAnal
1329
1330 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1331 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1332
1333 | otherwise
1334 = (inl_prag, specUnfolding poly_tyvars spec_app
1335 arity_decrease fn_unf)
1336
1337 arity_decrease = length spec_dict_args
1338 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1339
1340 --------------------------------------
1341 -- Adding arity information just propagates it a bit faster
1342 -- See Note [Arity decrease] in Simplify
1343 -- Copy InlinePragma information from the parent Id.
1344 -- So if f has INLINE[1] so does spec_f
1345 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1346 `setInlinePragma` spec_inl_prag
1347 `setIdUnfolding` spec_unf
1348 `asJoinId_maybe` spec_join_arity
1349
1350 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1351
1352 {- Note [Evidence foralls]
1353 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1354 Suppose (Trac #12212) that we are specialising
1355 f :: forall a b. (Num a, F a ~ F b) => blah
1356 with a=b=Int. Then the RULE will be something like
1357 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1358 f Int Int d g = f_spec
1359 But both varToCoreExpr (when constructing the LHS args), and the
1360 simplifier (when simplifying the LHS args), will transform to
1361 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1362 f Int Int d <F Int> = f_spec
1363 by replacing g with Refl. So now 'g' is unbound, which results in a later
1364 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1365 * varToCoreExpr generates a Refl
1366 * exprsFreeIdsList returns the Ids bound by the args,
1367 which won't include g
1368
1369 You might wonder if this will match as often, but the simplifier replaces
1370 complicated Refl coercions with Refl pretty aggressively.
1371
1372 Note [Orphans and auto-generated rules]
1373 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1374 When we specialise an INLINABLE function, or when we have
1375 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1376 We don't want to warn about these, or we'd generate a lot of warnings.
1377 Thus, we only warn about user-specified orphan rules.
1378
1379 Indeed, we don't even treat the module as an orphan module if it has
1380 auto-generated *rule* orphans. Orphan modules are read every time we
1381 compile, so they are pretty obtrusive and slow down every compilation,
1382 even non-optimised ones. (Reason: for type class instances it's a
1383 type correctness issue.) But specialisation rules are strictly for
1384 *optimisation* only so it's fine not to read the interface.
1385
1386 What this means is that a SPEC rules from auto-specialisation in
1387 module M will be used in other modules only if M.hi has been read for
1388 some other reason, which is actually pretty likely.
1389 -}
1390
1391 bindAuxiliaryDicts
1392 :: SpecEnv
1393 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1394 -> [DictId] -- A cloned dict-id for each dict arg
1395 -> (SpecEnv, -- Substitute for all orig_dicts
1396 [DictBind], -- Auxiliary dict bindings
1397 [CoreExpr]) -- Witnessing expressions (all trivial)
1398 -- Bind any dictionary arguments to fresh names, to preserve sharing
1399 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1400 orig_dict_ids call_ds inst_dict_ids
1401 = (env', dx_binds, spec_dict_args)
1402 where
1403 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1404 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1405 (orig_dict_ids `zip` spec_dict_args)
1406 `CoreSubst.extendInScopeList` dx_ids
1407 , se_interesting = interesting `unionVarSet` interesting_dicts }
1408
1409 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1410 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1411 , interestingDict env dx ]
1412 -- See Note [Make the new dictionaries interesting]
1413
1414 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1415 go [] _ = ([], [])
1416 go (dx:dxs) (dx_id:dx_ids)
1417 | exprIsTrivial dx = (dx_binds, dx : args)
1418 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1419 where
1420 (dx_binds, args) = go dxs dx_ids
1421 -- In the first case extend the substitution but not bindings;
1422 -- in the latter extend the bindings but not the substitution.
1423 -- For the former, note that we bind the *original* dict in the substitution,
1424 -- overriding any d->dx_id binding put there by substBndrs
1425 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1426
1427 {-
1428 Note [Make the new dictionaries interesting]
1429 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1430 Important! We're going to substitute dx_id1 for d
1431 and we want it to look "interesting", else we won't gather *any*
1432 consequential calls. E.g.
1433 f d = ...g d....
1434 If we specialise f for a call (f (dfun dNumInt)), we'll get
1435 a consequent call (g d') with an auxiliary definition
1436 d' = df dNumInt
1437 We want that consequent call to look interesting
1438
1439
1440 Note [From non-recursive to recursive]
1441 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1442 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1443 have built a recursive knot
1444
1445 f a d x = <blah>
1446 MkUD { ud_binds = d7 = MkD ..f..
1447 , ud_calls = ...(f T d7)... }
1448
1449 The we generate
1450
1451 Rec { fs x = <blah>[T/a, d7/d]
1452 f a d x = <blah>
1453 RULE f T _ = fs
1454 d7 = ...f... }
1455
1456 Here the recursion is only through the RULE.
1457
1458
1459 Note [Specialisation of dictionary functions]
1460 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1461 Here is a nasty example that bit us badly: see Trac #3591
1462
1463 class Eq a => C a
1464 instance Eq [a] => C [a]
1465
1466 ---------------
1467 dfun :: Eq [a] -> C [a]
1468 dfun a d = MkD a d (meth d)
1469
1470 d4 :: Eq [T] = <blah>
1471 d2 :: C [T] = dfun T d4
1472 d1 :: Eq [T] = $p1 d2
1473 d3 :: C [T] = dfun T d1
1474
1475 None of these definitions is recursive. What happened was that we
1476 generated a specialisation:
1477
1478 RULE forall d. dfun T d = dT :: C [T]
1479 dT = (MkD a d (meth d)) [T/a, d1/d]
1480 = MkD T d1 (meth d1)
1481
1482 But now we use the RULE on the RHS of d2, to get
1483
1484 d2 = dT = MkD d1 (meth d1)
1485 d1 = $p1 d2
1486
1487 and now d1 is bottom! The problem is that when specialising 'dfun' we
1488 should first dump "below" the binding all floated dictionary bindings
1489 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1490 placed below 'dfun', and thus unavailable to it when specialising
1491 'dfun'. That in turn means that the call (dfun T d1) must be
1492 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1493 d4 doesn't mention dfun.
1494
1495 But look at this:
1496
1497 class C a where { foo,bar :: [a] -> [a] }
1498
1499 instance C Int where
1500 foo x = r_bar x
1501 bar xs = reverse xs
1502
1503 r_bar :: C a => [a] -> [a]
1504 r_bar xs = bar (xs ++ xs)
1505
1506 That translates to:
1507
1508 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1509
1510 Rec { $fCInt :: C Int = MkC foo_help reverse
1511 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1512
1513 The call (r_bar $fCInt) mentions $fCInt,
1514 which mentions foo_help,
1515 which mentions r_bar
1516 But we DO want to specialise r_bar at Int:
1517
1518 Rec { $fCInt :: C Int = MkC foo_help reverse
1519 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1520
1521 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1522 RULE r_bar Int _ = r_bar_Int
1523
1524 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1525 }
1526
1527 Note that, because of its RULE, r_bar joins the recursive
1528 group. (In this case it'll unravel a short moment later.)
1529
1530
1531 Conclusion: we catch the nasty case using filter_dfuns in
1532 callsForMe. To be honest I'm not 100% certain that this is 100%
1533 right, but it works. Sigh.
1534
1535
1536 Note [Specialising a recursive group]
1537 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1538 Consider
1539 let rec { f x = ...g x'...
1540 ; g y = ...f y'.... }
1541 in f 'a'
1542 Here we specialise 'f' at Char; but that is very likely to lead to
1543 a specialisation of 'g' at Char. We must do the latter, else the
1544 whole point of specialisation is lost.
1545
1546 But we do not want to keep iterating to a fixpoint, because in the
1547 presence of polymorphic recursion we might generate an infinite number
1548 of specialisations.
1549
1550 So we use the following heuristic:
1551 * Arrange the rec block in dependency order, so far as possible
1552 (the occurrence analyser already does this)
1553
1554 * Specialise it much like a sequence of lets
1555
1556 * Then go through the block a second time, feeding call-info from
1557 the RHSs back in the bottom, as it were
1558
1559 In effect, the ordering maxmimises the effectiveness of each sweep,
1560 and we do just two sweeps. This should catch almost every case of
1561 monomorphic recursion -- the exception could be a very knotted-up
1562 recursion with multiple cycles tied up together.
1563
1564 This plan is implemented in the Rec case of specBindItself.
1565
1566 Note [Specialisations already covered]
1567 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1568 We obviously don't want to generate two specialisations for the same
1569 argument pattern. There are two wrinkles
1570
1571 1. We do the already-covered test in specDefn, not when we generate
1572 the CallInfo in mkCallUDs. We used to test in the latter place, but
1573 we now iterate the specialiser somewhat, and the Id at the call site
1574 might therefore not have all the RULES that we can see in specDefn
1575
1576 2. What about two specialisations where the second is an *instance*
1577 of the first? If the more specific one shows up first, we'll generate
1578 specialisations for both. If the *less* specific one shows up first,
1579 we *don't* currently generate a specialisation for the more specific
1580 one. (See the call to lookupRule in already_covered.) Reasons:
1581 (a) lookupRule doesn't say which matches are exact (bad reason)
1582 (b) if the earlier specialisation is user-provided, it's
1583 far from clear that we should auto-specialise further
1584
1585 Note [Auto-specialisation and RULES]
1586 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1587 Consider:
1588 g :: Num a => a -> a
1589 g = ...
1590
1591 f :: (Int -> Int) -> Int
1592 f w = ...
1593 {-# RULE f g = 0 #-}
1594
1595 Suppose that auto-specialisation makes a specialised version of
1596 g::Int->Int That version won't appear in the LHS of the RULE for f.
1597 So if the specialisation rule fires too early, the rule for f may
1598 never fire.
1599
1600 It might be possible to add new rules, to "complete" the rewrite system.
1601 Thus when adding
1602 RULE forall d. g Int d = g_spec
1603 also add
1604 RULE f g_spec = 0
1605
1606 But that's a bit complicated. For now we ask the programmer's help,
1607 by *copying the INLINE activation pragma* to the auto-specialised
1608 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1609 will also not be active until phase 2. And that's what programmers
1610 should jolly well do anyway, even aside from specialisation, to ensure
1611 that g doesn't inline too early.
1612
1613 This in turn means that the RULE would never fire for a NOINLINE
1614 thing so not much point in generating a specialisation at all.
1615
1616 Note [Specialisation shape]
1617 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1618 We only specialise a function if it has visible top-level lambdas
1619 corresponding to its overloading. E.g. if
1620 f :: forall a. Eq a => ....
1621 then its body must look like
1622 f = /\a. \d. ...
1623
1624 Reason: when specialising the body for a call (f ty dexp), we want to
1625 substitute dexp for d, and pick up specialised calls in the body of f.
1626
1627 This doesn't always work. One example I came across was this:
1628 newtype Gen a = MkGen{ unGen :: Int -> a }
1629
1630 choose :: Eq a => a -> Gen a
1631 choose n = MkGen (\r -> n)
1632
1633 oneof = choose (1::Int)
1634
1635 It's a silly exapmle, but we get
1636 choose = /\a. g `cast` co
1637 where choose doesn't have any dict arguments. Thus far I have not
1638 tried to fix this (wait till there's a real example).
1639
1640 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1641 it doesn't matter. This comes up with single-method classes
1642
1643 class C a where { op :: a -> a }
1644 instance C a => C [a] where ....
1645 ==>
1646 $fCList :: C a => C [a]
1647 $fCList = $copList |> (...coercion>...)
1648 ....(uses of $fCList at particular types)...
1649
1650 So we suppress the WARN if the rhs is trivial.
1651
1652 Note [Inline specialisations]
1653 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1654 Here is what we do with the InlinePragma of the original function
1655 * Activation/RuleMatchInfo: both transferred to the
1656 specialised function
1657 * InlineSpec:
1658 (a) An INLINE pragma is transferred
1659 (b) An INLINABLE pragma is *not* transferred
1660
1661 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1662 specialise the function at its call site, and arguably that's not so
1663 important for the specialised copies. BUT *pragma-directed*
1664 specialisation now takes place in the typechecker/desugarer, with
1665 manually specified INLINEs. The specialisation here is automatic.
1666 It'd be very odd if a function marked INLINE was specialised (because
1667 of some local use), and then forever after (including importing
1668 modules) the specialised version wasn't INLINEd. After all, the
1669 programmer said INLINE!
1670
1671 You might wonder why we specialise INLINE functions at all. After
1672 all they should be inlined, right? Two reasons:
1673
1674 * Even INLINE functions are sometimes not inlined, when they aren't
1675 applied to interesting arguments. But perhaps the type arguments
1676 alone are enough to specialise (even though the args are too boring
1677 to trigger inlining), and it's certainly better to call the
1678 specialised version.
1679
1680 * The RHS of an INLINE function might call another overloaded function,
1681 and we'd like to generate a specialised version of that function too.
1682 This actually happens a lot. Consider
1683 replicateM_ :: (Monad m) => Int -> m a -> m ()
1684 {-# INLINABLE replicateM_ #-}
1685 replicateM_ d x ma = ...
1686 The strictness analyser may transform to
1687 replicateM_ :: (Monad m) => Int -> m a -> m ()
1688 {-# INLINE replicateM_ #-}
1689 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1690
1691 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1692 {-# INLINABLE $wreplicateM_ #-}
1693 $wreplicateM_ = ...
1694 Now an importing module has a specialised call to replicateM_, say
1695 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1696 This particular example had a huge effect on the call to replicateM_
1697 in nofib/shootout/n-body.
1698
1699 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1700 Suppose we have
1701 {-# INLINABLE f #-}
1702 f :: Ord a => [a] -> Int
1703 f xs = letrec f' = ...f'... in f'
1704 Then, when f is specialised and optimised we might get
1705 wgo :: [Int] -> Int#
1706 wgo = ...wgo...
1707 f_spec :: [Int] -> Int
1708 f_spec xs = case wgo xs of { r -> I# r }
1709 and we clearly want to inline f_spec at call sites. But if we still
1710 have the big, un-optimised of f (albeit specialised) captured in an
1711 INLINABLE pragma for f_spec, we won't get that optimisation.
1712
1713 So we simply drop INLINABLE pragmas when specialising. It's not really
1714 a complete solution; ignoring specialisation for now, INLINABLE functions
1715 don't get properly strictness analysed, for example. But it works well
1716 for examples involving specialisation, which is the dominant use of
1717 INLINABLE. See Trac #4874.
1718
1719
1720 ************************************************************************
1721 * *
1722 \subsubsection{UsageDetails and suchlike}
1723 * *
1724 ************************************************************************
1725 -}
1726
1727 data UsageDetails
1728 = MkUD {
1729 ud_binds :: !(Bag DictBind),
1730 -- Floated dictionary bindings
1731 -- The order is important;
1732 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1733 -- (Remember, Bags preserve order in GHC.)
1734
1735 ud_calls :: !CallDetails
1736
1737 -- INVARIANT: suppose bs = bindersOf ud_binds
1738 -- Then 'calls' may *mention* 'bs',
1739 -- but there should be no calls *for* bs
1740 }
1741
1742 instance Outputable UsageDetails where
1743 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1744 = text "MkUD" <+> braces (sep (punctuate comma
1745 [text "binds" <+> equals <+> ppr dbs,
1746 text "calls" <+> equals <+> ppr calls]))
1747
1748 -- | A 'DictBind' is a binding along with a cached set containing its free
1749 -- variables (both type variables and dictionaries)
1750 type DictBind = (CoreBind, VarSet)
1751
1752 type DictExpr = CoreExpr
1753
1754 emptyUDs :: UsageDetails
1755 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1756
1757 ------------------------------------------------------------
1758 type CallDetails = DIdEnv CallInfoSet
1759 -- The order of specialized binds and rules depends on how we linearize
1760 -- CallDetails, so to get determinism we must use a deterministic set here.
1761 -- See Note [Deterministic UniqFM] in UniqDFM
1762 newtype CallKey = CallKey [Maybe Type]
1763 -- Nothing => unconstrained type argument
1764
1765 data CallInfoSet = CIS Id (Bag CallInfo)
1766 -- The list of types and dictionaries is guaranteed to
1767 -- match the type of f
1768
1769 {-
1770 Note [CallInfoSet determinism]
1771 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1772 CallInfoSet holds a Bag of (CallKey, [DictExpr], VarSet) triplets for a given
1773 Id. They represent the types that the function is instantiated at along with
1774 the dictionaries and free variables.
1775
1776 We use this information to generate specialized versions of a given function.
1777 CallInfoSet used to be defined as:
1778
1779 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1780
1781 Unfortunately this was not deterministic. The Ord instance of CallKey was
1782 defined in terms of nonDetCmpType which is not deterministic.
1783 See Note [nonDetCmpType nondeterminism].
1784 The end result was that if the function had multiple specializations they would
1785 be generated in arbitrary order.
1786
1787 We need a container that:
1788 a) when turned into a list has only one element per each CallKey and the list
1789 has deterministic order
1790 b) supports union
1791 c) supports singleton
1792 d) supports filter
1793
1794 We can't use UniqDFM here because there's no one Unique that we can key on.
1795
1796 The current approach is to implement the set as a Bag with duplicates.
1797 This makes b), c), d) trivial and pushes a) towards the end. The deduplication
1798 is done by using a TrieMap for membership tests on CallKey. This lets us delete
1799 the nondeterministic Ord CallKey instance.
1800
1801 An alternative approach would be to augment the Map the same way that UniqDFM
1802 is augmented, by keeping track of insertion order and using it to order the
1803 resulting lists. It would mean keeping the nondeterministic Ord CallKey
1804 instance making it easy to reintroduce nondeterminism in the future.
1805 -}
1806
1807 ciSetToList :: CallInfoSet -> [CallInfo]
1808 ciSetToList (CIS _ b) = snd $ foldrBag combine (emptyTM, []) b
1809 where
1810 -- This is where we eliminate duplicates, recording the CallKeys we've
1811 -- already seen in the TrieMap. See Note [CallInfoSet determinism].
1812 combine :: CallInfo -> (CallKeySet, [CallInfo]) -> (CallKeySet, [CallInfo])
1813 combine ci@(CallKey key, _) (set, acc)
1814 | Just _ <- lookupTM key set = (set, acc)
1815 | otherwise = (insertTM key () set, ci:acc)
1816
1817 type CallKeySet = ListMap (MaybeMap TypeMap) ()
1818 -- We only use it in ciSetToList to check for membership
1819
1820 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1821 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1822
1823 type CallInfo = (CallKey, ([DictExpr], VarSet))
1824 -- Range is dict args and the vars of the whole
1825 -- call (including tyvars)
1826 -- [*not* include the main id itself, of course]
1827
1828 instance Outputable CallInfoSet where
1829 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1830 2 (ppr map)
1831
1832 pprCallInfo :: Id -> CallInfo -> SDoc
1833 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1834 = hang (ppr fn)
1835 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1836
1837 ppr_call_key_ty :: Maybe Type -> SDoc
1838 ppr_call_key_ty Nothing = char '_'
1839 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1840
1841 instance Outputable CallKey where
1842 ppr (CallKey ts) = ppr ts
1843
1844 unionCalls :: CallDetails -> CallDetails -> CallDetails
1845 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1846
1847 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1848 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1849 CIS f (calls1 `unionBags` calls2)
1850
1851 callDetailsFVs :: CallDetails -> VarSet
1852 callDetailsFVs calls =
1853 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1854 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1855 -- immediately by converting to a nondeterministic set.
1856
1857 callInfoFVs :: CallInfoSet -> VarSet
1858 callInfoFVs (CIS _ call_info) =
1859 foldrBag (\(_, (_,fv)) vs -> unionVarSet fv vs) emptyVarSet call_info
1860
1861 ------------------------------------------------------------
1862 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1863 singleCall id tys dicts
1864 = MkUD {ud_binds = emptyBag,
1865 ud_calls = unitDVarEnv id $ CIS id $
1866 unitBag (CallKey tys, (dicts, call_fvs)) }
1867 where
1868 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1869 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1870 -- The type args (tys) are guaranteed to be part of the dictionary
1871 -- types, because they are just the constrained types,
1872 -- and the dictionary is therefore sure to be bound
1873 -- inside the binding for any type variables free in the type;
1874 -- hence it's safe to neglect tyvars free in tys when making
1875 -- the free-var set for this call
1876 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1877 --
1878 -- We don't include the 'id' itself.
1879
1880 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1881 mkCallUDs env f args
1882 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1883 res
1884 where
1885 res = mkCallUDs' env f args
1886
1887 mkCallUDs' env f args
1888 | not (want_calls_for f) -- Imported from elsewhere
1889 || null theta -- Not overloaded
1890 = emptyUDs
1891
1892 | not (all type_determines_value theta)
1893 || not (spec_tys `lengthIs` n_tyvars)
1894 || not ( dicts `lengthIs` n_dicts)
1895 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1896 -- See also Note [Specialisations already covered]
1897 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1898 emptyUDs -- Not overloaded, or no specialisation wanted
1899
1900 | otherwise
1901 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1902 singleCall f spec_tys dicts
1903 where
1904 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1905 , ppr (map (interestingDict env) dicts)]
1906 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1907 constrained_tyvars = tyCoVarsOfTypes theta
1908 n_tyvars = length tyvars
1909 n_dicts = length theta
1910
1911 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1912 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1913
1914 -- ignores Coercion arguments
1915 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1916 type_zip tvs (Coercion _ : args) = type_zip tvs args
1917 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1918 type_zip _ _ = []
1919
1920 mk_spec_ty tyvar ty
1921 | tyvar `elemVarSet` constrained_tyvars = Just ty
1922 | otherwise = Nothing
1923
1924 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1925 -- For imported things, we gather call instances if
1926 -- there is an unfolding that we could in principle specialise
1927 -- We might still decide not to use it (consulting dflags)
1928 -- in specImports
1929 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1930
1931 type_determines_value pred -- See Note [Type determines value]
1932 = case classifyPredType pred of
1933 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1934 EqPred {} -> True
1935 IrredPred {} -> True -- Things like (D []) where D is a
1936 -- Constraint-ranged family; Trac #7785
1937
1938 {-
1939 Note [Type determines value]
1940 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1941 Only specialise if all overloading is on non-IP *class* params,
1942 because these are the ones whose *type* determines their *value*. In
1943 parrticular, with implicit params, the type args *don't* say what the
1944 value of the implicit param is! See Trac #7101
1945
1946 However, consider
1947 type family D (v::*->*) :: Constraint
1948 type instance D [] = ()
1949 f :: D v => v Char -> Int
1950 If we see a call (f "foo"), we'll pass a "dictionary"
1951 () |> (g :: () ~ D [])
1952 and it's good to specialise f at this dictionary.
1953
1954 So the question is: can an implicit parameter "hide inside" a
1955 type-family constraint like (D a). Well, no. We don't allow
1956 type instance D Maybe = ?x:Int
1957 Hence the IrredPred case in type_determines_value.
1958 See Trac #7785.
1959
1960 Note [Interesting dictionary arguments]
1961 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1962 Consider this
1963 \a.\d:Eq a. let f = ... in ...(f d)...
1964 There really is not much point in specialising f wrt the dictionary d,
1965 because the code for the specialised f is not improved at all, because
1966 d is lambda-bound. We simply get junk specialisations.
1967
1968 What is "interesting"? Just that it has *some* structure. But what about
1969 variables?
1970
1971 * A variable might be imported, in which case its unfolding
1972 will tell us whether it has useful structure
1973
1974 * Local variables are cloned on the way down (to avoid clashes when
1975 we float dictionaries), and cloning drops the unfolding
1976 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1977 nuisance to give them unfoldings. So we keep track of the
1978 "interesting" dictionaries as a VarSet in SpecEnv.
1979 We have to take care to put any new interesting dictionary
1980 bindings in the set.
1981
1982 We accidentally lost accurate tracking of local variables for a long
1983 time, because cloned variables don't have unfoldings. But makes a
1984 massive difference in a few cases, eg Trac #5113. For nofib as a
1985 whole it's only a small win: 2.2% improvement in allocation for ansi,
1986 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1987 -}
1988
1989 interestingDict :: SpecEnv -> CoreExpr -> Bool
1990 -- A dictionary argument is interesting if it has *some* structure
1991 -- NB: "dictionary" arguments include constraints of all sorts,
1992 -- including equality constraints; hence the Coercion case
1993 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1994 || isDataConWorkId v
1995 || v `elemVarSet` se_interesting env
1996 interestingDict _ (Type _) = False
1997 interestingDict _ (Coercion _) = False
1998 interestingDict env (App fn (Type _)) = interestingDict env fn
1999 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2000 interestingDict env (Tick _ a) = interestingDict env a
2001 interestingDict env (Cast e _) = interestingDict env e
2002 interestingDict _ _ = True
2003
2004 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2005 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2006 (MkUD {ud_binds = db2, ud_calls = calls2})
2007 = MkUD { ud_binds = db1 `unionBags` db2
2008 , ud_calls = calls1 `unionCalls` calls2 }
2009
2010 plusUDList :: [UsageDetails] -> UsageDetails
2011 plusUDList = foldr plusUDs emptyUDs
2012
2013 -----------------------------
2014 _dictBindBndrs :: Bag DictBind -> [Id]
2015 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2016
2017 -- | Construct a 'DictBind' from a 'CoreBind'
2018 mkDB :: CoreBind -> DictBind
2019 mkDB bind = (bind, bind_fvs bind)
2020
2021 -- | Identify the free variables of a 'CoreBind'
2022 bind_fvs :: CoreBind -> VarSet
2023 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2024 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2025 where
2026 bndrs = map fst prs
2027 rhs_fvs = unionVarSets (map pair_fvs prs)
2028
2029 pair_fvs :: (Id, CoreExpr) -> VarSet
2030 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
2031 -- Don't forget variables mentioned in the
2032 -- rules of the bndr. C.f. OccAnal.addRuleUsage
2033 -- Also tyvars mentioned in its type; they may not appear in the RHS
2034 -- type T a = Int
2035 -- x :: T a = 3
2036
2037 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
2038 -- recursive binding, including some additional bindings.
2039 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
2040 flattenDictBinds dbs pairs
2041 = (Rec bindings, fvs)
2042 where
2043 (bindings, fvs) = foldrBag add
2044 ([], emptyVarSet)
2045 (dbs `snocBag` mkDB (Rec pairs))
2046 add (NonRec b r, fvs') (pairs, fvs) =
2047 ((b,r) : pairs, fvs `unionVarSet` fvs')
2048 add (Rec prs1, fvs') (pairs, fvs) =
2049 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2050
2051 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2052 -- Add ud_binds to the tail end of the bindings in uds
2053 snocDictBinds uds dbs
2054 = uds { ud_binds = ud_binds uds `unionBags`
2055 foldr consBag emptyBag dbs }
2056
2057 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2058 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2059
2060 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2061 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2062
2063 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2064 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2065
2066 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2067 wrapDictBinds dbs binds
2068 = foldrBag add binds dbs
2069 where
2070 add (bind,_) binds = bind : binds
2071
2072 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2073 wrapDictBindsE dbs expr
2074 = foldrBag add expr dbs
2075 where
2076 add (bind,_) expr = Let bind expr
2077
2078 ----------------------
2079 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2080 -- Used at a lambda or case binder; just dump anything mentioning the binder
2081 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2082 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2083 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2084 (free_uds, dump_dbs)
2085 where
2086 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2087 bndr_set = mkVarSet bndrs
2088 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2089 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2090 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2091 -- no calls for any of the dicts in dump_dbs
2092
2093 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2094 -- Used at a lambda or case binder; just dump anything mentioning the binder
2095 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2096 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2097 (free_uds, dump_dbs, float_all)
2098 where
2099 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2100 bndr_set = mkVarSet bndrs
2101 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2102 free_calls = deleteCallsFor bndrs orig_calls
2103 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2104
2105 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2106 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2107 = -- pprTrace ("callsForMe")
2108 -- (vcat [ppr fn,
2109 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2110 -- text "Orig calls =" <+> ppr orig_calls,
2111 -- text "Dep set =" <+> ppr dep_set,
2112 -- text "Calls for me =" <+> ppr calls_for_me]) $
2113 (uds_without_me, calls_for_me)
2114 where
2115 uds_without_me = MkUD { ud_binds = orig_dbs
2116 , ud_calls = delDVarEnv orig_calls fn }
2117 calls_for_me = case lookupDVarEnv orig_calls fn of
2118 Nothing -> []
2119 Just cis -> filter_dfuns (ciSetToList cis)
2120
2121 dep_set = foldlBag go (unitVarSet fn) orig_dbs
2122 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2123 = extendVarSetList dep_set (bindersOf db)
2124 | otherwise = dep_set
2125
2126 -- Note [Specialisation of dictionary functions]
2127 filter_dfuns | isDFunId fn = filter ok_call
2128 | otherwise = \cs -> cs
2129
2130 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2131
2132 ----------------------
2133 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2134 -- Returns (free_dbs, dump_dbs, dump_set)
2135 splitDictBinds dbs bndr_set
2136 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2137 -- Important that it's foldl not foldr;
2138 -- we're accumulating the set of dumped ids in dump_set
2139 where
2140 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2141 | dump_idset `intersectsVarSet` fvs -- Dump it
2142 = (free_dbs, dump_dbs `snocBag` db,
2143 extendVarSetList dump_idset (bindersOf bind))
2144
2145 | otherwise -- Don't dump it
2146 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2147
2148
2149 ----------------------
2150 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2151 -- Remove calls *mentioning* bs
2152 deleteCallsMentioning bs calls
2153 = mapDVarEnv (ciSetFilter keep_call) calls
2154 where
2155 keep_call (_, (_, fvs)) = not (fvs `intersectsVarSet` bs)
2156
2157 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2158 -- Remove calls *for* bs
2159 deleteCallsFor bs calls = delDVarEnvList calls bs
2160
2161 {-
2162 ************************************************************************
2163 * *
2164 \subsubsection{Boring helper functions}
2165 * *
2166 ************************************************************************
2167 -}
2168
2169 newtype SpecM a = SpecM (State SpecState a)
2170
2171 data SpecState = SpecState {
2172 spec_uniq_supply :: UniqSupply,
2173 spec_module :: Module,
2174 spec_dflags :: DynFlags
2175 }
2176
2177 instance Functor SpecM where
2178 fmap = liftM
2179
2180 instance Applicative SpecM where
2181 pure x = SpecM $ return x
2182 (<*>) = ap
2183
2184 instance Monad SpecM where
2185 SpecM x >>= f = SpecM $ do y <- x
2186 case f y of
2187 SpecM z ->
2188 z
2189 fail str = SpecM $ fail str
2190
2191 #if __GLASGOW_HASKELL__ > 710
2192 instance MonadFail.MonadFail SpecM where
2193 fail str = SpecM $ fail str
2194 #endif
2195
2196 instance MonadUnique SpecM where
2197 getUniqueSupplyM
2198 = SpecM $ do st <- get
2199 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2200 put $ st { spec_uniq_supply = us2 }
2201 return us1
2202
2203 getUniqueM
2204 = SpecM $ do st <- get
2205 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2206 put $ st { spec_uniq_supply = us' }
2207 return u
2208
2209 instance HasDynFlags SpecM where
2210 getDynFlags = SpecM $ liftM spec_dflags get
2211
2212 instance HasModule SpecM where
2213 getModule = SpecM $ liftM spec_module get
2214
2215 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2216 runSpecM dflags this_mod (SpecM spec)
2217 = do us <- getUniqueSupplyM
2218 let initialState = SpecState {
2219 spec_uniq_supply = us,
2220 spec_module = this_mod,
2221 spec_dflags = dflags
2222 }
2223 return $ evalState spec initialState
2224
2225 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2226 mapAndCombineSM _ [] = return ([], emptyUDs)
2227 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2228 (ys, uds2) <- mapAndCombineSM f xs
2229 return (y:ys, uds1 `plusUDs` uds2)
2230
2231 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2232 extendTvSubstList env tv_binds
2233 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2234
2235 substTy :: SpecEnv -> Type -> Type
2236 substTy env ty = CoreSubst.substTy (se_subst env) ty
2237
2238 substCo :: SpecEnv -> Coercion -> Coercion
2239 substCo env co = CoreSubst.substCo (se_subst env) co
2240
2241 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2242 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2243 (subst', bs') -> (env { se_subst = subst' }, bs')
2244
2245 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2246 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2247 (subst', bs') -> (env { se_subst = subst' }, bs')
2248
2249 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2250 -- Clone the binders of the bind; return new bind with the cloned binders
2251 -- Return the substitution to use for RHSs, and the one to use for the body
2252 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2253 = do { us <- getUniqueSupplyM
2254 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2255 interesting' | interestingDict env rhs
2256 = interesting `extendVarSet` bndr'
2257 | otherwise = interesting
2258 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2259 , NonRec bndr' rhs) }
2260
2261 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2262 = do { us <- getUniqueSupplyM
2263 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2264 env' = env { se_subst = subst'
2265 , se_interesting = interesting `extendVarSetList`
2266 [ v | (v,r) <- pairs, interestingDict env r ] }
2267 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2268
2269 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2270 -- Make up completely fresh binders for the dictionaries
2271 -- Their bindings are going to float outwards
2272 newDictBndr env b = do { uniq <- getUniqueM
2273 ; let n = idName b
2274 ty' = substTy env (idType b)
2275 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2276
2277 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2278 -- Give the new Id a similar occurrence name to the old one
2279 newSpecIdSM old_id new_ty join_arity_maybe
2280 = do { uniq <- getUniqueM
2281 ; let name = idName old_id
2282 new_occ = mkSpecOcc (nameOccName name)
2283 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2284 `asJoinId_maybe` join_arity_maybe
2285 ; return new_id }
2286
2287 {-
2288 Old (but interesting) stuff about unboxed bindings
2289 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2290
2291 What should we do when a value is specialised to a *strict* unboxed value?
2292
2293 map_*_* f (x:xs) = let h = f x
2294 t = map f xs
2295 in h:t
2296
2297 Could convert let to case:
2298
2299 map_*_Int# f (x:xs) = case f x of h# ->
2300 let t = map f xs
2301 in h#:t
2302
2303 This may be undesirable since it forces evaluation here, but the value
2304 may not be used in all branches of the body. In the general case this
2305 transformation is impossible since the mutual recursion in a letrec
2306 cannot be expressed as a case.
2307
2308 There is also a problem with top-level unboxed values, since our
2309 implementation cannot handle unboxed values at the top level.
2310
2311 Solution: Lift the binding of the unboxed value and extract it when it
2312 is used:
2313
2314 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2315 t = map f xs
2316 in case h of
2317 _Lift h# -> h#:t
2318
2319 Now give it to the simplifier and the _Lifting will be optimised away.
2320
2321 The benefit is that we have given the specialised "unboxed" values a
2322 very simple lifted semantics and then leave it up to the simplifier to
2323 optimise it --- knowing that the overheads will be removed in nearly
2324 all cases.
2325
2326 In particular, the value will only be evaluated in the branches of the
2327 program which use it, rather than being forced at the point where the
2328 value is bound. For example:
2329
2330 filtermap_*_* p f (x:xs)
2331 = let h = f x
2332 t = ...
2333 in case p x of
2334 True -> h:t
2335 False -> t
2336 ==>
2337 filtermap_*_Int# p f (x:xs)
2338 = let h = case (f x) of h# -> _Lift h#
2339 t = ...
2340 in case p x of
2341 True -> case h of _Lift h#
2342 -> h#:t
2343 False -> t
2344
2345 The binding for h can still be inlined in the one branch and the
2346 _Lifting eliminated.
2347
2348
2349 Question: When won't the _Lifting be eliminated?
2350
2351 Answer: When they at the top-level (where it is necessary) or when
2352 inlining would duplicate work (or possibly code depending on
2353 options). However, the _Lifting will still be eliminated if the
2354 strictness analyser deems the lifted binding strict.
2355 -}