bd1c7aecf040088d19bc5ef131907fe8212fd1fe
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy )
14 import Type hiding( substTy, extendTvSubstList )
15 import Module( Module, HasModule(..) )
16 import Coercion( Coercion )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars, exprsFreeIdsList )
26 import CoreArity ( etaExpandToJoinPointRule )
27 import UniqSupply
28 import Name
29 import MkId ( voidArgId, voidPrimId )
30 import Maybes ( catMaybes, isJust )
31 import BasicTypes
32 import HscTypes
33 import Bag
34 import DynFlags
35 import Util
36 import Outputable
37 import FastString
38 import State
39 import UniqDFM
40 import TrieMap
41
42 import Control.Monad
43 #if __GLASGOW_HASKELL__ > 710
44 import qualified Control.Monad.Fail as MonadFail
45 #endif
46
47 {-
48 ************************************************************************
49 * *
50 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
51 * *
52 ************************************************************************
53
54 These notes describe how we implement specialisation to eliminate
55 overloading.
56
57 The specialisation pass works on Core
58 syntax, complete with all the explicit dictionary application,
59 abstraction and construction as added by the type checker. The
60 existing type checker remains largely as it is.
61
62 One important thought: the {\em types} passed to an overloaded
63 function, and the {\em dictionaries} passed are mutually redundant.
64 If the same function is applied to the same type(s) then it is sure to
65 be applied to the same dictionary(s)---or rather to the same {\em
66 values}. (The arguments might look different but they will evaluate
67 to the same value.)
68
69 Second important thought: we know that we can make progress by
70 treating dictionary arguments as static and worth specialising on. So
71 we can do without binding-time analysis, and instead specialise on
72 dictionary arguments and no others.
73
74 The basic idea
75 ~~~~~~~~~~~~~~
76 Suppose we have
77
78 let f = <f_rhs>
79 in <body>
80
81 and suppose f is overloaded.
82
83 STEP 1: CALL-INSTANCE COLLECTION
84
85 We traverse <body>, accumulating all applications of f to types and
86 dictionaries.
87
88 (Might there be partial applications, to just some of its types and
89 dictionaries? In principle yes, but in practice the type checker only
90 builds applications of f to all its types and dictionaries, so partial
91 applications could only arise as a result of transformation, and even
92 then I think it's unlikely. In any case, we simply don't accumulate such
93 partial applications.)
94
95
96 STEP 2: EQUIVALENCES
97
98 So now we have a collection of calls to f:
99 f t1 t2 d1 d2
100 f t3 t4 d3 d4
101 ...
102 Notice that f may take several type arguments. To avoid ambiguity, we
103 say that f is called at type t1/t2 and t3/t4.
104
105 We take equivalence classes using equality of the *types* (ignoring
106 the dictionary args, which as mentioned previously are redundant).
107
108 STEP 3: SPECIALISATION
109
110 For each equivalence class, choose a representative (f t1 t2 d1 d2),
111 and create a local instance of f, defined thus:
112
113 f@t1/t2 = <f_rhs> t1 t2 d1 d2
114
115 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
116 of simplification will now result. However we don't actually *do* that
117 simplification. Rather, we leave it for the simplifier to do. If we
118 *did* do it, though, we'd get more call instances from the specialised
119 RHS. We can work out what they are by instantiating the call-instance
120 set from f's RHS with the types t1, t2.
121
122 Add this new id to f's IdInfo, to record that f has a specialised version.
123
124 Before doing any of this, check that f's IdInfo doesn't already
125 tell us about an existing instance of f at the required type/s.
126 (This might happen if specialisation was applied more than once, or
127 it might arise from user SPECIALIZE pragmas.)
128
129 Recursion
130 ~~~~~~~~~
131 Wait a minute! What if f is recursive? Then we can't just plug in
132 its right-hand side, can we?
133
134 But it's ok. The type checker *always* creates non-recursive definitions
135 for overloaded recursive functions. For example:
136
137 f x = f (x+x) -- Yes I know its silly
138
139 becomes
140
141 f a (d::Num a) = let p = +.sel a d
142 in
143 letrec fl (y::a) = fl (p y y)
144 in
145 fl
146
147 We still have recusion for non-overloaded functions which we
148 specialise, but the recursive call should get specialised to the
149 same recursive version.
150
151
152 Polymorphism 1
153 ~~~~~~~~~~~~~~
154
155 All this is crystal clear when the function is applied to *constant
156 types*; that is, types which have no type variables inside. But what if
157 it is applied to non-constant types? Suppose we find a call of f at type
158 t1/t2. There are two possibilities:
159
160 (a) The free type variables of t1, t2 are in scope at the definition point
161 of f. In this case there's no problem, we proceed just as before. A common
162 example is as follows. Here's the Haskell:
163
164 g y = let f x = x+x
165 in f y + f y
166
167 After typechecking we have
168
169 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
170 in +.sel a d (f a d y) (f a d y)
171
172 Notice that the call to f is at type type "a"; a non-constant type.
173 Both calls to f are at the same type, so we can specialise to give:
174
175 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
176 in +.sel a d (f@a y) (f@a y)
177
178
179 (b) The other case is when the type variables in the instance types
180 are *not* in scope at the definition point of f. The example we are
181 working with above is a good case. There are two instances of (+.sel a d),
182 but "a" is not in scope at the definition of +.sel. Can we do anything?
183 Yes, we can "common them up", a sort of limited common sub-expression deal.
184 This would give:
185
186 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
187 f@a (x::a) = +.sel@a x x
188 in +.sel@a (f@a y) (f@a y)
189
190 This can save work, and can't be spotted by the type checker, because
191 the two instances of +.sel weren't originally at the same type.
192
193 Further notes on (b)
194
195 * There are quite a few variations here. For example, the defn of
196 +.sel could be floated ouside the \y, to attempt to gain laziness.
197 It certainly mustn't be floated outside the \d because the d has to
198 be in scope too.
199
200 * We don't want to inline f_rhs in this case, because
201 that will duplicate code. Just commoning up the call is the point.
202
203 * Nothing gets added to +.sel's IdInfo.
204
205 * Don't bother unless the equivalence class has more than one item!
206
207 Not clear whether this is all worth it. It is of course OK to
208 simply discard call-instances when passing a big lambda.
209
210 Polymorphism 2 -- Overloading
211 ~~~~~~~~~~~~~~
212 Consider a function whose most general type is
213
214 f :: forall a b. Ord a => [a] -> b -> b
215
216 There is really no point in making a version of g at Int/Int and another
217 at Int/Bool, because it's only instantiating the type variable "a" which
218 buys us any efficiency. Since g is completely polymorphic in b there
219 ain't much point in making separate versions of g for the different
220 b types.
221
222 That suggests that we should identify which of g's type variables
223 are constrained (like "a") and which are unconstrained (like "b").
224 Then when taking equivalence classes in STEP 2, we ignore the type args
225 corresponding to unconstrained type variable. In STEP 3 we make
226 polymorphic versions. Thus:
227
228 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
229
230 We do this.
231
232
233 Dictionary floating
234 ~~~~~~~~~~~~~~~~~~~
235 Consider this
236
237 f a (d::Num a) = let g = ...
238 in
239 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
240
241 Here, g is only called at one type, but the dictionary isn't in scope at the
242 definition point for g. Usually the type checker would build a
243 definition for d1 which enclosed g, but the transformation system
244 might have moved d1's defn inward. Solution: float dictionary bindings
245 outwards along with call instances.
246
247 Consider
248
249 f x = let g p q = p==q
250 h r s = (r+s, g r s)
251 in
252 h x x
253
254
255 Before specialisation, leaving out type abstractions we have
256
257 f df x = let g :: Eq a => a -> a -> Bool
258 g dg p q = == dg p q
259 h :: Num a => a -> a -> (a, Bool)
260 h dh r s = let deq = eqFromNum dh
261 in (+ dh r s, g deq r s)
262 in
263 h df x x
264
265 After specialising h we get a specialised version of h, like this:
266
267 h' r s = let deq = eqFromNum df
268 in (+ df r s, g deq r s)
269
270 But we can't naively make an instance for g from this, because deq is not in scope
271 at the defn of g. Instead, we have to float out the (new) defn of deq
272 to widen its scope. Notice that this floating can't be done in advance -- it only
273 shows up when specialisation is done.
274
275 User SPECIALIZE pragmas
276 ~~~~~~~~~~~~~~~~~~~~~~~
277 Specialisation pragmas can be digested by the type checker, and implemented
278 by adding extra definitions along with that of f, in the same way as before
279
280 f@t1/t2 = <f_rhs> t1 t2 d1 d2
281
282 Indeed the pragmas *have* to be dealt with by the type checker, because
283 only it knows how to build the dictionaries d1 and d2! For example
284
285 g :: Ord a => [a] -> [a]
286 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
287
288 Here, the specialised version of g is an application of g's rhs to the
289 Ord dictionary for (Tree Int), which only the type checker can conjure
290 up. There might not even *be* one, if (Tree Int) is not an instance of
291 Ord! (All the other specialision has suitable dictionaries to hand
292 from actual calls.)
293
294 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
295 it is buried in a complex (as-yet-un-desugared) binding group.
296 Maybe we should say
297
298 f@t1/t2 = f* t1 t2 d1 d2
299
300 where f* is the Id f with an IdInfo which says "inline me regardless!".
301 Indeed all the specialisation could be done in this way.
302 That in turn means that the simplifier has to be prepared to inline absolutely
303 any in-scope let-bound thing.
304
305
306 Again, the pragma should permit polymorphism in unconstrained variables:
307
308 h :: Ord a => [a] -> b -> b
309 {-# SPECIALIZE h :: [Int] -> b -> b #-}
310
311 We *insist* that all overloaded type variables are specialised to ground types,
312 (and hence there can be no context inside a SPECIALIZE pragma).
313 We *permit* unconstrained type variables to be specialised to
314 - a ground type
315 - or left as a polymorphic type variable
316 but nothing in between. So
317
318 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
319
320 is *illegal*. (It can be handled, but it adds complication, and gains the
321 programmer nothing.)
322
323
324 SPECIALISING INSTANCE DECLARATIONS
325 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326 Consider
327
328 instance Foo a => Foo [a] where
329 ...
330 {-# SPECIALIZE instance Foo [Int] #-}
331
332 The original instance decl creates a dictionary-function
333 definition:
334
335 dfun.Foo.List :: forall a. Foo a -> Foo [a]
336
337 The SPECIALIZE pragma just makes a specialised copy, just as for
338 ordinary function definitions:
339
340 dfun.Foo.List@Int :: Foo [Int]
341 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
342
343 The information about what instance of the dfun exist gets added to
344 the dfun's IdInfo in the same way as a user-defined function too.
345
346
347 Automatic instance decl specialisation?
348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349 Can instance decls be specialised automatically? It's tricky.
350 We could collect call-instance information for each dfun, but
351 then when we specialised their bodies we'd get new call-instances
352 for ordinary functions; and when we specialised their bodies, we might get
353 new call-instances of the dfuns, and so on. This all arises because of
354 the unrestricted mutual recursion between instance decls and value decls.
355
356 Still, there's no actual problem; it just means that we may not do all
357 the specialisation we could theoretically do.
358
359 Furthermore, instance decls are usually exported and used non-locally,
360 so we'll want to compile enough to get those specialisations done.
361
362 Lastly, there's no such thing as a local instance decl, so we can
363 survive solely by spitting out *usage* information, and then reading that
364 back in as a pragma when next compiling the file. So for now,
365 we only specialise instance decls in response to pragmas.
366
367
368 SPITTING OUT USAGE INFORMATION
369 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
370
371 To spit out usage information we need to traverse the code collecting
372 call-instance information for all imported (non-prelude?) functions
373 and data types. Then we equivalence-class it and spit it out.
374
375 This is done at the top-level when all the call instances which escape
376 must be for imported functions and data types.
377
378 *** Not currently done ***
379
380
381 Partial specialisation by pragmas
382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383 What about partial specialisation:
384
385 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
386 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
387
388 or even
389
390 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
391
392 Seems quite reasonable. Similar things could be done with instance decls:
393
394 instance (Foo a, Foo b) => Foo (a,b) where
395 ...
396 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
397 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
398
399 Ho hum. Things are complex enough without this. I pass.
400
401
402 Requirements for the simplifier
403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404 The simplifier has to be able to take advantage of the specialisation.
405
406 * When the simplifier finds an application of a polymorphic f, it looks in
407 f's IdInfo in case there is a suitable instance to call instead. This converts
408
409 f t1 t2 d1 d2 ===> f_t1_t2
410
411 Note that the dictionaries get eaten up too!
412
413 * Dictionary selection operations on constant dictionaries must be
414 short-circuited:
415
416 +.sel Int d ===> +Int
417
418 The obvious way to do this is in the same way as other specialised
419 calls: +.sel has inside it some IdInfo which tells that if it's applied
420 to the type Int then it should eat a dictionary and transform to +Int.
421
422 In short, dictionary selectors need IdInfo inside them for constant
423 methods.
424
425 * Exactly the same applies if a superclass dictionary is being
426 extracted:
427
428 Eq.sel Int d ===> dEqInt
429
430 * Something similar applies to dictionary construction too. Suppose
431 dfun.Eq.List is the function taking a dictionary for (Eq a) to
432 one for (Eq [a]). Then we want
433
434 dfun.Eq.List Int d ===> dEq.List_Int
435
436 Where does the Eq [Int] dictionary come from? It is built in
437 response to a SPECIALIZE pragma on the Eq [a] instance decl.
438
439 In short, dfun Ids need IdInfo with a specialisation for each
440 constant instance of their instance declaration.
441
442 All this uses a single mechanism: the SpecEnv inside an Id
443
444
445 What does the specialisation IdInfo look like?
446 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
447
448 The SpecEnv of an Id maps a list of types (the template) to an expression
449
450 [Type] |-> Expr
451
452 For example, if f has this RuleInfo:
453
454 [Int, a] -> \d:Ord Int. f' a
455
456 it means that we can replace the call
457
458 f Int t ===> (\d. f' t)
459
460 This chucks one dictionary away and proceeds with the
461 specialised version of f, namely f'.
462
463
464 What can't be done this way?
465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466 There is no way, post-typechecker, to get a dictionary for (say)
467 Eq a from a dictionary for Eq [a]. So if we find
468
469 ==.sel [t] d
470
471 we can't transform to
472
473 eqList (==.sel t d')
474
475 where
476 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
477
478 Of course, we currently have no way to automatically derive
479 eqList, nor to connect it to the Eq [a] instance decl, but you
480 can imagine that it might somehow be possible. Taking advantage
481 of this is permanently ruled out.
482
483 Still, this is no great hardship, because we intend to eliminate
484 overloading altogether anyway!
485
486 A note about non-tyvar dictionaries
487 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
488 Some Ids have types like
489
490 forall a,b,c. Eq a -> Ord [a] -> tau
491
492 This seems curious at first, because we usually only have dictionary
493 args whose types are of the form (C a) where a is a type variable.
494 But this doesn't hold for the functions arising from instance decls,
495 which sometimes get arguments with types of form (C (T a)) for some
496 type constructor T.
497
498 Should we specialise wrt this compound-type dictionary? We used to say
499 "no", saying:
500 "This is a heuristic judgement, as indeed is the fact that we
501 specialise wrt only dictionaries. We choose *not* to specialise
502 wrt compound dictionaries because at the moment the only place
503 they show up is in instance decls, where they are simply plugged
504 into a returned dictionary. So nothing is gained by specialising
505 wrt them."
506
507 But it is simpler and more uniform to specialise wrt these dicts too;
508 and in future GHC is likely to support full fledged type signatures
509 like
510 f :: Eq [(a,b)] => ...
511
512
513 ************************************************************************
514 * *
515 \subsubsection{The new specialiser}
516 * *
517 ************************************************************************
518
519 Our basic game plan is this. For let(rec) bound function
520 f :: (C a, D c) => (a,b,c,d) -> Bool
521
522 * Find any specialised calls of f, (f ts ds), where
523 ts are the type arguments t1 .. t4, and
524 ds are the dictionary arguments d1 .. d2.
525
526 * Add a new definition for f1 (say):
527
528 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
529
530 Note that we abstract over the unconstrained type arguments.
531
532 * Add the mapping
533
534 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
535
536 to the specialisations of f. This will be used by the
537 simplifier to replace calls
538 (f t1 t2 t3 t4) da db
539 by
540 (\d1 d1 -> f1 t2 t4) da db
541
542 All the stuff about how many dictionaries to discard, and what types
543 to apply the specialised function to, are handled by the fact that the
544 SpecEnv contains a template for the result of the specialisation.
545
546 We don't build *partial* specialisations for f. For example:
547
548 f :: Eq a => a -> a -> Bool
549 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
550
551 Here, little is gained by making a specialised copy of f.
552 There's a distinct danger that the specialised version would
553 first build a dictionary for (Eq b, Eq c), and then select the (==)
554 method from it! Even if it didn't, not a great deal is saved.
555
556 We do, however, generate polymorphic, but not overloaded, specialisations:
557
558 f :: Eq a => [a] -> b -> b -> b
559 ... SPECIALISE f :: [Int] -> b -> b -> b ...
560
561 Hence, the invariant is this:
562
563 *** no specialised version is overloaded ***
564
565
566 ************************************************************************
567 * *
568 \subsubsection{The exported function}
569 * *
570 ************************************************************************
571 -}
572
573 -- | Specialise calls to type-class overloaded functions occuring in a program.
574 specProgram :: ModGuts -> CoreM ModGuts
575 specProgram guts@(ModGuts { mg_module = this_mod
576 , mg_rules = local_rules
577 , mg_binds = binds })
578 = do { dflags <- getDynFlags
579
580 -- Specialise the bindings of this module
581 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
582
583 -- Specialise imported functions
584 ; hpt_rules <- getRuleBase
585 ; let rule_base = extendRuleBaseList hpt_rules local_rules
586 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
587 [] rule_base (ud_calls uds)
588
589 -- Don't forget to wrap the specialized bindings with bindings
590 -- for the needed dictionaries.
591 -- See Note [Wrap bindings returned by specImports]
592 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
593
594 ; let final_binds
595 | null spec_binds' = binds'
596 | otherwise = Rec (flattenBinds spec_binds') : binds'
597 -- Note [Glom the bindings if imported functions are specialised]
598
599 ; return (guts { mg_binds = final_binds
600 , mg_rules = new_rules ++ local_rules }) }
601 where
602 -- We need to start with a Subst that knows all the things
603 -- that are in scope, so that the substitution engine doesn't
604 -- accidentally re-use a unique that's already in use
605 -- Easiest thing is to do it all at once, as if all the top-level
606 -- decls were mutually recursive
607 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
608 bindersOfBinds binds
609 , se_interesting = emptyVarSet }
610
611 go [] = return ([], emptyUDs)
612 go (bind:binds) = do (binds', uds) <- go binds
613 (bind', uds') <- specBind top_env bind uds
614 return (bind' ++ binds', uds')
615
616 {-
617 Note [Wrap bindings returned by specImports]
618 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
619 'specImports' returns a set of specialized bindings. However, these are lacking
620 necessary floated dictionary bindings, which are returned by
621 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
622 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
623 for instance, the 'specImports' call in 'specProgram'.
624
625
626 Note [Disabling cross-module specialisation]
627 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
628 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
629 in modules outside of the current module. This can sometimes uncover user code
630 which explodes in size when aggressively optimized. The
631 -fno-cross-module-specialise option was introduced to allow users to being
632 bitten by such instances to revert to the pre-7.10 behavior.
633
634 See Trac #10491
635 -}
636
637 -- | Specialise a set of calls to imported bindings
638 specImports :: DynFlags
639 -> Module
640 -> SpecEnv -- Passed in so that all top-level Ids are in scope
641 -> VarSet -- Don't specialise these ones
642 -- See Note [Avoiding recursive specialisation]
643 -> [Id] -- Stack of imported functions being specialised
644 -> RuleBase -- Rules from this module and the home package
645 -- (but not external packages, which can change)
646 -> CallDetails -- Calls for imported things, and floating bindings
647 -> CoreM ( [CoreRule] -- New rules
648 , [CoreBind] ) -- Specialised bindings
649 -- See Note [Wrapping bindings returned by specImports]
650 specImports dflags this_mod top_env done callers rule_base cds
651 -- See Note [Disabling cross-module specialisation]
652 | not $ gopt Opt_CrossModuleSpecialise dflags =
653 return ([], [])
654
655 | otherwise =
656 do { let import_calls = dVarEnvElts cds
657 ; (rules, spec_binds) <- go rule_base import_calls
658 ; return (rules, spec_binds) }
659 where
660 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
661 go _ [] = return ([], [])
662 go rb (cis@(CIS fn _calls_for_fn) : other_calls)
663 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
664 done callers rb fn $
665 ciSetToList cis
666 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
667 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
668
669 specImport :: DynFlags
670 -> Module
671 -> SpecEnv -- Passed in so that all top-level Ids are in scope
672 -> VarSet -- Don't specialise these
673 -- See Note [Avoiding recursive specialisation]
674 -> [Id] -- Stack of imported functions being specialised
675 -> RuleBase -- Rules from this module
676 -> Id -> [CallInfo] -- Imported function and calls for it
677 -> CoreM ( [CoreRule] -- New rules
678 , [CoreBind] ) -- Specialised bindings
679 specImport dflags this_mod top_env done callers rb fn calls_for_fn
680 | fn `elemVarSet` done
681 = return ([], []) -- No warning. This actually happens all the time
682 -- when specialising a recursive function, because
683 -- the RHS of the specialised function contains a recursive
684 -- call to the original function
685
686 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
687 = return ([], [])
688
689 | wantSpecImport dflags unfolding
690 , Just rhs <- maybeUnfoldingTemplate unfolding
691 = do { -- Get rules from the external package state
692 -- We keep doing this in case we "page-fault in"
693 -- more rules as we go along
694 ; hsc_env <- getHscEnv
695 ; eps <- liftIO $ hscEPS hsc_env
696 ; vis_orphs <- getVisibleOrphanMods
697 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
698 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
699
700 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
701 runSpecM dflags this_mod $
702 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
703 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
704 -- After the rules kick in we may get recursion, but
705 -- we rely on a global GlomBinds to sort that out later
706 -- See Note [Glom the bindings if imported functions are specialised]
707
708 -- Now specialise any cascaded calls
709 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
710 specImports dflags this_mod top_env
711 (extendVarSet done fn)
712 (fn:callers)
713 (extendRuleBaseList rb rules1)
714 (ud_calls uds)
715
716 -- Don't forget to wrap the specialized bindings with bindings
717 -- for the needed dictionaries
718 -- See Note [Wrap bindings returned by specImports]
719 ; let final_binds = wrapDictBinds (ud_binds uds)
720 (spec_binds2 ++ spec_binds1)
721
722 ; return (rules2 ++ rules1, final_binds) }
723
724 | warnMissingSpecs dflags callers
725 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
726 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
727 | caller <- callers])
728 , ifPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
729 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
730 ; return ([], []) }
731
732 | otherwise
733 = return ([], [])
734 where
735 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
736
737 warnMissingSpecs :: DynFlags -> [Id] -> Bool
738 -- See Note [Warning about missed specialisations]
739 warnMissingSpecs dflags callers
740 | wopt Opt_WarnAllMissedSpecs dflags = True
741 | not (wopt Opt_WarnMissedSpecs dflags) = False
742 | null callers = False
743 | otherwise = all has_inline_prag callers
744 where
745 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
746
747 wantSpecImport :: DynFlags -> Unfolding -> Bool
748 -- See Note [Specialise imported INLINABLE things]
749 wantSpecImport dflags unf
750 = case unf of
751 NoUnfolding -> False
752 BootUnfolding -> False
753 OtherCon {} -> False
754 DFunUnfolding {} -> True
755 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
756 | gopt Opt_SpecialiseAggressively dflags -> True
757 | isStableSource src -> True
758 -- Specialise even INLINE things; it hasn't inlined yet,
759 -- so perhaps it never will. Moreover it may have calls
760 -- inside it that we want to specialise
761 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
762
763 {- Note [Warning about missed specialisations]
764 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
765 Suppose
766 * In module Lib, you carefully mark a function 'foo' INLINABLE
767 * Import Lib(foo) into another module M
768 * Call 'foo' at some specialised type in M
769 Then you jolly well expect it to be specialised in M. But what if
770 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
771 specialised too. But if 'bar' is not marked INLINABLE it may well
772 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
773
774 It's more noisy to warning about a missed specialisation opportunity
775 for /every/ overloaded imported function, but sometimes useful. That
776 is what Opt_WarnAllMissedSpecs does.
777
778 ToDo: warn about missed opportunities for local functions.
779
780 Note [Specialise imported INLINABLE things]
781 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
782 What imported functions do we specialise? The basic set is
783 * DFuns and things with INLINABLE pragmas.
784 but with -fspecialise-aggressively we add
785 * Anything with an unfolding template
786
787 Trac #8874 has a good example of why we want to auto-specialise DFuns.
788
789 We have the -fspecialise-aggressively flag (usually off), because we
790 risk lots of orphan modules from over-vigorous specialisation.
791 However it's not a big deal: anything non-recursive with an
792 unfolding-template will probably have been inlined already.
793
794 Note [Glom the bindings if imported functions are specialised]
795 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
796 Suppose we have an imported, *recursive*, INLINABLE function
797 f :: Eq a => a -> a
798 f = /\a \d x. ...(f a d)...
799 In the module being compiled we have
800 g x = f (x::Int)
801 Now we'll make a specialised function
802 f_spec :: Int -> Int
803 f_spec = \x -> ...(f Int dInt)...
804 {-# RULE f Int _ = f_spec #-}
805 g = \x. f Int dInt x
806 Note that f_spec doesn't look recursive
807 After rewriting with the RULE, we get
808 f_spec = \x -> ...(f_spec)...
809 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
810 The occurrence analyser never turns a NonRec into a Rec. So we must
811 make sure that f_spec is recursive. Easiest thing is to make all
812 the specialisations for imported bindings recursive.
813
814
815 Note [Avoiding recursive specialisation]
816 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
817 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
818 'f's RHS. So we want to specialise g,h. But we don't want to
819 specialise f any more! It's possible that f's RHS might have a
820 recursive yet-more-specialised call, so we'd diverge in that case.
821 And if the call is to the same type, one specialisation is enough.
822 Avoiding this recursive specialisation loop is the reason for the
823 'done' VarSet passed to specImports and specImport.
824
825 ************************************************************************
826 * *
827 \subsubsection{@specExpr@: the main function}
828 * *
829 ************************************************************************
830 -}
831
832 data SpecEnv
833 = SE { se_subst :: CoreSubst.Subst
834 -- We carry a substitution down:
835 -- a) we must clone any binding that might float outwards,
836 -- to avoid name clashes
837 -- b) we carry a type substitution to use when analysing
838 -- the RHS of specialised bindings (no type-let!)
839
840
841 , se_interesting :: VarSet
842 -- Dict Ids that we know something about
843 -- and hence may be worth specialising against
844 -- See Note [Interesting dictionary arguments]
845 }
846
847 specVar :: SpecEnv -> Id -> CoreExpr
848 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
849
850 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
851
852 ---------------- First the easy cases --------------------
853 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
854 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
855 specExpr env (Var v) = return (specVar env v, emptyUDs)
856 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
857 specExpr env (Cast e co)
858 = do { (e', uds) <- specExpr env e
859 ; return ((mkCast e' (substCo env co)), uds) }
860 specExpr env (Tick tickish body)
861 = do { (body', uds) <- specExpr env body
862 ; return (Tick (specTickish env tickish) body', uds) }
863
864 ---------------- Applications might generate a call instance --------------------
865 specExpr env expr@(App {})
866 = go expr []
867 where
868 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
869 (fun', uds_app) <- go fun (arg':args)
870 return (App fun' arg', uds_arg `plusUDs` uds_app)
871
872 go (Var f) args = case specVar env f of
873 Var f' -> return (Var f', mkCallUDs env f' args)
874 e' -> return (e', emptyUDs) -- I don't expect this!
875 go other _ = specExpr env other
876
877 ---------------- Lambda/case require dumping of usage details --------------------
878 specExpr env e@(Lam _ _) = do
879 (body', uds) <- specExpr env' body
880 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
881 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
882 where
883 (bndrs, body) = collectBinders e
884 (env', bndrs') = substBndrs env bndrs
885 -- More efficient to collect a group of binders together all at once
886 -- and we don't want to split a lambda group with dumped bindings
887
888 specExpr env (Case scrut case_bndr ty alts)
889 = do { (scrut', scrut_uds) <- specExpr env scrut
890 ; (scrut'', case_bndr', alts', alts_uds)
891 <- specCase env scrut' case_bndr alts
892 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
893 , scrut_uds `plusUDs` alts_uds) }
894
895 ---------------- Finally, let is the interesting case --------------------
896 specExpr env (Let bind body)
897 = do { -- Clone binders
898 (rhs_env, body_env, bind') <- cloneBindSM env bind
899
900 -- Deal with the body
901 ; (body', body_uds) <- specExpr body_env body
902
903 -- Deal with the bindings
904 ; (binds', uds) <- specBind rhs_env bind' body_uds
905
906 -- All done
907 ; return (foldr Let body' binds', uds) }
908
909 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
910 specTickish env (Breakpoint ix ids)
911 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
912 -- drop vars from the list if they have a non-variable substitution.
913 -- should never happen, but it's harmless to drop them anyway.
914 specTickish _ other_tickish = other_tickish
915
916 specCase :: SpecEnv
917 -> CoreExpr -- Scrutinee, already done
918 -> Id -> [CoreAlt]
919 -> SpecM ( CoreExpr -- New scrutinee
920 , Id
921 , [CoreAlt]
922 , UsageDetails)
923 specCase env scrut' case_bndr [(con, args, rhs)]
924 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
925 , interestingDict env scrut'
926 , not (isDeadBinder case_bndr && null sc_args')
927 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
928
929 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
930 [(con, args', Var sc_arg')]
931 | sc_arg' <- sc_args' ]
932
933 -- Extend the substitution for RHS to map the *original* binders
934 -- to their floated verions.
935 mb_sc_flts :: [Maybe DictId]
936 mb_sc_flts = map (lookupVarEnv clone_env) args'
937 clone_env = zipVarEnv sc_args' sc_args_flt
938 subst_prs = (case_bndr, Var case_bndr_flt)
939 : [ (arg, Var sc_flt)
940 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
941 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
942 , se_interesting = se_interesting env_rhs `extendVarSetList`
943 (case_bndr_flt : sc_args_flt) }
944
945 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
946 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
947 case_bndr_set = unitVarSet case_bndr_flt
948 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
949 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
950 flt_binds = scrut_bind : sc_binds
951 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
952 all_uds = flt_binds `addDictBinds` free_uds
953 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
954 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
955 where
956 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
957 sc_args' = filter is_flt_sc_arg args'
958
959 clone_me bndr = do { uniq <- getUniqueM
960 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
961 where
962 name = idName bndr
963 ty = idType bndr
964 occ = nameOccName name
965 loc = getSrcSpan name
966
967 arg_set = mkVarSet args'
968 is_flt_sc_arg var = isId var
969 && not (isDeadBinder var)
970 && isDictTy var_ty
971 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
972 where
973 var_ty = idType var
974
975
976 specCase env scrut case_bndr alts
977 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
978 ; return (scrut, case_bndr', alts', uds_alts) }
979 where
980 (env_alt, case_bndr') = substBndr env case_bndr
981 spec_alt (con, args, rhs) = do
982 (rhs', uds) <- specExpr env_rhs rhs
983 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
984 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
985 where
986 (env_rhs, args') = substBndrs env_alt args
987
988 {-
989 Note [Floating dictionaries out of cases]
990 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
991 Consider
992 g = \d. case d of { MkD sc ... -> ...(f sc)... }
993 Naively we can't float d2's binding out of the case expression,
994 because 'sc' is bound by the case, and that in turn means we can't
995 specialise f, which seems a pity.
996
997 So we invert the case, by floating out a binding
998 for 'sc_flt' thus:
999 sc_flt = case d of { MkD sc ... -> sc }
1000 Now we can float the call instance for 'f'. Indeed this is just
1001 what'll happen if 'sc' was originally bound with a let binding,
1002 but case is more efficient, and necessary with equalities. So it's
1003 good to work with both.
1004
1005 You might think that this won't make any difference, because the
1006 call instance will only get nuked by the \d. BUT if 'g' itself is
1007 specialised, then transitively we should be able to specialise f.
1008
1009 In general, given
1010 case e of cb { MkD sc ... -> ...(f sc)... }
1011 we transform to
1012 let cb_flt = e
1013 sc_flt = case cb_flt of { MkD sc ... -> sc }
1014 in
1015 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1016
1017 The "_flt" things are the floated binds; we use the current substitution
1018 to substitute sc -> sc_flt in the RHS
1019
1020 ************************************************************************
1021 * *
1022 Dealing with a binding
1023 * *
1024 ************************************************************************
1025 -}
1026
1027 specBind :: SpecEnv -- Use this for RHSs
1028 -> CoreBind -- Binders are already cloned by cloneBindSM,
1029 -- but RHSs are un-processed
1030 -> UsageDetails -- Info on how the scope of the binding
1031 -> SpecM ([CoreBind], -- New bindings
1032 UsageDetails) -- And info to pass upstream
1033
1034 -- Returned UsageDetails:
1035 -- No calls for binders of this bind
1036 specBind rhs_env (NonRec fn rhs) body_uds
1037 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1038 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1039
1040 ; let pairs = spec_defns ++ [(fn', rhs')]
1041 -- fn' mentions the spec_defns in its rules,
1042 -- so put the latter first
1043
1044 combined_uds = body_uds1 `plusUDs` rhs_uds
1045 -- This way round a call in rhs_uds of a function f
1046 -- at type T will override a call of f at T in body_uds1; and
1047 -- that is good because it'll tend to keep "earlier" calls
1048 -- See Note [Specialisation of dictionary functions]
1049
1050 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1051 -- See Note [From non-recursive to recursive]
1052
1053 final_binds :: [DictBind]
1054 final_binds
1055 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1056 | otherwise = [flattenDictBinds dump_dbs pairs]
1057
1058 ; if float_all then
1059 -- Rather than discard the calls mentioning the bound variables
1060 -- we float this binding along with the others
1061 return ([], free_uds `snocDictBinds` final_binds)
1062 else
1063 -- No call in final_uds mentions bound variables,
1064 -- so we can just leave the binding here
1065 return (map fst final_binds, free_uds) }
1066
1067
1068 specBind rhs_env (Rec pairs) body_uds
1069 -- Note [Specialising a recursive group]
1070 = do { let (bndrs,rhss) = unzip pairs
1071 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1072 ; let scope_uds = body_uds `plusUDs` rhs_uds
1073 -- Includes binds and calls arising from rhss
1074
1075 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1076
1077 ; (bndrs3, spec_defns3, uds3)
1078 <- if null spec_defns1 -- Common case: no specialisation
1079 then return (bndrs1, [], uds1)
1080 else do { -- Specialisation occurred; do it again
1081 (bndrs2, spec_defns2, uds2)
1082 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1083 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1084
1085 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1086 bind = flattenDictBinds dumped_dbs
1087 (spec_defns3 ++ zip bndrs3 rhss')
1088
1089 ; if float_all then
1090 return ([], final_uds `snocDictBind` bind)
1091 else
1092 return ([fst bind], final_uds) }
1093
1094
1095 ---------------------------
1096 specDefns :: SpecEnv
1097 -> UsageDetails -- Info on how it is used in its scope
1098 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1099 -> SpecM ([OutId], -- Original Ids with RULES added
1100 [(OutId,OutExpr)], -- Extra, specialised bindings
1101 UsageDetails) -- Stuff to fling upwards from the specialised versions
1102
1103 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1104 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1105 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1106 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1107 -- But not vice versa (it's a fixpoint problem).
1108
1109 specDefns _env uds []
1110 = return ([], [], uds)
1111 specDefns env uds ((bndr,rhs):pairs)
1112 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1113 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1114 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1115
1116 ---------------------------
1117 specDefn :: SpecEnv
1118 -> UsageDetails -- Info on how it is used in its scope
1119 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1120 -> SpecM (Id, -- Original Id with added RULES
1121 [(Id,CoreExpr)], -- Extra, specialised bindings
1122 UsageDetails) -- Stuff to fling upwards from the specialised versions
1123
1124 specDefn env body_uds fn rhs
1125 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1126 rules_for_me = idCoreRules fn
1127 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1128 calls_for_me fn rhs
1129 ; return ( fn `addIdSpecialisations` rules
1130 , spec_defns
1131 , body_uds_without_me `plusUDs` spec_uds) }
1132 -- It's important that the `plusUDs` is this way
1133 -- round, because body_uds_without_me may bind
1134 -- dictionaries that are used in calls_for_me passed
1135 -- to specDefn. So the dictionary bindings in
1136 -- spec_uds may mention dictionaries bound in
1137 -- body_uds_without_me
1138
1139 ---------------------------
1140 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1141 -- Nothing => specialising local fn
1142 -> SpecEnv
1143 -> [CoreRule] -- Existing RULES for the fn
1144 -> [CallInfo]
1145 -> OutId -> InExpr
1146 -> SpecM ([CoreRule], -- New RULES for the fn
1147 [(Id,CoreExpr)], -- Extra, specialised bindings
1148 UsageDetails) -- New usage details from the specialised RHSs
1149
1150 -- This function checks existing rules, and does not create
1151 -- duplicate ones. So the caller does not need to do this filtering.
1152 -- See 'already_covered'
1153
1154 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1155 -- The first case is the interesting one
1156 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1157 && rhs_bndrs1 `lengthAtLeast` n_dicts -- and enough dict args
1158 && notNull calls_for_me -- And there are some calls to specialise
1159 && not (isNeverActive (idInlineActivation fn))
1160 -- Don't specialise NOINLINE things
1161 -- See Note [Auto-specialisation and RULES]
1162
1163 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1164 -- See Note [Inline specialisation] for why we do not
1165 -- switch off specialisation for inline functions
1166
1167 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1168 do { stuff <- mapM spec_call calls_for_me
1169 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1170 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1171
1172 | otherwise -- No calls or RHS doesn't fit our preconceptions
1173 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1174 text "Missed specialisation opportunity for"
1175 <+> ppr fn $$ _trace_doc )
1176 -- Note [Specialisation shape]
1177 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1178 return ([], [], emptyUDs)
1179 where
1180 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1181 , ppr rhs_bndrs, ppr n_dicts
1182 , ppr (idInlineActivation fn) ]
1183
1184 fn_type = idType fn
1185 fn_arity = idArity fn
1186 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1187 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1188 n_tyvars = length tyvars
1189 n_dicts = length theta
1190 inl_prag = idInlinePragma fn
1191 inl_act = inlinePragmaActivation inl_prag
1192 is_local = isLocalId fn
1193
1194 -- Figure out whether the function has an INLINE pragma
1195 -- See Note [Inline specialisations]
1196
1197 (rhs_bndrs, rhs_body) = CoreSubst.collectBindersPushingCo rhs
1198 -- See Note [Account for casts in binding]
1199 (rhs_tyvars, rhs_bndrs1) = span isTyVar rhs_bndrs
1200 (rhs_dict_ids, rhs_bndrs2) = splitAt n_dicts rhs_bndrs1
1201 body = mkLams rhs_bndrs2 rhs_body
1202 -- Glue back on the non-dict lambdas
1203
1204 already_covered :: DynFlags -> [CoreExpr] -> Bool
1205 already_covered dflags args -- Note [Specialisations already covered]
1206 = isJust (lookupRule dflags
1207 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1208 (const True)
1209 fn args rules_for_me)
1210
1211 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1212 mk_ty_args [] poly_tvs
1213 = ASSERT( null poly_tvs ) []
1214 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1215 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1216 mk_ty_args (Just ty : call_ts) poly_tvs
1217 = Type ty : mk_ty_args call_ts poly_tvs
1218 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1219
1220 ----------------------------------------------------------
1221 -- Specialise to one particular call pattern
1222 spec_call :: CallInfo -- Call instance
1223 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1224 UsageDetails, -- Usage details from specialised body
1225 CoreRule)) -- Info for the Id's SpecEnv
1226 spec_call _call_info@(CallKey call_ts, (call_ds, _))
1227 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1228
1229 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1230 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1231
1232 -- Construct the new binding
1233 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1234 -- PLUS the rule
1235 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1236 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1237 -- PLUS the usage-details
1238 -- { d1' = dx1; d2' = dx2 }
1239 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1240 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1241 --
1242 -- Note that the substitution is applied to the whole thing.
1243 -- This is convenient, but just slightly fragile. Notably:
1244 -- * There had better be no name clashes in a/b/c
1245 do { let
1246 -- poly_tyvars = [b] in the example above
1247 -- spec_tyvars = [a,c]
1248 -- ty_args = [t1,b,t3]
1249 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1250 env1 = extendTvSubstList env spec_tv_binds
1251 (rhs_env, poly_tyvars) = substBndrs env1
1252 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1253
1254 -- Clone rhs_dicts, including instantiating their types
1255 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1256 ; let (rhs_env2, dx_binds, spec_dict_args)
1257 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1258 ty_args = mk_ty_args call_ts poly_tyvars
1259 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1260 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1261 rule_args = ty_args ++ ev_args
1262 rule_bndrs = poly_tyvars ++ ev_bndrs
1263
1264 ; dflags <- getDynFlags
1265 ; if already_covered dflags rule_args then
1266 return Nothing
1267 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1268 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1269 -- , ppr dx_binds ]) $
1270 do
1271 { -- Figure out the type of the specialised function
1272 let body_ty = applyTypeToArgs rhs fn_type rule_args
1273 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1274 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1275 , not (isJoinId fn)
1276 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1277 | otherwise = (poly_tyvars, poly_tyvars)
1278 spec_id_ty = mkLamTypes lam_args body_ty
1279 join_arity_change = length app_args - length rule_args
1280 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1281 = Just (orig_join_arity + join_arity_change)
1282 | otherwise
1283 = Nothing
1284
1285 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1286 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1287 ; this_mod <- getModule
1288 ; let
1289 -- The rule to put in the function's specialisation is:
1290 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1291 herald = case mb_mod of
1292 Nothing -- Specialising local fn
1293 -> text "SPEC"
1294 Just this_mod -- Specialising imoprted fn
1295 -> text "SPEC/" <> ppr this_mod
1296
1297 rule_name = mkFastString $ showSDoc dflags $
1298 herald <+> ftext (occNameFS (getOccName fn))
1299 <+> hsep (map ppr_call_key_ty call_ts)
1300 -- This name ends up in interface files, so use occNameString.
1301 -- Otherwise uniques end up there, making builds
1302 -- less deterministic (See #4012 comment:61 ff)
1303
1304 rule_wout_eta = mkRule
1305 this_mod
1306 True {- Auto generated -}
1307 is_local
1308 rule_name
1309 inl_act -- Note [Auto-specialisation and RULES]
1310 (idName fn)
1311 rule_bndrs
1312 rule_args
1313 (mkVarApps (Var spec_f) app_args)
1314
1315 spec_env_rule
1316 = case isJoinId_maybe fn of
1317 Just join_arity -> etaExpandToJoinPointRule join_arity
1318 rule_wout_eta
1319 Nothing -> rule_wout_eta
1320
1321 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1322 final_uds = foldr consDictBind rhs_uds dx_binds
1323
1324 --------------------------------------
1325 -- Add a suitable unfolding if the spec_inl_prag says so
1326 -- See Note [Inline specialisations]
1327 (spec_inl_prag, spec_unf)
1328 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1329 = (neverInlinePragma, noUnfolding)
1330 -- See Note [Specialising imported functions] in OccurAnal
1331
1332 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1333 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1334
1335 | otherwise
1336 = (inl_prag, specUnfolding poly_tyvars spec_app
1337 arity_decrease fn_unf)
1338
1339 arity_decrease = length spec_dict_args
1340 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1341
1342 --------------------------------------
1343 -- Adding arity information just propagates it a bit faster
1344 -- See Note [Arity decrease] in Simplify
1345 -- Copy InlinePragma information from the parent Id.
1346 -- So if f has INLINE[1] so does spec_f
1347 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1348 `setInlinePragma` spec_inl_prag
1349 `setIdUnfolding` spec_unf
1350 `asJoinId_maybe` spec_join_arity
1351
1352 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1353
1354 {- Note [Account for casts in binding]
1355 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1356 Consider
1357 f :: Eq a => a -> IO ()
1358 {-# INLINABLE f
1359 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1360 #-}
1361 f = ...
1362
1363 In f's stable unfolding we have done some modest simplification which
1364 has pushed the cast to the outside. (I wonder if this is the Right
1365 Thing, but it's what happens now; see SimplUtils Note [Casts and
1366 lambdas].) Now that stable unfolding must be specialised, so we want
1367 to push the cast back inside. It would be terrible if the cast
1368 defeated specialisation! Hence the use of collectBindersPushingCo.
1369
1370 Note [Evidence foralls]
1371 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1372 Suppose (Trac #12212) that we are specialising
1373 f :: forall a b. (Num a, F a ~ F b) => blah
1374 with a=b=Int. Then the RULE will be something like
1375 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1376 f Int Int d g = f_spec
1377 But both varToCoreExpr (when constructing the LHS args), and the
1378 simplifier (when simplifying the LHS args), will transform to
1379 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1380 f Int Int d <F Int> = f_spec
1381 by replacing g with Refl. So now 'g' is unbound, which results in a later
1382 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1383 * varToCoreExpr generates a Refl
1384 * exprsFreeIdsList returns the Ids bound by the args,
1385 which won't include g
1386
1387 You might wonder if this will match as often, but the simplifier replaces
1388 complicated Refl coercions with Refl pretty aggressively.
1389
1390 Note [Orphans and auto-generated rules]
1391 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1392 When we specialise an INLINABLE function, or when we have
1393 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1394 We don't want to warn about these, or we'd generate a lot of warnings.
1395 Thus, we only warn about user-specified orphan rules.
1396
1397 Indeed, we don't even treat the module as an orphan module if it has
1398 auto-generated *rule* orphans. Orphan modules are read every time we
1399 compile, so they are pretty obtrusive and slow down every compilation,
1400 even non-optimised ones. (Reason: for type class instances it's a
1401 type correctness issue.) But specialisation rules are strictly for
1402 *optimisation* only so it's fine not to read the interface.
1403
1404 What this means is that a SPEC rules from auto-specialisation in
1405 module M will be used in other modules only if M.hi has been read for
1406 some other reason, which is actually pretty likely.
1407 -}
1408
1409 bindAuxiliaryDicts
1410 :: SpecEnv
1411 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1412 -> [DictId] -- A cloned dict-id for each dict arg
1413 -> (SpecEnv, -- Substitute for all orig_dicts
1414 [DictBind], -- Auxiliary dict bindings
1415 [CoreExpr]) -- Witnessing expressions (all trivial)
1416 -- Bind any dictionary arguments to fresh names, to preserve sharing
1417 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1418 orig_dict_ids call_ds inst_dict_ids
1419 = (env', dx_binds, spec_dict_args)
1420 where
1421 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1422 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1423 (orig_dict_ids `zip` spec_dict_args)
1424 `CoreSubst.extendInScopeList` dx_ids
1425 , se_interesting = interesting `unionVarSet` interesting_dicts }
1426
1427 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1428 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1429 , interestingDict env dx ]
1430 -- See Note [Make the new dictionaries interesting]
1431
1432 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1433 go [] _ = ([], [])
1434 go (dx:dxs) (dx_id:dx_ids)
1435 | exprIsTrivial dx = (dx_binds, dx : args)
1436 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1437 where
1438 (dx_binds, args) = go dxs dx_ids
1439 -- In the first case extend the substitution but not bindings;
1440 -- in the latter extend the bindings but not the substitution.
1441 -- For the former, note that we bind the *original* dict in the substitution,
1442 -- overriding any d->dx_id binding put there by substBndrs
1443 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1444
1445 {-
1446 Note [Make the new dictionaries interesting]
1447 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1448 Important! We're going to substitute dx_id1 for d
1449 and we want it to look "interesting", else we won't gather *any*
1450 consequential calls. E.g.
1451 f d = ...g d....
1452 If we specialise f for a call (f (dfun dNumInt)), we'll get
1453 a consequent call (g d') with an auxiliary definition
1454 d' = df dNumInt
1455 We want that consequent call to look interesting
1456
1457
1458 Note [From non-recursive to recursive]
1459 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1460 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1461 have built a recursive knot
1462
1463 f a d x = <blah>
1464 MkUD { ud_binds = d7 = MkD ..f..
1465 , ud_calls = ...(f T d7)... }
1466
1467 The we generate
1468
1469 Rec { fs x = <blah>[T/a, d7/d]
1470 f a d x = <blah>
1471 RULE f T _ = fs
1472 d7 = ...f... }
1473
1474 Here the recursion is only through the RULE.
1475
1476
1477 Note [Specialisation of dictionary functions]
1478 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1479 Here is a nasty example that bit us badly: see Trac #3591
1480
1481 class Eq a => C a
1482 instance Eq [a] => C [a]
1483
1484 ---------------
1485 dfun :: Eq [a] -> C [a]
1486 dfun a d = MkD a d (meth d)
1487
1488 d4 :: Eq [T] = <blah>
1489 d2 :: C [T] = dfun T d4
1490 d1 :: Eq [T] = $p1 d2
1491 d3 :: C [T] = dfun T d1
1492
1493 None of these definitions is recursive. What happened was that we
1494 generated a specialisation:
1495
1496 RULE forall d. dfun T d = dT :: C [T]
1497 dT = (MkD a d (meth d)) [T/a, d1/d]
1498 = MkD T d1 (meth d1)
1499
1500 But now we use the RULE on the RHS of d2, to get
1501
1502 d2 = dT = MkD d1 (meth d1)
1503 d1 = $p1 d2
1504
1505 and now d1 is bottom! The problem is that when specialising 'dfun' we
1506 should first dump "below" the binding all floated dictionary bindings
1507 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1508 placed below 'dfun', and thus unavailable to it when specialising
1509 'dfun'. That in turn means that the call (dfun T d1) must be
1510 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1511 d4 doesn't mention dfun.
1512
1513 But look at this:
1514
1515 class C a where { foo,bar :: [a] -> [a] }
1516
1517 instance C Int where
1518 foo x = r_bar x
1519 bar xs = reverse xs
1520
1521 r_bar :: C a => [a] -> [a]
1522 r_bar xs = bar (xs ++ xs)
1523
1524 That translates to:
1525
1526 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1527
1528 Rec { $fCInt :: C Int = MkC foo_help reverse
1529 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1530
1531 The call (r_bar $fCInt) mentions $fCInt,
1532 which mentions foo_help,
1533 which mentions r_bar
1534 But we DO want to specialise r_bar at Int:
1535
1536 Rec { $fCInt :: C Int = MkC foo_help reverse
1537 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1538
1539 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1540 RULE r_bar Int _ = r_bar_Int
1541
1542 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1543 }
1544
1545 Note that, because of its RULE, r_bar joins the recursive
1546 group. (In this case it'll unravel a short moment later.)
1547
1548
1549 Conclusion: we catch the nasty case using filter_dfuns in
1550 callsForMe. To be honest I'm not 100% certain that this is 100%
1551 right, but it works. Sigh.
1552
1553
1554 Note [Specialising a recursive group]
1555 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1556 Consider
1557 let rec { f x = ...g x'...
1558 ; g y = ...f y'.... }
1559 in f 'a'
1560 Here we specialise 'f' at Char; but that is very likely to lead to
1561 a specialisation of 'g' at Char. We must do the latter, else the
1562 whole point of specialisation is lost.
1563
1564 But we do not want to keep iterating to a fixpoint, because in the
1565 presence of polymorphic recursion we might generate an infinite number
1566 of specialisations.
1567
1568 So we use the following heuristic:
1569 * Arrange the rec block in dependency order, so far as possible
1570 (the occurrence analyser already does this)
1571
1572 * Specialise it much like a sequence of lets
1573
1574 * Then go through the block a second time, feeding call-info from
1575 the RHSs back in the bottom, as it were
1576
1577 In effect, the ordering maxmimises the effectiveness of each sweep,
1578 and we do just two sweeps. This should catch almost every case of
1579 monomorphic recursion -- the exception could be a very knotted-up
1580 recursion with multiple cycles tied up together.
1581
1582 This plan is implemented in the Rec case of specBindItself.
1583
1584 Note [Specialisations already covered]
1585 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1586 We obviously don't want to generate two specialisations for the same
1587 argument pattern. There are two wrinkles
1588
1589 1. We do the already-covered test in specDefn, not when we generate
1590 the CallInfo in mkCallUDs. We used to test in the latter place, but
1591 we now iterate the specialiser somewhat, and the Id at the call site
1592 might therefore not have all the RULES that we can see in specDefn
1593
1594 2. What about two specialisations where the second is an *instance*
1595 of the first? If the more specific one shows up first, we'll generate
1596 specialisations for both. If the *less* specific one shows up first,
1597 we *don't* currently generate a specialisation for the more specific
1598 one. (See the call to lookupRule in already_covered.) Reasons:
1599 (a) lookupRule doesn't say which matches are exact (bad reason)
1600 (b) if the earlier specialisation is user-provided, it's
1601 far from clear that we should auto-specialise further
1602
1603 Note [Auto-specialisation and RULES]
1604 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1605 Consider:
1606 g :: Num a => a -> a
1607 g = ...
1608
1609 f :: (Int -> Int) -> Int
1610 f w = ...
1611 {-# RULE f g = 0 #-}
1612
1613 Suppose that auto-specialisation makes a specialised version of
1614 g::Int->Int That version won't appear in the LHS of the RULE for f.
1615 So if the specialisation rule fires too early, the rule for f may
1616 never fire.
1617
1618 It might be possible to add new rules, to "complete" the rewrite system.
1619 Thus when adding
1620 RULE forall d. g Int d = g_spec
1621 also add
1622 RULE f g_spec = 0
1623
1624 But that's a bit complicated. For now we ask the programmer's help,
1625 by *copying the INLINE activation pragma* to the auto-specialised
1626 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1627 will also not be active until phase 2. And that's what programmers
1628 should jolly well do anyway, even aside from specialisation, to ensure
1629 that g doesn't inline too early.
1630
1631 This in turn means that the RULE would never fire for a NOINLINE
1632 thing so not much point in generating a specialisation at all.
1633
1634 Note [Specialisation shape]
1635 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1636 We only specialise a function if it has visible top-level lambdas
1637 corresponding to its overloading. E.g. if
1638 f :: forall a. Eq a => ....
1639 then its body must look like
1640 f = /\a. \d. ...
1641
1642 Reason: when specialising the body for a call (f ty dexp), we want to
1643 substitute dexp for d, and pick up specialised calls in the body of f.
1644
1645 This doesn't always work. One example I came across was this:
1646 newtype Gen a = MkGen{ unGen :: Int -> a }
1647
1648 choose :: Eq a => a -> Gen a
1649 choose n = MkGen (\r -> n)
1650
1651 oneof = choose (1::Int)
1652
1653 It's a silly exapmle, but we get
1654 choose = /\a. g `cast` co
1655 where choose doesn't have any dict arguments. Thus far I have not
1656 tried to fix this (wait till there's a real example).
1657
1658 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1659 it doesn't matter. This comes up with single-method classes
1660
1661 class C a where { op :: a -> a }
1662 instance C a => C [a] where ....
1663 ==>
1664 $fCList :: C a => C [a]
1665 $fCList = $copList |> (...coercion>...)
1666 ....(uses of $fCList at particular types)...
1667
1668 So we suppress the WARN if the rhs is trivial.
1669
1670 Note [Inline specialisations]
1671 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1672 Here is what we do with the InlinePragma of the original function
1673 * Activation/RuleMatchInfo: both transferred to the
1674 specialised function
1675 * InlineSpec:
1676 (a) An INLINE pragma is transferred
1677 (b) An INLINABLE pragma is *not* transferred
1678
1679 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1680 specialise the function at its call site, and arguably that's not so
1681 important for the specialised copies. BUT *pragma-directed*
1682 specialisation now takes place in the typechecker/desugarer, with
1683 manually specified INLINEs. The specialisation here is automatic.
1684 It'd be very odd if a function marked INLINE was specialised (because
1685 of some local use), and then forever after (including importing
1686 modules) the specialised version wasn't INLINEd. After all, the
1687 programmer said INLINE!
1688
1689 You might wonder why we specialise INLINE functions at all. After
1690 all they should be inlined, right? Two reasons:
1691
1692 * Even INLINE functions are sometimes not inlined, when they aren't
1693 applied to interesting arguments. But perhaps the type arguments
1694 alone are enough to specialise (even though the args are too boring
1695 to trigger inlining), and it's certainly better to call the
1696 specialised version.
1697
1698 * The RHS of an INLINE function might call another overloaded function,
1699 and we'd like to generate a specialised version of that function too.
1700 This actually happens a lot. Consider
1701 replicateM_ :: (Monad m) => Int -> m a -> m ()
1702 {-# INLINABLE replicateM_ #-}
1703 replicateM_ d x ma = ...
1704 The strictness analyser may transform to
1705 replicateM_ :: (Monad m) => Int -> m a -> m ()
1706 {-# INLINE replicateM_ #-}
1707 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1708
1709 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1710 {-# INLINABLE $wreplicateM_ #-}
1711 $wreplicateM_ = ...
1712 Now an importing module has a specialised call to replicateM_, say
1713 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1714 This particular example had a huge effect on the call to replicateM_
1715 in nofib/shootout/n-body.
1716
1717 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1718 Suppose we have
1719 {-# INLINABLE f #-}
1720 f :: Ord a => [a] -> Int
1721 f xs = letrec f' = ...f'... in f'
1722 Then, when f is specialised and optimised we might get
1723 wgo :: [Int] -> Int#
1724 wgo = ...wgo...
1725 f_spec :: [Int] -> Int
1726 f_spec xs = case wgo xs of { r -> I# r }
1727 and we clearly want to inline f_spec at call sites. But if we still
1728 have the big, un-optimised of f (albeit specialised) captured in an
1729 INLINABLE pragma for f_spec, we won't get that optimisation.
1730
1731 So we simply drop INLINABLE pragmas when specialising. It's not really
1732 a complete solution; ignoring specialisation for now, INLINABLE functions
1733 don't get properly strictness analysed, for example. But it works well
1734 for examples involving specialisation, which is the dominant use of
1735 INLINABLE. See Trac #4874.
1736
1737
1738 ************************************************************************
1739 * *
1740 \subsubsection{UsageDetails and suchlike}
1741 * *
1742 ************************************************************************
1743 -}
1744
1745 data UsageDetails
1746 = MkUD {
1747 ud_binds :: !(Bag DictBind),
1748 -- Floated dictionary bindings
1749 -- The order is important;
1750 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1751 -- (Remember, Bags preserve order in GHC.)
1752
1753 ud_calls :: !CallDetails
1754
1755 -- INVARIANT: suppose bs = bindersOf ud_binds
1756 -- Then 'calls' may *mention* 'bs',
1757 -- but there should be no calls *for* bs
1758 }
1759
1760 instance Outputable UsageDetails where
1761 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1762 = text "MkUD" <+> braces (sep (punctuate comma
1763 [text "binds" <+> equals <+> ppr dbs,
1764 text "calls" <+> equals <+> ppr calls]))
1765
1766 -- | A 'DictBind' is a binding along with a cached set containing its free
1767 -- variables (both type variables and dictionaries)
1768 type DictBind = (CoreBind, VarSet)
1769
1770 type DictExpr = CoreExpr
1771
1772 emptyUDs :: UsageDetails
1773 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1774
1775 ------------------------------------------------------------
1776 type CallDetails = DIdEnv CallInfoSet
1777 -- The order of specialized binds and rules depends on how we linearize
1778 -- CallDetails, so to get determinism we must use a deterministic set here.
1779 -- See Note [Deterministic UniqFM] in UniqDFM
1780 newtype CallKey = CallKey [Maybe Type]
1781 -- Nothing => unconstrained type argument
1782
1783 data CallInfoSet = CIS Id (Bag CallInfo)
1784 -- The list of types and dictionaries is guaranteed to
1785 -- match the type of f
1786
1787 {-
1788 Note [CallInfoSet determinism]
1789 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1790 CallInfoSet holds a Bag of (CallKey, [DictExpr], VarSet) triplets for a given
1791 Id. They represent the types that the function is instantiated at along with
1792 the dictionaries and free variables.
1793
1794 We use this information to generate specialized versions of a given function.
1795 CallInfoSet used to be defined as:
1796
1797 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1798
1799 Unfortunately this was not deterministic. The Ord instance of CallKey was
1800 defined in terms of nonDetCmpType which is not deterministic.
1801 See Note [nonDetCmpType nondeterminism].
1802 The end result was that if the function had multiple specializations they would
1803 be generated in arbitrary order.
1804
1805 We need a container that:
1806 a) when turned into a list has only one element per each CallKey and the list
1807 has deterministic order
1808 b) supports union
1809 c) supports singleton
1810 d) supports filter
1811
1812 We can't use UniqDFM here because there's no one Unique that we can key on.
1813
1814 The current approach is to implement the set as a Bag with duplicates.
1815 This makes b), c), d) trivial and pushes a) towards the end. The deduplication
1816 is done by using a TrieMap for membership tests on CallKey. This lets us delete
1817 the nondeterministic Ord CallKey instance.
1818
1819 An alternative approach would be to augment the Map the same way that UniqDFM
1820 is augmented, by keeping track of insertion order and using it to order the
1821 resulting lists. It would mean keeping the nondeterministic Ord CallKey
1822 instance making it easy to reintroduce nondeterminism in the future.
1823 -}
1824
1825 ciSetToList :: CallInfoSet -> [CallInfo]
1826 ciSetToList (CIS _ b) = snd $ foldrBag combine (emptyTM, []) b
1827 where
1828 -- This is where we eliminate duplicates, recording the CallKeys we've
1829 -- already seen in the TrieMap. See Note [CallInfoSet determinism].
1830 combine :: CallInfo -> (CallKeySet, [CallInfo]) -> (CallKeySet, [CallInfo])
1831 combine ci@(CallKey key, _) (set, acc)
1832 | Just _ <- lookupTM key set = (set, acc)
1833 | otherwise = (insertTM key () set, ci:acc)
1834
1835 type CallKeySet = ListMap (MaybeMap TypeMap) ()
1836 -- We only use it in ciSetToList to check for membership
1837
1838 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1839 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1840
1841 type CallInfo = (CallKey, ([DictExpr], VarSet))
1842 -- Range is dict args and the vars of the whole
1843 -- call (including tyvars)
1844 -- [*not* include the main id itself, of course]
1845
1846 instance Outputable CallInfoSet where
1847 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1848 2 (ppr map)
1849
1850 pprCallInfo :: Id -> CallInfo -> SDoc
1851 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1852 = hang (ppr fn)
1853 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1854
1855 ppr_call_key_ty :: Maybe Type -> SDoc
1856 ppr_call_key_ty Nothing = char '_'
1857 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1858
1859 instance Outputable CallKey where
1860 ppr (CallKey ts) = ppr ts
1861
1862 unionCalls :: CallDetails -> CallDetails -> CallDetails
1863 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1864
1865 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1866 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1867 CIS f (calls1 `unionBags` calls2)
1868
1869 callDetailsFVs :: CallDetails -> VarSet
1870 callDetailsFVs calls =
1871 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1872 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1873 -- immediately by converting to a nondeterministic set.
1874
1875 callInfoFVs :: CallInfoSet -> VarSet
1876 callInfoFVs (CIS _ call_info) =
1877 foldrBag (\(_, (_,fv)) vs -> unionVarSet fv vs) emptyVarSet call_info
1878
1879 ------------------------------------------------------------
1880 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1881 singleCall id tys dicts
1882 = MkUD {ud_binds = emptyBag,
1883 ud_calls = unitDVarEnv id $ CIS id $
1884 unitBag (CallKey tys, (dicts, call_fvs)) }
1885 where
1886 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1887 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1888 -- The type args (tys) are guaranteed to be part of the dictionary
1889 -- types, because they are just the constrained types,
1890 -- and the dictionary is therefore sure to be bound
1891 -- inside the binding for any type variables free in the type;
1892 -- hence it's safe to neglect tyvars free in tys when making
1893 -- the free-var set for this call
1894 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1895 --
1896 -- We don't include the 'id' itself.
1897
1898 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1899 mkCallUDs env f args
1900 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1901 res
1902 where
1903 res = mkCallUDs' env f args
1904
1905 mkCallUDs' env f args
1906 | not (want_calls_for f) -- Imported from elsewhere
1907 || null theta -- Not overloaded
1908 = emptyUDs
1909
1910 | not (all type_determines_value theta)
1911 || not (spec_tys `lengthIs` n_tyvars)
1912 || not ( dicts `lengthIs` n_dicts)
1913 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1914 -- See also Note [Specialisations already covered]
1915 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1916 emptyUDs -- Not overloaded, or no specialisation wanted
1917
1918 | otherwise
1919 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1920 singleCall f spec_tys dicts
1921 where
1922 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1923 , ppr (map (interestingDict env) dicts)]
1924 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1925 constrained_tyvars = tyCoVarsOfTypes theta
1926 n_tyvars = length tyvars
1927 n_dicts = length theta
1928
1929 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1930 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1931
1932 -- ignores Coercion arguments
1933 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1934 type_zip tvs (Coercion _ : args) = type_zip tvs args
1935 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1936 type_zip _ _ = []
1937
1938 mk_spec_ty tyvar ty
1939 | tyvar `elemVarSet` constrained_tyvars = Just ty
1940 | otherwise = Nothing
1941
1942 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1943 -- For imported things, we gather call instances if
1944 -- there is an unfolding that we could in principle specialise
1945 -- We might still decide not to use it (consulting dflags)
1946 -- in specImports
1947 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1948
1949 type_determines_value pred -- See Note [Type determines value]
1950 = case classifyPredType pred of
1951 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1952 EqPred {} -> True
1953 IrredPred {} -> True -- Things like (D []) where D is a
1954 -- Constraint-ranged family; Trac #7785
1955
1956 {-
1957 Note [Type determines value]
1958 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1959 Only specialise if all overloading is on non-IP *class* params,
1960 because these are the ones whose *type* determines their *value*. In
1961 parrticular, with implicit params, the type args *don't* say what the
1962 value of the implicit param is! See Trac #7101
1963
1964 However, consider
1965 type family D (v::*->*) :: Constraint
1966 type instance D [] = ()
1967 f :: D v => v Char -> Int
1968 If we see a call (f "foo"), we'll pass a "dictionary"
1969 () |> (g :: () ~ D [])
1970 and it's good to specialise f at this dictionary.
1971
1972 So the question is: can an implicit parameter "hide inside" a
1973 type-family constraint like (D a). Well, no. We don't allow
1974 type instance D Maybe = ?x:Int
1975 Hence the IrredPred case in type_determines_value.
1976 See Trac #7785.
1977
1978 Note [Interesting dictionary arguments]
1979 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1980 Consider this
1981 \a.\d:Eq a. let f = ... in ...(f d)...
1982 There really is not much point in specialising f wrt the dictionary d,
1983 because the code for the specialised f is not improved at all, because
1984 d is lambda-bound. We simply get junk specialisations.
1985
1986 What is "interesting"? Just that it has *some* structure. But what about
1987 variables?
1988
1989 * A variable might be imported, in which case its unfolding
1990 will tell us whether it has useful structure
1991
1992 * Local variables are cloned on the way down (to avoid clashes when
1993 we float dictionaries), and cloning drops the unfolding
1994 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1995 nuisance to give them unfoldings. So we keep track of the
1996 "interesting" dictionaries as a VarSet in SpecEnv.
1997 We have to take care to put any new interesting dictionary
1998 bindings in the set.
1999
2000 We accidentally lost accurate tracking of local variables for a long
2001 time, because cloned variables don't have unfoldings. But makes a
2002 massive difference in a few cases, eg Trac #5113. For nofib as a
2003 whole it's only a small win: 2.2% improvement in allocation for ansi,
2004 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2005 -}
2006
2007 interestingDict :: SpecEnv -> CoreExpr -> Bool
2008 -- A dictionary argument is interesting if it has *some* structure
2009 -- NB: "dictionary" arguments include constraints of all sorts,
2010 -- including equality constraints; hence the Coercion case
2011 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2012 || isDataConWorkId v
2013 || v `elemVarSet` se_interesting env
2014 interestingDict _ (Type _) = False
2015 interestingDict _ (Coercion _) = False
2016 interestingDict env (App fn (Type _)) = interestingDict env fn
2017 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2018 interestingDict env (Tick _ a) = interestingDict env a
2019 interestingDict env (Cast e _) = interestingDict env e
2020 interestingDict _ _ = True
2021
2022 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2023 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2024 (MkUD {ud_binds = db2, ud_calls = calls2})
2025 = MkUD { ud_binds = db1 `unionBags` db2
2026 , ud_calls = calls1 `unionCalls` calls2 }
2027
2028 plusUDList :: [UsageDetails] -> UsageDetails
2029 plusUDList = foldr plusUDs emptyUDs
2030
2031 -----------------------------
2032 _dictBindBndrs :: Bag DictBind -> [Id]
2033 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2034
2035 -- | Construct a 'DictBind' from a 'CoreBind'
2036 mkDB :: CoreBind -> DictBind
2037 mkDB bind = (bind, bind_fvs bind)
2038
2039 -- | Identify the free variables of a 'CoreBind'
2040 bind_fvs :: CoreBind -> VarSet
2041 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2042 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2043 where
2044 bndrs = map fst prs
2045 rhs_fvs = unionVarSets (map pair_fvs prs)
2046
2047 pair_fvs :: (Id, CoreExpr) -> VarSet
2048 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
2049 -- Don't forget variables mentioned in the
2050 -- rules of the bndr. C.f. OccAnal.addRuleUsage
2051 -- Also tyvars mentioned in its type; they may not appear in the RHS
2052 -- type T a = Int
2053 -- x :: T a = 3
2054
2055 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
2056 -- recursive binding, including some additional bindings.
2057 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
2058 flattenDictBinds dbs pairs
2059 = (Rec bindings, fvs)
2060 where
2061 (bindings, fvs) = foldrBag add
2062 ([], emptyVarSet)
2063 (dbs `snocBag` mkDB (Rec pairs))
2064 add (NonRec b r, fvs') (pairs, fvs) =
2065 ((b,r) : pairs, fvs `unionVarSet` fvs')
2066 add (Rec prs1, fvs') (pairs, fvs) =
2067 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2068
2069 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2070 -- Add ud_binds to the tail end of the bindings in uds
2071 snocDictBinds uds dbs
2072 = uds { ud_binds = ud_binds uds `unionBags`
2073 foldr consBag emptyBag dbs }
2074
2075 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2076 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2077
2078 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2079 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2080
2081 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2082 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2083
2084 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2085 wrapDictBinds dbs binds
2086 = foldrBag add binds dbs
2087 where
2088 add (bind,_) binds = bind : binds
2089
2090 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2091 wrapDictBindsE dbs expr
2092 = foldrBag add expr dbs
2093 where
2094 add (bind,_) expr = Let bind expr
2095
2096 ----------------------
2097 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2098 -- Used at a lambda or case binder; just dump anything mentioning the binder
2099 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2100 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2101 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2102 (free_uds, dump_dbs)
2103 where
2104 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2105 bndr_set = mkVarSet bndrs
2106 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2107 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2108 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2109 -- no calls for any of the dicts in dump_dbs
2110
2111 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2112 -- Used at a lambda or case binder; just dump anything mentioning the binder
2113 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2114 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2115 (free_uds, dump_dbs, float_all)
2116 where
2117 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2118 bndr_set = mkVarSet bndrs
2119 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2120 free_calls = deleteCallsFor bndrs orig_calls
2121 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2122
2123 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2124 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2125 = -- pprTrace ("callsForMe")
2126 -- (vcat [ppr fn,
2127 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2128 -- text "Orig calls =" <+> ppr orig_calls,
2129 -- text "Dep set =" <+> ppr dep_set,
2130 -- text "Calls for me =" <+> ppr calls_for_me]) $
2131 (uds_without_me, calls_for_me)
2132 where
2133 uds_without_me = MkUD { ud_binds = orig_dbs
2134 , ud_calls = delDVarEnv orig_calls fn }
2135 calls_for_me = case lookupDVarEnv orig_calls fn of
2136 Nothing -> []
2137 Just cis -> filter_dfuns (ciSetToList cis)
2138
2139 dep_set = foldlBag go (unitVarSet fn) orig_dbs
2140 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2141 = extendVarSetList dep_set (bindersOf db)
2142 | otherwise = dep_set
2143
2144 -- Note [Specialisation of dictionary functions]
2145 filter_dfuns | isDFunId fn = filter ok_call
2146 | otherwise = \cs -> cs
2147
2148 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2149
2150 ----------------------
2151 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2152 -- Returns (free_dbs, dump_dbs, dump_set)
2153 splitDictBinds dbs bndr_set
2154 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2155 -- Important that it's foldl not foldr;
2156 -- we're accumulating the set of dumped ids in dump_set
2157 where
2158 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2159 | dump_idset `intersectsVarSet` fvs -- Dump it
2160 = (free_dbs, dump_dbs `snocBag` db,
2161 extendVarSetList dump_idset (bindersOf bind))
2162
2163 | otherwise -- Don't dump it
2164 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2165
2166
2167 ----------------------
2168 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2169 -- Remove calls *mentioning* bs
2170 deleteCallsMentioning bs calls
2171 = mapDVarEnv (ciSetFilter keep_call) calls
2172 where
2173 keep_call (_, (_, fvs)) = not (fvs `intersectsVarSet` bs)
2174
2175 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2176 -- Remove calls *for* bs
2177 deleteCallsFor bs calls = delDVarEnvList calls bs
2178
2179 {-
2180 ************************************************************************
2181 * *
2182 \subsubsection{Boring helper functions}
2183 * *
2184 ************************************************************************
2185 -}
2186
2187 newtype SpecM a = SpecM (State SpecState a)
2188
2189 data SpecState = SpecState {
2190 spec_uniq_supply :: UniqSupply,
2191 spec_module :: Module,
2192 spec_dflags :: DynFlags
2193 }
2194
2195 instance Functor SpecM where
2196 fmap = liftM
2197
2198 instance Applicative SpecM where
2199 pure x = SpecM $ return x
2200 (<*>) = ap
2201
2202 instance Monad SpecM where
2203 SpecM x >>= f = SpecM $ do y <- x
2204 case f y of
2205 SpecM z ->
2206 z
2207 fail str = SpecM $ fail str
2208
2209 #if __GLASGOW_HASKELL__ > 710
2210 instance MonadFail.MonadFail SpecM where
2211 fail str = SpecM $ fail str
2212 #endif
2213
2214 instance MonadUnique SpecM where
2215 getUniqueSupplyM
2216 = SpecM $ do st <- get
2217 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2218 put $ st { spec_uniq_supply = us2 }
2219 return us1
2220
2221 getUniqueM
2222 = SpecM $ do st <- get
2223 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2224 put $ st { spec_uniq_supply = us' }
2225 return u
2226
2227 instance HasDynFlags SpecM where
2228 getDynFlags = SpecM $ liftM spec_dflags get
2229
2230 instance HasModule SpecM where
2231 getModule = SpecM $ liftM spec_module get
2232
2233 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2234 runSpecM dflags this_mod (SpecM spec)
2235 = do us <- getUniqueSupplyM
2236 let initialState = SpecState {
2237 spec_uniq_supply = us,
2238 spec_module = this_mod,
2239 spec_dflags = dflags
2240 }
2241 return $ evalState spec initialState
2242
2243 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2244 mapAndCombineSM _ [] = return ([], emptyUDs)
2245 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2246 (ys, uds2) <- mapAndCombineSM f xs
2247 return (y:ys, uds1 `plusUDs` uds2)
2248
2249 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2250 extendTvSubstList env tv_binds
2251 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2252
2253 substTy :: SpecEnv -> Type -> Type
2254 substTy env ty = CoreSubst.substTy (se_subst env) ty
2255
2256 substCo :: SpecEnv -> Coercion -> Coercion
2257 substCo env co = CoreSubst.substCo (se_subst env) co
2258
2259 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2260 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2261 (subst', bs') -> (env { se_subst = subst' }, bs')
2262
2263 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2264 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2265 (subst', bs') -> (env { se_subst = subst' }, bs')
2266
2267 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2268 -- Clone the binders of the bind; return new bind with the cloned binders
2269 -- Return the substitution to use for RHSs, and the one to use for the body
2270 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2271 = do { us <- getUniqueSupplyM
2272 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2273 interesting' | interestingDict env rhs
2274 = interesting `extendVarSet` bndr'
2275 | otherwise = interesting
2276 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2277 , NonRec bndr' rhs) }
2278
2279 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2280 = do { us <- getUniqueSupplyM
2281 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2282 env' = env { se_subst = subst'
2283 , se_interesting = interesting `extendVarSetList`
2284 [ v | (v,r) <- pairs, interestingDict env r ] }
2285 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2286
2287 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2288 -- Make up completely fresh binders for the dictionaries
2289 -- Their bindings are going to float outwards
2290 newDictBndr env b = do { uniq <- getUniqueM
2291 ; let n = idName b
2292 ty' = substTy env (idType b)
2293 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2294
2295 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2296 -- Give the new Id a similar occurrence name to the old one
2297 newSpecIdSM old_id new_ty join_arity_maybe
2298 = do { uniq <- getUniqueM
2299 ; let name = idName old_id
2300 new_occ = mkSpecOcc (nameOccName name)
2301 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2302 `asJoinId_maybe` join_arity_maybe
2303 ; return new_id }
2304
2305 {-
2306 Old (but interesting) stuff about unboxed bindings
2307 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2308
2309 What should we do when a value is specialised to a *strict* unboxed value?
2310
2311 map_*_* f (x:xs) = let h = f x
2312 t = map f xs
2313 in h:t
2314
2315 Could convert let to case:
2316
2317 map_*_Int# f (x:xs) = case f x of h# ->
2318 let t = map f xs
2319 in h#:t
2320
2321 This may be undesirable since it forces evaluation here, but the value
2322 may not be used in all branches of the body. In the general case this
2323 transformation is impossible since the mutual recursion in a letrec
2324 cannot be expressed as a case.
2325
2326 There is also a problem with top-level unboxed values, since our
2327 implementation cannot handle unboxed values at the top level.
2328
2329 Solution: Lift the binding of the unboxed value and extract it when it
2330 is used:
2331
2332 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2333 t = map f xs
2334 in case h of
2335 _Lift h# -> h#:t
2336
2337 Now give it to the simplifier and the _Lifting will be optimised away.
2338
2339 The benefit is that we have given the specialised "unboxed" values a
2340 very simple lifted semantics and then leave it up to the simplifier to
2341 optimise it --- knowing that the overheads will be removed in nearly
2342 all cases.
2343
2344 In particular, the value will only be evaluated in the branches of the
2345 program which use it, rather than being forced at the point where the
2346 value is bound. For example:
2347
2348 filtermap_*_* p f (x:xs)
2349 = let h = f x
2350 t = ...
2351 in case p x of
2352 True -> h:t
2353 False -> t
2354 ==>
2355 filtermap_*_Int# p f (x:xs)
2356 = let h = case (f x) of h# -> _Lift h#
2357 t = ...
2358 in case p x of
2359 True -> case h of _Lift h#
2360 -> h#:t
2361 False -> t
2362
2363 The binding for h can still be inlined in the one branch and the
2364 _Lifting eliminated.
2365
2366
2367 Question: When won't the _Lifting be eliminated?
2368
2369 Answer: When they at the top-level (where it is necessary) or when
2370 inlining would duplicate work (or possibly code depending on
2371 options). However, the _Lifting will still be eliminated if the
2372 strictness analyser deems the lifted binding strict.
2373 -}