e3501dfd38f094dc9c8ee3aab4b65f14edd69b59
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy, extendTvSubstList )
14 import Type hiding( substTy, extendTvSubstList )
15 import Coercion( Coercion )
16 import Module( Module, HasModule(..) )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import CoreUtils ( exprIsTrivial, applyTypeToArgs )
25 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars )
26 import UniqSupply
27 import Name
28 import MkId ( voidArgId, voidPrimId )
29 import Maybes ( catMaybes, isJust )
30 import BasicTypes
31 import HscTypes
32 import Bag
33 import DynFlags
34 import Util
35 import Outputable
36 import FastString
37 import State
38
39 #if __GLASGOW_HASKELL__ < 709
40 import Control.Applicative (Applicative(..))
41 #endif
42 import Control.Monad
43 import Data.Map (Map)
44 import qualified Data.Map as Map
45 import qualified FiniteMap as Map
46
47 {-
48 ************************************************************************
49 * *
50 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
51 * *
52 ************************************************************************
53
54 These notes describe how we implement specialisation to eliminate
55 overloading.
56
57 The specialisation pass works on Core
58 syntax, complete with all the explicit dictionary application,
59 abstraction and construction as added by the type checker. The
60 existing type checker remains largely as it is.
61
62 One important thought: the {\em types} passed to an overloaded
63 function, and the {\em dictionaries} passed are mutually redundant.
64 If the same function is applied to the same type(s) then it is sure to
65 be applied to the same dictionary(s)---or rather to the same {\em
66 values}. (The arguments might look different but they will evaluate
67 to the same value.)
68
69 Second important thought: we know that we can make progress by
70 treating dictionary arguments as static and worth specialising on. So
71 we can do without binding-time analysis, and instead specialise on
72 dictionary arguments and no others.
73
74 The basic idea
75 ~~~~~~~~~~~~~~
76 Suppose we have
77
78 let f = <f_rhs>
79 in <body>
80
81 and suppose f is overloaded.
82
83 STEP 1: CALL-INSTANCE COLLECTION
84
85 We traverse <body>, accumulating all applications of f to types and
86 dictionaries.
87
88 (Might there be partial applications, to just some of its types and
89 dictionaries? In principle yes, but in practice the type checker only
90 builds applications of f to all its types and dictionaries, so partial
91 applications could only arise as a result of transformation, and even
92 then I think it's unlikely. In any case, we simply don't accumulate such
93 partial applications.)
94
95
96 STEP 2: EQUIVALENCES
97
98 So now we have a collection of calls to f:
99 f t1 t2 d1 d2
100 f t3 t4 d3 d4
101 ...
102 Notice that f may take several type arguments. To avoid ambiguity, we
103 say that f is called at type t1/t2 and t3/t4.
104
105 We take equivalence classes using equality of the *types* (ignoring
106 the dictionary args, which as mentioned previously are redundant).
107
108 STEP 3: SPECIALISATION
109
110 For each equivalence class, choose a representative (f t1 t2 d1 d2),
111 and create a local instance of f, defined thus:
112
113 f@t1/t2 = <f_rhs> t1 t2 d1 d2
114
115 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
116 of simplification will now result. However we don't actually *do* that
117 simplification. Rather, we leave it for the simplifier to do. If we
118 *did* do it, though, we'd get more call instances from the specialised
119 RHS. We can work out what they are by instantiating the call-instance
120 set from f's RHS with the types t1, t2.
121
122 Add this new id to f's IdInfo, to record that f has a specialised version.
123
124 Before doing any of this, check that f's IdInfo doesn't already
125 tell us about an existing instance of f at the required type/s.
126 (This might happen if specialisation was applied more than once, or
127 it might arise from user SPECIALIZE pragmas.)
128
129 Recursion
130 ~~~~~~~~~
131 Wait a minute! What if f is recursive? Then we can't just plug in
132 its right-hand side, can we?
133
134 But it's ok. The type checker *always* creates non-recursive definitions
135 for overloaded recursive functions. For example:
136
137 f x = f (x+x) -- Yes I know its silly
138
139 becomes
140
141 f a (d::Num a) = let p = +.sel a d
142 in
143 letrec fl (y::a) = fl (p y y)
144 in
145 fl
146
147 We still have recusion for non-overloaded functions which we
148 speciailise, but the recursive call should get specialised to the
149 same recursive version.
150
151
152 Polymorphism 1
153 ~~~~~~~~~~~~~~
154
155 All this is crystal clear when the function is applied to *constant
156 types*; that is, types which have no type variables inside. But what if
157 it is applied to non-constant types? Suppose we find a call of f at type
158 t1/t2. There are two possibilities:
159
160 (a) The free type variables of t1, t2 are in scope at the definition point
161 of f. In this case there's no problem, we proceed just as before. A common
162 example is as follows. Here's the Haskell:
163
164 g y = let f x = x+x
165 in f y + f y
166
167 After typechecking we have
168
169 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
170 in +.sel a d (f a d y) (f a d y)
171
172 Notice that the call to f is at type type "a"; a non-constant type.
173 Both calls to f are at the same type, so we can specialise to give:
174
175 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
176 in +.sel a d (f@a y) (f@a y)
177
178
179 (b) The other case is when the type variables in the instance types
180 are *not* in scope at the definition point of f. The example we are
181 working with above is a good case. There are two instances of (+.sel a d),
182 but "a" is not in scope at the definition of +.sel. Can we do anything?
183 Yes, we can "common them up", a sort of limited common sub-expression deal.
184 This would give:
185
186 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
187 f@a (x::a) = +.sel@a x x
188 in +.sel@a (f@a y) (f@a y)
189
190 This can save work, and can't be spotted by the type checker, because
191 the two instances of +.sel weren't originally at the same type.
192
193 Further notes on (b)
194
195 * There are quite a few variations here. For example, the defn of
196 +.sel could be floated ouside the \y, to attempt to gain laziness.
197 It certainly mustn't be floated outside the \d because the d has to
198 be in scope too.
199
200 * We don't want to inline f_rhs in this case, because
201 that will duplicate code. Just commoning up the call is the point.
202
203 * Nothing gets added to +.sel's IdInfo.
204
205 * Don't bother unless the equivalence class has more than one item!
206
207 Not clear whether this is all worth it. It is of course OK to
208 simply discard call-instances when passing a big lambda.
209
210 Polymorphism 2 -- Overloading
211 ~~~~~~~~~~~~~~
212 Consider a function whose most general type is
213
214 f :: forall a b. Ord a => [a] -> b -> b
215
216 There is really no point in making a version of g at Int/Int and another
217 at Int/Bool, because it's only instancing the type variable "a" which
218 buys us any efficiency. Since g is completely polymorphic in b there
219 ain't much point in making separate versions of g for the different
220 b types.
221
222 That suggests that we should identify which of g's type variables
223 are constrained (like "a") and which are unconstrained (like "b").
224 Then when taking equivalence classes in STEP 2, we ignore the type args
225 corresponding to unconstrained type variable. In STEP 3 we make
226 polymorphic versions. Thus:
227
228 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
229
230 We do this.
231
232
233 Dictionary floating
234 ~~~~~~~~~~~~~~~~~~~
235 Consider this
236
237 f a (d::Num a) = let g = ...
238 in
239 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
240
241 Here, g is only called at one type, but the dictionary isn't in scope at the
242 definition point for g. Usually the type checker would build a
243 definition for d1 which enclosed g, but the transformation system
244 might have moved d1's defn inward. Solution: float dictionary bindings
245 outwards along with call instances.
246
247 Consider
248
249 f x = let g p q = p==q
250 h r s = (r+s, g r s)
251 in
252 h x x
253
254
255 Before specialisation, leaving out type abstractions we have
256
257 f df x = let g :: Eq a => a -> a -> Bool
258 g dg p q = == dg p q
259 h :: Num a => a -> a -> (a, Bool)
260 h dh r s = let deq = eqFromNum dh
261 in (+ dh r s, g deq r s)
262 in
263 h df x x
264
265 After specialising h we get a specialised version of h, like this:
266
267 h' r s = let deq = eqFromNum df
268 in (+ df r s, g deq r s)
269
270 But we can't naively make an instance for g from this, because deq is not in scope
271 at the defn of g. Instead, we have to float out the (new) defn of deq
272 to widen its scope. Notice that this floating can't be done in advance -- it only
273 shows up when specialisation is done.
274
275 User SPECIALIZE pragmas
276 ~~~~~~~~~~~~~~~~~~~~~~~
277 Specialisation pragmas can be digested by the type checker, and implemented
278 by adding extra definitions along with that of f, in the same way as before
279
280 f@t1/t2 = <f_rhs> t1 t2 d1 d2
281
282 Indeed the pragmas *have* to be dealt with by the type checker, because
283 only it knows how to build the dictionaries d1 and d2! For example
284
285 g :: Ord a => [a] -> [a]
286 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
287
288 Here, the specialised version of g is an application of g's rhs to the
289 Ord dictionary for (Tree Int), which only the type checker can conjure
290 up. There might not even *be* one, if (Tree Int) is not an instance of
291 Ord! (All the other specialision has suitable dictionaries to hand
292 from actual calls.)
293
294 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
295 it is buried in a complex (as-yet-un-desugared) binding group.
296 Maybe we should say
297
298 f@t1/t2 = f* t1 t2 d1 d2
299
300 where f* is the Id f with an IdInfo which says "inline me regardless!".
301 Indeed all the specialisation could be done in this way.
302 That in turn means that the simplifier has to be prepared to inline absolutely
303 any in-scope let-bound thing.
304
305
306 Again, the pragma should permit polymorphism in unconstrained variables:
307
308 h :: Ord a => [a] -> b -> b
309 {-# SPECIALIZE h :: [Int] -> b -> b #-}
310
311 We *insist* that all overloaded type variables are specialised to ground types,
312 (and hence there can be no context inside a SPECIALIZE pragma).
313 We *permit* unconstrained type variables to be specialised to
314 - a ground type
315 - or left as a polymorphic type variable
316 but nothing in between. So
317
318 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
319
320 is *illegal*. (It can be handled, but it adds complication, and gains the
321 programmer nothing.)
322
323
324 SPECIALISING INSTANCE DECLARATIONS
325 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326 Consider
327
328 instance Foo a => Foo [a] where
329 ...
330 {-# SPECIALIZE instance Foo [Int] #-}
331
332 The original instance decl creates a dictionary-function
333 definition:
334
335 dfun.Foo.List :: forall a. Foo a -> Foo [a]
336
337 The SPECIALIZE pragma just makes a specialised copy, just as for
338 ordinary function definitions:
339
340 dfun.Foo.List@Int :: Foo [Int]
341 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
342
343 The information about what instance of the dfun exist gets added to
344 the dfun's IdInfo in the same way as a user-defined function too.
345
346
347 Automatic instance decl specialisation?
348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349 Can instance decls be specialised automatically? It's tricky.
350 We could collect call-instance information for each dfun, but
351 then when we specialised their bodies we'd get new call-instances
352 for ordinary functions; and when we specialised their bodies, we might get
353 new call-instances of the dfuns, and so on. This all arises because of
354 the unrestricted mutual recursion between instance decls and value decls.
355
356 Still, there's no actual problem; it just means that we may not do all
357 the specialisation we could theoretically do.
358
359 Furthermore, instance decls are usually exported and used non-locally,
360 so we'll want to compile enough to get those specialisations done.
361
362 Lastly, there's no such thing as a local instance decl, so we can
363 survive solely by spitting out *usage* information, and then reading that
364 back in as a pragma when next compiling the file. So for now,
365 we only specialise instance decls in response to pragmas.
366
367
368 SPITTING OUT USAGE INFORMATION
369 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
370
371 To spit out usage information we need to traverse the code collecting
372 call-instance information for all imported (non-prelude?) functions
373 and data types. Then we equivalence-class it and spit it out.
374
375 This is done at the top-level when all the call instances which escape
376 must be for imported functions and data types.
377
378 *** Not currently done ***
379
380
381 Partial specialisation by pragmas
382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383 What about partial specialisation:
384
385 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
386 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
387
388 or even
389
390 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
391
392 Seems quite reasonable. Similar things could be done with instance decls:
393
394 instance (Foo a, Foo b) => Foo (a,b) where
395 ...
396 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
397 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
398
399 Ho hum. Things are complex enough without this. I pass.
400
401
402 Requirements for the simplifer
403 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404 The simplifier has to be able to take advantage of the specialisation.
405
406 * When the simplifier finds an application of a polymorphic f, it looks in
407 f's IdInfo in case there is a suitable instance to call instead. This converts
408
409 f t1 t2 d1 d2 ===> f_t1_t2
410
411 Note that the dictionaries get eaten up too!
412
413 * Dictionary selection operations on constant dictionaries must be
414 short-circuited:
415
416 +.sel Int d ===> +Int
417
418 The obvious way to do this is in the same way as other specialised
419 calls: +.sel has inside it some IdInfo which tells that if it's applied
420 to the type Int then it should eat a dictionary and transform to +Int.
421
422 In short, dictionary selectors need IdInfo inside them for constant
423 methods.
424
425 * Exactly the same applies if a superclass dictionary is being
426 extracted:
427
428 Eq.sel Int d ===> dEqInt
429
430 * Something similar applies to dictionary construction too. Suppose
431 dfun.Eq.List is the function taking a dictionary for (Eq a) to
432 one for (Eq [a]). Then we want
433
434 dfun.Eq.List Int d ===> dEq.List_Int
435
436 Where does the Eq [Int] dictionary come from? It is built in
437 response to a SPECIALIZE pragma on the Eq [a] instance decl.
438
439 In short, dfun Ids need IdInfo with a specialisation for each
440 constant instance of their instance declaration.
441
442 All this uses a single mechanism: the SpecEnv inside an Id
443
444
445 What does the specialisation IdInfo look like?
446 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
447
448 The SpecEnv of an Id maps a list of types (the template) to an expression
449
450 [Type] |-> Expr
451
452 For example, if f has this SpecInfo:
453
454 [Int, a] -> \d:Ord Int. f' a
455
456 it means that we can replace the call
457
458 f Int t ===> (\d. f' t)
459
460 This chucks one dictionary away and proceeds with the
461 specialised version of f, namely f'.
462
463
464 What can't be done this way?
465 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
466 There is no way, post-typechecker, to get a dictionary for (say)
467 Eq a from a dictionary for Eq [a]. So if we find
468
469 ==.sel [t] d
470
471 we can't transform to
472
473 eqList (==.sel t d')
474
475 where
476 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
477
478 Of course, we currently have no way to automatically derive
479 eqList, nor to connect it to the Eq [a] instance decl, but you
480 can imagine that it might somehow be possible. Taking advantage
481 of this is permanently ruled out.
482
483 Still, this is no great hardship, because we intend to eliminate
484 overloading altogether anyway!
485
486 A note about non-tyvar dictionaries
487 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
488 Some Ids have types like
489
490 forall a,b,c. Eq a -> Ord [a] -> tau
491
492 This seems curious at first, because we usually only have dictionary
493 args whose types are of the form (C a) where a is a type variable.
494 But this doesn't hold for the functions arising from instance decls,
495 which sometimes get arguments with types of form (C (T a)) for some
496 type constructor T.
497
498 Should we specialise wrt this compound-type dictionary? We used to say
499 "no", saying:
500 "This is a heuristic judgement, as indeed is the fact that we
501 specialise wrt only dictionaries. We choose *not* to specialise
502 wrt compound dictionaries because at the moment the only place
503 they show up is in instance decls, where they are simply plugged
504 into a returned dictionary. So nothing is gained by specialising
505 wrt them."
506
507 But it is simpler and more uniform to specialise wrt these dicts too;
508 and in future GHC is likely to support full fledged type signatures
509 like
510 f :: Eq [(a,b)] => ...
511
512
513 ************************************************************************
514 * *
515 \subsubsection{The new specialiser}
516 * *
517 ************************************************************************
518
519 Our basic game plan is this. For let(rec) bound function
520 f :: (C a, D c) => (a,b,c,d) -> Bool
521
522 * Find any specialised calls of f, (f ts ds), where
523 ts are the type arguments t1 .. t4, and
524 ds are the dictionary arguments d1 .. d2.
525
526 * Add a new definition for f1 (say):
527
528 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
529
530 Note that we abstract over the unconstrained type arguments.
531
532 * Add the mapping
533
534 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
535
536 to the specialisations of f. This will be used by the
537 simplifier to replace calls
538 (f t1 t2 t3 t4) da db
539 by
540 (\d1 d1 -> f1 t2 t4) da db
541
542 All the stuff about how many dictionaries to discard, and what types
543 to apply the specialised function to, are handled by the fact that the
544 SpecEnv contains a template for the result of the specialisation.
545
546 We don't build *partial* specialisations for f. For example:
547
548 f :: Eq a => a -> a -> Bool
549 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
550
551 Here, little is gained by making a specialised copy of f.
552 There's a distinct danger that the specialised version would
553 first build a dictionary for (Eq b, Eq c), and then select the (==)
554 method from it! Even if it didn't, not a great deal is saved.
555
556 We do, however, generate polymorphic, but not overloaded, specialisations:
557
558 f :: Eq a => [a] -> b -> b -> b
559 ... SPECIALISE f :: [Int] -> b -> b -> b ...
560
561 Hence, the invariant is this:
562
563 *** no specialised version is overloaded ***
564
565
566 ************************************************************************
567 * *
568 \subsubsection{The exported function}
569 * *
570 ************************************************************************
571 -}
572
573 -- | Specialise calls to type-class overloaded functions occuring in a program.
574 specProgram :: ModGuts -> CoreM ModGuts
575 specProgram guts@(ModGuts { mg_module = this_mod
576 , mg_rules = local_rules
577 , mg_binds = binds })
578 = do { dflags <- getDynFlags
579
580 -- Specialise the bindings of this module
581 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
582
583 -- Specialise imported functions
584 ; hpt_rules <- getRuleBase
585 ; let rule_base = extendRuleBaseList hpt_rules local_rules
586 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
587 [] rule_base (ud_calls uds)
588
589 -- Don't forget to wrap the specialized bindings with bindings
590 -- for the needed dictionaries.
591 -- See Note [Wrap bindings returned by specImports]
592 ; let spec_binds' = wrapDictBinds (ud_binds uds) spec_binds
593
594 ; let final_binds
595 | null spec_binds' = binds'
596 | otherwise = Rec (flattenBinds spec_binds') : binds'
597 -- Note [Glom the bindings if imported functions are specialised]
598
599 ; return (guts { mg_binds = final_binds
600 , mg_rules = new_rules ++ local_rules }) }
601 where
602 -- We need to start with a Subst that knows all the things
603 -- that are in scope, so that the substitution engine doesn't
604 -- accidentally re-use a unique that's already in use
605 -- Easiest thing is to do it all at once, as if all the top-level
606 -- decls were mutually recursive
607 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
608 bindersOfBinds binds
609 , se_interesting = emptyVarSet }
610
611 go [] = return ([], emptyUDs)
612 go (bind:binds) = do (binds', uds) <- go binds
613 (bind', uds') <- specBind top_env bind uds
614 return (bind' ++ binds', uds')
615
616 {-
617 Note [Wrap bindings returned by specImports]
618 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
619 'specImports' returns a set of specialized bindings. However, these are lacking
620 necessary floated dictionary bindings, which are returned by
621 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
622 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
623 for instance, the 'specImports' call in 'specProgram'.
624
625
626 Note [Disabling cross-module specialisation]
627 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
628 Since GHC 7.10 we have performed specialisation of INLINEABLE bindings living
629 in modules outside of the current module. This can sometimes uncover user code
630 which explodes in size when aggressively optimized. The
631 -fno-cross-module-specialise option was introduced to allow users to being
632 bitten by such instances to revert to the pre-7.10 behavior.
633
634 See Trac #10491
635 -}
636
637 -- | Specialise a set of calls to imported bindings
638 specImports :: DynFlags
639 -> Module
640 -> SpecEnv -- Passed in so that all top-level Ids are in scope
641 -> VarSet -- Don't specialise these ones
642 -- See Note [Avoiding recursive specialisation]
643 -> [Id] -- Stack of imported functions being specialised
644 -> RuleBase -- Rules from this module and the home package
645 -- (but not external packages, which can change)
646 -> CallDetails -- Calls for imported things, and floating bindings
647 -> CoreM ( [CoreRule] -- New rules
648 , [CoreBind] ) -- Specialised bindings
649 -- See Note [Wrapping bindings returned by specImports]
650 specImports dflags this_mod top_env done callers rule_base cds
651 -- See Note [Disabling cross-module specialisation]
652 | not $ gopt Opt_CrossModuleSpecialise dflags =
653 return ([], [])
654
655 | otherwise =
656 do { let import_calls = varEnvElts cds
657 ; (rules, spec_binds) <- go rule_base import_calls
658 ; return (rules, spec_binds) }
659 where
660 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
661 go _ [] = return ([], [])
662 go rb (CIS fn calls_for_fn : other_calls)
663 = do { (rules1, spec_binds1) <- specImport dflags this_mod top_env
664 done callers rb fn $
665 Map.toList calls_for_fn
666 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
667 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
668
669 specImport :: DynFlags
670 -> Module
671 -> SpecEnv -- Passed in so that all top-level Ids are in scope
672 -> VarSet -- Don't specialise these
673 -- See Note [Avoiding recursive specialisation]
674 -> [Id] -- Stack of imported functions being specialised
675 -> RuleBase -- Rules from this module
676 -> Id -> [CallInfo] -- Imported function and calls for it
677 -> CoreM ( [CoreRule] -- New rules
678 , [CoreBind] ) -- Specialised bindings
679 specImport dflags this_mod top_env done callers rb fn calls_for_fn
680 | fn `elemVarSet` done
681 = return ([], []) -- No warning. This actually happens all the time
682 -- when specialising a recursive function, because
683 -- the RHS of the specialised function contains a recursive
684 -- call to the original function
685
686 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
687 = return ([], [])
688
689 | wantSpecImport dflags unfolding
690 , Just rhs <- maybeUnfoldingTemplate unfolding
691 = do { -- Get rules from the external package state
692 -- We keep doing this in case we "page-fault in"
693 -- more rules as we go along
694 ; hsc_env <- getHscEnv
695 ; eps <- liftIO $ hscEPS hsc_env
696 ; vis_orphs <- getVisibleOrphanMods
697 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
698 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
699
700 ; (rules1, spec_pairs, uds) <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
701 runSpecM dflags this_mod $
702 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
703 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
704 -- After the rules kick in we may get recursion, but
705 -- we rely on a global GlomBinds to sort that out later
706 -- See Note [Glom the bindings if imported functions are specialised]
707
708 -- Now specialise any cascaded calls
709 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
710 specImports dflags this_mod top_env
711 (extendVarSet done fn)
712 (fn:callers)
713 (extendRuleBaseList rb rules1)
714 (ud_calls uds)
715
716 -- Don't forget to wrap the specialized bindings with bindings
717 -- for the needed dictionaries
718 -- See Note [Wrap bindings returned by specImports]
719 ; let final_binds = wrapDictBinds (ud_binds uds)
720 (spec_binds2 ++ spec_binds1)
721
722 ; return (rules2 ++ rules1, final_binds) }
723
724 | warnMissingSpecs dflags callers
725 = do { warnMsg (vcat [ hang (ptext (sLit "Could not specialise imported function") <+> quotes (ppr fn))
726 2 (vcat [ ptext (sLit "when specialising") <+> quotes (ppr caller)
727 | caller <- callers])
728 , ifPprDebug (ptext (sLit "calls:") <+> vcat (map (pprCallInfo fn) calls_for_fn))
729 , ptext (sLit "Probable fix: add INLINEABLE pragma on") <+> quotes (ppr fn) ])
730 ; return ([], []) }
731
732 | otherwise
733 = return ([], [])
734 where
735 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
736
737 warnMissingSpecs :: DynFlags -> [Id] -> Bool
738 -- See Note [Warning about missed specialisations]
739 warnMissingSpecs dflags callers
740 | wopt Opt_WarnAllMissedSpecs dflags = True
741 | not (wopt Opt_WarnMissedSpecs dflags) = False
742 | null callers = False
743 | otherwise = all has_inline_prag callers
744 where
745 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
746
747 wantSpecImport :: DynFlags -> Unfolding -> Bool
748 -- See Note [Specialise imported INLINABLE things]
749 wantSpecImport dflags unf
750 = case unf of
751 NoUnfolding -> False
752 OtherCon {} -> False
753 DFunUnfolding {} -> True
754 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
755 | gopt Opt_SpecialiseAggressively dflags -> True
756 | isStableSource src -> True
757 -- Specialise even INLINE things; it hasn't inlined yet,
758 -- so perhaps it never will. Moreover it may have calls
759 -- inside it that we want to specialise
760 | otherwise -> False -- Stable, not INLINE, hence INLINEABLE
761
762 {- Note [Warning about missed specialisations]
763 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
764 Suppose
765 * In module Lib, you carefully mark a function 'foo' INLINEABLE
766 * Import Lib(foo) into another module M
767 * Call 'foo' at some specialised type in M
768 Then you jolly well expect it to be specialised in M. But what if
769 'foo' calls another fuction 'Lib.bar'. Then you'd like 'bar' to be
770 specialised too. But if 'bar' is not marked INLINEABLE it may well
771 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
772
773 It's more noisy to warning about a missed specialisation opportunity
774 for /every/ overloaded imported function, but sometimes useful. That
775 is what Opt_WarnAllMissedSpecs does.
776
777 ToDo: warn about missed opportunities for local functions.
778
779 Note [Specialise imported INLINABLE things]
780 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
781 What imported functions do we specialise? The basic set is
782 * DFuns and things with INLINABLE pragmas.
783 but with -fspecialise-aggressively we add
784 * Anything with an unfolding template
785
786 Trac #8874 has a good example of why we want to auto-specialise DFuns.
787
788 We have the -fspecialise-aggressively flag (usually off), because we
789 risk lots of orphan modules from over-vigorous specialisation.
790 However it's not a big deal: anything non-recursive with an
791 unfolding-template will probably have been inlined already.
792
793 Note [Glom the bindings if imported functions are specialised]
794 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
795 Suppose we have an imported, *recursive*, INLINABLE function
796 f :: Eq a => a -> a
797 f = /\a \d x. ...(f a d)...
798 In the module being compiled we have
799 g x = f (x::Int)
800 Now we'll make a specialised function
801 f_spec :: Int -> Int
802 f_spec = \x -> ...(f Int dInt)...
803 {-# RULE f Int _ = f_spec #-}
804 g = \x. f Int dInt x
805 Note that f_spec doesn't look recursive
806 After rewriting with the RULE, we get
807 f_spec = \x -> ...(f_spec)...
808 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
809 The occurrence analyser never turns a NonRec into a Rec. So we must
810 make sure that f_spec is recursive. Easiest thing is to make all
811 the specialisations for imported bindings recursive.
812
813
814 Note [Avoiding recursive specialisation]
815 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
816 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
817 'f's RHS. So we want to specialise g,h. But we don't want to
818 specialise f any more! It's possible that f's RHS might have a
819 recursive yet-more-specialised call, so we'd diverge in that case.
820 And if the call is to the same type, one specialisation is enough.
821 Avoiding this recursive specialisation loop is the reason for the
822 'done' VarSet passed to specImports and specImport.
823
824 ************************************************************************
825 * *
826 \subsubsection{@specExpr@: the main function}
827 * *
828 ************************************************************************
829 -}
830
831 data SpecEnv
832 = SE { se_subst :: CoreSubst.Subst
833 -- We carry a substitution down:
834 -- a) we must clone any binding that might float outwards,
835 -- to avoid name clashes
836 -- b) we carry a type substitution to use when analysing
837 -- the RHS of specialised bindings (no type-let!)
838
839
840 , se_interesting :: VarSet
841 -- Dict Ids that we know something about
842 -- and hence may be worth specialising against
843 -- See Note [Interesting dictionary arguments]
844 }
845
846 specVar :: SpecEnv -> Id -> CoreExpr
847 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
848
849 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
850
851 ---------------- First the easy cases --------------------
852 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
853 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
854 specExpr env (Var v) = return (specVar env v, emptyUDs)
855 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
856 specExpr env (Cast e co)
857 = do { (e', uds) <- specExpr env e
858 ; return ((Cast e' (substCo env co)), uds) }
859 specExpr env (Tick tickish body)
860 = do { (body', uds) <- specExpr env body
861 ; return (Tick (specTickish env tickish) body', uds) }
862
863 ---------------- Applications might generate a call instance --------------------
864 specExpr env expr@(App {})
865 = go expr []
866 where
867 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
868 (fun', uds_app) <- go fun (arg':args)
869 return (App fun' arg', uds_arg `plusUDs` uds_app)
870
871 go (Var f) args = case specVar env f of
872 Var f' -> return (Var f', mkCallUDs env f' args)
873 e' -> return (e', emptyUDs) -- I don't expect this!
874 go other _ = specExpr env other
875
876 ---------------- Lambda/case require dumping of usage details --------------------
877 specExpr env e@(Lam _ _) = do
878 (body', uds) <- specExpr env' body
879 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
880 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
881 where
882 (bndrs, body) = collectBinders e
883 (env', bndrs') = substBndrs env bndrs
884 -- More efficient to collect a group of binders together all at once
885 -- and we don't want to split a lambda group with dumped bindings
886
887 specExpr env (Case scrut case_bndr ty alts)
888 = do { (scrut', scrut_uds) <- specExpr env scrut
889 ; (scrut'', case_bndr', alts', alts_uds)
890 <- specCase env scrut' case_bndr alts
891 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
892 , scrut_uds `plusUDs` alts_uds) }
893
894 ---------------- Finally, let is the interesting case --------------------
895 specExpr env (Let bind body)
896 = do { -- Clone binders
897 (rhs_env, body_env, bind') <- cloneBindSM env bind
898
899 -- Deal with the body
900 ; (body', body_uds) <- specExpr body_env body
901
902 -- Deal with the bindings
903 ; (binds', uds) <- specBind rhs_env bind' body_uds
904
905 -- All done
906 ; return (foldr Let body' binds', uds) }
907
908 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
909 specTickish env (Breakpoint ix ids)
910 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
911 -- drop vars from the list if they have a non-variable substitution.
912 -- should never happen, but it's harmless to drop them anyway.
913 specTickish _ other_tickish = other_tickish
914
915 specCase :: SpecEnv
916 -> CoreExpr -- Scrutinee, already done
917 -> Id -> [CoreAlt]
918 -> SpecM ( CoreExpr -- New scrutinee
919 , Id
920 , [CoreAlt]
921 , UsageDetails)
922 specCase env scrut' case_bndr [(con, args, rhs)]
923 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
924 , interestingDict env scrut'
925 , not (isDeadBinder case_bndr && null sc_args')
926 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
927
928 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
929 [(con, args', Var sc_arg')]
930 | sc_arg' <- sc_args' ]
931
932 -- Extend the substitution for RHS to map the *original* binders
933 -- to their floated verions.
934 mb_sc_flts :: [Maybe DictId]
935 mb_sc_flts = map (lookupVarEnv clone_env) args'
936 clone_env = zipVarEnv sc_args' sc_args_flt
937 subst_prs = (case_bndr, Var case_bndr_flt)
938 : [ (arg, Var sc_flt)
939 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
940 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
941 , se_interesting = se_interesting env_rhs `extendVarSetList`
942 (case_bndr_flt : sc_args_flt) }
943
944 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
945 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
946 case_bndr_set = unitVarSet case_bndr_flt
947 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
948 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
949 flt_binds = scrut_bind : sc_binds
950 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
951 all_uds = flt_binds `addDictBinds` free_uds
952 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
953 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
954 where
955 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
956 sc_args' = filter is_flt_sc_arg args'
957
958 clone_me bndr = do { uniq <- getUniqueM
959 ; return (mkUserLocal occ uniq ty loc) }
960 where
961 name = idName bndr
962 ty = idType bndr
963 occ = nameOccName name
964 loc = getSrcSpan name
965
966 arg_set = mkVarSet args'
967 is_flt_sc_arg var = isId var
968 && not (isDeadBinder var)
969 && isDictTy var_ty
970 && not (tyVarsOfType var_ty `intersectsVarSet` arg_set)
971 where
972 var_ty = idType var
973
974
975 specCase env scrut case_bndr alts
976 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
977 ; return (scrut, case_bndr', alts', uds_alts) }
978 where
979 (env_alt, case_bndr') = substBndr env case_bndr
980 spec_alt (con, args, rhs) = do
981 (rhs', uds) <- specExpr env_rhs rhs
982 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
983 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
984 where
985 (env_rhs, args') = substBndrs env_alt args
986
987 {-
988 Note [Floating dictionaries out of cases]
989 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
990 Consider
991 g = \d. case d of { MkD sc ... -> ...(f sc)... }
992 Naively we can't float d2's binding out of the case expression,
993 because 'sc' is bound by the case, and that in turn means we can't
994 specialise f, which seems a pity.
995
996 So we invert the case, by floating out a binding
997 for 'sc_flt' thus:
998 sc_flt = case d of { MkD sc ... -> sc }
999 Now we can float the call instance for 'f'. Indeed this is just
1000 what'll happen if 'sc' was originally bound with a let binding,
1001 but case is more efficient, and necessary with equalities. So it's
1002 good to work with both.
1003
1004 You might think that this won't make any difference, because the
1005 call instance will only get nuked by the \d. BUT if 'g' itself is
1006 specialised, then transitively we should be able to specialise f.
1007
1008 In general, given
1009 case e of cb { MkD sc ... -> ...(f sc)... }
1010 we transform to
1011 let cb_flt = e
1012 sc_flt = case cb_flt of { MkD sc ... -> sc }
1013 in
1014 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1015
1016 The "_flt" things are the floated binds; we use the current substitution
1017 to substitute sc -> sc_flt in the RHS
1018
1019 ************************************************************************
1020 * *
1021 Dealing with a binding
1022 * *
1023 ************************************************************************
1024 -}
1025
1026 specBind :: SpecEnv -- Use this for RHSs
1027 -> CoreBind
1028 -> UsageDetails -- Info on how the scope of the binding
1029 -> SpecM ([CoreBind], -- New bindings
1030 UsageDetails) -- And info to pass upstream
1031
1032 -- Returned UsageDetails:
1033 -- No calls for binders of this bind
1034 specBind rhs_env (NonRec fn rhs) body_uds
1035 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1036 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1037
1038 ; let pairs = spec_defns ++ [(fn', rhs')]
1039 -- fn' mentions the spec_defns in its rules,
1040 -- so put the latter first
1041
1042 combined_uds = body_uds1 `plusUDs` rhs_uds
1043 -- This way round a call in rhs_uds of a function f
1044 -- at type T will override a call of f at T in body_uds1; and
1045 -- that is good because it'll tend to keep "earlier" calls
1046 -- See Note [Specialisation of dictionary functions]
1047
1048 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1049 -- See Note [From non-recursive to recursive]
1050
1051 final_binds :: [DictBind]
1052 final_binds
1053 | isEmptyBag dump_dbs = [mkDB $ NonRec b r | (b,r) <- pairs]
1054 | otherwise = [flattenDictBinds dump_dbs pairs]
1055
1056 ; if float_all then
1057 -- Rather than discard the calls mentioning the bound variables
1058 -- we float this binding along with the others
1059 return ([], free_uds `snocDictBinds` final_binds)
1060 else
1061 -- No call in final_uds mentions bound variables,
1062 -- so we can just leave the binding here
1063 return (map fst final_binds, free_uds) }
1064
1065
1066 specBind rhs_env (Rec pairs) body_uds
1067 -- Note [Specialising a recursive group]
1068 = do { let (bndrs,rhss) = unzip pairs
1069 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1070 ; let scope_uds = body_uds `plusUDs` rhs_uds
1071 -- Includes binds and calls arising from rhss
1072
1073 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1074
1075 ; (bndrs3, spec_defns3, uds3)
1076 <- if null spec_defns1 -- Common case: no specialisation
1077 then return (bndrs1, [], uds1)
1078 else do { -- Specialisation occurred; do it again
1079 (bndrs2, spec_defns2, uds2)
1080 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1081 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1082
1083 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1084 bind = flattenDictBinds dumped_dbs
1085 (spec_defns3 ++ zip bndrs3 rhss')
1086
1087 ; if float_all then
1088 return ([], final_uds `snocDictBind` bind)
1089 else
1090 return ([fst bind], final_uds) }
1091
1092
1093 ---------------------------
1094 specDefns :: SpecEnv
1095 -> UsageDetails -- Info on how it is used in its scope
1096 -> [(Id,CoreExpr)] -- The things being bound and their un-processed RHS
1097 -> SpecM ([Id], -- Original Ids with RULES added
1098 [(Id,CoreExpr)], -- Extra, specialised bindings
1099 UsageDetails) -- Stuff to fling upwards from the specialised versions
1100
1101 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1102 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1103 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1104 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1105 -- But not vice versa (it's a fixpoint problem).
1106
1107 specDefns _env uds []
1108 = return ([], [], uds)
1109 specDefns env uds ((bndr,rhs):pairs)
1110 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1111 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1112 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1113
1114 ---------------------------
1115 specDefn :: SpecEnv
1116 -> UsageDetails -- Info on how it is used in its scope
1117 -> Id -> CoreExpr -- The thing being bound and its un-processed RHS
1118 -> SpecM (Id, -- Original Id with added RULES
1119 [(Id,CoreExpr)], -- Extra, specialised bindings
1120 UsageDetails) -- Stuff to fling upwards from the specialised versions
1121
1122 specDefn env body_uds fn rhs
1123 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1124 rules_for_me = idCoreRules fn
1125 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1126 calls_for_me fn rhs
1127 ; return ( fn `addIdSpecialisations` rules
1128 , spec_defns
1129 , body_uds_without_me `plusUDs` spec_uds) }
1130 -- It's important that the `plusUDs` is this way
1131 -- round, because body_uds_without_me may bind
1132 -- dictionaries that are used in calls_for_me passed
1133 -- to specDefn. So the dictionary bindings in
1134 -- spec_uds may mention dictionaries bound in
1135 -- body_uds_without_me
1136
1137 ---------------------------
1138 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1139 -- Nothing => specialising local fn
1140 -> SpecEnv
1141 -> [CoreRule] -- Existing RULES for the fn
1142 -> [CallInfo]
1143 -> Id -> CoreExpr
1144 -> SpecM ([CoreRule], -- New RULES for the fn
1145 [(Id,CoreExpr)], -- Extra, specialised bindings
1146 UsageDetails) -- New usage details from the specialised RHSs
1147
1148 -- This function checks existing rules, and does not create
1149 -- duplicate ones. So the caller does not need to do this filtering.
1150 -- See 'already_covered'
1151
1152 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1153 -- The first case is the interesting one
1154 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1155 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1156 && notNull calls_for_me -- And there are some calls to specialise
1157 && not (isNeverActive (idInlineActivation fn))
1158 -- Don't specialise NOINLINE things
1159 -- See Note [Auto-specialisation and RULES]
1160
1161 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1162 -- See Note [Inline specialisation] for why we do not
1163 -- switch off specialisation for inline functions
1164
1165 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1166 do { stuff <- mapM spec_call calls_for_me
1167 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1168 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1169
1170 | otherwise -- No calls or RHS doesn't fit our preconceptions
1171 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1172 ptext (sLit "Missed specialisation opportunity for")
1173 <+> ppr fn $$ _trace_doc )
1174 -- Note [Specialisation shape]
1175 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1176 return ([], [], emptyUDs)
1177 where
1178 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1179 , ppr rhs_ids, ppr n_dicts
1180 , ppr (idInlineActivation fn) ]
1181
1182 fn_type = idType fn
1183 fn_arity = idArity fn
1184 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1185 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1186 n_tyvars = length tyvars
1187 n_dicts = length theta
1188 inl_prag = idInlinePragma fn
1189 inl_act = inlinePragmaActivation inl_prag
1190 is_local = isLocalId fn
1191
1192 -- Figure out whether the function has an INLINE pragma
1193 -- See Note [Inline specialisations]
1194
1195 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1196
1197 rhs_dict_ids = take n_dicts rhs_ids
1198 body = mkLams (drop n_dicts rhs_ids) rhs_body
1199 -- Glue back on the non-dict lambdas
1200
1201 already_covered :: DynFlags -> [CoreExpr] -> Bool
1202 already_covered dflags args -- Note [Specialisations already covered]
1203 = isJust (lookupRule dflags
1204 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1205 (const True)
1206 fn args rules_for_me)
1207
1208 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1209 mk_ty_args [] poly_tvs
1210 = ASSERT( null poly_tvs ) []
1211 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1212 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1213 mk_ty_args (Just ty : call_ts) poly_tvs
1214 = Type ty : mk_ty_args call_ts poly_tvs
1215 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1216
1217 ----------------------------------------------------------
1218 -- Specialise to one particular call pattern
1219 spec_call :: CallInfo -- Call instance
1220 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1221 UsageDetails, -- Usage details from specialised body
1222 CoreRule)) -- Info for the Id's SpecEnv
1223 spec_call (CallKey call_ts, (call_ds, _))
1224 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1225
1226 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1227 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1228
1229 -- Construct the new binding
1230 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1231 -- PLUS the usage-details
1232 -- { d1' = dx1; d2' = dx2 }
1233 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1234 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1235 --
1236 -- Note that the substitution is applied to the whole thing.
1237 -- This is convenient, but just slightly fragile. Notably:
1238 -- * There had better be no name clashes in a/b/c
1239 do { let
1240 -- poly_tyvars = [b] in the example above
1241 -- spec_tyvars = [a,c]
1242 -- ty_args = [t1,b,t3]
1243 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1244 env1 = extendTvSubstList env spec_tv_binds
1245 (rhs_env, poly_tyvars) = substBndrs env1
1246 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1247
1248 -- Clone rhs_dicts, including instantiating their types
1249 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1250 ; let (rhs_env2, dx_binds, spec_dict_args)
1251 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1252 ty_args = mk_ty_args call_ts poly_tyvars
1253 rule_args = ty_args ++ map Var inst_dict_ids
1254 rule_bndrs = poly_tyvars ++ inst_dict_ids
1255
1256 ; dflags <- getDynFlags
1257 ; if already_covered dflags rule_args then
1258 return Nothing
1259 else do
1260 { -- Figure out the type of the specialised function
1261 let body_ty = applyTypeToArgs rhs fn_type rule_args
1262 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1263 | isUnLiftedType body_ty -- C.f. WwLib.mkWorkerArgs
1264 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1265 | otherwise = (poly_tyvars, poly_tyvars)
1266 spec_id_ty = mkPiTypes lam_args body_ty
1267
1268 ; spec_f <- newSpecIdSM fn spec_id_ty
1269 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1270 ; this_mod <- getModule
1271 ; let
1272 -- The rule to put in the function's specialisation is:
1273 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1274 herald = case mb_mod of
1275 Nothing -- Specialising local fn
1276 -> ptext (sLit "SPEC")
1277 Just this_mod -- Specialising imoprted fn
1278 -> ptext (sLit "SPEC/") <> ppr this_mod
1279
1280 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1281 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1282 -- This name ends up in interface files, so use showSDocForUser,
1283 -- otherwise uniques end up there, making builds
1284 -- less deterministic (See #4012 comment:61 ff)
1285
1286 spec_env_rule = mkRule
1287 this_mod
1288 True {- Auto generated -}
1289 is_local
1290 rule_name
1291 inl_act -- Note [Auto-specialisation and RULES]
1292 (idName fn)
1293 rule_bndrs
1294 rule_args
1295 (mkVarApps (Var spec_f) app_args)
1296
1297 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1298 final_uds = foldr consDictBind rhs_uds dx_binds
1299
1300 --------------------------------------
1301 -- Add a suitable unfolding if the spec_inl_prag says so
1302 -- See Note [Inline specialisations]
1303 (spec_inl_prag, spec_unf)
1304 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1305 = (neverInlinePragma, noUnfolding)
1306 -- See Note [Specialising imported functions] in OccurAnal
1307
1308 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1309 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1310
1311 | otherwise
1312 = (inl_prag, specUnfolding dflags (se_subst env)
1313 poly_tyvars (ty_args ++ spec_dict_args)
1314 fn_unf)
1315
1316 --------------------------------------
1317 -- Adding arity information just propagates it a bit faster
1318 -- See Note [Arity decrease] in Simplify
1319 -- Copy InlinePragma information from the parent Id.
1320 -- So if f has INLINE[1] so does spec_f
1321 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1322 `setInlinePragma` spec_inl_prag
1323 `setIdUnfolding` spec_unf
1324
1325 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1326
1327 bindAuxiliaryDicts
1328 :: SpecEnv
1329 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1330 -> [DictId] -- A cloned dict-id for each dict arg
1331 -> (SpecEnv, -- Substitute for all orig_dicts
1332 [DictBind], -- Auxiliary dict bindings
1333 [CoreExpr]) -- Witnessing expressions (all trivial)
1334 -- Bind any dictionary arguments to fresh names, to preserve sharing
1335 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1336 orig_dict_ids call_ds inst_dict_ids
1337 = (env', dx_binds, spec_dict_args)
1338 where
1339 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1340 env' = env { se_subst = CoreSubst.extendIdSubstList subst (orig_dict_ids `zip` spec_dict_args)
1341 , se_interesting = interesting `unionVarSet` interesting_dicts }
1342
1343 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1344 , interestingDict env dx ]
1345 -- See Note [Make the new dictionaries interesting]
1346
1347 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1348 go [] _ = ([], [])
1349 go (dx:dxs) (dx_id:dx_ids)
1350 | exprIsTrivial dx = (dx_binds, dx:args)
1351 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1352 where
1353 (dx_binds, args) = go dxs dx_ids
1354 -- In the first case extend the substitution but not bindings;
1355 -- in the latter extend the bindings but not the substitution.
1356 -- For the former, note that we bind the *original* dict in the substitution,
1357 -- overriding any d->dx_id binding put there by substBndrs
1358 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1359
1360 {-
1361 Note [Make the new dictionaries interesting]
1362 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1363 Important! We're going to substitute dx_id1 for d
1364 and we want it to look "interesting", else we won't gather *any*
1365 consequential calls. E.g.
1366 f d = ...g d....
1367 If we specialise f for a call (f (dfun dNumInt)), we'll get
1368 a consequent call (g d') with an auxiliary definition
1369 d' = df dNumInt
1370 We want that consequent call to look interesting
1371
1372
1373 Note [From non-recursive to recursive]
1374 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1375 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1376 have built a recursive knot
1377
1378 f a d x = <blah>
1379 MkUD { ud_binds = d7 = MkD ..f..
1380 , ud_calls = ...(f T d7)... }
1381
1382 The we generate
1383
1384 Rec { fs x = <blah>[T/a, d7/d]
1385 f a d x = <blah>
1386 RULE f T _ = fs
1387 d7 = ...f... }
1388
1389 Here the recursion is only through the RULE.
1390
1391
1392 Note [Specialisation of dictionary functions]
1393 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1394 Here is a nasty example that bit us badly: see Trac #3591
1395
1396 class Eq a => C a
1397 instance Eq [a] => C [a]
1398
1399 ---------------
1400 dfun :: Eq [a] -> C [a]
1401 dfun a d = MkD a d (meth d)
1402
1403 d4 :: Eq [T] = <blah>
1404 d2 :: C [T] = dfun T d4
1405 d1 :: Eq [T] = $p1 d2
1406 d3 :: C [T] = dfun T d1
1407
1408 None of these definitions is recursive. What happened was that we
1409 generated a specialisation:
1410
1411 RULE forall d. dfun T d = dT :: C [T]
1412 dT = (MkD a d (meth d)) [T/a, d1/d]
1413 = MkD T d1 (meth d1)
1414
1415 But now we use the RULE on the RHS of d2, to get
1416
1417 d2 = dT = MkD d1 (meth d1)
1418 d1 = $p1 d2
1419
1420 and now d1 is bottom! The problem is that when specialising 'dfun' we
1421 should first dump "below" the binding all floated dictionary bindings
1422 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1423 placed below 'dfun', and thus unavailable to it when specialising
1424 'dfun'. That in turn means that the call (dfun T d1) must be
1425 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1426 d4 doesn't mention dfun.
1427
1428 But look at this:
1429
1430 class C a where { foo,bar :: [a] -> [a] }
1431
1432 instance C Int where
1433 foo x = r_bar x
1434 bar xs = reverse xs
1435
1436 r_bar :: C a => [a] -> [a]
1437 r_bar xs = bar (xs ++ xs)
1438
1439 That translates to:
1440
1441 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1442
1443 Rec { $fCInt :: C Int = MkC foo_help reverse
1444 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1445
1446 The call (r_bar $fCInt) mentions $fCInt,
1447 which mentions foo_help,
1448 which mentions r_bar
1449 But we DO want to specialise r_bar at Int:
1450
1451 Rec { $fCInt :: C Int = MkC foo_help reverse
1452 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1453
1454 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1455 RULE r_bar Int _ = r_bar_Int
1456
1457 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1458 }
1459
1460 Note that, because of its RULE, r_bar joins the recursive
1461 group. (In this case it'll unravel a short moment later.)
1462
1463
1464 Conclusion: we catch the nasty case using filter_dfuns in
1465 callsForMe. To be honest I'm not 100% certain that this is 100%
1466 right, but it works. Sigh.
1467
1468
1469 Note [Specialising a recursive group]
1470 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1471 Consider
1472 let rec { f x = ...g x'...
1473 ; g y = ...f y'.... }
1474 in f 'a'
1475 Here we specialise 'f' at Char; but that is very likely to lead to
1476 a specialisation of 'g' at Char. We must do the latter, else the
1477 whole point of specialisation is lost.
1478
1479 But we do not want to keep iterating to a fixpoint, because in the
1480 presence of polymorphic recursion we might generate an infinite number
1481 of specialisations.
1482
1483 So we use the following heuristic:
1484 * Arrange the rec block in dependency order, so far as possible
1485 (the occurrence analyser already does this)
1486
1487 * Specialise it much like a sequence of lets
1488
1489 * Then go through the block a second time, feeding call-info from
1490 the RHSs back in the bottom, as it were
1491
1492 In effect, the ordering maxmimises the effectiveness of each sweep,
1493 and we do just two sweeps. This should catch almost every case of
1494 monomorphic recursion -- the exception could be a very knotted-up
1495 recursion with multiple cycles tied up together.
1496
1497 This plan is implemented in the Rec case of specBindItself.
1498
1499 Note [Specialisations already covered]
1500 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1501 We obviously don't want to generate two specialisations for the same
1502 argument pattern. There are two wrinkles
1503
1504 1. We do the already-covered test in specDefn, not when we generate
1505 the CallInfo in mkCallUDs. We used to test in the latter place, but
1506 we now iterate the specialiser somewhat, and the Id at the call site
1507 might therefore not have all the RULES that we can see in specDefn
1508
1509 2. What about two specialisations where the second is an *instance*
1510 of the first? If the more specific one shows up first, we'll generate
1511 specialisations for both. If the *less* specific one shows up first,
1512 we *don't* currently generate a specialisation for the more specific
1513 one. (See the call to lookupRule in already_covered.) Reasons:
1514 (a) lookupRule doesn't say which matches are exact (bad reason)
1515 (b) if the earlier specialisation is user-provided, it's
1516 far from clear that we should auto-specialise further
1517
1518 Note [Auto-specialisation and RULES]
1519 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1520 Consider:
1521 g :: Num a => a -> a
1522 g = ...
1523
1524 f :: (Int -> Int) -> Int
1525 f w = ...
1526 {-# RULE f g = 0 #-}
1527
1528 Suppose that auto-specialisation makes a specialised version of
1529 g::Int->Int That version won't appear in the LHS of the RULE for f.
1530 So if the specialisation rule fires too early, the rule for f may
1531 never fire.
1532
1533 It might be possible to add new rules, to "complete" the rewrite system.
1534 Thus when adding
1535 RULE forall d. g Int d = g_spec
1536 also add
1537 RULE f g_spec = 0
1538
1539 But that's a bit complicated. For now we ask the programmer's help,
1540 by *copying the INLINE activation pragma* to the auto-specialised
1541 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1542 will also not be active until phase 2. And that's what programmers
1543 should jolly well do anyway, even aside from specialisation, to ensure
1544 that g doesn't inline too early.
1545
1546 This in turn means that the RULE would never fire for a NOINLINE
1547 thing so not much point in generating a specialisation at all.
1548
1549 Note [Specialisation shape]
1550 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1551 We only specialise a function if it has visible top-level lambdas
1552 corresponding to its overloading. E.g. if
1553 f :: forall a. Eq a => ....
1554 then its body must look like
1555 f = /\a. \d. ...
1556
1557 Reason: when specialising the body for a call (f ty dexp), we want to
1558 substitute dexp for d, and pick up specialised calls in the body of f.
1559
1560 This doesn't always work. One example I came across was this:
1561 newtype Gen a = MkGen{ unGen :: Int -> a }
1562
1563 choose :: Eq a => a -> Gen a
1564 choose n = MkGen (\r -> n)
1565
1566 oneof = choose (1::Int)
1567
1568 It's a silly exapmle, but we get
1569 choose = /\a. g `cast` co
1570 where choose doesn't have any dict arguments. Thus far I have not
1571 tried to fix this (wait till there's a real example).
1572
1573 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1574 it doesn't matter. This comes up with single-method classes
1575
1576 class C a where { op :: a -> a }
1577 instance C a => C [a] where ....
1578 ==>
1579 $fCList :: C a => C [a]
1580 $fCList = $copList |> (...coercion>...)
1581 ....(uses of $fCList at particular types)...
1582
1583 So we suppress the WARN if the rhs is trivial.
1584
1585 Note [Inline specialisations]
1586 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1587 Here is what we do with the InlinePragma of the original function
1588 * Activation/RuleMatchInfo: both transferred to the
1589 specialised function
1590 * InlineSpec:
1591 (a) An INLINE pragma is transferred
1592 (b) An INLINABLE pragma is *not* transferred
1593
1594 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1595 specialise the function at its call site, and arguably that's not so
1596 important for the specialised copies. BUT *pragma-directed*
1597 specialisation now takes place in the typechecker/desugarer, with
1598 manually specified INLINEs. The specialisation here is automatic.
1599 It'd be very odd if a function marked INLINE was specialised (because
1600 of some local use), and then forever after (including importing
1601 modules) the specialised version wasn't INLINEd. After all, the
1602 programmer said INLINE!
1603
1604 You might wonder why we specialise INLINE functions at all. After
1605 all they should be inlined, right? Two reasons:
1606
1607 * Even INLINE functions are sometimes not inlined, when they aren't
1608 applied to interesting arguments. But perhaps the type arguments
1609 alone are enough to specialise (even though the args are too boring
1610 to trigger inlining), and it's certainly better to call the
1611 specialised version.
1612
1613 * The RHS of an INLINE function might call another overloaded function,
1614 and we'd like to generate a specialised version of that function too.
1615 This actually happens a lot. Consider
1616 replicateM_ :: (Monad m) => Int -> m a -> m ()
1617 {-# INLINABLE replicateM_ #-}
1618 replicateM_ d x ma = ...
1619 The strictness analyser may transform to
1620 replicateM_ :: (Monad m) => Int -> m a -> m ()
1621 {-# INLINE replicateM_ #-}
1622 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1623
1624 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1625 {-# INLINABLE $wreplicateM_ #-}
1626 $wreplicateM_ = ...
1627 Now an importing module has a specialised call to replicateM_, say
1628 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1629 This particular example had a huge effect on the call to replicateM_
1630 in nofib/shootout/n-body.
1631
1632 Why (b): discard INLINEABLE pragmas? See Trac #4874 for persuasive examples.
1633 Suppose we have
1634 {-# INLINABLE f #-}
1635 f :: Ord a => [a] -> Int
1636 f xs = letrec f' = ...f'... in f'
1637 Then, when f is specialised and optimised we might get
1638 wgo :: [Int] -> Int#
1639 wgo = ...wgo...
1640 f_spec :: [Int] -> Int
1641 f_spec xs = case wgo xs of { r -> I# r }
1642 and we clearly want to inline f_spec at call sites. But if we still
1643 have the big, un-optimised of f (albeit specialised) captured in an
1644 INLINABLE pragma for f_spec, we won't get that optimisation.
1645
1646 So we simply drop INLINABLE pragmas when specialising. It's not really
1647 a complete solution; ignoring specalisation for now, INLINABLE functions
1648 don't get properly strictness analysed, for example. But it works well
1649 for examples involving specialisation, which is the dominant use of
1650 INLINABLE. See Trac #4874.
1651
1652
1653 ************************************************************************
1654 * *
1655 \subsubsection{UsageDetails and suchlike}
1656 * *
1657 ************************************************************************
1658 -}
1659
1660 data UsageDetails
1661 = MkUD {
1662 ud_binds :: !(Bag DictBind),
1663 -- Floated dictionary bindings
1664 -- The order is important;
1665 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1666 -- (Remember, Bags preserve order in GHC.)
1667
1668 ud_calls :: !CallDetails
1669
1670 -- INVARIANT: suppose bs = bindersOf ud_binds
1671 -- Then 'calls' may *mention* 'bs',
1672 -- but there should be no calls *for* bs
1673 }
1674
1675 instance Outputable UsageDetails where
1676 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1677 = ptext (sLit "MkUD") <+> braces (sep (punctuate comma
1678 [ptext (sLit "binds") <+> equals <+> ppr dbs,
1679 ptext (sLit "calls") <+> equals <+> ppr calls]))
1680
1681 -- | A 'DictBind' is a binding along with a cached set containing its free
1682 -- variables (both type variables and dictionaries)
1683 type DictBind = (CoreBind, VarSet)
1684
1685 type DictExpr = CoreExpr
1686
1687 emptyUDs :: UsageDetails
1688 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyVarEnv }
1689
1690 ------------------------------------------------------------
1691 type CallDetails = IdEnv CallInfoSet
1692 newtype CallKey = CallKey [Maybe Type] -- Nothing => unconstrained type argument
1693
1694 -- CallInfo uses a Map, thereby ensuring that
1695 -- we record only one call instance for any key
1696 --
1697 -- The list of types and dictionaries is guaranteed to
1698 -- match the type of f
1699 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1700 -- Range is dict args and the vars of the whole
1701 -- call (including tyvars)
1702 -- [*not* include the main id itself, of course]
1703
1704 type CallInfo = (CallKey, ([DictExpr], VarSet))
1705
1706 instance Outputable CallInfoSet where
1707 ppr (CIS fn map) = hang (ptext (sLit "CIS") <+> ppr fn)
1708 2 (ppr map)
1709
1710 pprCallInfo :: Id -> CallInfo -> SDoc
1711 pprCallInfo fn (CallKey mb_tys, (_dxs, _))
1712 = hang (ppr fn)
1713 2 (fsep (map ppr_call_key_ty mb_tys {- ++ map pprParendExpr _dxs -}))
1714
1715 ppr_call_key_ty :: Maybe Type -> SDoc
1716 ppr_call_key_ty Nothing = char '_'
1717 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1718
1719 instance Outputable CallKey where
1720 ppr (CallKey ts) = ppr ts
1721
1722 -- Type isn't an instance of Ord, so that we can control which
1723 -- instance we use. That's tiresome here. Oh well
1724 instance Eq CallKey where
1725 k1 == k2 = case k1 `compare` k2 of { EQ -> True; _ -> False }
1726
1727 instance Ord CallKey where
1728 compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2
1729 where
1730 cmp Nothing Nothing = EQ
1731 cmp Nothing (Just _) = LT
1732 cmp (Just _) Nothing = GT
1733 cmp (Just t1) (Just t2) = cmpType t1 t2
1734
1735 unionCalls :: CallDetails -> CallDetails -> CallDetails
1736 unionCalls c1 c2 = plusVarEnv_C unionCallInfoSet c1 c2
1737
1738 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1739 unionCallInfoSet (CIS f calls1) (CIS _ calls2) = CIS f (calls1 `Map.union` calls2)
1740
1741 callDetailsFVs :: CallDetails -> VarSet
1742 callDetailsFVs calls = foldVarEnv (unionVarSet . callInfoFVs) emptyVarSet calls
1743
1744 callInfoFVs :: CallInfoSet -> VarSet
1745 callInfoFVs (CIS _ call_info) = Map.foldRight (\(_,fv) vs -> unionVarSet fv vs) emptyVarSet call_info
1746
1747 ------------------------------------------------------------
1748 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1749 singleCall id tys dicts
1750 = MkUD {ud_binds = emptyBag,
1751 ud_calls = unitVarEnv id $ CIS id $
1752 Map.singleton (CallKey tys) (dicts, call_fvs) }
1753 where
1754 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1755 tys_fvs = tyVarsOfTypes (catMaybes tys)
1756 -- The type args (tys) are guaranteed to be part of the dictionary
1757 -- types, because they are just the constrained types,
1758 -- and the dictionary is therefore sure to be bound
1759 -- inside the binding for any type variables free in the type;
1760 -- hence it's safe to neglect tyvars free in tys when making
1761 -- the free-var set for this call
1762 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1763 --
1764 -- We don't include the 'id' itself.
1765
1766 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1767 mkCallUDs env f args
1768 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1769 res
1770 where
1771 res = mkCallUDs' env f args
1772
1773 mkCallUDs' env f args
1774 | not (want_calls_for f) -- Imported from elsewhere
1775 || null theta -- Not overloaded
1776 = emptyUDs
1777
1778 | not (all type_determines_value theta)
1779 || not (spec_tys `lengthIs` n_tyvars)
1780 || not ( dicts `lengthIs` n_dicts)
1781 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1782 -- See also Note [Specialisations already covered]
1783 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1784 emptyUDs -- Not overloaded, or no specialisation wanted
1785
1786 | otherwise
1787 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1788 singleCall f spec_tys dicts
1789 where
1790 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1791 , ppr (map (interestingDict env) dicts)]
1792 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1793 constrained_tyvars = closeOverKinds (tyVarsOfTypes theta)
1794 n_tyvars = length tyvars
1795 n_dicts = length theta
1796
1797 spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args]
1798 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1799
1800 mk_spec_ty tyvar ty
1801 | tyvar `elemVarSet` constrained_tyvars = Just ty
1802 | otherwise = Nothing
1803
1804 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1805 -- For imported things, we gather call instances if
1806 -- there is an unfolding that we could in principle specialise
1807 -- We might still decide not to use it (consulting dflags)
1808 -- in specImports
1809 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1810
1811 type_determines_value pred -- See Note [Type determines value]
1812 = case classifyPredType pred of
1813 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1814 EqPred {} -> True
1815 IrredPred {} -> True -- Things like (D []) where D is a
1816 -- Constraint-ranged family; Trac #7785
1817
1818 {-
1819 Note [Type determines value]
1820 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1821 Only specialise if all overloading is on non-IP *class* params,
1822 because these are the ones whose *type* determines their *value*. In
1823 parrticular, with implicit params, the type args *don't* say what the
1824 value of the implicit param is! See Trac #7101
1825
1826 However, consider
1827 type family D (v::*->*) :: Constraint
1828 type instance D [] = ()
1829 f :: D v => v Char -> Int
1830 If we see a call (f "foo"), we'll pass a "dictionary"
1831 () |> (g :: () ~ D [])
1832 and it's good to specialise f at this dictionary.
1833
1834 So the question is: can an implicit parameter "hide inside" a
1835 type-family constraint like (D a). Well, no. We don't allow
1836 type instance D Maybe = ?x:Int
1837 Hence the IrredPred case in type_determines_value.
1838 See Trac #7785.
1839
1840 Note [Interesting dictionary arguments]
1841 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1842 Consider this
1843 \a.\d:Eq a. let f = ... in ...(f d)...
1844 There really is not much point in specialising f wrt the dictionary d,
1845 because the code for the specialised f is not improved at all, because
1846 d is lambda-bound. We simply get junk specialisations.
1847
1848 What is "interesting"? Just that it has *some* structure. But what about
1849 variables?
1850
1851 * A variable might be imported, in which case its unfolding
1852 will tell us whether it has useful structure
1853
1854 * Local variables are cloned on the way down (to avoid clashes when
1855 we float dictionaries), and cloning drops the unfolding
1856 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1857 nuisance to give them unfoldings. So we keep track of the
1858 "interesting" dictionaries as a VarSet in SpecEnv.
1859 We have to take care to put any new interesting dictionary
1860 bindings in the set.
1861
1862 We accidentally lost accurate tracking of local variables for a long
1863 time, because cloned variables don't have unfoldings. But makes a
1864 massive difference in a few cases, eg Trac #5113. For nofib as a
1865 whole it's only a small win: 2.2% improvement in allocation for ansi,
1866 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1867 -}
1868
1869 interestingDict :: SpecEnv -> CoreExpr -> Bool
1870 -- A dictionary argument is interesting if it has *some* structure
1871 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1872 || isDataConWorkId v
1873 || v `elemVarSet` se_interesting env
1874 interestingDict _ (Type _) = False
1875 interestingDict _ (Coercion _) = False
1876 interestingDict env (App fn (Type _)) = interestingDict env fn
1877 interestingDict env (App fn (Coercion _)) = interestingDict env fn
1878 interestingDict env (Tick _ a) = interestingDict env a
1879 interestingDict env (Cast e _) = interestingDict env e
1880 interestingDict _ _ = True
1881
1882 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1883 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
1884 (MkUD {ud_binds = db2, ud_calls = calls2})
1885 = MkUD { ud_binds = db1 `unionBags` db2
1886 , ud_calls = calls1 `unionCalls` calls2 }
1887
1888 plusUDList :: [UsageDetails] -> UsageDetails
1889 plusUDList = foldr plusUDs emptyUDs
1890
1891 -----------------------------
1892 _dictBindBndrs :: Bag DictBind -> [Id]
1893 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
1894
1895 -- | Construct a 'DictBind' from a 'CoreBind'
1896 mkDB :: CoreBind -> DictBind
1897 mkDB bind = (bind, bind_fvs bind)
1898
1899 -- | Identify the free variables of a 'CoreBind'
1900 bind_fvs :: CoreBind -> VarSet
1901 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
1902 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
1903 where
1904 bndrs = map fst prs
1905 rhs_fvs = unionVarSets (map pair_fvs prs)
1906
1907 pair_fvs :: (Id, CoreExpr) -> VarSet
1908 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
1909 -- Don't forget variables mentioned in the
1910 -- rules of the bndr. C.f. OccAnal.addRuleUsage
1911 -- Also tyvars mentioned in its type; they may not appear in the RHS
1912 -- type T a = Int
1913 -- x :: T a = 3
1914
1915 -- | Flatten a set of 'DictBind's and some other binding pairs into a single
1916 -- recursive binding, including some additional bindings.
1917 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> DictBind
1918 flattenDictBinds dbs pairs
1919 = (Rec bindings, fvs)
1920 where
1921 (bindings, fvs) = foldrBag add
1922 ([], emptyVarSet)
1923 (dbs `snocBag` mkDB (Rec pairs))
1924 add (NonRec b r, fvs') (pairs, fvs) =
1925 ((b,r) : pairs, fvs `unionVarSet` fvs')
1926 add (Rec prs1, fvs') (pairs, fvs) =
1927 (prs1 ++ pairs, fvs `unionVarSet` fvs')
1928
1929 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
1930 -- Add ud_binds to the tail end of the bindings in uds
1931 snocDictBinds uds dbs
1932 = uds { ud_binds = ud_binds uds `unionBags`
1933 foldr consBag emptyBag dbs }
1934
1935 consDictBind :: DictBind -> UsageDetails -> UsageDetails
1936 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
1937
1938 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
1939 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
1940
1941 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
1942 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
1943
1944 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
1945 wrapDictBinds dbs binds
1946 = foldrBag add binds dbs
1947 where
1948 add (bind,_) binds = bind : binds
1949
1950 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
1951 wrapDictBindsE dbs expr
1952 = foldrBag add expr dbs
1953 where
1954 add (bind,_) expr = Let bind expr
1955
1956 ----------------------
1957 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
1958 -- Used at a lambda or case binder; just dump anything mentioning the binder
1959 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1960 | null bndrs = (uds, emptyBag) -- Common in case alternatives
1961 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
1962 (free_uds, dump_dbs)
1963 where
1964 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
1965 bndr_set = mkVarSet bndrs
1966 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
1967 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
1968 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
1969 -- no calls for any of the dicts in dump_dbs
1970
1971 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
1972 -- Used at a lambda or case binder; just dump anything mentioning the binder
1973 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1974 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
1975 (free_uds, dump_dbs, float_all)
1976 where
1977 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
1978 bndr_set = mkVarSet bndrs
1979 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
1980 free_calls = deleteCallsFor bndrs orig_calls
1981 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
1982
1983 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
1984 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1985 = -- pprTrace ("callsForMe")
1986 -- (vcat [ppr fn,
1987 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
1988 -- text "Orig calls =" <+> ppr orig_calls,
1989 -- text "Dep set =" <+> ppr dep_set,
1990 -- text "Calls for me =" <+> ppr calls_for_me]) $
1991 (uds_without_me, calls_for_me)
1992 where
1993 uds_without_me = MkUD { ud_binds = orig_dbs, ud_calls = delVarEnv orig_calls fn }
1994 calls_for_me = case lookupVarEnv orig_calls fn of
1995 Nothing -> []
1996 Just (CIS _ calls) -> filter_dfuns (Map.toList calls)
1997
1998 dep_set = foldlBag go (unitVarSet fn) orig_dbs
1999 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
2000 = extendVarSetList dep_set (bindersOf db)
2001 | otherwise = dep_set
2002
2003 -- Note [Specialisation of dictionary functions]
2004 filter_dfuns | isDFunId fn = filter ok_call
2005 | otherwise = \cs -> cs
2006
2007 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
2008
2009 ----------------------
2010 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2011 -- Returns (free_dbs, dump_dbs, dump_set)
2012 splitDictBinds dbs bndr_set
2013 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2014 -- Important that it's foldl not foldr;
2015 -- we're accumulating the set of dumped ids in dump_set
2016 where
2017 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2018 | dump_idset `intersectsVarSet` fvs -- Dump it
2019 = (free_dbs, dump_dbs `snocBag` db,
2020 extendVarSetList dump_idset (bindersOf bind))
2021
2022 | otherwise -- Don't dump it
2023 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2024
2025
2026 ----------------------
2027 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2028 -- Remove calls *mentioning* bs
2029 deleteCallsMentioning bs calls
2030 = mapVarEnv filter_calls calls
2031 where
2032 filter_calls :: CallInfoSet -> CallInfoSet
2033 filter_calls (CIS f calls) = CIS f (Map.filter keep_call calls)
2034 keep_call (_, fvs) = not (fvs `intersectsVarSet` bs)
2035
2036 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2037 -- Remove calls *for* bs
2038 deleteCallsFor bs calls = delVarEnvList calls bs
2039
2040 {-
2041 ************************************************************************
2042 * *
2043 \subsubsection{Boring helper functions}
2044 * *
2045 ************************************************************************
2046 -}
2047
2048 newtype SpecM a = SpecM (State SpecState a)
2049
2050 data SpecState = SpecState {
2051 spec_uniq_supply :: UniqSupply,
2052 spec_module :: Module,
2053 spec_dflags :: DynFlags
2054 }
2055
2056 instance Functor SpecM where
2057 fmap = liftM
2058
2059 instance Applicative SpecM where
2060 pure = return
2061 (<*>) = ap
2062
2063 instance Monad SpecM where
2064 SpecM x >>= f = SpecM $ do y <- x
2065 case f y of
2066 SpecM z ->
2067 z
2068 return x = SpecM $ return x
2069 fail str = SpecM $ fail str
2070
2071 instance MonadUnique SpecM where
2072 getUniqueSupplyM
2073 = SpecM $ do st <- get
2074 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2075 put $ st { spec_uniq_supply = us2 }
2076 return us1
2077
2078 getUniqueM
2079 = SpecM $ do st <- get
2080 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2081 put $ st { spec_uniq_supply = us' }
2082 return u
2083
2084 instance HasDynFlags SpecM where
2085 getDynFlags = SpecM $ liftM spec_dflags get
2086
2087 instance HasModule SpecM where
2088 getModule = SpecM $ liftM spec_module get
2089
2090 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2091 runSpecM dflags this_mod (SpecM spec)
2092 = do us <- getUniqueSupplyM
2093 let initialState = SpecState {
2094 spec_uniq_supply = us,
2095 spec_module = this_mod,
2096 spec_dflags = dflags
2097 }
2098 return $ evalState spec initialState
2099
2100 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2101 mapAndCombineSM _ [] = return ([], emptyUDs)
2102 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2103 (ys, uds2) <- mapAndCombineSM f xs
2104 return (y:ys, uds1 `plusUDs` uds2)
2105
2106 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2107 extendTvSubstList env tv_binds
2108 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2109
2110 substTy :: SpecEnv -> Type -> Type
2111 substTy env ty = CoreSubst.substTy (se_subst env) ty
2112
2113 substCo :: SpecEnv -> Coercion -> Coercion
2114 substCo env co = CoreSubst.substCo (se_subst env) co
2115
2116 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2117 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2118 (subst', bs') -> (env { se_subst = subst' }, bs')
2119
2120 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2121 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2122 (subst', bs') -> (env { se_subst = subst' }, bs')
2123
2124 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2125 -- Clone the binders of the bind; return new bind with the cloned binders
2126 -- Return the substitution to use for RHSs, and the one to use for the body
2127 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2128 = do { us <- getUniqueSupplyM
2129 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2130 interesting' | interestingDict env rhs
2131 = interesting `extendVarSet` bndr'
2132 | otherwise = interesting
2133 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2134 , NonRec bndr' rhs) }
2135
2136 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2137 = do { us <- getUniqueSupplyM
2138 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2139 env' = env { se_subst = subst'
2140 , se_interesting = interesting `extendVarSetList`
2141 [ v | (v,r) <- pairs, interestingDict env r ] }
2142 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2143
2144 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2145 -- Make up completely fresh binders for the dictionaries
2146 -- Their bindings are going to float outwards
2147 newDictBndr env b = do { uniq <- getUniqueM
2148 ; let n = idName b
2149 ty' = substTy env (idType b)
2150 ; return (mkUserLocal (nameOccName n) uniq ty' (getSrcSpan n)) }
2151
2152 newSpecIdSM :: Id -> Type -> SpecM Id
2153 -- Give the new Id a similar occurrence name to the old one
2154 newSpecIdSM old_id new_ty
2155 = do { uniq <- getUniqueM
2156 ; let name = idName old_id
2157 new_occ = mkSpecOcc (nameOccName name)
2158 new_id = mkUserLocal new_occ uniq new_ty (getSrcSpan name)
2159 ; return new_id }
2160
2161 {-
2162 Old (but interesting) stuff about unboxed bindings
2163 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2164
2165 What should we do when a value is specialised to a *strict* unboxed value?
2166
2167 map_*_* f (x:xs) = let h = f x
2168 t = map f xs
2169 in h:t
2170
2171 Could convert let to case:
2172
2173 map_*_Int# f (x:xs) = case f x of h# ->
2174 let t = map f xs
2175 in h#:t
2176
2177 This may be undesirable since it forces evaluation here, but the value
2178 may not be used in all branches of the body. In the general case this
2179 transformation is impossible since the mutual recursion in a letrec
2180 cannot be expressed as a case.
2181
2182 There is also a problem with top-level unboxed values, since our
2183 implementation cannot handle unboxed values at the top level.
2184
2185 Solution: Lift the binding of the unboxed value and extract it when it
2186 is used:
2187
2188 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2189 t = map f xs
2190 in case h of
2191 _Lift h# -> h#:t
2192
2193 Now give it to the simplifier and the _Lifting will be optimised away.
2194
2195 The benfit is that we have given the specialised "unboxed" values a
2196 very simplep lifted semantics and then leave it up to the simplifier to
2197 optimise it --- knowing that the overheads will be removed in nearly
2198 all cases.
2199
2200 In particular, the value will only be evaluted in the branches of the
2201 program which use it, rather than being forced at the point where the
2202 value is bound. For example:
2203
2204 filtermap_*_* p f (x:xs)
2205 = let h = f x
2206 t = ...
2207 in case p x of
2208 True -> h:t
2209 False -> t
2210 ==>
2211 filtermap_*_Int# p f (x:xs)
2212 = let h = case (f x) of h# -> _Lift h#
2213 t = ...
2214 in case p x of
2215 True -> case h of _Lift h#
2216 -> h#:t
2217 False -> t
2218
2219 The binding for h can still be inlined in the one branch and the
2220 _Lifting eliminated.
2221
2222
2223 Question: When won't the _Lifting be eliminated?
2224
2225 Answer: When they at the top-level (where it is necessary) or when
2226 inlining would duplicate work (or possibly code depending on
2227 options). However, the _Lifting will still be eliminated if the
2228 strictness analyser deems the lifted binding strict.
2229 -}