Imported ids without signatures are still inferred.
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy, extendTvSubstList )
14 import Type hiding( substTy, extendTvSubstList )
15 import Coercion( Coercion )
16 import Module( Module, HasModule(..) )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import VarSet
21 import VarEnv
22 import CoreSyn
23 import Rules
24 import PprCore ( pprParendExpr )
25 import CoreUtils ( exprIsTrivial, applyTypeToArgs )
26 import CoreFVs ( exprFreeVars, exprsFreeVars, idFreeVars )
27 import UniqSupply
28 import Name
29 import MkId ( voidArgId, voidPrimId )
30 import Maybes ( catMaybes, isJust )
31 import BasicTypes
32 import HscTypes
33 import Bag
34 import DynFlags
35 import Util
36 import Outputable
37 import FastString
38 import State
39
40 #if __GLASGOW_HASKELL__ < 709
41 import Control.Applicative (Applicative(..))
42 #endif
43 import Control.Monad
44 import Data.Map (Map)
45 import qualified Data.Map as Map
46 import qualified FiniteMap as Map
47
48 {-
49 ************************************************************************
50 * *
51 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
52 * *
53 ************************************************************************
54
55 These notes describe how we implement specialisation to eliminate
56 overloading.
57
58 The specialisation pass works on Core
59 syntax, complete with all the explicit dictionary application,
60 abstraction and construction as added by the type checker. The
61 existing type checker remains largely as it is.
62
63 One important thought: the {\em types} passed to an overloaded
64 function, and the {\em dictionaries} passed are mutually redundant.
65 If the same function is applied to the same type(s) then it is sure to
66 be applied to the same dictionary(s)---or rather to the same {\em
67 values}. (The arguments might look different but they will evaluate
68 to the same value.)
69
70 Second important thought: we know that we can make progress by
71 treating dictionary arguments as static and worth specialising on. So
72 we can do without binding-time analysis, and instead specialise on
73 dictionary arguments and no others.
74
75 The basic idea
76 ~~~~~~~~~~~~~~
77 Suppose we have
78
79 let f = <f_rhs>
80 in <body>
81
82 and suppose f is overloaded.
83
84 STEP 1: CALL-INSTANCE COLLECTION
85
86 We traverse <body>, accumulating all applications of f to types and
87 dictionaries.
88
89 (Might there be partial applications, to just some of its types and
90 dictionaries? In principle yes, but in practice the type checker only
91 builds applications of f to all its types and dictionaries, so partial
92 applications could only arise as a result of transformation, and even
93 then I think it's unlikely. In any case, we simply don't accumulate such
94 partial applications.)
95
96
97 STEP 2: EQUIVALENCES
98
99 So now we have a collection of calls to f:
100 f t1 t2 d1 d2
101 f t3 t4 d3 d4
102 ...
103 Notice that f may take several type arguments. To avoid ambiguity, we
104 say that f is called at type t1/t2 and t3/t4.
105
106 We take equivalence classes using equality of the *types* (ignoring
107 the dictionary args, which as mentioned previously are redundant).
108
109 STEP 3: SPECIALISATION
110
111 For each equivalence class, choose a representative (f t1 t2 d1 d2),
112 and create a local instance of f, defined thus:
113
114 f@t1/t2 = <f_rhs> t1 t2 d1 d2
115
116 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
117 of simplification will now result. However we don't actually *do* that
118 simplification. Rather, we leave it for the simplifier to do. If we
119 *did* do it, though, we'd get more call instances from the specialised
120 RHS. We can work out what they are by instantiating the call-instance
121 set from f's RHS with the types t1, t2.
122
123 Add this new id to f's IdInfo, to record that f has a specialised version.
124
125 Before doing any of this, check that f's IdInfo doesn't already
126 tell us about an existing instance of f at the required type/s.
127 (This might happen if specialisation was applied more than once, or
128 it might arise from user SPECIALIZE pragmas.)
129
130 Recursion
131 ~~~~~~~~~
132 Wait a minute! What if f is recursive? Then we can't just plug in
133 its right-hand side, can we?
134
135 But it's ok. The type checker *always* creates non-recursive definitions
136 for overloaded recursive functions. For example:
137
138 f x = f (x+x) -- Yes I know its silly
139
140 becomes
141
142 f a (d::Num a) = let p = +.sel a d
143 in
144 letrec fl (y::a) = fl (p y y)
145 in
146 fl
147
148 We still have recusion for non-overloaded functions which we
149 speciailise, but the recursive call should get specialised to the
150 same recursive version.
151
152
153 Polymorphism 1
154 ~~~~~~~~~~~~~~
155
156 All this is crystal clear when the function is applied to *constant
157 types*; that is, types which have no type variables inside. But what if
158 it is applied to non-constant types? Suppose we find a call of f at type
159 t1/t2. There are two possibilities:
160
161 (a) The free type variables of t1, t2 are in scope at the definition point
162 of f. In this case there's no problem, we proceed just as before. A common
163 example is as follows. Here's the Haskell:
164
165 g y = let f x = x+x
166 in f y + f y
167
168 After typechecking we have
169
170 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
171 in +.sel a d (f a d y) (f a d y)
172
173 Notice that the call to f is at type type "a"; a non-constant type.
174 Both calls to f are at the same type, so we can specialise to give:
175
176 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
177 in +.sel a d (f@a y) (f@a y)
178
179
180 (b) The other case is when the type variables in the instance types
181 are *not* in scope at the definition point of f. The example we are
182 working with above is a good case. There are two instances of (+.sel a d),
183 but "a" is not in scope at the definition of +.sel. Can we do anything?
184 Yes, we can "common them up", a sort of limited common sub-expression deal.
185 This would give:
186
187 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
188 f@a (x::a) = +.sel@a x x
189 in +.sel@a (f@a y) (f@a y)
190
191 This can save work, and can't be spotted by the type checker, because
192 the two instances of +.sel weren't originally at the same type.
193
194 Further notes on (b)
195
196 * There are quite a few variations here. For example, the defn of
197 +.sel could be floated ouside the \y, to attempt to gain laziness.
198 It certainly mustn't be floated outside the \d because the d has to
199 be in scope too.
200
201 * We don't want to inline f_rhs in this case, because
202 that will duplicate code. Just commoning up the call is the point.
203
204 * Nothing gets added to +.sel's IdInfo.
205
206 * Don't bother unless the equivalence class has more than one item!
207
208 Not clear whether this is all worth it. It is of course OK to
209 simply discard call-instances when passing a big lambda.
210
211 Polymorphism 2 -- Overloading
212 ~~~~~~~~~~~~~~
213 Consider a function whose most general type is
214
215 f :: forall a b. Ord a => [a] -> b -> b
216
217 There is really no point in making a version of g at Int/Int and another
218 at Int/Bool, because it's only instancing the type variable "a" which
219 buys us any efficiency. Since g is completely polymorphic in b there
220 ain't much point in making separate versions of g for the different
221 b types.
222
223 That suggests that we should identify which of g's type variables
224 are constrained (like "a") and which are unconstrained (like "b").
225 Then when taking equivalence classes in STEP 2, we ignore the type args
226 corresponding to unconstrained type variable. In STEP 3 we make
227 polymorphic versions. Thus:
228
229 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
230
231 We do this.
232
233
234 Dictionary floating
235 ~~~~~~~~~~~~~~~~~~~
236 Consider this
237
238 f a (d::Num a) = let g = ...
239 in
240 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
241
242 Here, g is only called at one type, but the dictionary isn't in scope at the
243 definition point for g. Usually the type checker would build a
244 definition for d1 which enclosed g, but the transformation system
245 might have moved d1's defn inward. Solution: float dictionary bindings
246 outwards along with call instances.
247
248 Consider
249
250 f x = let g p q = p==q
251 h r s = (r+s, g r s)
252 in
253 h x x
254
255
256 Before specialisation, leaving out type abstractions we have
257
258 f df x = let g :: Eq a => a -> a -> Bool
259 g dg p q = == dg p q
260 h :: Num a => a -> a -> (a, Bool)
261 h dh r s = let deq = eqFromNum dh
262 in (+ dh r s, g deq r s)
263 in
264 h df x x
265
266 After specialising h we get a specialised version of h, like this:
267
268 h' r s = let deq = eqFromNum df
269 in (+ df r s, g deq r s)
270
271 But we can't naively make an instance for g from this, because deq is not in scope
272 at the defn of g. Instead, we have to float out the (new) defn of deq
273 to widen its scope. Notice that this floating can't be done in advance -- it only
274 shows up when specialisation is done.
275
276 User SPECIALIZE pragmas
277 ~~~~~~~~~~~~~~~~~~~~~~~
278 Specialisation pragmas can be digested by the type checker, and implemented
279 by adding extra definitions along with that of f, in the same way as before
280
281 f@t1/t2 = <f_rhs> t1 t2 d1 d2
282
283 Indeed the pragmas *have* to be dealt with by the type checker, because
284 only it knows how to build the dictionaries d1 and d2! For example
285
286 g :: Ord a => [a] -> [a]
287 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
288
289 Here, the specialised version of g is an application of g's rhs to the
290 Ord dictionary for (Tree Int), which only the type checker can conjure
291 up. There might not even *be* one, if (Tree Int) is not an instance of
292 Ord! (All the other specialision has suitable dictionaries to hand
293 from actual calls.)
294
295 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
296 it is buried in a complex (as-yet-un-desugared) binding group.
297 Maybe we should say
298
299 f@t1/t2 = f* t1 t2 d1 d2
300
301 where f* is the Id f with an IdInfo which says "inline me regardless!".
302 Indeed all the specialisation could be done in this way.
303 That in turn means that the simplifier has to be prepared to inline absolutely
304 any in-scope let-bound thing.
305
306
307 Again, the pragma should permit polymorphism in unconstrained variables:
308
309 h :: Ord a => [a] -> b -> b
310 {-# SPECIALIZE h :: [Int] -> b -> b #-}
311
312 We *insist* that all overloaded type variables are specialised to ground types,
313 (and hence there can be no context inside a SPECIALIZE pragma).
314 We *permit* unconstrained type variables to be specialised to
315 - a ground type
316 - or left as a polymorphic type variable
317 but nothing in between. So
318
319 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
320
321 is *illegal*. (It can be handled, but it adds complication, and gains the
322 programmer nothing.)
323
324
325 SPECIALISING INSTANCE DECLARATIONS
326 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
327 Consider
328
329 instance Foo a => Foo [a] where
330 ...
331 {-# SPECIALIZE instance Foo [Int] #-}
332
333 The original instance decl creates a dictionary-function
334 definition:
335
336 dfun.Foo.List :: forall a. Foo a -> Foo [a]
337
338 The SPECIALIZE pragma just makes a specialised copy, just as for
339 ordinary function definitions:
340
341 dfun.Foo.List@Int :: Foo [Int]
342 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
343
344 The information about what instance of the dfun exist gets added to
345 the dfun's IdInfo in the same way as a user-defined function too.
346
347
348 Automatic instance decl specialisation?
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350 Can instance decls be specialised automatically? It's tricky.
351 We could collect call-instance information for each dfun, but
352 then when we specialised their bodies we'd get new call-instances
353 for ordinary functions; and when we specialised their bodies, we might get
354 new call-instances of the dfuns, and so on. This all arises because of
355 the unrestricted mutual recursion between instance decls and value decls.
356
357 Still, there's no actual problem; it just means that we may not do all
358 the specialisation we could theoretically do.
359
360 Furthermore, instance decls are usually exported and used non-locally,
361 so we'll want to compile enough to get those specialisations done.
362
363 Lastly, there's no such thing as a local instance decl, so we can
364 survive solely by spitting out *usage* information, and then reading that
365 back in as a pragma when next compiling the file. So for now,
366 we only specialise instance decls in response to pragmas.
367
368
369 SPITTING OUT USAGE INFORMATION
370 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
371
372 To spit out usage information we need to traverse the code collecting
373 call-instance information for all imported (non-prelude?) functions
374 and data types. Then we equivalence-class it and spit it out.
375
376 This is done at the top-level when all the call instances which escape
377 must be for imported functions and data types.
378
379 *** Not currently done ***
380
381
382 Partial specialisation by pragmas
383 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
384 What about partial specialisation:
385
386 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
387 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
388
389 or even
390
391 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
392
393 Seems quite reasonable. Similar things could be done with instance decls:
394
395 instance (Foo a, Foo b) => Foo (a,b) where
396 ...
397 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
398 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
399
400 Ho hum. Things are complex enough without this. I pass.
401
402
403 Requirements for the simplifer
404 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405 The simplifier has to be able to take advantage of the specialisation.
406
407 * When the simplifier finds an application of a polymorphic f, it looks in
408 f's IdInfo in case there is a suitable instance to call instead. This converts
409
410 f t1 t2 d1 d2 ===> f_t1_t2
411
412 Note that the dictionaries get eaten up too!
413
414 * Dictionary selection operations on constant dictionaries must be
415 short-circuited:
416
417 +.sel Int d ===> +Int
418
419 The obvious way to do this is in the same way as other specialised
420 calls: +.sel has inside it some IdInfo which tells that if it's applied
421 to the type Int then it should eat a dictionary and transform to +Int.
422
423 In short, dictionary selectors need IdInfo inside them for constant
424 methods.
425
426 * Exactly the same applies if a superclass dictionary is being
427 extracted:
428
429 Eq.sel Int d ===> dEqInt
430
431 * Something similar applies to dictionary construction too. Suppose
432 dfun.Eq.List is the function taking a dictionary for (Eq a) to
433 one for (Eq [a]). Then we want
434
435 dfun.Eq.List Int d ===> dEq.List_Int
436
437 Where does the Eq [Int] dictionary come from? It is built in
438 response to a SPECIALIZE pragma on the Eq [a] instance decl.
439
440 In short, dfun Ids need IdInfo with a specialisation for each
441 constant instance of their instance declaration.
442
443 All this uses a single mechanism: the SpecEnv inside an Id
444
445
446 What does the specialisation IdInfo look like?
447 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448
449 The SpecEnv of an Id maps a list of types (the template) to an expression
450
451 [Type] |-> Expr
452
453 For example, if f has this SpecInfo:
454
455 [Int, a] -> \d:Ord Int. f' a
456
457 it means that we can replace the call
458
459 f Int t ===> (\d. f' t)
460
461 This chucks one dictionary away and proceeds with the
462 specialised version of f, namely f'.
463
464
465 What can't be done this way?
466 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
467 There is no way, post-typechecker, to get a dictionary for (say)
468 Eq a from a dictionary for Eq [a]. So if we find
469
470 ==.sel [t] d
471
472 we can't transform to
473
474 eqList (==.sel t d')
475
476 where
477 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
478
479 Of course, we currently have no way to automatically derive
480 eqList, nor to connect it to the Eq [a] instance decl, but you
481 can imagine that it might somehow be possible. Taking advantage
482 of this is permanently ruled out.
483
484 Still, this is no great hardship, because we intend to eliminate
485 overloading altogether anyway!
486
487 A note about non-tyvar dictionaries
488 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
489 Some Ids have types like
490
491 forall a,b,c. Eq a -> Ord [a] -> tau
492
493 This seems curious at first, because we usually only have dictionary
494 args whose types are of the form (C a) where a is a type variable.
495 But this doesn't hold for the functions arising from instance decls,
496 which sometimes get arguments with types of form (C (T a)) for some
497 type constructor T.
498
499 Should we specialise wrt this compound-type dictionary? We used to say
500 "no", saying:
501 "This is a heuristic judgement, as indeed is the fact that we
502 specialise wrt only dictionaries. We choose *not* to specialise
503 wrt compound dictionaries because at the moment the only place
504 they show up is in instance decls, where they are simply plugged
505 into a returned dictionary. So nothing is gained by specialising
506 wrt them."
507
508 But it is simpler and more uniform to specialise wrt these dicts too;
509 and in future GHC is likely to support full fledged type signatures
510 like
511 f :: Eq [(a,b)] => ...
512
513
514 ************************************************************************
515 * *
516 \subsubsection{The new specialiser}
517 * *
518 ************************************************************************
519
520 Our basic game plan is this. For let(rec) bound function
521 f :: (C a, D c) => (a,b,c,d) -> Bool
522
523 * Find any specialised calls of f, (f ts ds), where
524 ts are the type arguments t1 .. t4, and
525 ds are the dictionary arguments d1 .. d2.
526
527 * Add a new definition for f1 (say):
528
529 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
530
531 Note that we abstract over the unconstrained type arguments.
532
533 * Add the mapping
534
535 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
536
537 to the specialisations of f. This will be used by the
538 simplifier to replace calls
539 (f t1 t2 t3 t4) da db
540 by
541 (\d1 d1 -> f1 t2 t4) da db
542
543 All the stuff about how many dictionaries to discard, and what types
544 to apply the specialised function to, are handled by the fact that the
545 SpecEnv contains a template for the result of the specialisation.
546
547 We don't build *partial* specialisations for f. For example:
548
549 f :: Eq a => a -> a -> Bool
550 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
551
552 Here, little is gained by making a specialised copy of f.
553 There's a distinct danger that the specialised version would
554 first build a dictionary for (Eq b, Eq c), and then select the (==)
555 method from it! Even if it didn't, not a great deal is saved.
556
557 We do, however, generate polymorphic, but not overloaded, specialisations:
558
559 f :: Eq a => [a] -> b -> b -> b
560 ... SPECIALISE f :: [Int] -> b -> b -> b ...
561
562 Hence, the invariant is this:
563
564 *** no specialised version is overloaded ***
565
566
567 ************************************************************************
568 * *
569 \subsubsection{The exported function}
570 * *
571 ************************************************************************
572 -}
573
574 specProgram :: ModGuts -> CoreM ModGuts
575 specProgram guts@(ModGuts { mg_module = this_mod
576 , mg_rules = local_rules
577 , mg_binds = binds })
578 = do { dflags <- getDynFlags
579
580 -- Specialise the bindings of this module
581 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
582
583 -- Specialise imported functions
584 ; hpt_rules <- getRuleBase
585 ; let rule_base = extendRuleBaseList hpt_rules local_rules
586 ; (new_rules, spec_binds) <- specImports dflags this_mod emptyVarSet rule_base uds
587
588 ; let final_binds | null spec_binds = binds'
589 | otherwise = Rec (flattenBinds spec_binds) : binds'
590 -- Note [Glom the bindings if imported functions are specialised]
591
592 ; return (guts { mg_binds = final_binds
593 , mg_rules = new_rules ++ local_rules }) }
594 where
595 -- We need to start with a Subst that knows all the things
596 -- that are in scope, so that the substitution engine doesn't
597 -- accidentally re-use a unique that's already in use
598 -- Easiest thing is to do it all at once, as if all the top-level
599 -- decls were mutually recursive
600 top_subst = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
601 bindersOfBinds binds
602 , se_interesting = emptyVarSet }
603
604 go [] = return ([], emptyUDs)
605 go (bind:binds) = do (binds', uds) <- go binds
606 (bind', uds') <- specBind top_subst bind uds
607 return (bind' ++ binds', uds')
608
609 specImports :: DynFlags
610 -> Module
611 -> VarSet -- Don't specialise these ones
612 -- See Note [Avoiding recursive specialisation]
613 -> RuleBase -- Rules from this module and the home package
614 -- (but not external packages, which can change)
615 -> UsageDetails -- Calls for imported things, and floating bindings
616 -> CoreM ( [CoreRule] -- New rules
617 , [CoreBind] ) -- Specialised bindings and floating bindings
618 specImports dflags this_mod done rule_base uds
619 = do { let import_calls = varEnvElts (ud_calls uds)
620 ; (rules, spec_binds) <- go rule_base import_calls
621 ; return (rules, wrapDictBinds (ud_binds uds) spec_binds) }
622 where
623 go _ [] = return ([], [])
624 go rb (CIS fn calls_for_fn : other_calls)
625 = do { (rules1, spec_binds1) <- specImport dflags this_mod done rb fn $
626 Map.toList calls_for_fn
627 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
628 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
629
630 specImport :: DynFlags
631 -> Module
632 -> VarSet -- Don't specialise these
633 -- See Note [Avoiding recursive specialisation]
634 -> RuleBase -- Rules from this module
635 -> Id -> [CallInfo] -- Imported function and calls for it
636 -> CoreM ( [CoreRule] -- New rules
637 , [CoreBind] ) -- Specialised bindings
638 specImport dflags this_mod done rb fn calls_for_fn
639 | fn `elemVarSet` done
640 = return ([], []) -- No warning. This actually happens all the time
641 -- when specialising a recursive function, because
642 -- the RHS of the specialised function contains a recursive
643 -- call to the original function
644
645 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
646 = return ([], [])
647
648 | wantSpecImport dflags unfolding
649 , Just rhs <- maybeUnfoldingTemplate unfolding
650 = do { -- Get rules from the external package state
651 -- We keep doing this in case we "page-fault in"
652 -- more rules as we go along
653 ; hsc_env <- getHscEnv
654 ; eps <- liftIO $ hscEPS hsc_env
655 ; vis_orphs <- getVisibleOrphanMods
656 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
657 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
658
659 ; (rules1, spec_pairs, uds) <- runSpecM dflags this_mod $
660 specCalls (Just this_mod) emptySpecEnv rules_for_fn calls_for_fn fn rhs
661 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
662 -- After the rules kick in we may get recursion, but
663 -- we rely on a global GlomBinds to sort that out later
664 -- See Note [Glom the bindings if imported functions are specialised]
665
666 -- Now specialise any cascaded calls
667 ; (rules2, spec_binds2) <- -- pprTrace "specImport" (ppr fn $$ ppr uds $$ ppr rhs) $
668 specImports dflags this_mod (extendVarSet done fn)
669 (extendRuleBaseList rb rules1)
670 uds
671
672 ; return (rules2 ++ rules1, spec_binds2 ++ spec_binds1) }
673
674 | otherwise
675 = WARN( True, hang (ptext (sLit "specImport discarding:") <+> ppr fn <+> dcolon <+> ppr (idType fn))
676 2 ( (text "want:" <+> ppr (wantSpecImport dflags unfolding))
677 $$ (text "stable:" <+> ppr (isStableUnfolding unfolding))
678 $$ (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) ) )
679 return ([], [])
680 where
681 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
682
683 wantSpecImport :: DynFlags -> Unfolding -> Bool
684 -- See Note [Specialise imported INLINABLE things]
685 wantSpecImport dflags unf
686 = case unf of
687 NoUnfolding -> False
688 OtherCon {} -> False
689 DFunUnfolding {} -> True
690 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
691 | gopt Opt_SpecialiseAggressively dflags -> True
692 | isStableSource src -> True
693 -- Specialise even INLINE things; it hasn't inlined yet,
694 -- so perhaps it never will. Moreover it may have calls
695 -- inside it that we want to specialise
696 | otherwise -> False -- Stable, not INLINE, hence INLINEABLE
697
698 {-
699 Note [Specialise imported INLINABLE things]
700 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
701 What imported functions do we specialise? The basic set is
702 * DFuns and things with INLINABLE pragmas.
703 but with -fspecialise-aggressively we add
704 * Anything with an unfolding template
705
706 Trac #8874 has a good example of why we want to auto-specialise DFuns.
707
708 We have the -fspecialise-aggressively flag (usually off), because we
709 risk lots of orphan modules from over-vigorous specialisation.
710 However it's not a big deal: anything non-recursive with an
711 unfolding-template will probably have been inlined already.
712
713 Note [Glom the bindings if imported functions are specialised]
714 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
715 Suppose we have an imported, *recursive*, INLINABLE function
716 f :: Eq a => a -> a
717 f = /\a \d x. ...(f a d)...
718 In the module being compiled we have
719 g x = f (x::Int)
720 Now we'll make a specialised function
721 f_spec :: Int -> Int
722 f_spec = \x -> ...(f Int dInt)...
723 {-# RULE f Int _ = f_spec #-}
724 g = \x. f Int dInt x
725 Note that f_spec doesn't look recursive
726 After rewriting with the RULE, we get
727 f_spec = \x -> ...(f_spec)...
728 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
729 The occurrence analyser never turns a NonRec into a Rec. So we must
730 make sure that f_spec is recursive. Easiest thing is to make all
731 the specialisations for imported bindings recursive.
732
733
734 Note [Avoiding recursive specialisation]
735 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
736 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
737 'f's RHS. So we want to specialise g,h. But we don't want to
738 specialise f any more! It's possible that f's RHS might have a
739 recursive yet-more-specialised call, so we'd diverge in that case.
740 And if the call is to the same type, one specialisation is enough.
741 Avoiding this recursive specialisation loop is the reason for the
742 'done' VarSet passed to specImports and specImport.
743
744 ************************************************************************
745 * *
746 \subsubsection{@specExpr@: the main function}
747 * *
748 ************************************************************************
749 -}
750
751 data SpecEnv
752 = SE { se_subst :: CoreSubst.Subst
753 -- We carry a substitution down:
754 -- a) we must clone any binding that might float outwards,
755 -- to avoid name clashes
756 -- b) we carry a type substitution to use when analysing
757 -- the RHS of specialised bindings (no type-let!)
758
759
760 , se_interesting :: VarSet
761 -- Dict Ids that we know something about
762 -- and hence may be worth specialising against
763 -- See Note [Interesting dictionary arguments]
764 }
765
766 emptySpecEnv :: SpecEnv
767 emptySpecEnv = SE { se_subst = CoreSubst.emptySubst, se_interesting = emptyVarSet}
768
769 specVar :: SpecEnv -> Id -> CoreExpr
770 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
771
772 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
773
774 ---------------- First the easy cases --------------------
775 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
776 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
777 specExpr env (Var v) = return (specVar env v, emptyUDs)
778 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
779 specExpr env (Cast e co)
780 = do { (e', uds) <- specExpr env e
781 ; return ((Cast e' (substCo env co)), uds) }
782 specExpr env (Tick tickish body)
783 = do { (body', uds) <- specExpr env body
784 ; return (Tick (specTickish env tickish) body', uds) }
785
786 ---------------- Applications might generate a call instance --------------------
787 specExpr env expr@(App {})
788 = go expr []
789 where
790 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
791 (fun', uds_app) <- go fun (arg':args)
792 return (App fun' arg', uds_arg `plusUDs` uds_app)
793
794 go (Var f) args = case specVar env f of
795 Var f' -> return (Var f', mkCallUDs env f' args)
796 e' -> return (e', emptyUDs) -- I don't expect this!
797 go other _ = specExpr env other
798
799 ---------------- Lambda/case require dumping of usage details --------------------
800 specExpr env e@(Lam _ _) = do
801 (body', uds) <- specExpr env' body
802 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
803 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
804 where
805 (bndrs, body) = collectBinders e
806 (env', bndrs') = substBndrs env bndrs
807 -- More efficient to collect a group of binders together all at once
808 -- and we don't want to split a lambda group with dumped bindings
809
810 specExpr env (Case scrut case_bndr ty alts)
811 = do { (scrut', scrut_uds) <- specExpr env scrut
812 ; (scrut'', case_bndr', alts', alts_uds)
813 <- specCase env scrut' case_bndr alts
814 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
815 , scrut_uds `plusUDs` alts_uds) }
816
817 ---------------- Finally, let is the interesting case --------------------
818 specExpr env (Let bind body)
819 = do { -- Clone binders
820 (rhs_env, body_env, bind') <- cloneBindSM env bind
821
822 -- Deal with the body
823 ; (body', body_uds) <- specExpr body_env body
824
825 -- Deal with the bindings
826 ; (binds', uds) <- specBind rhs_env bind' body_uds
827
828 -- All done
829 ; return (foldr Let body' binds', uds) }
830
831 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
832 specTickish env (Breakpoint ix ids)
833 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
834 -- drop vars from the list if they have a non-variable substitution.
835 -- should never happen, but it's harmless to drop them anyway.
836 specTickish _ other_tickish = other_tickish
837
838 specCase :: SpecEnv
839 -> CoreExpr -- Scrutinee, already done
840 -> Id -> [CoreAlt]
841 -> SpecM ( CoreExpr -- New scrutinee
842 , Id
843 , [CoreAlt]
844 , UsageDetails)
845 specCase env scrut' case_bndr [(con, args, rhs)]
846 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
847 , interestingDict env scrut'
848 , not (isDeadBinder case_bndr && null sc_args')
849 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
850
851 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
852 [(con, args', Var sc_arg')]
853 | sc_arg' <- sc_args' ]
854
855 -- Extend the substitution for RHS to map the *original* binders
856 -- to their floated verions.
857 mb_sc_flts :: [Maybe DictId]
858 mb_sc_flts = map (lookupVarEnv clone_env) args'
859 clone_env = zipVarEnv sc_args' sc_args_flt
860 subst_prs = (case_bndr, Var case_bndr_flt)
861 : [ (arg, Var sc_flt)
862 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
863 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
864 , se_interesting = se_interesting env_rhs `extendVarSetList`
865 (case_bndr_flt : sc_args_flt) }
866
867 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
868 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
869 case_bndr_set = unitVarSet case_bndr_flt
870 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
871 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
872 flt_binds = scrut_bind : sc_binds
873 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
874 all_uds = flt_binds `addDictBinds` free_uds
875 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
876 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
877 where
878 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
879 sc_args' = filter is_flt_sc_arg args'
880
881 clone_me bndr = do { uniq <- getUniqueM
882 ; return (mkUserLocal occ uniq ty has_sig loc) }
883 where
884 name = idName bndr
885 ty = idType bndr
886 has_sig = idHasSig bndr
887 occ = nameOccName name
888 loc = getSrcSpan name
889
890 arg_set = mkVarSet args'
891 is_flt_sc_arg var = isId var
892 && not (isDeadBinder var)
893 && isDictTy var_ty
894 && not (tyVarsOfType var_ty `intersectsVarSet` arg_set)
895 where
896 var_ty = idType var
897
898
899 specCase env scrut case_bndr alts
900 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
901 ; return (scrut, case_bndr', alts', uds_alts) }
902 where
903 (env_alt, case_bndr') = substBndr env case_bndr
904 spec_alt (con, args, rhs) = do
905 (rhs', uds) <- specExpr env_rhs rhs
906 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
907 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
908 where
909 (env_rhs, args') = substBndrs env_alt args
910
911 {-
912 Note [Floating dictionaries out of cases]
913 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
914 Consider
915 g = \d. case d of { MkD sc ... -> ...(f sc)... }
916 Naively we can't float d2's binding out of the case expression,
917 because 'sc' is bound by the case, and that in turn means we can't
918 specialise f, which seems a pity.
919
920 So we invert the case, by floating out a binding
921 for 'sc_flt' thus:
922 sc_flt = case d of { MkD sc ... -> sc }
923 Now we can float the call instance for 'f'. Indeed this is just
924 what'll happen if 'sc' was originally bound with a let binding,
925 but case is more efficient, and necessary with equalities. So it's
926 good to work with both.
927
928 You might think that this won't make any difference, because the
929 call instance will only get nuked by the \d. BUT if 'g' itself is
930 specialised, then transitively we should be able to specialise f.
931
932 In general, given
933 case e of cb { MkD sc ... -> ...(f sc)... }
934 we transform to
935 let cb_flt = e
936 sc_flt = case cb_flt of { MkD sc ... -> sc }
937 in
938 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
939
940 The "_flt" things are the floated binds; we use the current substitution
941 to substitute sc -> sc_flt in the RHS
942
943 ************************************************************************
944 * *
945 Dealing with a binding
946 * *
947 ************************************************************************
948 -}
949
950 specBind :: SpecEnv -- Use this for RHSs
951 -> CoreBind
952 -> UsageDetails -- Info on how the scope of the binding
953 -> SpecM ([CoreBind], -- New bindings
954 UsageDetails) -- And info to pass upstream
955
956 -- Returned UsageDetails:
957 -- No calls for binders of this bind
958 specBind rhs_env (NonRec fn rhs) body_uds
959 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
960 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
961
962 ; let pairs = spec_defns ++ [(fn', rhs')]
963 -- fn' mentions the spec_defns in its rules,
964 -- so put the latter first
965
966 combined_uds = body_uds1 `plusUDs` rhs_uds
967 -- This way round a call in rhs_uds of a function f
968 -- at type T will override a call of f at T in body_uds1; and
969 -- that is good because it'll tend to keep "earlier" calls
970 -- See Note [Specialisation of dictionary functions]
971
972 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
973 -- See Note [From non-recursive to recursive]
974
975 final_binds | isEmptyBag dump_dbs = [NonRec b r | (b,r) <- pairs]
976 | otherwise = [Rec (flattenDictBinds dump_dbs pairs)]
977
978 ; if float_all then
979 -- Rather than discard the calls mentioning the bound variables
980 -- we float this binding along with the others
981 return ([], free_uds `snocDictBinds` final_binds)
982 else
983 -- No call in final_uds mentions bound variables,
984 -- so we can just leave the binding here
985 return (final_binds, free_uds) }
986
987
988 specBind rhs_env (Rec pairs) body_uds
989 -- Note [Specialising a recursive group]
990 = do { let (bndrs,rhss) = unzip pairs
991 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
992 ; let scope_uds = body_uds `plusUDs` rhs_uds
993 -- Includes binds and calls arising from rhss
994
995 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
996
997 ; (bndrs3, spec_defns3, uds3)
998 <- if null spec_defns1 -- Common case: no specialisation
999 then return (bndrs1, [], uds1)
1000 else do { -- Specialisation occurred; do it again
1001 (bndrs2, spec_defns2, uds2)
1002 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1003 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1004
1005 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1006 bind = Rec (flattenDictBinds dumped_dbs $
1007 spec_defns3 ++ zip bndrs3 rhss')
1008
1009 ; if float_all then
1010 return ([], final_uds `snocDictBind` bind)
1011 else
1012 return ([bind], final_uds) }
1013
1014
1015 ---------------------------
1016 specDefns :: SpecEnv
1017 -> UsageDetails -- Info on how it is used in its scope
1018 -> [(Id,CoreExpr)] -- The things being bound and their un-processed RHS
1019 -> SpecM ([Id], -- Original Ids with RULES added
1020 [(Id,CoreExpr)], -- Extra, specialised bindings
1021 UsageDetails) -- Stuff to fling upwards from the specialised versions
1022
1023 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1024 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1025 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1026 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1027 -- But not vice versa (it's a fixpoint problem).
1028
1029 specDefns _env uds []
1030 = return ([], [], uds)
1031 specDefns env uds ((bndr,rhs):pairs)
1032 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1033 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1034 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1035
1036 ---------------------------
1037 specDefn :: SpecEnv
1038 -> UsageDetails -- Info on how it is used in its scope
1039 -> Id -> CoreExpr -- The thing being bound and its un-processed RHS
1040 -> SpecM (Id, -- Original Id with added RULES
1041 [(Id,CoreExpr)], -- Extra, specialised bindings
1042 UsageDetails) -- Stuff to fling upwards from the specialised versions
1043
1044 specDefn env body_uds fn rhs
1045 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1046 rules_for_me = idCoreRules fn
1047 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1048 calls_for_me fn rhs
1049 ; return ( fn `addIdSpecialisations` rules
1050 , spec_defns
1051 , body_uds_without_me `plusUDs` spec_uds) }
1052 -- It's important that the `plusUDs` is this way
1053 -- round, because body_uds_without_me may bind
1054 -- dictionaries that are used in calls_for_me passed
1055 -- to specDefn. So the dictionary bindings in
1056 -- spec_uds may mention dictionaries bound in
1057 -- body_uds_without_me
1058
1059 ---------------------------
1060 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1061 -- Nothing => specialising local fn
1062 -> SpecEnv
1063 -> [CoreRule] -- Existing RULES for the fn
1064 -> [CallInfo]
1065 -> Id -> CoreExpr
1066 -> SpecM ([CoreRule], -- New RULES for the fn
1067 [(Id,CoreExpr)], -- Extra, specialised bindings
1068 UsageDetails) -- New usage details from the specialised RHSs
1069
1070 -- This function checks existing rules, and does not create
1071 -- duplicate ones. So the caller does not need to do this filtering.
1072 -- See 'already_covered'
1073
1074 specCalls mb_mod env rules_for_me calls_for_me fn rhs
1075 -- The first case is the interesting one
1076 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1077 && rhs_ids `lengthAtLeast` n_dicts -- and enough dict args
1078 && notNull calls_for_me -- And there are some calls to specialise
1079 && not (isNeverActive (idInlineActivation fn))
1080 -- Don't specialise NOINLINE things
1081 -- See Note [Auto-specialisation and RULES]
1082
1083 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1084 -- See Note [Inline specialisation] for why we do not
1085 -- switch off specialisation for inline functions
1086
1087 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr rules_for_me) $
1088 do { stuff <- mapM spec_call calls_for_me
1089 ; let (spec_defns, spec_uds, spec_rules) = unzip3 (catMaybes stuff)
1090 ; return (spec_rules, spec_defns, plusUDList spec_uds) }
1091
1092 | otherwise -- No calls or RHS doesn't fit our preconceptions
1093 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1094 ptext (sLit "Missed specialisation opportunity for")
1095 <+> ppr fn $$ _trace_doc )
1096 -- Note [Specialisation shape]
1097 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1098 return ([], [], emptyUDs)
1099 where
1100 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1101 , ppr rhs_ids, ppr n_dicts
1102 , ppr (idInlineActivation fn) ]
1103
1104 fn_type = idType fn
1105 fn_arity = idArity fn
1106 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1107 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1108 n_tyvars = length tyvars
1109 n_dicts = length theta
1110 inl_prag = idInlinePragma fn
1111 inl_act = inlinePragmaActivation inl_prag
1112 is_local = isLocalId fn
1113
1114 -- Figure out whether the function has an INLINE pragma
1115 -- See Note [Inline specialisations]
1116
1117 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
1118
1119 rhs_dict_ids = take n_dicts rhs_ids
1120 body = mkLams (drop n_dicts rhs_ids) rhs_body
1121 -- Glue back on the non-dict lambdas
1122
1123 already_covered :: DynFlags -> [CoreExpr] -> Bool
1124 already_covered dflags args -- Note [Specialisations already covered]
1125 = isJust (lookupRule dflags
1126 (CoreSubst.substInScope (se_subst env), realIdUnfolding)
1127 (const True)
1128 fn args rules_for_me)
1129
1130 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1131 mk_ty_args [] poly_tvs
1132 = ASSERT( null poly_tvs ) []
1133 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1134 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1135 mk_ty_args (Just ty : call_ts) poly_tvs
1136 = Type ty : mk_ty_args call_ts poly_tvs
1137 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1138
1139 ----------------------------------------------------------
1140 -- Specialise to one particular call pattern
1141 spec_call :: CallInfo -- Call instance
1142 -> SpecM (Maybe ((Id,CoreExpr), -- Specialised definition
1143 UsageDetails, -- Usage details from specialised body
1144 CoreRule)) -- Info for the Id's SpecEnv
1145 spec_call (CallKey call_ts, (call_ds, _))
1146 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1147
1148 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1149 -- Supppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1150
1151 -- Construct the new binding
1152 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1153 -- PLUS the usage-details
1154 -- { d1' = dx1; d2' = dx2 }
1155 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1156 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1157 --
1158 -- Note that the substitution is applied to the whole thing.
1159 -- This is convenient, but just slightly fragile. Notably:
1160 -- * There had better be no name clashes in a/b/c
1161 do { let
1162 -- poly_tyvars = [b] in the example above
1163 -- spec_tyvars = [a,c]
1164 -- ty_args = [t1,b,t3]
1165 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1166 env1 = extendTvSubstList env spec_tv_binds
1167 (rhs_env, poly_tyvars) = substBndrs env1
1168 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1169
1170 -- Clone rhs_dicts, including instantiating their types
1171 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1172 ; let (rhs_env2, dx_binds, spec_dict_args)
1173 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1174 ty_args = mk_ty_args call_ts poly_tyvars
1175 rule_args = ty_args ++ map Var inst_dict_ids
1176 rule_bndrs = poly_tyvars ++ inst_dict_ids
1177
1178 ; dflags <- getDynFlags
1179 ; if already_covered dflags rule_args then
1180 return Nothing
1181 else do
1182 { -- Figure out the type of the specialised function
1183 let body_ty = applyTypeToArgs rhs fn_type rule_args
1184 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1185 | isUnLiftedType body_ty -- C.f. WwLib.mkWorkerArgs
1186 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1187 | otherwise = (poly_tyvars, poly_tyvars)
1188 spec_id_ty = mkPiTypes lam_args body_ty
1189
1190 ; spec_f <- newSpecIdSM fn spec_id_ty
1191 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1192 ; this_mod <- getModule
1193 ; let
1194 -- The rule to put in the function's specialisation is:
1195 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1196 herald = case mb_mod of
1197 Nothing -- Specialising local fn
1198 -> ptext (sLit "SPEC")
1199 Just this_mod -- Specialising imoprted fn
1200 -> ptext (sLit "SPEC/") <> ppr this_mod
1201
1202 rule_name = mkFastString $ showSDocForUser dflags neverQualify $
1203 herald <+> ppr fn <+> hsep (map ppr_call_key_ty call_ts)
1204 -- This name ends up in interface files, so use showSDocForUser,
1205 -- otherwise uniques end up there, making builds
1206 -- less deterministic (See #4012 comment:61 ff)
1207
1208 spec_env_rule = mkRule
1209 this_mod
1210 True {- Auto generated -}
1211 is_local
1212 rule_name
1213 inl_act -- Note [Auto-specialisation and RULES]
1214 (idName fn)
1215 rule_bndrs
1216 rule_args
1217 (mkVarApps (Var spec_f) app_args)
1218
1219 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1220 final_uds = foldr consDictBind rhs_uds dx_binds
1221
1222 --------------------------------------
1223 -- Add a suitable unfolding if the spec_inl_prag says so
1224 -- See Note [Inline specialisations]
1225 (spec_inl_prag, spec_unf)
1226 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1227 = (neverInlinePragma, noUnfolding)
1228 -- See Note [Specialising imported functions] in OccurAnal
1229
1230 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1231 = (inl_prag { inl_inline = EmptyInlineSpec }, noUnfolding)
1232
1233 | otherwise
1234 = (inl_prag, specUnfolding dflags (se_subst env)
1235 poly_tyvars (ty_args ++ spec_dict_args)
1236 fn_unf)
1237
1238 --------------------------------------
1239 -- Adding arity information just propagates it a bit faster
1240 -- See Note [Arity decrease] in Simplify
1241 -- Copy InlinePragma information from the parent Id.
1242 -- So if f has INLINE[1] so does spec_f
1243 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1244 `setInlinePragma` spec_inl_prag
1245 `setIdUnfolding` spec_unf
1246
1247 ; return (Just ((spec_f_w_arity, spec_rhs), final_uds, spec_env_rule)) } }
1248
1249 bindAuxiliaryDicts
1250 :: SpecEnv
1251 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1252 -> [DictId] -- A cloned dict-id for each dict arg
1253 -> (SpecEnv, -- Substitute for all orig_dicts
1254 [CoreBind], -- Auxiliary dict bindings
1255 [CoreExpr]) -- Witnessing expressions (all trivial)
1256 -- Bind any dictionary arguments to fresh names, to preserve sharing
1257 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1258 orig_dict_ids call_ds inst_dict_ids
1259 = (env', dx_binds, spec_dict_args)
1260 where
1261 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1262 env' = env { se_subst = CoreSubst.extendIdSubstList subst (orig_dict_ids `zip` spec_dict_args)
1263 , se_interesting = interesting `unionVarSet` interesting_dicts }
1264
1265 interesting_dicts = mkVarSet [ dx_id | NonRec dx_id dx <- dx_binds
1266 , interestingDict env dx ]
1267 -- See Note [Make the new dictionaries interesting]
1268
1269 go [] _ = ([], [])
1270 go (dx:dxs) (dx_id:dx_ids)
1271 | exprIsTrivial dx = (dx_binds, dx:args)
1272 | otherwise = (NonRec dx_id dx : dx_binds, Var dx_id : args)
1273 where
1274 (dx_binds, args) = go dxs dx_ids
1275 -- In the first case extend the substitution but not bindings;
1276 -- in the latter extend the bindings but not the substitution.
1277 -- For the former, note that we bind the *original* dict in the substitution,
1278 -- overriding any d->dx_id binding put there by substBndrs
1279 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1280
1281 {-
1282 Note [Make the new dictionaries interesting]
1283 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1284 Important! We're going to substitute dx_id1 for d
1285 and we want it to look "interesting", else we won't gather *any*
1286 consequential calls. E.g.
1287 f d = ...g d....
1288 If we specialise f for a call (f (dfun dNumInt)), we'll get
1289 a consequent call (g d') with an auxiliary definition
1290 d' = df dNumInt
1291 We want that consequent call to look interesting
1292
1293
1294 Note [From non-recursive to recursive]
1295 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1296 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1297 have built a recursive knot
1298
1299 f a d x = <blah>
1300 MkUD { ud_binds = d7 = MkD ..f..
1301 , ud_calls = ...(f T d7)... }
1302
1303 The we generate
1304
1305 Rec { fs x = <blah>[T/a, d7/d]
1306 f a d x = <blah>
1307 RULE f T _ = fs
1308 d7 = ...f... }
1309
1310 Here the recursion is only through the RULE.
1311
1312
1313 Note [Specialisation of dictionary functions]
1314 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1315 Here is a nasty example that bit us badly: see Trac #3591
1316
1317 class Eq a => C a
1318 instance Eq [a] => C [a]
1319
1320 ---------------
1321 dfun :: Eq [a] -> C [a]
1322 dfun a d = MkD a d (meth d)
1323
1324 d4 :: Eq [T] = <blah>
1325 d2 :: C [T] = dfun T d4
1326 d1 :: Eq [T] = $p1 d2
1327 d3 :: C [T] = dfun T d1
1328
1329 None of these definitions is recursive. What happened was that we
1330 generated a specialisation:
1331
1332 RULE forall d. dfun T d = dT :: C [T]
1333 dT = (MkD a d (meth d)) [T/a, d1/d]
1334 = MkD T d1 (meth d1)
1335
1336 But now we use the RULE on the RHS of d2, to get
1337
1338 d2 = dT = MkD d1 (meth d1)
1339 d1 = $p1 d2
1340
1341 and now d1 is bottom! The problem is that when specialising 'dfun' we
1342 should first dump "below" the binding all floated dictionary bindings
1343 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1344 placed below 'dfun', and thus unavailable to it when specialising
1345 'dfun'. That in turn means that the call (dfun T d1) must be
1346 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1347 d4 doesn't mention dfun.
1348
1349 But look at this:
1350
1351 class C a where { foo,bar :: [a] -> [a] }
1352
1353 instance C Int where
1354 foo x = r_bar x
1355 bar xs = reverse xs
1356
1357 r_bar :: C a => [a] -> [a]
1358 r_bar xs = bar (xs ++ xs)
1359
1360 That translates to:
1361
1362 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1363
1364 Rec { $fCInt :: C Int = MkC foo_help reverse
1365 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1366
1367 The call (r_bar $fCInt) mentions $fCInt,
1368 which mentions foo_help,
1369 which mentions r_bar
1370 But we DO want to specialise r_bar at Int:
1371
1372 Rec { $fCInt :: C Int = MkC foo_help reverse
1373 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1374
1375 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1376 RULE r_bar Int _ = r_bar_Int
1377
1378 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1379 }
1380
1381 Note that, because of its RULE, r_bar joins the recursive
1382 group. (In this case it'll unravel a short moment later.)
1383
1384
1385 Conclusion: we catch the nasty case using filter_dfuns in
1386 callsForMe. To be honest I'm not 100% certain that this is 100%
1387 right, but it works. Sigh.
1388
1389
1390 Note [Specialising a recursive group]
1391 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1392 Consider
1393 let rec { f x = ...g x'...
1394 ; g y = ...f y'.... }
1395 in f 'a'
1396 Here we specialise 'f' at Char; but that is very likely to lead to
1397 a specialisation of 'g' at Char. We must do the latter, else the
1398 whole point of specialisation is lost.
1399
1400 But we do not want to keep iterating to a fixpoint, because in the
1401 presence of polymorphic recursion we might generate an infinite number
1402 of specialisations.
1403
1404 So we use the following heuristic:
1405 * Arrange the rec block in dependency order, so far as possible
1406 (the occurrence analyser already does this)
1407
1408 * Specialise it much like a sequence of lets
1409
1410 * Then go through the block a second time, feeding call-info from
1411 the RHSs back in the bottom, as it were
1412
1413 In effect, the ordering maxmimises the effectiveness of each sweep,
1414 and we do just two sweeps. This should catch almost every case of
1415 monomorphic recursion -- the exception could be a very knotted-up
1416 recursion with multiple cycles tied up together.
1417
1418 This plan is implemented in the Rec case of specBindItself.
1419
1420 Note [Specialisations already covered]
1421 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1422 We obviously don't want to generate two specialisations for the same
1423 argument pattern. There are two wrinkles
1424
1425 1. We do the already-covered test in specDefn, not when we generate
1426 the CallInfo in mkCallUDs. We used to test in the latter place, but
1427 we now iterate the specialiser somewhat, and the Id at the call site
1428 might therefore not have all the RULES that we can see in specDefn
1429
1430 2. What about two specialisations where the second is an *instance*
1431 of the first? If the more specific one shows up first, we'll generate
1432 specialisations for both. If the *less* specific one shows up first,
1433 we *don't* currently generate a specialisation for the more specific
1434 one. (See the call to lookupRule in already_covered.) Reasons:
1435 (a) lookupRule doesn't say which matches are exact (bad reason)
1436 (b) if the earlier specialisation is user-provided, it's
1437 far from clear that we should auto-specialise further
1438
1439 Note [Auto-specialisation and RULES]
1440 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1441 Consider:
1442 g :: Num a => a -> a
1443 g = ...
1444
1445 f :: (Int -> Int) -> Int
1446 f w = ...
1447 {-# RULE f g = 0 #-}
1448
1449 Suppose that auto-specialisation makes a specialised version of
1450 g::Int->Int That version won't appear in the LHS of the RULE for f.
1451 So if the specialisation rule fires too early, the rule for f may
1452 never fire.
1453
1454 It might be possible to add new rules, to "complete" the rewrite system.
1455 Thus when adding
1456 RULE forall d. g Int d = g_spec
1457 also add
1458 RULE f g_spec = 0
1459
1460 But that's a bit complicated. For now we ask the programmer's help,
1461 by *copying the INLINE activation pragma* to the auto-specialised
1462 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1463 will also not be active until phase 2. And that's what programmers
1464 should jolly well do anyway, even aside from specialisation, to ensure
1465 that g doesn't inline too early.
1466
1467 This in turn means that the RULE would never fire for a NOINLINE
1468 thing so not much point in generating a specialisation at all.
1469
1470 Note [Specialisation shape]
1471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1472 We only specialise a function if it has visible top-level lambdas
1473 corresponding to its overloading. E.g. if
1474 f :: forall a. Eq a => ....
1475 then its body must look like
1476 f = /\a. \d. ...
1477
1478 Reason: when specialising the body for a call (f ty dexp), we want to
1479 substitute dexp for d, and pick up specialised calls in the body of f.
1480
1481 This doesn't always work. One example I came across was this:
1482 newtype Gen a = MkGen{ unGen :: Int -> a }
1483
1484 choose :: Eq a => a -> Gen a
1485 choose n = MkGen (\r -> n)
1486
1487 oneof = choose (1::Int)
1488
1489 It's a silly exapmle, but we get
1490 choose = /\a. g `cast` co
1491 where choose doesn't have any dict arguments. Thus far I have not
1492 tried to fix this (wait till there's a real example).
1493
1494 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1495 it doesn't matter. This comes up with single-method classes
1496
1497 class C a where { op :: a -> a }
1498 instance C a => C [a] where ....
1499 ==>
1500 $fCList :: C a => C [a]
1501 $fCList = $copList |> (...coercion>...)
1502 ....(uses of $fCList at particular types)...
1503
1504 So we suppress the WARN if the rhs is trivial.
1505
1506 Note [Inline specialisations]
1507 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1508 Here is what we do with the InlinePragma of the original function
1509 * Activation/RuleMatchInfo: both transferred to the
1510 specialised function
1511 * InlineSpec:
1512 (a) An INLINE pragma is transferred
1513 (b) An INLINABLE pragma is *not* transferred
1514
1515 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1516 specialise the function at its call site, and arguably that's not so
1517 important for the specialised copies. BUT *pragma-directed*
1518 specialisation now takes place in the typechecker/desugarer, with
1519 manually specified INLINEs. The specialisation here is automatic.
1520 It'd be very odd if a function marked INLINE was specialised (because
1521 of some local use), and then forever after (including importing
1522 modules) the specialised version wasn't INLINEd. After all, the
1523 programmer said INLINE!
1524
1525 You might wonder why we specialise INLINE functions at all. After
1526 all they should be inlined, right? Two reasons:
1527
1528 * Even INLINE functions are sometimes not inlined, when they aren't
1529 applied to interesting arguments. But perhaps the type arguments
1530 alone are enough to specialise (even though the args are too boring
1531 to trigger inlining), and it's certainly better to call the
1532 specialised version.
1533
1534 * The RHS of an INLINE function might call another overloaded function,
1535 and we'd like to generate a specialised version of that function too.
1536 This actually happens a lot. Consider
1537 replicateM_ :: (Monad m) => Int -> m a -> m ()
1538 {-# INLINABLE replicateM_ #-}
1539 replicateM_ d x ma = ...
1540 The strictness analyser may transform to
1541 replicateM_ :: (Monad m) => Int -> m a -> m ()
1542 {-# INLINE replicateM_ #-}
1543 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1544
1545 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1546 {-# INLINABLE $wreplicateM_ #-}
1547 $wreplicateM_ = ...
1548 Now an importing module has a specialised call to replicateM_, say
1549 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1550 This particular example had a huge effect on the call to replicateM_
1551 in nofib/shootout/n-body.
1552
1553 Why (b): discard INLINEABLE pragmas? See Trac #4874 for persuasive examples.
1554 Suppose we have
1555 {-# INLINABLE f #-}
1556 f :: Ord a => [a] -> Int
1557 f xs = letrec f' = ...f'... in f'
1558 Then, when f is specialised and optimised we might get
1559 wgo :: [Int] -> Int#
1560 wgo = ...wgo...
1561 f_spec :: [Int] -> Int
1562 f_spec xs = case wgo xs of { r -> I# r }
1563 and we clearly want to inline f_spec at call sites. But if we still
1564 have the big, un-optimised of f (albeit specialised) captured in an
1565 INLINABLE pragma for f_spec, we won't get that optimisation.
1566
1567 So we simply drop INLINABLE pragmas when specialising. It's not really
1568 a complete solution; ignoring specalisation for now, INLINABLE functions
1569 don't get properly strictness analysed, for example. But it works well
1570 for examples involving specialisation, which is the dominant use of
1571 INLINABLE. See Trac #4874.
1572
1573
1574 ************************************************************************
1575 * *
1576 \subsubsection{UsageDetails and suchlike}
1577 * *
1578 ************************************************************************
1579 -}
1580
1581 data UsageDetails
1582 = MkUD {
1583 ud_binds :: !(Bag DictBind),
1584 -- Floated dictionary bindings
1585 -- The order is important;
1586 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1587 -- (Remember, Bags preserve order in GHC.)
1588
1589 ud_calls :: !CallDetails
1590
1591 -- INVARIANT: suppose bs = bindersOf ud_binds
1592 -- Then 'calls' may *mention* 'bs',
1593 -- but there should be no calls *for* bs
1594 }
1595
1596 instance Outputable UsageDetails where
1597 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1598 = ptext (sLit "MkUD") <+> braces (sep (punctuate comma
1599 [ptext (sLit "binds") <+> equals <+> ppr dbs,
1600 ptext (sLit "calls") <+> equals <+> ppr calls]))
1601
1602 type DictBind = (CoreBind, VarSet)
1603 -- The set is the free vars of the binding
1604 -- both tyvars and dicts
1605
1606 type DictExpr = CoreExpr
1607
1608 emptyUDs :: UsageDetails
1609 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyVarEnv }
1610
1611 ------------------------------------------------------------
1612 type CallDetails = IdEnv CallInfoSet
1613 newtype CallKey = CallKey [Maybe Type] -- Nothing => unconstrained type argument
1614
1615 -- CallInfo uses a Map, thereby ensuring that
1616 -- we record only one call instance for any key
1617 --
1618 -- The list of types and dictionaries is guaranteed to
1619 -- match the type of f
1620 data CallInfoSet = CIS Id (Map CallKey ([DictExpr], VarSet))
1621 -- Range is dict args and the vars of the whole
1622 -- call (including tyvars)
1623 -- [*not* include the main id itself, of course]
1624
1625 type CallInfo = (CallKey, ([DictExpr], VarSet))
1626
1627 instance Outputable CallInfoSet where
1628 ppr (CIS fn map) = hang (ptext (sLit "CIS") <+> ppr fn)
1629 2 (ppr map)
1630
1631 pprCallInfo :: Id -> CallInfo -> SDoc
1632 pprCallInfo fn (CallKey mb_tys, (dxs, _))
1633 = hang (ppr fn)
1634 2 (fsep (map ppr_call_key_ty mb_tys ++ map pprParendExpr dxs))
1635
1636 ppr_call_key_ty :: Maybe Type -> SDoc
1637 ppr_call_key_ty Nothing = char '_'
1638 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1639
1640 instance Outputable CallKey where
1641 ppr (CallKey ts) = ppr ts
1642
1643 -- Type isn't an instance of Ord, so that we can control which
1644 -- instance we use. That's tiresome here. Oh well
1645 instance Eq CallKey where
1646 k1 == k2 = case k1 `compare` k2 of { EQ -> True; _ -> False }
1647
1648 instance Ord CallKey where
1649 compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2
1650 where
1651 cmp Nothing Nothing = EQ
1652 cmp Nothing (Just _) = LT
1653 cmp (Just _) Nothing = GT
1654 cmp (Just t1) (Just t2) = cmpType t1 t2
1655
1656 unionCalls :: CallDetails -> CallDetails -> CallDetails
1657 unionCalls c1 c2 = plusVarEnv_C unionCallInfoSet c1 c2
1658
1659 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1660 unionCallInfoSet (CIS f calls1) (CIS _ calls2) = CIS f (calls1 `Map.union` calls2)
1661
1662 callDetailsFVs :: CallDetails -> VarSet
1663 callDetailsFVs calls = foldVarEnv (unionVarSet . callInfoFVs) emptyVarSet calls
1664
1665 callInfoFVs :: CallInfoSet -> VarSet
1666 callInfoFVs (CIS _ call_info) = Map.foldRight (\(_,fv) vs -> unionVarSet fv vs) emptyVarSet call_info
1667
1668 ------------------------------------------------------------
1669 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1670 singleCall id tys dicts
1671 = MkUD {ud_binds = emptyBag,
1672 ud_calls = unitVarEnv id $ CIS id $
1673 Map.singleton (CallKey tys) (dicts, call_fvs) }
1674 where
1675 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1676 tys_fvs = tyVarsOfTypes (catMaybes tys)
1677 -- The type args (tys) are guaranteed to be part of the dictionary
1678 -- types, because they are just the constrained types,
1679 -- and the dictionary is therefore sure to be bound
1680 -- inside the binding for any type variables free in the type;
1681 -- hence it's safe to neglect tyvars free in tys when making
1682 -- the free-var set for this call
1683 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1684 --
1685 -- We don't include the 'id' itself.
1686
1687 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1688 mkCallUDs env f args
1689 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1690 res
1691 where
1692 res = mkCallUDs' env f args
1693
1694 mkCallUDs' env f args
1695 | not (want_calls_for f) -- Imported from elsewhere
1696 || null theta -- Not overloaded
1697 = emptyUDs
1698
1699 | not (all type_determines_value theta)
1700 || not (spec_tys `lengthIs` n_tyvars)
1701 || not ( dicts `lengthIs` n_dicts)
1702 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1703 -- See also Note [Specialisations already covered]
1704 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1705 emptyUDs -- Not overloaded, or no specialisation wanted
1706
1707 | otherwise
1708 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1709 singleCall f spec_tys dicts
1710 where
1711 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1712 , ppr (map (interestingDict env) dicts)]
1713 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1714 constrained_tyvars = closeOverKinds (tyVarsOfTypes theta)
1715 n_tyvars = length tyvars
1716 n_dicts = length theta
1717
1718 spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args]
1719 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1720
1721 mk_spec_ty tyvar ty
1722 | tyvar `elemVarSet` constrained_tyvars = Just ty
1723 | otherwise = Nothing
1724
1725 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
1726 -- For imported things, we gather call instances if
1727 -- there is an unfolding that we could in principle specialise
1728 -- We might still decide not to use it (consulting dflags)
1729 -- in specImports
1730 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
1731
1732 type_determines_value pred -- See Note [Type determines value]
1733 = case classifyPredType pred of
1734 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
1735 EqPred {} -> True
1736 IrredPred {} -> True -- Things like (D []) where D is a
1737 -- Constraint-ranged family; Trac #7785
1738
1739 {-
1740 Note [Type determines value]
1741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1742 Only specialise if all overloading is on non-IP *class* params,
1743 because these are the ones whose *type* determines their *value*. In
1744 parrticular, with implicit params, the type args *don't* say what the
1745 value of the implicit param is! See Trac #7101
1746
1747 However, consider
1748 type family D (v::*->*) :: Constraint
1749 type instance D [] = ()
1750 f :: D v => v Char -> Int
1751 If we see a call (f "foo"), we'll pass a "dictionary"
1752 () |> (g :: () ~ D [])
1753 and it's good to specialise f at this dictionary.
1754
1755 So the question is: can an implicit parameter "hide inside" a
1756 type-family constraint like (D a). Well, no. We don't allow
1757 type instance D Maybe = ?x:Int
1758 Hence the IrredPred case in type_determines_value.
1759 See Trac #7785.
1760
1761 Note [Interesting dictionary arguments]
1762 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1763 Consider this
1764 \a.\d:Eq a. let f = ... in ...(f d)...
1765 There really is not much point in specialising f wrt the dictionary d,
1766 because the code for the specialised f is not improved at all, because
1767 d is lambda-bound. We simply get junk specialisations.
1768
1769 What is "interesting"? Just that it has *some* structure. But what about
1770 variables?
1771
1772 * A variable might be imported, in which case its unfolding
1773 will tell us whether it has useful structure
1774
1775 * Local variables are cloned on the way down (to avoid clashes when
1776 we float dictionaries), and cloning drops the unfolding
1777 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
1778 nuisance to give them unfoldings. So we keep track of the
1779 "interesting" dictionaries as a VarSet in SpecEnv.
1780 We have to take care to put any new interesting dictionary
1781 bindings in the set.
1782
1783 We accidentally lost accurate tracking of local variables for a long
1784 time, because cloned variables don't have unfoldings. But makes a
1785 massive difference in a few cases, eg Trac #5113. For nofib as a
1786 whole it's only a small win: 2.2% improvement in allocation for ansi,
1787 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
1788 -}
1789
1790 interestingDict :: SpecEnv -> CoreExpr -> Bool
1791 -- A dictionary argument is interesting if it has *some* structure
1792 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
1793 || isDataConWorkId v
1794 || v `elemVarSet` se_interesting env
1795 interestingDict _ (Type _) = False
1796 interestingDict _ (Coercion _) = False
1797 interestingDict env (App fn (Type _)) = interestingDict env fn
1798 interestingDict env (App fn (Coercion _)) = interestingDict env fn
1799 interestingDict env (Tick _ a) = interestingDict env a
1800 interestingDict env (Cast e _) = interestingDict env e
1801 interestingDict _ _ = True
1802
1803 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1804 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
1805 (MkUD {ud_binds = db2, ud_calls = calls2})
1806 = MkUD { ud_binds = db1 `unionBags` db2
1807 , ud_calls = calls1 `unionCalls` calls2 }
1808
1809 plusUDList :: [UsageDetails] -> UsageDetails
1810 plusUDList = foldr plusUDs emptyUDs
1811
1812 -----------------------------
1813 _dictBindBndrs :: Bag DictBind -> [Id]
1814 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
1815
1816 mkDB :: CoreBind -> DictBind
1817 mkDB bind = (bind, bind_fvs bind)
1818
1819 bind_fvs :: CoreBind -> VarSet
1820 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
1821 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
1822 where
1823 bndrs = map fst prs
1824 rhs_fvs = unionVarSets (map pair_fvs prs)
1825
1826 pair_fvs :: (Id, CoreExpr) -> VarSet
1827 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
1828 -- Don't forget variables mentioned in the
1829 -- rules of the bndr. C.f. OccAnal.addRuleUsage
1830 -- Also tyvars mentioned in its type; they may not appear in the RHS
1831 -- type T a = Int
1832 -- x :: T a = 3
1833
1834 flattenDictBinds :: Bag DictBind -> [(Id,CoreExpr)] -> [(Id,CoreExpr)]
1835 flattenDictBinds dbs pairs
1836 = foldrBag add pairs dbs
1837 where
1838 add (NonRec b r,_) pairs = (b,r) : pairs
1839 add (Rec prs1, _) pairs = prs1 ++ pairs
1840
1841 snocDictBinds :: UsageDetails -> [CoreBind] -> UsageDetails
1842 -- Add ud_binds to the tail end of the bindings in uds
1843 snocDictBinds uds dbs
1844 = uds { ud_binds = ud_binds uds `unionBags`
1845 foldr (consBag . mkDB) emptyBag dbs }
1846
1847 consDictBind :: CoreBind -> UsageDetails -> UsageDetails
1848 consDictBind bind uds = uds { ud_binds = mkDB bind `consBag` ud_binds uds }
1849
1850 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
1851 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
1852
1853 snocDictBind :: UsageDetails -> CoreBind -> UsageDetails
1854 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` mkDB bind }
1855
1856 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
1857 wrapDictBinds dbs binds
1858 = foldrBag add binds dbs
1859 where
1860 add (bind,_) binds = bind : binds
1861
1862 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
1863 wrapDictBindsE dbs expr
1864 = foldrBag add expr dbs
1865 where
1866 add (bind,_) expr = Let bind expr
1867
1868 ----------------------
1869 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
1870 -- Used at a lambda or case binder; just dump anything mentioning the binder
1871 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1872 | null bndrs = (uds, emptyBag) -- Common in case alternatives
1873 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
1874 (free_uds, dump_dbs)
1875 where
1876 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
1877 bndr_set = mkVarSet bndrs
1878 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
1879 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
1880 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
1881 -- no calls for any of the dicts in dump_dbs
1882
1883 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
1884 -- Used at a lambda or case binder; just dump anything mentioning the binder
1885 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1886 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
1887 (free_uds, dump_dbs, float_all)
1888 where
1889 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
1890 bndr_set = mkVarSet bndrs
1891 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
1892 free_calls = deleteCallsFor bndrs orig_calls
1893 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
1894
1895 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
1896 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
1897 = -- pprTrace ("callsForMe")
1898 -- (vcat [ppr fn,
1899 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
1900 -- text "Orig calls =" <+> ppr orig_calls,
1901 -- text "Dep set =" <+> ppr dep_set,
1902 -- text "Calls for me =" <+> ppr calls_for_me]) $
1903 (uds_without_me, calls_for_me)
1904 where
1905 uds_without_me = MkUD { ud_binds = orig_dbs, ud_calls = delVarEnv orig_calls fn }
1906 calls_for_me = case lookupVarEnv orig_calls fn of
1907 Nothing -> []
1908 Just (CIS _ calls) -> filter_dfuns (Map.toList calls)
1909
1910 dep_set = foldlBag go (unitVarSet fn) orig_dbs
1911 go dep_set (db,fvs) | fvs `intersectsVarSet` dep_set
1912 = extendVarSetList dep_set (bindersOf db)
1913 | otherwise = dep_set
1914
1915 -- Note [Specialisation of dictionary functions]
1916 filter_dfuns | isDFunId fn = filter ok_call
1917 | otherwise = \cs -> cs
1918
1919 ok_call (_, (_,fvs)) = not (fvs `intersectsVarSet` dep_set)
1920
1921 ----------------------
1922 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
1923 -- Returns (free_dbs, dump_dbs, dump_set)
1924 splitDictBinds dbs bndr_set
1925 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
1926 -- Important that it's foldl not foldr;
1927 -- we're accumulating the set of dumped ids in dump_set
1928 where
1929 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
1930 | dump_idset `intersectsVarSet` fvs -- Dump it
1931 = (free_dbs, dump_dbs `snocBag` db,
1932 extendVarSetList dump_idset (bindersOf bind))
1933
1934 | otherwise -- Don't dump it
1935 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
1936
1937
1938 ----------------------
1939 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
1940 -- Remove calls *mentioning* bs
1941 deleteCallsMentioning bs calls
1942 = mapVarEnv filter_calls calls
1943 where
1944 filter_calls :: CallInfoSet -> CallInfoSet
1945 filter_calls (CIS f calls) = CIS f (Map.filter keep_call calls)
1946 keep_call (_, fvs) = not (fvs `intersectsVarSet` bs)
1947
1948 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
1949 -- Remove calls *for* bs
1950 deleteCallsFor bs calls = delVarEnvList calls bs
1951
1952 {-
1953 ************************************************************************
1954 * *
1955 \subsubsection{Boring helper functions}
1956 * *
1957 ************************************************************************
1958 -}
1959
1960 newtype SpecM a = SpecM (State SpecState a)
1961
1962 data SpecState = SpecState {
1963 spec_uniq_supply :: UniqSupply,
1964 spec_module :: Module,
1965 spec_dflags :: DynFlags
1966 }
1967
1968 instance Functor SpecM where
1969 fmap = liftM
1970
1971 instance Applicative SpecM where
1972 pure = return
1973 (<*>) = ap
1974
1975 instance Monad SpecM where
1976 SpecM x >>= f = SpecM $ do y <- x
1977 case f y of
1978 SpecM z ->
1979 z
1980 return x = SpecM $ return x
1981 fail str = SpecM $ fail str
1982
1983 instance MonadUnique SpecM where
1984 getUniqueSupplyM
1985 = SpecM $ do st <- get
1986 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
1987 put $ st { spec_uniq_supply = us2 }
1988 return us1
1989
1990 getUniqueM
1991 = SpecM $ do st <- get
1992 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
1993 put $ st { spec_uniq_supply = us' }
1994 return u
1995
1996 instance HasDynFlags SpecM where
1997 getDynFlags = SpecM $ liftM spec_dflags get
1998
1999 instance HasModule SpecM where
2000 getModule = SpecM $ liftM spec_module get
2001
2002 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2003 runSpecM dflags this_mod (SpecM spec)
2004 = do us <- getUniqueSupplyM
2005 let initialState = SpecState {
2006 spec_uniq_supply = us,
2007 spec_module = this_mod,
2008 spec_dflags = dflags
2009 }
2010 return $ evalState spec initialState
2011
2012 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2013 mapAndCombineSM _ [] = return ([], emptyUDs)
2014 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2015 (ys, uds2) <- mapAndCombineSM f xs
2016 return (y:ys, uds1 `plusUDs` uds2)
2017
2018 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2019 extendTvSubstList env tv_binds
2020 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2021
2022 substTy :: SpecEnv -> Type -> Type
2023 substTy env ty = CoreSubst.substTy (se_subst env) ty
2024
2025 substCo :: SpecEnv -> Coercion -> Coercion
2026 substCo env co = CoreSubst.substCo (se_subst env) co
2027
2028 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2029 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2030 (subst', bs') -> (env { se_subst = subst' }, bs')
2031
2032 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2033 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2034 (subst', bs') -> (env { se_subst = subst' }, bs')
2035
2036 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2037 -- Clone the binders of the bind; return new bind with the cloned binders
2038 -- Return the substitution to use for RHSs, and the one to use for the body
2039 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2040 = do { us <- getUniqueSupplyM
2041 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2042 interesting' | interestingDict env rhs
2043 = interesting `extendVarSet` bndr'
2044 | otherwise = interesting
2045 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2046 , NonRec bndr' rhs) }
2047
2048 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2049 = do { us <- getUniqueSupplyM
2050 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2051 env' = env { se_subst = subst'
2052 , se_interesting = interesting `extendVarSetList`
2053 [ v | (v,r) <- pairs, interestingDict env r ] }
2054 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2055
2056 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2057 -- Make up completely fresh binders for the dictionaries
2058 -- Their bindings are going to float outwards
2059 newDictBndr env b = do { uniq <- getUniqueM
2060 ; let n = idName b
2061 ty' = substTy env (idType b)
2062 ; return (mkUserLocal (nameOccName n) uniq ty' HasSigId (getSrcSpan n)) }
2063
2064 newSpecIdSM :: Id -> Type -> SpecM Id
2065 -- Give the new Id a similar occurrence name to the old one
2066 newSpecIdSM old_id new_ty
2067 = do { uniq <- getUniqueM
2068 ; let name = idName old_id
2069 new_occ = mkSpecOcc (nameOccName name)
2070 new_id = mkUserLocal new_occ uniq new_ty NoSigId (getSrcSpan name)
2071 ; return new_id }
2072
2073 {-
2074 Old (but interesting) stuff about unboxed bindings
2075 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2076
2077 What should we do when a value is specialised to a *strict* unboxed value?
2078
2079 map_*_* f (x:xs) = let h = f x
2080 t = map f xs
2081 in h:t
2082
2083 Could convert let to case:
2084
2085 map_*_Int# f (x:xs) = case f x of h# ->
2086 let t = map f xs
2087 in h#:t
2088
2089 This may be undesirable since it forces evaluation here, but the value
2090 may not be used in all branches of the body. In the general case this
2091 transformation is impossible since the mutual recursion in a letrec
2092 cannot be expressed as a case.
2093
2094 There is also a problem with top-level unboxed values, since our
2095 implementation cannot handle unboxed values at the top level.
2096
2097 Solution: Lift the binding of the unboxed value and extract it when it
2098 is used:
2099
2100 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2101 t = map f xs
2102 in case h of
2103 _Lift h# -> h#:t
2104
2105 Now give it to the simplifier and the _Lifting will be optimised away.
2106
2107 The benfit is that we have given the specialised "unboxed" values a
2108 very simplep lifted semantics and then leave it up to the simplifier to
2109 optimise it --- knowing that the overheads will be removed in nearly
2110 all cases.
2111
2112 In particular, the value will only be evaluted in the branches of the
2113 program which use it, rather than being forced at the point where the
2114 value is bound. For example:
2115
2116 filtermap_*_* p f (x:xs)
2117 = let h = f x
2118 t = ...
2119 in case p x of
2120 True -> h:t
2121 False -> t
2122 ==>
2123 filtermap_*_Int# p f (x:xs)
2124 = let h = case (f x) of h# -> _Lift h#
2125 t = ...
2126 in case p x of
2127 True -> case h of _Lift h#
2128 -> h#:t
2129 False -> t
2130
2131 The binding for h can still be inlined in the one branch and the
2132 _Lifting eliminated.
2133
2134
2135 Question: When won't the _Lifting be eliminated?
2136
2137 Answer: When they at the top-level (where it is necessary) or when
2138 inlining would duplicate work (or possibly code depending on
2139 options). However, the _Lifting will still be eliminated if the
2140 strictness analyser deems the lifted binding strict.
2141 -}