c4fe042b1fb8f55e2ed032e8b491fadcc270f770
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import GhcPrelude
13
14 import Id
15 import TcType hiding( substTy )
16 import Type hiding( substTy, extendTvSubstList )
17 import Module( Module, HasModule(..) )
18 import Coercion( Coercion )
19 import CoreMonad
20 import qualified CoreSubst
21 import CoreUnfold
22 import Var ( isLocalVar )
23 import VarSet
24 import VarEnv
25 import CoreSyn
26 import Rules
27 import CoreOpt ( collectBindersPushingCo )
28 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
29 import CoreFVs
30 import FV ( InterestingVarFun )
31 import CoreArity ( etaExpandToJoinPointRule )
32 import UniqSupply
33 import Name
34 import MkId ( voidArgId, voidPrimId )
35 import Maybes ( catMaybes, isJust )
36 import MonadUtils ( foldlM )
37 import BasicTypes
38 import HscTypes
39 import Bag
40 import DynFlags
41 import Util
42 import Outputable
43 import FastString
44 import State
45 import UniqDFM
46
47 import Control.Monad
48 import qualified Control.Monad.Fail as MonadFail
49
50 {-
51 ************************************************************************
52 * *
53 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
54 * *
55 ************************************************************************
56
57 These notes describe how we implement specialisation to eliminate
58 overloading.
59
60 The specialisation pass works on Core
61 syntax, complete with all the explicit dictionary application,
62 abstraction and construction as added by the type checker. The
63 existing type checker remains largely as it is.
64
65 One important thought: the {\em types} passed to an overloaded
66 function, and the {\em dictionaries} passed are mutually redundant.
67 If the same function is applied to the same type(s) then it is sure to
68 be applied to the same dictionary(s)---or rather to the same {\em
69 values}. (The arguments might look different but they will evaluate
70 to the same value.)
71
72 Second important thought: we know that we can make progress by
73 treating dictionary arguments as static and worth specialising on. So
74 we can do without binding-time analysis, and instead specialise on
75 dictionary arguments and no others.
76
77 The basic idea
78 ~~~~~~~~~~~~~~
79 Suppose we have
80
81 let f = <f_rhs>
82 in <body>
83
84 and suppose f is overloaded.
85
86 STEP 1: CALL-INSTANCE COLLECTION
87
88 We traverse <body>, accumulating all applications of f to types and
89 dictionaries.
90
91 (Might there be partial applications, to just some of its types and
92 dictionaries? In principle yes, but in practice the type checker only
93 builds applications of f to all its types and dictionaries, so partial
94 applications could only arise as a result of transformation, and even
95 then I think it's unlikely. In any case, we simply don't accumulate such
96 partial applications.)
97
98
99 STEP 2: EQUIVALENCES
100
101 So now we have a collection of calls to f:
102 f t1 t2 d1 d2
103 f t3 t4 d3 d4
104 ...
105 Notice that f may take several type arguments. To avoid ambiguity, we
106 say that f is called at type t1/t2 and t3/t4.
107
108 We take equivalence classes using equality of the *types* (ignoring
109 the dictionary args, which as mentioned previously are redundant).
110
111 STEP 3: SPECIALISATION
112
113 For each equivalence class, choose a representative (f t1 t2 d1 d2),
114 and create a local instance of f, defined thus:
115
116 f@t1/t2 = <f_rhs> t1 t2 d1 d2
117
118 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
119 of simplification will now result. However we don't actually *do* that
120 simplification. Rather, we leave it for the simplifier to do. If we
121 *did* do it, though, we'd get more call instances from the specialised
122 RHS. We can work out what they are by instantiating the call-instance
123 set from f's RHS with the types t1, t2.
124
125 Add this new id to f's IdInfo, to record that f has a specialised version.
126
127 Before doing any of this, check that f's IdInfo doesn't already
128 tell us about an existing instance of f at the required type/s.
129 (This might happen if specialisation was applied more than once, or
130 it might arise from user SPECIALIZE pragmas.)
131
132 Recursion
133 ~~~~~~~~~
134 Wait a minute! What if f is recursive? Then we can't just plug in
135 its right-hand side, can we?
136
137 But it's ok. The type checker *always* creates non-recursive definitions
138 for overloaded recursive functions. For example:
139
140 f x = f (x+x) -- Yes I know its silly
141
142 becomes
143
144 f a (d::Num a) = let p = +.sel a d
145 in
146 letrec fl (y::a) = fl (p y y)
147 in
148 fl
149
150 We still have recursion for non-overloaded functions which we
151 specialise, but the recursive call should get specialised to the
152 same recursive version.
153
154
155 Polymorphism 1
156 ~~~~~~~~~~~~~~
157
158 All this is crystal clear when the function is applied to *constant
159 types*; that is, types which have no type variables inside. But what if
160 it is applied to non-constant types? Suppose we find a call of f at type
161 t1/t2. There are two possibilities:
162
163 (a) The free type variables of t1, t2 are in scope at the definition point
164 of f. In this case there's no problem, we proceed just as before. A common
165 example is as follows. Here's the Haskell:
166
167 g y = let f x = x+x
168 in f y + f y
169
170 After typechecking we have
171
172 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
173 in +.sel a d (f a d y) (f a d y)
174
175 Notice that the call to f is at type type "a"; a non-constant type.
176 Both calls to f are at the same type, so we can specialise to give:
177
178 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
179 in +.sel a d (f@a y) (f@a y)
180
181
182 (b) The other case is when the type variables in the instance types
183 are *not* in scope at the definition point of f. The example we are
184 working with above is a good case. There are two instances of (+.sel a d),
185 but "a" is not in scope at the definition of +.sel. Can we do anything?
186 Yes, we can "common them up", a sort of limited common sub-expression deal.
187 This would give:
188
189 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
190 f@a (x::a) = +.sel@a x x
191 in +.sel@a (f@a y) (f@a y)
192
193 This can save work, and can't be spotted by the type checker, because
194 the two instances of +.sel weren't originally at the same type.
195
196 Further notes on (b)
197
198 * There are quite a few variations here. For example, the defn of
199 +.sel could be floated ouside the \y, to attempt to gain laziness.
200 It certainly mustn't be floated outside the \d because the d has to
201 be in scope too.
202
203 * We don't want to inline f_rhs in this case, because
204 that will duplicate code. Just commoning up the call is the point.
205
206 * Nothing gets added to +.sel's IdInfo.
207
208 * Don't bother unless the equivalence class has more than one item!
209
210 Not clear whether this is all worth it. It is of course OK to
211 simply discard call-instances when passing a big lambda.
212
213 Polymorphism 2 -- Overloading
214 ~~~~~~~~~~~~~~
215 Consider a function whose most general type is
216
217 f :: forall a b. Ord a => [a] -> b -> b
218
219 There is really no point in making a version of g at Int/Int and another
220 at Int/Bool, because it's only instantiating the type variable "a" which
221 buys us any efficiency. Since g is completely polymorphic in b there
222 ain't much point in making separate versions of g for the different
223 b types.
224
225 That suggests that we should identify which of g's type variables
226 are constrained (like "a") and which are unconstrained (like "b").
227 Then when taking equivalence classes in STEP 2, we ignore the type args
228 corresponding to unconstrained type variable. In STEP 3 we make
229 polymorphic versions. Thus:
230
231 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
232
233 We do this.
234
235
236 Dictionary floating
237 ~~~~~~~~~~~~~~~~~~~
238 Consider this
239
240 f a (d::Num a) = let g = ...
241 in
242 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
243
244 Here, g is only called at one type, but the dictionary isn't in scope at the
245 definition point for g. Usually the type checker would build a
246 definition for d1 which enclosed g, but the transformation system
247 might have moved d1's defn inward. Solution: float dictionary bindings
248 outwards along with call instances.
249
250 Consider
251
252 f x = let g p q = p==q
253 h r s = (r+s, g r s)
254 in
255 h x x
256
257
258 Before specialisation, leaving out type abstractions we have
259
260 f df x = let g :: Eq a => a -> a -> Bool
261 g dg p q = == dg p q
262 h :: Num a => a -> a -> (a, Bool)
263 h dh r s = let deq = eqFromNum dh
264 in (+ dh r s, g deq r s)
265 in
266 h df x x
267
268 After specialising h we get a specialised version of h, like this:
269
270 h' r s = let deq = eqFromNum df
271 in (+ df r s, g deq r s)
272
273 But we can't naively make an instance for g from this, because deq is not in scope
274 at the defn of g. Instead, we have to float out the (new) defn of deq
275 to widen its scope. Notice that this floating can't be done in advance -- it only
276 shows up when specialisation is done.
277
278 User SPECIALIZE pragmas
279 ~~~~~~~~~~~~~~~~~~~~~~~
280 Specialisation pragmas can be digested by the type checker, and implemented
281 by adding extra definitions along with that of f, in the same way as before
282
283 f@t1/t2 = <f_rhs> t1 t2 d1 d2
284
285 Indeed the pragmas *have* to be dealt with by the type checker, because
286 only it knows how to build the dictionaries d1 and d2! For example
287
288 g :: Ord a => [a] -> [a]
289 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
290
291 Here, the specialised version of g is an application of g's rhs to the
292 Ord dictionary for (Tree Int), which only the type checker can conjure
293 up. There might not even *be* one, if (Tree Int) is not an instance of
294 Ord! (All the other specialision has suitable dictionaries to hand
295 from actual calls.)
296
297 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
298 it is buried in a complex (as-yet-un-desugared) binding group.
299 Maybe we should say
300
301 f@t1/t2 = f* t1 t2 d1 d2
302
303 where f* is the Id f with an IdInfo which says "inline me regardless!".
304 Indeed all the specialisation could be done in this way.
305 That in turn means that the simplifier has to be prepared to inline absolutely
306 any in-scope let-bound thing.
307
308
309 Again, the pragma should permit polymorphism in unconstrained variables:
310
311 h :: Ord a => [a] -> b -> b
312 {-# SPECIALIZE h :: [Int] -> b -> b #-}
313
314 We *insist* that all overloaded type variables are specialised to ground types,
315 (and hence there can be no context inside a SPECIALIZE pragma).
316 We *permit* unconstrained type variables to be specialised to
317 - a ground type
318 - or left as a polymorphic type variable
319 but nothing in between. So
320
321 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
322
323 is *illegal*. (It can be handled, but it adds complication, and gains the
324 programmer nothing.)
325
326
327 SPECIALISING INSTANCE DECLARATIONS
328 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
329 Consider
330
331 instance Foo a => Foo [a] where
332 ...
333 {-# SPECIALIZE instance Foo [Int] #-}
334
335 The original instance decl creates a dictionary-function
336 definition:
337
338 dfun.Foo.List :: forall a. Foo a -> Foo [a]
339
340 The SPECIALIZE pragma just makes a specialised copy, just as for
341 ordinary function definitions:
342
343 dfun.Foo.List@Int :: Foo [Int]
344 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
345
346 The information about what instance of the dfun exist gets added to
347 the dfun's IdInfo in the same way as a user-defined function too.
348
349
350 Automatic instance decl specialisation?
351 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
352 Can instance decls be specialised automatically? It's tricky.
353 We could collect call-instance information for each dfun, but
354 then when we specialised their bodies we'd get new call-instances
355 for ordinary functions; and when we specialised their bodies, we might get
356 new call-instances of the dfuns, and so on. This all arises because of
357 the unrestricted mutual recursion between instance decls and value decls.
358
359 Still, there's no actual problem; it just means that we may not do all
360 the specialisation we could theoretically do.
361
362 Furthermore, instance decls are usually exported and used non-locally,
363 so we'll want to compile enough to get those specialisations done.
364
365 Lastly, there's no such thing as a local instance decl, so we can
366 survive solely by spitting out *usage* information, and then reading that
367 back in as a pragma when next compiling the file. So for now,
368 we only specialise instance decls in response to pragmas.
369
370
371 SPITTING OUT USAGE INFORMATION
372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
373
374 To spit out usage information we need to traverse the code collecting
375 call-instance information for all imported (non-prelude?) functions
376 and data types. Then we equivalence-class it and spit it out.
377
378 This is done at the top-level when all the call instances which escape
379 must be for imported functions and data types.
380
381 *** Not currently done ***
382
383
384 Partial specialisation by pragmas
385 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
386 What about partial specialisation:
387
388 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
389 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
390
391 or even
392
393 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
394
395 Seems quite reasonable. Similar things could be done with instance decls:
396
397 instance (Foo a, Foo b) => Foo (a,b) where
398 ...
399 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
400 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
401
402 Ho hum. Things are complex enough without this. I pass.
403
404
405 Requirements for the simplifier
406 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
407 The simplifier has to be able to take advantage of the specialisation.
408
409 * When the simplifier finds an application of a polymorphic f, it looks in
410 f's IdInfo in case there is a suitable instance to call instead. This converts
411
412 f t1 t2 d1 d2 ===> f_t1_t2
413
414 Note that the dictionaries get eaten up too!
415
416 * Dictionary selection operations on constant dictionaries must be
417 short-circuited:
418
419 +.sel Int d ===> +Int
420
421 The obvious way to do this is in the same way as other specialised
422 calls: +.sel has inside it some IdInfo which tells that if it's applied
423 to the type Int then it should eat a dictionary and transform to +Int.
424
425 In short, dictionary selectors need IdInfo inside them for constant
426 methods.
427
428 * Exactly the same applies if a superclass dictionary is being
429 extracted:
430
431 Eq.sel Int d ===> dEqInt
432
433 * Something similar applies to dictionary construction too. Suppose
434 dfun.Eq.List is the function taking a dictionary for (Eq a) to
435 one for (Eq [a]). Then we want
436
437 dfun.Eq.List Int d ===> dEq.List_Int
438
439 Where does the Eq [Int] dictionary come from? It is built in
440 response to a SPECIALIZE pragma on the Eq [a] instance decl.
441
442 In short, dfun Ids need IdInfo with a specialisation for each
443 constant instance of their instance declaration.
444
445 All this uses a single mechanism: the SpecEnv inside an Id
446
447
448 What does the specialisation IdInfo look like?
449 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450
451 The SpecEnv of an Id maps a list of types (the template) to an expression
452
453 [Type] |-> Expr
454
455 For example, if f has this RuleInfo:
456
457 [Int, a] -> \d:Ord Int. f' a
458
459 it means that we can replace the call
460
461 f Int t ===> (\d. f' t)
462
463 This chucks one dictionary away and proceeds with the
464 specialised version of f, namely f'.
465
466
467 What can't be done this way?
468 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
469 There is no way, post-typechecker, to get a dictionary for (say)
470 Eq a from a dictionary for Eq [a]. So if we find
471
472 ==.sel [t] d
473
474 we can't transform to
475
476 eqList (==.sel t d')
477
478 where
479 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
480
481 Of course, we currently have no way to automatically derive
482 eqList, nor to connect it to the Eq [a] instance decl, but you
483 can imagine that it might somehow be possible. Taking advantage
484 of this is permanently ruled out.
485
486 Still, this is no great hardship, because we intend to eliminate
487 overloading altogether anyway!
488
489 A note about non-tyvar dictionaries
490 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
491 Some Ids have types like
492
493 forall a,b,c. Eq a -> Ord [a] -> tau
494
495 This seems curious at first, because we usually only have dictionary
496 args whose types are of the form (C a) where a is a type variable.
497 But this doesn't hold for the functions arising from instance decls,
498 which sometimes get arguments with types of form (C (T a)) for some
499 type constructor T.
500
501 Should we specialise wrt this compound-type dictionary? We used to say
502 "no", saying:
503 "This is a heuristic judgement, as indeed is the fact that we
504 specialise wrt only dictionaries. We choose *not* to specialise
505 wrt compound dictionaries because at the moment the only place
506 they show up is in instance decls, where they are simply plugged
507 into a returned dictionary. So nothing is gained by specialising
508 wrt them."
509
510 But it is simpler and more uniform to specialise wrt these dicts too;
511 and in future GHC is likely to support full fledged type signatures
512 like
513 f :: Eq [(a,b)] => ...
514
515
516 ************************************************************************
517 * *
518 \subsubsection{The new specialiser}
519 * *
520 ************************************************************************
521
522 Our basic game plan is this. For let(rec) bound function
523 f :: (C a, D c) => (a,b,c,d) -> Bool
524
525 * Find any specialised calls of f, (f ts ds), where
526 ts are the type arguments t1 .. t4, and
527 ds are the dictionary arguments d1 .. d2.
528
529 * Add a new definition for f1 (say):
530
531 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
532
533 Note that we abstract over the unconstrained type arguments.
534
535 * Add the mapping
536
537 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
538
539 to the specialisations of f. This will be used by the
540 simplifier to replace calls
541 (f t1 t2 t3 t4) da db
542 by
543 (\d1 d1 -> f1 t2 t4) da db
544
545 All the stuff about how many dictionaries to discard, and what types
546 to apply the specialised function to, are handled by the fact that the
547 SpecEnv contains a template for the result of the specialisation.
548
549 We don't build *partial* specialisations for f. For example:
550
551 f :: Eq a => a -> a -> Bool
552 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
553
554 Here, little is gained by making a specialised copy of f.
555 There's a distinct danger that the specialised version would
556 first build a dictionary for (Eq b, Eq c), and then select the (==)
557 method from it! Even if it didn't, not a great deal is saved.
558
559 We do, however, generate polymorphic, but not overloaded, specialisations:
560
561 f :: Eq a => [a] -> b -> b -> b
562 ... SPECIALISE f :: [Int] -> b -> b -> b ...
563
564 Hence, the invariant is this:
565
566 *** no specialised version is overloaded ***
567
568
569 ************************************************************************
570 * *
571 \subsubsection{The exported function}
572 * *
573 ************************************************************************
574 -}
575
576 -- | Specialise calls to type-class overloaded functions occuring in a program.
577 specProgram :: ModGuts -> CoreM ModGuts
578 specProgram guts@(ModGuts { mg_module = this_mod
579 , mg_rules = local_rules
580 , mg_binds = binds })
581 = do { dflags <- getDynFlags
582
583 -- Specialise the bindings of this module
584 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
585
586 -- Specialise imported functions
587 ; hpt_rules <- getRuleBase
588 ; let rule_base = extendRuleBaseList hpt_rules local_rules
589 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
590 [] rule_base uds
591
592 ; let final_binds
593 | null spec_binds = binds'
594 | otherwise = Rec (flattenBinds spec_binds) : binds'
595 -- Note [Glom the bindings if imported functions are specialised]
596
597 ; return (guts { mg_binds = final_binds
598 , mg_rules = new_rules ++ local_rules }) }
599 where
600 -- We need to start with a Subst that knows all the things
601 -- that are in scope, so that the substitution engine doesn't
602 -- accidentally re-use a unique that's already in use
603 -- Easiest thing is to do it all at once, as if all the top-level
604 -- decls were mutually recursive
605 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
606 bindersOfBinds binds
607 , se_interesting = emptyVarSet }
608
609 go [] = return ([], emptyUDs)
610 go (bind:binds) = do (binds', uds) <- go binds
611 (bind', uds') <- specBind top_env bind uds
612 return (bind' ++ binds', uds')
613
614 {-
615 Note [Wrap bindings returned by specImports]
616 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
617 'specImports' returns a set of specialized bindings. However, these are lacking
618 necessary floated dictionary bindings, which are returned by
619 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
620 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
621 for instance, the 'specImports' call in 'specProgram'.
622
623
624 Note [Disabling cross-module specialisation]
625 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
626 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
627 in modules outside of the current module. This can sometimes uncover user code
628 which explodes in size when aggressively optimized. The
629 -fno-cross-module-specialise option was introduced to allow users to being
630 bitten by such instances to revert to the pre-7.10 behavior.
631
632 See Trac #10491
633 -}
634
635 -- | Specialise a set of calls to imported bindings
636 specImports :: DynFlags
637 -> Module
638 -> SpecEnv -- Passed in so that all top-level Ids are in scope
639 -> VarSet -- Don't specialise these ones
640 -- See Note [Avoiding recursive specialisation]
641 -> [Id] -- Stack of imported functions being specialised
642 -> RuleBase -- Rules from this module and the home package
643 -- (but not external packages, which can change)
644 -> UsageDetails -- Calls for imported things, and floating bindings
645 -> CoreM ( [CoreRule] -- New rules
646 , [CoreBind] ) -- Specialised bindings
647 -- See Note [Wrapping bindings returned by specImports]
648 specImports dflags this_mod top_env done callers rule_base
649 (MkUD { ud_binds = dict_binds, ud_calls = calls })
650 -- See Note [Disabling cross-module specialisation]
651 | not $ gopt Opt_CrossModuleSpecialise dflags
652 = return ([], [])
653
654 | otherwise
655 = do { let import_calls = dVarEnvElts calls
656 ; (rules, spec_binds) <- go rule_base import_calls
657
658 -- Don't forget to wrap the specialized bindings with
659 -- bindings for the needed dictionaries.
660 -- See Note [Wrap bindings returned by specImports]
661 ; let spec_binds' = wrapDictBinds dict_binds spec_binds
662
663 ; return (rules, spec_binds') }
664 where
665 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
666 go _ [] = return ([], [])
667 go rb (cis@(CIS fn _) : other_calls)
668 = do { let ok_calls = filterCalls cis dict_binds
669 -- Drop calls that (directly or indirectly) refer to fn
670 -- See Note [Avoiding loops]
671 -- ; debugTraceMsg (text "specImport" <+> vcat [ ppr fn
672 -- , text "calls" <+> ppr cis
673 -- , text "ud_binds =" <+> ppr dict_binds
674 -- , text "dump set =" <+> ppr dump_set
675 -- , text "filtered calls =" <+> ppr ok_calls ])
676 ; (rules1, spec_binds1) <- specImport dflags this_mod top_env
677 done callers rb fn ok_calls
678
679 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
680 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
681
682 specImport :: DynFlags
683 -> Module
684 -> SpecEnv -- Passed in so that all top-level Ids are in scope
685 -> VarSet -- Don't specialise these
686 -- See Note [Avoiding recursive specialisation]
687 -> [Id] -- Stack of imported functions being specialised
688 -> RuleBase -- Rules from this module
689 -> Id -> [CallInfo] -- Imported function and calls for it
690 -> CoreM ( [CoreRule] -- New rules
691 , [CoreBind] ) -- Specialised bindings
692 specImport dflags this_mod top_env done callers rb fn calls_for_fn
693 | fn `elemVarSet` done
694 = return ([], []) -- No warning. This actually happens all the time
695 -- when specialising a recursive function, because
696 -- the RHS of the specialised function contains a recursive
697 -- call to the original function
698
699 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
700 = return ([], [])
701
702 | wantSpecImport dflags unfolding
703 , Just rhs <- maybeUnfoldingTemplate unfolding
704 = do { -- Get rules from the external package state
705 -- We keep doing this in case we "page-fault in"
706 -- more rules as we go along
707 ; hsc_env <- getHscEnv
708 ; eps <- liftIO $ hscEPS hsc_env
709 ; vis_orphs <- getVisibleOrphanMods
710 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
711 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
712
713 ; (rules1, spec_pairs, uds)
714 <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
715 runSpecM dflags this_mod $
716 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
717 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
718 -- After the rules kick in we may get recursion, but
719 -- we rely on a global GlomBinds to sort that out later
720 -- See Note [Glom the bindings if imported functions are specialised]
721
722 -- Now specialise any cascaded calls
723 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
724 specImports dflags this_mod top_env
725 (extendVarSet done fn)
726 (fn:callers)
727 (extendRuleBaseList rb rules1)
728 uds
729
730 ; let final_binds = spec_binds2 ++ spec_binds1
731
732 ; return (rules2 ++ rules1, final_binds) }
733
734 | warnMissingSpecs dflags callers
735 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
736 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
737 | caller <- callers])
738 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
739 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
740 ; return ([], []) }
741
742 | otherwise
743 = return ([], [])
744 where
745 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
746
747 warnMissingSpecs :: DynFlags -> [Id] -> Bool
748 -- See Note [Warning about missed specialisations]
749 warnMissingSpecs dflags callers
750 | wopt Opt_WarnAllMissedSpecs dflags = True
751 | not (wopt Opt_WarnMissedSpecs dflags) = False
752 | null callers = False
753 | otherwise = all has_inline_prag callers
754 where
755 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
756
757 wantSpecImport :: DynFlags -> Unfolding -> Bool
758 -- See Note [Specialise imported INLINABLE things]
759 wantSpecImport dflags unf
760 = case unf of
761 NoUnfolding -> False
762 BootUnfolding -> False
763 OtherCon {} -> False
764 DFunUnfolding {} -> True
765 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
766 | gopt Opt_SpecialiseAggressively dflags -> True
767 | isStableSource src -> True
768 -- Specialise even INLINE things; it hasn't inlined yet,
769 -- so perhaps it never will. Moreover it may have calls
770 -- inside it that we want to specialise
771 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
772
773 {- Note [Warning about missed specialisations]
774 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
775 Suppose
776 * In module Lib, you carefully mark a function 'foo' INLINABLE
777 * Import Lib(foo) into another module M
778 * Call 'foo' at some specialised type in M
779 Then you jolly well expect it to be specialised in M. But what if
780 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
781 specialised too. But if 'bar' is not marked INLINABLE it may well
782 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
783
784 It's more noisy to warning about a missed specialisation opportunity
785 for /every/ overloaded imported function, but sometimes useful. That
786 is what Opt_WarnAllMissedSpecs does.
787
788 ToDo: warn about missed opportunities for local functions.
789
790 Note [Specialise imported INLINABLE things]
791 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
792 What imported functions do we specialise? The basic set is
793 * DFuns and things with INLINABLE pragmas.
794 but with -fspecialise-aggressively we add
795 * Anything with an unfolding template
796
797 Trac #8874 has a good example of why we want to auto-specialise DFuns.
798
799 We have the -fspecialise-aggressively flag (usually off), because we
800 risk lots of orphan modules from over-vigorous specialisation.
801 However it's not a big deal: anything non-recursive with an
802 unfolding-template will probably have been inlined already.
803
804 Note [Glom the bindings if imported functions are specialised]
805 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
806 Suppose we have an imported, *recursive*, INLINABLE function
807 f :: Eq a => a -> a
808 f = /\a \d x. ...(f a d)...
809 In the module being compiled we have
810 g x = f (x::Int)
811 Now we'll make a specialised function
812 f_spec :: Int -> Int
813 f_spec = \x -> ...(f Int dInt)...
814 {-# RULE f Int _ = f_spec #-}
815 g = \x. f Int dInt x
816 Note that f_spec doesn't look recursive
817 After rewriting with the RULE, we get
818 f_spec = \x -> ...(f_spec)...
819 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
820 The occurrence analyser never turns a NonRec into a Rec. So we must
821 make sure that f_spec is recursive. Easiest thing is to make all
822 the specialisations for imported bindings recursive.
823
824
825 Note [Avoiding recursive specialisation]
826 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
827 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
828 'f's RHS. So we want to specialise g,h. But we don't want to
829 specialise f any more! It's possible that f's RHS might have a
830 recursive yet-more-specialised call, so we'd diverge in that case.
831 And if the call is to the same type, one specialisation is enough.
832 Avoiding this recursive specialisation loop is the reason for the
833 'done' VarSet passed to specImports and specImport.
834
835 ************************************************************************
836 * *
837 \subsubsection{@specExpr@: the main function}
838 * *
839 ************************************************************************
840 -}
841
842 data SpecEnv
843 = SE { se_subst :: CoreSubst.Subst
844 -- We carry a substitution down:
845 -- a) we must clone any binding that might float outwards,
846 -- to avoid name clashes
847 -- b) we carry a type substitution to use when analysing
848 -- the RHS of specialised bindings (no type-let!)
849
850
851 , se_interesting :: VarSet
852 -- Dict Ids that we know something about
853 -- and hence may be worth specialising against
854 -- See Note [Interesting dictionary arguments]
855 }
856
857 specVar :: SpecEnv -> Id -> CoreExpr
858 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
859
860 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
861
862 ---------------- First the easy cases --------------------
863 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
864 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
865 specExpr env (Var v) = return (specVar env v, emptyUDs)
866 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
867 specExpr env (Cast e co)
868 = do { (e', uds) <- specExpr env e
869 ; return ((mkCast e' (substCo env co)), uds) }
870 specExpr env (Tick tickish body)
871 = do { (body', uds) <- specExpr env body
872 ; return (Tick (specTickish env tickish) body', uds) }
873
874 ---------------- Applications might generate a call instance --------------------
875 specExpr env expr@(App {})
876 = go expr []
877 where
878 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
879 (fun', uds_app) <- go fun (arg':args)
880 return (App fun' arg', uds_arg `plusUDs` uds_app)
881
882 go (Var f) args = case specVar env f of
883 Var f' -> return (Var f', mkCallUDs env f' args)
884 e' -> return (e', emptyUDs) -- I don't expect this!
885 go other _ = specExpr env other
886
887 ---------------- Lambda/case require dumping of usage details --------------------
888 specExpr env e@(Lam _ _) = do
889 (body', uds) <- specExpr env' body
890 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
891 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
892 where
893 (bndrs, body) = collectBinders e
894 (env', bndrs') = substBndrs env bndrs
895 -- More efficient to collect a group of binders together all at once
896 -- and we don't want to split a lambda group with dumped bindings
897
898 specExpr env (Case scrut case_bndr ty alts)
899 = do { (scrut', scrut_uds) <- specExpr env scrut
900 ; (scrut'', case_bndr', alts', alts_uds)
901 <- specCase env scrut' case_bndr alts
902 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
903 , scrut_uds `plusUDs` alts_uds) }
904
905 ---------------- Finally, let is the interesting case --------------------
906 specExpr env (Let bind body)
907 = do { -- Clone binders
908 (rhs_env, body_env, bind') <- cloneBindSM env bind
909
910 -- Deal with the body
911 ; (body', body_uds) <- specExpr body_env body
912
913 -- Deal with the bindings
914 ; (binds', uds) <- specBind rhs_env bind' body_uds
915
916 -- All done
917 ; return (foldr Let body' binds', uds) }
918
919 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
920 specTickish env (Breakpoint ix ids)
921 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
922 -- drop vars from the list if they have a non-variable substitution.
923 -- should never happen, but it's harmless to drop them anyway.
924 specTickish _ other_tickish = other_tickish
925
926 specCase :: SpecEnv
927 -> CoreExpr -- Scrutinee, already done
928 -> Id -> [CoreAlt]
929 -> SpecM ( CoreExpr -- New scrutinee
930 , Id
931 , [CoreAlt]
932 , UsageDetails)
933 specCase env scrut' case_bndr [(con, args, rhs)]
934 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
935 , interestingDict env scrut'
936 , not (isDeadBinder case_bndr && null sc_args')
937 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
938
939 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
940 [(con, args', Var sc_arg')]
941 | sc_arg' <- sc_args' ]
942
943 -- Extend the substitution for RHS to map the *original* binders
944 -- to their floated verions.
945 mb_sc_flts :: [Maybe DictId]
946 mb_sc_flts = map (lookupVarEnv clone_env) args'
947 clone_env = zipVarEnv sc_args' sc_args_flt
948 subst_prs = (case_bndr, Var case_bndr_flt)
949 : [ (arg, Var sc_flt)
950 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
951 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
952 , se_interesting = se_interesting env_rhs `extendVarSetList`
953 (case_bndr_flt : sc_args_flt) }
954
955 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
956 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
957 case_bndr_set = unitVarSet case_bndr_flt
958 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
959 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
960 flt_binds = scrut_bind : sc_binds
961 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
962 all_uds = flt_binds `addDictBinds` free_uds
963 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
964 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
965 where
966 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
967 sc_args' = filter is_flt_sc_arg args'
968
969 clone_me bndr = do { uniq <- getUniqueM
970 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
971 where
972 name = idName bndr
973 ty = idType bndr
974 occ = nameOccName name
975 loc = getSrcSpan name
976
977 arg_set = mkVarSet args'
978 is_flt_sc_arg var = isId var
979 && not (isDeadBinder var)
980 && isDictTy var_ty
981 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
982 where
983 var_ty = idType var
984
985
986 specCase env scrut case_bndr alts
987 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
988 ; return (scrut, case_bndr', alts', uds_alts) }
989 where
990 (env_alt, case_bndr') = substBndr env case_bndr
991 spec_alt (con, args, rhs) = do
992 (rhs', uds) <- specExpr env_rhs rhs
993 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
994 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
995 where
996 (env_rhs, args') = substBndrs env_alt args
997
998 {-
999 Note [Floating dictionaries out of cases]
1000 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1001 Consider
1002 g = \d. case d of { MkD sc ... -> ...(f sc)... }
1003 Naively we can't float d2's binding out of the case expression,
1004 because 'sc' is bound by the case, and that in turn means we can't
1005 specialise f, which seems a pity.
1006
1007 So we invert the case, by floating out a binding
1008 for 'sc_flt' thus:
1009 sc_flt = case d of { MkD sc ... -> sc }
1010 Now we can float the call instance for 'f'. Indeed this is just
1011 what'll happen if 'sc' was originally bound with a let binding,
1012 but case is more efficient, and necessary with equalities. So it's
1013 good to work with both.
1014
1015 You might think that this won't make any difference, because the
1016 call instance will only get nuked by the \d. BUT if 'g' itself is
1017 specialised, then transitively we should be able to specialise f.
1018
1019 In general, given
1020 case e of cb { MkD sc ... -> ...(f sc)... }
1021 we transform to
1022 let cb_flt = e
1023 sc_flt = case cb_flt of { MkD sc ... -> sc }
1024 in
1025 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1026
1027 The "_flt" things are the floated binds; we use the current substitution
1028 to substitute sc -> sc_flt in the RHS
1029
1030 ************************************************************************
1031 * *
1032 Dealing with a binding
1033 * *
1034 ************************************************************************
1035 -}
1036
1037 specBind :: SpecEnv -- Use this for RHSs
1038 -> CoreBind -- Binders are already cloned by cloneBindSM,
1039 -- but RHSs are un-processed
1040 -> UsageDetails -- Info on how the scope of the binding
1041 -> SpecM ([CoreBind], -- New bindings
1042 UsageDetails) -- And info to pass upstream
1043
1044 -- Returned UsageDetails:
1045 -- No calls for binders of this bind
1046 specBind rhs_env (NonRec fn rhs) body_uds
1047 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1048 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1049
1050 ; let pairs = spec_defns ++ [(fn', rhs')]
1051 -- fn' mentions the spec_defns in its rules,
1052 -- so put the latter first
1053
1054 combined_uds = body_uds1 `plusUDs` rhs_uds
1055
1056 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1057
1058 final_binds :: [DictBind]
1059 -- See Note [From non-recursive to recursive]
1060 final_binds
1061 | not (isEmptyBag dump_dbs)
1062 , not (null spec_defns)
1063 = [recWithDumpedDicts pairs dump_dbs]
1064 | otherwise
1065 = [mkDB $ NonRec b r | (b,r) <- pairs]
1066 ++ bagToList dump_dbs
1067
1068 ; if float_all then
1069 -- Rather than discard the calls mentioning the bound variables
1070 -- we float this (dictionary) binding along with the others
1071 return ([], free_uds `snocDictBinds` final_binds)
1072 else
1073 -- No call in final_uds mentions bound variables,
1074 -- so we can just leave the binding here
1075 return (map fst final_binds, free_uds) }
1076
1077
1078 specBind rhs_env (Rec pairs) body_uds
1079 -- Note [Specialising a recursive group]
1080 = do { let (bndrs,rhss) = unzip pairs
1081 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1082 ; let scope_uds = body_uds `plusUDs` rhs_uds
1083 -- Includes binds and calls arising from rhss
1084
1085 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1086
1087 ; (bndrs3, spec_defns3, uds3)
1088 <- if null spec_defns1 -- Common case: no specialisation
1089 then return (bndrs1, [], uds1)
1090 else do { -- Specialisation occurred; do it again
1091 (bndrs2, spec_defns2, uds2)
1092 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1093 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1094
1095 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1096 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1097 dumped_dbs
1098
1099 ; if float_all then
1100 return ([], final_uds `snocDictBind` final_bind)
1101 else
1102 return ([fst final_bind], final_uds) }
1103
1104
1105 ---------------------------
1106 specDefns :: SpecEnv
1107 -> UsageDetails -- Info on how it is used in its scope
1108 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1109 -> SpecM ([OutId], -- Original Ids with RULES added
1110 [(OutId,OutExpr)], -- Extra, specialised bindings
1111 UsageDetails) -- Stuff to fling upwards from the specialised versions
1112
1113 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1114 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1115 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1116 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1117 -- But not vice versa (it's a fixpoint problem).
1118
1119 specDefns _env uds []
1120 = return ([], [], uds)
1121 specDefns env uds ((bndr,rhs):pairs)
1122 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1123 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1124 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1125
1126 ---------------------------
1127 specDefn :: SpecEnv
1128 -> UsageDetails -- Info on how it is used in its scope
1129 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1130 -> SpecM (Id, -- Original Id with added RULES
1131 [(Id,CoreExpr)], -- Extra, specialised bindings
1132 UsageDetails) -- Stuff to fling upwards from the specialised versions
1133
1134 specDefn env body_uds fn rhs
1135 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1136 rules_for_me = idCoreRules fn
1137 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1138 calls_for_me fn rhs
1139 ; return ( fn `addIdSpecialisations` rules
1140 , spec_defns
1141 , body_uds_without_me `plusUDs` spec_uds) }
1142 -- It's important that the `plusUDs` is this way
1143 -- round, because body_uds_without_me may bind
1144 -- dictionaries that are used in calls_for_me passed
1145 -- to specDefn. So the dictionary bindings in
1146 -- spec_uds may mention dictionaries bound in
1147 -- body_uds_without_me
1148
1149 ---------------------------
1150 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1151 -- Nothing => specialising local fn
1152 -> SpecEnv
1153 -> [CoreRule] -- Existing RULES for the fn
1154 -> [CallInfo]
1155 -> OutId -> InExpr
1156 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details
1157
1158 -- This function checks existing rules, and does not create
1159 -- duplicate ones. So the caller does not need to do this filtering.
1160 -- See 'already_covered'
1161
1162 type SpecInfo = ( [CoreRule] -- Specialisation rules
1163 , [(Id,CoreExpr)] -- Specialised definition
1164 , UsageDetails ) -- Usage details from specialised RHSs
1165
1166 specCalls mb_mod env existing_rules calls_for_me fn rhs
1167 -- The first case is the interesting one
1168 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1169 && rhs_bndrs1 `lengthAtLeast` n_dicts -- and enough dict args
1170 && notNull calls_for_me -- And there are some calls to specialise
1171 && not (isNeverActive (idInlineActivation fn))
1172 -- Don't specialise NOINLINE things
1173 -- See Note [Auto-specialisation and RULES]
1174
1175 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1176 -- See Note [Inline specialisation] for why we do not
1177 -- switch off specialisation for inline functions
1178
1179 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1180 foldlM spec_call ([], [], emptyUDs) calls_for_me
1181
1182 | otherwise -- No calls or RHS doesn't fit our preconceptions
1183 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1184 text "Missed specialisation opportunity for"
1185 <+> ppr fn $$ _trace_doc )
1186 -- Note [Specialisation shape]
1187 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1188 return ([], [], emptyUDs)
1189 where
1190 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1191 , ppr rhs_bndrs, ppr n_dicts
1192 , ppr (idInlineActivation fn) ]
1193
1194 fn_type = idType fn
1195 fn_arity = idArity fn
1196 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1197 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1198 n_tyvars = length tyvars
1199 n_dicts = length theta
1200 inl_prag = idInlinePragma fn
1201 inl_act = inlinePragmaActivation inl_prag
1202 is_local = isLocalId fn
1203
1204 -- Figure out whether the function has an INLINE pragma
1205 -- See Note [Inline specialisations]
1206
1207 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1208 -- See Note [Account for casts in binding]
1209 (rhs_tyvars, rhs_bndrs1) = span isTyVar rhs_bndrs
1210 (rhs_dict_ids, rhs_bndrs2) = splitAt n_dicts rhs_bndrs1
1211 body = mkLams rhs_bndrs2 rhs_body
1212 -- Glue back on the non-dict lambdas
1213
1214 in_scope = CoreSubst.substInScope (se_subst env)
1215
1216 already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool
1217 already_covered dflags new_rules args -- Note [Specialisations already covered]
1218 = isJust (lookupRule dflags (in_scope, realIdUnfolding)
1219 (const True) fn args
1220 (new_rules ++ existing_rules))
1221 -- NB: we look both in the new_rules (generated by this invocation
1222 -- of specCalls), and in existing_rules (passed in to specCalls)
1223
1224 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1225 mk_ty_args [] poly_tvs
1226 = ASSERT( null poly_tvs ) []
1227 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1228 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1229 mk_ty_args (Just ty : call_ts) poly_tvs
1230 = Type ty : mk_ty_args call_ts poly_tvs
1231 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1232
1233 ----------------------------------------------------------
1234 -- Specialise to one particular call pattern
1235 spec_call :: SpecInfo -- Accumulating parameter
1236 -> CallInfo -- Call instance
1237 -> SpecM SpecInfo
1238 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc)
1239 (CI { ci_key = CallKey call_ts, ci_args = call_ds })
1240 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1241
1242 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1243 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1244
1245 -- Construct the new binding
1246 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1247 -- PLUS the rule
1248 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1249 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1250 -- PLUS the usage-details
1251 -- { d1' = dx1; d2' = dx2 }
1252 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1253 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1254 --
1255 -- Note that the substitution is applied to the whole thing.
1256 -- This is convenient, but just slightly fragile. Notably:
1257 -- * There had better be no name clashes in a/b/c
1258 do { let
1259 -- poly_tyvars = [b] in the example above
1260 -- spec_tyvars = [a,c]
1261 -- ty_args = [t1,b,t3]
1262 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1263 env1 = extendTvSubstList env spec_tv_binds
1264 (rhs_env, poly_tyvars) = substBndrs env1
1265 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1266
1267 -- Clone rhs_dicts, including instantiating their types
1268 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1269 ; let (rhs_env2, dx_binds, spec_dict_args)
1270 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1271 ty_args = mk_ty_args call_ts poly_tyvars
1272 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1273 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1274 rule_args = ty_args ++ ev_args
1275 rule_bndrs = poly_tyvars ++ ev_bndrs
1276
1277 ; dflags <- getDynFlags
1278 ; if already_covered dflags rules_acc rule_args
1279 then return spec_acc
1280 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1281 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1282 -- , ppr dx_binds ]) $
1283 do
1284 { -- Figure out the type of the specialised function
1285 let body_ty = applyTypeToArgs rhs fn_type rule_args
1286 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1287 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1288 , not (isJoinId fn)
1289 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1290 | otherwise = (poly_tyvars, poly_tyvars)
1291 spec_id_ty = mkLamTypes lam_args body_ty
1292 join_arity_change = length app_args - length rule_args
1293 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1294 = Just (orig_join_arity + join_arity_change)
1295 | otherwise
1296 = Nothing
1297
1298 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1299 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1300 ; this_mod <- getModule
1301 ; let
1302 -- The rule to put in the function's specialisation is:
1303 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1304 herald = case mb_mod of
1305 Nothing -- Specialising local fn
1306 -> text "SPEC"
1307 Just this_mod -- Specialising imported fn
1308 -> text "SPEC/" <> ppr this_mod
1309
1310 rule_name = mkFastString $ showSDoc dflags $
1311 herald <+> ftext (occNameFS (getOccName fn))
1312 <+> hsep (map ppr_call_key_ty call_ts)
1313 -- This name ends up in interface files, so use occNameString.
1314 -- Otherwise uniques end up there, making builds
1315 -- less deterministic (See #4012 comment:61 ff)
1316
1317 rule_wout_eta = mkRule
1318 this_mod
1319 True {- Auto generated -}
1320 is_local
1321 rule_name
1322 inl_act -- Note [Auto-specialisation and RULES]
1323 (idName fn)
1324 rule_bndrs
1325 rule_args
1326 (mkVarApps (Var spec_f) app_args)
1327
1328 spec_rule
1329 = case isJoinId_maybe fn of
1330 Just join_arity -> etaExpandToJoinPointRule join_arity
1331 rule_wout_eta
1332 Nothing -> rule_wout_eta
1333
1334 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1335 spec_uds = foldr consDictBind rhs_uds dx_binds
1336
1337 --------------------------------------
1338 -- Add a suitable unfolding if the spec_inl_prag says so
1339 -- See Note [Inline specialisations]
1340 (spec_inl_prag, spec_unf)
1341 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1342 = (neverInlinePragma, noUnfolding)
1343 -- See Note [Specialising imported functions] in OccurAnal
1344
1345 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1346 = (inl_prag { inl_inline = NoUserInline }, noUnfolding)
1347
1348 | otherwise
1349 = (inl_prag, specUnfolding poly_tyvars spec_app
1350 arity_decrease fn_unf)
1351
1352 arity_decrease = length spec_dict_args
1353 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1354
1355 --------------------------------------
1356 -- Adding arity information just propagates it a bit faster
1357 -- See Note [Arity decrease] in Simplify
1358 -- Copy InlinePragma information from the parent Id.
1359 -- So if f has INLINE[1] so does spec_f
1360 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1361 `setInlinePragma` spec_inl_prag
1362 `setIdUnfolding` spec_unf
1363 `asJoinId_maybe` spec_join_arity
1364
1365 ; return ( spec_rule : rules_acc
1366 , (spec_f_w_arity, spec_rhs) : pairs_acc
1367 , spec_uds `plusUDs` uds_acc
1368 ) } }
1369
1370 {- Note [Account for casts in binding]
1371 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1372 Consider
1373 f :: Eq a => a -> IO ()
1374 {-# INLINABLE f
1375 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1376 #-}
1377 f = ...
1378
1379 In f's stable unfolding we have done some modest simplification which
1380 has pushed the cast to the outside. (I wonder if this is the Right
1381 Thing, but it's what happens now; see SimplUtils Note [Casts and
1382 lambdas].) Now that stable unfolding must be specialised, so we want
1383 to push the cast back inside. It would be terrible if the cast
1384 defeated specialisation! Hence the use of collectBindersPushingCo.
1385
1386 Note [Evidence foralls]
1387 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1388 Suppose (Trac #12212) that we are specialising
1389 f :: forall a b. (Num a, F a ~ F b) => blah
1390 with a=b=Int. Then the RULE will be something like
1391 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1392 f Int Int d g = f_spec
1393 But both varToCoreExpr (when constructing the LHS args), and the
1394 simplifier (when simplifying the LHS args), will transform to
1395 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1396 f Int Int d <F Int> = f_spec
1397 by replacing g with Refl. So now 'g' is unbound, which results in a later
1398 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1399 * varToCoreExpr generates a Refl
1400 * exprsFreeIdsList returns the Ids bound by the args,
1401 which won't include g
1402
1403 You might wonder if this will match as often, but the simplifier replaces
1404 complicated Refl coercions with Refl pretty aggressively.
1405
1406 Note [Orphans and auto-generated rules]
1407 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1408 When we specialise an INLINABLE function, or when we have
1409 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1410 We don't want to warn about these, or we'd generate a lot of warnings.
1411 Thus, we only warn about user-specified orphan rules.
1412
1413 Indeed, we don't even treat the module as an orphan module if it has
1414 auto-generated *rule* orphans. Orphan modules are read every time we
1415 compile, so they are pretty obtrusive and slow down every compilation,
1416 even non-optimised ones. (Reason: for type class instances it's a
1417 type correctness issue.) But specialisation rules are strictly for
1418 *optimisation* only so it's fine not to read the interface.
1419
1420 What this means is that a SPEC rules from auto-specialisation in
1421 module M will be used in other modules only if M.hi has been read for
1422 some other reason, which is actually pretty likely.
1423 -}
1424
1425 bindAuxiliaryDicts
1426 :: SpecEnv
1427 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1428 -> [DictId] -- A cloned dict-id for each dict arg
1429 -> (SpecEnv, -- Substitute for all orig_dicts
1430 [DictBind], -- Auxiliary dict bindings
1431 [CoreExpr]) -- Witnessing expressions (all trivial)
1432 -- Bind any dictionary arguments to fresh names, to preserve sharing
1433 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1434 orig_dict_ids call_ds inst_dict_ids
1435 = (env', dx_binds, spec_dict_args)
1436 where
1437 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1438 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1439 (orig_dict_ids `zip` spec_dict_args)
1440 `CoreSubst.extendInScopeList` dx_ids
1441 , se_interesting = interesting `unionVarSet` interesting_dicts }
1442
1443 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1444 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1445 , interestingDict env dx ]
1446 -- See Note [Make the new dictionaries interesting]
1447
1448 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1449 go [] _ = ([], [])
1450 go (dx:dxs) (dx_id:dx_ids)
1451 | exprIsTrivial dx = (dx_binds, dx : args)
1452 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1453 where
1454 (dx_binds, args) = go dxs dx_ids
1455 -- In the first case extend the substitution but not bindings;
1456 -- in the latter extend the bindings but not the substitution.
1457 -- For the former, note that we bind the *original* dict in the substitution,
1458 -- overriding any d->dx_id binding put there by substBndrs
1459 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1460
1461 {-
1462 Note [Make the new dictionaries interesting]
1463 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1464 Important! We're going to substitute dx_id1 for d
1465 and we want it to look "interesting", else we won't gather *any*
1466 consequential calls. E.g.
1467 f d = ...g d....
1468 If we specialise f for a call (f (dfun dNumInt)), we'll get
1469 a consequent call (g d') with an auxiliary definition
1470 d' = df dNumInt
1471 We want that consequent call to look interesting
1472
1473
1474 Note [From non-recursive to recursive]
1475 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1476 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1477 have built a recursive knot
1478
1479 f a d x = <blah>
1480 MkUD { ud_binds = NonRec d7 (MkD ..f..)
1481 , ud_calls = ...(f T d7)... }
1482
1483 The we generate
1484
1485 Rec { fs x = <blah>[T/a, d7/d]
1486 f a d x = <blah>
1487 RULE f T _ = fs
1488 d7 = ...f... }
1489
1490 Here the recursion is only through the RULE.
1491
1492 However we definitely should /not/ make the Rec in this wildly common
1493 case:
1494 d = ...
1495 MkUD { ud_binds = NonRec d7 (...d...)
1496 , ud_calls = ...(f T d7)... }
1497
1498 Here we want simply to add d to the floats, giving
1499 MkUD { ud_binds = NonRec d (...)
1500 NonRec d7 (...d...)
1501 , ud_calls = ...(f T d7)... }
1502
1503 In general, we need only make this Rec if
1504 - there are some specialisations (spec_binds non-empty)
1505 - there are some dict_binds that depend on f (dump_dbs non-empty)
1506
1507 Note [Avoiding loops]
1508 ~~~~~~~~~~~~~~~~~~~~~
1509 When specialising /dictionary functions/ we must be very careful to
1510 avoid building loops. Here is an example that bit us badly: Trac #3591
1511
1512 class Eq a => C a
1513 instance Eq [a] => C [a]
1514
1515 This translates to
1516 dfun :: Eq [a] -> C [a]
1517 dfun a d = MkD a d (meth d)
1518
1519 d4 :: Eq [T] = <blah>
1520 d2 :: C [T] = dfun T d4
1521 d1 :: Eq [T] = $p1 d2
1522 d3 :: C [T] = dfun T d1
1523
1524 None of these definitions is recursive. What happened was that we
1525 generated a specialisation:
1526
1527 RULE forall d. dfun T d = dT :: C [T]
1528 dT = (MkD a d (meth d)) [T/a, d1/d]
1529 = MkD T d1 (meth d1)
1530
1531 But now we use the RULE on the RHS of d2, to get
1532
1533 d2 = dT = MkD d1 (meth d1)
1534 d1 = $p1 d2
1535
1536 and now d1 is bottom! The problem is that when specialising 'dfun' we
1537 should first dump "below" the binding all floated dictionary bindings
1538 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1539 placed below 'dfun', and thus unavailable to it when specialising
1540 'dfun'. That in turn means that the call (dfun T d1) must be
1541 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1542 d4 doesn't mention dfun.
1543
1544 Solution:
1545 Discard all calls that mention dictionaries that depend
1546 (directly or indirectly) on the dfun we are specialising.
1547 This is done by 'filterCalls'
1548
1549 --------------
1550 Here's another example, this time for an imported dfun, so the call
1551 to filterCalls is in specImports (Trac #13429). Suppose we have
1552 class Monoid v => C v a where ...
1553
1554 We start with a call
1555 f @ [Integer] @ Integer $fC[]Integer
1556
1557 Specialising call to 'f' gives dict bindings
1558 $dMonoid_1 :: Monoid [Integer]
1559 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
1560
1561 $dC_1 :: C [Integer] (Node [Integer] Integer)
1562 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1563
1564 ...plus a recursive call to
1565 f @ [Integer] @ (Node [Integer] Integer) $dC_1
1566
1567 Specialising that call gives
1568 $dMonoid_2 :: Monoid [Integer]
1569 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1570
1571 $dC_2 :: C [Integer] (Node [Integer] Integer)
1572 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
1573
1574 Now we have two calls to the imported function
1575 M.$fCvNode :: Monoid v => C v a
1576 M.$fCvNode @v @a m = C m some_fun
1577
1578 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
1579 for specialisation, else we get:
1580
1581 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1582 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1583 $s$fCvNode = C $dMonoid_2 ...
1584 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
1585
1586 Now use the rule to rewrite the call in the RHS of $dC_1
1587 and we get a loop!
1588
1589 --------------
1590 Here's yet another example
1591
1592 class C a where { foo,bar :: [a] -> [a] }
1593
1594 instance C Int where
1595 foo x = r_bar x
1596 bar xs = reverse xs
1597
1598 r_bar :: C a => [a] -> [a]
1599 r_bar xs = bar (xs ++ xs)
1600
1601 That translates to:
1602
1603 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1604
1605 Rec { $fCInt :: C Int = MkC foo_help reverse
1606 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1607
1608 The call (r_bar $fCInt) mentions $fCInt,
1609 which mentions foo_help,
1610 which mentions r_bar
1611 But we DO want to specialise r_bar at Int:
1612
1613 Rec { $fCInt :: C Int = MkC foo_help reverse
1614 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1615
1616 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1617 RULE r_bar Int _ = r_bar_Int
1618
1619 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1620 }
1621
1622 Note that, because of its RULE, r_bar joins the recursive
1623 group. (In this case it'll unravel a short moment later.)
1624
1625
1626 Note [Specialising a recursive group]
1627 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1628 Consider
1629 let rec { f x = ...g x'...
1630 ; g y = ...f y'.... }
1631 in f 'a'
1632 Here we specialise 'f' at Char; but that is very likely to lead to
1633 a specialisation of 'g' at Char. We must do the latter, else the
1634 whole point of specialisation is lost.
1635
1636 But we do not want to keep iterating to a fixpoint, because in the
1637 presence of polymorphic recursion we might generate an infinite number
1638 of specialisations.
1639
1640 So we use the following heuristic:
1641 * Arrange the rec block in dependency order, so far as possible
1642 (the occurrence analyser already does this)
1643
1644 * Specialise it much like a sequence of lets
1645
1646 * Then go through the block a second time, feeding call-info from
1647 the RHSs back in the bottom, as it were
1648
1649 In effect, the ordering maxmimises the effectiveness of each sweep,
1650 and we do just two sweeps. This should catch almost every case of
1651 monomorphic recursion -- the exception could be a very knotted-up
1652 recursion with multiple cycles tied up together.
1653
1654 This plan is implemented in the Rec case of specBindItself.
1655
1656 Note [Specialisations already covered]
1657 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1658 We obviously don't want to generate two specialisations for the same
1659 argument pattern. There are two wrinkles
1660
1661 1. We do the already-covered test in specDefn, not when we generate
1662 the CallInfo in mkCallUDs. We used to test in the latter place, but
1663 we now iterate the specialiser somewhat, and the Id at the call site
1664 might therefore not have all the RULES that we can see in specDefn
1665
1666 2. What about two specialisations where the second is an *instance*
1667 of the first? If the more specific one shows up first, we'll generate
1668 specialisations for both. If the *less* specific one shows up first,
1669 we *don't* currently generate a specialisation for the more specific
1670 one. (See the call to lookupRule in already_covered.) Reasons:
1671 (a) lookupRule doesn't say which matches are exact (bad reason)
1672 (b) if the earlier specialisation is user-provided, it's
1673 far from clear that we should auto-specialise further
1674
1675 Note [Auto-specialisation and RULES]
1676 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1677 Consider:
1678 g :: Num a => a -> a
1679 g = ...
1680
1681 f :: (Int -> Int) -> Int
1682 f w = ...
1683 {-# RULE f g = 0 #-}
1684
1685 Suppose that auto-specialisation makes a specialised version of
1686 g::Int->Int That version won't appear in the LHS of the RULE for f.
1687 So if the specialisation rule fires too early, the rule for f may
1688 never fire.
1689
1690 It might be possible to add new rules, to "complete" the rewrite system.
1691 Thus when adding
1692 RULE forall d. g Int d = g_spec
1693 also add
1694 RULE f g_spec = 0
1695
1696 But that's a bit complicated. For now we ask the programmer's help,
1697 by *copying the INLINE activation pragma* to the auto-specialised
1698 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1699 will also not be active until phase 2. And that's what programmers
1700 should jolly well do anyway, even aside from specialisation, to ensure
1701 that g doesn't inline too early.
1702
1703 This in turn means that the RULE would never fire for a NOINLINE
1704 thing so not much point in generating a specialisation at all.
1705
1706 Note [Specialisation shape]
1707 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1708 We only specialise a function if it has visible top-level lambdas
1709 corresponding to its overloading. E.g. if
1710 f :: forall a. Eq a => ....
1711 then its body must look like
1712 f = /\a. \d. ...
1713
1714 Reason: when specialising the body for a call (f ty dexp), we want to
1715 substitute dexp for d, and pick up specialised calls in the body of f.
1716
1717 This doesn't always work. One example I came across was this:
1718 newtype Gen a = MkGen{ unGen :: Int -> a }
1719
1720 choose :: Eq a => a -> Gen a
1721 choose n = MkGen (\r -> n)
1722
1723 oneof = choose (1::Int)
1724
1725 It's a silly exapmle, but we get
1726 choose = /\a. g `cast` co
1727 where choose doesn't have any dict arguments. Thus far I have not
1728 tried to fix this (wait till there's a real example).
1729
1730 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1731 it doesn't matter. This comes up with single-method classes
1732
1733 class C a where { op :: a -> a }
1734 instance C a => C [a] where ....
1735 ==>
1736 $fCList :: C a => C [a]
1737 $fCList = $copList |> (...coercion>...)
1738 ....(uses of $fCList at particular types)...
1739
1740 So we suppress the WARN if the rhs is trivial.
1741
1742 Note [Inline specialisations]
1743 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1744 Here is what we do with the InlinePragma of the original function
1745 * Activation/RuleMatchInfo: both transferred to the
1746 specialised function
1747 * InlineSpec:
1748 (a) An INLINE pragma is transferred
1749 (b) An INLINABLE pragma is *not* transferred
1750
1751 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1752 specialise the function at its call site, and arguably that's not so
1753 important for the specialised copies. BUT *pragma-directed*
1754 specialisation now takes place in the typechecker/desugarer, with
1755 manually specified INLINEs. The specialisation here is automatic.
1756 It'd be very odd if a function marked INLINE was specialised (because
1757 of some local use), and then forever after (including importing
1758 modules) the specialised version wasn't INLINEd. After all, the
1759 programmer said INLINE!
1760
1761 You might wonder why we specialise INLINE functions at all. After
1762 all they should be inlined, right? Two reasons:
1763
1764 * Even INLINE functions are sometimes not inlined, when they aren't
1765 applied to interesting arguments. But perhaps the type arguments
1766 alone are enough to specialise (even though the args are too boring
1767 to trigger inlining), and it's certainly better to call the
1768 specialised version.
1769
1770 * The RHS of an INLINE function might call another overloaded function,
1771 and we'd like to generate a specialised version of that function too.
1772 This actually happens a lot. Consider
1773 replicateM_ :: (Monad m) => Int -> m a -> m ()
1774 {-# INLINABLE replicateM_ #-}
1775 replicateM_ d x ma = ...
1776 The strictness analyser may transform to
1777 replicateM_ :: (Monad m) => Int -> m a -> m ()
1778 {-# INLINE replicateM_ #-}
1779 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1780
1781 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1782 {-# INLINABLE $wreplicateM_ #-}
1783 $wreplicateM_ = ...
1784 Now an importing module has a specialised call to replicateM_, say
1785 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1786 This particular example had a huge effect on the call to replicateM_
1787 in nofib/shootout/n-body.
1788
1789 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1790 Suppose we have
1791 {-# INLINABLE f #-}
1792 f :: Ord a => [a] -> Int
1793 f xs = letrec f' = ...f'... in f'
1794 Then, when f is specialised and optimised we might get
1795 wgo :: [Int] -> Int#
1796 wgo = ...wgo...
1797 f_spec :: [Int] -> Int
1798 f_spec xs = case wgo xs of { r -> I# r }
1799 and we clearly want to inline f_spec at call sites. But if we still
1800 have the big, un-optimised of f (albeit specialised) captured in an
1801 INLINABLE pragma for f_spec, we won't get that optimisation.
1802
1803 So we simply drop INLINABLE pragmas when specialising. It's not really
1804 a complete solution; ignoring specialisation for now, INLINABLE functions
1805 don't get properly strictness analysed, for example. But it works well
1806 for examples involving specialisation, which is the dominant use of
1807 INLINABLE. See Trac #4874.
1808
1809
1810 ************************************************************************
1811 * *
1812 \subsubsection{UsageDetails and suchlike}
1813 * *
1814 ************************************************************************
1815 -}
1816
1817 data UsageDetails
1818 = MkUD {
1819 ud_binds :: !(Bag DictBind),
1820 -- See Note [Floated dictionary bindings]
1821 -- The order is important;
1822 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1823 -- (Remember, Bags preserve order in GHC.)
1824
1825 ud_calls :: !CallDetails
1826
1827 -- INVARIANT: suppose bs = bindersOf ud_binds
1828 -- Then 'calls' may *mention* 'bs',
1829 -- but there should be no calls *for* bs
1830 }
1831
1832 -- | A 'DictBind' is a binding along with a cached set containing its free
1833 -- variables (both type variables and dictionaries)
1834 type DictBind = (CoreBind, VarSet)
1835
1836 {- Note [Floated dictionary bindings]
1837 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1838 We float out dictionary bindings for the reasons described under
1839 "Dictionary floating" above. But not /just/ dictionary bindings.
1840 Consider
1841
1842 f :: Eq a => blah
1843 f a d = rhs
1844
1845 $c== :: T -> T -> Bool
1846 $c== x y = ...
1847
1848 $df :: Eq T
1849 $df = Eq $c== ...
1850
1851 gurgle = ...(f @T $df)...
1852
1853 We gather the call info for (f @T $df), and we don't want to drop it
1854 when we come across the binding for $df. So we add $df to the floats
1855 and continue. But then we have to add $c== to the floats, and so on.
1856 These all float above the binding for 'f', and and now we can
1857 successfully specialise 'f'.
1858
1859 So the DictBinds in (ud_binds :: Bag DictBind) may contain
1860 non-dictionary bindings too.
1861 -}
1862
1863 instance Outputable UsageDetails where
1864 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1865 = text "MkUD" <+> braces (sep (punctuate comma
1866 [text "binds" <+> equals <+> ppr dbs,
1867 text "calls" <+> equals <+> ppr calls]))
1868
1869 emptyUDs :: UsageDetails
1870 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1871
1872 ------------------------------------------------------------
1873 type CallDetails = DIdEnv CallInfoSet
1874 -- The order of specialized binds and rules depends on how we linearize
1875 -- CallDetails, so to get determinism we must use a deterministic set here.
1876 -- See Note [Deterministic UniqFM] in UniqDFM
1877
1878 data CallInfoSet = CIS Id (Bag CallInfo)
1879 -- The list of types and dictionaries is guaranteed to
1880 -- match the type of f
1881 -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
1882 -- These dups are eliminated by already_covered in specCalls
1883
1884 data CallInfo
1885 = CI { ci_key :: CallKey -- Type arguments
1886 , ci_args :: [DictExpr] -- Dictionary arguments
1887 , ci_fvs :: VarSet -- Free vars of the ci_key and ci_args
1888 -- call (including tyvars)
1889 -- [*not* include the main id itself, of course]
1890 }
1891
1892 newtype CallKey = CallKey [Maybe Type]
1893 -- Nothing => unconstrained type argument
1894
1895 type DictExpr = CoreExpr
1896
1897 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1898 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1899
1900 instance Outputable CallInfoSet where
1901 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1902 2 (ppr map)
1903
1904 pprCallInfo :: Id -> CallInfo -> SDoc
1905 pprCallInfo fn (CI { ci_key = key })
1906 = ppr fn <+> ppr key
1907
1908 ppr_call_key_ty :: Maybe Type -> SDoc
1909 ppr_call_key_ty Nothing = char '_'
1910 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1911
1912 instance Outputable CallKey where
1913 ppr (CallKey ts) = brackets (fsep (map ppr_call_key_ty ts))
1914
1915 instance Outputable CallInfo where
1916 ppr (CI { ci_key = key, ci_args = args, ci_fvs = fvs })
1917 = text "CI" <> braces (hsep [ ppr key, ppr args, ppr fvs ])
1918
1919 unionCalls :: CallDetails -> CallDetails -> CallDetails
1920 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1921
1922 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1923 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1924 CIS f (calls1 `unionBags` calls2)
1925
1926 callDetailsFVs :: CallDetails -> VarSet
1927 callDetailsFVs calls =
1928 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1929 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1930 -- immediately by converting to a nondeterministic set.
1931
1932 callInfoFVs :: CallInfoSet -> VarSet
1933 callInfoFVs (CIS _ call_info) =
1934 foldrBag (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
1935
1936 ------------------------------------------------------------
1937 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1938 singleCall id tys dicts
1939 = MkUD {ud_binds = emptyBag,
1940 ud_calls = unitDVarEnv id $ CIS id $
1941 unitBag (CI { ci_key = CallKey tys
1942 , ci_args = dicts
1943 , ci_fvs = call_fvs }) }
1944 where
1945 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1946 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1947 -- The type args (tys) are guaranteed to be part of the dictionary
1948 -- types, because they are just the constrained types,
1949 -- and the dictionary is therefore sure to be bound
1950 -- inside the binding for any type variables free in the type;
1951 -- hence it's safe to neglect tyvars free in tys when making
1952 -- the free-var set for this call
1953 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1954 --
1955 -- We don't include the 'id' itself.
1956
1957 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1958 mkCallUDs env f args
1959 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1960 res
1961 where
1962 res = mkCallUDs' env f args
1963
1964 mkCallUDs' env f args
1965 | not (want_calls_for f) -- Imported from elsewhere
1966 || null theta -- Not overloaded
1967 = emptyUDs
1968
1969 | not (all type_determines_value theta)
1970 || not (spec_tys `lengthIs` n_tyvars)
1971 || not ( dicts `lengthIs` n_dicts)
1972 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1973 -- See also Note [Specialisations already covered]
1974 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1975 emptyUDs -- Not overloaded, or no specialisation wanted
1976
1977 | otherwise
1978 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1979 singleCall f spec_tys dicts
1980 where
1981 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1982 , ppr (map (interestingDict env) dicts)]
1983 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1984 constrained_tyvars = tyCoVarsOfTypes theta
1985 n_tyvars = length tyvars
1986 n_dicts = length theta
1987
1988 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1989 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1990
1991 -- ignores Coercion arguments
1992 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1993 type_zip tvs (Coercion _ : args) = type_zip tvs args
1994 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1995 type_zip _ _ = []
1996
1997 mk_spec_ty tyvar ty
1998 | tyvar `elemVarSet` constrained_tyvars = Just ty
1999 | otherwise = Nothing
2000
2001 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
2002 -- For imported things, we gather call instances if
2003 -- there is an unfolding that we could in principle specialise
2004 -- We might still decide not to use it (consulting dflags)
2005 -- in specImports
2006 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
2007
2008 type_determines_value pred -- See Note [Type determines value]
2009 = case classifyPredType pred of
2010 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
2011 EqPred {} -> True
2012 IrredPred {} -> True -- Things like (D []) where D is a
2013 -- Constraint-ranged family; Trac #7785
2014
2015 {-
2016 Note [Type determines value]
2017 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2018 Only specialise if all overloading is on non-IP *class* params,
2019 because these are the ones whose *type* determines their *value*. In
2020 parrticular, with implicit params, the type args *don't* say what the
2021 value of the implicit param is! See Trac #7101
2022
2023 However, consider
2024 type family D (v::*->*) :: Constraint
2025 type instance D [] = ()
2026 f :: D v => v Char -> Int
2027 If we see a call (f "foo"), we'll pass a "dictionary"
2028 () |> (g :: () ~ D [])
2029 and it's good to specialise f at this dictionary.
2030
2031 So the question is: can an implicit parameter "hide inside" a
2032 type-family constraint like (D a). Well, no. We don't allow
2033 type instance D Maybe = ?x:Int
2034 Hence the IrredPred case in type_determines_value.
2035 See Trac #7785.
2036
2037 Note [Interesting dictionary arguments]
2038 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2039 Consider this
2040 \a.\d:Eq a. let f = ... in ...(f d)...
2041 There really is not much point in specialising f wrt the dictionary d,
2042 because the code for the specialised f is not improved at all, because
2043 d is lambda-bound. We simply get junk specialisations.
2044
2045 What is "interesting"? Just that it has *some* structure. But what about
2046 variables?
2047
2048 * A variable might be imported, in which case its unfolding
2049 will tell us whether it has useful structure
2050
2051 * Local variables are cloned on the way down (to avoid clashes when
2052 we float dictionaries), and cloning drops the unfolding
2053 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
2054 nuisance to give them unfoldings. So we keep track of the
2055 "interesting" dictionaries as a VarSet in SpecEnv.
2056 We have to take care to put any new interesting dictionary
2057 bindings in the set.
2058
2059 We accidentally lost accurate tracking of local variables for a long
2060 time, because cloned variables don't have unfoldings. But makes a
2061 massive difference in a few cases, eg Trac #5113. For nofib as a
2062 whole it's only a small win: 2.2% improvement in allocation for ansi,
2063 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2064 -}
2065
2066 interestingDict :: SpecEnv -> CoreExpr -> Bool
2067 -- A dictionary argument is interesting if it has *some* structure
2068 -- NB: "dictionary" arguments include constraints of all sorts,
2069 -- including equality constraints; hence the Coercion case
2070 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2071 || isDataConWorkId v
2072 || v `elemVarSet` se_interesting env
2073 interestingDict _ (Type _) = False
2074 interestingDict _ (Coercion _) = False
2075 interestingDict env (App fn (Type _)) = interestingDict env fn
2076 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2077 interestingDict env (Tick _ a) = interestingDict env a
2078 interestingDict env (Cast e _) = interestingDict env e
2079 interestingDict _ _ = True
2080
2081 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2082 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2083 (MkUD {ud_binds = db2, ud_calls = calls2})
2084 = MkUD { ud_binds = db1 `unionBags` db2
2085 , ud_calls = calls1 `unionCalls` calls2 }
2086
2087 -----------------------------
2088 _dictBindBndrs :: Bag DictBind -> [Id]
2089 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2090
2091 -- | Construct a 'DictBind' from a 'CoreBind'
2092 mkDB :: CoreBind -> DictBind
2093 mkDB bind = (bind, bind_fvs bind)
2094
2095 -- | Identify the free variables of a 'CoreBind'
2096 bind_fvs :: CoreBind -> VarSet
2097 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2098 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2099 where
2100 bndrs = map fst prs
2101 rhs_fvs = unionVarSets (map pair_fvs prs)
2102
2103 pair_fvs :: (Id, CoreExpr) -> VarSet
2104 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2105 `unionVarSet` idFreeVars bndr
2106 -- idFreeVars: don't forget variables mentioned in
2107 -- the rules of the bndr. C.f. OccAnal.addRuleUsage
2108 -- Also tyvars mentioned in its type; they may not appear
2109 -- in the RHS
2110 -- type T a = Int
2111 -- x :: T a = 3
2112 where
2113 interesting :: InterestingVarFun
2114 interesting v = isLocalVar v || (isId v && isDFunId v)
2115 -- Very important: include DFunIds /even/ if it is imported
2116 -- Reason: See Note [Avoiding loops], the second exmaple
2117 -- involving an imported dfun. We must know whether
2118 -- a dictionary binding depends on an imported dfun,
2119 -- in case we try to specialise that imported dfun
2120 -- Trac #13429 illustrates
2121
2122 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
2123 -- pairs, into a single recursive binding.
2124 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind
2125 recWithDumpedDicts pairs dbs
2126 = (Rec bindings, fvs)
2127 where
2128 (bindings, fvs) = foldrBag add
2129 ([], emptyVarSet)
2130 (dbs `snocBag` mkDB (Rec pairs))
2131 add (NonRec b r, fvs') (pairs, fvs) =
2132 ((b,r) : pairs, fvs `unionVarSet` fvs')
2133 add (Rec prs1, fvs') (pairs, fvs) =
2134 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2135
2136 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2137 -- Add ud_binds to the tail end of the bindings in uds
2138 snocDictBinds uds dbs
2139 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2140
2141 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2142 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2143
2144 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2145 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2146
2147 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2148 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2149
2150 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2151 wrapDictBinds dbs binds
2152 = foldrBag add binds dbs
2153 where
2154 add (bind,_) binds = bind : binds
2155
2156 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2157 wrapDictBindsE dbs expr
2158 = foldrBag add expr dbs
2159 where
2160 add (bind,_) expr = Let bind expr
2161
2162 ----------------------
2163 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2164 -- Used at a lambda or case binder; just dump anything mentioning the binder
2165 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2166 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2167 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2168 (free_uds, dump_dbs)
2169 where
2170 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2171 bndr_set = mkVarSet bndrs
2172 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2173 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2174 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2175 -- no calls for any of the dicts in dump_dbs
2176
2177 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2178 -- Used at a let(rec) binding.
2179 -- We return a boolean indicating whether the binding itself is mentioned
2180 -- is mentioned, directly or indirectly, by any of the ud_calls; in that
2181 -- case we want to float the binding itself;
2182 -- See Note [Floated dictionary bindings]
2183 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2184 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2185 (free_uds, dump_dbs, float_all)
2186 where
2187 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2188 bndr_set = mkVarSet bndrs
2189 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2190 free_calls = deleteCallsFor bndrs orig_calls
2191 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2192
2193 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2194 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2195 = -- pprTrace ("callsForMe")
2196 -- (vcat [ppr fn,
2197 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2198 -- text "Orig calls =" <+> ppr orig_calls,
2199 -- text "Dep set =" <+> ppr dep_set,
2200 -- text "Calls for me =" <+> ppr calls_for_me]) $
2201 (uds_without_me, calls_for_me)
2202 where
2203 uds_without_me = MkUD { ud_binds = orig_dbs
2204 , ud_calls = delDVarEnv orig_calls fn }
2205 calls_for_me = case lookupDVarEnv orig_calls fn of
2206 Nothing -> []
2207 Just cis -> filterCalls cis orig_dbs
2208 -- filterCalls: drop calls that (directly or indirectly)
2209 -- refer to fn. See Note [Avoiding loops]
2210
2211 ----------------------
2212 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2213 -- See Note [Avoiding loops]
2214 filterCalls (CIS fn call_bag) dbs
2215 = filter ok_call (bagToList call_bag)
2216 where
2217 dump_set = foldlBag go (unitVarSet fn) dbs
2218 -- This dump-set could also be computed by splitDictBinds
2219 -- (_,_,dump_set) = splitDictBinds dbs {fn}
2220 -- But this variant is shorter
2221
2222 go so_far (db,fvs) | fvs `intersectsVarSet` so_far
2223 = extendVarSetList so_far (bindersOf db)
2224 | otherwise = so_far
2225
2226 ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set)
2227
2228 ----------------------
2229 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2230 -- splitDictBinds dbs bndrs returns
2231 -- (free_dbs, dump_dbs, dump_set)
2232 -- where
2233 -- * dump_dbs depends, transitively on bndrs
2234 -- * free_dbs does not depend on bndrs
2235 -- * dump_set = bndrs `union` bndrs(dump_dbs)
2236 splitDictBinds dbs bndr_set
2237 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2238 -- Important that it's foldl not foldr;
2239 -- we're accumulating the set of dumped ids in dump_set
2240 where
2241 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2242 | dump_idset `intersectsVarSet` fvs -- Dump it
2243 = (free_dbs, dump_dbs `snocBag` db,
2244 extendVarSetList dump_idset (bindersOf bind))
2245
2246 | otherwise -- Don't dump it
2247 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2248
2249
2250 ----------------------
2251 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2252 -- Remove calls *mentioning* bs in any way
2253 deleteCallsMentioning bs calls
2254 = mapDVarEnv (ciSetFilter keep_call) calls
2255 where
2256 keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs)
2257
2258 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2259 -- Remove calls *for* bs
2260 deleteCallsFor bs calls = delDVarEnvList calls bs
2261
2262 {-
2263 ************************************************************************
2264 * *
2265 \subsubsection{Boring helper functions}
2266 * *
2267 ************************************************************************
2268 -}
2269
2270 newtype SpecM a = SpecM (State SpecState a)
2271
2272 data SpecState = SpecState {
2273 spec_uniq_supply :: UniqSupply,
2274 spec_module :: Module,
2275 spec_dflags :: DynFlags
2276 }
2277
2278 instance Functor SpecM where
2279 fmap = liftM
2280
2281 instance Applicative SpecM where
2282 pure x = SpecM $ return x
2283 (<*>) = ap
2284
2285 instance Monad SpecM where
2286 SpecM x >>= f = SpecM $ do y <- x
2287 case f y of
2288 SpecM z ->
2289 z
2290 fail = MonadFail.fail
2291
2292 instance MonadFail.MonadFail SpecM where
2293 fail str = SpecM $ fail str
2294
2295 instance MonadUnique SpecM where
2296 getUniqueSupplyM
2297 = SpecM $ do st <- get
2298 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2299 put $ st { spec_uniq_supply = us2 }
2300 return us1
2301
2302 getUniqueM
2303 = SpecM $ do st <- get
2304 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2305 put $ st { spec_uniq_supply = us' }
2306 return u
2307
2308 instance HasDynFlags SpecM where
2309 getDynFlags = SpecM $ liftM spec_dflags get
2310
2311 instance HasModule SpecM where
2312 getModule = SpecM $ liftM spec_module get
2313
2314 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2315 runSpecM dflags this_mod (SpecM spec)
2316 = do us <- getUniqueSupplyM
2317 let initialState = SpecState {
2318 spec_uniq_supply = us,
2319 spec_module = this_mod,
2320 spec_dflags = dflags
2321 }
2322 return $ evalState spec initialState
2323
2324 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2325 mapAndCombineSM _ [] = return ([], emptyUDs)
2326 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2327 (ys, uds2) <- mapAndCombineSM f xs
2328 return (y:ys, uds1 `plusUDs` uds2)
2329
2330 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2331 extendTvSubstList env tv_binds
2332 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2333
2334 substTy :: SpecEnv -> Type -> Type
2335 substTy env ty = CoreSubst.substTy (se_subst env) ty
2336
2337 substCo :: SpecEnv -> Coercion -> Coercion
2338 substCo env co = CoreSubst.substCo (se_subst env) co
2339
2340 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2341 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2342 (subst', bs') -> (env { se_subst = subst' }, bs')
2343
2344 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2345 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2346 (subst', bs') -> (env { se_subst = subst' }, bs')
2347
2348 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2349 -- Clone the binders of the bind; return new bind with the cloned binders
2350 -- Return the substitution to use for RHSs, and the one to use for the body
2351 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2352 = do { us <- getUniqueSupplyM
2353 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2354 interesting' | interestingDict env rhs
2355 = interesting `extendVarSet` bndr'
2356 | otherwise = interesting
2357 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2358 , NonRec bndr' rhs) }
2359
2360 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2361 = do { us <- getUniqueSupplyM
2362 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2363 env' = env { se_subst = subst'
2364 , se_interesting = interesting `extendVarSetList`
2365 [ v | (v,r) <- pairs, interestingDict env r ] }
2366 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2367
2368 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2369 -- Make up completely fresh binders for the dictionaries
2370 -- Their bindings are going to float outwards
2371 newDictBndr env b = do { uniq <- getUniqueM
2372 ; let n = idName b
2373 ty' = substTy env (idType b)
2374 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2375
2376 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2377 -- Give the new Id a similar occurrence name to the old one
2378 newSpecIdSM old_id new_ty join_arity_maybe
2379 = do { uniq <- getUniqueM
2380 ; let name = idName old_id
2381 new_occ = mkSpecOcc (nameOccName name)
2382 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2383 `asJoinId_maybe` join_arity_maybe
2384 ; return new_id }
2385
2386 {-
2387 Old (but interesting) stuff about unboxed bindings
2388 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2389
2390 What should we do when a value is specialised to a *strict* unboxed value?
2391
2392 map_*_* f (x:xs) = let h = f x
2393 t = map f xs
2394 in h:t
2395
2396 Could convert let to case:
2397
2398 map_*_Int# f (x:xs) = case f x of h# ->
2399 let t = map f xs
2400 in h#:t
2401
2402 This may be undesirable since it forces evaluation here, but the value
2403 may not be used in all branches of the body. In the general case this
2404 transformation is impossible since the mutual recursion in a letrec
2405 cannot be expressed as a case.
2406
2407 There is also a problem with top-level unboxed values, since our
2408 implementation cannot handle unboxed values at the top level.
2409
2410 Solution: Lift the binding of the unboxed value and extract it when it
2411 is used:
2412
2413 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2414 t = map f xs
2415 in case h of
2416 _Lift h# -> h#:t
2417
2418 Now give it to the simplifier and the _Lifting will be optimised away.
2419
2420 The benefit is that we have given the specialised "unboxed" values a
2421 very simple lifted semantics and then leave it up to the simplifier to
2422 optimise it --- knowing that the overheads will be removed in nearly
2423 all cases.
2424
2425 In particular, the value will only be evaluated in the branches of the
2426 program which use it, rather than being forced at the point where the
2427 value is bound. For example:
2428
2429 filtermap_*_* p f (x:xs)
2430 = let h = f x
2431 t = ...
2432 in case p x of
2433 True -> h:t
2434 False -> t
2435 ==>
2436 filtermap_*_Int# p f (x:xs)
2437 = let h = case (f x) of h# -> _Lift h#
2438 t = ...
2439 in case p x of
2440 True -> case h of _Lift h#
2441 -> h#:t
2442 False -> t
2443
2444 The binding for h can still be inlined in the one branch and the
2445 _Lifting eliminated.
2446
2447
2448 Question: When won't the _Lifting be eliminated?
2449
2450 Answer: When they at the top-level (where it is necessary) or when
2451 inlining would duplicate work (or possibly code depending on
2452 options). However, the _Lifting will still be eliminated if the
2453 strictness analyser deems the lifted binding strict.
2454 -}