f391781c4356c0a14ccbe4835f015a6a0e7675c7
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import GhcPrelude
13
14 import Id
15 import TcType hiding( substTy )
16 import Type hiding( substTy, extendTvSubstList )
17 import Module( Module, HasModule(..) )
18 import Coercion( Coercion )
19 import CoreMonad
20 import qualified CoreSubst
21 import CoreUnfold
22 import Var ( isLocalVar )
23 import VarSet
24 import VarEnv
25 import CoreSyn
26 import Rules
27 import CoreOpt ( collectBindersPushingCo )
28 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
29 import CoreFVs
30 import CoreArity ( etaExpandToJoinPointRule )
31 import UniqSupply
32 import Name
33 import MkId ( voidArgId, voidPrimId )
34 import Maybes ( catMaybes, isJust )
35 import MonadUtils ( foldlM )
36 import BasicTypes
37 import HscTypes
38 import Bag
39 import DynFlags
40 import Util
41 import Outputable
42 import FastString
43 import State
44 import UniqDFM
45
46 import Control.Monad
47 import qualified Control.Monad.Fail as MonadFail
48
49 {-
50 ************************************************************************
51 * *
52 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
53 * *
54 ************************************************************************
55
56 These notes describe how we implement specialisation to eliminate
57 overloading.
58
59 The specialisation pass works on Core
60 syntax, complete with all the explicit dictionary application,
61 abstraction and construction as added by the type checker. The
62 existing type checker remains largely as it is.
63
64 One important thought: the {\em types} passed to an overloaded
65 function, and the {\em dictionaries} passed are mutually redundant.
66 If the same function is applied to the same type(s) then it is sure to
67 be applied to the same dictionary(s)---or rather to the same {\em
68 values}. (The arguments might look different but they will evaluate
69 to the same value.)
70
71 Second important thought: we know that we can make progress by
72 treating dictionary arguments as static and worth specialising on. So
73 we can do without binding-time analysis, and instead specialise on
74 dictionary arguments and no others.
75
76 The basic idea
77 ~~~~~~~~~~~~~~
78 Suppose we have
79
80 let f = <f_rhs>
81 in <body>
82
83 and suppose f is overloaded.
84
85 STEP 1: CALL-INSTANCE COLLECTION
86
87 We traverse <body>, accumulating all applications of f to types and
88 dictionaries.
89
90 (Might there be partial applications, to just some of its types and
91 dictionaries? In principle yes, but in practice the type checker only
92 builds applications of f to all its types and dictionaries, so partial
93 applications could only arise as a result of transformation, and even
94 then I think it's unlikely. In any case, we simply don't accumulate such
95 partial applications.)
96
97
98 STEP 2: EQUIVALENCES
99
100 So now we have a collection of calls to f:
101 f t1 t2 d1 d2
102 f t3 t4 d3 d4
103 ...
104 Notice that f may take several type arguments. To avoid ambiguity, we
105 say that f is called at type t1/t2 and t3/t4.
106
107 We take equivalence classes using equality of the *types* (ignoring
108 the dictionary args, which as mentioned previously are redundant).
109
110 STEP 3: SPECIALISATION
111
112 For each equivalence class, choose a representative (f t1 t2 d1 d2),
113 and create a local instance of f, defined thus:
114
115 f@t1/t2 = <f_rhs> t1 t2 d1 d2
116
117 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
118 of simplification will now result. However we don't actually *do* that
119 simplification. Rather, we leave it for the simplifier to do. If we
120 *did* do it, though, we'd get more call instances from the specialised
121 RHS. We can work out what they are by instantiating the call-instance
122 set from f's RHS with the types t1, t2.
123
124 Add this new id to f's IdInfo, to record that f has a specialised version.
125
126 Before doing any of this, check that f's IdInfo doesn't already
127 tell us about an existing instance of f at the required type/s.
128 (This might happen if specialisation was applied more than once, or
129 it might arise from user SPECIALIZE pragmas.)
130
131 Recursion
132 ~~~~~~~~~
133 Wait a minute! What if f is recursive? Then we can't just plug in
134 its right-hand side, can we?
135
136 But it's ok. The type checker *always* creates non-recursive definitions
137 for overloaded recursive functions. For example:
138
139 f x = f (x+x) -- Yes I know its silly
140
141 becomes
142
143 f a (d::Num a) = let p = +.sel a d
144 in
145 letrec fl (y::a) = fl (p y y)
146 in
147 fl
148
149 We still have recursion for non-overloaded functions which we
150 specialise, but the recursive call should get specialised to the
151 same recursive version.
152
153
154 Polymorphism 1
155 ~~~~~~~~~~~~~~
156
157 All this is crystal clear when the function is applied to *constant
158 types*; that is, types which have no type variables inside. But what if
159 it is applied to non-constant types? Suppose we find a call of f at type
160 t1/t2. There are two possibilities:
161
162 (a) The free type variables of t1, t2 are in scope at the definition point
163 of f. In this case there's no problem, we proceed just as before. A common
164 example is as follows. Here's the Haskell:
165
166 g y = let f x = x+x
167 in f y + f y
168
169 After typechecking we have
170
171 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
172 in +.sel a d (f a d y) (f a d y)
173
174 Notice that the call to f is at type type "a"; a non-constant type.
175 Both calls to f are at the same type, so we can specialise to give:
176
177 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
178 in +.sel a d (f@a y) (f@a y)
179
180
181 (b) The other case is when the type variables in the instance types
182 are *not* in scope at the definition point of f. The example we are
183 working with above is a good case. There are two instances of (+.sel a d),
184 but "a" is not in scope at the definition of +.sel. Can we do anything?
185 Yes, we can "common them up", a sort of limited common sub-expression deal.
186 This would give:
187
188 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
189 f@a (x::a) = +.sel@a x x
190 in +.sel@a (f@a y) (f@a y)
191
192 This can save work, and can't be spotted by the type checker, because
193 the two instances of +.sel weren't originally at the same type.
194
195 Further notes on (b)
196
197 * There are quite a few variations here. For example, the defn of
198 +.sel could be floated ouside the \y, to attempt to gain laziness.
199 It certainly mustn't be floated outside the \d because the d has to
200 be in scope too.
201
202 * We don't want to inline f_rhs in this case, because
203 that will duplicate code. Just commoning up the call is the point.
204
205 * Nothing gets added to +.sel's IdInfo.
206
207 * Don't bother unless the equivalence class has more than one item!
208
209 Not clear whether this is all worth it. It is of course OK to
210 simply discard call-instances when passing a big lambda.
211
212 Polymorphism 2 -- Overloading
213 ~~~~~~~~~~~~~~
214 Consider a function whose most general type is
215
216 f :: forall a b. Ord a => [a] -> b -> b
217
218 There is really no point in making a version of g at Int/Int and another
219 at Int/Bool, because it's only instantiating the type variable "a" which
220 buys us any efficiency. Since g is completely polymorphic in b there
221 ain't much point in making separate versions of g for the different
222 b types.
223
224 That suggests that we should identify which of g's type variables
225 are constrained (like "a") and which are unconstrained (like "b").
226 Then when taking equivalence classes in STEP 2, we ignore the type args
227 corresponding to unconstrained type variable. In STEP 3 we make
228 polymorphic versions. Thus:
229
230 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
231
232 We do this.
233
234
235 Dictionary floating
236 ~~~~~~~~~~~~~~~~~~~
237 Consider this
238
239 f a (d::Num a) = let g = ...
240 in
241 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
242
243 Here, g is only called at one type, but the dictionary isn't in scope at the
244 definition point for g. Usually the type checker would build a
245 definition for d1 which enclosed g, but the transformation system
246 might have moved d1's defn inward. Solution: float dictionary bindings
247 outwards along with call instances.
248
249 Consider
250
251 f x = let g p q = p==q
252 h r s = (r+s, g r s)
253 in
254 h x x
255
256
257 Before specialisation, leaving out type abstractions we have
258
259 f df x = let g :: Eq a => a -> a -> Bool
260 g dg p q = == dg p q
261 h :: Num a => a -> a -> (a, Bool)
262 h dh r s = let deq = eqFromNum dh
263 in (+ dh r s, g deq r s)
264 in
265 h df x x
266
267 After specialising h we get a specialised version of h, like this:
268
269 h' r s = let deq = eqFromNum df
270 in (+ df r s, g deq r s)
271
272 But we can't naively make an instance for g from this, because deq is not in scope
273 at the defn of g. Instead, we have to float out the (new) defn of deq
274 to widen its scope. Notice that this floating can't be done in advance -- it only
275 shows up when specialisation is done.
276
277 User SPECIALIZE pragmas
278 ~~~~~~~~~~~~~~~~~~~~~~~
279 Specialisation pragmas can be digested by the type checker, and implemented
280 by adding extra definitions along with that of f, in the same way as before
281
282 f@t1/t2 = <f_rhs> t1 t2 d1 d2
283
284 Indeed the pragmas *have* to be dealt with by the type checker, because
285 only it knows how to build the dictionaries d1 and d2! For example
286
287 g :: Ord a => [a] -> [a]
288 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
289
290 Here, the specialised version of g is an application of g's rhs to the
291 Ord dictionary for (Tree Int), which only the type checker can conjure
292 up. There might not even *be* one, if (Tree Int) is not an instance of
293 Ord! (All the other specialision has suitable dictionaries to hand
294 from actual calls.)
295
296 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
297 it is buried in a complex (as-yet-un-desugared) binding group.
298 Maybe we should say
299
300 f@t1/t2 = f* t1 t2 d1 d2
301
302 where f* is the Id f with an IdInfo which says "inline me regardless!".
303 Indeed all the specialisation could be done in this way.
304 That in turn means that the simplifier has to be prepared to inline absolutely
305 any in-scope let-bound thing.
306
307
308 Again, the pragma should permit polymorphism in unconstrained variables:
309
310 h :: Ord a => [a] -> b -> b
311 {-# SPECIALIZE h :: [Int] -> b -> b #-}
312
313 We *insist* that all overloaded type variables are specialised to ground types,
314 (and hence there can be no context inside a SPECIALIZE pragma).
315 We *permit* unconstrained type variables to be specialised to
316 - a ground type
317 - or left as a polymorphic type variable
318 but nothing in between. So
319
320 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
321
322 is *illegal*. (It can be handled, but it adds complication, and gains the
323 programmer nothing.)
324
325
326 SPECIALISING INSTANCE DECLARATIONS
327 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
328 Consider
329
330 instance Foo a => Foo [a] where
331 ...
332 {-# SPECIALIZE instance Foo [Int] #-}
333
334 The original instance decl creates a dictionary-function
335 definition:
336
337 dfun.Foo.List :: forall a. Foo a -> Foo [a]
338
339 The SPECIALIZE pragma just makes a specialised copy, just as for
340 ordinary function definitions:
341
342 dfun.Foo.List@Int :: Foo [Int]
343 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
344
345 The information about what instance of the dfun exist gets added to
346 the dfun's IdInfo in the same way as a user-defined function too.
347
348
349 Automatic instance decl specialisation?
350 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
351 Can instance decls be specialised automatically? It's tricky.
352 We could collect call-instance information for each dfun, but
353 then when we specialised their bodies we'd get new call-instances
354 for ordinary functions; and when we specialised their bodies, we might get
355 new call-instances of the dfuns, and so on. This all arises because of
356 the unrestricted mutual recursion between instance decls and value decls.
357
358 Still, there's no actual problem; it just means that we may not do all
359 the specialisation we could theoretically do.
360
361 Furthermore, instance decls are usually exported and used non-locally,
362 so we'll want to compile enough to get those specialisations done.
363
364 Lastly, there's no such thing as a local instance decl, so we can
365 survive solely by spitting out *usage* information, and then reading that
366 back in as a pragma when next compiling the file. So for now,
367 we only specialise instance decls in response to pragmas.
368
369
370 SPITTING OUT USAGE INFORMATION
371 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
372
373 To spit out usage information we need to traverse the code collecting
374 call-instance information for all imported (non-prelude?) functions
375 and data types. Then we equivalence-class it and spit it out.
376
377 This is done at the top-level when all the call instances which escape
378 must be for imported functions and data types.
379
380 *** Not currently done ***
381
382
383 Partial specialisation by pragmas
384 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
385 What about partial specialisation:
386
387 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
388 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
389
390 or even
391
392 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
393
394 Seems quite reasonable. Similar things could be done with instance decls:
395
396 instance (Foo a, Foo b) => Foo (a,b) where
397 ...
398 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
399 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
400
401 Ho hum. Things are complex enough without this. I pass.
402
403
404 Requirements for the simplifier
405 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
406 The simplifier has to be able to take advantage of the specialisation.
407
408 * When the simplifier finds an application of a polymorphic f, it looks in
409 f's IdInfo in case there is a suitable instance to call instead. This converts
410
411 f t1 t2 d1 d2 ===> f_t1_t2
412
413 Note that the dictionaries get eaten up too!
414
415 * Dictionary selection operations on constant dictionaries must be
416 short-circuited:
417
418 +.sel Int d ===> +Int
419
420 The obvious way to do this is in the same way as other specialised
421 calls: +.sel has inside it some IdInfo which tells that if it's applied
422 to the type Int then it should eat a dictionary and transform to +Int.
423
424 In short, dictionary selectors need IdInfo inside them for constant
425 methods.
426
427 * Exactly the same applies if a superclass dictionary is being
428 extracted:
429
430 Eq.sel Int d ===> dEqInt
431
432 * Something similar applies to dictionary construction too. Suppose
433 dfun.Eq.List is the function taking a dictionary for (Eq a) to
434 one for (Eq [a]). Then we want
435
436 dfun.Eq.List Int d ===> dEq.List_Int
437
438 Where does the Eq [Int] dictionary come from? It is built in
439 response to a SPECIALIZE pragma on the Eq [a] instance decl.
440
441 In short, dfun Ids need IdInfo with a specialisation for each
442 constant instance of their instance declaration.
443
444 All this uses a single mechanism: the SpecEnv inside an Id
445
446
447 What does the specialisation IdInfo look like?
448 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
449
450 The SpecEnv of an Id maps a list of types (the template) to an expression
451
452 [Type] |-> Expr
453
454 For example, if f has this RuleInfo:
455
456 [Int, a] -> \d:Ord Int. f' a
457
458 it means that we can replace the call
459
460 f Int t ===> (\d. f' t)
461
462 This chucks one dictionary away and proceeds with the
463 specialised version of f, namely f'.
464
465
466 What can't be done this way?
467 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
468 There is no way, post-typechecker, to get a dictionary for (say)
469 Eq a from a dictionary for Eq [a]. So if we find
470
471 ==.sel [t] d
472
473 we can't transform to
474
475 eqList (==.sel t d')
476
477 where
478 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
479
480 Of course, we currently have no way to automatically derive
481 eqList, nor to connect it to the Eq [a] instance decl, but you
482 can imagine that it might somehow be possible. Taking advantage
483 of this is permanently ruled out.
484
485 Still, this is no great hardship, because we intend to eliminate
486 overloading altogether anyway!
487
488 A note about non-tyvar dictionaries
489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
490 Some Ids have types like
491
492 forall a,b,c. Eq a -> Ord [a] -> tau
493
494 This seems curious at first, because we usually only have dictionary
495 args whose types are of the form (C a) where a is a type variable.
496 But this doesn't hold for the functions arising from instance decls,
497 which sometimes get arguments with types of form (C (T a)) for some
498 type constructor T.
499
500 Should we specialise wrt this compound-type dictionary? We used to say
501 "no", saying:
502 "This is a heuristic judgement, as indeed is the fact that we
503 specialise wrt only dictionaries. We choose *not* to specialise
504 wrt compound dictionaries because at the moment the only place
505 they show up is in instance decls, where they are simply plugged
506 into a returned dictionary. So nothing is gained by specialising
507 wrt them."
508
509 But it is simpler and more uniform to specialise wrt these dicts too;
510 and in future GHC is likely to support full fledged type signatures
511 like
512 f :: Eq [(a,b)] => ...
513
514
515 ************************************************************************
516 * *
517 \subsubsection{The new specialiser}
518 * *
519 ************************************************************************
520
521 Our basic game plan is this. For let(rec) bound function
522 f :: (C a, D c) => (a,b,c,d) -> Bool
523
524 * Find any specialised calls of f, (f ts ds), where
525 ts are the type arguments t1 .. t4, and
526 ds are the dictionary arguments d1 .. d2.
527
528 * Add a new definition for f1 (say):
529
530 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
531
532 Note that we abstract over the unconstrained type arguments.
533
534 * Add the mapping
535
536 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
537
538 to the specialisations of f. This will be used by the
539 simplifier to replace calls
540 (f t1 t2 t3 t4) da db
541 by
542 (\d1 d1 -> f1 t2 t4) da db
543
544 All the stuff about how many dictionaries to discard, and what types
545 to apply the specialised function to, are handled by the fact that the
546 SpecEnv contains a template for the result of the specialisation.
547
548 We don't build *partial* specialisations for f. For example:
549
550 f :: Eq a => a -> a -> Bool
551 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
552
553 Here, little is gained by making a specialised copy of f.
554 There's a distinct danger that the specialised version would
555 first build a dictionary for (Eq b, Eq c), and then select the (==)
556 method from it! Even if it didn't, not a great deal is saved.
557
558 We do, however, generate polymorphic, but not overloaded, specialisations:
559
560 f :: Eq a => [a] -> b -> b -> b
561 ... SPECIALISE f :: [Int] -> b -> b -> b ...
562
563 Hence, the invariant is this:
564
565 *** no specialised version is overloaded ***
566
567
568 ************************************************************************
569 * *
570 \subsubsection{The exported function}
571 * *
572 ************************************************************************
573 -}
574
575 -- | Specialise calls to type-class overloaded functions occuring in a program.
576 specProgram :: ModGuts -> CoreM ModGuts
577 specProgram guts@(ModGuts { mg_module = this_mod
578 , mg_rules = local_rules
579 , mg_binds = binds })
580 = do { dflags <- getDynFlags
581
582 -- Specialise the bindings of this module
583 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
584
585 -- Specialise imported functions
586 ; hpt_rules <- getRuleBase
587 ; let rule_base = extendRuleBaseList hpt_rules local_rules
588 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
589 [] rule_base uds
590
591 ; let final_binds
592 | null spec_binds = binds'
593 | otherwise = Rec (flattenBinds spec_binds) : binds'
594 -- Note [Glom the bindings if imported functions are specialised]
595
596 ; return (guts { mg_binds = final_binds
597 , mg_rules = new_rules ++ local_rules }) }
598 where
599 -- We need to start with a Subst that knows all the things
600 -- that are in scope, so that the substitution engine doesn't
601 -- accidentally re-use a unique that's already in use
602 -- Easiest thing is to do it all at once, as if all the top-level
603 -- decls were mutually recursive
604 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
605 bindersOfBinds binds
606 , se_interesting = emptyVarSet }
607
608 go [] = return ([], emptyUDs)
609 go (bind:binds) = do (binds', uds) <- go binds
610 (bind', uds') <- specBind top_env bind uds
611 return (bind' ++ binds', uds')
612
613 {-
614 Note [Wrap bindings returned by specImports]
615 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
616 'specImports' returns a set of specialized bindings. However, these are lacking
617 necessary floated dictionary bindings, which are returned by
618 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
619 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
620 for instance, the 'specImports' call in 'specProgram'.
621
622
623 Note [Disabling cross-module specialisation]
624 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
625 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
626 in modules outside of the current module. This can sometimes uncover user code
627 which explodes in size when aggressively optimized. The
628 -fno-cross-module-specialise option was introduced to allow users to being
629 bitten by such instances to revert to the pre-7.10 behavior.
630
631 See Trac #10491
632 -}
633
634 -- | Specialise a set of calls to imported bindings
635 specImports :: DynFlags
636 -> Module
637 -> SpecEnv -- Passed in so that all top-level Ids are in scope
638 -> VarSet -- Don't specialise these ones
639 -- See Note [Avoiding recursive specialisation]
640 -> [Id] -- Stack of imported functions being specialised
641 -> RuleBase -- Rules from this module and the home package
642 -- (but not external packages, which can change)
643 -> UsageDetails -- Calls for imported things, and floating bindings
644 -> CoreM ( [CoreRule] -- New rules
645 , [CoreBind] ) -- Specialised bindings
646 -- See Note [Wrapping bindings returned by specImports]
647 specImports dflags this_mod top_env done callers rule_base
648 (MkUD { ud_binds = dict_binds, ud_calls = calls })
649 -- See Note [Disabling cross-module specialisation]
650 | not $ gopt Opt_CrossModuleSpecialise dflags
651 = return ([], [])
652
653 | otherwise
654 = do { let import_calls = dVarEnvElts calls
655 ; (rules, spec_binds) <- go rule_base import_calls
656
657 -- Don't forget to wrap the specialized bindings with
658 -- bindings for the needed dictionaries.
659 -- See Note [Wrap bindings returned by specImports]
660 ; let spec_binds' = wrapDictBinds dict_binds spec_binds
661
662 ; return (rules, spec_binds') }
663 where
664 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
665 go _ [] = return ([], [])
666 go rb (cis@(CIS fn _) : other_calls)
667 = do { let ok_calls = filterCalls cis dict_binds
668 -- Drop calls that (directly or indirectly) refer to fn
669 -- See Note [Avoiding loops]
670 -- ; debugTraceMsg (text "specImport" <+> vcat [ ppr fn
671 -- , text "calls" <+> ppr cis
672 -- , text "ud_binds =" <+> ppr dict_binds
673 -- , text "dump set =" <+> ppr dump_set
674 -- , text "filtered calls =" <+> ppr ok_calls ])
675 ; (rules1, spec_binds1) <- specImport dflags this_mod top_env
676 done callers rb fn ok_calls
677
678 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
679 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
680
681 specImport :: DynFlags
682 -> Module
683 -> SpecEnv -- Passed in so that all top-level Ids are in scope
684 -> VarSet -- Don't specialise these
685 -- See Note [Avoiding recursive specialisation]
686 -> [Id] -- Stack of imported functions being specialised
687 -> RuleBase -- Rules from this module
688 -> Id -> [CallInfo] -- Imported function and calls for it
689 -> CoreM ( [CoreRule] -- New rules
690 , [CoreBind] ) -- Specialised bindings
691 specImport dflags this_mod top_env done callers rb fn calls_for_fn
692 | fn `elemVarSet` done
693 = return ([], []) -- No warning. This actually happens all the time
694 -- when specialising a recursive function, because
695 -- the RHS of the specialised function contains a recursive
696 -- call to the original function
697
698 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
699 = return ([], [])
700
701 | wantSpecImport dflags unfolding
702 , Just rhs <- maybeUnfoldingTemplate unfolding
703 = do { -- Get rules from the external package state
704 -- We keep doing this in case we "page-fault in"
705 -- more rules as we go along
706 ; hsc_env <- getHscEnv
707 ; eps <- liftIO $ hscEPS hsc_env
708 ; vis_orphs <- getVisibleOrphanMods
709 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
710 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
711
712 ; (rules1, spec_pairs, uds)
713 <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
714 runSpecM dflags this_mod $
715 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
716 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
717 -- After the rules kick in we may get recursion, but
718 -- we rely on a global GlomBinds to sort that out later
719 -- See Note [Glom the bindings if imported functions are specialised]
720
721 -- Now specialise any cascaded calls
722 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
723 specImports dflags this_mod top_env
724 (extendVarSet done fn)
725 (fn:callers)
726 (extendRuleBaseList rb rules1)
727 uds
728
729 ; let final_binds = spec_binds2 ++ spec_binds1
730
731 ; return (rules2 ++ rules1, final_binds) }
732
733 | warnMissingSpecs dflags callers
734 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
735 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
736 | caller <- callers])
737 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
738 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
739 ; return ([], []) }
740
741 | otherwise
742 = return ([], [])
743 where
744 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
745
746 warnMissingSpecs :: DynFlags -> [Id] -> Bool
747 -- See Note [Warning about missed specialisations]
748 warnMissingSpecs dflags callers
749 | wopt Opt_WarnAllMissedSpecs dflags = True
750 | not (wopt Opt_WarnMissedSpecs dflags) = False
751 | null callers = False
752 | otherwise = all has_inline_prag callers
753 where
754 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
755
756 wantSpecImport :: DynFlags -> Unfolding -> Bool
757 -- See Note [Specialise imported INLINABLE things]
758 wantSpecImport dflags unf
759 = case unf of
760 NoUnfolding -> False
761 BootUnfolding -> False
762 OtherCon {} -> False
763 DFunUnfolding {} -> True
764 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
765 | gopt Opt_SpecialiseAggressively dflags -> True
766 | isStableSource src -> True
767 -- Specialise even INLINE things; it hasn't inlined yet,
768 -- so perhaps it never will. Moreover it may have calls
769 -- inside it that we want to specialise
770 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
771
772 {- Note [Warning about missed specialisations]
773 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
774 Suppose
775 * In module Lib, you carefully mark a function 'foo' INLINABLE
776 * Import Lib(foo) into another module M
777 * Call 'foo' at some specialised type in M
778 Then you jolly well expect it to be specialised in M. But what if
779 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
780 specialised too. But if 'bar' is not marked INLINABLE it may well
781 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
782
783 It's more noisy to warning about a missed specialisation opportunity
784 for /every/ overloaded imported function, but sometimes useful. That
785 is what Opt_WarnAllMissedSpecs does.
786
787 ToDo: warn about missed opportunities for local functions.
788
789 Note [Specialise imported INLINABLE things]
790 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
791 What imported functions do we specialise? The basic set is
792 * DFuns and things with INLINABLE pragmas.
793 but with -fspecialise-aggressively we add
794 * Anything with an unfolding template
795
796 Trac #8874 has a good example of why we want to auto-specialise DFuns.
797
798 We have the -fspecialise-aggressively flag (usually off), because we
799 risk lots of orphan modules from over-vigorous specialisation.
800 However it's not a big deal: anything non-recursive with an
801 unfolding-template will probably have been inlined already.
802
803 Note [Glom the bindings if imported functions are specialised]
804 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
805 Suppose we have an imported, *recursive*, INLINABLE function
806 f :: Eq a => a -> a
807 f = /\a \d x. ...(f a d)...
808 In the module being compiled we have
809 g x = f (x::Int)
810 Now we'll make a specialised function
811 f_spec :: Int -> Int
812 f_spec = \x -> ...(f Int dInt)...
813 {-# RULE f Int _ = f_spec #-}
814 g = \x. f Int dInt x
815 Note that f_spec doesn't look recursive
816 After rewriting with the RULE, we get
817 f_spec = \x -> ...(f_spec)...
818 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
819 The occurrence analyser never turns a NonRec into a Rec. So we must
820 make sure that f_spec is recursive. Easiest thing is to make all
821 the specialisations for imported bindings recursive.
822
823
824 Note [Avoiding recursive specialisation]
825 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
826 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
827 'f's RHS. So we want to specialise g,h. But we don't want to
828 specialise f any more! It's possible that f's RHS might have a
829 recursive yet-more-specialised call, so we'd diverge in that case.
830 And if the call is to the same type, one specialisation is enough.
831 Avoiding this recursive specialisation loop is the reason for the
832 'done' VarSet passed to specImports and specImport.
833
834 ************************************************************************
835 * *
836 \subsubsection{@specExpr@: the main function}
837 * *
838 ************************************************************************
839 -}
840
841 data SpecEnv
842 = SE { se_subst :: CoreSubst.Subst
843 -- We carry a substitution down:
844 -- a) we must clone any binding that might float outwards,
845 -- to avoid name clashes
846 -- b) we carry a type substitution to use when analysing
847 -- the RHS of specialised bindings (no type-let!)
848
849
850 , se_interesting :: VarSet
851 -- Dict Ids that we know something about
852 -- and hence may be worth specialising against
853 -- See Note [Interesting dictionary arguments]
854 }
855
856 specVar :: SpecEnv -> Id -> CoreExpr
857 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
858
859 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
860
861 ---------------- First the easy cases --------------------
862 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
863 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
864 specExpr env (Var v) = return (specVar env v, emptyUDs)
865 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
866 specExpr env (Cast e co)
867 = do { (e', uds) <- specExpr env e
868 ; return ((mkCast e' (substCo env co)), uds) }
869 specExpr env (Tick tickish body)
870 = do { (body', uds) <- specExpr env body
871 ; return (Tick (specTickish env tickish) body', uds) }
872
873 ---------------- Applications might generate a call instance --------------------
874 specExpr env expr@(App {})
875 = go expr []
876 where
877 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
878 (fun', uds_app) <- go fun (arg':args)
879 return (App fun' arg', uds_arg `plusUDs` uds_app)
880
881 go (Var f) args = case specVar env f of
882 Var f' -> return (Var f', mkCallUDs env f' args)
883 e' -> return (e', emptyUDs) -- I don't expect this!
884 go other _ = specExpr env other
885
886 ---------------- Lambda/case require dumping of usage details --------------------
887 specExpr env e@(Lam _ _) = do
888 (body', uds) <- specExpr env' body
889 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
890 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
891 where
892 (bndrs, body) = collectBinders e
893 (env', bndrs') = substBndrs env bndrs
894 -- More efficient to collect a group of binders together all at once
895 -- and we don't want to split a lambda group with dumped bindings
896
897 specExpr env (Case scrut case_bndr ty alts)
898 = do { (scrut', scrut_uds) <- specExpr env scrut
899 ; (scrut'', case_bndr', alts', alts_uds)
900 <- specCase env scrut' case_bndr alts
901 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
902 , scrut_uds `plusUDs` alts_uds) }
903
904 ---------------- Finally, let is the interesting case --------------------
905 specExpr env (Let bind body)
906 = do { -- Clone binders
907 (rhs_env, body_env, bind') <- cloneBindSM env bind
908
909 -- Deal with the body
910 ; (body', body_uds) <- specExpr body_env body
911
912 -- Deal with the bindings
913 ; (binds', uds) <- specBind rhs_env bind' body_uds
914
915 -- All done
916 ; return (foldr Let body' binds', uds) }
917
918 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
919 specTickish env (Breakpoint ix ids)
920 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
921 -- drop vars from the list if they have a non-variable substitution.
922 -- should never happen, but it's harmless to drop them anyway.
923 specTickish _ other_tickish = other_tickish
924
925 specCase :: SpecEnv
926 -> CoreExpr -- Scrutinee, already done
927 -> Id -> [CoreAlt]
928 -> SpecM ( CoreExpr -- New scrutinee
929 , Id
930 , [CoreAlt]
931 , UsageDetails)
932 specCase env scrut' case_bndr [(con, args, rhs)]
933 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
934 , interestingDict env scrut'
935 , not (isDeadBinder case_bndr && null sc_args')
936 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
937
938 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
939 [(con, args', Var sc_arg')]
940 | sc_arg' <- sc_args' ]
941
942 -- Extend the substitution for RHS to map the *original* binders
943 -- to their floated versions.
944 mb_sc_flts :: [Maybe DictId]
945 mb_sc_flts = map (lookupVarEnv clone_env) args'
946 clone_env = zipVarEnv sc_args' sc_args_flt
947 subst_prs = (case_bndr, Var case_bndr_flt)
948 : [ (arg, Var sc_flt)
949 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
950 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
951 , se_interesting = se_interesting env_rhs `extendVarSetList`
952 (case_bndr_flt : sc_args_flt) }
953
954 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
955 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
956 case_bndr_set = unitVarSet case_bndr_flt
957 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
958 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
959 flt_binds = scrut_bind : sc_binds
960 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
961 all_uds = flt_binds `addDictBinds` free_uds
962 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
963 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
964 where
965 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
966 sc_args' = filter is_flt_sc_arg args'
967
968 clone_me bndr = do { uniq <- getUniqueM
969 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
970 where
971 name = idName bndr
972 ty = idType bndr
973 occ = nameOccName name
974 loc = getSrcSpan name
975
976 arg_set = mkVarSet args'
977 is_flt_sc_arg var = isId var
978 && not (isDeadBinder var)
979 && isDictTy var_ty
980 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
981 where
982 var_ty = idType var
983
984
985 specCase env scrut case_bndr alts
986 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
987 ; return (scrut, case_bndr', alts', uds_alts) }
988 where
989 (env_alt, case_bndr') = substBndr env case_bndr
990 spec_alt (con, args, rhs) = do
991 (rhs', uds) <- specExpr env_rhs rhs
992 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
993 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
994 where
995 (env_rhs, args') = substBndrs env_alt args
996
997 {-
998 Note [Floating dictionaries out of cases]
999 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1000 Consider
1001 g = \d. case d of { MkD sc ... -> ...(f sc)... }
1002 Naively we can't float d2's binding out of the case expression,
1003 because 'sc' is bound by the case, and that in turn means we can't
1004 specialise f, which seems a pity.
1005
1006 So we invert the case, by floating out a binding
1007 for 'sc_flt' thus:
1008 sc_flt = case d of { MkD sc ... -> sc }
1009 Now we can float the call instance for 'f'. Indeed this is just
1010 what'll happen if 'sc' was originally bound with a let binding,
1011 but case is more efficient, and necessary with equalities. So it's
1012 good to work with both.
1013
1014 You might think that this won't make any difference, because the
1015 call instance will only get nuked by the \d. BUT if 'g' itself is
1016 specialised, then transitively we should be able to specialise f.
1017
1018 In general, given
1019 case e of cb { MkD sc ... -> ...(f sc)... }
1020 we transform to
1021 let cb_flt = e
1022 sc_flt = case cb_flt of { MkD sc ... -> sc }
1023 in
1024 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1025
1026 The "_flt" things are the floated binds; we use the current substitution
1027 to substitute sc -> sc_flt in the RHS
1028
1029 ************************************************************************
1030 * *
1031 Dealing with a binding
1032 * *
1033 ************************************************************************
1034 -}
1035
1036 specBind :: SpecEnv -- Use this for RHSs
1037 -> CoreBind -- Binders are already cloned by cloneBindSM,
1038 -- but RHSs are un-processed
1039 -> UsageDetails -- Info on how the scope of the binding
1040 -> SpecM ([CoreBind], -- New bindings
1041 UsageDetails) -- And info to pass upstream
1042
1043 -- Returned UsageDetails:
1044 -- No calls for binders of this bind
1045 specBind rhs_env (NonRec fn rhs) body_uds
1046 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1047 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1048
1049 ; let pairs = spec_defns ++ [(fn', rhs')]
1050 -- fn' mentions the spec_defns in its rules,
1051 -- so put the latter first
1052
1053 combined_uds = body_uds1 `plusUDs` rhs_uds
1054
1055 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1056
1057 final_binds :: [DictBind]
1058 -- See Note [From non-recursive to recursive]
1059 final_binds
1060 | not (isEmptyBag dump_dbs)
1061 , not (null spec_defns)
1062 = [recWithDumpedDicts pairs dump_dbs]
1063 | otherwise
1064 = [mkDB $ NonRec b r | (b,r) <- pairs]
1065 ++ bagToList dump_dbs
1066
1067 ; if float_all then
1068 -- Rather than discard the calls mentioning the bound variables
1069 -- we float this (dictionary) binding along with the others
1070 return ([], free_uds `snocDictBinds` final_binds)
1071 else
1072 -- No call in final_uds mentions bound variables,
1073 -- so we can just leave the binding here
1074 return (map fst final_binds, free_uds) }
1075
1076
1077 specBind rhs_env (Rec pairs) body_uds
1078 -- Note [Specialising a recursive group]
1079 = do { let (bndrs,rhss) = unzip pairs
1080 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1081 ; let scope_uds = body_uds `plusUDs` rhs_uds
1082 -- Includes binds and calls arising from rhss
1083
1084 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1085
1086 ; (bndrs3, spec_defns3, uds3)
1087 <- if null spec_defns1 -- Common case: no specialisation
1088 then return (bndrs1, [], uds1)
1089 else do { -- Specialisation occurred; do it again
1090 (bndrs2, spec_defns2, uds2)
1091 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1092 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1093
1094 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1095 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1096 dumped_dbs
1097
1098 ; if float_all then
1099 return ([], final_uds `snocDictBind` final_bind)
1100 else
1101 return ([fst final_bind], final_uds) }
1102
1103
1104 ---------------------------
1105 specDefns :: SpecEnv
1106 -> UsageDetails -- Info on how it is used in its scope
1107 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1108 -> SpecM ([OutId], -- Original Ids with RULES added
1109 [(OutId,OutExpr)], -- Extra, specialised bindings
1110 UsageDetails) -- Stuff to fling upwards from the specialised versions
1111
1112 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1113 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1114 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1115 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1116 -- But not vice versa (it's a fixpoint problem).
1117
1118 specDefns _env uds []
1119 = return ([], [], uds)
1120 specDefns env uds ((bndr,rhs):pairs)
1121 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1122 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1123 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1124
1125 ---------------------------
1126 specDefn :: SpecEnv
1127 -> UsageDetails -- Info on how it is used in its scope
1128 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1129 -> SpecM (Id, -- Original Id with added RULES
1130 [(Id,CoreExpr)], -- Extra, specialised bindings
1131 UsageDetails) -- Stuff to fling upwards from the specialised versions
1132
1133 specDefn env body_uds fn rhs
1134 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1135 rules_for_me = idCoreRules fn
1136 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1137 calls_for_me fn rhs
1138 ; return ( fn `addIdSpecialisations` rules
1139 , spec_defns
1140 , body_uds_without_me `plusUDs` spec_uds) }
1141 -- It's important that the `plusUDs` is this way
1142 -- round, because body_uds_without_me may bind
1143 -- dictionaries that are used in calls_for_me passed
1144 -- to specDefn. So the dictionary bindings in
1145 -- spec_uds may mention dictionaries bound in
1146 -- body_uds_without_me
1147
1148 ---------------------------
1149 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1150 -- Nothing => specialising local fn
1151 -> SpecEnv
1152 -> [CoreRule] -- Existing RULES for the fn
1153 -> [CallInfo]
1154 -> OutId -> InExpr
1155 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details
1156
1157 -- This function checks existing rules, and does not create
1158 -- duplicate ones. So the caller does not need to do this filtering.
1159 -- See 'already_covered'
1160
1161 type SpecInfo = ( [CoreRule] -- Specialisation rules
1162 , [(Id,CoreExpr)] -- Specialised definition
1163 , UsageDetails ) -- Usage details from specialised RHSs
1164
1165 specCalls mb_mod env existing_rules calls_for_me fn rhs
1166 -- The first case is the interesting one
1167 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1168 && rhs_bndrs1 `lengthAtLeast` n_dicts -- and enough dict args
1169 && notNull calls_for_me -- And there are some calls to specialise
1170 && not (isNeverActive (idInlineActivation fn))
1171 -- Don't specialise NOINLINE things
1172 -- See Note [Auto-specialisation and RULES]
1173
1174 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1175 -- See Note [Inline specialisation] for why we do not
1176 -- switch off specialisation for inline functions
1177
1178 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1179 foldlM spec_call ([], [], emptyUDs) calls_for_me
1180
1181 | otherwise -- No calls or RHS doesn't fit our preconceptions
1182 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1183 text "Missed specialisation opportunity for"
1184 <+> ppr fn $$ _trace_doc )
1185 -- Note [Specialisation shape]
1186 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1187 return ([], [], emptyUDs)
1188 where
1189 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1190 , ppr rhs_bndrs, ppr n_dicts
1191 , ppr (idInlineActivation fn) ]
1192
1193 fn_type = idType fn
1194 fn_arity = idArity fn
1195 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1196 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1197 n_tyvars = length tyvars
1198 n_dicts = length theta
1199 inl_prag = idInlinePragma fn
1200 inl_act = inlinePragmaActivation inl_prag
1201 is_local = isLocalId fn
1202
1203 -- Figure out whether the function has an INLINE pragma
1204 -- See Note [Inline specialisations]
1205
1206 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1207 -- See Note [Account for casts in binding]
1208 (rhs_tyvars, rhs_bndrs1) = span isTyVar rhs_bndrs
1209 (rhs_dict_ids, rhs_bndrs2) = splitAt n_dicts rhs_bndrs1
1210 body = mkLams rhs_bndrs2 rhs_body
1211 -- Glue back on the non-dict lambdas
1212
1213 in_scope = CoreSubst.substInScope (se_subst env)
1214
1215 already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool
1216 already_covered dflags new_rules args -- Note [Specialisations already covered]
1217 = isJust (lookupRule dflags (in_scope, realIdUnfolding)
1218 (const True) fn args
1219 (new_rules ++ existing_rules))
1220 -- NB: we look both in the new_rules (generated by this invocation
1221 -- of specCalls), and in existing_rules (passed in to specCalls)
1222
1223 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1224 mk_ty_args [] poly_tvs
1225 = ASSERT( null poly_tvs ) []
1226 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1227 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1228 mk_ty_args (Just ty : call_ts) poly_tvs
1229 = Type ty : mk_ty_args call_ts poly_tvs
1230 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1231
1232 ----------------------------------------------------------
1233 -- Specialise to one particular call pattern
1234 spec_call :: SpecInfo -- Accumulating parameter
1235 -> CallInfo -- Call instance
1236 -> SpecM SpecInfo
1237 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc)
1238 (CI { ci_key = CallKey call_ts, ci_args = call_ds })
1239 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1240
1241 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1242 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1243
1244 -- Construct the new binding
1245 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1246 -- PLUS the rule
1247 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1248 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1249 -- PLUS the usage-details
1250 -- { d1' = dx1; d2' = dx2 }
1251 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1252 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1253 --
1254 -- Note that the substitution is applied to the whole thing.
1255 -- This is convenient, but just slightly fragile. Notably:
1256 -- * There had better be no name clashes in a/b/c
1257 do { let
1258 -- poly_tyvars = [b] in the example above
1259 -- spec_tyvars = [a,c]
1260 -- ty_args = [t1,b,t3]
1261 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1262 env1 = extendTvSubstList env spec_tv_binds
1263 (rhs_env, poly_tyvars) = substBndrs env1
1264 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1265
1266 -- Clone rhs_dicts, including instantiating their types
1267 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1268 ; let (rhs_env2, dx_binds, spec_dict_args)
1269 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1270 ty_args = mk_ty_args call_ts poly_tyvars
1271 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1272 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1273 rule_args = ty_args ++ ev_args
1274 rule_bndrs = poly_tyvars ++ ev_bndrs
1275
1276 ; dflags <- getDynFlags
1277 ; if already_covered dflags rules_acc rule_args
1278 then return spec_acc
1279 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1280 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1281 -- , ppr dx_binds ]) $
1282 do
1283 { -- Figure out the type of the specialised function
1284 let body_ty = applyTypeToArgs rhs fn_type rule_args
1285 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1286 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1287 , not (isJoinId fn)
1288 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1289 | otherwise = (poly_tyvars, poly_tyvars)
1290 spec_id_ty = mkLamTypes lam_args body_ty
1291 join_arity_change = length app_args - length rule_args
1292 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1293 = Just (orig_join_arity + join_arity_change)
1294 | otherwise
1295 = Nothing
1296
1297 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1298 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1299 ; this_mod <- getModule
1300 ; let
1301 -- The rule to put in the function's specialisation is:
1302 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1303 herald = case mb_mod of
1304 Nothing -- Specialising local fn
1305 -> text "SPEC"
1306 Just this_mod -- Specialising imported fn
1307 -> text "SPEC/" <> ppr this_mod
1308
1309 rule_name = mkFastString $ showSDoc dflags $
1310 herald <+> ftext (occNameFS (getOccName fn))
1311 <+> hsep (map ppr_call_key_ty call_ts)
1312 -- This name ends up in interface files, so use occNameString.
1313 -- Otherwise uniques end up there, making builds
1314 -- less deterministic (See #4012 comment:61 ff)
1315
1316 rule_wout_eta = mkRule
1317 this_mod
1318 True {- Auto generated -}
1319 is_local
1320 rule_name
1321 inl_act -- Note [Auto-specialisation and RULES]
1322 (idName fn)
1323 rule_bndrs
1324 rule_args
1325 (mkVarApps (Var spec_f) app_args)
1326
1327 spec_rule
1328 = case isJoinId_maybe fn of
1329 Just join_arity -> etaExpandToJoinPointRule join_arity
1330 rule_wout_eta
1331 Nothing -> rule_wout_eta
1332
1333 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1334 spec_uds = foldr consDictBind rhs_uds dx_binds
1335
1336 --------------------------------------
1337 -- Add a suitable unfolding if the spec_inl_prag says so
1338 -- See Note [Inline specialisations]
1339 (spec_inl_prag, spec_unf)
1340 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1341 = (neverInlinePragma, noUnfolding)
1342 -- See Note [Specialising imported functions] in OccurAnal
1343
1344 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1345 = (inl_prag { inl_inline = NoUserInline }, noUnfolding)
1346
1347 | otherwise
1348 = (inl_prag, specUnfolding dflags poly_tyvars spec_app
1349 arity_decrease fn_unf)
1350
1351 arity_decrease = length spec_dict_args
1352 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1353
1354 --------------------------------------
1355 -- Adding arity information just propagates it a bit faster
1356 -- See Note [Arity decrease] in Simplify
1357 -- Copy InlinePragma information from the parent Id.
1358 -- So if f has INLINE[1] so does spec_f
1359 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1360 `setInlinePragma` spec_inl_prag
1361 `setIdUnfolding` spec_unf
1362 `asJoinId_maybe` spec_join_arity
1363
1364 ; return ( spec_rule : rules_acc
1365 , (spec_f_w_arity, spec_rhs) : pairs_acc
1366 , spec_uds `plusUDs` uds_acc
1367 ) } }
1368
1369 {- Note [Account for casts in binding]
1370 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1371 Consider
1372 f :: Eq a => a -> IO ()
1373 {-# INLINABLE f
1374 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1375 #-}
1376 f = ...
1377
1378 In f's stable unfolding we have done some modest simplification which
1379 has pushed the cast to the outside. (I wonder if this is the Right
1380 Thing, but it's what happens now; see SimplUtils Note [Casts and
1381 lambdas].) Now that stable unfolding must be specialised, so we want
1382 to push the cast back inside. It would be terrible if the cast
1383 defeated specialisation! Hence the use of collectBindersPushingCo.
1384
1385 Note [Evidence foralls]
1386 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1387 Suppose (Trac #12212) that we are specialising
1388 f :: forall a b. (Num a, F a ~ F b) => blah
1389 with a=b=Int. Then the RULE will be something like
1390 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1391 f Int Int d g = f_spec
1392 But both varToCoreExpr (when constructing the LHS args), and the
1393 simplifier (when simplifying the LHS args), will transform to
1394 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1395 f Int Int d <F Int> = f_spec
1396 by replacing g with Refl. So now 'g' is unbound, which results in a later
1397 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1398 * varToCoreExpr generates a Refl
1399 * exprsFreeIdsList returns the Ids bound by the args,
1400 which won't include g
1401
1402 You might wonder if this will match as often, but the simplifier replaces
1403 complicated Refl coercions with Refl pretty aggressively.
1404
1405 Note [Orphans and auto-generated rules]
1406 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1407 When we specialise an INLINABLE function, or when we have
1408 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1409 We don't want to warn about these, or we'd generate a lot of warnings.
1410 Thus, we only warn about user-specified orphan rules.
1411
1412 Indeed, we don't even treat the module as an orphan module if it has
1413 auto-generated *rule* orphans. Orphan modules are read every time we
1414 compile, so they are pretty obtrusive and slow down every compilation,
1415 even non-optimised ones. (Reason: for type class instances it's a
1416 type correctness issue.) But specialisation rules are strictly for
1417 *optimisation* only so it's fine not to read the interface.
1418
1419 What this means is that a SPEC rules from auto-specialisation in
1420 module M will be used in other modules only if M.hi has been read for
1421 some other reason, which is actually pretty likely.
1422 -}
1423
1424 bindAuxiliaryDicts
1425 :: SpecEnv
1426 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1427 -> [DictId] -- A cloned dict-id for each dict arg
1428 -> (SpecEnv, -- Substitute for all orig_dicts
1429 [DictBind], -- Auxiliary dict bindings
1430 [CoreExpr]) -- Witnessing expressions (all trivial)
1431 -- Bind any dictionary arguments to fresh names, to preserve sharing
1432 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1433 orig_dict_ids call_ds inst_dict_ids
1434 = (env', dx_binds, spec_dict_args)
1435 where
1436 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1437 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1438 (orig_dict_ids `zip` spec_dict_args)
1439 `CoreSubst.extendInScopeList` dx_ids
1440 , se_interesting = interesting `unionVarSet` interesting_dicts }
1441
1442 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1443 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1444 , interestingDict env dx ]
1445 -- See Note [Make the new dictionaries interesting]
1446
1447 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1448 go [] _ = ([], [])
1449 go (dx:dxs) (dx_id:dx_ids)
1450 | exprIsTrivial dx = (dx_binds, dx : args)
1451 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1452 where
1453 (dx_binds, args) = go dxs dx_ids
1454 -- In the first case extend the substitution but not bindings;
1455 -- in the latter extend the bindings but not the substitution.
1456 -- For the former, note that we bind the *original* dict in the substitution,
1457 -- overriding any d->dx_id binding put there by substBndrs
1458 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1459
1460 {-
1461 Note [Make the new dictionaries interesting]
1462 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1463 Important! We're going to substitute dx_id1 for d
1464 and we want it to look "interesting", else we won't gather *any*
1465 consequential calls. E.g.
1466 f d = ...g d....
1467 If we specialise f for a call (f (dfun dNumInt)), we'll get
1468 a consequent call (g d') with an auxiliary definition
1469 d' = df dNumInt
1470 We want that consequent call to look interesting
1471
1472
1473 Note [From non-recursive to recursive]
1474 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1475 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1476 have built a recursive knot
1477
1478 f a d x = <blah>
1479 MkUD { ud_binds = NonRec d7 (MkD ..f..)
1480 , ud_calls = ...(f T d7)... }
1481
1482 The we generate
1483
1484 Rec { fs x = <blah>[T/a, d7/d]
1485 f a d x = <blah>
1486 RULE f T _ = fs
1487 d7 = ...f... }
1488
1489 Here the recursion is only through the RULE.
1490
1491 However we definitely should /not/ make the Rec in this wildly common
1492 case:
1493 d = ...
1494 MkUD { ud_binds = NonRec d7 (...d...)
1495 , ud_calls = ...(f T d7)... }
1496
1497 Here we want simply to add d to the floats, giving
1498 MkUD { ud_binds = NonRec d (...)
1499 NonRec d7 (...d...)
1500 , ud_calls = ...(f T d7)... }
1501
1502 In general, we need only make this Rec if
1503 - there are some specialisations (spec_binds non-empty)
1504 - there are some dict_binds that depend on f (dump_dbs non-empty)
1505
1506 Note [Avoiding loops]
1507 ~~~~~~~~~~~~~~~~~~~~~
1508 When specialising /dictionary functions/ we must be very careful to
1509 avoid building loops. Here is an example that bit us badly: Trac #3591
1510
1511 class Eq a => C a
1512 instance Eq [a] => C [a]
1513
1514 This translates to
1515 dfun :: Eq [a] -> C [a]
1516 dfun a d = MkD a d (meth d)
1517
1518 d4 :: Eq [T] = <blah>
1519 d2 :: C [T] = dfun T d4
1520 d1 :: Eq [T] = $p1 d2
1521 d3 :: C [T] = dfun T d1
1522
1523 None of these definitions is recursive. What happened was that we
1524 generated a specialisation:
1525
1526 RULE forall d. dfun T d = dT :: C [T]
1527 dT = (MkD a d (meth d)) [T/a, d1/d]
1528 = MkD T d1 (meth d1)
1529
1530 But now we use the RULE on the RHS of d2, to get
1531
1532 d2 = dT = MkD d1 (meth d1)
1533 d1 = $p1 d2
1534
1535 and now d1 is bottom! The problem is that when specialising 'dfun' we
1536 should first dump "below" the binding all floated dictionary bindings
1537 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1538 placed below 'dfun', and thus unavailable to it when specialising
1539 'dfun'. That in turn means that the call (dfun T d1) must be
1540 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1541 d4 doesn't mention dfun.
1542
1543 Solution:
1544 Discard all calls that mention dictionaries that depend
1545 (directly or indirectly) on the dfun we are specialising.
1546 This is done by 'filterCalls'
1547
1548 --------------
1549 Here's another example, this time for an imported dfun, so the call
1550 to filterCalls is in specImports (Trac #13429). Suppose we have
1551 class Monoid v => C v a where ...
1552
1553 We start with a call
1554 f @ [Integer] @ Integer $fC[]Integer
1555
1556 Specialising call to 'f' gives dict bindings
1557 $dMonoid_1 :: Monoid [Integer]
1558 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
1559
1560 $dC_1 :: C [Integer] (Node [Integer] Integer)
1561 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1562
1563 ...plus a recursive call to
1564 f @ [Integer] @ (Node [Integer] Integer) $dC_1
1565
1566 Specialising that call gives
1567 $dMonoid_2 :: Monoid [Integer]
1568 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1569
1570 $dC_2 :: C [Integer] (Node [Integer] Integer)
1571 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
1572
1573 Now we have two calls to the imported function
1574 M.$fCvNode :: Monoid v => C v a
1575 M.$fCvNode @v @a m = C m some_fun
1576
1577 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
1578 for specialisation, else we get:
1579
1580 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1581 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1582 $s$fCvNode = C $dMonoid_2 ...
1583 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
1584
1585 Now use the rule to rewrite the call in the RHS of $dC_1
1586 and we get a loop!
1587
1588 --------------
1589 Here's yet another example
1590
1591 class C a where { foo,bar :: [a] -> [a] }
1592
1593 instance C Int where
1594 foo x = r_bar x
1595 bar xs = reverse xs
1596
1597 r_bar :: C a => [a] -> [a]
1598 r_bar xs = bar (xs ++ xs)
1599
1600 That translates to:
1601
1602 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1603
1604 Rec { $fCInt :: C Int = MkC foo_help reverse
1605 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1606
1607 The call (r_bar $fCInt) mentions $fCInt,
1608 which mentions foo_help,
1609 which mentions r_bar
1610 But we DO want to specialise r_bar at Int:
1611
1612 Rec { $fCInt :: C Int = MkC foo_help reverse
1613 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1614
1615 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1616 RULE r_bar Int _ = r_bar_Int
1617
1618 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1619 }
1620
1621 Note that, because of its RULE, r_bar joins the recursive
1622 group. (In this case it'll unravel a short moment later.)
1623
1624
1625 Note [Specialising a recursive group]
1626 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1627 Consider
1628 let rec { f x = ...g x'...
1629 ; g y = ...f y'.... }
1630 in f 'a'
1631 Here we specialise 'f' at Char; but that is very likely to lead to
1632 a specialisation of 'g' at Char. We must do the latter, else the
1633 whole point of specialisation is lost.
1634
1635 But we do not want to keep iterating to a fixpoint, because in the
1636 presence of polymorphic recursion we might generate an infinite number
1637 of specialisations.
1638
1639 So we use the following heuristic:
1640 * Arrange the rec block in dependency order, so far as possible
1641 (the occurrence analyser already does this)
1642
1643 * Specialise it much like a sequence of lets
1644
1645 * Then go through the block a second time, feeding call-info from
1646 the RHSs back in the bottom, as it were
1647
1648 In effect, the ordering maxmimises the effectiveness of each sweep,
1649 and we do just two sweeps. This should catch almost every case of
1650 monomorphic recursion -- the exception could be a very knotted-up
1651 recursion with multiple cycles tied up together.
1652
1653 This plan is implemented in the Rec case of specBindItself.
1654
1655 Note [Specialisations already covered]
1656 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1657 We obviously don't want to generate two specialisations for the same
1658 argument pattern. There are two wrinkles
1659
1660 1. We do the already-covered test in specDefn, not when we generate
1661 the CallInfo in mkCallUDs. We used to test in the latter place, but
1662 we now iterate the specialiser somewhat, and the Id at the call site
1663 might therefore not have all the RULES that we can see in specDefn
1664
1665 2. What about two specialisations where the second is an *instance*
1666 of the first? If the more specific one shows up first, we'll generate
1667 specialisations for both. If the *less* specific one shows up first,
1668 we *don't* currently generate a specialisation for the more specific
1669 one. (See the call to lookupRule in already_covered.) Reasons:
1670 (a) lookupRule doesn't say which matches are exact (bad reason)
1671 (b) if the earlier specialisation is user-provided, it's
1672 far from clear that we should auto-specialise further
1673
1674 Note [Auto-specialisation and RULES]
1675 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1676 Consider:
1677 g :: Num a => a -> a
1678 g = ...
1679
1680 f :: (Int -> Int) -> Int
1681 f w = ...
1682 {-# RULE f g = 0 #-}
1683
1684 Suppose that auto-specialisation makes a specialised version of
1685 g::Int->Int That version won't appear in the LHS of the RULE for f.
1686 So if the specialisation rule fires too early, the rule for f may
1687 never fire.
1688
1689 It might be possible to add new rules, to "complete" the rewrite system.
1690 Thus when adding
1691 RULE forall d. g Int d = g_spec
1692 also add
1693 RULE f g_spec = 0
1694
1695 But that's a bit complicated. For now we ask the programmer's help,
1696 by *copying the INLINE activation pragma* to the auto-specialised
1697 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1698 will also not be active until phase 2. And that's what programmers
1699 should jolly well do anyway, even aside from specialisation, to ensure
1700 that g doesn't inline too early.
1701
1702 This in turn means that the RULE would never fire for a NOINLINE
1703 thing so not much point in generating a specialisation at all.
1704
1705 Note [Specialisation shape]
1706 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1707 We only specialise a function if it has visible top-level lambdas
1708 corresponding to its overloading. E.g. if
1709 f :: forall a. Eq a => ....
1710 then its body must look like
1711 f = /\a. \d. ...
1712
1713 Reason: when specialising the body for a call (f ty dexp), we want to
1714 substitute dexp for d, and pick up specialised calls in the body of f.
1715
1716 This doesn't always work. One example I came across was this:
1717 newtype Gen a = MkGen{ unGen :: Int -> a }
1718
1719 choose :: Eq a => a -> Gen a
1720 choose n = MkGen (\r -> n)
1721
1722 oneof = choose (1::Int)
1723
1724 It's a silly example, but we get
1725 choose = /\a. g `cast` co
1726 where choose doesn't have any dict arguments. Thus far I have not
1727 tried to fix this (wait till there's a real example).
1728
1729 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1730 it doesn't matter. This comes up with single-method classes
1731
1732 class C a where { op :: a -> a }
1733 instance C a => C [a] where ....
1734 ==>
1735 $fCList :: C a => C [a]
1736 $fCList = $copList |> (...coercion>...)
1737 ....(uses of $fCList at particular types)...
1738
1739 So we suppress the WARN if the rhs is trivial.
1740
1741 Note [Inline specialisations]
1742 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1743 Here is what we do with the InlinePragma of the original function
1744 * Activation/RuleMatchInfo: both transferred to the
1745 specialised function
1746 * InlineSpec:
1747 (a) An INLINE pragma is transferred
1748 (b) An INLINABLE pragma is *not* transferred
1749
1750 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1751 specialise the function at its call site, and arguably that's not so
1752 important for the specialised copies. BUT *pragma-directed*
1753 specialisation now takes place in the typechecker/desugarer, with
1754 manually specified INLINEs. The specialisation here is automatic.
1755 It'd be very odd if a function marked INLINE was specialised (because
1756 of some local use), and then forever after (including importing
1757 modules) the specialised version wasn't INLINEd. After all, the
1758 programmer said INLINE!
1759
1760 You might wonder why we specialise INLINE functions at all. After
1761 all they should be inlined, right? Two reasons:
1762
1763 * Even INLINE functions are sometimes not inlined, when they aren't
1764 applied to interesting arguments. But perhaps the type arguments
1765 alone are enough to specialise (even though the args are too boring
1766 to trigger inlining), and it's certainly better to call the
1767 specialised version.
1768
1769 * The RHS of an INLINE function might call another overloaded function,
1770 and we'd like to generate a specialised version of that function too.
1771 This actually happens a lot. Consider
1772 replicateM_ :: (Monad m) => Int -> m a -> m ()
1773 {-# INLINABLE replicateM_ #-}
1774 replicateM_ d x ma = ...
1775 The strictness analyser may transform to
1776 replicateM_ :: (Monad m) => Int -> m a -> m ()
1777 {-# INLINE replicateM_ #-}
1778 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1779
1780 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1781 {-# INLINABLE $wreplicateM_ #-}
1782 $wreplicateM_ = ...
1783 Now an importing module has a specialised call to replicateM_, say
1784 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1785 This particular example had a huge effect on the call to replicateM_
1786 in nofib/shootout/n-body.
1787
1788 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1789 Suppose we have
1790 {-# INLINABLE f #-}
1791 f :: Ord a => [a] -> Int
1792 f xs = letrec f' = ...f'... in f'
1793 Then, when f is specialised and optimised we might get
1794 wgo :: [Int] -> Int#
1795 wgo = ...wgo...
1796 f_spec :: [Int] -> Int
1797 f_spec xs = case wgo xs of { r -> I# r }
1798 and we clearly want to inline f_spec at call sites. But if we still
1799 have the big, un-optimised of f (albeit specialised) captured in an
1800 INLINABLE pragma for f_spec, we won't get that optimisation.
1801
1802 So we simply drop INLINABLE pragmas when specialising. It's not really
1803 a complete solution; ignoring specialisation for now, INLINABLE functions
1804 don't get properly strictness analysed, for example. But it works well
1805 for examples involving specialisation, which is the dominant use of
1806 INLINABLE. See Trac #4874.
1807
1808
1809 ************************************************************************
1810 * *
1811 \subsubsection{UsageDetails and suchlike}
1812 * *
1813 ************************************************************************
1814 -}
1815
1816 data UsageDetails
1817 = MkUD {
1818 ud_binds :: !(Bag DictBind),
1819 -- See Note [Floated dictionary bindings]
1820 -- The order is important;
1821 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1822 -- (Remember, Bags preserve order in GHC.)
1823
1824 ud_calls :: !CallDetails
1825
1826 -- INVARIANT: suppose bs = bindersOf ud_binds
1827 -- Then 'calls' may *mention* 'bs',
1828 -- but there should be no calls *for* bs
1829 }
1830
1831 -- | A 'DictBind' is a binding along with a cached set containing its free
1832 -- variables (both type variables and dictionaries)
1833 type DictBind = (CoreBind, VarSet)
1834
1835 {- Note [Floated dictionary bindings]
1836 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1837 We float out dictionary bindings for the reasons described under
1838 "Dictionary floating" above. But not /just/ dictionary bindings.
1839 Consider
1840
1841 f :: Eq a => blah
1842 f a d = rhs
1843
1844 $c== :: T -> T -> Bool
1845 $c== x y = ...
1846
1847 $df :: Eq T
1848 $df = Eq $c== ...
1849
1850 gurgle = ...(f @T $df)...
1851
1852 We gather the call info for (f @T $df), and we don't want to drop it
1853 when we come across the binding for $df. So we add $df to the floats
1854 and continue. But then we have to add $c== to the floats, and so on.
1855 These all float above the binding for 'f', and and now we can
1856 successfully specialise 'f'.
1857
1858 So the DictBinds in (ud_binds :: Bag DictBind) may contain
1859 non-dictionary bindings too.
1860 -}
1861
1862 instance Outputable UsageDetails where
1863 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1864 = text "MkUD" <+> braces (sep (punctuate comma
1865 [text "binds" <+> equals <+> ppr dbs,
1866 text "calls" <+> equals <+> ppr calls]))
1867
1868 emptyUDs :: UsageDetails
1869 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1870
1871 ------------------------------------------------------------
1872 type CallDetails = DIdEnv CallInfoSet
1873 -- The order of specialized binds and rules depends on how we linearize
1874 -- CallDetails, so to get determinism we must use a deterministic set here.
1875 -- See Note [Deterministic UniqFM] in UniqDFM
1876
1877 data CallInfoSet = CIS Id (Bag CallInfo)
1878 -- The list of types and dictionaries is guaranteed to
1879 -- match the type of f
1880 -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
1881 -- These dups are eliminated by already_covered in specCalls
1882
1883 data CallInfo
1884 = CI { ci_key :: CallKey -- Type arguments
1885 , ci_args :: [DictExpr] -- Dictionary arguments
1886 , ci_fvs :: VarSet -- Free vars of the ci_key and ci_args
1887 -- call (including tyvars)
1888 -- [*not* include the main id itself, of course]
1889 }
1890
1891 newtype CallKey = CallKey [Maybe Type]
1892 -- Nothing => unconstrained type argument
1893
1894 type DictExpr = CoreExpr
1895
1896 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1897 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1898
1899 instance Outputable CallInfoSet where
1900 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1901 2 (ppr map)
1902
1903 pprCallInfo :: Id -> CallInfo -> SDoc
1904 pprCallInfo fn (CI { ci_key = key })
1905 = ppr fn <+> ppr key
1906
1907 ppr_call_key_ty :: Maybe Type -> SDoc
1908 ppr_call_key_ty Nothing = char '_'
1909 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1910
1911 instance Outputable CallKey where
1912 ppr (CallKey ts) = brackets (fsep (map ppr_call_key_ty ts))
1913
1914 instance Outputable CallInfo where
1915 ppr (CI { ci_key = key, ci_args = args, ci_fvs = fvs })
1916 = text "CI" <> braces (hsep [ ppr key, ppr args, ppr fvs ])
1917
1918 unionCalls :: CallDetails -> CallDetails -> CallDetails
1919 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1920
1921 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1922 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1923 CIS f (calls1 `unionBags` calls2)
1924
1925 callDetailsFVs :: CallDetails -> VarSet
1926 callDetailsFVs calls =
1927 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1928 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1929 -- immediately by converting to a nondeterministic set.
1930
1931 callInfoFVs :: CallInfoSet -> VarSet
1932 callInfoFVs (CIS _ call_info) =
1933 foldrBag (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
1934
1935 ------------------------------------------------------------
1936 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1937 singleCall id tys dicts
1938 = MkUD {ud_binds = emptyBag,
1939 ud_calls = unitDVarEnv id $ CIS id $
1940 unitBag (CI { ci_key = CallKey tys
1941 , ci_args = dicts
1942 , ci_fvs = call_fvs }) }
1943 where
1944 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1945 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1946 -- The type args (tys) are guaranteed to be part of the dictionary
1947 -- types, because they are just the constrained types,
1948 -- and the dictionary is therefore sure to be bound
1949 -- inside the binding for any type variables free in the type;
1950 -- hence it's safe to neglect tyvars free in tys when making
1951 -- the free-var set for this call
1952 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1953 --
1954 -- We don't include the 'id' itself.
1955
1956 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1957 mkCallUDs env f args
1958 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1959 res
1960 where
1961 res = mkCallUDs' env f args
1962
1963 mkCallUDs' env f args
1964 | not (want_calls_for f) -- Imported from elsewhere
1965 || null theta -- Not overloaded
1966 = emptyUDs
1967
1968 | not (all type_determines_value theta)
1969 || not (spec_tys `lengthIs` n_tyvars)
1970 || not ( dicts `lengthIs` n_dicts)
1971 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1972 -- See also Note [Specialisations already covered]
1973 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1974 emptyUDs -- Not overloaded, or no specialisation wanted
1975
1976 | otherwise
1977 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1978 singleCall f spec_tys dicts
1979 where
1980 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1981 , ppr (map (interestingDict env) dicts)]
1982 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1983 constrained_tyvars = tyCoVarsOfTypes theta
1984 n_tyvars = length tyvars
1985 n_dicts = length theta
1986
1987 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1988 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1989
1990 -- ignores Coercion arguments
1991 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1992 type_zip tvs (Coercion _ : args) = type_zip tvs args
1993 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1994 type_zip _ _ = []
1995
1996 mk_spec_ty tyvar ty
1997 | tyvar `elemVarSet` constrained_tyvars = Just ty
1998 | otherwise = Nothing
1999
2000 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
2001 -- For imported things, we gather call instances if
2002 -- there is an unfolding that we could in principle specialise
2003 -- We might still decide not to use it (consulting dflags)
2004 -- in specImports
2005 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
2006
2007 type_determines_value pred -- See Note [Type determines value]
2008 = case classifyPredType pred of
2009 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
2010 EqPred {} -> True
2011 IrredPred {} -> True -- Things like (D []) where D is a
2012 -- Constraint-ranged family; Trac #7785
2013 ForAllPred {} -> True
2014
2015 {-
2016 Note [Type determines value]
2017 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2018 Only specialise if all overloading is on non-IP *class* params,
2019 because these are the ones whose *type* determines their *value*. In
2020 parrticular, with implicit params, the type args *don't* say what the
2021 value of the implicit param is! See Trac #7101
2022
2023 However, consider
2024 type family D (v::*->*) :: Constraint
2025 type instance D [] = ()
2026 f :: D v => v Char -> Int
2027 If we see a call (f "foo"), we'll pass a "dictionary"
2028 () |> (g :: () ~ D [])
2029 and it's good to specialise f at this dictionary.
2030
2031 So the question is: can an implicit parameter "hide inside" a
2032 type-family constraint like (D a). Well, no. We don't allow
2033 type instance D Maybe = ?x:Int
2034 Hence the IrredPred case in type_determines_value.
2035 See Trac #7785.
2036
2037 Note [Interesting dictionary arguments]
2038 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2039 Consider this
2040 \a.\d:Eq a. let f = ... in ...(f d)...
2041 There really is not much point in specialising f wrt the dictionary d,
2042 because the code for the specialised f is not improved at all, because
2043 d is lambda-bound. We simply get junk specialisations.
2044
2045 What is "interesting"? Just that it has *some* structure. But what about
2046 variables?
2047
2048 * A variable might be imported, in which case its unfolding
2049 will tell us whether it has useful structure
2050
2051 * Local variables are cloned on the way down (to avoid clashes when
2052 we float dictionaries), and cloning drops the unfolding
2053 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
2054 nuisance to give them unfoldings. So we keep track of the
2055 "interesting" dictionaries as a VarSet in SpecEnv.
2056 We have to take care to put any new interesting dictionary
2057 bindings in the set.
2058
2059 We accidentally lost accurate tracking of local variables for a long
2060 time, because cloned variables don't have unfoldings. But makes a
2061 massive difference in a few cases, eg Trac #5113. For nofib as a
2062 whole it's only a small win: 2.2% improvement in allocation for ansi,
2063 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2064 -}
2065
2066 interestingDict :: SpecEnv -> CoreExpr -> Bool
2067 -- A dictionary argument is interesting if it has *some* structure
2068 -- NB: "dictionary" arguments include constraints of all sorts,
2069 -- including equality constraints; hence the Coercion case
2070 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2071 || isDataConWorkId v
2072 || v `elemVarSet` se_interesting env
2073 interestingDict _ (Type _) = False
2074 interestingDict _ (Coercion _) = False
2075 interestingDict env (App fn (Type _)) = interestingDict env fn
2076 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2077 interestingDict env (Tick _ a) = interestingDict env a
2078 interestingDict env (Cast e _) = interestingDict env e
2079 interestingDict _ _ = True
2080
2081 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2082 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2083 (MkUD {ud_binds = db2, ud_calls = calls2})
2084 = MkUD { ud_binds = db1 `unionBags` db2
2085 , ud_calls = calls1 `unionCalls` calls2 }
2086
2087 -----------------------------
2088 _dictBindBndrs :: Bag DictBind -> [Id]
2089 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2090
2091 -- | Construct a 'DictBind' from a 'CoreBind'
2092 mkDB :: CoreBind -> DictBind
2093 mkDB bind = (bind, bind_fvs bind)
2094
2095 -- | Identify the free variables of a 'CoreBind'
2096 bind_fvs :: CoreBind -> VarSet
2097 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2098 bind_fvs (Rec prs) = foldl' delVarSet rhs_fvs bndrs
2099 where
2100 bndrs = map fst prs
2101 rhs_fvs = unionVarSets (map pair_fvs prs)
2102
2103 pair_fvs :: (Id, CoreExpr) -> VarSet
2104 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2105 `unionVarSet` idFreeVars bndr
2106 -- idFreeVars: don't forget variables mentioned in
2107 -- the rules of the bndr. C.f. OccAnal.addRuleUsage
2108 -- Also tyvars mentioned in its type; they may not appear
2109 -- in the RHS
2110 -- type T a = Int
2111 -- x :: T a = 3
2112 where
2113 interesting :: InterestingVarFun
2114 interesting v = isLocalVar v || (isId v && isDFunId v)
2115 -- Very important: include DFunIds /even/ if it is imported
2116 -- Reason: See Note [Avoiding loops], the second exmaple
2117 -- involving an imported dfun. We must know whether
2118 -- a dictionary binding depends on an imported dfun,
2119 -- in case we try to specialise that imported dfun
2120 -- Trac #13429 illustrates
2121
2122 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
2123 -- pairs, into a single recursive binding.
2124 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind
2125 recWithDumpedDicts pairs dbs
2126 = (Rec bindings, fvs)
2127 where
2128 (bindings, fvs) = foldrBag add
2129 ([], emptyVarSet)
2130 (dbs `snocBag` mkDB (Rec pairs))
2131 add (NonRec b r, fvs') (pairs, fvs) =
2132 ((b,r) : pairs, fvs `unionVarSet` fvs')
2133 add (Rec prs1, fvs') (pairs, fvs) =
2134 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2135
2136 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2137 -- Add ud_binds to the tail end of the bindings in uds
2138 snocDictBinds uds dbs
2139 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2140
2141 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2142 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2143
2144 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2145 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2146
2147 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2148 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2149
2150 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2151 wrapDictBinds dbs binds
2152 = foldrBag add binds dbs
2153 where
2154 add (bind,_) binds = bind : binds
2155
2156 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2157 wrapDictBindsE dbs expr
2158 = foldrBag add expr dbs
2159 where
2160 add (bind,_) expr = Let bind expr
2161
2162 ----------------------
2163 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2164 -- Used at a lambda or case binder; just dump anything mentioning the binder
2165 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2166 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2167 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2168 (free_uds, dump_dbs)
2169 where
2170 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2171 bndr_set = mkVarSet bndrs
2172 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2173 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2174 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2175 -- no calls for any of the dicts in dump_dbs
2176
2177 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2178 -- Used at a let(rec) binding.
2179 -- We return a boolean indicating whether the binding itself is mentioned,
2180 -- directly or indirectly, by any of the ud_calls; in that case we want to
2181 -- float the binding itself;
2182 -- See Note [Floated dictionary bindings]
2183 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2184 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2185 (free_uds, dump_dbs, float_all)
2186 where
2187 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2188 bndr_set = mkVarSet bndrs
2189 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2190 free_calls = deleteCallsFor bndrs orig_calls
2191 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2192
2193 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2194 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2195 = -- pprTrace ("callsForMe")
2196 -- (vcat [ppr fn,
2197 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2198 -- text "Orig calls =" <+> ppr orig_calls,
2199 -- text "Dep set =" <+> ppr dep_set,
2200 -- text "Calls for me =" <+> ppr calls_for_me]) $
2201 (uds_without_me, calls_for_me)
2202 where
2203 uds_without_me = MkUD { ud_binds = orig_dbs
2204 , ud_calls = delDVarEnv orig_calls fn }
2205 calls_for_me = case lookupDVarEnv orig_calls fn of
2206 Nothing -> []
2207 Just cis -> filterCalls cis orig_dbs
2208 -- filterCalls: drop calls that (directly or indirectly)
2209 -- refer to fn. See Note [Avoiding loops]
2210
2211 ----------------------
2212 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2213 -- See Note [Avoiding loops]
2214 filterCalls (CIS fn call_bag) dbs
2215 = filter ok_call (bagToList call_bag)
2216 where
2217 dump_set = foldlBag go (unitVarSet fn) dbs
2218 -- This dump-set could also be computed by splitDictBinds
2219 -- (_,_,dump_set) = splitDictBinds dbs {fn}
2220 -- But this variant is shorter
2221
2222 go so_far (db,fvs) | fvs `intersectsVarSet` so_far
2223 = extendVarSetList so_far (bindersOf db)
2224 | otherwise = so_far
2225
2226 ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set)
2227
2228 ----------------------
2229 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2230 -- splitDictBinds dbs bndrs returns
2231 -- (free_dbs, dump_dbs, dump_set)
2232 -- where
2233 -- * dump_dbs depends, transitively on bndrs
2234 -- * free_dbs does not depend on bndrs
2235 -- * dump_set = bndrs `union` bndrs(dump_dbs)
2236 splitDictBinds dbs bndr_set
2237 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2238 -- Important that it's foldl not foldr;
2239 -- we're accumulating the set of dumped ids in dump_set
2240 where
2241 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2242 | dump_idset `intersectsVarSet` fvs -- Dump it
2243 = (free_dbs, dump_dbs `snocBag` db,
2244 extendVarSetList dump_idset (bindersOf bind))
2245
2246 | otherwise -- Don't dump it
2247 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2248
2249
2250 ----------------------
2251 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2252 -- Remove calls *mentioning* bs in any way
2253 deleteCallsMentioning bs calls
2254 = mapDVarEnv (ciSetFilter keep_call) calls
2255 where
2256 keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs)
2257
2258 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2259 -- Remove calls *for* bs
2260 deleteCallsFor bs calls = delDVarEnvList calls bs
2261
2262 {-
2263 ************************************************************************
2264 * *
2265 \subsubsection{Boring helper functions}
2266 * *
2267 ************************************************************************
2268 -}
2269
2270 newtype SpecM a = SpecM (State SpecState a)
2271
2272 data SpecState = SpecState {
2273 spec_uniq_supply :: UniqSupply,
2274 spec_module :: Module,
2275 spec_dflags :: DynFlags
2276 }
2277
2278 instance Functor SpecM where
2279 fmap = liftM
2280
2281 instance Applicative SpecM where
2282 pure x = SpecM $ return x
2283 (<*>) = ap
2284
2285 instance Monad SpecM where
2286 SpecM x >>= f = SpecM $ do y <- x
2287 case f y of
2288 SpecM z ->
2289 z
2290 #if !MIN_VERSION_base(4,13,0)
2291 fail = MonadFail.fail
2292 #endif
2293
2294 instance MonadFail.MonadFail SpecM where
2295 fail str = SpecM $ error str
2296
2297 instance MonadUnique SpecM where
2298 getUniqueSupplyM
2299 = SpecM $ do st <- get
2300 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2301 put $ st { spec_uniq_supply = us2 }
2302 return us1
2303
2304 getUniqueM
2305 = SpecM $ do st <- get
2306 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2307 put $ st { spec_uniq_supply = us' }
2308 return u
2309
2310 instance HasDynFlags SpecM where
2311 getDynFlags = SpecM $ liftM spec_dflags get
2312
2313 instance HasModule SpecM where
2314 getModule = SpecM $ liftM spec_module get
2315
2316 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2317 runSpecM dflags this_mod (SpecM spec)
2318 = do us <- getUniqueSupplyM
2319 let initialState = SpecState {
2320 spec_uniq_supply = us,
2321 spec_module = this_mod,
2322 spec_dflags = dflags
2323 }
2324 return $ evalState spec initialState
2325
2326 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2327 mapAndCombineSM _ [] = return ([], emptyUDs)
2328 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2329 (ys, uds2) <- mapAndCombineSM f xs
2330 return (y:ys, uds1 `plusUDs` uds2)
2331
2332 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2333 extendTvSubstList env tv_binds
2334 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2335
2336 substTy :: SpecEnv -> Type -> Type
2337 substTy env ty = CoreSubst.substTy (se_subst env) ty
2338
2339 substCo :: SpecEnv -> Coercion -> Coercion
2340 substCo env co = CoreSubst.substCo (se_subst env) co
2341
2342 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2343 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2344 (subst', bs') -> (env { se_subst = subst' }, bs')
2345
2346 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2347 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2348 (subst', bs') -> (env { se_subst = subst' }, bs')
2349
2350 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2351 -- Clone the binders of the bind; return new bind with the cloned binders
2352 -- Return the substitution to use for RHSs, and the one to use for the body
2353 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2354 = do { us <- getUniqueSupplyM
2355 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2356 interesting' | interestingDict env rhs
2357 = interesting `extendVarSet` bndr'
2358 | otherwise = interesting
2359 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2360 , NonRec bndr' rhs) }
2361
2362 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2363 = do { us <- getUniqueSupplyM
2364 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2365 env' = env { se_subst = subst'
2366 , se_interesting = interesting `extendVarSetList`
2367 [ v | (v,r) <- pairs, interestingDict env r ] }
2368 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2369
2370 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2371 -- Make up completely fresh binders for the dictionaries
2372 -- Their bindings are going to float outwards
2373 newDictBndr env b = do { uniq <- getUniqueM
2374 ; let n = idName b
2375 ty' = substTy env (idType b)
2376 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2377
2378 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2379 -- Give the new Id a similar occurrence name to the old one
2380 newSpecIdSM old_id new_ty join_arity_maybe
2381 = do { uniq <- getUniqueM
2382 ; let name = idName old_id
2383 new_occ = mkSpecOcc (nameOccName name)
2384 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2385 `asJoinId_maybe` join_arity_maybe
2386 ; return new_id }
2387
2388 {-
2389 Old (but interesting) stuff about unboxed bindings
2390 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2391
2392 What should we do when a value is specialised to a *strict* unboxed value?
2393
2394 map_*_* f (x:xs) = let h = f x
2395 t = map f xs
2396 in h:t
2397
2398 Could convert let to case:
2399
2400 map_*_Int# f (x:xs) = case f x of h# ->
2401 let t = map f xs
2402 in h#:t
2403
2404 This may be undesirable since it forces evaluation here, but the value
2405 may not be used in all branches of the body. In the general case this
2406 transformation is impossible since the mutual recursion in a letrec
2407 cannot be expressed as a case.
2408
2409 There is also a problem with top-level unboxed values, since our
2410 implementation cannot handle unboxed values at the top level.
2411
2412 Solution: Lift the binding of the unboxed value and extract it when it
2413 is used:
2414
2415 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2416 t = map f xs
2417 in case h of
2418 _Lift h# -> h#:t
2419
2420 Now give it to the simplifier and the _Lifting will be optimised away.
2421
2422 The benefit is that we have given the specialised "unboxed" values a
2423 very simple lifted semantics and then leave it up to the simplifier to
2424 optimise it --- knowing that the overheads will be removed in nearly
2425 all cases.
2426
2427 In particular, the value will only be evaluated in the branches of the
2428 program which use it, rather than being forced at the point where the
2429 value is bound. For example:
2430
2431 filtermap_*_* p f (x:xs)
2432 = let h = f x
2433 t = ...
2434 in case p x of
2435 True -> h:t
2436 False -> t
2437 ==>
2438 filtermap_*_Int# p f (x:xs)
2439 = let h = case (f x) of h# -> _Lift h#
2440 t = ...
2441 in case p x of
2442 True -> case h of _Lift h#
2443 -> h#:t
2444 False -> t
2445
2446 The binding for h can still be inlined in the one branch and the
2447 _Lifting eliminated.
2448
2449
2450 Question: When won't the _Lifting be eliminated?
2451
2452 Answer: When they at the top-level (where it is necessary) or when
2453 inlining would duplicate work (or possibly code depending on
2454 options). However, the _Lifting will still be eliminated if the
2455 strictness analyser deems the lifted binding strict.
2456 -}