Allow CSE'ing of work-wrapped bindings (#14186)
[ghc.git] / compiler / specialise / Specialise.hs
1 {-
2 (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
3
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
5 -}
6
7 {-# LANGUAGE CPP #-}
8 module Specialise ( specProgram, specUnfolding ) where
9
10 #include "HsVersions.h"
11
12 import Id
13 import TcType hiding( substTy )
14 import Type hiding( substTy, extendTvSubstList )
15 import Module( Module, HasModule(..) )
16 import Coercion( Coercion )
17 import CoreMonad
18 import qualified CoreSubst
19 import CoreUnfold
20 import Var ( isLocalVar )
21 import VarSet
22 import VarEnv
23 import CoreSyn
24 import Rules
25 import CoreOpt ( collectBindersPushingCo )
26 import CoreUtils ( exprIsTrivial, applyTypeToArgs, mkCast )
27 import CoreFVs
28 import FV ( InterestingVarFun )
29 import CoreArity ( etaExpandToJoinPointRule )
30 import UniqSupply
31 import Name
32 import MkId ( voidArgId, voidPrimId )
33 import Maybes ( catMaybes, isJust )
34 import MonadUtils ( foldlM )
35 import BasicTypes
36 import HscTypes
37 import Bag
38 import DynFlags
39 import Util
40 import Outputable
41 import FastString
42 import State
43 import UniqDFM
44
45 import Control.Monad
46 import qualified Control.Monad.Fail as MonadFail
47
48 {-
49 ************************************************************************
50 * *
51 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
52 * *
53 ************************************************************************
54
55 These notes describe how we implement specialisation to eliminate
56 overloading.
57
58 The specialisation pass works on Core
59 syntax, complete with all the explicit dictionary application,
60 abstraction and construction as added by the type checker. The
61 existing type checker remains largely as it is.
62
63 One important thought: the {\em types} passed to an overloaded
64 function, and the {\em dictionaries} passed are mutually redundant.
65 If the same function is applied to the same type(s) then it is sure to
66 be applied to the same dictionary(s)---or rather to the same {\em
67 values}. (The arguments might look different but they will evaluate
68 to the same value.)
69
70 Second important thought: we know that we can make progress by
71 treating dictionary arguments as static and worth specialising on. So
72 we can do without binding-time analysis, and instead specialise on
73 dictionary arguments and no others.
74
75 The basic idea
76 ~~~~~~~~~~~~~~
77 Suppose we have
78
79 let f = <f_rhs>
80 in <body>
81
82 and suppose f is overloaded.
83
84 STEP 1: CALL-INSTANCE COLLECTION
85
86 We traverse <body>, accumulating all applications of f to types and
87 dictionaries.
88
89 (Might there be partial applications, to just some of its types and
90 dictionaries? In principle yes, but in practice the type checker only
91 builds applications of f to all its types and dictionaries, so partial
92 applications could only arise as a result of transformation, and even
93 then I think it's unlikely. In any case, we simply don't accumulate such
94 partial applications.)
95
96
97 STEP 2: EQUIVALENCES
98
99 So now we have a collection of calls to f:
100 f t1 t2 d1 d2
101 f t3 t4 d3 d4
102 ...
103 Notice that f may take several type arguments. To avoid ambiguity, we
104 say that f is called at type t1/t2 and t3/t4.
105
106 We take equivalence classes using equality of the *types* (ignoring
107 the dictionary args, which as mentioned previously are redundant).
108
109 STEP 3: SPECIALISATION
110
111 For each equivalence class, choose a representative (f t1 t2 d1 d2),
112 and create a local instance of f, defined thus:
113
114 f@t1/t2 = <f_rhs> t1 t2 d1 d2
115
116 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
117 of simplification will now result. However we don't actually *do* that
118 simplification. Rather, we leave it for the simplifier to do. If we
119 *did* do it, though, we'd get more call instances from the specialised
120 RHS. We can work out what they are by instantiating the call-instance
121 set from f's RHS with the types t1, t2.
122
123 Add this new id to f's IdInfo, to record that f has a specialised version.
124
125 Before doing any of this, check that f's IdInfo doesn't already
126 tell us about an existing instance of f at the required type/s.
127 (This might happen if specialisation was applied more than once, or
128 it might arise from user SPECIALIZE pragmas.)
129
130 Recursion
131 ~~~~~~~~~
132 Wait a minute! What if f is recursive? Then we can't just plug in
133 its right-hand side, can we?
134
135 But it's ok. The type checker *always* creates non-recursive definitions
136 for overloaded recursive functions. For example:
137
138 f x = f (x+x) -- Yes I know its silly
139
140 becomes
141
142 f a (d::Num a) = let p = +.sel a d
143 in
144 letrec fl (y::a) = fl (p y y)
145 in
146 fl
147
148 We still have recursion for non-overloaded functions which we
149 specialise, but the recursive call should get specialised to the
150 same recursive version.
151
152
153 Polymorphism 1
154 ~~~~~~~~~~~~~~
155
156 All this is crystal clear when the function is applied to *constant
157 types*; that is, types which have no type variables inside. But what if
158 it is applied to non-constant types? Suppose we find a call of f at type
159 t1/t2. There are two possibilities:
160
161 (a) The free type variables of t1, t2 are in scope at the definition point
162 of f. In this case there's no problem, we proceed just as before. A common
163 example is as follows. Here's the Haskell:
164
165 g y = let f x = x+x
166 in f y + f y
167
168 After typechecking we have
169
170 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
171 in +.sel a d (f a d y) (f a d y)
172
173 Notice that the call to f is at type type "a"; a non-constant type.
174 Both calls to f are at the same type, so we can specialise to give:
175
176 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
177 in +.sel a d (f@a y) (f@a y)
178
179
180 (b) The other case is when the type variables in the instance types
181 are *not* in scope at the definition point of f. The example we are
182 working with above is a good case. There are two instances of (+.sel a d),
183 but "a" is not in scope at the definition of +.sel. Can we do anything?
184 Yes, we can "common them up", a sort of limited common sub-expression deal.
185 This would give:
186
187 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
188 f@a (x::a) = +.sel@a x x
189 in +.sel@a (f@a y) (f@a y)
190
191 This can save work, and can't be spotted by the type checker, because
192 the two instances of +.sel weren't originally at the same type.
193
194 Further notes on (b)
195
196 * There are quite a few variations here. For example, the defn of
197 +.sel could be floated ouside the \y, to attempt to gain laziness.
198 It certainly mustn't be floated outside the \d because the d has to
199 be in scope too.
200
201 * We don't want to inline f_rhs in this case, because
202 that will duplicate code. Just commoning up the call is the point.
203
204 * Nothing gets added to +.sel's IdInfo.
205
206 * Don't bother unless the equivalence class has more than one item!
207
208 Not clear whether this is all worth it. It is of course OK to
209 simply discard call-instances when passing a big lambda.
210
211 Polymorphism 2 -- Overloading
212 ~~~~~~~~~~~~~~
213 Consider a function whose most general type is
214
215 f :: forall a b. Ord a => [a] -> b -> b
216
217 There is really no point in making a version of g at Int/Int and another
218 at Int/Bool, because it's only instantiating the type variable "a" which
219 buys us any efficiency. Since g is completely polymorphic in b there
220 ain't much point in making separate versions of g for the different
221 b types.
222
223 That suggests that we should identify which of g's type variables
224 are constrained (like "a") and which are unconstrained (like "b").
225 Then when taking equivalence classes in STEP 2, we ignore the type args
226 corresponding to unconstrained type variable. In STEP 3 we make
227 polymorphic versions. Thus:
228
229 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
230
231 We do this.
232
233
234 Dictionary floating
235 ~~~~~~~~~~~~~~~~~~~
236 Consider this
237
238 f a (d::Num a) = let g = ...
239 in
240 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
241
242 Here, g is only called at one type, but the dictionary isn't in scope at the
243 definition point for g. Usually the type checker would build a
244 definition for d1 which enclosed g, but the transformation system
245 might have moved d1's defn inward. Solution: float dictionary bindings
246 outwards along with call instances.
247
248 Consider
249
250 f x = let g p q = p==q
251 h r s = (r+s, g r s)
252 in
253 h x x
254
255
256 Before specialisation, leaving out type abstractions we have
257
258 f df x = let g :: Eq a => a -> a -> Bool
259 g dg p q = == dg p q
260 h :: Num a => a -> a -> (a, Bool)
261 h dh r s = let deq = eqFromNum dh
262 in (+ dh r s, g deq r s)
263 in
264 h df x x
265
266 After specialising h we get a specialised version of h, like this:
267
268 h' r s = let deq = eqFromNum df
269 in (+ df r s, g deq r s)
270
271 But we can't naively make an instance for g from this, because deq is not in scope
272 at the defn of g. Instead, we have to float out the (new) defn of deq
273 to widen its scope. Notice that this floating can't be done in advance -- it only
274 shows up when specialisation is done.
275
276 User SPECIALIZE pragmas
277 ~~~~~~~~~~~~~~~~~~~~~~~
278 Specialisation pragmas can be digested by the type checker, and implemented
279 by adding extra definitions along with that of f, in the same way as before
280
281 f@t1/t2 = <f_rhs> t1 t2 d1 d2
282
283 Indeed the pragmas *have* to be dealt with by the type checker, because
284 only it knows how to build the dictionaries d1 and d2! For example
285
286 g :: Ord a => [a] -> [a]
287 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
288
289 Here, the specialised version of g is an application of g's rhs to the
290 Ord dictionary for (Tree Int), which only the type checker can conjure
291 up. There might not even *be* one, if (Tree Int) is not an instance of
292 Ord! (All the other specialision has suitable dictionaries to hand
293 from actual calls.)
294
295 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
296 it is buried in a complex (as-yet-un-desugared) binding group.
297 Maybe we should say
298
299 f@t1/t2 = f* t1 t2 d1 d2
300
301 where f* is the Id f with an IdInfo which says "inline me regardless!".
302 Indeed all the specialisation could be done in this way.
303 That in turn means that the simplifier has to be prepared to inline absolutely
304 any in-scope let-bound thing.
305
306
307 Again, the pragma should permit polymorphism in unconstrained variables:
308
309 h :: Ord a => [a] -> b -> b
310 {-# SPECIALIZE h :: [Int] -> b -> b #-}
311
312 We *insist* that all overloaded type variables are specialised to ground types,
313 (and hence there can be no context inside a SPECIALIZE pragma).
314 We *permit* unconstrained type variables to be specialised to
315 - a ground type
316 - or left as a polymorphic type variable
317 but nothing in between. So
318
319 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
320
321 is *illegal*. (It can be handled, but it adds complication, and gains the
322 programmer nothing.)
323
324
325 SPECIALISING INSTANCE DECLARATIONS
326 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
327 Consider
328
329 instance Foo a => Foo [a] where
330 ...
331 {-# SPECIALIZE instance Foo [Int] #-}
332
333 The original instance decl creates a dictionary-function
334 definition:
335
336 dfun.Foo.List :: forall a. Foo a -> Foo [a]
337
338 The SPECIALIZE pragma just makes a specialised copy, just as for
339 ordinary function definitions:
340
341 dfun.Foo.List@Int :: Foo [Int]
342 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
343
344 The information about what instance of the dfun exist gets added to
345 the dfun's IdInfo in the same way as a user-defined function too.
346
347
348 Automatic instance decl specialisation?
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350 Can instance decls be specialised automatically? It's tricky.
351 We could collect call-instance information for each dfun, but
352 then when we specialised their bodies we'd get new call-instances
353 for ordinary functions; and when we specialised their bodies, we might get
354 new call-instances of the dfuns, and so on. This all arises because of
355 the unrestricted mutual recursion between instance decls and value decls.
356
357 Still, there's no actual problem; it just means that we may not do all
358 the specialisation we could theoretically do.
359
360 Furthermore, instance decls are usually exported and used non-locally,
361 so we'll want to compile enough to get those specialisations done.
362
363 Lastly, there's no such thing as a local instance decl, so we can
364 survive solely by spitting out *usage* information, and then reading that
365 back in as a pragma when next compiling the file. So for now,
366 we only specialise instance decls in response to pragmas.
367
368
369 SPITTING OUT USAGE INFORMATION
370 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
371
372 To spit out usage information we need to traverse the code collecting
373 call-instance information for all imported (non-prelude?) functions
374 and data types. Then we equivalence-class it and spit it out.
375
376 This is done at the top-level when all the call instances which escape
377 must be for imported functions and data types.
378
379 *** Not currently done ***
380
381
382 Partial specialisation by pragmas
383 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
384 What about partial specialisation:
385
386 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
387 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
388
389 or even
390
391 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
392
393 Seems quite reasonable. Similar things could be done with instance decls:
394
395 instance (Foo a, Foo b) => Foo (a,b) where
396 ...
397 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
398 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
399
400 Ho hum. Things are complex enough without this. I pass.
401
402
403 Requirements for the simplifier
404 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405 The simplifier has to be able to take advantage of the specialisation.
406
407 * When the simplifier finds an application of a polymorphic f, it looks in
408 f's IdInfo in case there is a suitable instance to call instead. This converts
409
410 f t1 t2 d1 d2 ===> f_t1_t2
411
412 Note that the dictionaries get eaten up too!
413
414 * Dictionary selection operations on constant dictionaries must be
415 short-circuited:
416
417 +.sel Int d ===> +Int
418
419 The obvious way to do this is in the same way as other specialised
420 calls: +.sel has inside it some IdInfo which tells that if it's applied
421 to the type Int then it should eat a dictionary and transform to +Int.
422
423 In short, dictionary selectors need IdInfo inside them for constant
424 methods.
425
426 * Exactly the same applies if a superclass dictionary is being
427 extracted:
428
429 Eq.sel Int d ===> dEqInt
430
431 * Something similar applies to dictionary construction too. Suppose
432 dfun.Eq.List is the function taking a dictionary for (Eq a) to
433 one for (Eq [a]). Then we want
434
435 dfun.Eq.List Int d ===> dEq.List_Int
436
437 Where does the Eq [Int] dictionary come from? It is built in
438 response to a SPECIALIZE pragma on the Eq [a] instance decl.
439
440 In short, dfun Ids need IdInfo with a specialisation for each
441 constant instance of their instance declaration.
442
443 All this uses a single mechanism: the SpecEnv inside an Id
444
445
446 What does the specialisation IdInfo look like?
447 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448
449 The SpecEnv of an Id maps a list of types (the template) to an expression
450
451 [Type] |-> Expr
452
453 For example, if f has this RuleInfo:
454
455 [Int, a] -> \d:Ord Int. f' a
456
457 it means that we can replace the call
458
459 f Int t ===> (\d. f' t)
460
461 This chucks one dictionary away and proceeds with the
462 specialised version of f, namely f'.
463
464
465 What can't be done this way?
466 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
467 There is no way, post-typechecker, to get a dictionary for (say)
468 Eq a from a dictionary for Eq [a]. So if we find
469
470 ==.sel [t] d
471
472 we can't transform to
473
474 eqList (==.sel t d')
475
476 where
477 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
478
479 Of course, we currently have no way to automatically derive
480 eqList, nor to connect it to the Eq [a] instance decl, but you
481 can imagine that it might somehow be possible. Taking advantage
482 of this is permanently ruled out.
483
484 Still, this is no great hardship, because we intend to eliminate
485 overloading altogether anyway!
486
487 A note about non-tyvar dictionaries
488 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
489 Some Ids have types like
490
491 forall a,b,c. Eq a -> Ord [a] -> tau
492
493 This seems curious at first, because we usually only have dictionary
494 args whose types are of the form (C a) where a is a type variable.
495 But this doesn't hold for the functions arising from instance decls,
496 which sometimes get arguments with types of form (C (T a)) for some
497 type constructor T.
498
499 Should we specialise wrt this compound-type dictionary? We used to say
500 "no", saying:
501 "This is a heuristic judgement, as indeed is the fact that we
502 specialise wrt only dictionaries. We choose *not* to specialise
503 wrt compound dictionaries because at the moment the only place
504 they show up is in instance decls, where they are simply plugged
505 into a returned dictionary. So nothing is gained by specialising
506 wrt them."
507
508 But it is simpler and more uniform to specialise wrt these dicts too;
509 and in future GHC is likely to support full fledged type signatures
510 like
511 f :: Eq [(a,b)] => ...
512
513
514 ************************************************************************
515 * *
516 \subsubsection{The new specialiser}
517 * *
518 ************************************************************************
519
520 Our basic game plan is this. For let(rec) bound function
521 f :: (C a, D c) => (a,b,c,d) -> Bool
522
523 * Find any specialised calls of f, (f ts ds), where
524 ts are the type arguments t1 .. t4, and
525 ds are the dictionary arguments d1 .. d2.
526
527 * Add a new definition for f1 (say):
528
529 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
530
531 Note that we abstract over the unconstrained type arguments.
532
533 * Add the mapping
534
535 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
536
537 to the specialisations of f. This will be used by the
538 simplifier to replace calls
539 (f t1 t2 t3 t4) da db
540 by
541 (\d1 d1 -> f1 t2 t4) da db
542
543 All the stuff about how many dictionaries to discard, and what types
544 to apply the specialised function to, are handled by the fact that the
545 SpecEnv contains a template for the result of the specialisation.
546
547 We don't build *partial* specialisations for f. For example:
548
549 f :: Eq a => a -> a -> Bool
550 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
551
552 Here, little is gained by making a specialised copy of f.
553 There's a distinct danger that the specialised version would
554 first build a dictionary for (Eq b, Eq c), and then select the (==)
555 method from it! Even if it didn't, not a great deal is saved.
556
557 We do, however, generate polymorphic, but not overloaded, specialisations:
558
559 f :: Eq a => [a] -> b -> b -> b
560 ... SPECIALISE f :: [Int] -> b -> b -> b ...
561
562 Hence, the invariant is this:
563
564 *** no specialised version is overloaded ***
565
566
567 ************************************************************************
568 * *
569 \subsubsection{The exported function}
570 * *
571 ************************************************************************
572 -}
573
574 -- | Specialise calls to type-class overloaded functions occuring in a program.
575 specProgram :: ModGuts -> CoreM ModGuts
576 specProgram guts@(ModGuts { mg_module = this_mod
577 , mg_rules = local_rules
578 , mg_binds = binds })
579 = do { dflags <- getDynFlags
580
581 -- Specialise the bindings of this module
582 ; (binds', uds) <- runSpecM dflags this_mod (go binds)
583
584 -- Specialise imported functions
585 ; hpt_rules <- getRuleBase
586 ; let rule_base = extendRuleBaseList hpt_rules local_rules
587 ; (new_rules, spec_binds) <- specImports dflags this_mod top_env emptyVarSet
588 [] rule_base uds
589
590 ; let final_binds
591 | null spec_binds = binds'
592 | otherwise = Rec (flattenBinds spec_binds) : binds'
593 -- Note [Glom the bindings if imported functions are specialised]
594
595 ; return (guts { mg_binds = final_binds
596 , mg_rules = new_rules ++ local_rules }) }
597 where
598 -- We need to start with a Subst that knows all the things
599 -- that are in scope, so that the substitution engine doesn't
600 -- accidentally re-use a unique that's already in use
601 -- Easiest thing is to do it all at once, as if all the top-level
602 -- decls were mutually recursive
603 top_env = SE { se_subst = CoreSubst.mkEmptySubst $ mkInScopeSet $ mkVarSet $
604 bindersOfBinds binds
605 , se_interesting = emptyVarSet }
606
607 go [] = return ([], emptyUDs)
608 go (bind:binds) = do (binds', uds) <- go binds
609 (bind', uds') <- specBind top_env bind uds
610 return (bind' ++ binds', uds')
611
612 {-
613 Note [Wrap bindings returned by specImports]
614 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
615 'specImports' returns a set of specialized bindings. However, these are lacking
616 necessary floated dictionary bindings, which are returned by
617 UsageDetails(ud_binds). These dictionaries need to be brought into scope with
618 'wrapDictBinds' before the bindings returned by 'specImports' can be used. See,
619 for instance, the 'specImports' call in 'specProgram'.
620
621
622 Note [Disabling cross-module specialisation]
623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
624 Since GHC 7.10 we have performed specialisation of INLINABLE bindings living
625 in modules outside of the current module. This can sometimes uncover user code
626 which explodes in size when aggressively optimized. The
627 -fno-cross-module-specialise option was introduced to allow users to being
628 bitten by such instances to revert to the pre-7.10 behavior.
629
630 See Trac #10491
631 -}
632
633 -- | Specialise a set of calls to imported bindings
634 specImports :: DynFlags
635 -> Module
636 -> SpecEnv -- Passed in so that all top-level Ids are in scope
637 -> VarSet -- Don't specialise these ones
638 -- See Note [Avoiding recursive specialisation]
639 -> [Id] -- Stack of imported functions being specialised
640 -> RuleBase -- Rules from this module and the home package
641 -- (but not external packages, which can change)
642 -> UsageDetails -- Calls for imported things, and floating bindings
643 -> CoreM ( [CoreRule] -- New rules
644 , [CoreBind] ) -- Specialised bindings
645 -- See Note [Wrapping bindings returned by specImports]
646 specImports dflags this_mod top_env done callers rule_base
647 (MkUD { ud_binds = dict_binds, ud_calls = calls })
648 -- See Note [Disabling cross-module specialisation]
649 | not $ gopt Opt_CrossModuleSpecialise dflags
650 = return ([], [])
651
652 | otherwise
653 = do { let import_calls = dVarEnvElts calls
654 ; (rules, spec_binds) <- go rule_base import_calls
655
656 -- Don't forget to wrap the specialized bindings with
657 -- bindings for the needed dictionaries.
658 -- See Note [Wrap bindings returned by specImports]
659 ; let spec_binds' = wrapDictBinds dict_binds spec_binds
660
661 ; return (rules, spec_binds') }
662 where
663 go :: RuleBase -> [CallInfoSet] -> CoreM ([CoreRule], [CoreBind])
664 go _ [] = return ([], [])
665 go rb (cis@(CIS fn _) : other_calls)
666 = do { let ok_calls = filterCalls cis dict_binds
667 -- Drop calls that (directly or indirectly) refer to fn
668 -- See Note [Avoiding loops]
669 -- ; debugTraceMsg (text "specImport" <+> vcat [ ppr fn
670 -- , text "calls" <+> ppr cis
671 -- , text "ud_binds =" <+> ppr dict_binds
672 -- , text "dump set =" <+> ppr dump_set
673 -- , text "filtered calls =" <+> ppr ok_calls ])
674 ; (rules1, spec_binds1) <- specImport dflags this_mod top_env
675 done callers rb fn ok_calls
676
677 ; (rules2, spec_binds2) <- go (extendRuleBaseList rb rules1) other_calls
678 ; return (rules1 ++ rules2, spec_binds1 ++ spec_binds2) }
679
680 specImport :: DynFlags
681 -> Module
682 -> SpecEnv -- Passed in so that all top-level Ids are in scope
683 -> VarSet -- Don't specialise these
684 -- See Note [Avoiding recursive specialisation]
685 -> [Id] -- Stack of imported functions being specialised
686 -> RuleBase -- Rules from this module
687 -> Id -> [CallInfo] -- Imported function and calls for it
688 -> CoreM ( [CoreRule] -- New rules
689 , [CoreBind] ) -- Specialised bindings
690 specImport dflags this_mod top_env done callers rb fn calls_for_fn
691 | fn `elemVarSet` done
692 = return ([], []) -- No warning. This actually happens all the time
693 -- when specialising a recursive function, because
694 -- the RHS of the specialised function contains a recursive
695 -- call to the original function
696
697 | null calls_for_fn -- We filtered out all the calls in deleteCallsMentioning
698 = return ([], [])
699
700 | wantSpecImport dflags unfolding
701 , Just rhs <- maybeUnfoldingTemplate unfolding
702 = do { -- Get rules from the external package state
703 -- We keep doing this in case we "page-fault in"
704 -- more rules as we go along
705 ; hsc_env <- getHscEnv
706 ; eps <- liftIO $ hscEPS hsc_env
707 ; vis_orphs <- getVisibleOrphanMods
708 ; let full_rb = unionRuleBase rb (eps_rule_base eps)
709 rules_for_fn = getRules (RuleEnv full_rb vis_orphs) fn
710
711 ; (rules1, spec_pairs, uds)
712 <- -- pprTrace "specImport1" (vcat [ppr fn, ppr calls_for_fn, ppr rhs]) $
713 runSpecM dflags this_mod $
714 specCalls (Just this_mod) top_env rules_for_fn calls_for_fn fn rhs
715 ; let spec_binds1 = [NonRec b r | (b,r) <- spec_pairs]
716 -- After the rules kick in we may get recursion, but
717 -- we rely on a global GlomBinds to sort that out later
718 -- See Note [Glom the bindings if imported functions are specialised]
719
720 -- Now specialise any cascaded calls
721 ; (rules2, spec_binds2) <- -- pprTrace "specImport 2" (ppr fn $$ ppr rules1 $$ ppr spec_binds1) $
722 specImports dflags this_mod top_env
723 (extendVarSet done fn)
724 (fn:callers)
725 (extendRuleBaseList rb rules1)
726 uds
727
728 ; let final_binds = spec_binds2 ++ spec_binds1
729
730 ; return (rules2 ++ rules1, final_binds) }
731
732 | warnMissingSpecs dflags callers
733 = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn))
734 2 (vcat [ text "when specialising" <+> quotes (ppr caller)
735 | caller <- callers])
736 , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn))
737 , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ])
738 ; return ([], []) }
739
740 | otherwise
741 = return ([], [])
742 where
743 unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers
744
745 warnMissingSpecs :: DynFlags -> [Id] -> Bool
746 -- See Note [Warning about missed specialisations]
747 warnMissingSpecs dflags callers
748 | wopt Opt_WarnAllMissedSpecs dflags = True
749 | not (wopt Opt_WarnMissedSpecs dflags) = False
750 | null callers = False
751 | otherwise = all has_inline_prag callers
752 where
753 has_inline_prag id = isAnyInlinePragma (idInlinePragma id)
754
755 wantSpecImport :: DynFlags -> Unfolding -> Bool
756 -- See Note [Specialise imported INLINABLE things]
757 wantSpecImport dflags unf
758 = case unf of
759 NoUnfolding -> False
760 BootUnfolding -> False
761 OtherCon {} -> False
762 DFunUnfolding {} -> True
763 CoreUnfolding { uf_src = src, uf_guidance = _guidance }
764 | gopt Opt_SpecialiseAggressively dflags -> True
765 | isStableSource src -> True
766 -- Specialise even INLINE things; it hasn't inlined yet,
767 -- so perhaps it never will. Moreover it may have calls
768 -- inside it that we want to specialise
769 | otherwise -> False -- Stable, not INLINE, hence INLINABLE
770
771 {- Note [Warning about missed specialisations]
772 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
773 Suppose
774 * In module Lib, you carefully mark a function 'foo' INLINABLE
775 * Import Lib(foo) into another module M
776 * Call 'foo' at some specialised type in M
777 Then you jolly well expect it to be specialised in M. But what if
778 'foo' calls another function 'Lib.bar'. Then you'd like 'bar' to be
779 specialised too. But if 'bar' is not marked INLINABLE it may well
780 not be specialised. The warning Opt_WarnMissedSpecs warns about this.
781
782 It's more noisy to warning about a missed specialisation opportunity
783 for /every/ overloaded imported function, but sometimes useful. That
784 is what Opt_WarnAllMissedSpecs does.
785
786 ToDo: warn about missed opportunities for local functions.
787
788 Note [Specialise imported INLINABLE things]
789 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
790 What imported functions do we specialise? The basic set is
791 * DFuns and things with INLINABLE pragmas.
792 but with -fspecialise-aggressively we add
793 * Anything with an unfolding template
794
795 Trac #8874 has a good example of why we want to auto-specialise DFuns.
796
797 We have the -fspecialise-aggressively flag (usually off), because we
798 risk lots of orphan modules from over-vigorous specialisation.
799 However it's not a big deal: anything non-recursive with an
800 unfolding-template will probably have been inlined already.
801
802 Note [Glom the bindings if imported functions are specialised]
803 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
804 Suppose we have an imported, *recursive*, INLINABLE function
805 f :: Eq a => a -> a
806 f = /\a \d x. ...(f a d)...
807 In the module being compiled we have
808 g x = f (x::Int)
809 Now we'll make a specialised function
810 f_spec :: Int -> Int
811 f_spec = \x -> ...(f Int dInt)...
812 {-# RULE f Int _ = f_spec #-}
813 g = \x. f Int dInt x
814 Note that f_spec doesn't look recursive
815 After rewriting with the RULE, we get
816 f_spec = \x -> ...(f_spec)...
817 BUT since f_spec was non-recursive before it'll *stay* non-recursive.
818 The occurrence analyser never turns a NonRec into a Rec. So we must
819 make sure that f_spec is recursive. Easiest thing is to make all
820 the specialisations for imported bindings recursive.
821
822
823 Note [Avoiding recursive specialisation]
824 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
825 When we specialise 'f' we may find new overloaded calls to 'g', 'h' in
826 'f's RHS. So we want to specialise g,h. But we don't want to
827 specialise f any more! It's possible that f's RHS might have a
828 recursive yet-more-specialised call, so we'd diverge in that case.
829 And if the call is to the same type, one specialisation is enough.
830 Avoiding this recursive specialisation loop is the reason for the
831 'done' VarSet passed to specImports and specImport.
832
833 ************************************************************************
834 * *
835 \subsubsection{@specExpr@: the main function}
836 * *
837 ************************************************************************
838 -}
839
840 data SpecEnv
841 = SE { se_subst :: CoreSubst.Subst
842 -- We carry a substitution down:
843 -- a) we must clone any binding that might float outwards,
844 -- to avoid name clashes
845 -- b) we carry a type substitution to use when analysing
846 -- the RHS of specialised bindings (no type-let!)
847
848
849 , se_interesting :: VarSet
850 -- Dict Ids that we know something about
851 -- and hence may be worth specialising against
852 -- See Note [Interesting dictionary arguments]
853 }
854
855 specVar :: SpecEnv -> Id -> CoreExpr
856 specVar env v = CoreSubst.lookupIdSubst (text "specVar") (se_subst env) v
857
858 specExpr :: SpecEnv -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
859
860 ---------------- First the easy cases --------------------
861 specExpr env (Type ty) = return (Type (substTy env ty), emptyUDs)
862 specExpr env (Coercion co) = return (Coercion (substCo env co), emptyUDs)
863 specExpr env (Var v) = return (specVar env v, emptyUDs)
864 specExpr _ (Lit lit) = return (Lit lit, emptyUDs)
865 specExpr env (Cast e co)
866 = do { (e', uds) <- specExpr env e
867 ; return ((mkCast e' (substCo env co)), uds) }
868 specExpr env (Tick tickish body)
869 = do { (body', uds) <- specExpr env body
870 ; return (Tick (specTickish env tickish) body', uds) }
871
872 ---------------- Applications might generate a call instance --------------------
873 specExpr env expr@(App {})
874 = go expr []
875 where
876 go (App fun arg) args = do (arg', uds_arg) <- specExpr env arg
877 (fun', uds_app) <- go fun (arg':args)
878 return (App fun' arg', uds_arg `plusUDs` uds_app)
879
880 go (Var f) args = case specVar env f of
881 Var f' -> return (Var f', mkCallUDs env f' args)
882 e' -> return (e', emptyUDs) -- I don't expect this!
883 go other _ = specExpr env other
884
885 ---------------- Lambda/case require dumping of usage details --------------------
886 specExpr env e@(Lam _ _) = do
887 (body', uds) <- specExpr env' body
888 let (free_uds, dumped_dbs) = dumpUDs bndrs' uds
889 return (mkLams bndrs' (wrapDictBindsE dumped_dbs body'), free_uds)
890 where
891 (bndrs, body) = collectBinders e
892 (env', bndrs') = substBndrs env bndrs
893 -- More efficient to collect a group of binders together all at once
894 -- and we don't want to split a lambda group with dumped bindings
895
896 specExpr env (Case scrut case_bndr ty alts)
897 = do { (scrut', scrut_uds) <- specExpr env scrut
898 ; (scrut'', case_bndr', alts', alts_uds)
899 <- specCase env scrut' case_bndr alts
900 ; return (Case scrut'' case_bndr' (substTy env ty) alts'
901 , scrut_uds `plusUDs` alts_uds) }
902
903 ---------------- Finally, let is the interesting case --------------------
904 specExpr env (Let bind body)
905 = do { -- Clone binders
906 (rhs_env, body_env, bind') <- cloneBindSM env bind
907
908 -- Deal with the body
909 ; (body', body_uds) <- specExpr body_env body
910
911 -- Deal with the bindings
912 ; (binds', uds) <- specBind rhs_env bind' body_uds
913
914 -- All done
915 ; return (foldr Let body' binds', uds) }
916
917 specTickish :: SpecEnv -> Tickish Id -> Tickish Id
918 specTickish env (Breakpoint ix ids)
919 = Breakpoint ix [ id' | id <- ids, Var id' <- [specVar env id]]
920 -- drop vars from the list if they have a non-variable substitution.
921 -- should never happen, but it's harmless to drop them anyway.
922 specTickish _ other_tickish = other_tickish
923
924 specCase :: SpecEnv
925 -> CoreExpr -- Scrutinee, already done
926 -> Id -> [CoreAlt]
927 -> SpecM ( CoreExpr -- New scrutinee
928 , Id
929 , [CoreAlt]
930 , UsageDetails)
931 specCase env scrut' case_bndr [(con, args, rhs)]
932 | isDictId case_bndr -- See Note [Floating dictionaries out of cases]
933 , interestingDict env scrut'
934 , not (isDeadBinder case_bndr && null sc_args')
935 = do { (case_bndr_flt : sc_args_flt) <- mapM clone_me (case_bndr' : sc_args')
936
937 ; let sc_rhss = [ Case (Var case_bndr_flt) case_bndr' (idType sc_arg')
938 [(con, args', Var sc_arg')]
939 | sc_arg' <- sc_args' ]
940
941 -- Extend the substitution for RHS to map the *original* binders
942 -- to their floated verions.
943 mb_sc_flts :: [Maybe DictId]
944 mb_sc_flts = map (lookupVarEnv clone_env) args'
945 clone_env = zipVarEnv sc_args' sc_args_flt
946 subst_prs = (case_bndr, Var case_bndr_flt)
947 : [ (arg, Var sc_flt)
948 | (arg, Just sc_flt) <- args `zip` mb_sc_flts ]
949 env_rhs' = env_rhs { se_subst = CoreSubst.extendIdSubstList (se_subst env_rhs) subst_prs
950 , se_interesting = se_interesting env_rhs `extendVarSetList`
951 (case_bndr_flt : sc_args_flt) }
952
953 ; (rhs', rhs_uds) <- specExpr env_rhs' rhs
954 ; let scrut_bind = mkDB (NonRec case_bndr_flt scrut')
955 case_bndr_set = unitVarSet case_bndr_flt
956 sc_binds = [(NonRec sc_arg_flt sc_rhs, case_bndr_set)
957 | (sc_arg_flt, sc_rhs) <- sc_args_flt `zip` sc_rhss ]
958 flt_binds = scrut_bind : sc_binds
959 (free_uds, dumped_dbs) = dumpUDs (case_bndr':args') rhs_uds
960 all_uds = flt_binds `addDictBinds` free_uds
961 alt' = (con, args', wrapDictBindsE dumped_dbs rhs')
962 ; return (Var case_bndr_flt, case_bndr', [alt'], all_uds) }
963 where
964 (env_rhs, (case_bndr':args')) = substBndrs env (case_bndr:args)
965 sc_args' = filter is_flt_sc_arg args'
966
967 clone_me bndr = do { uniq <- getUniqueM
968 ; return (mkUserLocalOrCoVar occ uniq ty loc) }
969 where
970 name = idName bndr
971 ty = idType bndr
972 occ = nameOccName name
973 loc = getSrcSpan name
974
975 arg_set = mkVarSet args'
976 is_flt_sc_arg var = isId var
977 && not (isDeadBinder var)
978 && isDictTy var_ty
979 && not (tyCoVarsOfType var_ty `intersectsVarSet` arg_set)
980 where
981 var_ty = idType var
982
983
984 specCase env scrut case_bndr alts
985 = do { (alts', uds_alts) <- mapAndCombineSM spec_alt alts
986 ; return (scrut, case_bndr', alts', uds_alts) }
987 where
988 (env_alt, case_bndr') = substBndr env case_bndr
989 spec_alt (con, args, rhs) = do
990 (rhs', uds) <- specExpr env_rhs rhs
991 let (free_uds, dumped_dbs) = dumpUDs (case_bndr' : args') uds
992 return ((con, args', wrapDictBindsE dumped_dbs rhs'), free_uds)
993 where
994 (env_rhs, args') = substBndrs env_alt args
995
996 {-
997 Note [Floating dictionaries out of cases]
998 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
999 Consider
1000 g = \d. case d of { MkD sc ... -> ...(f sc)... }
1001 Naively we can't float d2's binding out of the case expression,
1002 because 'sc' is bound by the case, and that in turn means we can't
1003 specialise f, which seems a pity.
1004
1005 So we invert the case, by floating out a binding
1006 for 'sc_flt' thus:
1007 sc_flt = case d of { MkD sc ... -> sc }
1008 Now we can float the call instance for 'f'. Indeed this is just
1009 what'll happen if 'sc' was originally bound with a let binding,
1010 but case is more efficient, and necessary with equalities. So it's
1011 good to work with both.
1012
1013 You might think that this won't make any difference, because the
1014 call instance will only get nuked by the \d. BUT if 'g' itself is
1015 specialised, then transitively we should be able to specialise f.
1016
1017 In general, given
1018 case e of cb { MkD sc ... -> ...(f sc)... }
1019 we transform to
1020 let cb_flt = e
1021 sc_flt = case cb_flt of { MkD sc ... -> sc }
1022 in
1023 case cb_flt of bg { MkD sc ... -> ....(f sc_flt)... }
1024
1025 The "_flt" things are the floated binds; we use the current substitution
1026 to substitute sc -> sc_flt in the RHS
1027
1028 ************************************************************************
1029 * *
1030 Dealing with a binding
1031 * *
1032 ************************************************************************
1033 -}
1034
1035 specBind :: SpecEnv -- Use this for RHSs
1036 -> CoreBind -- Binders are already cloned by cloneBindSM,
1037 -- but RHSs are un-processed
1038 -> UsageDetails -- Info on how the scope of the binding
1039 -> SpecM ([CoreBind], -- New bindings
1040 UsageDetails) -- And info to pass upstream
1041
1042 -- Returned UsageDetails:
1043 -- No calls for binders of this bind
1044 specBind rhs_env (NonRec fn rhs) body_uds
1045 = do { (rhs', rhs_uds) <- specExpr rhs_env rhs
1046 ; (fn', spec_defns, body_uds1) <- specDefn rhs_env body_uds fn rhs
1047
1048 ; let pairs = spec_defns ++ [(fn', rhs')]
1049 -- fn' mentions the spec_defns in its rules,
1050 -- so put the latter first
1051
1052 combined_uds = body_uds1 `plusUDs` rhs_uds
1053
1054 (free_uds, dump_dbs, float_all) = dumpBindUDs [fn] combined_uds
1055
1056 final_binds :: [DictBind]
1057 -- See Note [From non-recursive to recursive]
1058 final_binds
1059 | not (isEmptyBag dump_dbs)
1060 , not (null spec_defns)
1061 = [recWithDumpedDicts pairs dump_dbs]
1062 | otherwise
1063 = [mkDB $ NonRec b r | (b,r) <- pairs]
1064 ++ bagToList dump_dbs
1065
1066 ; if float_all then
1067 -- Rather than discard the calls mentioning the bound variables
1068 -- we float this (dictionary) binding along with the others
1069 return ([], free_uds `snocDictBinds` final_binds)
1070 else
1071 -- No call in final_uds mentions bound variables,
1072 -- so we can just leave the binding here
1073 return (map fst final_binds, free_uds) }
1074
1075
1076 specBind rhs_env (Rec pairs) body_uds
1077 -- Note [Specialising a recursive group]
1078 = do { let (bndrs,rhss) = unzip pairs
1079 ; (rhss', rhs_uds) <- mapAndCombineSM (specExpr rhs_env) rhss
1080 ; let scope_uds = body_uds `plusUDs` rhs_uds
1081 -- Includes binds and calls arising from rhss
1082
1083 ; (bndrs1, spec_defns1, uds1) <- specDefns rhs_env scope_uds pairs
1084
1085 ; (bndrs3, spec_defns3, uds3)
1086 <- if null spec_defns1 -- Common case: no specialisation
1087 then return (bndrs1, [], uds1)
1088 else do { -- Specialisation occurred; do it again
1089 (bndrs2, spec_defns2, uds2)
1090 <- specDefns rhs_env uds1 (bndrs1 `zip` rhss)
1091 ; return (bndrs2, spec_defns2 ++ spec_defns1, uds2) }
1092
1093 ; let (final_uds, dumped_dbs, float_all) = dumpBindUDs bndrs uds3
1094 final_bind = recWithDumpedDicts (spec_defns3 ++ zip bndrs3 rhss')
1095 dumped_dbs
1096
1097 ; if float_all then
1098 return ([], final_uds `snocDictBind` final_bind)
1099 else
1100 return ([fst final_bind], final_uds) }
1101
1102
1103 ---------------------------
1104 specDefns :: SpecEnv
1105 -> UsageDetails -- Info on how it is used in its scope
1106 -> [(OutId,InExpr)] -- The things being bound and their un-processed RHS
1107 -> SpecM ([OutId], -- Original Ids with RULES added
1108 [(OutId,OutExpr)], -- Extra, specialised bindings
1109 UsageDetails) -- Stuff to fling upwards from the specialised versions
1110
1111 -- Specialise a list of bindings (the contents of a Rec), but flowing usages
1112 -- upwards binding by binding. Example: { f = ...g ...; g = ...f .... }
1113 -- Then if the input CallDetails has a specialised call for 'g', whose specialisation
1114 -- in turn generates a specialised call for 'f', we catch that in this one sweep.
1115 -- But not vice versa (it's a fixpoint problem).
1116
1117 specDefns _env uds []
1118 = return ([], [], uds)
1119 specDefns env uds ((bndr,rhs):pairs)
1120 = do { (bndrs1, spec_defns1, uds1) <- specDefns env uds pairs
1121 ; (bndr1, spec_defns2, uds2) <- specDefn env uds1 bndr rhs
1122 ; return (bndr1 : bndrs1, spec_defns1 ++ spec_defns2, uds2) }
1123
1124 ---------------------------
1125 specDefn :: SpecEnv
1126 -> UsageDetails -- Info on how it is used in its scope
1127 -> OutId -> InExpr -- The thing being bound and its un-processed RHS
1128 -> SpecM (Id, -- Original Id with added RULES
1129 [(Id,CoreExpr)], -- Extra, specialised bindings
1130 UsageDetails) -- Stuff to fling upwards from the specialised versions
1131
1132 specDefn env body_uds fn rhs
1133 = do { let (body_uds_without_me, calls_for_me) = callsForMe fn body_uds
1134 rules_for_me = idCoreRules fn
1135 ; (rules, spec_defns, spec_uds) <- specCalls Nothing env rules_for_me
1136 calls_for_me fn rhs
1137 ; return ( fn `addIdSpecialisations` rules
1138 , spec_defns
1139 , body_uds_without_me `plusUDs` spec_uds) }
1140 -- It's important that the `plusUDs` is this way
1141 -- round, because body_uds_without_me may bind
1142 -- dictionaries that are used in calls_for_me passed
1143 -- to specDefn. So the dictionary bindings in
1144 -- spec_uds may mention dictionaries bound in
1145 -- body_uds_without_me
1146
1147 ---------------------------
1148 specCalls :: Maybe Module -- Just this_mod => specialising imported fn
1149 -- Nothing => specialising local fn
1150 -> SpecEnv
1151 -> [CoreRule] -- Existing RULES for the fn
1152 -> [CallInfo]
1153 -> OutId -> InExpr
1154 -> SpecM SpecInfo -- New rules, specialised bindings, and usage details
1155
1156 -- This function checks existing rules, and does not create
1157 -- duplicate ones. So the caller does not need to do this filtering.
1158 -- See 'already_covered'
1159
1160 type SpecInfo = ( [CoreRule] -- Specialisation rules
1161 , [(Id,CoreExpr)] -- Specialised definition
1162 , UsageDetails ) -- Usage details from specialised RHSs
1163
1164 specCalls mb_mod env existing_rules calls_for_me fn rhs
1165 -- The first case is the interesting one
1166 | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas
1167 && rhs_bndrs1 `lengthAtLeast` n_dicts -- and enough dict args
1168 && notNull calls_for_me -- And there are some calls to specialise
1169 && not (isNeverActive (idInlineActivation fn))
1170 -- Don't specialise NOINLINE things
1171 -- See Note [Auto-specialisation and RULES]
1172
1173 -- && not (certainlyWillInline (idUnfolding fn)) -- And it's not small
1174 -- See Note [Inline specialisation] for why we do not
1175 -- switch off specialisation for inline functions
1176
1177 = -- pprTrace "specDefn: some" (ppr fn $$ ppr calls_for_me $$ ppr existing_rules) $
1178 foldlM spec_call ([], [], emptyUDs) calls_for_me
1179
1180 | otherwise -- No calls or RHS doesn't fit our preconceptions
1181 = WARN( not (exprIsTrivial rhs) && notNull calls_for_me,
1182 text "Missed specialisation opportunity for"
1183 <+> ppr fn $$ _trace_doc )
1184 -- Note [Specialisation shape]
1185 -- pprTrace "specDefn: none" (ppr fn <+> ppr calls_for_me) $
1186 return ([], [], emptyUDs)
1187 where
1188 _trace_doc = sep [ ppr rhs_tyvars, ppr n_tyvars
1189 , ppr rhs_bndrs, ppr n_dicts
1190 , ppr (idInlineActivation fn) ]
1191
1192 fn_type = idType fn
1193 fn_arity = idArity fn
1194 fn_unf = realIdUnfolding fn -- Ignore loop-breaker-ness here
1195 (tyvars, theta, _) = tcSplitSigmaTy fn_type
1196 n_tyvars = length tyvars
1197 n_dicts = length theta
1198 inl_prag = idInlinePragma fn
1199 inl_act = inlinePragmaActivation inl_prag
1200 is_local = isLocalId fn
1201
1202 -- Figure out whether the function has an INLINE pragma
1203 -- See Note [Inline specialisations]
1204
1205 (rhs_bndrs, rhs_body) = collectBindersPushingCo rhs
1206 -- See Note [Account for casts in binding]
1207 (rhs_tyvars, rhs_bndrs1) = span isTyVar rhs_bndrs
1208 (rhs_dict_ids, rhs_bndrs2) = splitAt n_dicts rhs_bndrs1
1209 body = mkLams rhs_bndrs2 rhs_body
1210 -- Glue back on the non-dict lambdas
1211
1212 in_scope = CoreSubst.substInScope (se_subst env)
1213
1214 already_covered :: DynFlags -> [CoreRule] -> [CoreExpr] -> Bool
1215 already_covered dflags new_rules args -- Note [Specialisations already covered]
1216 = isJust (lookupRule dflags (in_scope, realIdUnfolding)
1217 (const True) fn args
1218 (new_rules ++ existing_rules))
1219 -- NB: we look both in the new_rules (generated by this invocation
1220 -- of specCalls), and in existing_rules (passed in to specCalls)
1221
1222 mk_ty_args :: [Maybe Type] -> [TyVar] -> [CoreExpr]
1223 mk_ty_args [] poly_tvs
1224 = ASSERT( null poly_tvs ) []
1225 mk_ty_args (Nothing : call_ts) (poly_tv : poly_tvs)
1226 = Type (mkTyVarTy poly_tv) : mk_ty_args call_ts poly_tvs
1227 mk_ty_args (Just ty : call_ts) poly_tvs
1228 = Type ty : mk_ty_args call_ts poly_tvs
1229 mk_ty_args (Nothing : _) [] = panic "mk_ty_args"
1230
1231 ----------------------------------------------------------
1232 -- Specialise to one particular call pattern
1233 spec_call :: SpecInfo -- Accumulating parameter
1234 -> CallInfo -- Call instance
1235 -> SpecM SpecInfo
1236 spec_call spec_acc@(rules_acc, pairs_acc, uds_acc)
1237 (CI { ci_key = CallKey call_ts, ci_args = call_ds })
1238 = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts )
1239
1240 -- Suppose f's defn is f = /\ a b c -> \ d1 d2 -> rhs
1241 -- Suppose the call is for f [Just t1, Nothing, Just t3] [dx1, dx2]
1242
1243 -- Construct the new binding
1244 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b -> rhs)
1245 -- PLUS the rule
1246 -- RULE "SPEC f" forall b d1' d2'. f b d1' d2' = f1 b
1247 -- In the rule, d1' and d2' are just wildcards, not used in the RHS
1248 -- PLUS the usage-details
1249 -- { d1' = dx1; d2' = dx2 }
1250 -- where d1', d2' are cloned versions of d1,d2, with the type substitution
1251 -- applied. These auxiliary bindings just avoid duplication of dx1, dx2
1252 --
1253 -- Note that the substitution is applied to the whole thing.
1254 -- This is convenient, but just slightly fragile. Notably:
1255 -- * There had better be no name clashes in a/b/c
1256 do { let
1257 -- poly_tyvars = [b] in the example above
1258 -- spec_tyvars = [a,c]
1259 -- ty_args = [t1,b,t3]
1260 spec_tv_binds = [(tv,ty) | (tv, Just ty) <- rhs_tyvars `zip` call_ts]
1261 env1 = extendTvSubstList env spec_tv_binds
1262 (rhs_env, poly_tyvars) = substBndrs env1
1263 [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
1264
1265 -- Clone rhs_dicts, including instantiating their types
1266 ; inst_dict_ids <- mapM (newDictBndr rhs_env) rhs_dict_ids
1267 ; let (rhs_env2, dx_binds, spec_dict_args)
1268 = bindAuxiliaryDicts rhs_env rhs_dict_ids call_ds inst_dict_ids
1269 ty_args = mk_ty_args call_ts poly_tyvars
1270 ev_args = map varToCoreExpr inst_dict_ids -- ev_args, ev_bndrs:
1271 ev_bndrs = exprsFreeIdsList ev_args -- See Note [Evidence foralls]
1272 rule_args = ty_args ++ ev_args
1273 rule_bndrs = poly_tyvars ++ ev_bndrs
1274
1275 ; dflags <- getDynFlags
1276 ; if already_covered dflags rules_acc rule_args
1277 then return spec_acc
1278 else -- pprTrace "spec_call" (vcat [ ppr _call_info, ppr fn, ppr rhs_dict_ids
1279 -- , text "rhs_env2" <+> ppr (se_subst rhs_env2)
1280 -- , ppr dx_binds ]) $
1281 do
1282 { -- Figure out the type of the specialised function
1283 let body_ty = applyTypeToArgs rhs fn_type rule_args
1284 (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted
1285 | isUnliftedType body_ty -- C.f. WwLib.mkWorkerArgs
1286 , not (isJoinId fn)
1287 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [voidPrimId])
1288 | otherwise = (poly_tyvars, poly_tyvars)
1289 spec_id_ty = mkLamTypes lam_args body_ty
1290 join_arity_change = length app_args - length rule_args
1291 spec_join_arity | Just orig_join_arity <- isJoinId_maybe fn
1292 = Just (orig_join_arity + join_arity_change)
1293 | otherwise
1294 = Nothing
1295
1296 ; spec_f <- newSpecIdSM fn spec_id_ty spec_join_arity
1297 ; (spec_rhs, rhs_uds) <- specExpr rhs_env2 (mkLams lam_args body)
1298 ; this_mod <- getModule
1299 ; let
1300 -- The rule to put in the function's specialisation is:
1301 -- forall b, d1',d2'. f t1 b t3 d1' d2' = f1 b
1302 herald = case mb_mod of
1303 Nothing -- Specialising local fn
1304 -> text "SPEC"
1305 Just this_mod -- Specialising imported fn
1306 -> text "SPEC/" <> ppr this_mod
1307
1308 rule_name = mkFastString $ showSDoc dflags $
1309 herald <+> ftext (occNameFS (getOccName fn))
1310 <+> hsep (map ppr_call_key_ty call_ts)
1311 -- This name ends up in interface files, so use occNameString.
1312 -- Otherwise uniques end up there, making builds
1313 -- less deterministic (See #4012 comment:61 ff)
1314
1315 rule_wout_eta = mkRule
1316 this_mod
1317 True {- Auto generated -}
1318 is_local
1319 rule_name
1320 inl_act -- Note [Auto-specialisation and RULES]
1321 (idName fn)
1322 rule_bndrs
1323 rule_args
1324 (mkVarApps (Var spec_f) app_args)
1325
1326 spec_rule
1327 = case isJoinId_maybe fn of
1328 Just join_arity -> etaExpandToJoinPointRule join_arity
1329 rule_wout_eta
1330 Nothing -> rule_wout_eta
1331
1332 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
1333 spec_uds = foldr consDictBind rhs_uds dx_binds
1334
1335 --------------------------------------
1336 -- Add a suitable unfolding if the spec_inl_prag says so
1337 -- See Note [Inline specialisations]
1338 (spec_inl_prag, spec_unf)
1339 | not is_local && isStrongLoopBreaker (idOccInfo fn)
1340 = (neverInlinePragma, noUnfolding)
1341 -- See Note [Specialising imported functions] in OccurAnal
1342
1343 | InlinePragma { inl_inline = Inlinable } <- inl_prag
1344 = (inl_prag { inl_inline = NoUserInline }, noUnfolding)
1345
1346 | otherwise
1347 = (inl_prag, specUnfolding poly_tyvars spec_app
1348 arity_decrease fn_unf)
1349
1350 arity_decrease = length spec_dict_args
1351 spec_app e = (e `mkApps` ty_args) `mkApps` spec_dict_args
1352
1353 --------------------------------------
1354 -- Adding arity information just propagates it a bit faster
1355 -- See Note [Arity decrease] in Simplify
1356 -- Copy InlinePragma information from the parent Id.
1357 -- So if f has INLINE[1] so does spec_f
1358 spec_f_w_arity = spec_f `setIdArity` max 0 (fn_arity - n_dicts)
1359 `setInlinePragma` spec_inl_prag
1360 `setIdUnfolding` spec_unf
1361 `asJoinId_maybe` spec_join_arity
1362
1363 ; return ( spec_rule : rules_acc
1364 , (spec_f_w_arity, spec_rhs) : pairs_acc
1365 , spec_uds `plusUDs` uds_acc
1366 ) } }
1367
1368 {- Note [Account for casts in binding]
1369 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1370 Consider
1371 f :: Eq a => a -> IO ()
1372 {-# INLINABLE f
1373 StableUnf = (/\a \(d:Eq a) (x:a). blah) |> g
1374 #-}
1375 f = ...
1376
1377 In f's stable unfolding we have done some modest simplification which
1378 has pushed the cast to the outside. (I wonder if this is the Right
1379 Thing, but it's what happens now; see SimplUtils Note [Casts and
1380 lambdas].) Now that stable unfolding must be specialised, so we want
1381 to push the cast back inside. It would be terrible if the cast
1382 defeated specialisation! Hence the use of collectBindersPushingCo.
1383
1384 Note [Evidence foralls]
1385 ~~~~~~~~~~~~~~~~~~~~~~~~~~
1386 Suppose (Trac #12212) that we are specialising
1387 f :: forall a b. (Num a, F a ~ F b) => blah
1388 with a=b=Int. Then the RULE will be something like
1389 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1390 f Int Int d g = f_spec
1391 But both varToCoreExpr (when constructing the LHS args), and the
1392 simplifier (when simplifying the LHS args), will transform to
1393 RULE forall (d:Num Int) (g :: F Int ~ F Int).
1394 f Int Int d <F Int> = f_spec
1395 by replacing g with Refl. So now 'g' is unbound, which results in a later
1396 crash. So we use Refl right off the bat, and do not forall-quantify 'g':
1397 * varToCoreExpr generates a Refl
1398 * exprsFreeIdsList returns the Ids bound by the args,
1399 which won't include g
1400
1401 You might wonder if this will match as often, but the simplifier replaces
1402 complicated Refl coercions with Refl pretty aggressively.
1403
1404 Note [Orphans and auto-generated rules]
1405 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1406 When we specialise an INLINABLE function, or when we have
1407 -fspecialise-aggressively, we auto-generate RULES that are orphans.
1408 We don't want to warn about these, or we'd generate a lot of warnings.
1409 Thus, we only warn about user-specified orphan rules.
1410
1411 Indeed, we don't even treat the module as an orphan module if it has
1412 auto-generated *rule* orphans. Orphan modules are read every time we
1413 compile, so they are pretty obtrusive and slow down every compilation,
1414 even non-optimised ones. (Reason: for type class instances it's a
1415 type correctness issue.) But specialisation rules are strictly for
1416 *optimisation* only so it's fine not to read the interface.
1417
1418 What this means is that a SPEC rules from auto-specialisation in
1419 module M will be used in other modules only if M.hi has been read for
1420 some other reason, which is actually pretty likely.
1421 -}
1422
1423 bindAuxiliaryDicts
1424 :: SpecEnv
1425 -> [DictId] -> [CoreExpr] -- Original dict bndrs, and the witnessing expressions
1426 -> [DictId] -- A cloned dict-id for each dict arg
1427 -> (SpecEnv, -- Substitute for all orig_dicts
1428 [DictBind], -- Auxiliary dict bindings
1429 [CoreExpr]) -- Witnessing expressions (all trivial)
1430 -- Bind any dictionary arguments to fresh names, to preserve sharing
1431 bindAuxiliaryDicts env@(SE { se_subst = subst, se_interesting = interesting })
1432 orig_dict_ids call_ds inst_dict_ids
1433 = (env', dx_binds, spec_dict_args)
1434 where
1435 (dx_binds, spec_dict_args) = go call_ds inst_dict_ids
1436 env' = env { se_subst = subst `CoreSubst.extendSubstList`
1437 (orig_dict_ids `zip` spec_dict_args)
1438 `CoreSubst.extendInScopeList` dx_ids
1439 , se_interesting = interesting `unionVarSet` interesting_dicts }
1440
1441 dx_ids = [dx_id | (NonRec dx_id _, _) <- dx_binds]
1442 interesting_dicts = mkVarSet [ dx_id | (NonRec dx_id dx, _) <- dx_binds
1443 , interestingDict env dx ]
1444 -- See Note [Make the new dictionaries interesting]
1445
1446 go :: [CoreExpr] -> [CoreBndr] -> ([DictBind], [CoreExpr])
1447 go [] _ = ([], [])
1448 go (dx:dxs) (dx_id:dx_ids)
1449 | exprIsTrivial dx = (dx_binds, dx : args)
1450 | otherwise = (mkDB (NonRec dx_id dx) : dx_binds, Var dx_id : args)
1451 where
1452 (dx_binds, args) = go dxs dx_ids
1453 -- In the first case extend the substitution but not bindings;
1454 -- in the latter extend the bindings but not the substitution.
1455 -- For the former, note that we bind the *original* dict in the substitution,
1456 -- overriding any d->dx_id binding put there by substBndrs
1457 go _ _ = pprPanic "bindAuxiliaryDicts" (ppr orig_dict_ids $$ ppr call_ds $$ ppr inst_dict_ids)
1458
1459 {-
1460 Note [Make the new dictionaries interesting]
1461 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1462 Important! We're going to substitute dx_id1 for d
1463 and we want it to look "interesting", else we won't gather *any*
1464 consequential calls. E.g.
1465 f d = ...g d....
1466 If we specialise f for a call (f (dfun dNumInt)), we'll get
1467 a consequent call (g d') with an auxiliary definition
1468 d' = df dNumInt
1469 We want that consequent call to look interesting
1470
1471
1472 Note [From non-recursive to recursive]
1473 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1474 Even in the non-recursive case, if any dict-binds depend on 'fn' we might
1475 have built a recursive knot
1476
1477 f a d x = <blah>
1478 MkUD { ud_binds = NonRec d7 (MkD ..f..)
1479 , ud_calls = ...(f T d7)... }
1480
1481 The we generate
1482
1483 Rec { fs x = <blah>[T/a, d7/d]
1484 f a d x = <blah>
1485 RULE f T _ = fs
1486 d7 = ...f... }
1487
1488 Here the recursion is only through the RULE.
1489
1490 However we definitely should /not/ make the Rec in this wildly common
1491 case:
1492 d = ...
1493 MkUD { ud_binds = NonRec d7 (...d...)
1494 , ud_calls = ...(f T d7)... }
1495
1496 Here we want simply to add d to the floats, giving
1497 MkUD { ud_binds = NonRec d (...)
1498 NonRec d7 (...d...)
1499 , ud_calls = ...(f T d7)... }
1500
1501 In general, we need only make this Rec if
1502 - there are some specialisations (spec_binds non-empty)
1503 - there are some dict_binds that depend on f (dump_dbs non-empty)
1504
1505 Note [Avoiding loops]
1506 ~~~~~~~~~~~~~~~~~~~~~
1507 When specialising /dictionary functions/ we must be very careful to
1508 avoid building loops. Here is an example that bit us badly: Trac #3591
1509
1510 class Eq a => C a
1511 instance Eq [a] => C [a]
1512
1513 This translates to
1514 dfun :: Eq [a] -> C [a]
1515 dfun a d = MkD a d (meth d)
1516
1517 d4 :: Eq [T] = <blah>
1518 d2 :: C [T] = dfun T d4
1519 d1 :: Eq [T] = $p1 d2
1520 d3 :: C [T] = dfun T d1
1521
1522 None of these definitions is recursive. What happened was that we
1523 generated a specialisation:
1524
1525 RULE forall d. dfun T d = dT :: C [T]
1526 dT = (MkD a d (meth d)) [T/a, d1/d]
1527 = MkD T d1 (meth d1)
1528
1529 But now we use the RULE on the RHS of d2, to get
1530
1531 d2 = dT = MkD d1 (meth d1)
1532 d1 = $p1 d2
1533
1534 and now d1 is bottom! The problem is that when specialising 'dfun' we
1535 should first dump "below" the binding all floated dictionary bindings
1536 that mention 'dfun' itself. So d2 and d3 (and hence d1) must be
1537 placed below 'dfun', and thus unavailable to it when specialising
1538 'dfun'. That in turn means that the call (dfun T d1) must be
1539 discarded. On the other hand, the call (dfun T d4) is fine, assuming
1540 d4 doesn't mention dfun.
1541
1542 Solution:
1543 Discard all calls that mention dictionaries that depend
1544 (directly or indirectly) on the dfun we are specialising.
1545 This is done by 'filterCalls'
1546
1547 --------------
1548 Here's another example, this time for an imported dfun, so the call
1549 to filterCalls is in specImports (Trac #13429). Suppose we have
1550 class Monoid v => C v a where ...
1551
1552 We start with a call
1553 f @ [Integer] @ Integer $fC[]Integer
1554
1555 Specialising call to 'f' gives dict bindings
1556 $dMonoid_1 :: Monoid [Integer]
1557 $dMonoid_1 = M.$p1C @ [Integer] $fC[]Integer
1558
1559 $dC_1 :: C [Integer] (Node [Integer] Integer)
1560 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1561
1562 ...plus a recursive call to
1563 f @ [Integer] @ (Node [Integer] Integer) $dC_1
1564
1565 Specialising that call gives
1566 $dMonoid_2 :: Monoid [Integer]
1567 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1568
1569 $dC_2 :: C [Integer] (Node [Integer] Integer)
1570 $dC_2 = M.$fCvNode @ [Integer] $dMonoid_2
1571
1572 Now we have two calls to the imported function
1573 M.$fCvNode :: Monoid v => C v a
1574 M.$fCvNode @v @a m = C m some_fun
1575
1576 But we must /not/ use the call (M.$fCvNode @ [Integer] $dMonoid_2)
1577 for specialisation, else we get:
1578
1579 $dC_1 = M.$fCvNode @ [Integer] $dMonoid_1
1580 $dMonoid_2 = M.$p1C @ [Integer] $dC_1
1581 $s$fCvNode = C $dMonoid_2 ...
1582 RULE M.$fCvNode [Integer] _ _ = $s$fCvNode
1583
1584 Now use the rule to rewrite the call in the RHS of $dC_1
1585 and we get a loop!
1586
1587 --------------
1588 Here's yet another example
1589
1590 class C a where { foo,bar :: [a] -> [a] }
1591
1592 instance C Int where
1593 foo x = r_bar x
1594 bar xs = reverse xs
1595
1596 r_bar :: C a => [a] -> [a]
1597 r_bar xs = bar (xs ++ xs)
1598
1599 That translates to:
1600
1601 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1602
1603 Rec { $fCInt :: C Int = MkC foo_help reverse
1604 foo_help (xs::[Int]) = r_bar Int $fCInt xs }
1605
1606 The call (r_bar $fCInt) mentions $fCInt,
1607 which mentions foo_help,
1608 which mentions r_bar
1609 But we DO want to specialise r_bar at Int:
1610
1611 Rec { $fCInt :: C Int = MkC foo_help reverse
1612 foo_help (xs::[Int]) = r_bar Int $fCInt xs
1613
1614 r_bar a (c::C a) (xs::[a]) = bar a d (xs ++ xs)
1615 RULE r_bar Int _ = r_bar_Int
1616
1617 r_bar_Int xs = bar Int $fCInt (xs ++ xs)
1618 }
1619
1620 Note that, because of its RULE, r_bar joins the recursive
1621 group. (In this case it'll unravel a short moment later.)
1622
1623
1624 Note [Specialising a recursive group]
1625 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1626 Consider
1627 let rec { f x = ...g x'...
1628 ; g y = ...f y'.... }
1629 in f 'a'
1630 Here we specialise 'f' at Char; but that is very likely to lead to
1631 a specialisation of 'g' at Char. We must do the latter, else the
1632 whole point of specialisation is lost.
1633
1634 But we do not want to keep iterating to a fixpoint, because in the
1635 presence of polymorphic recursion we might generate an infinite number
1636 of specialisations.
1637
1638 So we use the following heuristic:
1639 * Arrange the rec block in dependency order, so far as possible
1640 (the occurrence analyser already does this)
1641
1642 * Specialise it much like a sequence of lets
1643
1644 * Then go through the block a second time, feeding call-info from
1645 the RHSs back in the bottom, as it were
1646
1647 In effect, the ordering maxmimises the effectiveness of each sweep,
1648 and we do just two sweeps. This should catch almost every case of
1649 monomorphic recursion -- the exception could be a very knotted-up
1650 recursion with multiple cycles tied up together.
1651
1652 This plan is implemented in the Rec case of specBindItself.
1653
1654 Note [Specialisations already covered]
1655 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1656 We obviously don't want to generate two specialisations for the same
1657 argument pattern. There are two wrinkles
1658
1659 1. We do the already-covered test in specDefn, not when we generate
1660 the CallInfo in mkCallUDs. We used to test in the latter place, but
1661 we now iterate the specialiser somewhat, and the Id at the call site
1662 might therefore not have all the RULES that we can see in specDefn
1663
1664 2. What about two specialisations where the second is an *instance*
1665 of the first? If the more specific one shows up first, we'll generate
1666 specialisations for both. If the *less* specific one shows up first,
1667 we *don't* currently generate a specialisation for the more specific
1668 one. (See the call to lookupRule in already_covered.) Reasons:
1669 (a) lookupRule doesn't say which matches are exact (bad reason)
1670 (b) if the earlier specialisation is user-provided, it's
1671 far from clear that we should auto-specialise further
1672
1673 Note [Auto-specialisation and RULES]
1674 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1675 Consider:
1676 g :: Num a => a -> a
1677 g = ...
1678
1679 f :: (Int -> Int) -> Int
1680 f w = ...
1681 {-# RULE f g = 0 #-}
1682
1683 Suppose that auto-specialisation makes a specialised version of
1684 g::Int->Int That version won't appear in the LHS of the RULE for f.
1685 So if the specialisation rule fires too early, the rule for f may
1686 never fire.
1687
1688 It might be possible to add new rules, to "complete" the rewrite system.
1689 Thus when adding
1690 RULE forall d. g Int d = g_spec
1691 also add
1692 RULE f g_spec = 0
1693
1694 But that's a bit complicated. For now we ask the programmer's help,
1695 by *copying the INLINE activation pragma* to the auto-specialised
1696 rule. So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
1697 will also not be active until phase 2. And that's what programmers
1698 should jolly well do anyway, even aside from specialisation, to ensure
1699 that g doesn't inline too early.
1700
1701 This in turn means that the RULE would never fire for a NOINLINE
1702 thing so not much point in generating a specialisation at all.
1703
1704 Note [Specialisation shape]
1705 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1706 We only specialise a function if it has visible top-level lambdas
1707 corresponding to its overloading. E.g. if
1708 f :: forall a. Eq a => ....
1709 then its body must look like
1710 f = /\a. \d. ...
1711
1712 Reason: when specialising the body for a call (f ty dexp), we want to
1713 substitute dexp for d, and pick up specialised calls in the body of f.
1714
1715 This doesn't always work. One example I came across was this:
1716 newtype Gen a = MkGen{ unGen :: Int -> a }
1717
1718 choose :: Eq a => a -> Gen a
1719 choose n = MkGen (\r -> n)
1720
1721 oneof = choose (1::Int)
1722
1723 It's a silly exapmle, but we get
1724 choose = /\a. g `cast` co
1725 where choose doesn't have any dict arguments. Thus far I have not
1726 tried to fix this (wait till there's a real example).
1727
1728 Mind you, then 'choose' will be inlined (since RHS is trivial) so
1729 it doesn't matter. This comes up with single-method classes
1730
1731 class C a where { op :: a -> a }
1732 instance C a => C [a] where ....
1733 ==>
1734 $fCList :: C a => C [a]
1735 $fCList = $copList |> (...coercion>...)
1736 ....(uses of $fCList at particular types)...
1737
1738 So we suppress the WARN if the rhs is trivial.
1739
1740 Note [Inline specialisations]
1741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1742 Here is what we do with the InlinePragma of the original function
1743 * Activation/RuleMatchInfo: both transferred to the
1744 specialised function
1745 * InlineSpec:
1746 (a) An INLINE pragma is transferred
1747 (b) An INLINABLE pragma is *not* transferred
1748
1749 Why (a): transfer INLINE pragmas? The point of INLINE was precisely to
1750 specialise the function at its call site, and arguably that's not so
1751 important for the specialised copies. BUT *pragma-directed*
1752 specialisation now takes place in the typechecker/desugarer, with
1753 manually specified INLINEs. The specialisation here is automatic.
1754 It'd be very odd if a function marked INLINE was specialised (because
1755 of some local use), and then forever after (including importing
1756 modules) the specialised version wasn't INLINEd. After all, the
1757 programmer said INLINE!
1758
1759 You might wonder why we specialise INLINE functions at all. After
1760 all they should be inlined, right? Two reasons:
1761
1762 * Even INLINE functions are sometimes not inlined, when they aren't
1763 applied to interesting arguments. But perhaps the type arguments
1764 alone are enough to specialise (even though the args are too boring
1765 to trigger inlining), and it's certainly better to call the
1766 specialised version.
1767
1768 * The RHS of an INLINE function might call another overloaded function,
1769 and we'd like to generate a specialised version of that function too.
1770 This actually happens a lot. Consider
1771 replicateM_ :: (Monad m) => Int -> m a -> m ()
1772 {-# INLINABLE replicateM_ #-}
1773 replicateM_ d x ma = ...
1774 The strictness analyser may transform to
1775 replicateM_ :: (Monad m) => Int -> m a -> m ()
1776 {-# INLINE replicateM_ #-}
1777 replicateM_ d x ma = case x of I# x' -> $wreplicateM_ d x' ma
1778
1779 $wreplicateM_ :: (Monad m) => Int# -> m a -> m ()
1780 {-# INLINABLE $wreplicateM_ #-}
1781 $wreplicateM_ = ...
1782 Now an importing module has a specialised call to replicateM_, say
1783 (replicateM_ dMonadIO). We certainly want to specialise $wreplicateM_!
1784 This particular example had a huge effect on the call to replicateM_
1785 in nofib/shootout/n-body.
1786
1787 Why (b): discard INLINABLE pragmas? See Trac #4874 for persuasive examples.
1788 Suppose we have
1789 {-# INLINABLE f #-}
1790 f :: Ord a => [a] -> Int
1791 f xs = letrec f' = ...f'... in f'
1792 Then, when f is specialised and optimised we might get
1793 wgo :: [Int] -> Int#
1794 wgo = ...wgo...
1795 f_spec :: [Int] -> Int
1796 f_spec xs = case wgo xs of { r -> I# r }
1797 and we clearly want to inline f_spec at call sites. But if we still
1798 have the big, un-optimised of f (albeit specialised) captured in an
1799 INLINABLE pragma for f_spec, we won't get that optimisation.
1800
1801 So we simply drop INLINABLE pragmas when specialising. It's not really
1802 a complete solution; ignoring specialisation for now, INLINABLE functions
1803 don't get properly strictness analysed, for example. But it works well
1804 for examples involving specialisation, which is the dominant use of
1805 INLINABLE. See Trac #4874.
1806
1807
1808 ************************************************************************
1809 * *
1810 \subsubsection{UsageDetails and suchlike}
1811 * *
1812 ************************************************************************
1813 -}
1814
1815 data UsageDetails
1816 = MkUD {
1817 ud_binds :: !(Bag DictBind),
1818 -- See Note [Floated dictionary bindings]
1819 -- The order is important;
1820 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1821 -- (Remember, Bags preserve order in GHC.)
1822
1823 ud_calls :: !CallDetails
1824
1825 -- INVARIANT: suppose bs = bindersOf ud_binds
1826 -- Then 'calls' may *mention* 'bs',
1827 -- but there should be no calls *for* bs
1828 }
1829
1830 -- | A 'DictBind' is a binding along with a cached set containing its free
1831 -- variables (both type variables and dictionaries)
1832 type DictBind = (CoreBind, VarSet)
1833
1834 {- Note [Floated dictionary bindings]
1835 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1836 We float out dictionary bindings for the reasons described under
1837 "Dictionary floating" above. But not /just/ dictionary bindings.
1838 Consider
1839
1840 f :: Eq a => blah
1841 f a d = rhs
1842
1843 $c== :: T -> T -> Bool
1844 $c== x y = ...
1845
1846 $df :: Eq T
1847 $df = Eq $c== ...
1848
1849 gurgle = ...(f @T $df)...
1850
1851 We gather the call info for (f @T $df), and we don't want to drop it
1852 when we come across the binding for $df. So we add $df to the floats
1853 and continue. But then we have to add $c== to the floats, and so on.
1854 These all float above the binding for 'f', and and now we can
1855 successfully specialise 'f'.
1856
1857 So the DictBinds in (ud_binds :: Bag DictBind) may contain
1858 non-dictionary bindings too.
1859 -}
1860
1861 instance Outputable UsageDetails where
1862 ppr (MkUD { ud_binds = dbs, ud_calls = calls })
1863 = text "MkUD" <+> braces (sep (punctuate comma
1864 [text "binds" <+> equals <+> ppr dbs,
1865 text "calls" <+> equals <+> ppr calls]))
1866
1867 emptyUDs :: UsageDetails
1868 emptyUDs = MkUD { ud_binds = emptyBag, ud_calls = emptyDVarEnv }
1869
1870 ------------------------------------------------------------
1871 type CallDetails = DIdEnv CallInfoSet
1872 -- The order of specialized binds and rules depends on how we linearize
1873 -- CallDetails, so to get determinism we must use a deterministic set here.
1874 -- See Note [Deterministic UniqFM] in UniqDFM
1875
1876 data CallInfoSet = CIS Id (Bag CallInfo)
1877 -- The list of types and dictionaries is guaranteed to
1878 -- match the type of f
1879 -- The Bag may contain duplicate calls (i.e. f @T and another f @T)
1880 -- These dups are eliminated by already_covered in specCalls
1881
1882 data CallInfo
1883 = CI { ci_key :: CallKey -- Type arguments
1884 , ci_args :: [DictExpr] -- Dictionary arguments
1885 , ci_fvs :: VarSet -- Free vars of the ci_key and ci_args
1886 -- call (including tyvars)
1887 -- [*not* include the main id itself, of course]
1888 }
1889
1890 newtype CallKey = CallKey [Maybe Type]
1891 -- Nothing => unconstrained type argument
1892
1893 type DictExpr = CoreExpr
1894
1895 ciSetFilter :: (CallInfo -> Bool) -> CallInfoSet -> CallInfoSet
1896 ciSetFilter p (CIS id a) = CIS id (filterBag p a)
1897
1898 instance Outputable CallInfoSet where
1899 ppr (CIS fn map) = hang (text "CIS" <+> ppr fn)
1900 2 (ppr map)
1901
1902 pprCallInfo :: Id -> CallInfo -> SDoc
1903 pprCallInfo fn (CI { ci_key = key })
1904 = ppr fn <+> ppr key
1905
1906 ppr_call_key_ty :: Maybe Type -> SDoc
1907 ppr_call_key_ty Nothing = char '_'
1908 ppr_call_key_ty (Just ty) = char '@' <+> pprParendType ty
1909
1910 instance Outputable CallKey where
1911 ppr (CallKey ts) = brackets (fsep (map ppr_call_key_ty ts))
1912
1913 instance Outputable CallInfo where
1914 ppr (CI { ci_key = key, ci_args = args, ci_fvs = fvs })
1915 = text "CI" <> braces (hsep [ ppr key, ppr args, ppr fvs ])
1916
1917 unionCalls :: CallDetails -> CallDetails -> CallDetails
1918 unionCalls c1 c2 = plusDVarEnv_C unionCallInfoSet c1 c2
1919
1920 unionCallInfoSet :: CallInfoSet -> CallInfoSet -> CallInfoSet
1921 unionCallInfoSet (CIS f calls1) (CIS _ calls2) =
1922 CIS f (calls1 `unionBags` calls2)
1923
1924 callDetailsFVs :: CallDetails -> VarSet
1925 callDetailsFVs calls =
1926 nonDetFoldUDFM (unionVarSet . callInfoFVs) emptyVarSet calls
1927 -- It's OK to use nonDetFoldUDFM here because we forget the ordering
1928 -- immediately by converting to a nondeterministic set.
1929
1930 callInfoFVs :: CallInfoSet -> VarSet
1931 callInfoFVs (CIS _ call_info) =
1932 foldrBag (\(CI { ci_fvs = fv }) vs -> unionVarSet fv vs) emptyVarSet call_info
1933
1934 ------------------------------------------------------------
1935 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> UsageDetails
1936 singleCall id tys dicts
1937 = MkUD {ud_binds = emptyBag,
1938 ud_calls = unitDVarEnv id $ CIS id $
1939 unitBag (CI { ci_key = CallKey tys
1940 , ci_args = dicts
1941 , ci_fvs = call_fvs }) }
1942 where
1943 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1944 tys_fvs = tyCoVarsOfTypes (catMaybes tys)
1945 -- The type args (tys) are guaranteed to be part of the dictionary
1946 -- types, because they are just the constrained types,
1947 -- and the dictionary is therefore sure to be bound
1948 -- inside the binding for any type variables free in the type;
1949 -- hence it's safe to neglect tyvars free in tys when making
1950 -- the free-var set for this call
1951 -- BUT I don't trust this reasoning; play safe and include tys_fvs
1952 --
1953 -- We don't include the 'id' itself.
1954
1955 mkCallUDs, mkCallUDs' :: SpecEnv -> Id -> [CoreExpr] -> UsageDetails
1956 mkCallUDs env f args
1957 = -- pprTrace "mkCallUDs" (vcat [ ppr f, ppr args, ppr res ])
1958 res
1959 where
1960 res = mkCallUDs' env f args
1961
1962 mkCallUDs' env f args
1963 | not (want_calls_for f) -- Imported from elsewhere
1964 || null theta -- Not overloaded
1965 = emptyUDs
1966
1967 | not (all type_determines_value theta)
1968 || not (spec_tys `lengthIs` n_tyvars)
1969 || not ( dicts `lengthIs` n_dicts)
1970 || not (any (interestingDict env) dicts) -- Note [Interesting dictionary arguments]
1971 -- See also Note [Specialisations already covered]
1972 = -- pprTrace "mkCallUDs: discarding" _trace_doc
1973 emptyUDs -- Not overloaded, or no specialisation wanted
1974
1975 | otherwise
1976 = -- pprTrace "mkCallUDs: keeping" _trace_doc
1977 singleCall f spec_tys dicts
1978 where
1979 _trace_doc = vcat [ppr f, ppr args, ppr n_tyvars, ppr n_dicts
1980 , ppr (map (interestingDict env) dicts)]
1981 (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1982 constrained_tyvars = tyCoVarsOfTypes theta
1983 n_tyvars = length tyvars
1984 n_dicts = length theta
1985
1986 spec_tys = [mk_spec_ty tv ty | (tv, ty) <- tyvars `type_zip` args]
1987 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1988
1989 -- ignores Coercion arguments
1990 type_zip :: [TyVar] -> [CoreExpr] -> [(TyVar, Type)]
1991 type_zip tvs (Coercion _ : args) = type_zip tvs args
1992 type_zip (tv:tvs) (Type ty : args) = (tv, ty) : type_zip tvs args
1993 type_zip _ _ = []
1994
1995 mk_spec_ty tyvar ty
1996 | tyvar `elemVarSet` constrained_tyvars = Just ty
1997 | otherwise = Nothing
1998
1999 want_calls_for f = isLocalId f || isJust (maybeUnfoldingTemplate (realIdUnfolding f))
2000 -- For imported things, we gather call instances if
2001 -- there is an unfolding that we could in principle specialise
2002 -- We might still decide not to use it (consulting dflags)
2003 -- in specImports
2004 -- Use 'realIdUnfolding' to ignore the loop-breaker flag!
2005
2006 type_determines_value pred -- See Note [Type determines value]
2007 = case classifyPredType pred of
2008 ClassPred cls _ -> not (isIPClass cls) -- Superclasses can't be IPs
2009 EqPred {} -> True
2010 IrredPred {} -> True -- Things like (D []) where D is a
2011 -- Constraint-ranged family; Trac #7785
2012
2013 {-
2014 Note [Type determines value]
2015 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2016 Only specialise if all overloading is on non-IP *class* params,
2017 because these are the ones whose *type* determines their *value*. In
2018 parrticular, with implicit params, the type args *don't* say what the
2019 value of the implicit param is! See Trac #7101
2020
2021 However, consider
2022 type family D (v::*->*) :: Constraint
2023 type instance D [] = ()
2024 f :: D v => v Char -> Int
2025 If we see a call (f "foo"), we'll pass a "dictionary"
2026 () |> (g :: () ~ D [])
2027 and it's good to specialise f at this dictionary.
2028
2029 So the question is: can an implicit parameter "hide inside" a
2030 type-family constraint like (D a). Well, no. We don't allow
2031 type instance D Maybe = ?x:Int
2032 Hence the IrredPred case in type_determines_value.
2033 See Trac #7785.
2034
2035 Note [Interesting dictionary arguments]
2036 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2037 Consider this
2038 \a.\d:Eq a. let f = ... in ...(f d)...
2039 There really is not much point in specialising f wrt the dictionary d,
2040 because the code for the specialised f is not improved at all, because
2041 d is lambda-bound. We simply get junk specialisations.
2042
2043 What is "interesting"? Just that it has *some* structure. But what about
2044 variables?
2045
2046 * A variable might be imported, in which case its unfolding
2047 will tell us whether it has useful structure
2048
2049 * Local variables are cloned on the way down (to avoid clashes when
2050 we float dictionaries), and cloning drops the unfolding
2051 (cloneIdBndr). Moreover, we make up some new bindings, and it's a
2052 nuisance to give them unfoldings. So we keep track of the
2053 "interesting" dictionaries as a VarSet in SpecEnv.
2054 We have to take care to put any new interesting dictionary
2055 bindings in the set.
2056
2057 We accidentally lost accurate tracking of local variables for a long
2058 time, because cloned variables don't have unfoldings. But makes a
2059 massive difference in a few cases, eg Trac #5113. For nofib as a
2060 whole it's only a small win: 2.2% improvement in allocation for ansi,
2061 1.2% for bspt, but mostly 0.0! Average 0.1% increase in binary size.
2062 -}
2063
2064 interestingDict :: SpecEnv -> CoreExpr -> Bool
2065 -- A dictionary argument is interesting if it has *some* structure
2066 -- NB: "dictionary" arguments include constraints of all sorts,
2067 -- including equality constraints; hence the Coercion case
2068 interestingDict env (Var v) = hasSomeUnfolding (idUnfolding v)
2069 || isDataConWorkId v
2070 || v `elemVarSet` se_interesting env
2071 interestingDict _ (Type _) = False
2072 interestingDict _ (Coercion _) = False
2073 interestingDict env (App fn (Type _)) = interestingDict env fn
2074 interestingDict env (App fn (Coercion _)) = interestingDict env fn
2075 interestingDict env (Tick _ a) = interestingDict env a
2076 interestingDict env (Cast e _) = interestingDict env e
2077 interestingDict _ _ = True
2078
2079 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
2080 plusUDs (MkUD {ud_binds = db1, ud_calls = calls1})
2081 (MkUD {ud_binds = db2, ud_calls = calls2})
2082 = MkUD { ud_binds = db1 `unionBags` db2
2083 , ud_calls = calls1 `unionCalls` calls2 }
2084
2085 -----------------------------
2086 _dictBindBndrs :: Bag DictBind -> [Id]
2087 _dictBindBndrs dbs = foldrBag ((++) . bindersOf . fst) [] dbs
2088
2089 -- | Construct a 'DictBind' from a 'CoreBind'
2090 mkDB :: CoreBind -> DictBind
2091 mkDB bind = (bind, bind_fvs bind)
2092
2093 -- | Identify the free variables of a 'CoreBind'
2094 bind_fvs :: CoreBind -> VarSet
2095 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
2096 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
2097 where
2098 bndrs = map fst prs
2099 rhs_fvs = unionVarSets (map pair_fvs prs)
2100
2101 pair_fvs :: (Id, CoreExpr) -> VarSet
2102 pair_fvs (bndr, rhs) = exprSomeFreeVars interesting rhs
2103 `unionVarSet` idFreeVars bndr
2104 -- idFreeVars: don't forget variables mentioned in
2105 -- the rules of the bndr. C.f. OccAnal.addRuleUsage
2106 -- Also tyvars mentioned in its type; they may not appear
2107 -- in the RHS
2108 -- type T a = Int
2109 -- x :: T a = 3
2110 where
2111 interesting :: InterestingVarFun
2112 interesting v = isLocalVar v || (isId v && isDFunId v)
2113 -- Very important: include DFunIds /even/ if it is imported
2114 -- Reason: See Note [Avoiding loops], the second exmaple
2115 -- involving an imported dfun. We must know whether
2116 -- a dictionary binding depends on an imported dfun,
2117 -- in case we try to specialise that imported dfun
2118 -- Trac #13429 illustrates
2119
2120 -- | Flatten a set of "dumped" 'DictBind's, and some other binding
2121 -- pairs, into a single recursive binding.
2122 recWithDumpedDicts :: [(Id,CoreExpr)] -> Bag DictBind ->DictBind
2123 recWithDumpedDicts pairs dbs
2124 = (Rec bindings, fvs)
2125 where
2126 (bindings, fvs) = foldrBag add
2127 ([], emptyVarSet)
2128 (dbs `snocBag` mkDB (Rec pairs))
2129 add (NonRec b r, fvs') (pairs, fvs) =
2130 ((b,r) : pairs, fvs `unionVarSet` fvs')
2131 add (Rec prs1, fvs') (pairs, fvs) =
2132 (prs1 ++ pairs, fvs `unionVarSet` fvs')
2133
2134 snocDictBinds :: UsageDetails -> [DictBind] -> UsageDetails
2135 -- Add ud_binds to the tail end of the bindings in uds
2136 snocDictBinds uds dbs
2137 = uds { ud_binds = ud_binds uds `unionBags` listToBag dbs }
2138
2139 consDictBind :: DictBind -> UsageDetails -> UsageDetails
2140 consDictBind bind uds = uds { ud_binds = bind `consBag` ud_binds uds }
2141
2142 addDictBinds :: [DictBind] -> UsageDetails -> UsageDetails
2143 addDictBinds binds uds = uds { ud_binds = listToBag binds `unionBags` ud_binds uds }
2144
2145 snocDictBind :: UsageDetails -> DictBind -> UsageDetails
2146 snocDictBind uds bind = uds { ud_binds = ud_binds uds `snocBag` bind }
2147
2148 wrapDictBinds :: Bag DictBind -> [CoreBind] -> [CoreBind]
2149 wrapDictBinds dbs binds
2150 = foldrBag add binds dbs
2151 where
2152 add (bind,_) binds = bind : binds
2153
2154 wrapDictBindsE :: Bag DictBind -> CoreExpr -> CoreExpr
2155 wrapDictBindsE dbs expr
2156 = foldrBag add expr dbs
2157 where
2158 add (bind,_) expr = Let bind expr
2159
2160 ----------------------
2161 dumpUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind)
2162 -- Used at a lambda or case binder; just dump anything mentioning the binder
2163 dumpUDs bndrs uds@(MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2164 | null bndrs = (uds, emptyBag) -- Common in case alternatives
2165 | otherwise = -- pprTrace "dumpUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2166 (free_uds, dump_dbs)
2167 where
2168 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2169 bndr_set = mkVarSet bndrs
2170 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2171 free_calls = deleteCallsMentioning dump_set $ -- Drop calls mentioning bndr_set on the floor
2172 deleteCallsFor bndrs orig_calls -- Discard calls for bndr_set; there should be
2173 -- no calls for any of the dicts in dump_dbs
2174
2175 dumpBindUDs :: [CoreBndr] -> UsageDetails -> (UsageDetails, Bag DictBind, Bool)
2176 -- Used at a let(rec) binding.
2177 -- We return a boolean indicating whether the binding itself is mentioned
2178 -- is mentioned, directly or indirectly, by any of the ud_calls; in that
2179 -- case we want to float the binding itself;
2180 -- See Note [Floated dictionary bindings]
2181 dumpBindUDs bndrs (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2182 = -- pprTrace "dumpBindUDs" (ppr bndrs $$ ppr free_uds $$ ppr dump_dbs) $
2183 (free_uds, dump_dbs, float_all)
2184 where
2185 free_uds = MkUD { ud_binds = free_dbs, ud_calls = free_calls }
2186 bndr_set = mkVarSet bndrs
2187 (free_dbs, dump_dbs, dump_set) = splitDictBinds orig_dbs bndr_set
2188 free_calls = deleteCallsFor bndrs orig_calls
2189 float_all = dump_set `intersectsVarSet` callDetailsFVs free_calls
2190
2191 callsForMe :: Id -> UsageDetails -> (UsageDetails, [CallInfo])
2192 callsForMe fn (MkUD { ud_binds = orig_dbs, ud_calls = orig_calls })
2193 = -- pprTrace ("callsForMe")
2194 -- (vcat [ppr fn,
2195 -- text "Orig dbs =" <+> ppr (_dictBindBndrs orig_dbs),
2196 -- text "Orig calls =" <+> ppr orig_calls,
2197 -- text "Dep set =" <+> ppr dep_set,
2198 -- text "Calls for me =" <+> ppr calls_for_me]) $
2199 (uds_without_me, calls_for_me)
2200 where
2201 uds_without_me = MkUD { ud_binds = orig_dbs
2202 , ud_calls = delDVarEnv orig_calls fn }
2203 calls_for_me = case lookupDVarEnv orig_calls fn of
2204 Nothing -> []
2205 Just cis -> filterCalls cis orig_dbs
2206 -- filterCalls: drop calls that (directly or indirectly)
2207 -- refer to fn. See Note [Avoiding loops]
2208
2209 ----------------------
2210 filterCalls :: CallInfoSet -> Bag DictBind -> [CallInfo]
2211 -- See Note [Avoiding loops]
2212 filterCalls (CIS fn call_bag) dbs
2213 = filter ok_call (bagToList call_bag)
2214 where
2215 dump_set = foldlBag go (unitVarSet fn) dbs
2216 -- This dump-set could also be computed by splitDictBinds
2217 -- (_,_,dump_set) = splitDictBinds dbs {fn}
2218 -- But this variant is shorter
2219
2220 go so_far (db,fvs) | fvs `intersectsVarSet` so_far
2221 = extendVarSetList so_far (bindersOf db)
2222 | otherwise = so_far
2223
2224 ok_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` dump_set)
2225
2226 ----------------------
2227 splitDictBinds :: Bag DictBind -> IdSet -> (Bag DictBind, Bag DictBind, IdSet)
2228 -- splitDictBinds dbs bndrs returns
2229 -- (free_dbs, dump_dbs, dump_set)
2230 -- where
2231 -- * dump_dbs depends, transitively on bndrs
2232 -- * free_dbs does not depend on bndrs
2233 -- * dump_set = bndrs `union` bndrs(dump_dbs)
2234 splitDictBinds dbs bndr_set
2235 = foldlBag split_db (emptyBag, emptyBag, bndr_set) dbs
2236 -- Important that it's foldl not foldr;
2237 -- we're accumulating the set of dumped ids in dump_set
2238 where
2239 split_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
2240 | dump_idset `intersectsVarSet` fvs -- Dump it
2241 = (free_dbs, dump_dbs `snocBag` db,
2242 extendVarSetList dump_idset (bindersOf bind))
2243
2244 | otherwise -- Don't dump it
2245 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
2246
2247
2248 ----------------------
2249 deleteCallsMentioning :: VarSet -> CallDetails -> CallDetails
2250 -- Remove calls *mentioning* bs in any way
2251 deleteCallsMentioning bs calls
2252 = mapDVarEnv (ciSetFilter keep_call) calls
2253 where
2254 keep_call (CI { ci_fvs = fvs }) = not (fvs `intersectsVarSet` bs)
2255
2256 deleteCallsFor :: [Id] -> CallDetails -> CallDetails
2257 -- Remove calls *for* bs
2258 deleteCallsFor bs calls = delDVarEnvList calls bs
2259
2260 {-
2261 ************************************************************************
2262 * *
2263 \subsubsection{Boring helper functions}
2264 * *
2265 ************************************************************************
2266 -}
2267
2268 newtype SpecM a = SpecM (State SpecState a)
2269
2270 data SpecState = SpecState {
2271 spec_uniq_supply :: UniqSupply,
2272 spec_module :: Module,
2273 spec_dflags :: DynFlags
2274 }
2275
2276 instance Functor SpecM where
2277 fmap = liftM
2278
2279 instance Applicative SpecM where
2280 pure x = SpecM $ return x
2281 (<*>) = ap
2282
2283 instance Monad SpecM where
2284 SpecM x >>= f = SpecM $ do y <- x
2285 case f y of
2286 SpecM z ->
2287 z
2288 fail = MonadFail.fail
2289
2290 instance MonadFail.MonadFail SpecM where
2291 fail str = SpecM $ fail str
2292
2293 instance MonadUnique SpecM where
2294 getUniqueSupplyM
2295 = SpecM $ do st <- get
2296 let (us1, us2) = splitUniqSupply $ spec_uniq_supply st
2297 put $ st { spec_uniq_supply = us2 }
2298 return us1
2299
2300 getUniqueM
2301 = SpecM $ do st <- get
2302 let (u,us') = takeUniqFromSupply $ spec_uniq_supply st
2303 put $ st { spec_uniq_supply = us' }
2304 return u
2305
2306 instance HasDynFlags SpecM where
2307 getDynFlags = SpecM $ liftM spec_dflags get
2308
2309 instance HasModule SpecM where
2310 getModule = SpecM $ liftM spec_module get
2311
2312 runSpecM :: DynFlags -> Module -> SpecM a -> CoreM a
2313 runSpecM dflags this_mod (SpecM spec)
2314 = do us <- getUniqueSupplyM
2315 let initialState = SpecState {
2316 spec_uniq_supply = us,
2317 spec_module = this_mod,
2318 spec_dflags = dflags
2319 }
2320 return $ evalState spec initialState
2321
2322 mapAndCombineSM :: (a -> SpecM (b, UsageDetails)) -> [a] -> SpecM ([b], UsageDetails)
2323 mapAndCombineSM _ [] = return ([], emptyUDs)
2324 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
2325 (ys, uds2) <- mapAndCombineSM f xs
2326 return (y:ys, uds1 `plusUDs` uds2)
2327
2328 extendTvSubstList :: SpecEnv -> [(TyVar,Type)] -> SpecEnv
2329 extendTvSubstList env tv_binds
2330 = env { se_subst = CoreSubst.extendTvSubstList (se_subst env) tv_binds }
2331
2332 substTy :: SpecEnv -> Type -> Type
2333 substTy env ty = CoreSubst.substTy (se_subst env) ty
2334
2335 substCo :: SpecEnv -> Coercion -> Coercion
2336 substCo env co = CoreSubst.substCo (se_subst env) co
2337
2338 substBndr :: SpecEnv -> CoreBndr -> (SpecEnv, CoreBndr)
2339 substBndr env bs = case CoreSubst.substBndr (se_subst env) bs of
2340 (subst', bs') -> (env { se_subst = subst' }, bs')
2341
2342 substBndrs :: SpecEnv -> [CoreBndr] -> (SpecEnv, [CoreBndr])
2343 substBndrs env bs = case CoreSubst.substBndrs (se_subst env) bs of
2344 (subst', bs') -> (env { se_subst = subst' }, bs')
2345
2346 cloneBindSM :: SpecEnv -> CoreBind -> SpecM (SpecEnv, SpecEnv, CoreBind)
2347 -- Clone the binders of the bind; return new bind with the cloned binders
2348 -- Return the substitution to use for RHSs, and the one to use for the body
2349 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (NonRec bndr rhs)
2350 = do { us <- getUniqueSupplyM
2351 ; let (subst', bndr') = CoreSubst.cloneIdBndr subst us bndr
2352 interesting' | interestingDict env rhs
2353 = interesting `extendVarSet` bndr'
2354 | otherwise = interesting
2355 ; return (env, env { se_subst = subst', se_interesting = interesting' }
2356 , NonRec bndr' rhs) }
2357
2358 cloneBindSM env@(SE { se_subst = subst, se_interesting = interesting }) (Rec pairs)
2359 = do { us <- getUniqueSupplyM
2360 ; let (subst', bndrs') = CoreSubst.cloneRecIdBndrs subst us (map fst pairs)
2361 env' = env { se_subst = subst'
2362 , se_interesting = interesting `extendVarSetList`
2363 [ v | (v,r) <- pairs, interestingDict env r ] }
2364 ; return (env', env', Rec (bndrs' `zip` map snd pairs)) }
2365
2366 newDictBndr :: SpecEnv -> CoreBndr -> SpecM CoreBndr
2367 -- Make up completely fresh binders for the dictionaries
2368 -- Their bindings are going to float outwards
2369 newDictBndr env b = do { uniq <- getUniqueM
2370 ; let n = idName b
2371 ty' = substTy env (idType b)
2372 ; return (mkUserLocalOrCoVar (nameOccName n) uniq ty' (getSrcSpan n)) }
2373
2374 newSpecIdSM :: Id -> Type -> Maybe JoinArity -> SpecM Id
2375 -- Give the new Id a similar occurrence name to the old one
2376 newSpecIdSM old_id new_ty join_arity_maybe
2377 = do { uniq <- getUniqueM
2378 ; let name = idName old_id
2379 new_occ = mkSpecOcc (nameOccName name)
2380 new_id = mkUserLocalOrCoVar new_occ uniq new_ty (getSrcSpan name)
2381 `asJoinId_maybe` join_arity_maybe
2382 ; return new_id }
2383
2384 {-
2385 Old (but interesting) stuff about unboxed bindings
2386 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2387
2388 What should we do when a value is specialised to a *strict* unboxed value?
2389
2390 map_*_* f (x:xs) = let h = f x
2391 t = map f xs
2392 in h:t
2393
2394 Could convert let to case:
2395
2396 map_*_Int# f (x:xs) = case f x of h# ->
2397 let t = map f xs
2398 in h#:t
2399
2400 This may be undesirable since it forces evaluation here, but the value
2401 may not be used in all branches of the body. In the general case this
2402 transformation is impossible since the mutual recursion in a letrec
2403 cannot be expressed as a case.
2404
2405 There is also a problem with top-level unboxed values, since our
2406 implementation cannot handle unboxed values at the top level.
2407
2408 Solution: Lift the binding of the unboxed value and extract it when it
2409 is used:
2410
2411 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
2412 t = map f xs
2413 in case h of
2414 _Lift h# -> h#:t
2415
2416 Now give it to the simplifier and the _Lifting will be optimised away.
2417
2418 The benefit is that we have given the specialised "unboxed" values a
2419 very simple lifted semantics and then leave it up to the simplifier to
2420 optimise it --- knowing that the overheads will be removed in nearly
2421 all cases.
2422
2423 In particular, the value will only be evaluated in the branches of the
2424 program which use it, rather than being forced at the point where the
2425 value is bound. For example:
2426
2427 filtermap_*_* p f (x:xs)
2428 = let h = f x
2429 t = ...
2430 in case p x of
2431 True -> h:t
2432 False -> t
2433 ==>
2434 filtermap_*_Int# p f (x:xs)
2435 = let h = case (f x) of h# -> _Lift h#
2436 t = ...
2437 in case p x of
2438 True -> case h of _Lift h#
2439 -> h#:t
2440 False -> t
2441
2442 The binding for h can still be inlined in the one branch and the
2443 _Lifting eliminated.
2444
2445
2446 Question: When won't the _Lifting be eliminated?
2447
2448 Answer: When they at the top-level (where it is necessary) or when
2449 inlining would duplicate work (or possibly code depending on
2450 options). However, the _Lifting will still be eliminated if the
2451 strictness analyser deems the lifted binding strict.
2452 -}