Don't shortcut SRTs for static functions (#15544)
[ghc.git] / compiler / cmm / CmmBuildInfoTables.hs
1 {-# LANGUAGE GADTs, BangPatterns, RecordWildCards,
2 GeneralizedNewtypeDeriving, NondecreasingIndentation, TupleSections #-}
3
4 module CmmBuildInfoTables
5 ( CAFSet, CAFEnv, cafAnal
6 , doSRTs, ModuleSRTInfo, emptySRT
7 ) where
8
9 import GhcPrelude hiding (succ)
10
11 import Id
12 import BlockId
13 import Hoopl.Block
14 import Hoopl.Graph
15 import Hoopl.Label
16 import Hoopl.Collections
17 import Hoopl.Dataflow
18 import Module
19 import Platform
20 import Digraph
21 import CLabel
22 import PprCmmDecl ()
23 import Cmm
24 import CmmUtils
25 import DynFlags
26 import Maybes
27 import Outputable
28 import SMRep
29 import UniqSupply
30 import CostCentre
31 import StgCmmHeap
32
33 import PprCmm()
34 import Control.Monad
35 import Data.Map (Map)
36 import qualified Data.Map as Map
37 import Data.Set (Set)
38 import qualified Data.Set as Set
39 import Data.Tuple
40 import Control.Monad.Trans.State
41 import Control.Monad.Trans.Class
42
43
44 {- Note [SRTs]
45
46 SRTs are the mechanism by which the garbage collector can determine
47 the live CAFs in the program.
48
49 Representation
50 ^^^^^^^^^^^^^^
51
52 +------+
53 | info |
54 | | +-----+---+---+---+
55 | -------->|SRT_2| | | | | 0 |
56 |------| +-----+-|-+-|-+---+
57 | | | |
58 | code | | |
59 | | v v
60
61 An SRT is simply an object in the program's data segment. It has the
62 same representation as a static constructor. There are 16
63 pre-compiled SRT info tables: stg_SRT_1_info, .. stg_SRT_16_info,
64 representing SRT objects with 1-16 pointers, respectively.
65
66 The entries of an SRT object point to static closures, which are either
67 - FUN_STATIC, THUNK_STATIC or CONSTR
68 - Another SRT (actually just a CONSTR)
69
70 The final field of the SRT is the static link field, used by the
71 garbage collector to chain together static closures that it visits and
72 to determine whether a static closure has been visited or not. (see
73 Note [STATIC_LINK fields])
74
75 By traversing the transitive closure of an SRT, the GC will reach all
76 of the CAFs that are reachable from the code associated with this SRT.
77
78 If we need to create an SRT with more than 16 entries, we build a
79 chain of SRT objects with all but the last having 16 entries.
80
81 +-----+---+- -+---+---+
82 |SRT16| | | | | | 0 |
83 +-----+-|-+- -+-|-+---+
84 | |
85 v v
86 +----+---+---+---+
87 |SRT2| | | | | 0 |
88 +----+-|-+-|-+---+
89 | |
90 | |
91 v v
92
93 Referring to an SRT from the info table
94 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
95
96 The following things have SRTs:
97
98 - Static functions (FUN)
99 - Static thunks (THUNK), ie. CAFs
100 - Continuations (RET_SMALL, etc.)
101
102 In each case, the info table points to the SRT.
103
104 - info->srt is zero if there's no SRT, otherwise:
105 - info->srt == 1 and info->f.srt_offset points to the SRT
106
107 e.g. for a FUN with an SRT:
108
109 StgFunInfoTable +------+
110 info->f.srt_offset | ------------> offset to SRT object
111 StgStdInfoTable +------+
112 info->layout.ptrs | ... |
113 info->layout.nptrs | ... |
114 info->srt | 1 |
115 info->type | ... |
116 |------|
117
118 On x86_64, we optimise the info table representation further. The
119 offset to the SRT can be stored in 32 bits (all code lives within a
120 2GB region in x86_64's small memory model), so we can save a word in
121 the info table by storing the srt_offset in the srt field, which is
122 half a word.
123
124 On x86_64 with TABLES_NEXT_TO_CODE (except on MachO, due to #15169):
125
126 - info->srt is zero if there's no SRT, otherwise:
127 - info->srt is an offset from the info pointer to the SRT object
128
129 StgStdInfoTable +------+
130 info->layout.ptrs | |
131 info->layout.nptrs | |
132 info->srt | ------------> offset to SRT object
133 |------|
134
135
136 EXAMPLE
137 ^^^^^^^
138
139 f = \x. ... g ...
140 where
141 g = \y. ... h ... c1 ...
142 h = \z. ... c2 ...
143
144 c1 & c2 are CAFs
145
146 g and h are local functions, but they have no static closures. When
147 we generate code for f, we start with a CmmGroup of four CmmDecls:
148
149 [ f_closure, f_entry, g_entry, h_entry ]
150
151 we process each CmmDecl separately in cpsTop, giving us a list of
152 CmmDecls. e.g. for f_entry, we might end up with
153
154 [ f_entry, f1_ret, f2_proc ]
155
156 where f1_ret is a return point, and f2_proc is a proc-point. We have
157 a CAFSet for each of these CmmDecls, let's suppose they are
158
159 [ f_entry{g_info}, f1_ret{g_info}, f2_proc{} ]
160 [ g_entry{h_info, c1_closure} ]
161 [ h_entry{c2_closure} ]
162
163 Next, we make an SRT for each of these functions:
164
165 f_srt : [g_info]
166 g_srt : [h_info, c1_closure]
167 h_srt : [c2_closure]
168
169 Now, for g_info and h_info, we want to refer to the SRTs for g and h
170 respectively, which we'll label g_srt and h_srt:
171
172 f_srt : [g_srt]
173 g_srt : [h_srt, c1_closure]
174 h_srt : [c2_closure]
175
176 Now, when an SRT has a single entry, we don't actually generate an SRT
177 closure for it, instead we just replace references to it with its
178 single element. So, since h_srt == c2_closure, we have
179
180 f_srt : [g_srt]
181 g_srt : [c2_closure, c1_closure]
182 h_srt : [c2_closure]
183
184 and the only SRT closure we generate is
185
186 g_srt = SRT_2 [c2_closure, c1_closure]
187
188
189 Optimisations
190 ^^^^^^^^^^^^^
191
192 To reduce the code size overhead and the cost of traversing SRTs in
193 the GC, we want to simplify SRTs where possible. We therefore apply
194 the following optimisations. Each has a [keyword]; search for the
195 keyword in the code below to see where the optimisation is
196 implemented.
197
198 1. [Inline] we never create an SRT with a single entry, instead we
199 point to the single entry directly from the info table.
200
201 i.e. instead of
202
203 +------+
204 | info |
205 | | +-----+---+---+
206 | -------->|SRT_1| | | 0 |
207 |------| +-----+-|-+---+
208 | | |
209 | code | |
210 | | v
211 C
212
213 we can point directly to the closure:
214
215 +------+
216 | info |
217 | |
218 | -------->C
219 |------|
220 | |
221 | code |
222 | |
223
224
225 Furthermore, the SRT for any code that refers to this info table
226 can point directly to C.
227
228 The exception to this is when we're doing dynamic linking. In that
229 case, if the closure is not locally defined then we can't point to
230 it directly from the info table, because this is the text section
231 which cannot contain runtime relocations. In this case we skip this
232 optimisation and generate the singleton SRT, becase SRTs are in the
233 data section and *can* have relocatable references.
234
235 2. [FUN] A static function closure can also be an SRT, we simply put
236 the SRT entries as fields in the static closure. This makes a lot
237 of sense: the static references are just like the free variables of
238 the FUN closure.
239
240 i.e. instead of
241
242 f_closure:
243 +-----+---+
244 | | | 0 |
245 +- |--+---+
246 | +------+
247 | | info | f_srt:
248 | | | +-----+---+---+---+
249 | | -------->|SRT_2| | | | + 0 |
250 `----------->|------| +-----+-|-+-|-+---+
251 | | | |
252 | code | | |
253 | | v v
254
255
256 We can generate:
257
258 f_closure:
259 +-----+---+---+---+
260 | | | | | | | 0 |
261 +- |--+-|-+-|-+---+
262 | | | +------+
263 | v v | info |
264 | | |
265 | | 0 |
266 `----------->|------|
267 | |
268 | code |
269 | |
270
271
272 (note: we can't do this for THUNKs, because the thunk gets
273 overwritten when it is entered, so we wouldn't be able to share
274 this SRT with other info tables that want to refer to it (see
275 [Common] below). FUNs are immutable so don't have this problem.)
276
277 3. [Common] Identical SRTs can be commoned up.
278
279 4. [Filter] If an SRT A refers to an SRT B and a closure C, and B also
280 refers to C (perhaps transitively), then we can omit the reference
281 to C from A.
282
283
284 Note that there are many other optimisations that we could do, but
285 aren't implemented. In general, we could omit any reference from an
286 SRT if everything reachable from it is also reachable from the other
287 fields in the SRT. Our [Filter] optimisation is a special case of
288 this.
289
290 Another opportunity we don't exploit is this:
291
292 A = {X,Y,Z}
293 B = {Y,Z}
294 C = {X,B}
295
296 Here we could use C = {A} and therefore [Inline] C = A.
297 -}
298
299 -- ---------------------------------------------------------------------
300 {- Note [Invalid optimisation: shortcutting]
301
302 You might think that if we have something like
303
304 A's SRT = {B}
305 B's SRT = {X}
306
307 that we could replace the reference to B in A's SRT with X.
308
309 A's SRT = {X}
310 B's SRT = {X}
311
312 and thereby perhaps save a little work at runtime, because we don't
313 have to visit B.
314
315 But this is NOT valid.
316
317 Consider these cases:
318
319 0. B can't be a constructor, because constructors don't have SRTs
320
321 1. B is a CAF. This is the easy one. Obviously we want A's SRT to
322 point to B, so that it keeps B alive.
323
324 2. B is a function. This is the tricky one. The reason we can't
325 shortcut in this case is that we aren't allowed to resurrect static
326 objects.
327
328 == How does this cause a problem? ==
329
330 The particular case that cropped up when we tried this was #15544.
331 - A is a thunk
332 - B is a static function
333 - X is a CAF
334 - suppose we GC when A is alive, and B is not otherwise reachable.
335 - B is "collected", meaning that it doesn't make it onto the static
336 objects list during this GC, but nothing bad happens yet.
337 - Next, suppose we enter A, and then call B. (remember that A refers to B)
338 At the entry point to B, we GC. This puts B on the stack, as part of the
339 RET_FUN stack frame that gets pushed when we GC at a function entry point.
340 - This GC will now reach B
341 - But because B was previous "collected", it breaks the assumption
342 that static objects are never resurrected. See Note [STATIC_LINK
343 fields] in rts/sm/Storage.h for why this is bad.
344 - In practice, the GC thinks that B has already been visited, and so
345 doesn't visit X, and catastrophe ensues.
346
347 == Isn't this caused by the RET_FUN business? ==
348
349 Maybe, but could you prove that RET_FUN is the only way that
350 resurrection can occur?
351
352 So, no shortcutting.
353 -}
354
355 -- ---------------------------------------------------------------------
356 -- Label types
357
358 -- Labels that come from cafAnal can be:
359 -- - _closure labels for static functions or CAFs
360 -- - _info labels for dynamic functions, thunks, or continuations
361 -- - _entry labels for functions or thunks
362 --
363 -- Meanwhile the labels on top-level blocks are _entry labels.
364 --
365 -- To put everything in the same namespace we convert all labels to
366 -- closure labels using toClosureLbl. Note that some of these
367 -- labels will not actually exist; that's ok because we're going to
368 -- map them to SRTEntry later, which ranges over labels that do exist.
369 --
370 newtype CAFLabel = CAFLabel CLabel
371 deriving (Eq,Ord,Outputable)
372
373 type CAFSet = Set CAFLabel
374 type CAFEnv = LabelMap CAFSet
375
376 mkCAFLabel :: CLabel -> CAFLabel
377 mkCAFLabel lbl = CAFLabel (toClosureLbl lbl)
378
379 -- This is a label that we can put in an SRT. It *must* be a closure label,
380 -- pointing to either a FUN_STATIC, THUNK_STATIC, or CONSTR.
381 newtype SRTEntry = SRTEntry CLabel
382 deriving (Eq, Ord, Outputable)
383
384 -- ---------------------------------------------------------------------
385 -- CAF analysis
386
387 -- |
388 -- For each code block:
389 -- - collect the references reachable from this code block to FUN,
390 -- THUNK or RET labels for which hasCAF == True
391 --
392 -- This gives us a `CAFEnv`: a mapping from code block to sets of labels
393 --
394 cafAnal
395 :: LabelSet -- The blocks representing continuations, ie. those
396 -- that will get RET info tables. These labels will
397 -- get their own SRTs, so we don't aggregate CAFs from
398 -- references to these labels, we just use the label.
399 -> CLabel -- The top label of the proc
400 -> CmmGraph
401 -> CAFEnv
402 cafAnal contLbls topLbl cmmGraph =
403 analyzeCmmBwd cafLattice
404 (cafTransfers contLbls (g_entry cmmGraph) topLbl) cmmGraph mapEmpty
405
406
407 cafLattice :: DataflowLattice CAFSet
408 cafLattice = DataflowLattice Set.empty add
409 where
410 add (OldFact old) (NewFact new) =
411 let !new' = old `Set.union` new
412 in changedIf (Set.size new' > Set.size old) new'
413
414
415 cafTransfers :: LabelSet -> Label -> CLabel -> TransferFun CAFSet
416 cafTransfers contLbls entry topLbl
417 (BlockCC eNode middle xNode) fBase =
418 let joined = cafsInNode xNode $! live'
419 !result = foldNodesBwdOO cafsInNode middle joined
420
421 facts = mapMaybe successorFact (successors xNode)
422 live' = joinFacts cafLattice facts
423
424 successorFact s
425 -- If this is a loop back to the entry, we can refer to the
426 -- entry label.
427 | s == entry = Just (add topLbl Set.empty)
428 -- If this is a continuation, we want to refer to the
429 -- SRT for the continuation's info table
430 | s `setMember` contLbls
431 = Just (Set.singleton (mkCAFLabel (infoTblLbl s)))
432 -- Otherwise, takes the CAF references from the destination
433 | otherwise
434 = lookupFact s fBase
435
436 cafsInNode :: CmmNode e x -> CAFSet -> CAFSet
437 cafsInNode node set = foldExpDeep addCaf node set
438
439 addCaf expr !set =
440 case expr of
441 CmmLit (CmmLabel c) -> add c set
442 CmmLit (CmmLabelOff c _) -> add c set
443 CmmLit (CmmLabelDiffOff c1 c2 _ _) -> add c1 $! add c2 set
444 _ -> set
445 add l s | hasCAF l = Set.insert (mkCAFLabel l) s
446 | otherwise = s
447
448 in mapSingleton (entryLabel eNode) result
449
450
451 -- -----------------------------------------------------------------------------
452 -- ModuleSRTInfo
453
454 data ModuleSRTInfo = ModuleSRTInfo
455 { thisModule :: Module
456 -- ^ Current module being compiled. Required for calling labelDynamic.
457 , dedupSRTs :: Map (Set SRTEntry) SRTEntry
458 -- ^ previous SRTs we've emitted, so we can de-duplicate.
459 -- Used to implement the [Common] optimisation.
460 , flatSRTs :: Map SRTEntry (Set SRTEntry)
461 -- ^ The reverse mapping, so that we can remove redundant
462 -- entries. e.g. if we have an SRT [a,b,c], and we know that b
463 -- points to [c,d], we can omit c and emit [a,b].
464 -- Used to implement the [Filter] optimisation.
465 }
466 instance Outputable ModuleSRTInfo where
467 ppr ModuleSRTInfo{..} =
468 text "ModuleSRTInfo:" <+> ppr dedupSRTs <+> ppr flatSRTs
469
470 emptySRT :: Module -> ModuleSRTInfo
471 emptySRT mod =
472 ModuleSRTInfo
473 { thisModule = mod
474 , dedupSRTs = Map.empty
475 , flatSRTs = Map.empty }
476
477 -- -----------------------------------------------------------------------------
478 -- Constructing SRTs
479
480 {- Implementation notes
481
482 - In each CmmDecl there is a mapping info_tbls from Label -> CmmInfoTable
483
484 - The entry in info_tbls corresponding to g_entry is the closure info
485 table, the rest are continuations.
486
487 - Each entry in info_tbls possibly needs an SRT. We need to make a
488 label for each of these.
489
490 - We get the CAFSet for each entry from the CAFEnv
491
492 -}
493
494 -- | Return a (Label,CLabel) pair for each labelled block of a CmmDecl,
495 -- where the label is
496 -- - the info label for a continuation or dynamic closure
497 -- - the closure label for a top-level function (not a CAF)
498 getLabelledBlocks :: CmmDecl -> [(Label, CAFLabel)]
499 getLabelledBlocks (CmmData _ _) = []
500 getLabelledBlocks (CmmProc top_info _ _ _) =
501 [ (blockId, mkCAFLabel (cit_lbl info))
502 | (blockId, info) <- mapToList (info_tbls top_info)
503 , let rep = cit_rep info
504 , not (isStaticRep rep) || not (isThunkRep rep)
505 ]
506
507
508 -- | Put the labelled blocks that we will be annotating with SRTs into
509 -- dependency order. This is so that we can process them one at a
510 -- time, resolving references to earlier blocks to point to their
511 -- SRTs. CAFs themselves are not included here; see getCAFs below.
512 depAnalSRTs
513 :: CAFEnv
514 -> [CmmDecl]
515 -> [SCC (Label, CAFLabel, Set CAFLabel)]
516 depAnalSRTs cafEnv decls =
517 srtTrace "depAnalSRTs" (ppr graph) graph
518 where
519 labelledBlocks = concatMap getLabelledBlocks decls
520 labelToBlock = Map.fromList (map swap labelledBlocks)
521 graph = stronglyConnCompFromEdgedVerticesOrd
522 [ let cafs' = Set.delete lbl cafs in
523 DigraphNode (l,lbl,cafs') l
524 (mapMaybe (flip Map.lookup labelToBlock) (Set.toList cafs'))
525 | (l, lbl) <- labelledBlocks
526 , Just cafs <- [mapLookup l cafEnv] ]
527
528
529 -- | Get (Label, CAFLabel, Set CAFLabel) for each block that represents a CAF.
530 -- These are treated differently from other labelled blocks:
531 -- - we never shortcut a reference to a CAF to the contents of its
532 -- SRT, since the point of SRTs is to keep CAFs alive.
533 -- - CAFs therefore don't take part in the dependency analysis in depAnalSRTs.
534 -- instead we generate their SRTs after everything else.
535 getCAFs :: CAFEnv -> [CmmDecl] -> [(Label, CAFLabel, Set CAFLabel)]
536 getCAFs cafEnv decls =
537 [ (g_entry g, mkCAFLabel topLbl, cafs)
538 | CmmProc top_info topLbl _ g <- decls
539 , Just info <- [mapLookup (g_entry g) (info_tbls top_info)]
540 , let rep = cit_rep info
541 , isStaticRep rep && isThunkRep rep
542 , Just cafs <- [mapLookup (g_entry g) cafEnv]
543 ]
544
545
546 -- | Get the list of blocks that correspond to the entry points for
547 -- FUN_STATIC closures. These are the blocks for which if we have an
548 -- SRT we can merge it with the static closure. [FUN]
549 getStaticFuns :: [CmmDecl] -> [(BlockId, CLabel)]
550 getStaticFuns decls =
551 [ (g_entry g, lbl)
552 | CmmProc top_info _ _ g <- decls
553 , Just info <- [mapLookup (g_entry g) (info_tbls top_info)]
554 , Just (id, _) <- [cit_clo info]
555 , let rep = cit_rep info
556 , isStaticRep rep && isFunRep rep
557 , let lbl = mkLocalClosureLabel (idName id) (idCafInfo id)
558 ]
559
560
561 -- | Maps labels from 'cafAnal' to the final CLabel that will appear
562 -- in the SRT.
563 -- - closures with singleton SRTs resolve to their single entry
564 -- - closures with larger SRTs map to the label for that SRT
565 -- - CAFs must not map to anything!
566 -- - if a labels maps to Nothing, we found that this label's SRT
567 -- is empty, so we don't need to refer to it from other SRTs.
568 type SRTMap = Map CAFLabel (Maybe SRTEntry)
569
570 -- | resolve a CAFLabel to its SRTEntry using the SRTMap
571 resolveCAF :: SRTMap -> CAFLabel -> Maybe SRTEntry
572 resolveCAF srtMap lbl@(CAFLabel l) =
573 Map.findWithDefault (Just (SRTEntry (toClosureLbl l))) lbl srtMap
574
575
576 -- | Attach SRTs to all info tables in the CmmDecls, and add SRT
577 -- declarations to the ModuleSRTInfo.
578 --
579 doSRTs
580 :: DynFlags
581 -> ModuleSRTInfo
582 -> [(CAFEnv, [CmmDecl])]
583 -> IO (ModuleSRTInfo, [CmmDecl])
584
585 doSRTs dflags moduleSRTInfo tops = do
586 us <- mkSplitUniqSupply 'u'
587
588 -- Ignore the original grouping of decls, and combine all the
589 -- CAFEnvs into a single CAFEnv.
590 let (cafEnvs, declss) = unzip tops
591 cafEnv = mapUnions cafEnvs
592 decls = concat declss
593 staticFuns = mapFromList (getStaticFuns decls)
594
595 -- Put the decls in dependency order. Why? So that we can implement
596 -- [Inline] and [Filter]. If we need to refer to an SRT that has
597 -- a single entry, we use the entry itself, which means that we
598 -- don't need to generate the singleton SRT in the first place. But
599 -- to do this we need to process blocks before things that depend on
600 -- them.
601 let
602 sccs = depAnalSRTs cafEnv decls
603 cafsWithSRTs = getCAFs cafEnv decls
604
605 -- On each strongly-connected group of decls, construct the SRT
606 -- closures and the SRT fields for info tables.
607 let result ::
608 [ ( [CmmDecl] -- generated SRTs
609 , [(Label, CLabel)] -- SRT fields for info tables
610 , [(Label, [SRTEntry])] -- SRTs to attach to static functions
611 ) ]
612 ((result, _srtMap), moduleSRTInfo') =
613 initUs_ us $
614 flip runStateT moduleSRTInfo $
615 flip runStateT Map.empty $ do
616 nonCAFs <- mapM (doSCC dflags staticFuns) sccs
617 cAFs <- forM cafsWithSRTs $ \(l, cafLbl, cafs) ->
618 oneSRT dflags staticFuns [l] [cafLbl] True{-is a CAF-} cafs
619 return (nonCAFs ++ cAFs)
620
621 (declss, pairs, funSRTs) = unzip3 result
622
623 -- Next, update the info tables with the SRTs
624 let
625 srtFieldMap = mapFromList (concat pairs)
626 funSRTMap = mapFromList (concat funSRTs)
627 decls' = concatMap (updInfoSRTs dflags srtFieldMap funSRTMap) decls
628
629 return (moduleSRTInfo', concat declss ++ decls')
630
631
632 -- | Build the SRT for a strongly-connected component of blocks
633 doSCC
634 :: DynFlags
635 -> LabelMap CLabel -- which blocks are static function entry points
636 -> SCC (Label, CAFLabel, Set CAFLabel)
637 -> StateT SRTMap
638 (StateT ModuleSRTInfo UniqSM)
639 ( [CmmDecl] -- generated SRTs
640 , [(Label, CLabel)] -- SRT fields for info tables
641 , [(Label, [SRTEntry])] -- SRTs to attach to static functions
642 )
643
644 doSCC dflags staticFuns (AcyclicSCC (l, cafLbl, cafs)) =
645 oneSRT dflags staticFuns [l] [cafLbl] False cafs
646
647 doSCC dflags staticFuns (CyclicSCC nodes) = do
648 -- build a single SRT for the whole cycle, see Note [recursive SRTs]
649 let (blockids, lbls, cafsets) = unzip3 nodes
650 cafs = Set.unions cafsets
651 oneSRT dflags staticFuns blockids lbls False cafs
652
653
654 {- Note [recursive SRTs]
655
656 If the dependency analyser has found us a recursive group of
657 declarations, then we build a single SRT for the whole group, on the
658 grounds that everything in the group is reachable from everything
659 else, so we lose nothing by having a single SRT.
660
661 However, there are a couple of wrinkles to be aware of.
662
663 * The Set CAFLabel for this SRT will contain labels in the group
664 itself. The SRTMap will therefore not contain entries for these labels
665 yet, so we can't turn them into SRTEntries using resolveCAF. BUT we
666 can just remove recursive references from the Set CAFLabel before
667 generating the SRT - the SRT will still contain all the CAFLabels that
668 we need to refer to from this group's SRT.
669
670 * That is, EXCEPT for static function closures. For the same reason
671 described in Note [Invalid optimisation: shortcutting], we cannot omit
672 references to static function closures.
673 - But, since we will merge the SRT with one of the static function
674 closures (see [FUN]), we can omit references to *that* static
675 function closure from the SRT.
676 -}
677
678 -- | Build an SRT for a set of blocks
679 oneSRT
680 :: DynFlags
681 -> LabelMap CLabel -- which blocks are static function entry points
682 -> [Label] -- blocks in this set
683 -> [CAFLabel] -- labels for those blocks
684 -> Bool -- True <=> this SRT is for a CAF
685 -> Set CAFLabel -- SRT for this set
686 -> StateT SRTMap
687 (StateT ModuleSRTInfo UniqSM)
688 ( [CmmDecl] -- SRT objects we built
689 , [(Label, CLabel)] -- SRT fields for these blocks' itbls
690 , [(Label, [SRTEntry])] -- SRTs to attach to static functions
691 )
692
693 oneSRT dflags staticFuns blockids lbls isCAF cafs = do
694 srtMap <- get
695 topSRT <- lift get
696 let
697 -- Can we merge this SRT with a FUN_STATIC closure?
698 (maybeFunClosure, otherFunLabels) =
699 case [ (l,b) | b <- blockids, Just l <- [mapLookup b staticFuns] ] of
700 [] -> (Nothing, [])
701 ((l,b):xs) -> (Just (l,b), map (mkCAFLabel . fst) xs)
702
703 -- Remove recursive references from the SRT, except for (all but
704 -- one of the) static functions. See Note [recursive SRTs].
705 nonRec = cafs `Set.difference`
706 Set.fromList lbls `Set.difference` Set.fromList otherFunLabels
707
708 -- First resolve all the CAFLabels to SRTEntries
709 -- Implements the [Inline] optimisation.
710 resolved =
711 Set.fromList $
712 catMaybes (map (resolveCAF srtMap) (Set.toList nonRec))
713
714 -- The set of all SRTEntries in SRTs that we refer to from here.
715 allBelow =
716 Set.unions [ lbls | caf <- Set.toList resolved
717 , Just lbls <- [Map.lookup caf (flatSRTs topSRT)] ]
718
719 -- Remove SRTEntries that are also in an SRT that we refer to.
720 -- Implements the [Filter] optimisation.
721 filtered = Set.difference resolved allBelow
722
723 srtTrace "oneSRT:"
724 (ppr cafs <+> ppr resolved <+> ppr allBelow <+> ppr filtered) $ return ()
725
726 let
727 isStaticFun = isJust maybeFunClosure
728
729 -- For a label without a closure (e.g. a continuation), we must
730 -- update the SRTMap for the label to point to a closure. It's
731 -- important that we don't do this for static functions or CAFs,
732 -- see Note [Invalid optimisation: shortcutting].
733 updateSRTMap srtEntry =
734 when (not isCAF && not isStaticFun) $ do
735 let newSRTMap = Map.fromList [(cafLbl, srtEntry) | cafLbl <- lbls]
736 put (Map.union newSRTMap srtMap)
737
738 this_mod = thisModule topSRT
739
740 case Set.toList filtered of
741 [] -> do
742 srtTrace "oneSRT: empty" (ppr lbls) $ return ()
743 updateSRTMap Nothing
744 return ([], [], [])
745
746 -- [Inline] - when we have only one entry there is no need to
747 -- build an SRT object at all, instead we put the singleton SRT
748 -- entry in the info table.
749 [one@(SRTEntry lbl)]
750 | -- Info tables refer to SRTs by offset (as noted in the section
751 -- "Referring to an SRT from the info table" of Note [SRTs]). However,
752 -- when dynamic linking is used we cannot guarantee that the offset
753 -- between the SRT and the info table will fit in the offset field.
754 -- Consequently we build a singleton SRT in in this case.
755 not (labelDynamic dflags this_mod lbl)
756
757 -- MachO relocations can't express offsets between compilation units at
758 -- all, so we are always forced to build a singleton SRT in this case.
759 && (not (osMachOTarget $ platformOS $ targetPlatform dflags)
760 || isLocalCLabel this_mod lbl) -> do
761
762 -- If we have a static function closure, then it becomes the
763 -- SRT object, and everything else points to it. (the only way
764 -- we could have multiple labels here is if this is a
765 -- recursive group, see Note [recursive SRTs])
766 case maybeFunClosure of
767 Just (staticFunLbl,staticFunBlock) -> return ([], withLabels, [])
768 where
769 withLabels =
770 [ (b, if b == staticFunBlock then lbl else staticFunLbl)
771 | b <- blockids ]
772 Nothing -> do
773 updateSRTMap (Just one)
774 return ([], map (,lbl) blockids, [])
775
776 cafList ->
777 -- Check whether an SRT with the same entries has been emitted already.
778 -- Implements the [Common] optimisation.
779 case Map.lookup filtered (dedupSRTs topSRT) of
780 Just srtEntry@(SRTEntry srtLbl) -> do
781 srtTrace "oneSRT [Common]" (ppr lbls <+> ppr srtLbl) $ return ()
782 updateSRTMap (Just srtEntry)
783 return ([], map (,srtLbl) blockids, [])
784 Nothing -> do
785 -- No duplicates: we have to build a new SRT object
786 srtTrace "oneSRT: new" (ppr lbls <+> ppr filtered) $ return ()
787 (decls, funSRTs, srtEntry) <-
788 case maybeFunClosure of
789 Just (fun,block) ->
790 return ( [], [(block, cafList)], SRTEntry fun )
791 Nothing -> do
792 (decls, entry) <- lift . lift $ buildSRTChain dflags cafList
793 return (decls, [], entry)
794 updateSRTMap (Just srtEntry)
795 let allBelowThis = Set.union allBelow filtered
796 oldFlatSRTs = flatSRTs topSRT
797 newFlatSRTs = Map.insert srtEntry allBelowThis oldFlatSRTs
798 newDedupSRTs = Map.insert filtered srtEntry (dedupSRTs topSRT)
799 lift (put (topSRT { dedupSRTs = newDedupSRTs
800 , flatSRTs = newFlatSRTs }))
801 let SRTEntry lbl = srtEntry
802 return (decls, map (,lbl) blockids, funSRTs)
803
804
805 -- | build a static SRT object (or a chain of objects) from a list of
806 -- SRTEntries.
807 buildSRTChain
808 :: DynFlags
809 -> [SRTEntry]
810 -> UniqSM
811 ( [CmmDecl] -- The SRT object(s)
812 , SRTEntry -- label to use in the info table
813 )
814 buildSRTChain _ [] = panic "buildSRT: empty"
815 buildSRTChain dflags cafSet =
816 case splitAt mAX_SRT_SIZE cafSet of
817 (these, []) -> do
818 (decl,lbl) <- buildSRT dflags these
819 return ([decl], lbl)
820 (these,those) -> do
821 (rest, rest_lbl) <- buildSRTChain dflags (head these : those)
822 (decl,lbl) <- buildSRT dflags (rest_lbl : tail these)
823 return (decl:rest, lbl)
824 where
825 mAX_SRT_SIZE = 16
826
827
828 buildSRT :: DynFlags -> [SRTEntry] -> UniqSM (CmmDecl, SRTEntry)
829 buildSRT dflags refs = do
830 id <- getUniqueM
831 let
832 lbl = mkSRTLabel id
833 srt_n_info = mkSRTInfoLabel (length refs)
834 fields =
835 mkStaticClosure dflags srt_n_info dontCareCCS
836 [ CmmLabel lbl | SRTEntry lbl <- refs ]
837 [] -- no padding
838 [mkIntCLit dflags 0] -- link field
839 [] -- no saved info
840 return (mkDataLits (Section Data lbl) lbl fields, SRTEntry lbl)
841
842
843 -- | Update info tables with references to their SRTs. Also generate
844 -- static closures, splicing in SRT fields as necessary.
845 updInfoSRTs
846 :: DynFlags
847 -> LabelMap CLabel -- SRT labels for each block
848 -> LabelMap [SRTEntry] -- SRTs to merge into FUN_STATIC closures
849 -> CmmDecl
850 -> [CmmDecl]
851
852 updInfoSRTs dflags srt_env funSRTEnv (CmmProc top_info top_l live g)
853 | Just (_,closure) <- maybeStaticClosure = [ proc, closure ]
854 | otherwise = [ proc ]
855 where
856 proc = CmmProc top_info { info_tbls = newTopInfo } top_l live g
857 newTopInfo = mapMapWithKey updInfoTbl (info_tbls top_info)
858 updInfoTbl l info_tbl
859 | l == g_entry g, Just (inf, _) <- maybeStaticClosure = inf
860 | otherwise = info_tbl { cit_srt = mapLookup l srt_env }
861
862 -- Generate static closures [FUN]. Note that this also generates
863 -- static closures for thunks (CAFs), because it's easier to treat
864 -- them uniformly in the code generator.
865 maybeStaticClosure :: Maybe (CmmInfoTable, CmmDecl)
866 maybeStaticClosure
867 | Just info_tbl@CmmInfoTable{..} <-
868 mapLookup (g_entry g) (info_tbls top_info)
869 , Just (id, ccs) <- cit_clo
870 , isStaticRep cit_rep =
871 let
872 (newInfo, srtEntries) = case mapLookup (g_entry g) funSRTEnv of
873 Nothing ->
874 -- if we don't add SRT entries to this closure, then we
875 -- want to set the srt field in its info table as usual
876 (info_tbl { cit_srt = mapLookup (g_entry g) srt_env }, [])
877 Just srtEntries -> srtTrace "maybeStaticFun" (ppr res)
878 (info_tbl { cit_rep = new_rep }, res)
879 where res = [ CmmLabel lbl | SRTEntry lbl <- srtEntries ]
880 fields = mkStaticClosureFields dflags info_tbl ccs (idCafInfo id)
881 srtEntries
882 new_rep = case cit_rep of
883 HeapRep sta ptrs nptrs ty ->
884 HeapRep sta (ptrs + length srtEntries) nptrs ty
885 _other -> panic "maybeStaticFun"
886 lbl = mkLocalClosureLabel (idName id) (idCafInfo id)
887 in
888 Just (newInfo, mkDataLits (Section Data lbl) lbl fields)
889 | otherwise = Nothing
890
891 updInfoSRTs _ _ _ t = [t]
892
893
894 srtTrace :: String -> SDoc -> b -> b
895 -- srtTrace = pprTrace
896 srtTrace _ _ b = b