Fix #11407.
[ghc.git] / compiler / cmm / CmmProcPoint.hs
1 {-# LANGUAGE GADTs, DisambiguateRecordFields #-}
2
3 module CmmProcPoint
4 ( ProcPointSet, Status(..)
5 , callProcPoints, minimalProcPointSet
6 , splitAtProcPoints, procPointAnalysis
7 , attachContInfoTables
8 )
9 where
10
11 import Prelude hiding (last, unzip, succ, zip)
12
13 import DynFlags
14 import BlockId
15 import CLabel
16 import Cmm
17 import PprCmm ()
18 import CmmUtils
19 import CmmInfo
20 import CmmLive (cmmGlobalLiveness)
21 import CmmSwitch
22 import Data.List (sortBy)
23 import Maybes
24 import Control.Monad
25 import Outputable
26 import Platform
27 import UniqSupply
28
29 import Hoopl
30
31 -- Compute a minimal set of proc points for a control-flow graph.
32
33 -- Determine a protocol for each proc point (which live variables will
34 -- be passed as arguments and which will be on the stack).
35
36 {-
37 A proc point is a basic block that, after CPS transformation, will
38 start a new function. The entry block of the original function is a
39 proc point, as is the continuation of each function call.
40 A third kind of proc point arises if we want to avoid copying code.
41 Suppose we have code like the following:
42
43 f() {
44 if (...) { ..1..; call foo(); ..2..}
45 else { ..3..; call bar(); ..4..}
46 x = y + z;
47 return x;
48 }
49
50 The statement 'x = y + z' can be reached from two different proc
51 points: the continuations of foo() and bar(). We would prefer not to
52 put a copy in each continuation; instead we would like 'x = y + z' to
53 be the start of a new procedure to which the continuations can jump:
54
55 f_cps () {
56 if (...) { ..1..; push k_foo; jump foo_cps(); }
57 else { ..3..; push k_bar; jump bar_cps(); }
58 }
59 k_foo() { ..2..; jump k_join(y, z); }
60 k_bar() { ..4..; jump k_join(y, z); }
61 k_join(y, z) { x = y + z; return x; }
62
63 You might think then that a criterion to make a node a proc point is
64 that it is directly reached by two distinct proc points. (Note
65 [Direct reachability].) But this criterion is a bit too simple; for
66 example, 'return x' is also reached by two proc points, yet there is
67 no point in pulling it out of k_join. A good criterion would be to
68 say that a node should be made a proc point if it is reached by a set
69 of proc points that is different than its immediate dominator. NR
70 believes this criterion can be shown to produce a minimum set of proc
71 points, and given a dominator tree, the proc points can be chosen in
72 time linear in the number of blocks. Lacking a dominator analysis,
73 however, we turn instead to an iterative solution, starting with no
74 proc points and adding them according to these rules:
75
76 1. The entry block is a proc point.
77 2. The continuation of a call is a proc point.
78 3. A node is a proc point if it is directly reached by more proc
79 points than one of its predecessors.
80
81 Because we don't understand the problem very well, we apply rule 3 at
82 most once per iteration, then recompute the reachability information.
83 (See Note [No simple dataflow].) The choice of the new proc point is
84 arbitrary, and I don't know if the choice affects the final solution,
85 so I don't know if the number of proc points chosen is the
86 minimum---but the set will be minimal.
87
88
89
90 Note [Proc-point analysis]
91 ~~~~~~~~~~~~~~~~~~~~~~~~~~
92
93 Given a specified set of proc-points (a set of block-ids), "proc-point
94 analysis" figures out, for every block, which proc-point it belongs to.
95 All the blocks belonging to proc-point P will constitute a single
96 top-level C procedure.
97
98 A non-proc-point block B "belongs to" a proc-point P iff B is
99 reachable from P without going through another proc-point.
100
101 Invariant: a block B should belong to at most one proc-point; if it
102 belongs to two, that's a bug.
103
104 Note [Non-existing proc-points]
105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107 On some architectures it might happen that the list of proc-points
108 computed before stack layout pass will be invalidated by the stack
109 layout. This will happen if stack layout removes from the graph
110 blocks that were determined to be proc-points. Later on in the pipeline
111 we use list of proc-points to perform [Proc-point analysis], but
112 if a proc-point does not exist anymore then we will get compiler panic.
113 See #8205.
114 -}
115
116 type ProcPointSet = BlockSet
117
118 data Status
119 = ReachedBy ProcPointSet -- set of proc points that directly reach the block
120 | ProcPoint -- this block is itself a proc point
121
122 instance Outputable Status where
123 ppr (ReachedBy ps)
124 | setNull ps = text "<not-reached>"
125 | otherwise = text "reached by" <+>
126 (hsep $ punctuate comma $ map ppr $ setElems ps)
127 ppr ProcPoint = text "<procpt>"
128
129 --------------------------------------------------
130 -- Proc point analysis
131
132 procPointAnalysis :: ProcPointSet -> CmmGraph -> UniqSM (BlockEnv Status)
133 -- Once you know what the proc-points are, figure out
134 -- what proc-points each block is reachable from
135 -- See Note [Proc-point analysis]
136 procPointAnalysis procPoints g@(CmmGraph {g_graph = graph}) =
137 -- pprTrace "procPointAnalysis" (ppr procPoints) $
138 dataflowAnalFwdBlocks g initProcPoints $ analFwd lattice forward
139 where initProcPoints = [(id, ProcPoint) | id <- setElems procPoints,
140 id `setMember` labelsInGraph ]
141 -- See Note [Non-existing proc-points]
142 labelsInGraph = labelsDefined graph
143 -- transfer equations
144
145 forward :: FwdTransfer CmmNode Status
146 forward = mkFTransfer3 first middle last
147 where
148 first :: CmmNode C O -> Status -> Status
149 first (CmmEntry id _) ProcPoint = ReachedBy $ setSingleton id
150 first _ x = x
151
152 middle _ x = x
153
154 last :: CmmNode O C -> Status -> FactBase Status
155 last l x = mkFactBase lattice $ map (\id -> (id, x)) (successors l)
156
157 lattice :: DataflowLattice Status
158 lattice = DataflowLattice "direct proc-point reachability" unreached add_to
159 where unreached = ReachedBy setEmpty
160 add_to _ (OldFact ProcPoint) _ = (NoChange, ProcPoint)
161 add_to _ _ (NewFact ProcPoint) = (SomeChange, ProcPoint)
162 -- because of previous case
163 add_to _ (OldFact (ReachedBy p)) (NewFact (ReachedBy p'))
164 | setSize union > setSize p = (SomeChange, ReachedBy union)
165 | otherwise = (NoChange, ReachedBy p)
166 where
167 union = setUnion p' p
168
169 ----------------------------------------------------------------------
170
171 -- It is worth distinguishing two sets of proc points: those that are
172 -- induced by calls in the original graph and those that are
173 -- introduced because they're reachable from multiple proc points.
174 --
175 -- Extract the set of Continuation BlockIds, see Note [Continuation BlockIds].
176 callProcPoints :: CmmGraph -> ProcPointSet
177 callProcPoints g = foldGraphBlocks add (setSingleton (g_entry g)) g
178 where add :: CmmBlock -> BlockSet -> BlockSet
179 add b set = case lastNode b of
180 CmmCall {cml_cont = Just k} -> setInsert k set
181 CmmForeignCall {succ=k} -> setInsert k set
182 _ -> set
183
184 minimalProcPointSet :: Platform -> ProcPointSet -> CmmGraph
185 -> UniqSM ProcPointSet
186 -- Given the set of successors of calls (which must be proc-points)
187 -- figure out the minimal set of necessary proc-points
188 minimalProcPointSet platform callProcPoints g
189 = extendPPSet platform g (postorderDfs g) callProcPoints
190
191 extendPPSet :: Platform -> CmmGraph -> [CmmBlock] -> ProcPointSet -> UniqSM ProcPointSet
192 extendPPSet platform g blocks procPoints =
193 do env <- procPointAnalysis procPoints g
194 -- pprTrace "extensPPSet" (ppr env) $ return ()
195 let add block pps = let id = entryLabel block
196 in case mapLookup id env of
197 Just ProcPoint -> setInsert id pps
198 _ -> pps
199 procPoints' = foldGraphBlocks add setEmpty g
200 newPoints = mapMaybe ppSuccessor blocks
201 newPoint = listToMaybe newPoints
202 ppSuccessor b =
203 let nreached id = case mapLookup id env `orElse`
204 pprPanic "no ppt" (ppr id <+> ppr b) of
205 ProcPoint -> 1
206 ReachedBy ps -> setSize ps
207 block_procpoints = nreached (entryLabel b)
208 -- | Looking for a successor of b that is reached by
209 -- more proc points than b and is not already a proc
210 -- point. If found, it can become a proc point.
211 newId succ_id = not (setMember succ_id procPoints') &&
212 nreached succ_id > block_procpoints
213 in listToMaybe $ filter newId $ successors b
214 {-
215 case newPoints of
216 [] -> return procPoints'
217 pps -> extendPPSet g blocks
218 (foldl extendBlockSet procPoints' pps)
219 -}
220 case newPoint of
221 Just id ->
222 if setMember id procPoints'
223 then panic "added old proc pt"
224 else extendPPSet platform g blocks (setInsert id procPoints')
225 Nothing -> return procPoints'
226
227
228 -- At this point, we have found a set of procpoints, each of which should be
229 -- the entry point of a procedure.
230 -- Now, we create the procedure for each proc point,
231 -- which requires that we:
232 -- 1. build a map from proc points to the blocks reachable from the proc point
233 -- 2. turn each branch to a proc point into a jump
234 -- 3. turn calls and returns into jumps
235 -- 4. build info tables for the procedures -- and update the info table for
236 -- the SRTs in the entry procedure as well.
237 -- Input invariant: A block should only be reachable from a single ProcPoint.
238 -- ToDo: use the _ret naming convention that the old code generator
239 -- used. -- EZY
240 splitAtProcPoints :: DynFlags -> CLabel -> ProcPointSet-> ProcPointSet -> BlockEnv Status ->
241 CmmDecl -> UniqSM [CmmDecl]
242 splitAtProcPoints dflags entry_label callPPs procPoints procMap
243 (CmmProc (TopInfo {info_tbls = info_tbls})
244 top_l _ g@(CmmGraph {g_entry=entry})) =
245 do -- Build a map from procpoints to the blocks they reach
246 let addBlock b graphEnv =
247 case mapLookup bid procMap of
248 Just ProcPoint -> add graphEnv bid bid b
249 Just (ReachedBy set) ->
250 case setElems set of
251 [] -> graphEnv
252 [id] -> add graphEnv id bid b
253 _ -> panic "Each block should be reachable from only one ProcPoint"
254 Nothing -> graphEnv
255 where bid = entryLabel b
256 add graphEnv procId bid b = mapInsert procId graph' graphEnv
257 where graph = mapLookup procId graphEnv `orElse` mapEmpty
258 graph' = mapInsert bid b graph
259
260 let liveness = cmmGlobalLiveness dflags g
261 let ppLiveness pp = filter isArgReg $
262 regSetToList $
263 expectJust "ppLiveness" $ mapLookup pp liveness
264
265 graphEnv <- return $ foldGraphBlocks addBlock emptyBlockMap g
266
267 -- Build a map from proc point BlockId to pairs of:
268 -- * Labels for their new procedures
269 -- * Labels for the info tables of their new procedures (only if
270 -- the proc point is a callPP)
271 -- Due to common blockification, we may overestimate the set of procpoints.
272 let add_label map pp = mapInsert pp lbls map
273 where lbls | pp == entry = (entry_label, fmap cit_lbl (mapLookup entry info_tbls))
274 | otherwise = (block_lbl, guard (setMember pp callPPs) >>
275 Just (toInfoLbl block_lbl))
276 where block_lbl = blockLbl pp
277
278 procLabels :: LabelMap (CLabel, Maybe CLabel)
279 procLabels = foldl add_label mapEmpty
280 (filter (flip mapMember (toBlockMap g)) (setElems procPoints))
281
282 -- In each new graph, add blocks jumping off to the new procedures,
283 -- and replace branches to procpoints with branches to the jump-off blocks
284 let add_jump_block (env, bs) (pp, l) =
285 do bid <- liftM mkBlockId getUniqueM
286 let b = blockJoin (CmmEntry bid GlobalScope) emptyBlock jump
287 live = ppLiveness pp
288 jump = CmmCall (CmmLit (CmmLabel l)) Nothing live 0 0 0
289 return (mapInsert pp bid env, b : bs)
290
291 add_jumps newGraphEnv (ppId, blockEnv) =
292 do let needed_jumps = -- find which procpoints we currently branch to
293 mapFold add_if_branch_to_pp [] blockEnv
294 add_if_branch_to_pp :: CmmBlock -> [(BlockId, CLabel)] -> [(BlockId, CLabel)]
295 add_if_branch_to_pp block rst =
296 case lastNode block of
297 CmmBranch id -> add_if_pp id rst
298 CmmCondBranch _ ti fi _ -> add_if_pp ti (add_if_pp fi rst)
299 CmmSwitch _ ids -> foldr add_if_pp rst $ switchTargetsToList ids
300 _ -> rst
301
302 -- when jumping to a PP that has an info table, if
303 -- tablesNextToCode is off we must jump to the entry
304 -- label instead.
305 jump_label (Just info_lbl) _
306 | tablesNextToCode dflags = info_lbl
307 | otherwise = toEntryLbl info_lbl
308 jump_label Nothing block_lbl = block_lbl
309
310 add_if_pp id rst = case mapLookup id procLabels of
311 Just (lbl, mb_info_lbl) -> (id, jump_label mb_info_lbl lbl) : rst
312 Nothing -> rst
313 (jumpEnv, jumpBlocks) <-
314 foldM add_jump_block (mapEmpty, []) needed_jumps
315 -- update the entry block
316 let b = expectJust "block in env" $ mapLookup ppId blockEnv
317 blockEnv' = mapInsert ppId b blockEnv
318 -- replace branches to procpoints with branches to jumps
319 blockEnv'' = toBlockMap $ replaceBranches jumpEnv $ ofBlockMap ppId blockEnv'
320 -- add the jump blocks to the graph
321 blockEnv''' = foldl (flip insertBlock) blockEnv'' jumpBlocks
322 let g' = ofBlockMap ppId blockEnv'''
323 -- pprTrace "g' pre jumps" (ppr g') $ do
324 return (mapInsert ppId g' newGraphEnv)
325
326 graphEnv <- foldM add_jumps emptyBlockMap $ mapToList graphEnv
327
328 let to_proc (bid, g)
329 | bid == entry
330 = CmmProc (TopInfo {info_tbls = info_tbls,
331 stack_info = stack_info})
332 top_l live g'
333 | otherwise
334 = case expectJust "pp label" $ mapLookup bid procLabels of
335 (lbl, Just info_lbl)
336 -> CmmProc (TopInfo { info_tbls = mapSingleton (g_entry g) (mkEmptyContInfoTable info_lbl)
337 , stack_info=stack_info})
338 lbl live g'
339 (lbl, Nothing)
340 -> CmmProc (TopInfo {info_tbls = mapEmpty, stack_info=stack_info})
341 lbl live g'
342 where
343 g' = replacePPIds g
344 live = ppLiveness (g_entry g')
345 stack_info = StackInfo { arg_space = 0
346 , updfr_space = Nothing
347 , do_layout = True }
348 -- cannot use panic, this is printed by -ddump-cmm
349
350 -- References to procpoint IDs can now be replaced with the
351 -- infotable's label
352 replacePPIds g = {-# SCC "replacePPIds" #-}
353 mapGraphNodes (id, mapExp repl, mapExp repl) g
354 where repl e@(CmmLit (CmmBlock bid)) =
355 case mapLookup bid procLabels of
356 Just (_, Just info_lbl) -> CmmLit (CmmLabel info_lbl)
357 _ -> e
358 repl e = e
359
360 -- The C back end expects to see return continuations before the
361 -- call sites. Here, we sort them in reverse order -- it gets
362 -- reversed later.
363 let (_, block_order) = foldl add_block_num (0::Int, emptyBlockMap) (postorderDfs g)
364 add_block_num (i, map) block = (i+1, mapInsert (entryLabel block) i map)
365 sort_fn (bid, _) (bid', _) =
366 compare (expectJust "block_order" $ mapLookup bid block_order)
367 (expectJust "block_order" $ mapLookup bid' block_order)
368 procs <- return $ map to_proc $ sortBy sort_fn $ mapToList graphEnv
369 return -- pprTrace "procLabels" (ppr procLabels)
370 -- pprTrace "splitting graphs" (ppr procs)
371 procs
372 splitAtProcPoints _ _ _ _ _ t@(CmmData _ _) = return [t]
373
374 -- Only called from CmmProcPoint.splitAtProcPoints. NB. does a
375 -- recursive lookup, see comment below.
376 replaceBranches :: BlockEnv BlockId -> CmmGraph -> CmmGraph
377 replaceBranches env cmmg
378 = {-# SCC "replaceBranches" #-}
379 ofBlockMap (g_entry cmmg) $ mapMap f $ toBlockMap cmmg
380 where
381 f block = replaceLastNode block $ last (lastNode block)
382
383 last :: CmmNode O C -> CmmNode O C
384 last (CmmBranch id) = CmmBranch (lookup id)
385 last (CmmCondBranch e ti fi l) = CmmCondBranch e (lookup ti) (lookup fi) l
386 last (CmmSwitch e ids) = CmmSwitch e (mapSwitchTargets lookup ids)
387 last l@(CmmCall {}) = l { cml_cont = Nothing }
388 -- NB. remove the continuation of a CmmCall, since this
389 -- label will now be in a different CmmProc. Not only
390 -- is this tidier, it stops CmmLint from complaining.
391 last l@(CmmForeignCall {}) = l
392 lookup id = fmap lookup (mapLookup id env) `orElse` id
393 -- XXX: this is a recursive lookup, it follows chains
394 -- until the lookup returns Nothing, at which point we
395 -- return the last BlockId
396
397 -- --------------------------------------------------------------
398 -- Not splitting proc points: add info tables for continuations
399
400 attachContInfoTables :: ProcPointSet -> CmmDecl -> CmmDecl
401 attachContInfoTables call_proc_points (CmmProc top_info top_l live g)
402 = CmmProc top_info{info_tbls = info_tbls'} top_l live g
403 where
404 info_tbls' = mapUnion (info_tbls top_info) $
405 mapFromList [ (l, mkEmptyContInfoTable (infoTblLbl l))
406 | l <- setElems call_proc_points
407 , l /= g_entry g ]
408 attachContInfoTables _ other_decl
409 = other_decl
410
411 ----------------------------------------------------------------
412
413 {-
414 Note [Direct reachability]
415
416 Block B is directly reachable from proc point P iff control can flow
417 from P to B without passing through an intervening proc point.
418 -}
419
420 ----------------------------------------------------------------
421
422 {-
423 Note [No simple dataflow]
424
425 Sadly, it seems impossible to compute the proc points using a single
426 dataflow pass. One might attempt to use this simple lattice:
427
428 data Location = Unknown
429 | InProc BlockId -- node is in procedure headed by the named proc point
430 | ProcPoint -- node is itself a proc point
431
432 At a join, a node in two different blocks becomes a proc point.
433 The difficulty is that the change of information during iterative
434 computation may promote a node prematurely. Here's a program that
435 illustrates the difficulty:
436
437 f () {
438 entry:
439 ....
440 L1:
441 if (...) { ... }
442 else { ... }
443
444 L2: if (...) { g(); goto L1; }
445 return x + y;
446 }
447
448 The only proc-point needed (besides the entry) is L1. But in an
449 iterative analysis, consider what happens to L2. On the first pass
450 through, it rises from Unknown to 'InProc entry', but when L1 is
451 promoted to a proc point (because it's the successor of g()), L1's
452 successors will be promoted to 'InProc L1'. The problem hits when the
453 new fact 'InProc L1' flows into L2 which is already bound to 'InProc entry'.
454 The join operation makes it a proc point when in fact it needn't be,
455 because its immediate dominator L1 is already a proc point and there
456 are no other proc points that directly reach L2.
457 -}
458
459
460
461 {- Note [Separate Adams optimization]
462 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
463 It may be worthwhile to attempt the Adams optimization by rewriting
464 the graph before the assignment of proc-point protocols. Here are a
465 couple of rules:
466
467 g() returns to k; g() returns to L;
468 k: CopyIn c ress; goto L:
469 ... ==> ...
470 L: // no CopyIn node here L: CopyIn c ress;
471
472
473 And when c == c' and ress == ress', this also:
474
475 g() returns to k; g() returns to L;
476 k: CopyIn c ress; goto L:
477 ... ==> ...
478 L: CopyIn c' ress' L: CopyIn c' ress' ;
479
480 In both cases the goal is to eliminate k.
481 -}