1 {-

2 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998

4 ************************************************************************

5 * *

6 \section[FloatIn]{Floating Inwards pass}

7 * *

8 ************************************************************************

10 The main purpose of @floatInwards@ is floating into branches of a

11 case, so that we don't allocate things, save them on the stack, and

12 then discover that they aren't needed in the chosen branch.

13 -}

15 {-# LANGUAGE CPP #-}

16 {-# OPTIONS_GHC -fprof-auto #-}

22 import CoreSyn

23 import MkCore

25 import CoreUtils

26 import CoreFVs

29 import Var

30 import Type

31 import VarSet

32 import Util

33 import DynFlags

34 import Outputable

35 -- import Data.List ( mapAccumL )

38 {-

39 Top-level interface function, @floatInwards@. Note that we do not

40 actually float any bindings downwards from the top-level.

41 -}

47 where

54 {-

55 ************************************************************************

56 * *

57 \subsection{Mail from Andr\'e [edited]}

58 * *

59 ************************************************************************

61 {\em Will wrote: What??? I thought the idea was to float as far

62 inwards as possible, no matter what. This is dropping all bindings

63 every time it sees a lambda of any kind. Help! }

65 You are assuming we DO DO full laziness AFTER floating inwards! We

66 have to [not float inside lambdas] if we don't.

68 If we indeed do full laziness after the floating inwards (we could

69 check the compilation flags for that) then I agree we could be more

70 aggressive and do float inwards past lambdas.

72 Actually we are not doing a proper full laziness (see below), which

73 was another reason for not floating inwards past a lambda.

75 This can easily be fixed. The problem is that we float lets outwards,

76 but there are a few expressions which are not let bound, like case

77 scrutinees and case alternatives. After floating inwards the

78 simplifier could decide to inline the let and the laziness would be

79 lost, e.g.

81 \begin{verbatim}

82 let a = expensive ==> \b -> case expensive of ...

83 in \ b -> case a of ...

84 \end{verbatim}

85 The fix is

86 \begin{enumerate}

87 \item

88 to let bind the algebraic case scrutinees (done, I think) and

89 the case alternatives (except the ones with an

90 unboxed type)(not done, I think). This is best done in the

91 SetLevels.hs module, which tags things with their level numbers.

92 \item

93 do the full laziness pass (floating lets outwards).

94 \item

95 simplify. The simplifier inlines the (trivial) lets that were

96 created but were not floated outwards.

97 \end{enumerate}

99 With the fix I think Will's suggestion that we can gain even more from

100 strictness by floating inwards past lambdas makes sense.

102 We still gain even without going past lambdas, as things may be

103 strict in the (new) context of a branch (where it was floated to) or

104 of a let rhs, e.g.

105 \begin{verbatim}

106 let a = something case x of

107 in case x of alt1 -> case something of a -> a + a

108 alt1 -> a + a ==> alt2 -> b

109 alt2 -> b

111 let a = something let b = case something of a -> a + a

112 in let b = a + a ==> in (b,b)

113 in (b,b)

114 \end{verbatim}

115 Also, even if a is not found to be strict in the new context and is

116 still left as a let, if the branch is not taken (or b is not entered)

117 the closure for a is not built.

119 ************************************************************************

120 * *

121 \subsection{Main floating-inwards code}

122 * *

123 ************************************************************************

124 -}

130 -- The FreeVarSet is the free variables of the binding. In the case

131 -- of recursive bindings, the set doesn't include the bound

132 -- variables.

135 -- In reverse dependency order (innermost binder first)

137 fiExpr :: DynFlags

139 -- as far "inwards" as possible

150 where

154 to_drop

156 {-

157 Applications: we do float inside applications, mainly because we

158 need to get at all the arguments. The next simplifier run will

159 pull out any silly ones.

160 -}

164 mkTicks ticks $

167 where

170 fun_fvs = freeVarsOf ann_fun

176 to_drop

177 -- Shortcut behaviour: if to_drop is empty,

178 -- sepBindsByDropPoint returns a suitable bunch of empty

179 -- lists without evaluating extra_fvs, and hence without

180 -- peering into each argument

185 _ -> emptyDVarSet

186 -- Don't float the binding for f into f x y z; see Note [Join points]

187 -- for why we *can't* do it when f is a join point. (If f isn't a

188 -- join point, floating it in isn't especially harmful but it's

189 -- useless since the simplifier will immediately float it back out.)

196 | noFloatIntoArg arg arg_ty

198 | otherwise

200 where

203 {-

204 Note [Do not destroy the let/app invariant]

205 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

206 Watch out for

207 f (x +# y)

208 We don't want to float bindings into here

209 f (case ... of { x -> x +# y })

210 because that might destroy the let/app invariant, which requires

211 unlifted function arguments to be ok-for-speculation.

213 Note [Join points]

214 ~~~~~~~~~~~~~~~~~~

215 Generally, we don't need to worry about join points - there are places we're

216 not allowed to float them, but since they can't have occurrences in those

217 places, we're not tempted.

219 We do need to be careful about jumps, however:

221 joinrec j x y z = ... in

222 jump j a b c

224 Previous versions often floated the definition of a recursive function into its

225 only non-recursive occurrence. But for a join point, this is a disaster:

227 (joinrec j x y z = ... in

228 jump j) a b c -- wrong!

230 Every jump must be exact, so the jump to j must have three arguments. Hence

231 we're careful not to float into the target of a jump (though we can float into

232 the arguments just fine).

234 Note [Floating in past a lambda group]

235 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

236 * We must be careful about floating inside a value lambda.

237 That risks losing laziness.

238 The float-out pass might rescue us, but then again it might not.

240 * We must be careful about type lambdas too. At one time we did, and

241 there is no risk of duplicating work thereby, but we do need to be

242 careful. In particular, here is a bad case (it happened in the

243 cichelli benchmark:

244 let v = ...

245 in let f = /\t -> \a -> ...

246 ==>

247 let f = /\t -> let v = ... in \a -> ...

248 This is bad as now f is an updatable closure (update PAP)

249 and has arity 0.

251 * Hack alert! We only float in through one-shot lambdas,

252 not (as you might guess) through lone big lambdas.

253 Reason: we float *out* past big lambdas (see the test in the Lam

254 case of FloatOut.floatExpr) and we don't want to float straight

255 back in again.

257 It *is* important to float into one-shot lambdas, however;

258 see the remarks with noFloatIntoRhs.

260 So we treat lambda in groups, using the following rule:

262 Float in if (a) there is at least one Id,

263 and (b) there are no non-one-shot Ids

265 Otherwise drop all the bindings outside the group.

267 This is what the 'go' function in the AnnLam case is doing.

269 (Join points are handled similarly: a join point is considered one-shot iff

270 it's non-recursive, so we float only into non-recursive join points.)

272 Urk! if all are tyvars, and we don't float in, we may miss an

273 opportunity to float inside a nested case branch

276 Note [Floating coercions]

277 ~~~~~~~~~~~~~~~~~~~~~~~~~

278 We could, in principle, have a coercion binding like

279 case f x of co { DEFAULT -> e1 e2 }

280 It's not common to have a function that returns a coercion, but nothing

281 in Core prohibits it. If so, 'co' might be mentioned in e1 or e2

282 /only in a type/. E.g. suppose e1 was

283 let (x :: Int |> co) = blah in blah2

286 But, with coercions appearing in types, there is a complication: we

287 might be floating in a "strict let" -- that is, a case. Case expressions

288 mention their return type. We absolutely can't float a coercion binding

289 inward to the point that the type of the expression it's about to wrap

290 mentions the coercion. So we include the union of the sets of free variables

291 of the types of all the drop points involved. If any of the floaters

292 bind a coercion variable mentioned in any of the types, that binder must

293 be dropped right away.

295 -}

298 | noFloatIntoLam bndrs -- Dump it all here

299 -- NB: Must line up with noFloatIntoRhs (AnnLam...); see Trac #7088

305 where

308 {-

309 We don't float lets inwards past an SCC.

310 ToDo: keep info on current cc, and when passing

311 one, if it is not the same, annotate all lets in binds with current

312 cc, change current cc to the new one and float binds into expr.

313 -}

316 | tickish `tickishScopesLike` SoftScope

322 {-

323 For @Lets@, the possible ``drop points'' for the \tr{to_drop}

324 bindings are: (a)~in the body, (b1)~in the RHS of a NonRec binding,

325 or~(b2), in each of the RHSs of the pairs of a @Rec@.

327 Note that we do {\em weird things} with this let's binding. Consider:

328 \begin{verbatim}

329 let

330 w = ...

331 in {

332 let v = ... w ...

333 in ... v .. w ...

334 }

335 \end{verbatim}

336 Look at the inner \tr{let}. As \tr{w} is used in both the bind and

337 body of the inner let, we could panic and leave \tr{w}'s binding where

338 it is. But \tr{v} is floatable further into the body of the inner let, and

339 {\em then} \tr{w} will also be only in the body of that inner let.

341 So: rather than drop \tr{w}'s binding here, we add it onto the list of

342 things to drop in the outer let's body, and let nature take its

343 course.

345 Note [extra_fvs (1): avoid floating into RHS]

346 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

347 Consider let x=\y....t... in body. We do not necessarily want to float

348 a binding for t into the RHS, because it'll immediately be floated out

349 again. (It won't go inside the lambda else we risk losing work.)

350 In letrec, we need to be more careful still. We don't want to transform

351 let x# = y# +# 1#

352 in

353 letrec f = \z. ...x#...f...

354 in ...

355 into

356 letrec f = let x# = y# +# 1# in \z. ...x#...f... in ...

357 because now we can't float the let out again, because a letrec

358 can't have unboxed bindings.

360 So we make "extra_fvs" which is the rhs_fvs of such bindings, and

361 arrange to dump bindings that bind extra_fvs before the entire let.

363 Note [extra_fvs (2): free variables of rules]

364 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

365 Consider

366 let x{rule mentioning y} = rhs in body

367 Here y is not free in rhs or body; but we still want to dump bindings

368 that bind y outside the let. So we augment extra_fvs with the

369 idRuleAndUnfoldingVars of x. No need for type variables, hence not using

370 idFreeVars.

371 -}

375 -- to_drop is in reverse dependency order

376 where

378 body_fvs = freeVarsOf body

380 {- Note [Floating primops]

381 ~~~~~~~~~~~~~~~~~~~~~~~~~~

382 We try to float-in a case expression over an unlifted type. The

383 motivating example was Trac #5658: in particular, this change allows

384 array indexing operations, which have a single DEFAULT alternative

385 without any binders, to be floated inward.

387 SIMD primops for unpacking SIMD vectors into an unboxed tuple of unboxed

388 scalars also need to be floated inward, but unpacks have a single non-DEFAULT

389 alternative that binds the elements of the tuple. We now therefore also support

390 floating in cases with a single alternative that may bind values.

392 But there are wrinkles

394 * Which unlifted cases do we float? See PrimOp.hs

395 Note [PrimOp can_fail and has_side_effects] which explains:

396 - We can float-in can_fail primops, but we can't float them out.

397 - But we can float a has_side_effects primop, but NOT inside a lambda,

398 so for now we don't float them at all.

399 Hence exprOkForSideEffects

401 * Because we can float can-fail primops (array indexing, division) inwards

402 but not outwards, we must be careful not to transform

403 case a /# b of r -> f (F# r)

404 ===>

405 f (case a /# b of r -> F# r)

406 because that creates a new thunk that wasn't there before. And

407 because it can't be floated out (can_fail), the thunk will stay

408 there. Disaster! (This happened in nofib 'simple' and 'scs'.)

410 Solution: only float cases into the branches of other cases, and

411 not into the arguments of an application, or the RHS of a let. This

412 is somewhat conservative, but it's simple. And it still hits the

413 cases like Trac #5658. This is implemented in sepBindsByJoinPoint;

414 if is_case is False we dump all floating cases right here.

416 For @Case@, the possible drop points for the 'to_drop'

417 bindings are:

418 (a) inside the scrutinee

419 (b) inside one of the alternatives/default (default FVs always /first/!).

421 -}

426 -- See Note [Floating primops]

429 where

432 scrut' = fiExpr dflags scrut_binds scrut

434 scrut_fvs = freeVarsOf scrut

439 to_drop

443 wrapFloats drop_here2 $

446 where

447 -- Float into the scrut and alts-considered-together just like App

451 to_drop

453 -- Float into the alts with the is_case flag set

458 scrut_fvs = freeVarsOf scrut

460 all_alts_fvs = unionDVarSets alts_fvs

463 -- Delete case_bndr and args from free vars of rhs

464 -- to get free vars of alt

468 ------------------

469 fiBind :: DynFlags

471 -- as far "inwards" as possible

480 -- See Note [extra_fvs (1,2)]

485 where

489 extra_fvs | noFloatIntoRhs NonRecursive id rhs

490 = rule_fvs `unionDVarSet` rhs_fvs

491 | otherwise

492 = rule_fvs

493 -- See Note [extra_fvs (1): avoid floating into RHS]

494 -- No point in floating in only to float straight out again

495 -- We *can't* float into ok-for-speculation unlifted RHSs

496 -- But do float into join points

501 to_drop

503 -- Push rhs_binds into the right hand side of the binding

505 rhs_fvs' = rhs_fvs `unionDVarSet` floatedBindsFVs rhs_binds `unionDVarSet` rule_fvs

506 -- Don't forget the rule_fvs; the binding mentions them!

513 where

517 -- See Note [extra_fvs (1,2)]

518 rule_fvs = mapUnionDVarSet bndrRuleAndUnfoldingVarsDSet ids

519 extra_fvs = rule_fvs `unionDVarSet`

526 to_drop

528 rhs_fvs' = unionDVarSets rhss_fvs `unionDVarSet`

530 rule_fvs -- Don't forget the rule variables!

532 -- Push rhs_binds into the right hand side of the binding

537 fi_bind to_drops pairs

541 ------------------

543 fiRhs dflags to_drop bndr rhs

544 | Just join_arity <- isJoinId_maybe bndr

547 | otherwise

548 = fiExpr dflags to_drop rhs

550 ------------------

553 where

555 -- Don't float inside a non-one-shot lambda

558 -- ^ True if it's a bad idea to float bindings into this RHS

559 noFloatIntoRhs is_rec bndr rhs

560 | isJoinId bndr

563 | otherwise

567 noFloatIntoArg expr expr_ty

568 | isUnliftedType expr_ty

571 | AnnLam bndr e <- expr

575 -- See Note [noFloatInto considerations] wrinkle 2

578 = exprIsTrivial deann_expr || exprIsHNF deann_expr

579 where

582 {- Note [noFloatInto considerations]

583 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

584 When do we want to float bindings into

585 - noFloatIntoRHs: the RHS of a let-binding

586 - noFloatIntoArg: the argument of a function application

588 Definitely don't float in if it has unlifted type; that

589 would destroy the let/app invariant.

591 * Wrinkle 1: do not float in if

592 (a) any non-one-shot value lambdas

593 or (b) all type lambdas

594 In both cases we'll float straight back out again

595 NB: Must line up with fiExpr (AnnLam...); see Trac #7088

597 (a) is important: we /must/ float into a one-shot lambda group

598 (which includes join points). This makes a big difference

599 for things like

600 f x# = let x = I# x#

601 in let j = \() -> ...x...

602 in if <condition> then normal-path else j ()

603 If x is used only in the error case join point, j, we must float the

604 boxing constructor into it, else we box it every time which is very

605 bad news indeed.

607 * Wrinkle 2: for RHSs, do not float into a HNF; we'll just float right

608 back out again... not tragic, but a waste of time.

610 For function arguments we will still end up with this

611 in-then-out stuff; consider

612 letrec x = e in f x

613 Here x is not a HNF, so we'll produce

614 f (letrec x = e in x)

615 which is OK... it's not that common, and we'll end up

616 floating out again, in CorePrep if not earlier.

617 Still, we use exprIsTrivial to catch this case (sigh)

620 ************************************************************************

621 * *

622 \subsection{@sepBindsByDropPoint@}

623 * *

624 ************************************************************************

626 This is the crucial function. The idea is: We have a wad of bindings

627 that we'd like to distribute inside a collection of {\em drop points};

628 insides the alternatives of a \tr{case} would be one example of some

629 drop points; the RHS and body of a non-recursive \tr{let} binding

630 would be another (2-element) collection.

632 So: We're given a list of sets-of-free-variables, one per drop point,

633 and a list of floating-inwards bindings. If a binding can go into

634 only one drop point (without suddenly making something out-of-scope),

635 in it goes. If a binding is used inside {\em multiple} drop points,

636 then it has to go in a you-must-drop-it-above-all-these-drop-points

637 point.

639 We have to maintain the order on these drop-point-related lists.

640 -}

642 -- pprFIB :: FloatInBinds -> SDoc

643 -- pprFIB fibs = text "FIB:" <+> ppr [b | FB _ _ b <- fibs]

645 sepBindsByDropPoint

646 :: DynFlags

649 -- Always at least two long!

652 -- inside any drop point; the rest correspond

653 -- one-to-one with the input list of FV sets

655 -- Every input floater is returned somewhere in the result;

656 -- none are dropped, not even ones which don't seem to be

657 -- free in *any* of the drop-point fvs. Why? Because, for example,

658 -- a binding (let x = E in B) might have a specialised version of

659 -- x (say x') stored inside x, but x' isn't free in E or B.

663 sepBindsByDropPoint dflags is_case drop_pts floaters

667 | otherwise

670 where

674 -- The *first* one in the argument list is the drop_here set

675 -- The FloatInBinds in the lists are in the reverse of

676 -- the normal FloatInBinds order; that is, they are the right way round!

681 = go binds new_boxes

682 where

683 -- "here" means the group of bindings dropped at the top of the fork

688 drop_here = used_here || cant_push

692 cant_push

694 -- Remember n_alts > 1

696 -- floatIsDupable: see Note [Duplicating floats]

699 -- floatIsCase: see Note [Floating primops]

705 fork_boxes used_in_flags

716 {- Note [Duplicating floats]

717 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

719 For case expressions we duplicate the binding if it is reasonably

720 small, and if it is not used in all the RHSs This is good for

721 situations like

722 let x = I# y in

723 case e of

724 C -> error x

725 D -> error x

726 E -> ...not mentioning x...

728 If the thing is used in all RHSs there is nothing gained,

729 so we don't duplicate then.

730 -}

733 floatedBindsFVs binds = mapUnionDVarSet fbFVs binds

739 -- Remember FloatInBinds is in *reverse* dependency order