1 {-

2 (c) The University of Glasgow 2006

3 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998

6 Arity and eta expansion

7 -}

9 {-# LANGUAGE CPP #-}

11 -- | Arity and eta expansion

16 exprBotStrictness_maybe

21 import CoreSyn

22 import CoreFVs

23 import CoreUtils

24 import CoreSubst

25 import Demand

26 import Var

27 import VarEnv

28 import Id

29 import Type

31 import Coercion

32 import BasicTypes

33 import Unique

35 import Outputable

36 import FastString

37 import Pair

40 {-

41 ************************************************************************

42 * *

43 manifestArity and exprArity

44 * *

45 ************************************************************************

47 exprArity is a cheap-and-cheerful version of exprEtaExpandArity.

48 It tells how many things the expression can be applied to before doing

49 any work. It doesn't look inside cases, lets, etc. The idea is that

50 exprEtaExpandArity will do the hard work, leaving something that's easy

51 for exprArity to grapple with. In particular, Simplify uses exprArity to

52 compute the ArityInfo for the Id.

54 Originally I thought that it was enough just to look for top-level lambdas, but

55 it isn't. I've seen this

57 foo = PrelBase.timesInt

59 We want foo to get arity 2 even though the eta-expander will leave it

60 unchanged, in the expectation that it'll be inlined. But occasionally it

61 isn't, because foo is blacklisted (used in a rule).

63 Similarly, see the ok_note check in exprEtaExpandArity. So

64 f = __inline_me (\x -> e)

65 won't be eta-expanded.

67 And in any case it seems more robust to have exprArity be a bit more intelligent.

68 But note that (\x y z -> f x y z)

69 should have arity 3, regardless of f's arity.

70 -}

73 -- ^ manifestArity sees how many leading value lambdas there are,

74 -- after looking through casts

82 -- Join points are supposed to have manifestly-visible

83 -- lambdas at the top: no ticks, no casts, nothing

84 -- Moreover, type lambdas count in JoinArity

89 ---------------

91 -- ^ An approximate, fast, version of 'exprEtaExpandArity'

92 exprArity e = go e

93 where

99 -- Note [exprArity invariant]

102 -- See Note [exprArity for applications]

103 -- NB: coercions count as a value argument

110 ---------------

112 -- How many value arrows are visible in the type?

113 -- We look through foralls, and newtypes

114 -- See Note [exprArity invariant]

115 typeArity ty

116 = go initRecTc ty

117 where

118 go rec_nts ty

128 -- in TyCon

129 -- , not (isClassTyCon tc) -- Do not eta-expand through newtype classes

130 -- -- See Note [Newtype classes and eta expansion]

131 -- (no longer required)

133 -- Important to look through non-recursive newtypes, so that, eg

134 -- (f x) where f has arity 2, f :: Int -> IO ()

135 -- Here we want to get arity 1 for the result!

136 --

137 -- AND through a layer of recursive newtypes

138 -- e.g. newtype Stream m a b = Stream (m (Either b (a, Stream m a b)))

140 | otherwise

141 = []

143 ---------------

145 -- A cheap and cheerful function that identifies bottoming functions

146 -- and gives them a suitable strictness signatures. It's used during

147 -- float-out

148 exprBotStrictness_maybe e

150 Nothing -> Nothing

152 where

155 -- For this purpose we can be very simple

156 -- exnRes is a bit less aggressive than botRes

158 {-

159 Note [exprArity invariant]

160 ~~~~~~~~~~~~~~~~~~~~~~~~~~

161 exprArity has the following invariant:

163 (1) If typeArity (exprType e) = n,

164 then manifestArity (etaExpand e n) = n

166 That is, etaExpand can always expand as much as typeArity says

167 So the case analysis in etaExpand and in typeArity must match

169 (2) exprArity e <= typeArity (exprType e)

171 (3) Hence if (exprArity e) = n, then manifestArity (etaExpand e n) = n

173 That is, if exprArity says "the arity is n" then etaExpand really

174 can get "n" manifest lambdas to the top.

176 Why is this important? Because

177 - In TidyPgm we use exprArity to fix the *final arity* of

178 each top-level Id, and in

179 - In CorePrep we use etaExpand on each rhs, so that the visible lambdas

180 actually match that arity, which in turn means

181 that the StgRhs has the right number of lambdas

183 An alternative would be to do the eta-expansion in TidyPgm, at least

184 for top-level bindings, in which case we would not need the trim_arity

185 in exprArity. That is a less local change, so I'm going to leave it for today!

187 Note [Newtype classes and eta expansion]

188 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

189 NB: this nasty special case is no longer required, because

190 for newtype classes we don't use the class-op rule mechanism

191 at all. See Note [Single-method classes] in TcInstDcls. SLPJ May 2013

193 -------- Old out of date comments, just for interest -----------

194 We have to be careful when eta-expanding through newtypes. In general

195 it's a good idea, but annoyingly it interacts badly with the class-op

196 rule mechanism. Consider

198 class C a where { op :: a -> a }

199 instance C b => C [b] where

200 op x = ...

202 These translate to

204 co :: forall a. (a->a) ~ C a

206 $copList :: C b -> [b] -> [b]

207 $copList d x = ...

209 $dfList :: C b -> C [b]

210 {-# DFunUnfolding = [$copList] #-}

213 Now suppose we have:

221 blah = $copList dCInt

233 -------- End of old out of date comments, just for interest -----------

237 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

238 When we come to an application we check that the arg is trivial.

247 have arity decreases.

253 ************************************************************************

254 * *

256 * *

257 ************************************************************************

260 ~~~~~~~~~~~~~~~~~~~~~~~~~~

263 converges rapidly

265 Or, to put it another way

267 there is no work lost in duplicating the partial

282 ~~~~~~~~~~~~~~~~~~~~~~~

283 Consider one-shot lambdas

288 ~~~~~~~~~~~~~~~~~~~~~~~~~~

289 A Big Deal with computing arities is expressions like

304 would lose an important transformation for many programs. (See

307 Consider also

320 have several tickets reporting unexpected behaviour resulting from

331 going to diverge eventually anyway then getting the best arity

342 ~~~~~~~~~~~~~~~~~~~~~~~~

348 Suppose we have

349 e = coerce T f

362 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

363 Suppose we have

364 f = e

366 a usage type like

382 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

395 Extrude the g2

400 Discard args for bottomming function

405 Extrude g1.g3

418 After inlining (>>) we get

434 ~~~~~~~~~~~~~~~~

453 its definition.

457 themselves.

466 Example:

479 -------------------- Main arity code ----------------------------

480 -}

482 -- See Note [ArityType]

484 -- There is always an explicit lambda

485 -- to justify the [OneShot], or the Arity

491 vanillaArityType :: ArityType

494 -- ^ The Arity returned is the number of value args the

495 -- expression can be applied to without doing much work

497 -- exprEtaExpandArity is used when eta expanding

498 -- e ==> \xy -> e x y

499 exprEtaExpandArity dflags e

502 ABot n -> n

503 where

508 -- Arity of a divergent function

510 getBotArity _ = Nothing

513 mk_cheap_fn dflags cheap_app

516 | otherwise

520 Just ty -> isDictLikeTy ty

523 ----------------------

525 -- This implements the fixpoint loop for arity analysis

526 -- See Note [Arity analysis]

527 -- If findRhsArity e = (n, is_bot) then

528 -- (a) any application of e to <n arguments will not do much work,

529 -- so it is safe to expand e ==> (\x1..xn. e x1 .. xn)

530 -- (b) if is_bot=True, then e applied to n args is guaranteed bottom

531 findRhsArity dflags bndr rhs old_arity

533 -- We always call exprEtaExpandArity once, but usually

534 -- that produces a result equal to old_arity, and then

535 -- we stop right away (since arities should not decrease)

536 -- Result: the common case is that there is just one iteration

537 where

538 is_lam = has_lam rhs

544 init_cheap_app :: CheapAppFun

545 init_cheap_app fn n_val_args

555 pprTrace "Exciting arity"

558 #endif

559 go new_info

560 where

563 cheap_app :: CheapAppFun

564 cheap_app fn n_val_args

569 get_arity cheap_app

575 where

579 {-

580 Note [Arity analysis]

581 ~~~~~~~~~~~~~~~~~~~~~

582 The motivating example for arity analysis is this:

584 f = \x. let g = f (x+1)

585 in \y. ...g...

587 What arity does f have? Really it should have arity 2, but a naive

588 look at the RHS won't see that. You need a fixpoint analysis which

589 says it has arity "infinity" the first time round.

591 This example happens a lot; it first showed up in Andy Gill's thesis,

592 fifteen years ago! It also shows up in the code for 'rnf' on lists

593 in Trac #4138.

595 The analysis is easy to achieve because exprEtaExpandArity takes an

596 argument

597 type CheapFun = CoreExpr -> Maybe Type -> Bool

598 used to decide if an expression is cheap enough to push inside a

599 lambda. And exprIsCheap' in turn takes an argument

600 type CheapAppFun = Id -> Int -> Bool

601 which tells when an application is cheap. This makes it easy to

602 write the analysis loop.

604 The analysis is cheap-and-cheerful because it doesn't deal with

605 mutual recursion. But the self-recursive case is the important one.

608 Note [Eta expanding through dictionaries]

609 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

610 If the experimental -fdicts-cheap flag is on, we eta-expand through

611 dictionary bindings. This improves arities. Thereby, it also

612 means that full laziness is less prone to floating out the

613 application of a function to its dictionary arguments, which

614 can thereby lose opportunities for fusion. Example:

615 foo :: Ord a => a -> ...

616 foo = /\a \(d:Ord a). let d' = ...d... in \(x:a). ....

617 -- So foo has arity 1

619 f = \x. foo dInt $ bar x

621 The (foo DInt) is floated out, and makes ineffective a RULE

622 foo (bar x) = ...

624 One could go further and make exprIsCheap reply True to any

625 dictionary-typed expression, but that's more work.

627 See Note [Dictionary-like types] in TcType.hs for why we use

628 isDictLikeTy here rather than isDictTy

630 Note [Eta expanding thunks]

631 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

632 We don't eta-expand

633 * Trivial RHSs x = y

634 * PAPs x = map g

635 * Thunks f = case y of p -> \x -> blah

637 When we see

638 f = case y of p -> \x -> blah

639 should we eta-expand it? Well, if 'x' is a one-shot state token

640 then 'yes' because 'f' will only be applied once. But otherwise

641 we (conservatively) say no. My main reason is to avoid expanding

642 PAPSs

643 f = g d ==> f = \x. g d x

644 because that might in turn make g inline (if it has an inline pragma),

645 which we might not want. After all, INLINE pragmas say "inline only

646 when saturated" so we don't want to be too gung-ho about saturating!

647 -}

654 -- We have something like (let x = E in b),

655 -- where b has the given arity type.

659 -- If E is not cheap, keep arity only for one-shots

662 -- Processing (fun arg) where at is the ArityType of fun,

663 -- Knock off an argument and behave like 'let'

679 {- Note [ABot branches: use max]

680 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

681 Consider case x of

682 True -> \x. error "urk"

683 False -> \xy. error "urk2"

685 Remember: ABot n means "if you apply to n args, it'll definitely diverge".

686 So we need (ABot 2) for the whole thing, the /max/ of the ABot arities.

688 Note [Combining case branches]

689 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

690 Consider

691 go = \x. let z = go e0

692 go2 = \x. case x of

693 True -> z

694 False -> \s(one-shot). e1

695 in go2 x

696 We *really* want to eta-expand go and go2.

697 When combining the barnches of the case we have

698 ATop [] `andAT` ATop [OneShotLam]

699 and we want to get ATop [OneShotLam]. But if the inner

700 lambda wasn't one-shot we don't want to do this.

701 (We need a proper arity analysis to justify that.)

703 So we combine the best of the two branches, on the (slightly dodgy)

704 basis that if we know one branch is one-shot, then they all must be.

705 -}

707 ---------------------------

709 -- How to decide if an expression is cheap

710 -- If the Maybe is Just, the type is the type

711 -- of the expression; Nothing means "don't know"

713 data ArityEnv

716 }

724 where

726 -- See Note [exprArity invariant] (2); must be true of

727 -- arityType too, since that is how we compute the arity

728 -- of variables, and they in turn affect result of exprArity

729 -- Trac #5441 is a nice demo

730 -- However, do make sure that ATop -> ATop and ABot -> ABot!

731 -- Casts don't affect that part. Getting this wrong provoked #5475

734 | strict_sig <- idStrictness v

740 | otherwise

742 where

746 -- Lambdas; increase arity

751 -- Applications; decrease arity, except for types

753 = arityType env fun

757 -- Case/Let; keep arity if either the expression is cheap

758 -- or it's a 1-shot lambda

759 -- The former is not really right for Haskell

760 -- f x = case x of { (a,b) -> \y. e }

761 -- ===>

762 -- f x y = case x of { (a,b) -> e }

763 -- The difference is observable using 'seq'

764 --

766 | exprIsBottom scrut || null alts

768 -- See Note [Dealing with bottom (1)]

769 | otherwise

773 -- See Note [Dealing with bottom (2)]

779 where

784 where

792 arityType _ _ = vanillaArityType

794 {-

795 %************************************************************************

796 %* *

797 The main eta-expander

798 %* *

799 %************************************************************************

801 We go for:

802 f = \x1..xn -> N ==> f = \x1..xn y1..ym -> N y1..ym

803 (n >= 0)

805 where (in both cases)

807 * The xi can include type variables

809 * The yi are all value variables

811 * N is a NORMAL FORM (i.e. no redexes anywhere)

812 wanting a suitable number of extra args.

814 The biggest reason for doing this is for cases like

816 f = \x -> case x of

817 True -> \y -> e1

818 False -> \y -> e2

820 Here we want to get the lambdas together. A good example is the nofib

821 program fibheaps, which gets 25% more allocation if you don't do this

822 eta-expansion.

824 We may have to sandwich some coerces between the lambdas

825 to make the types work. exprEtaExpandArity looks through coerces

826 when computing arity; and etaExpand adds the coerces as necessary when

827 actually computing the expansion.

829 Note [No crap in eta-expanded code]

830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

831 The eta expander is careful not to introduce "crap". In particular,

832 given a CoreExpr satisfying the 'CpeRhs' invariant (in CorePrep), it

833 returns a CoreExpr satisfying the same invariant. See Note [Eta

834 expansion and the CorePrep invariants] in CorePrep.

836 This means the eta-expander has to do a bit of on-the-fly

837 simplification but it's not too hard. The alernative, of relying on

838 a subsequent clean-up phase of the Simplifier to de-crapify the result,

839 means you can't really use it in CorePrep, which is painful.

841 Note [Eta expansion for join points]

842 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

843 The no-crap rule is very tiresome to guarantee when

844 we have join points. Consider eta-expanding

845 let j :: Int -> Int -> Bool

846 j x = e

847 in b

849 The simple way is

850 \(y::Int). (let j x = e in b) y

852 The no-crap way is

853 \(y::Int). let j' :: Int -> Bool

854 j' x = e y

855 in b[j'/j] y

856 where I have written to stress that j's type has

857 changed. Note that (of course!) we have to push the application

858 inside the RHS of the join as well as into the body. AND if j

859 has an unfolding we have to push it into there too. AND j might

860 be recursive...

862 So for now I'm abandonig the no-crap rule in this case. I think

863 that for the use in CorePrep it really doesn't matter; and if

864 it does, then CoreToStg.myCollectArgs will fall over.

866 (Moreover, I think that casts can make the no-crap rule fail too.)

868 Note [Eta expansion and SCCs]

869 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

870 Note that SCCs are not treated specially by etaExpand. If we have

871 etaExpand 2 (\x -> scc "foo" e)

872 = (\xy -> (scc "foo" e) y)

873 So the costs of evaluating 'e' (not 'e y') are attributed to "foo"

875 Note [Eta expansion and source notes]

876 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

877 CorePrep puts floatable ticks outside of value applications, but not

878 type applications. As a result we might be trying to eta-expand an

879 expression like

881 (src<...> v) @a

883 which we want to lead to code like

885 \x -> src<...> v @a x

887 This means that we need to look through type applications and be ready

888 to re-add floats on the top.

890 -}

892 -- | @etaExpand n e@ returns an expression with

893 -- the same meaning as @e@, but with arity @n@.

894 --

895 -- Given:

896 --

897 -- > e' = etaExpand n e

898 --

899 -- We should have that:

900 --

901 -- > ty = exprType e = exprType e'

904 -> CoreExpr

905 -- etaExpand arity e = res

906 -- Then 'res' has at least 'arity' lambdas at the top

907 --

908 -- etaExpand deals with for-alls. For example:

909 -- etaExpand 1 E

910 -- where E :: forall a. a -> a

911 -- would return

912 -- (/\b. \y::a -> E b y)

913 --

914 -- It deals with coerces too, though they are now rare

915 -- so perhaps the extra code isn't worth it

917 etaExpand n orig_expr

918 = go n orig_expr

919 where

920 -- Strip off existing lambdas and casts

921 -- Note [Eta expansion and SCCs]

926 go n expr

929 where

934 -- Find ticks behind type apps.

935 -- See Note [Eta expansion and source notes]

941 -- Abstraction Application

942 --------------

944 -- \x. [] [] x

945 | EtaCo Coercion -- [] |> sym co [] |> co

953 | isReflCo co = eis

955 where

956 co = co1 `mkTransCo` co2

960 --------------

966 --------------

968 -- (etaInfoApp s e eis) returns something equivalent to

969 -- ((substExpr s e) `appliedto` eis)

976 where

981 where

986 where

991 -- See Note [Eta expansion for join points]

993 where

999 etaInfoApp subst expr _

1003 = subst_expr subst expr

1005 etaInfoApp subst e eis

1007 where

1008 go e [] = e

1013 --------------

1015 -- If e :: ty

1016 -- then etaInfoApp e eis :: etaInfoApp ty eis

1017 etaInfoAppTy ty [] = ty

1021 --------------

1024 -- EtaInfo contains fresh variables,

1025 -- not free in the incoming CoreExpr

1026 -- Outgoing InScopeSet includes the EtaInfo vars

1027 -- and the original free vars

1029 mkEtaWW orig_n orig_expr in_scope orig_ty

1031 where

1032 empty_subst = mkEmptyTCvSubst in_scope

1034 go n subst ty eis -- See Note [exprArity invariant]

1040 -- Avoid free vars of the original expression

1045 -- See Note [Levity polymorphism invariants] in CoreSyn

1046 -- See also test case typecheck/should_run/EtaExpandLevPoly

1049 -- Avoid free vars of the original expression

1054 -- newtype T = MkT ([T] -> Int)

1055 -- Consider eta-expanding this

1056 -- eta_expand 1 e T

1057 -- We want to get

1058 -- coerce T (\x::[T] -> (coerce ([T]->Int) e) x)

1062 -- but its type isn't a function, or a binder

1063 -- is levity-polymorphic

1066 -- This *can* legitmately happen:

1067 -- e.g. coerce Int (\x. x) Essentially the programmer is

1068 -- playing fast and loose with types (Happy does this a lot).

1069 -- So we simply decline to eta-expand. Otherwise we'd end up

1070 -- with an explicit lambda having a non-function type

1074 --------------

1075 -- Don't use short-cutting substitution - we may be changing the types of join

1076 -- points, so applying the in-scope set is necessary

1077 -- TODO Check if we actually *are* changing any join points' types

1083 --------------

1085 -- | Split an expression into the given number of binders and a body,

1086 -- eta-expanding if necessary. Counts value *and* type binders.

1088 etaExpandToJoinPoint join_arity expr

1090 where

1099 -- How did a local binding get a built-in rule anyway? Probably a plugin.

1100 rule

1104 = rule

1107 | otherwise

1110 where

1113 new_args = varsToCoreExprs new_bndrs

1115 -- Adds as many binders as asked for; assumes expr is not a lambda

1117 etaBodyForJoinPoint need_args body

1119 where

1120 go 0 _ _ rev_bs e

1122 go n ty subst rev_bs e

1129 | otherwise

1135 --------------

1137 -- Make a fresh Id, with specified type (after applying substitution)

1138 -- It should be "fresh" in the sense that it's not in the in-scope set

1139 -- of the TvSubstEnv; and it should itself then be added to the in-scope

1140 -- set of the TvSubstEnv

1141 --

1142 -- The Int is just a reasonable starting point for generating a unique;

1143 -- it does not necessarily have to be unique itself.

1144 freshEtaId n subst ty

1146 where