1 {-

2 (c) The University of Glasgow 2006

3 (c) The GRASP/AQUA Project, Glasgow University, 1992-1998

6 Arity and eta expansion

7 -}

9 {-# LANGUAGE CPP #-}

11 -- | Arity and eta expansion

16 exprBotStrictness_maybe

21 import GhcPrelude

23 import CoreSyn

24 import CoreFVs

25 import CoreUtils

26 import CoreSubst

27 import Demand

28 import Var

29 import VarEnv

30 import Id

31 import Type

33 import Coercion

34 import BasicTypes

35 import Unique

37 import Outputable

38 import FastString

39 import Pair

42 {-

43 ************************************************************************

44 * *

45 manifestArity and exprArity

46 * *

47 ************************************************************************

49 exprArity is a cheap-and-cheerful version of exprEtaExpandArity.

50 It tells how many things the expression can be applied to before doing

51 any work. It doesn't look inside cases, lets, etc. The idea is that

52 exprEtaExpandArity will do the hard work, leaving something that's easy

53 for exprArity to grapple with. In particular, Simplify uses exprArity to

54 compute the ArityInfo for the Id.

56 Originally I thought that it was enough just to look for top-level lambdas, but

57 it isn't. I've seen this

59 foo = PrelBase.timesInt

61 We want foo to get arity 2 even though the eta-expander will leave it

62 unchanged, in the expectation that it'll be inlined. But occasionally it

63 isn't, because foo is blacklisted (used in a rule).

65 Similarly, see the ok_note check in exprEtaExpandArity. So

66 f = __inline_me (\x -> e)

67 won't be eta-expanded.

69 And in any case it seems more robust to have exprArity be a bit more intelligent.

70 But note that (\x y z -> f x y z)

71 should have arity 3, regardless of f's arity.

72 -}

75 -- ^ manifestArity sees how many leading value lambdas there are,

76 -- after looking through casts

84 -- Join points are supposed to have manifestly-visible

85 -- lambdas at the top: no ticks, no casts, nothing

86 -- Moreover, type lambdas count in JoinArity

91 ---------------

93 -- ^ An approximate, fast, version of 'exprEtaExpandArity'

94 exprArity e = go e

95 where

101 -- Note [exprArity invariant]

104 -- See Note [exprArity for applications]

105 -- NB: coercions count as a value argument

112 ---------------

114 -- How many value arrows are visible in the type?

115 -- We look through foralls, and newtypes

116 -- See Note [exprArity invariant]

117 typeArity ty

118 = go initRecTc ty

119 where

120 go rec_nts ty

130 -- in TyCon

131 -- , not (isClassTyCon tc) -- Do not eta-expand through newtype classes

132 -- -- See Note [Newtype classes and eta expansion]

133 -- (no longer required)

135 -- Important to look through non-recursive newtypes, so that, eg

136 -- (f x) where f has arity 2, f :: Int -> IO ()

137 -- Here we want to get arity 1 for the result!

138 --

139 -- AND through a layer of recursive newtypes

140 -- e.g. newtype Stream m a b = Stream (m (Either b (a, Stream m a b)))

142 | otherwise

143 = []

145 ---------------

147 -- A cheap and cheerful function that identifies bottoming functions

148 -- and gives them a suitable strictness signatures. It's used during

149 -- float-out

150 exprBotStrictness_maybe e

152 Nothing -> Nothing

154 where

157 -- For this purpose we can be very simple

158 -- exnRes is a bit less aggressive than botRes

160 {-

161 Note [exprArity invariant]

162 ~~~~~~~~~~~~~~~~~~~~~~~~~~

163 exprArity has the following invariant:

165 (1) If typeArity (exprType e) = n,

166 then manifestArity (etaExpand e n) = n

168 That is, etaExpand can always expand as much as typeArity says

169 So the case analysis in etaExpand and in typeArity must match

171 (2) exprArity e <= typeArity (exprType e)

173 (3) Hence if (exprArity e) = n, then manifestArity (etaExpand e n) = n

175 That is, if exprArity says "the arity is n" then etaExpand really

176 can get "n" manifest lambdas to the top.

178 Why is this important? Because

179 - In TidyPgm we use exprArity to fix the *final arity* of

180 each top-level Id, and in

181 - In CorePrep we use etaExpand on each rhs, so that the visible lambdas

182 actually match that arity, which in turn means

183 that the StgRhs has the right number of lambdas

185 An alternative would be to do the eta-expansion in TidyPgm, at least

186 for top-level bindings, in which case we would not need the trim_arity

187 in exprArity. That is a less local change, so I'm going to leave it for today!

189 Note [Newtype classes and eta expansion]

190 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

191 NB: this nasty special case is no longer required, because

192 for newtype classes we don't use the class-op rule mechanism

193 at all. See Note [Single-method classes] in TcInstDcls. SLPJ May 2013

195 -------- Old out of date comments, just for interest -----------

196 We have to be careful when eta-expanding through newtypes. In general

197 it's a good idea, but annoyingly it interacts badly with the class-op

198 rule mechanism. Consider

200 class C a where { op :: a -> a }

201 instance C b => C [b] where

202 op x = ...

204 These translate to

206 co :: forall a. (a->a) ~ C a

208 $copList :: C b -> [b] -> [b]

209 $copList d x = ...

211 $dfList :: C b -> C [b]

212 {-# DFunUnfolding = [$copList] #-}

215 Now suppose we have:

223 blah = $copList dCInt

235 -------- End of old out of date comments, just for interest -----------

239 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

240 When we come to an application we check that the arg is trivial.

249 have arity decreases.

255 ************************************************************************

256 * *

258 * *

259 ************************************************************************

262 ~~~~~~~~~~~~~~~~~~~~~~~~~~

265 converges rapidly

267 Or, to put it another way

269 there is no work lost in duplicating the partial

284 ~~~~~~~~~~~~~~~~~~~~~~~

285 Consider one-shot lambdas

290 ~~~~~~~~~~~~~~~~~~~~~~~~~~

291 A Big Deal with computing arities is expressions like

306 would lose an important transformation for many programs. (See

309 Consider also

322 have several tickets reporting unexpected behaviour resulting from

333 going to diverge eventually anyway then getting the best arity

344 ~~~~~~~~~~~~~~~~~~~~~~~~

350 Suppose we have

351 e = coerce T f

364 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

365 Suppose we have

366 f = e

368 a usage type like

384 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

397 Extrude the g2

402 Discard args for bottomming function

407 Extrude g1.g3

420 After inlining (>>) we get

436 ~~~~~~~~~~~~~~~~

455 its definition.

459 themselves.

468 Example:

481 -------------------- Main arity code ----------------------------

482 -}

484 -- See Note [ArityType]

486 -- There is always an explicit lambda

487 -- to justify the [OneShot], or the Arity

493 vanillaArityType :: ArityType

496 -- ^ The Arity returned is the number of value args the

497 -- expression can be applied to without doing much work

499 -- exprEtaExpandArity is used when eta expanding

500 -- e ==> \xy -> e x y

501 exprEtaExpandArity dflags e

504 ABot n -> n

505 where

510 -- Arity of a divergent function

512 getBotArity _ = Nothing

515 mk_cheap_fn dflags cheap_app

518 | otherwise

522 Just ty -> isDictLikeTy ty

525 ----------------------

527 -- This implements the fixpoint loop for arity analysis

528 -- See Note [Arity analysis]

529 -- If findRhsArity e = (n, is_bot) then

530 -- (a) any application of e to <n arguments will not do much work,

531 -- so it is safe to expand e ==> (\x1..xn. e x1 .. xn)

532 -- (b) if is_bot=True, then e applied to n args is guaranteed bottom

533 findRhsArity dflags bndr rhs old_arity

535 -- We always call exprEtaExpandArity once, but usually

536 -- that produces a result equal to old_arity, and then

537 -- we stop right away (since arities should not decrease)

538 -- Result: the common case is that there is just one iteration

539 where

540 is_lam = has_lam rhs

546 init_cheap_app :: CheapAppFun

547 init_cheap_app fn n_val_args

557 pprTrace "Exciting arity"

560 #endif

561 go new_info

562 where

565 cheap_app :: CheapAppFun

566 cheap_app fn n_val_args

571 get_arity cheap_app

577 where

581 {-

582 Note [Arity analysis]

583 ~~~~~~~~~~~~~~~~~~~~~

584 The motivating example for arity analysis is this:

586 f = \x. let g = f (x+1)

587 in \y. ...g...

589 What arity does f have? Really it should have arity 2, but a naive

590 look at the RHS won't see that. You need a fixpoint analysis which

591 says it has arity "infinity" the first time round.

593 This example happens a lot; it first showed up in Andy Gill's thesis,

594 fifteen years ago! It also shows up in the code for 'rnf' on lists

595 in Trac #4138.

597 The analysis is easy to achieve because exprEtaExpandArity takes an

598 argument

599 type CheapFun = CoreExpr -> Maybe Type -> Bool

600 used to decide if an expression is cheap enough to push inside a

601 lambda. And exprIsCheap' in turn takes an argument

602 type CheapAppFun = Id -> Int -> Bool

603 which tells when an application is cheap. This makes it easy to

604 write the analysis loop.

606 The analysis is cheap-and-cheerful because it doesn't deal with

607 mutual recursion. But the self-recursive case is the important one.

610 Note [Eta expanding through dictionaries]

611 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

612 If the experimental -fdicts-cheap flag is on, we eta-expand through

613 dictionary bindings. This improves arities. Thereby, it also

614 means that full laziness is less prone to floating out the

615 application of a function to its dictionary arguments, which

616 can thereby lose opportunities for fusion. Example:

617 foo :: Ord a => a -> ...

618 foo = /\a \(d:Ord a). let d' = ...d... in \(x:a). ....

619 -- So foo has arity 1

621 f = \x. foo dInt $ bar x

623 The (foo DInt) is floated out, and makes ineffective a RULE

624 foo (bar x) = ...

626 One could go further and make exprIsCheap reply True to any

627 dictionary-typed expression, but that's more work.

629 See Note [Dictionary-like types] in TcType.hs for why we use

630 isDictLikeTy here rather than isDictTy

632 Note [Eta expanding thunks]

633 ~~~~~~~~~~~~~~~~~~~~~~~~~~~

634 We don't eta-expand

635 * Trivial RHSs x = y

636 * PAPs x = map g

637 * Thunks f = case y of p -> \x -> blah

639 When we see

640 f = case y of p -> \x -> blah

641 should we eta-expand it? Well, if 'x' is a one-shot state token

642 then 'yes' because 'f' will only be applied once. But otherwise

643 we (conservatively) say no. My main reason is to avoid expanding

644 PAPSs

645 f = g d ==> f = \x. g d x

646 because that might in turn make g inline (if it has an inline pragma),

647 which we might not want. After all, INLINE pragmas say "inline only

648 when saturated" so we don't want to be too gung-ho about saturating!

649 -}

656 -- We have something like (let x = E in b),

657 -- where b has the given arity type.

661 -- If E is not cheap, keep arity only for one-shots

664 -- Processing (fun arg) where at is the ArityType of fun,

665 -- Knock off an argument and behave like 'let'

681 {- Note [ABot branches: use max]

682 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

683 Consider case x of

684 True -> \x. error "urk"

685 False -> \xy. error "urk2"

687 Remember: ABot n means "if you apply to n args, it'll definitely diverge".

688 So we need (ABot 2) for the whole thing, the /max/ of the ABot arities.

690 Note [Combining case branches]

691 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

692 Consider

693 go = \x. let z = go e0

694 go2 = \x. case x of

695 True -> z

696 False -> \s(one-shot). e1

697 in go2 x

698 We *really* want to eta-expand go and go2.

699 When combining the barnches of the case we have

700 ATop [] `andAT` ATop [OneShotLam]

701 and we want to get ATop [OneShotLam]. But if the inner

702 lambda wasn't one-shot we don't want to do this.

703 (We need a proper arity analysis to justify that.)

705 So we combine the best of the two branches, on the (slightly dodgy)

706 basis that if we know one branch is one-shot, then they all must be.

707 -}

709 ---------------------------

711 -- How to decide if an expression is cheap

712 -- If the Maybe is Just, the type is the type

713 -- of the expression; Nothing means "don't know"

715 data ArityEnv

718 }

726 where

728 -- See Note [exprArity invariant] (2); must be true of

729 -- arityType too, since that is how we compute the arity

730 -- of variables, and they in turn affect result of exprArity

731 -- Trac #5441 is a nice demo

732 -- However, do make sure that ATop -> ATop and ABot -> ABot!

733 -- Casts don't affect that part. Getting this wrong provoked #5475

736 | strict_sig <- idStrictness v

742 | otherwise

744 where

748 -- Lambdas; increase arity

753 -- Applications; decrease arity, except for types

755 = arityType env fun

759 -- Case/Let; keep arity if either the expression is cheap

760 -- or it's a 1-shot lambda

761 -- The former is not really right for Haskell

762 -- f x = case x of { (a,b) -> \y. e }

763 -- ===>

764 -- f x y = case x of { (a,b) -> e }

765 -- The difference is observable using 'seq'

766 --

768 | exprIsBottom scrut || null alts

770 -- See Note [Dealing with bottom (1)]

771 | otherwise

775 -- See Note [Dealing with bottom (2)]

781 where

786 where

794 arityType _ _ = vanillaArityType

796 {-

797 %************************************************************************

798 %* *

799 The main eta-expander

800 %* *

801 %************************************************************************

803 We go for:

804 f = \x1..xn -> N ==> f = \x1..xn y1..ym -> N y1..ym

805 (n >= 0)

807 where (in both cases)

809 * The xi can include type variables

811 * The yi are all value variables

813 * N is a NORMAL FORM (i.e. no redexes anywhere)

814 wanting a suitable number of extra args.

816 The biggest reason for doing this is for cases like

818 f = \x -> case x of

819 True -> \y -> e1

820 False -> \y -> e2

822 Here we want to get the lambdas together. A good example is the nofib

823 program fibheaps, which gets 25% more allocation if you don't do this

824 eta-expansion.

826 We may have to sandwich some coerces between the lambdas

827 to make the types work. exprEtaExpandArity looks through coerces

828 when computing arity; and etaExpand adds the coerces as necessary when

829 actually computing the expansion.

831 Note [No crap in eta-expanded code]

832 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

833 The eta expander is careful not to introduce "crap". In particular,

834 given a CoreExpr satisfying the 'CpeRhs' invariant (in CorePrep), it

835 returns a CoreExpr satisfying the same invariant. See Note [Eta

836 expansion and the CorePrep invariants] in CorePrep.

838 This means the eta-expander has to do a bit of on-the-fly

839 simplification but it's not too hard. The alernative, of relying on

840 a subsequent clean-up phase of the Simplifier to de-crapify the result,

841 means you can't really use it in CorePrep, which is painful.

843 Note [Eta expansion for join points]

844 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

845 The no-crap rule is very tiresome to guarantee when

846 we have join points. Consider eta-expanding

847 let j :: Int -> Int -> Bool

848 j x = e

849 in b

851 The simple way is

852 \(y::Int). (let j x = e in b) y

854 The no-crap way is

855 \(y::Int). let j' :: Int -> Bool

856 j' x = e y

857 in b[j'/j] y

858 where I have written to stress that j's type has

859 changed. Note that (of course!) we have to push the application

860 inside the RHS of the join as well as into the body. AND if j

861 has an unfolding we have to push it into there too. AND j might

862 be recursive...

864 So for now I'm abandonig the no-crap rule in this case. I think

865 that for the use in CorePrep it really doesn't matter; and if

866 it does, then CoreToStg.myCollectArgs will fall over.

868 (Moreover, I think that casts can make the no-crap rule fail too.)

870 Note [Eta expansion and SCCs]

871 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

872 Note that SCCs are not treated specially by etaExpand. If we have

873 etaExpand 2 (\x -> scc "foo" e)

874 = (\xy -> (scc "foo" e) y)

875 So the costs of evaluating 'e' (not 'e y') are attributed to "foo"

877 Note [Eta expansion and source notes]

878 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

879 CorePrep puts floatable ticks outside of value applications, but not

880 type applications. As a result we might be trying to eta-expand an

881 expression like

883 (src<...> v) @a

885 which we want to lead to code like

887 \x -> src<...> v @a x

889 This means that we need to look through type applications and be ready

890 to re-add floats on the top.

892 -}

894 -- | @etaExpand n e@ returns an expression with

895 -- the same meaning as @e@, but with arity @n@.

896 --

897 -- Given:

898 --

899 -- > e' = etaExpand n e

900 --

901 -- We should have that:

902 --

903 -- > ty = exprType e = exprType e'

906 -> CoreExpr

907 -- etaExpand arity e = res

908 -- Then 'res' has at least 'arity' lambdas at the top

909 --

910 -- etaExpand deals with for-alls. For example:

911 -- etaExpand 1 E

912 -- where E :: forall a. a -> a

913 -- would return

914 -- (/\b. \y::a -> E b y)

915 --

916 -- It deals with coerces too, though they are now rare

917 -- so perhaps the extra code isn't worth it

919 etaExpand n orig_expr

920 = go n orig_expr

921 where

922 -- Strip off existing lambdas and casts

923 -- Note [Eta expansion and SCCs]

928 go n expr

931 where

936 -- Find ticks behind type apps.

937 -- See Note [Eta expansion and source notes]

943 -- Abstraction Application

944 --------------

946 -- \x. [] [] x

947 | EtaCo Coercion -- [] |> sym co [] |> co

955 | isReflCo co = eis

957 where

958 co = co1 `mkTransCo` co2

962 --------------

968 --------------

970 -- (etaInfoApp s e eis) returns something equivalent to

971 -- ((substExpr s e) `appliedto` eis)

978 where

983 where

988 where

993 -- See Note [Eta expansion for join points]

995 where

1001 etaInfoApp subst expr _

1005 = subst_expr subst expr

1007 etaInfoApp subst e eis

1009 where

1010 go e [] = e

1015 --------------

1017 -- If e :: ty

1018 -- then etaInfoApp e eis :: etaInfoApp ty eis

1019 etaInfoAppTy ty [] = ty

1023 --------------

1026 -- EtaInfo contains fresh variables,

1027 -- not free in the incoming CoreExpr

1028 -- Outgoing InScopeSet includes the EtaInfo vars

1029 -- and the original free vars

1031 mkEtaWW orig_n orig_expr in_scope orig_ty

1033 where

1034 empty_subst = mkEmptyTCvSubst in_scope

1036 go n subst ty eis -- See Note [exprArity invariant]

1042 -- Avoid free vars of the original expression

1047 -- See Note [Levity polymorphism invariants] in CoreSyn

1048 -- See also test case typecheck/should_run/EtaExpandLevPoly

1051 -- Avoid free vars of the original expression

1056 -- newtype T = MkT ([T] -> Int)

1057 -- Consider eta-expanding this

1058 -- eta_expand 1 e T

1059 -- We want to get

1060 -- coerce T (\x::[T] -> (coerce ([T]->Int) e) x)

1064 -- but its type isn't a function, or a binder

1065 -- is levity-polymorphic

1068 -- This *can* legitmately happen:

1069 -- e.g. coerce Int (\x. x) Essentially the programmer is

1070 -- playing fast and loose with types (Happy does this a lot).

1071 -- So we simply decline to eta-expand. Otherwise we'd end up

1072 -- with an explicit lambda having a non-function type

1076 --------------

1077 -- Don't use short-cutting substitution - we may be changing the types of join

1078 -- points, so applying the in-scope set is necessary

1079 -- TODO Check if we actually *are* changing any join points' types

1085 --------------

1087 -- | Split an expression into the given number of binders and a body,

1088 -- eta-expanding if necessary. Counts value *and* type binders.

1090 etaExpandToJoinPoint join_arity expr

1092 where

1101 -- How did a local binding get a built-in rule anyway? Probably a plugin.

1102 rule

1106 = rule

1109 | otherwise

1112 where

1115 new_args = varsToCoreExprs new_bndrs

1117 -- Adds as many binders as asked for; assumes expr is not a lambda

1119 etaBodyForJoinPoint need_args body

1121 where

1122 go 0 _ _ rev_bs e

1124 go n ty subst rev_bs e

1131 | otherwise

1137 --------------

1139 -- Make a fresh Id, with specified type (after applying substitution)

1140 -- It should be "fresh" in the sense that it's not in the in-scope set

1141 -- of the TvSubstEnv; and it should itself then be added to the in-scope

1142 -- set of the TvSubstEnv

1143 --

1144 -- The Int is just a reasonable starting point for generating a unique;

1145 -- it does not necessarily have to be unique itself.

1146 freshEtaId n subst ty

1148 where