1 \newif\ifhaskellworkshop\haskellworkshoptrue

3 \iffalse

5 Payload: alternatives for functions polymorphic in node type

7 (n C O -> a, n C O -> b, n C O -> c)

9 vs

11 (forall e x . n e x -> ShapeTag e x, forall e x . n e x -> result e x)

13 where

15 data ShapeTag e x where

16 First :: ShapeTag C O

17 Middle :: ShapeTag O O

18 Last :: ShapeTag O C

20 result C O = a

21 result O O = b

22 result O C = c

24 The function returning ShapeTag can be passed in a type-class

25 dictionary.

30 Say something about the proliferation of heavyweight type signatures

31 required for GADT pattern matches. When the signature is three times

32 the size of the function, something is wrong...

35 I'm going to leave this point out, because in all the client code

36 we've written, we are (a) matching only on a node, and (b) matching on

37 all cases. In this scenario the GADT exhaustiveness checker provides

38 no additional benefits. Indeed, GADTs can be an irritant to the

39 client: in many pattern matches, GADTs make it impossible to default

40 cases.

43 I was thinking again about unwrapped nodes, cons-lists, snoc-lists,

44 and tree fringe. I think there's an analogy with ordinary lists, and

45 the analogy is 'concatMap' vs 'fold'. Just as with lists, the common

47 unwrapping of elements. We nevertheless prefer 'concatMap' because it

48 provides better separation of concerns. But this analogy suggests

49 several ideas:

51 - Can block processing be written in higher-order fashion?

53 - Can it be done in both styles (concatMap *and* fold)?

55 - Once we have framed the problem in these terms, can we write

56 fold-style cold that is not too hard to understand and maintain?

58 - Finally, is there an analog of stream fusion that would work for

59 control-flow graphs, enabling us to write some graph combinators

60 that be both perspicuous and efficient?

62 These are good observations for the paper and for future work.

65 ----------------------------------------------------------------

68 P.S. The three of us should have a nice little Skype chat about

69 higher-rank types. A lot of the questions still at issue boil down to

70 the following observations:

72 - It's really convenient for the *implementation* of Hoopl to put

73 forall to the left of an arrow. Polymorphic functions as

74 arguments are powerful and keen, and they make it really easy for

75 Hoopl to do its job.

77 - It's a bid inconvenient for a *client* of Hoopl to be forced to

78 supply functions that are polymorphic in shape. All constructors

79 have to be written out by hand; defaulting is not possible. (Or

80 at least John and I haven't figured out how to do it.)

82 - Using higher-rank types in Hoopl's interface makes some very

83 desirable things downright impossible:

85 - One can't write a generic dominator analysis with type

87 dom :: NonLocal n => FwdPass n Dominators

89 - One can't write a generic debugging wrapper of type

91 debug :: (Show n, Show f)

92 => TraceFn -> WhatToTrace -> FwdPass n f -> FwdPass n f

94 - One can't write a combinator of type

96 product :: FwdPass n f -> FwdPass n f' -> FwdPass n (f, f')

98 I submit that these things are all very desirable.

100 I'm therefore strongly in favor of removing the *requirement* that a

101 FwdPass include transfer and rewrite functions that are polymorphic in

102 shape. It will be a bit tedious to return to triples of functions,

103 but for those lucky clients who *can* write polymorphic functions and

104 who wish to, we can provide injection functions.

106 I should stress that I believe the tedium can be contained within

107 reasonable bounds---for example, the arfXXX functions that are

108 internal to Hoopl all remain polymorphic in shape (although not

109 higher-rank any longer).

111 \fi

114 %

115 % TODO

116 %

117 %

118 % AGraph = graph under construction

119 % Graph = replacement graph

120 %

121 %%% Hoopl assumes that the replacement graph for a node open on exit

122 %%% doesn't contain any additional exits

123 %%%

124 %%% introduce replacement graph in section 3, after graph.

125 %%% it has arbitrarily complicated internal control flow, but only

126 %%% one exit (if open on exit)

127 %%%

128 %%% a rquirement on the client that is not checked statically.

135 \newif\ifnoauthornotes \noauthornotesfalse

139 % \timestampfalse % it's post-submission time

147 \genkillfalse

150 \newif\ifnotesinmargin \notesinmarginfalse

156 % higher-order optimization library

157 % ('Hoople' was taken -- see hoople.org)

158 \let\hoopl\ourlib

164 % l2h substitution ourlib Hoopl

165 % l2h substitution hoopl Hoopl

196 \makeatletter

201 \vfill

203 \scriptsize

204 \if \@preprint

205 [Copyright notice will appear here

207 \else

208 \@toappear

209 \fi

210 \if \@reprint

211 \noindent Reprinted from \@conferencename,

212 \@proceedings,

213 \@conferenceinfo,

219 \let\c@table=

225 }

230 \else

233 \fi

234 \else

236 \fi

237 }

240 }

251 \noindent

253 %{\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\phantom{: \,}}}%

254 \else

257 \fi

261 \else

264 \verbatim

265 }

268 \makeatother

278 \makeatletter

282 \makeatother

291 \let\cite\citep

313 %%

314 %% 2009/05/10: removed 'float' package because it breaks multiple

315 %% \caption's per {figure} environment. ---NR

316 %%

317 %% % Put figures in boxes --- WHY??? --NR

318 %% \usepackage{float}

319 %% \floatstyle{boxed}

320 %% \restylefloat{figure}

321 %% \restylefloat{table}

325 % ON LINE THREE, set \noauthornotestrue to suppress notes (or not)

327 %\newcommand{\qed}{QED}

328 \ifnotesinmargin

330 \ifvmode

335 \else

342 \else

343 % Simon: please set \notesinmarginfalse on the first line

345 \fi

346 \ifnoauthornotes

348 \fi

355 \let\remark\norman

380 \iftimestamp

382 \fi

391 %\title{\ourlib: Dataflow Optimization Made Simple}

393 %\title{Implementing Dataflow Analysis and Optimization by Lifting Node Functions to Basic Blocks and Control-Flow Graphs}

394 %\subtitle{Programming pearl}

397 Under consideration for publication in the ACM Haskell Symposium.

398 Comments

399 are welcome; please identify this version by the words

401 }}

407 \ifnoauthornotes

408 \makeatletter

409 \let\HyPsd@Warning=

410 \@gobble

411 \makeatother

412 \fi

414 % JoÃ£o

422 \maketitle

428 We present \ourlib, a Haskell library that makes it easy for a

429 compiler writer

430 to implement program transformations based on dataflow analyses.

431 A~client of \ourlib\ defines a representation of

432 logical assertions,

433 a transfer function that computes outgoing assertions from incoming

434 assertions,

435 and a rewrite function that improves code when improvements are

436 justified by the assertions.

437 \ourlib\ does the actual analysis and transformation.

439 \ourlib\ implements state-of-the art algorithms:

440 Lerner, Grove, and Chambers's

442 composition of simple analyses and transformations, which achieves

443 the same precision as complex, handwritten

444 ``super-analyses;''

446 isolating bugs in a client's code.

447 \ourlib's implementation is unique in that unlike previous

448 implementations,

449 it carefully separates the tricky

450 elements of each of these algorithms, so that they can be examined and

451 understood independently.

456 \fi

457 Dataflow analysis and transformation of control-flow graphs is

458 pervasive in optimizing compilers, but it is typically tightly

461 unusually easy to define new analyses and

463 \ourlib's interface is modular and polymorphic,

464 and it offers unusually strong static guarantees.

465 The implementation encapsulates

466 state-of-the-art algorithms (interleaved analysis and rewriting,

467 dynamic error isolation), and it cleanly separates their tricky elements

468 so that they can be understood independently.

469 %

470 %\ourlib\ will be the workhorse of a new

471 %back end for the Glasgow Haskell Compiler (version~6.14, forthcoming).

476 \else

479 \fi

480 %%% Source: http://www.cs.tufts.edu/~nr/drop/popl-index.pdf

487 A mature optimizing compiler for an imperative language includes many

488 analyses, the results of which justify the optimizer's

489 code-improving transformations.

490 Many analyses and transformations---constant

491 propagation, live-variable analysis, inlining, sinking of loads,

492 and so on---should be regarded as particular cases of

494 %% \remark{I do not feel compelled to cite Muchnick (or anyone else) here}

495 Dataflow analysis is over thirty years old,

497 describing a powerful but subtle way to

499 piggybacks on the other.

501 Because optimizations based on dataflow analysis

502 share a common intellectual framework, and because that framework is

503 subtle, it~is tempting to

504 try to build a single reusable library that embodies the

505 subtle ideas, while

506 making it easy for clients to instantiate the library for different

507 situations.

508 % Tempting, but difficult.

509 Although such libraries exist, as we discuss

511 and none implements the Lerner/Grove/Chambers technique.

514 optimization library''), a new Haskell library for dataflow analysis and

515 optimization. It has the following distinctive characteristics:

518 \item

519 \ourlib\ is purely functional.

520 Although pure functional languages are not obviously suited to writing

521 standard algorithms that

522 transform control-flow graphs,

523 pure functional code is actually easier to write, and far easier

524 to write correctly, than code that is mostly functional but uses a mutable

525 representation of graphs

527 When analysis and transformation

529 without knowing whether

530 a~transformed graph will be retained or discarded,

531 pure functional code offers even more benefits.

533 \item

534 \ourlib\ is polymorphic. Just as a list library is

536 the nodes that inhabit graphs and in the dataflow facts that

539 \item

540 The paper by Lerner, Grove, and Chambers is inspiring but abstract.

541 We articulate their ideas in a concrete, simple API,

542 which hides

543 a subtle implementation

545 You provide a representation for assertions,

546 a transfer function that transforms assertions across a node,

547 and a rewrite function that uses an assertion to

548 justify rewriting a node.

549 \ourlib\ ``lifts'' these node-level functions to work over

550 control-flow graphs, solves recursion equations,

551 and interleaves rewriting with analysis.

552 Designing APIs is surprisingly

553 hard; we have been through over a dozen significantly different

554 iterations, and we offer our API as a contribution.

556 \item

557 Because the client

558 can perform very local reasoning (``@y@ is live before

559 @x:=y+2@''),

561 analyses and transformations built on \ourlib\

562 are small, simple, and easy to get right.

563 Moreover, \ourlib\ helps you write correct optimizations:

564 it~statically rules out transformations that violate invariants

566 and dynamically it can help find the first transformation that introduces a fault

568 %%\finalremark{SLPJ: I wanted to write more about open/closed,

569 %%but I like this sentence with its claim to both static and dynamic assistance,

570 %%and maybe the open/closed story is hard to understand here.}

572 % \item \ourlib{} makes use of GADTS and type functions to offer unusually

573 % strong static guarantees. In particular, nodes, basic blocks, and

574 % graphs are all statically typed by their open or closedness on entry, and

575 % their open or closedness on exit (\secref{graph-rep}). For example, an add instruction is

576 % open on entry and exit, while a branch is open on entry and closed on exit.

577 % Using these types we can statically guarantee that, say, an add instruction

578 % is rewritten to a graph that is also open at both entry and exit; and

579 % that the user cannot construct a block where an add instruction follows an

580 % unconditional branch. We know of no other system that offers

581 % static guarantees this strong.

584 (a)~interleaved analysis and rewriting, (b)~speculative rewriting,

585 (c)~computing fixed points, and (d)~dynamic fault isolation.

586 Previous implementations of these algorithms---including three of our

587 own---are complicated and hard to understand, because the tricky pieces

588 are implemented all together, inseparably.

589 In~this paper, each tricky piece is handled in just

591 We~emphasize this implementation as an object of interest in

592 its own right.

594 Our work bridges the gap between abstract, theoretical presentations

595 and actual compilers.

598 One of \hoopl's clients

599 is the Glasgow Haskell Compiler, which uses \hoopl\ to optimize

600 imperative code in GHC's back end.

602 \ourlib's API is made possible by

603 sophisticated aspects of Haskell's type system, such

604 as higher-rank polymorphism, GADTs, and type functions.

606 of these features.

615 %% We begin by setting the scene, introducing some vocabulary, and

616 %% showing a small motivating example.

619 Each block is a sequence of instructions,

620 beginning with a label and ending with a

621 control-transfer instruction that branches to other blocks.

622 % Each block has a label at the beginning,

623 % a sequence of

624 % -- each of which has a label at the

625 % beginning. Each block may branch to other blocks with arbitrarily

626 % complex control flow.

627 The goal of dataflow optimization is to compute valid

629 then use those assertions to justify code-improving

632 As~a concrete example, we show constant propagation with constant

633 folding.

634 On the left we show a basic block; in the middle we show

636 in the block; and at

637 the right we show the result of transforming the block

638 based on the assertions:

640 Before Facts After

641 ------------{}-------------

644 z := x>5 z := True

646 if z goto L1

647 then goto L1

648 else goto L2

650 Constant propagation works

651 from top to bottom. We start with the empty fact.

653 transformation?

654 Yes: we can replace the node with @x:=7@.

655 Now, given this transformed node

656 and the original fact, what fact flows out of the bottom of

657 the transformed node?

661 Now, can we make another transformation? Yes: constant folding can

662 replace the node with @z:=True@.

663 The~process continues to the end of the block, where we

664 can replace the conditional branch with an unconditional one, @goto L1@.

666 The example above is simple because the program has only straightline code;

667 when programs have loops, dataflow analysis gets more complicated.

668 For example,

669 consider the following graph,

670 where we assume @L1@ is the entry point:

673 L2: x=7; goto L3

674 L3: ...

676 Because control flows to @L3@ from two places (@L1@~and~@L2@),

678 All paths to @L3@ produce the fact @y=@$4$,

679 so we can conclude that this fact holds at~@L3@.

680 But depending on the the path to~@L3@, @x@~may have different

681 values, so we conclude ``@x=@$\top$'',

682 meaning that there is no single value held by~@x@ at~@L3@.

683 %\footnote{

684 %In this example @x@ really does vary at @L3@, but in general

685 %the analysis might be conservative.}

686 The final result of joining the dataflow facts that flow to~@L3@

693 assertion about the program state, such as ``variable~@x@ holds value~@7@.''

695 A~prime example is live-variable analysis, where a fact takes the~form

696 ``variable @x@ is live'' and is an assertion about the

698 live'' at a program point P is an assertion that @x@ is used on some program

700 The accompanying transformation is called dead-code elimination;

701 if @x@~is not live, this transformation

702 replaces the node @x:=e@ with a no-op.

707 Interleaving makes it far easier to write effective analyses.

710 the transformations.

712 and the instruction @z:=x>5@,

713 a pure analysis could produce the outgoing fact

715 But the subsequent transformation must perform

719 the transfer function becomes wonderfully simple: it merely has to see if the

720 right hand side is a constant.

723 Another example is the interleaving of liveness analysis and dead-code elimination.

725 it is sufficient for the analysis to~say ``@y@ is live before @x:=y+2@''.

726 It is not necessary to have the more complex rule

727 ``if @x@~is live after @x:=y+2@ then @y@ is live before~it,''

729 the assignment @x:=y+2@ will be eliminated.

730 If~there are a number of interacting

731 analyses and/or transformations,

732 the benefit of interleaving them is even more compelling; for more

734 %

735 But these compelling benefits come at a cost.

736 To~compute valid facts for a program that has loops,

737 an analysis may require multiple iterations over the program.

738 Before the final iteration, the analysis may

739 compute a~fact that is invalid,

740 and a transformation may use the invalid fact to rewrite the program

742 To~avoid unjustified rewrites,

743 any rewrite based on an invalid fact must be rolled back;

746 \hoopl~manages speculation with cooperation from the client.

755 While it is wonderful that we can implement complex optimizations by

756 composing very simple analyses and transformations,

757 it~is not so wonderful that very simple analyses and transformations,

758 when composed, can exhibit complex emergent behavior.

759 Because such behavior is not easily predicted, it is essential to have good

760 tools for debugging.

761 \hoopl's primary debugging tool is an implementation of

769 and transformations driven by these analyses, on control-flow graphs.

770 Graphs are composed from smaller units, which we discuss from the

771 bottom up:

783 Nodes, blocks, and graphs share important properties in common.

788 control-transfer instruction to a named label.

789 For example,

791 \item A shift-left instruction is open on entry (because control can fall into it

792 from the preceding instruction), and open on exit (because control falls through

793 to the next instruction).

794 \item An unconditional branch is open on entry, but closed on exit (because

795 control cannot fall through to the next instruction).

797 control to fall through into a branch target), but open on exit.

798 \item

799 A~function call should be closed on exit,

800 because control can flow from a call site to multiple points:

801 for example, a~return continuation or an exception handler.

802 (And~after optimization, distinct call sites may share a return

803 continuation.)

805 % This taxonomy enables \ourlib{} to enforce invariants:

806 % only nodes closed on entry can be the targets of branches, and only nodes closed

807 % on exits can transfer control (see also \secref{nonlocal-class}).

808 % As~a consequence, all control transfers originate at control-transfer

809 % instructions and terminated at labels; this invariant dramatically

810 % simplifies analysis and transformation.

811 These examples concern nodes, but the same classification applies

812 to blocks and graphs. For example the block

816 is open on entry and closed on exit.

818 ``open/closed;''

819 we may refer to an ``open/closed block.''

821 The shape of a thing determines that thing's control-flow properties.

822 In particular, whenever E is a node, block, or graph,

823 % : \simon{Removed the claim about a unique entry point.}

825 \item

826 If E is open on entry, it has a unique predecessor;

827 if it is closed, it may have arbitrarily many predecessors---or none.

828 \item

829 If E is open on exit, it has a unique successor;

830 if it is closed, it may have arbitrarily many successors---or none.

832 %%%%

833 %%%%

834 %%%% % \item Regardless of whether E is open or closed,

835 %%%% % it has a unique entry point where execution begins.

836 %%%% \item If E is closed on exit, control leaves \emph{only}

837 %%%% by explicit branches from closed-on-exit nodes.

838 %%%% \item If E is open on exit, control \emph{may} leave E

839 %%%% by ``falling through'' from a distinguished exit point.

840 %%%% \remark{If E is a node or block, control \emph{only} leaves E by

841 %%%% falling through, but this isn't so for a graph. Example: a body of a

842 %%%% loop contains a \texttt{break} statement} \simon{I don't understand.

843 %%%% A break statement would have to be translated as a branch, no?

844 %%%% Can you give a an example? I claim that control only leaves an

845 %%%% open graph by falling through.}

846 %%%% \end{itemize}

851 The primitive constituents of a control-flow graph are

853 For example, a node might represent a machine instruction, such as an

854 assignment, a call, or a conditional branch.

856 so each client can define nodes as it likes.

857 Because they contain nodes defined by the client,

858 graphs can include arbitrary client-specified data, including

859 (say) method calls, C~statements, stack maps, or

860 whatever.

865 data `Node e x where

866 Label :: Label -> Node C O

867 `Assign :: Var -> Expr -> Node O O

868 `Store :: Expr -> Expr -> Node O O

869 `Branch :: Label -> Node O C

870 `Cond :: Expr -> Label -> Label -> Node O C

871 ... more constructors ...

878 closed on entry and exit:

879 the type of a node has kind @*->*->*@, where the two type parameters

880 are type-level flags, one for entry and one for exit.

881 Such a type parameter may be instantiated only with type @O@~(for

882 open) or type~@C@ (for closed).

884 As an example,

887 The type parameters are written @e@ and @x@, for

888 entry and exit respectively.

889 The type is a generalized algebraic data type;

890 the syntax gives the type of each constructor.

891 %%% \cite{peyton-jones:unification-based-gadts}.

892 For example, constructor @Label@

893 takes a @Label@ and returns a node of type @Node C O@, where

894 the~``@C@'' says ``closed on entry'' and the~``@O@'' says ``open on exit''.

895 The types @Label@, @O@, and~@C@ are

897 As another example, constructor @Assign@ takes a variable and an expression, and it

898 returns a @Node@ open on both entry and exit; constructor @Store@ is

899 similar.

900 Types @`Var@ and @`Expr@ are private to the client, and

902 Finally, control-transfer nodes @Branch@ and @Cond@ (conditional

903 branch) are open on entry

904 and closed on exit.

906 Nodes closed on entry are the only targets of control transfers;

907 nodes open on entry and exit never perform control transfers;

909 To obey these invariants,

910 a node for

911 a conditional-branch instruction, which typically either transfers control

913 conditional branch, with the fall-through path in a separate block.

915 and it costs nothing in practice:

916 such code is easily sequentialized without superfluous branches.

917 %a late-stage code-layout pass can readily reconstruct efficient code.

918 }.

919 Because of the position each type of node occupies in a

920 basic block,

922 respectively.

927 data `O -- Open

928 data `C -- Closed

930 data `Block n e x where

931 `BFirst :: n C O -> Block n C O

932 `BMiddle :: n O O -> Block n O O

933 `BLast :: n O C -> Block n O C

934 `BCat :: Block n e O -> Block n O x -> Block n e x

936 data `Graph n e x where

937 `GNil :: Graph n O O

938 `GUnit :: Block n O O -> Graph n O O

939 `GMany :: MaybeO e (Block n O C)

940 -> LabelMap (Block n C C)

941 -> MaybeO x (Block n C O)

942 -> Graph n e x

944 data `MaybeO ^ex t where

945 `JustO :: t -> MaybeO O t

946 `NothingO :: MaybeO C t

948 newtype `Label -- abstract type

950 class `NonLocal n where

951 `entryLabel :: n C x -> Label

957 % omit MaybeC :: * -> * -> *

962 \ourlib\ combines the client's nodes into

965 A~@Block@ is parameterized over the node type~@n@

966 as well as over the same flag types that make it open or closed at

967 entry and exit.

969 The @BFirst@, @BMiddle@, and @BLast@ constructors create one-node

970 blocks.

973 Why not use a single constructor

976 shape?

977 Because by making the shape known statically, we simplify the

978 implementation of analysis and transformation in

981 parts of the representation of blocks are exposed.

982 Do we tell our readers or just ignore the question?

983 }

985 The @BCat@ constructor concatenates blocks in sequence.

986 It~makes sense to concatenate blocks only when control can fall

987 through from the first to the second; therefore,

989 the point of concatenation.

990 This restriction is enforced by the type of @BCat@, whose first

991 argument must be open on exit, and whose second argument must be open on entry.

992 It~is statically impossible, for example, to concatenate a @Branch@

993 immediately before an

994 @Assign@.

995 Indeed, the @Block@ type statically guarantees that any

996 closed/closed @Block@---which compiler writers normally

997 call a ``basic block''---consists of exactly one first node

999 followed by zero or more middle nodes (@Assign@ or @Store@),

1000 and terminated with exactly one

1001 last node (@Branch@ or @Cond@).

1002 Using GADTs to enforce these invariants is one of

1008 In a block, a control-flow edge is implicit in every application of

1009 the @BCat@ constructor.

1010 An~implicit edge originates in a first node or a middle node and flows

1011 to a middle node or a last node.

1013 an~explicit edge originates in a last node and flows to a (labelled)

1014 first node.

1015 \hoopl\ discovers explicit edges by using the @successors@ and

1017 Any~edge, whether implicit or explicit, is considered a program point,

1018 and an analysis written using \hoopl\ computes a dataflow fact at each

1019 such point.

1025 Like @Block@, the data type @Graph@ is parameterized over

1026 both nodes @n@ and its open/closed shape (@e@ and @x@).

1027 It has three constructors. The first two

1028 deal with the base cases of open/open graphs:

1029 an empty graph is represented by @GNil@ while a single-block graph

1030 is represented by @GUnit@.

1032 More general graphs are represented by @GMany@, which has three

1033 fields: an optional entry sequence, a body, and an optional exit

1034 sequence.

1036 \item

1038 @Block n O C@.

1039 We could represent this sequence as a value of type

1040 @Maybe (Block n O C)@, but we can do better:

1043 if the graph is open on entry.

1044 We~express our compile-time knowledge by using the type

1045 @MaybeO e (Block n O C)@, a type-indexed version of @Maybe@

1047 the type @MaybeO O a@ is isomorphic to~@a@, while

1048 the type @MaybeO C a@ is isomorphic to~@()@.

1049 \item

1051 To~facilitate traversal of the graph, we represent a body as a finite

1052 map from label to block.

1053 \item

1055 sequence, its presence or absence is deducible from the static type of the graph.

1060 Graphs can be spliced together nicely; the cost is logarithmic in the

1061 number of closed/closed blocks.

1062 Unlike blocks, two graphs may be spliced together

1063 not only when they are both open

1064 at splice point but also

1065 when they are both closed---and not in the other two cases:

1066 \begingroup

1069 `gSplice :: Graph n e a -> Graph n a x -> Graph n e x

1070 gSplice GNil g2 = g2

1071 gSplice g1 GNil = g1\^

1072 gSplice (GUnit ^b1) (GUnit ^b2) = GUnit (b1 `BCat` b2)\^

1073 gSplice (GUnit b) (GMany (JustO e) bs x)

1074 = GMany (JustO (b `BCat` e)) bs x\^

1075 gSplice (GMany e ^bs (JustO x)) (GUnit b2)

1076 = GMany e bs (JustO (x `BCat` b2))\^

1077 gSplice (GMany e1 ^bs1 (JustO x1)) (GMany (JustO e2) ^bs2 x2)

1078 = GMany e1 (bs1 `mapUnion` (b `addBlock` bs2)) x2

1079 where b = x1 `BCat` e2\^

1080 gSplice (GMany e1 bs1 NothingO) (GMany NothingO bs2 x2)

1081 = GMany e1 (bs1 `mapUnion` bs2) x2

1083 \endgroup

1084 % omit mapUnion :: LabelMap a -> LabelMap a -> LabelMap a

1085 % omit addBlock :: NonLocal n => Block n C C -> LabelMap (Block n C C) -> LabelMap (Block n C C)

1086 This definition illustrates the power of GADTs: the

1087 pattern matching is exhaustive, and all the open/closed invariants are

1088 statically checked. For example, consider the second-last equation for @gSplice@.

1089 Since the exit sequence of the first argument is @JustO x1@,

1090 we know that type parameter~@a@ is~@O@, and hence the entry sequence of the second

1091 argument must be @JustO e2@.

1092 Moreover, block~@x1@ must be

1093 closed/open, and block~@e2@ must be open/closed.

1094 We can therefore concatenate

1095 \ifpagetuning

1096 @x1@~and~@e2@

1097 \else

1098 them

1099 \fi

1100 with @BCat@ to produce a closed/closed block~@b@, which is

1101 added to the body of the result.

1103 The representation of @Graph@s is exposed to \hoopl's clients.

1105 We~have carefully crafted the types so that if @BCat@

1106 is considered as an associative operator,

1107 every graph has a unique representation.

1108 %% \simon{Well, you were the one who was so keen on a unique representation!

1109 %% And since we have one, I think tis useful to say so. Lastly, the representation of

1110 %% singleton blocks is not entirely obvious.}

1111 %%%%

1112 %%%% An empty open/open graph is represented

1113 %%%% by @GNil@, while a closed/closed one is @gNilCC@:

1114 %%%% \par {\small

1115 %%%% \begin{code}

1116 %%%% gNilCC :: Graph C C

1117 %%%% gNilCC = GMany NothingO BodyEmpty NothingO

1118 %%%% \end{code}}

1119 %%%% The representation of a @Graph@ consisting of a single block~@b@

1120 %%%% depends on the shape of~@b@:\remark{Does anyone care?}

1121 %%%% \par{\small

1122 %%%% \begin{code}

1123 %%%% gUnitOO :: Block O O -> Graph O O

1124 %%%% gUnitOC :: Block O C -> Graph O C

1125 %%%% gUnitCO :: Block O C -> Graph C O

1126 %%%% gUnitCC :: Block O C -> Graph C C

1127 %%%% gUnitOO b = GUnit b

1128 %%%% gUnitOC b = GMany (JustO b) BodyEmpty NothingO

1129 %%%% gUnitCO b = GMany NothingO BodyEmpty (JustO b)

1130 %%%% gUnitCC b = GMany NothingO (BodyUnit b) NothingO

1131 %%%% \end{code}}

1132 %%%% Multi-block graphs are similar.

1133 %%%% From these definitions

1134 To guarantee uniqueness, @GUnit@ is restricted to open/open

1135 blocks.

1136 If~@GUnit@ were more polymorphic, there would be

1137 more than one way to represent some graphs, and it wouldn't be obvious

1138 to a client which representation to choose---or if the choice made a difference.

1145 it~still needs to know how a node may transfer control from one

1146 block to another.

1147 \hoopl\ also

1148 needs to know what @Label@ is on the first node in a block.

1149 If \hoopl\ is polymorphic in the node type,

1150 how can it{} know these things?

1151 \hoopl\ requires the client to make the node type an instance of

1154 and returns its @Label@;

1155 the @successors@ method takes a last node (closed on exit) and returns

1156 the @Label@s to

1157 which it can transfer control.

1158 A~middle node, which is open on both entry and exit, transfers control

1159 only locally, to its successor within a basic block,

1160 so no corresponding interrogation function is needed.

1162 %% A node type defined by a client must be an instance of @NonLocal@.

1164 the client's instance declaration for @Node@ would be

1166 instance NonLocal Node where

1167 entryLabel (Label l) = l

1171 Again, the pattern matching for both functions is exhaustive, and

1172 the compiler statically checks this fact.

1173 Here, @entryLabel@

1174 cannot be applied to an @Assign@ or @Branch@ node,

1175 and any attempt to define a case for @Assign@ or @Branch@ would result

1176 in a type error.

1178 While the client provides this information about

1179 nodes, it is convenient for \ourlib\ to get the same information

1180 about blocks.

1181 Internally,

1184 instance NonLocal n => NonLocal (Block n) where

1185 entryLabel (BFirst n) = entryLabel n

1186 entryLabel (BCat b _) = entryLabel b

1187 successors (BLast n) = successors n

1188 successors (BCat _ b) = successors b

1190 Because the functions @entryLabel@ and @successors@ are used to track control

1192 entry label or successors of a @Graph@ itself.

1194 if a @Graph@ is closed on entry, it need not have a unique entry label.

1195 %

1196 % A slight infelicity is that we cannot make @Graph@ an instance of @NonLocal@,

1197 % because a graph closed on entry has no unique label. Fortunately, we never

1198 % need @Graph@s to be in @NonLocal@.

1199 %

1200 % ``Never apologize. Never confess to infelicity.'' ---SLPJ

1212 Now that we have graphs, how do we optimize them?

1214 A~client must supply the following pieces:

1218 that let the client build control-flow graphs out of nodes.

1221 Each analysis uses facts that are specific to that particular analysis,

1223 the fact type.

1228 performs a monadic action,

1229 and returns either @Nothing@

1230 or @Just g@,

1231 where @g@~is a graph that should

1232 replace the node

1235 % that may include internal control flow

1236 is crucial for many code-improving transformations.

1237 %We discuss the rewrite function

1238 %in Sections \ref{sec:rewrites} and \ref{sec:shallow-vs-deep}.

1241 Because facts, transfer functions, and rewrite functions work together,

1247 }

1256 Control-flow graphs& Us & Us & One \\

1258 Dataflow fact~$F$ & You & You & One type per logic \\

1260 Transfer functions & Us & You & One per analysis \\

1262 Analyze-and-rewrite functions & Us & Us & Two (forward, backward) \\

1264 }

1273 Given a node type~@n@ and a @FwdPass@,

1275 graph.

1276 \hoopl\ provides a fully polymorphic interface, but for purposes of

1277 exposition, we present a function that is specialized to a

1278 closed/closed graph:

1280 `analyzeAndRewriteFwdBody

1281 :: ( CkpointMonad m -- Roll back speculative actions

1282 , NonLocal n ) -- Extract non-local flow edges

1283 => FwdPass m n f -- Lattice, transfer, rewrite

1285 -> Graph n C C -- Input graph

1286 -> FactBase f -- Input fact(s)

1287 -> m ( Graph n C C -- Result graph

1288 , FactBase f ) -- ... and its facts

1290 Given a @FwdPass@ and a list of entry points, the

1291 analyze-and-rewrite function transforms a graph into

1292 an optimized graph.

1293 As its type shows, this function

1294 is polymorphic in the types of nodes~@n@ and facts~@f@;

1295 these types are chosen by the client.

1296 The type of the monad~@m@ is also chosen by the client.

1298 As well as taking and returning a graph, the

1299 function also takes input facts (the @FactBase@) and produces output facts.

1300 A~@FactBase@ is a finite mapping from @Label@ to facts;

1301 if~a @Label@ is not in the domain of the @FactBase@, its fact is the

1302 bottom element of the lattice.

1304 if the graph

1305 represents the body of a procedure

1306 with parameters $x,y,z$, we would map the entry @Label@ to a fact

1308 parameters are not known to be constants.

1309 %%

1310 %% The

1311 %% output @FactBase@ maps each @Label@ in the body to its fact

1313 The client's model of how @analyzeAndRewriteFwdBody@ works is as follows:

1314 \ourlib\ walks forward over each block in the graph.

1315 At~each node, \ourlib\ applies the

1316 rewrite function to the node and the incoming fact. If the rewrite

1317 function returns @Nothing@, the node is retained as part of the output

1318 graph, the transfer function is used to compute the downstream fact,

1319 and \ourlib\ moves on to the next node.

1320 But if the rewrite function returns @Just g@,

1321 indicating that it wants to rewrite the node to the replacement graph~@g@, then

1322 \ourlib\ recursively analyzes and may further rewrite~@g@

1323 before moving on to the next node.

1324 A~node following a rewritten node sees

1326 analyzing the replacement graph.

1328 A~rewrite function may take any action that is justified by

1329 the incoming fact.

1330 If~further analysis invalidates the fact, \hoopl\ rolls

1331 back the action.

1332 Because graphs cannot be mutated,

1333 rolling back to the original graph is easy.

1335 requires cooperation from the client:

1336 the~client must provide @checkpoint@ and

1337 @restart@ operations, which

1338 make~@m@ an instance of

1339 \hoopl's @CkpointMonad@ class

1343 Below we flesh out the

1344 interface to @analyzeAndRewriteFwdBody@, leaving the implementation for

1346 %{\hfuzz=7.2pt\par}

1351 data `FwdPass m n f

1352 = FwdPass { `fp_lattice :: DataflowLattice f

1353 , `fp_transfer :: FwdTransfer n f

1354 , `fp_rewrite :: FwdRewrite m n f }

1356 ------- Lattice ----------

1357 data `DataflowLattice f = DataflowLattice

1358 { `fact_bot :: f

1359 , `fact_join :: JoinFun f }

1360 type `JoinFun f =

1361 OldFact f -> NewFact f -> (ChangeFlag, f)

1362 newtype `OldFact f = OldFact f

1363 newtype `NewFact f = NewFact f

1364 data `ChangeFlag = `NoChange | `SomeChange

1366 ------- Transfers ----------

1367 newtype `FwdTransfer n f -- abstract type

1368 `mkFTransfer

1369 :: (forall e x . n e x -> f -> Fact x f)

1370 -> FwdTransfer n f

1372 ------- Rewrites ----------

1373 newtype `FwdRewrite m n f -- abstract type

1374 `mkFRewrite :: FuelMonad m

1375 => (forall e x . n e x -> f -> m (Maybe (Graph n e x)))

1376 -> FwdRewrite m n f

1377 thenFwdRw :: FwdRewrite m n f -> FwdRewrite m n f

1378 -> FwdRewrite m n f

1379 iterFwdRw :: FwdRewrite m n f -> FwdRewrite m n f

1380 `noFwdRw :: Monad m => FwdRewrite m n f

1382 ------- Fact-like things, aka "fact(s)" -----

1383 type family `Fact x f :: *

1384 type instance Fact O f = f

1385 type instance Fact C f = FactBase f

1387 ------- FactBase -------

1388 type `FactBase f = LabelMap f

1389 -- A finite mapping from Labels to facts f

1391 ------- Rolling back speculative rewrites ----

1392 class Monad m => CkpointMonad m where

1393 type Checkpoint m

1394 checkpoint :: m (Checkpoint m)

1395 restart :: Checkpoint m -> m ()

1397 ------- Optimization fuel ----

1398 type `Fuel = Int

1399 class Monad m => `FuelMonad m where

1400 `getFuel :: m Fuel

1401 `setFuel :: Fuel -> m ()

1408 % omit mkFactBase :: [(Label, f)] -> FactBase f

1409 %%%% \simon{We previously renamed @fact\_join@ to @fact\_extend@ because it really

1410 %%%% is not a symmetrical join; we're extending an old fact with a new one.

1411 %%%% NR: Yes, but the asymmetry is now explicit in the \emph{type}, so it

1412 %%%% needn't also be reflected in the \emph{name}.}

1416 For each analysis or transformation, the client must define a type

1417 of dataflow facts.

1418 A~dataflow fact often represents an assertion

1419 about a program point,

1424 claims that variable @x@ holds value @3@ at P, regardless of the

1425 path by which P is reached.

1428 assertion ``@x@ is dead'' at point P claims that no path from P uses

1429 variable @x@.

1433 (a)~the bottom element of the lattice and (b)~how to take

1434 the least upper bound (join) of two elements.

1435 To ensure that analysis

1436 terminates, it is enough if every fact has a finite number of

1437 distinct facts above it, so that repeated joins

1438 eventually reach a fixed point.

1439 %% \simon{There is no mention here of the @OldFact@ and @NewFact@ types.

1440 %% Shall we nuke them for the purposes of the paper?

1441 %% NR: They should definitely not be nuked; they're needed if the reader

1442 %% wants to understand the asymmetrical behavior of @fact\_join@.

1443 %% I'm~also mindful that this paper will be the primary explanation of

1444 %% \hoopl\ to its users, and I don't want to hide this part of the

1445 %% interface.

1446 %% As for mentions in the text,

1447 %% if you look carefully, you'll

1448 %% see that I've changed the subscripts in the text to ``old'' and

1449 %% ``new''.

1450 %% Together with the code, I believe that's enough to get the point across.

1451 %% SLPJ -- OK. I don't agree that it's enough, but we have more important fish to fry.

1452 %% }

1454 In practice, joins are computed at labels.

1456 label~$L$,

1458 into label~$L$,

1461 And \hoopl\ wants to know

1463 because if not, the analysis has not reached a fixed point.

1465 The bottom element and join operation of a lattice of facts of

1466 type~@f@ are stored in a value of type @DataflowLattice f@

1468 %%%% \simon{Can we shorten ``@DataflowLattice@'' to just

1469 %%%% ``@Lattice@''?} % bigger fish ---NR

1470 %%Such a value is part of the @FwdPass n f@ that is passed to

1471 %%@analyzeAndRewriteFwd@ above.

1472 As noted in the previous paragraph,

1474 to the old fact.

1475 Because this information can often be computed cheaply together

1476 with the join, \ourlib\ does not

1477 require a separate equality test on facts, which might be expensive.

1478 Instead, \ourlib\ requires that @fact_join@ return a @ChangeFlag@ as

1479 well as the least upper bound.

1480 The @ChangeFlag@ should be @NoChange@ if

1481 the result is the same as the old fact, and

1482 @SomeChange@ if the result differs.

1486 To help clients create lattices and join functions,

1487 \hoopl\ includes functions and constructors that can extend a type~@a@

1488 with top and bottom elements.

1489 In this paper, we use only type @`WithTop@, which comes with value

1490 constructors that have these types:

1492 `PElem :: a -> WithTop a

1493 `Top :: WithTop a

1495 \hoopl\ provides combinators which make it easier to create join

1496 functions that use @Top@.

1497 The most useful is @extendJoinDomain@:

1499 `extendJoinDomain

1500 :: (OldFact a -> NewFact a -> (ChangeFlag, WithTop a))

1501 -> JoinFun (WithTop a)

1504 type~@a@, but may produce @Top@ (as well as a fact of type~@a@)---as

1506 Calling @extendJoinDomain@ extends the client's function to a proper

1507 join function on the type @WithTop a@,

1508 and @extendJoinDomain@ makes sure that joins

1509 involving @Top@ obey the appropriate algebraic laws.

1511 \hoopl\ also provides a value constructor @`Bot@ and type constructors

1512 @`WithBot@ and @`WithTopAndBot@, along with similar functions.

1513 Constructors @Top@ and @Bot@ are polymorphic, so for example,

1514 @Top@~also has type @WithTopAndBot a@.

1516 It is also very common to use a lattice that takes the form of a

1517 finite~map.

1518 In~such lattices it is typical to join maps pointwise, and \hoopl\

1519 provides a function that makes it convenient to do so:

1521 `joinMaps :: Ord k => JoinFun v -> JoinFun (Map k v)

1528 A~forward transfer function is presented with the dataflow fact

1529 coming

1530 into a node, and it computes dataflow fact(s) on the node's outgoing edge(s).

1531 In a forward analysis, \hoopl\ starts with the fact at the

1532 beginning of a block and applies the transfer function to successive

1533 nodes in that block, until eventually the transfer function for the last node

1534 computes the facts that are propagated to the block's successors.

1536 the following graph, with entry at @L1@:

1538 L1: x=3; goto L2

1540 if x>0 then goto L2 else return

1543 except the entry~@L1@.

1545 Analyzing~@L1@ propagates this fact forward, applying the transfer

1546 function successively to the nodes

1548 This new fact is joined with the existing (bottom) fact at~@L2@.

1549 Now~the analysis propagates @L2@'s fact forward, again applying the transfer

1551 Again the new fact is joined with the existing fact at~@L2@, and the process

1552 repeats until the facts for each label reach a fixed point.

1554 A~transfer function has an unusual sort of type:

1555 not quite a dependent type, but not a bog-standard polymorphic type either.

1557 the type) of the node argument:

1558 If the node is open on exit, the transfer function produces a single fact.

1560 the transfer function

1561 produces a collection of (@Label@,fact) pairs: one for each outgoing edge.

1562 %

1563 The indexing is expressed by Haskell's (recently added)

1566 A~forward transfer function supplied by a client,

1567 which would be passed to @mkFTransfer@,

1568 is typically a function polymorphic in @e@ and @x@.

1569 It~takes a

1572 @f -> Fact x f@.

1573 Type constructor @Fact@

1574 should be thought of as a type-level function: its signature is given in the

1575 @type family@ declaration, while its definition is given by two @type instance@

1576 declarations. The first declaration says that a @Fact O f@, which

1577 comes out of a node

1579 The second declaration says that a @Fact C f@, which comes out of a

1582 %%

1583 %% \begin{code}

1584 %% `transfer_fn :: forall e x . n e x -> f -> Fact x f

1585 %% `node :: n e x

1586 %% \end{code}

1587 %% then @(transfer_fn node)@ is a fact transformer:

1588 %% \begin{code}

1589 %% transfer_fn node :: f -> Fact x f

1590 %% \end{code}

1591 %%

1599 We compute dataflow facts in order to enable code-improving

1600 transformations.

1601 In our constant-propagation example,

1602 the dataflow facts may enable us

1603 to simplify an expression by performing constant folding, or to

1604 turn a conditional branch into an unconditional one.

1605 Similarly, a liveness analysis may allow us to

1606 replace a dead assignment with a no-op.

1610 A~programmer creating a rewrite function chooses the type of a node~@n@ and

1611 a dataflow fact~@f@.

1612 A~rewrite function might also want to consume fresh

1613 names (e.g., to label new blocks) or take other actions (e.g., logging

1614 rewrites).

1615 So~that a rewrite function may take actions, \hoopl\

1616 requires that a programmer creating a rewrite function also choose a

1617 monad~@m@.

1618 So~that \hoopl\ may roll back actions taken by speculative rewrites, the monad

1619 must satisfy the constraint @CkpointMonad m@, as described in

1621 The programmer may write code that works with any such monad,

1622 may create a monad just for the client,

1623 or may use a monad supplied by \hoopl.

1626 When these choices are made, the easy way to create a rewrite

1628 The client supplies a function that is specialized to a particular

1629 node, fact, and (possibly) monad, but is polymorphic in the

1631 The function, which we will call~$r$, takes a node and a fact and returns a monadic

1632 computation, but what is the result of that computation?

1633 One might

1634 expect that the result should be a new node, but that is not enough:

1635 in~general, it must be possible for rewriting to result in a graph.

1636 For example,

1637 we might want to remove a node by returning the empty graph,

1638 or more ambitiously, we might want to replace a high-level operation

1639 with a tree of conditional branches or a loop, which would entail

1640 returning a graph containing new blocks with internal control flow.

1643 It~must also be possible for a rewrite function to decide to do nothing.

1644 The result of the monadic computation returned by~$r$ may therefore be

1645 @Nothing@, indicating that the node should

1646 not be rewritten,

1647 or $@Just@\;\ag$, indicating that the node should

1648 be replaced with~\ag: the replacement graph.

1649 %%The additional value $\rw$ tells \hoopl\ whether

1650 %%and how the replacement graph~$\ag$ should be analyzed and rewritten

1651 %%further;

1652 %%we explain~$\rw$ in \secref{shallow-vs-deep}.

1656 that

1657 the replacement graph $\ag$ has

1659 For example, a branch instruction can be replaced only by a graph

1660 closed on exit.

1661 %% Moreover, because only an open/open graph can be

1662 %% empty---look at the type of @GNil@ in \figref{graph}---the type

1663 %% of @FwdRewrite@

1664 %% guarantees, at compile time, that no head of a block (closed/open)

1665 %% or tail of a block (open/closed) can ever be deleted by being

1666 %% rewritten to an empty graph.

1672 When a node is rewritten, the replacement graph~$g$

1673 must itself be analyzed, and its nodes may be further rewritten.

1674 \hoopl\ can make a recursive call to

1675 @analyzeAndRewriteFwdBody@---but what @FwdPass@ should it use?

1676 There are two common situations:

1678 \item Sometimes we want to analyze and transform the replacement graph

1679 with an unmodified @FwdPass@, thereby further rewriting the replacement graph.

1680 This procedure is

1682 When deep rewriting is used, the client's rewrite function must

1683 ensure that the graphs it produces are not rewritten indefinitely

1685 \item

1686 A client may want to analyze the

1690 Deep rewriting is essential to achieve the full benefits of

1691 interleaved analysis and transformation

1693 But shallow rewriting can be vital as well;

1694 for example, a backward dataflow pass that inserts

1695 a spill before a call must not rewrite the call again, lest it attempt

1696 to insert infinitely many spills.

1700 An~innovation of \hoopl\ is to build the choice of shallow or deep

1701 rewriting into each rewrite function, through the use of combinators.

1702 A~function returned by @mkFRewrite@ is shallow by default, but it can

1703 be made deep by

1705 `iterFwdRw :: FwdRewrite m n f -> FwdRewrite m n f

1707 For~convenience, we~provide the~function @`deepFwdRw@,

1708 which is the composition of @iterFwdRw@ and~@mkFRewrite@.

1709 Another useful combinator enables \hoopl\ to try any number of functions to

1710 rewrite a node, provided only that they all use the same fact:

1712 % defn thenFwdRw

1713 Rewrite function @r1 `thenFwdRw` r2@ first does the rewrites of~@r1@,

1714 then the rewrites of~@r2@.

1715 The rewriter @noFwdRw@ is the identity of @thenFwdRw@.

1716 %%

1717 %%

1718 %% by using

1719 %% Rewrite functions are potentially more plentiful than transfer

1720 %% functions, because

1721 %% a~single dataflow fact might justify more than one kind of rewrite.

1722 %% \hoopl\ makes it easy for a client to combine multiple rewrite

1723 %% functions that use the same fact:

1725 These combinators satisfy the algebraic law

1727 iterFwdRw r = r `thenFwdRw` iterFwdRw r

1730 @iterFwdRw@:

1731 if~we used this equation to define @iterFwdRw@, then when @r@ declined

1732 to rewrite, @iterFwdRw r@ would diverge.

1736 %%%% \hoopl\ provides

1737 %%%% a function that makes a shallow rewrite deep:\finalremark{algebraic

1738 %%%% law wanted!}

1739 %%%% \remark{Do we really care about @iterFwdRw@ or can we just drop it?}

1740 %%%% \verbatiminput{iterf}

1741 %%%% % defn iterFwdRw

1748 the type of the transfer function's result

1749 depends on the argument's shape on exit.

1750 It~is easy for a client to write a type-indexed transfer function,

1751 because the client defines the constructor and shape for each node.

1752 The client's transfer functions discriminate on the constructor

1753 and so can return a result that is indexed by each node's shape.

1755 What if you want to write a transfer function that

1757 %

1758 For example, a dominator analysis need not scrutinize nodes;

1759 it needs to know only about

1760 labels and edges in the graph.

1761 Ideally, a dominator analysis would work

1763 provided only that @n@~is an

1764 instance of the @NonLocal@ type class.

1765 But if we don't know

1766 the type of~@n@,

1767 we can't write a function of type

1768 @n e x -> f -> Fact x f@,

1769 because the only way to get the result type right is to

1770 scrutinize the constructors of~@n@.

1774 There is another way;

1775 instead of requiring a single function that is polymorphic in shape,

1776 \hoopl\ will also accept a triple of functions, each of which is

1777 polymorphic in the node's type but monomorphic in its shape:

1779 `mkFTransfer3 :: (n C O -> f -> Fact O f)

1780 -> (n O O -> f -> Fact O f)

1781 -> (n O C -> f -> Fact C f)

1782 -> FwdTransfer n f

1784 We have used this interface to write a number of functions that are

1785 polymorphic in the node type~@n@:

1787 \item

1788 A function that takes a @FwdTransfer@ and wraps it in logging code, so

1789 an analysis can be debugged by watching the facts flow through the

1790 nodes

1791 \item

1792 A pairing function that runs two passes interleaved, not sequentially,

1793 potentially producing better results than any sequence:

1795 % defn pairFwd

1796 \item

1797 An efficient dominator analysis

1798 in the style of

1800 whose transfer function is implemented

1801 using only the

1806 %% Or, you might want to write a combinator that

1807 %% computes the pairwise composition of any two analyses

1808 %% defined on any node type.

1809 %%

1810 %% You cannot write either of these examples

1811 %% using the polymorphic function described in \secref{transfers}

1812 %% because these examples require the ability

1813 %% to inspect the shape of the node.

1814 %%

1815 %%

1816 %% \subsection{Stuff cut from 4.2 that should disappear when the previous section is finished}

1817 %%

1818 %% So much for the interface.

1819 %% What about the implementation?

1820 %% The way GADTs work is that the compiler uses the value constructor for

1821 %% type \mbox{@n@ @e@ @x@} to determine whether @e@~and~@x@ are open or

1822 %% closed.

1823 %% But we want to write functions that are \emph{polymorphic} in the node

1824 %% type~@n@!

1825 %% Such functions include

1826 %% \begin{itemize}

1827 %% \item

1828 %% A function that takes a pair of @FwdTransfer@s for facts @f@~and~@f'@,

1829 %% and returns a single @FwdTransfer@ for the fact @(f, f')@

1830 %% \end{itemize}

1831 %% \simon{These bullets are utterly opaque. They belong in 4.5. The rest of this section

1832 %% is also very hard to understand without more explanation. See email. }

1833 %% Such functions may also be useful in \hoopl's \emph{clients}:

1834 %% % may need

1835 %% %functions that are polymorphic in the node type~@n@:

1836 %% \begin{itemize}

1837 %% \item

1838 %% A dominator analysis in the style of

1839 %% \citet{cooper-harvey-kennedy:simple-dominance} requires only the

1840 %% functions in the \texttt{NonLocal} type class;

1841 %% we have written such an analysis using transfer functions that are

1842 %% polymorphic in~@n@.

1843 %% \end{itemize}

1844 %% Because the mapping from

1845 %% value constructors to shape is different for each node type~@n@, transfer

1846 %% functions cannot be polymorphic in both

1847 %% the representation and the shape of nodes.

1848 %% Our implementation therefore sacrifices polymorphism in shape:

1849 %% internally, \hoopl\ represents

1850 %% a~@FwdTransfer n f@ as a \emph{triple} of functions,

1851 %% each polymorphic in~@n@ but monomorphic in shape:

1852 %% \begin{code}

1853 %% newtype FwdTransfer n f

1854 %% = FwdTransfers ( n C O -> f -> f

1855 %% , n O O -> f -> f

1856 %% , n O C -> f -> FactBase f

1857 %% )

1858 %% \end{code}

1859 %% \simon{I argue strongly that the implementation should be

1860 %% polymorphic too, using a shape classifier where necessary.

1861 %% Regardless of how this debate pans out, though, I think it's

1862 %% a bad idea to introduce triples here. They are simply confusing.}

1863 %% Such triples can easily be composed and wrapped without requiring a

1864 %% pattern match on the value constructor of an unknown node

1865 %% type.\remark{Need to refer to this junk in the conclusion}

1866 %% Working with triples is tedious, but only \hoopl\ itself is forced to

1867 %% work with triples; each client, which knows the node type, may supply

1868 %% a triple,

1869 %% but it typically supplies a single function polymorphic in the shape

1870 %% of the (known) node.

1871 %%

1872 %%

1874 %% \subsection{Composing rewrite functions and dataflow passes} \seclabel{combinators}

1875 %%

1876 %% Requiring each rewrite to return a new rewrite function has another

1877 %% advantage, beyond controlling the shallow-vs-deep choice: it

1878 %% enables a variety of combinators over rewrite functions.

1879 %% \remark{This whole subsection needs to be redone in light of the new

1880 %% (triple-based) representation. It's not pretty any more.}

1881 %% For example, here is a function

1882 %% that combines two rewrite functions in sequence:

1883 %% \remark{This code must be improved}

1884 %% What a beautiful type @thenFwdRw@ has!

1885 %% \remark{And what an ugly implementation! Implementations must go.}

1886 %% It tries @rw1@, and if @rw1@

1887 %% declines to rewrite, it behaves like @rw2@. But if

1888 %% @rw1@ rewrites, returning a new rewriter @rw1a@, then the overall call also

1889 %% succeeds, returning a new rewrite function obtained by combining @rw1a@

1890 %% with @rw2@. (We cannot apply @rw1a@ or @rw2@

1891 %% directly to the replacement graph~@g@,

1892 %% because @r1@~returns a graph and @rw2@~expects a node.)

1893 %% The rewriter @noFwdRw@ is the identity of @thenFwdRw@.

1894 %% Finally, @thenFwdRw@ can

1895 %% combine a deep-rewriting function and a shallow-rewriting function,

1896 %% to produce a rewrite function that is a combination of deep and shallow.

1897 %% %%This is something that the Lerner/Grove/Chambers framework could not do,

1898 %% %%because there was a global shallow/deep flag.

1899 %% %% Our shortsightedness; Lerner/Grove/Chambers is deep only ---NR

1900 %%

1901 %%

1902 %% A shallow rewrite function can be made deep by iterating

1903 %% it:\remark{Algebraic law wanted!}

1904 %% If we have shallow rewrites $A$~and~$B$ then we can build $AB$,

1905 %% $A^*B$, $(AB)^*$,

1906 %% and so on: sequential composition is @thenFwdRw@ and the Kleene star

1907 %% is @iterFwdRw@.\remark{Do we still believe this claim?}

1908 %% \remark{We can't define @iterFwdRew@ in terms of @thenFwdRew@ because

1909 %% the second argument to @thenFwdRew@ would have to be the constant

1910 %% nothing function when applied but would have to be the original triple

1911 %% when passed to @thenFwdRew@ as the second argument (in the recursive

1912 %% call).}

1913 %%

1914 %%

1915 %% The combinators above operate on rewrite functions that share a common

1916 %% fact type and transfer function.

1917 %% It~can also be useful to combine entire dataflow passes that use

1918 %% different facts.

1919 %% We~invite you to write one such combinator, with type

1920 %% \begin{code}

1921 %% `pairFwd :: Monad m

1922 %% => FwdPass m n f1

1923 %% -> FwdPass m n f2

1924 %% -> FwdPass m n (f1,f2)

1925 %% \end{code}

1926 %% The two passes run interleaved, not sequentially, and each may help

1927 %% the other,

1928 %% yielding better results than running $A$~and then~$B$ or $B$~and then~$A$

1929 %% \citep{lerner-grove-chambers:2002}.

1930 %% %% call these passes. ``super-analyses;''

1931 %% %% in \hoopl, construction of super-analyses is

1932 %% %% particularly concrete.

1937 % omit Binop :: Operator -> Expr -> Expr -> Expr

1938 % omit Add :: Operator

1942 % local node

1943 % defn ConstFact

1944 % defn constLattice

1945 % defn constFactAdd

1946 % defn varHasLit

1947 % local ft

1948 % defn constProp

1949 % local cp

1950 % local lookup

1951 % defn simplify

1952 % local simp

1953 % local s_node

1954 % local s_exp

1955 % defn constPropPass

1956 \caption{The client for constant propagation and constant folding\break (extracted automatically from code distributed with Hoopl)}

1963 constant propagation and constant folding.

1964 For each variable, at each program point, the analysis concludes one

1965 of three facts:

1966 the variable holds a constant value of type~@Lit@,

1967 the variable might hold a non-constant value,

1968 or what the variable holds is unknown.

1969 We~represent these facts using a finite map from a variable to a

1971 % Any one procedure has only

1972 % finitely many variables; only finitely many facts are computed at any

1973 % program point; and in this lattice any one fact can increase at most

1974 % twice. These properties ensure that the dataflow engine will reach a

1975 % fixed point.

1976 A~variable with a constant value maps to @Just (PElem k)@, where

1977 @k@~is the constant value;

1978 a variable with a non-constant value maps to @Just Top@;

1979 and a variable with an unknown value maps to @Nothing@ (it is not

1980 in the domain of the finite map).

1982 % \afterpage{\clearpage}

1984 The definition of the lattice (@constLattice@) is straightforward.

1985 The bottom element is an empty map (nothing is known about what

1986 any variable holds).

1987 %

1988 The join function is implemented with the help of combinators provided

1989 by \hoopl.

1990 The client writes a simple function, @constFactAdd@, which

1991 compares two values of type @Lit@ and returns a result of type

1992 @WithTop Lit@.

1993 The client uses @extendJoinDomain@ to lift @constFactAdd@ into a join

1994 function on @WithTop Lit@, then uses

1996 containing facts for all variables.

1998 The forward transfer function @varHasLit@ is defined using the

1999 shape-polymorphic auxiliary function~@ft@.

2000 For most nodes~@n@, @ft n@ simply propagates the input fact forward.

2001 But for an assignment node, if a variable~@x@ gets a constant value~@k@,

2002 @ft@ extends the input fact by mapping @x@ to~@PElem k@.

2003 And if a variable~@x@ is assigned a non-constant value,

2004 @ft@ extends the input fact by mapping @x@ to~@Top@.

2005 There is one other interesting case:

2006 a conditional branch where the condition

2007 is a variable.

2008 If the conditional branch flows to the true successor,

2009 the variable holds @True@, and similarly for the false successor,

2011 Function @ft@ updates the fact flowing to each successor accordingly.

2013 default the uninteresting cases.}

2015 The transfer function need not consider complicated cases such as

2016 an assignment @x:=y@ where @y@ holds a constant value~@k@.

2017 Instead, we rely on the interleaving of transformation

2018 and analysis to first transform the assignment to @x:=k@,

2019 which is exactly what our simple transfer function expects.

2021 interleaving makes it possible to write

2022 very simple transfer functions, without missing

2023 opportunities to improve the code.

2026 rewrites each use of a variable to its constant value.

2027 The client has defined auxiliary functions that may change expressions

2028 or nodes:

2030 type `MaybeChange a = a -> Maybe a

2031 `mapVE :: (Var -> Maybe Expr) -> MaybeChange Expr

2032 `mapEE :: MaybeChange Expr -> MaybeChange Expr

2033 `mapEN :: MaybeChange Expr -> MaybeChange (Node e x)

2034 `mapVN :: (Var -> Maybe Expr) -> MaybeChange (Node e x)

2035 `nodeToG :: Node e x -> Graph Node e x

2038 to apply @lookup@ to each use of a variable in each kind of node;

2039 @lookup@ substitutes for each variable that has a constant value.

2040 Applying @liftM nodeToG@ lifts the final node, if present, into a~@Graph@.

2044 for constant

2045 folding: @simplify@.

2046 This function

2047 rewrites constant expressions to their values,

2048 and it rewrites a conditional branch on a

2049 boolean constant to an unconditional branch.

2050 To~rewrite constant expressions,

2051 it runs @s_exp@ on every subexpression.

2052 Function @simplify@ does not check whether a variable holds a

2053 constant value; it relies on @constProp@ to have replaced the

2054 variable by the constant.

2055 Indeed, @simplify@ does not consult the

2056 incoming fact, so it is polymorphic in~@f@.

2059 The @FwdRewrite@ functions @constProp@ and @simplify@

2060 are useful independently.

2061 In this case, however, we

2063 so we compose them with @thenFwdRw@.

2064 The composition, along with the lattice and the

2065 transfer function,

2067 Given @constPropPass@, we can

2068 improve a graph~@g@ by passing @constPropPass@ and~@g@

2069 to

2070 @analyzeAndRewriteFwdBody@.

2074 %%%% \subsection{Fixed points and speculative rewrites} \seclabel{fixpoints}

2075 %%%%

2076 %%%% Are rewrites sound, especially when there are loops?

2077 %%%% Many analyses compute a fixed point starting from unsound

2078 %%%% ``facts''; for example, a live-variable analysis starts from the

2079 %%%% assumption that all variables are dead. This means \emph{rewrites

2080 %%%% performed before a fixed point is reached may be unsound, and their results

2081 %%%% must be discarded}. Each iteration of the fixed-point computation must

2082 %%%% start afresh with the original graph.

2083 %%%%

2084 %%%%

2085 %%%% Although the rewrites may be unsound, \emph{they must be performed}

2086 %%%% (speculatively, and possibly recursively),

2087 %%%% so that the facts downstream of the replacement graphs are as accurate

2088 %%%% as possible.

2089 %%%% For~example, consider this graph, with entry at @L1@:

2090 %%%% \par{\small

2091 %%%% \begin{code}

2092 %%%% L1: x=0; goto L2

2093 %%%% L2: x=x+1; if x==10 then goto L3 else goto L2

2094 %%%% \end{code}}

2095 %%%% The first traversal of block @L2@ starts with the unsound ``fact'' \{x=0\};

2096 %%%% but analysis of the block propagates the new fact \{x=1\} to @L2@, which joins the

2097 %%%% existing fact to get \{x=$\top$\}.

2098 %%%% What if the predicate in the conditional branch were @x<10@ instead

2099 %%%% of @x==10@?

2100 %%%% Again the first iteration would begin with the tentative fact \{x=0\}.

2101 %%%% Using that fact, we would rewrite the conditional branch to an unconditional

2102 %%%% branch @goto L3@. No new fact would propagate to @L2@, and we would

2103 %%%% have successfully (and soundly) eliminated the loop.

2104 %%%% This example is contrived, but it illustrates that

2105 %%%% for best results we should

2106 %%%% \begin{itemize}

2107 %%%% \item Perform the rewrites on every iteration.

2108 %%%% \item Begin each new iteration with the original, virgin graph.

2109 %%%% \end{itemize}

2110 %%%% This sort of algorithm is hard to implement in an imperative setting, where rewrites

2111 %%%% mutate a graph in place.

2112 %%%% But with an immutable graph, implementing the algorithm

2113 %%%% is trivially easy: we simply revert to the original graph at the start

2114 %%%% of each fixed-point iteration.

2121 In the presence of loops, a rewrite function could make a change

2122 that later has to be rolled back.

2123 For example, consider constant propagation in this loop, which

2124 computes factorial:

2127 L1: if (i >= n) goto L3 else goto L2;

2128 L2: i = i + 1; prod = prod * i;

2129 goto L1;

2130 L3: ...

2132 Function @analyzeAndRewriteFwdBody@ iterates through this graph until

2133 the dataflow facts stop changing.

2134 On~the first iteration, the assignment @i = i + 1@ will

2135 be analyzed with an incoming fact @i=1@, and the assignment will be

2136 rewritten to the graph @i = 2@.

2137 But~on a later iteration, the incoming fact will increase to @i=@$\top$,

2138 and the~rewrite will no longer be justified.

2139 After each iteration, \hoopl\ starts the next iteration with

2141 of using purely functional data structures, rewrites from

2142 previous iterations are automatically ``rolled back.''

2145 acquiring a fresh name.

2146 These~actions must also be rolled back, and because the~client chooses

2147 the monad in which the actions take place, the client must provide the

2148 means to roll them back.

2151 back monadic actions.

2152 The~interface is defined by the type class @CkpointMonad@ in

2155 class Monad m => `CkpointMonad m where

2156 type `Checkpoint m

2157 `checkpoint :: m (Checkpoint m)

2158 `restart :: Checkpoint m -> m ()

2160 \hoopl\ calls the @checkpoint@ method at the beginning of an

2161 iteration, then calls the @restart@ method if another iteration is

2162 necessary.

2163 These operations must obey the following algebraic law:

2167 where @m@~represents any combination of monadic actions that might be

2168 taken by rewrite functions.

2169 (The safest course is to make sure the law holds throughout the entire

2170 monad.)

2171 The~type of the saved checkpoint~@s@ is up to the client;

2180 % Facts computed by @analyzeAndRewriteFwdBody@ depend on graphs produced by the rewrite

2181 Facts computed by the transfer function depend on graphs produced by the rewrite

2182 function, which in turn depend on facts computed by the transfer function.

2183 How~do we know this algorithm is sound, or if it terminates?

2184 A~proof requires a POPL paper

2186 here we merely state the conditions for correctness as applied to \hoopl:

2188 \item

2190 every sequence of calls to @fact_join@ must eventually return @NoChange@.

2191 \item

2192 The transfer function must be

2194 it must produce a more informative fact out.

2195 \item

2197 if it replaces a node @n@ by a replacement graph~@g@, then @g@~must be

2198 observationally equivalent to~@n@ under the

2199 assumptions expressed by the incoming dataflow fact~@f@.

2200 %%\footnote{We do not permit a transformation to change

2201 %% the @Label@ of a node. We have not found any optimizations

2202 %% that are prevented (or even affected) by this restriction.}

2203 %

2204 Moreover, analysis of~@g@ must produce output fact(s)

2205 that are at least as informative as the fact(s) produced by applying

2206 the transfer function to~@n@.

2207 %% \item

2208 %% The rewrite function must be \emph{consistent} with the transfer function;

2209 %% that is, \mbox{@`transfer n f@ $\sqsubseteq$ @transfer g f@}.

2210 For example, if the transfer function says that @x=7@ after the node~@n@,

2211 then after analysis of~@g@,

2212 @x@~had better still be~@7@.

2213 \item

2214 % To ensure termination,

2215 A transformation that uses deep rewriting

2216 must not return a replacement graph which

2217 contains a node that could be rewritten indefinitely.

2219 %% Without the conditions on monotonicity and consistency,

2220 %% our algorithm will terminate,

2221 %% but there is no guarantee that it will compute

2222 %% a fixed point of the analysis. And that in turn threatens the

2223 %% soundness of rewrites based on possibly bogus ``facts''.

2224 Under these conditions, the algorithm terminates and is

2225 sound.

2226 %%

2227 %% \begin{itemize}

2228 %% \item

2229 %% The algorithm terminates. The fixed-point loop must terminate because the

2230 %% lattice has no infinite ascending chains. And the client is responsible

2231 %% for avoiding infinite recursion when deep rewriting is used.

2232 %% \item

2233 %% The algorithm is sound. Why? Because if each rewrite is sound (in the sense given above),

2234 %% then applying a succession of rewrites is also sound.

2235 %% Moreover, a~sound analysis of the replacement graph

2236 %% may generate only dataflow facts that could have been

2237 %% generated by a more complicated analysis of the original graph.

2238 %% \end{itemize}

2239 %%

2240 %% \finalremark{Doesn't the rewrite have to be have the following property:

2241 %% for a forward analysis/transform, if (rewrite P s) = Just s',

2242 %% then (transfer P s $\sqsubseteq$ transfer P s').

2243 %% For backward: if (rewrite Q s) = Just s', then (transfer Q s' $\sqsubseteq$ transfer Q s).

2244 %% Works for liveness.

2245 %% ``It works for liveness, so it must be true'' (NR).

2246 %% If this is true, it's worth a QuickCheck property!

2247 %% }%

2248 %% \finalremark{Version 2, after further rumination. Let's define

2249 %% $\scriptstyle \mathit{rt}(f,s) = \mathit{transform}(f, \mathit{rewrite}(f,s))$.

2250 %% Then $\mathit{rt}$ should

2251 %% be monotonic in~$f$. We think this is true of liveness, but we are not sure

2252 %% whether it's just a generally good idea, or whether it's actually a

2253 %% precondition for some (as yet unarticulated) property of \ourlib{} to hold.}%

2255 %%%% \simon{The rewrite functions must presumably satisfy

2256 %%%% some monotonicity property. Something like: given a more informative

2257 %%%% fact, the rewrite function will rewrite a node to a more informative graph

2258 %%%% (in the fact lattice.).

2259 %%%% \textbf{NR}: actually the only obligation of the rewrite function is

2260 %%%% to preserve observable behavior. There's no requirement that it be

2261 %%%% monotonic or indeed that it do anything useful. It just has to

2262 %%%% preserve semantics (and be a pure function of course).

2263 %%%% \textbf{SLPJ} In that case I think I could cook up a program that

2264 %%%% would never reach a fixed point. Imagine a liveness analysis with a loop;

2265 %%%% x is initially unused anywhere.

2266 %%%% At some assignment node inside the loop, the rewriter behaves as follows:

2267 %%%% if (and only if) x is dead downstream,

2268 %%%% make it alive by rewriting the assignment to mention x.

2269 %%%% Now in each successive iteration x will go live/dead/live/dead etc. I

2270 %%%% maintain my claim that rewrite functions must satisfy some

2271 %%%% monotonicity property.

2272 %%%% \textbf{JD}: in the example you cite, monotonicity of facts at labels

2273 %%%% means x cannot go live/dead/live/dead etc. The only way we can think

2274 %%%% of not to terminate is infinite ``deep rewriting.''

2275 %%%% }

2286 gives a client's-eye view of \hoopl, showing how to use

2287 it to create analyses and transformations.

2288 \hoopl's interface is simple, but

2291 do not describe their implementation. We have written

2292 at least three previous implementations, all of which

2293 were long and hard to understand, and only one of which

2294 provided compile-time guarantees about open and closed shapes.

2295 We are not confident that any of our previous implementations are correct.

2297 In this paper we describe a new implementation. It is elegant and short

2298 (about a third of the size of our last attempt), and it offers

2299 strong compile-time guarantees about shapes.

2301 %

2303 analysis and transformation.

2304 The implementations of backward analysis and transformation are

2305 exactly analogous and are included in \hoopl.

2308 isolate errors in faulty optimizers, and how the fault-isolation

2309 machinery is integrated with the rest of the implementation.

2318 %% We on @analyzeAndRewriteFwd@, whose type is more general

2319 %% than that of @analyzeAndRewriteFwdBody@:

2320 %% \begin{smallcode}

2321 %% `analyzeAndRewriteFwd

2322 %% :: forall m n f e x. (FuelMonad m, NonLocal n)

2323 %% => FwdPass m n f -- lattice, transfers, rewrites

2324 %% -> MaybeC e [Label] -- entry points for a closed graph

2325 %% -> Graph n e x -- the original graph

2326 %% -> Fact e f -- fact(s) flowing into the entry/entries

2327 %% -> m (Graph n e x, FactBase f, MaybeO x f)

2328 %% \end{smallcode}

2329 %% We analyze graphs of all shapes; a single @FwdPass@ may be used with

2330 %% multiple shapes.

2331 %% If a graph is closed on entry, a list of entry points must be

2332 %% provided;

2333 %% if the graph is open on entry, it must be the case that the

2334 %% implicit entry point is the only entry point.

2335 %% The fact or set of facts flowing into the entries is similarly

2336 %% determined by the shape of the entry point.

2337 %% Finally, the result is a rewritten graph, a @FactBase@ that gives a

2338 %% fixed point of the analysis on the rewritten graph, and if the graph

2339 %% is open on exit, an ``exit fact'' flowing out.

2341 Instead of the interface function @analyzeAndRewriteFwdBody@, we present

2342 the private function @arfGraph@ (short for ``analyze and rewrite

2343 forward graph''):

2345 `arfGraph

2346 :: forall m n f e x. (FuelMonad m, NonLocal n)

2347 => FwdPass m n f -- lattice, transfers, rewrites

2349 -> Graph n e x -- the original graph

2350 -> Fact e f -- fact(s) flowing into the entry/entries

2351 -> m (DG f n e x, Fact x f)

2353 Function @arfGraph@ has a more general type than

2354 the function @analyzeAndRewriteFwdBody@ % praying for a decent line break

2355 because @arfGraph@ is used recursively

2356 to analyze graphs of all shapes.

2357 If a graph is closed on entry, a list of entry points must be

2358 provided;

2359 if the graph is open on entry,

2360 the graph's entry sequence must be the only entry point.

2361 The~graph's shape on entry also determines the type of fact or facts

2362 flowing in.

2363 Finally, the result is a ``decorated graph''

2364 @DG f n e x@,

2365 and if the graph

2366 is open on exit, an ``exit fact'' flowing out.

2368 %% \simon{I suggest (a) putting the paragraph break one sentence earlier,

2369 %%% so that this para is all about DGs.}

2370 %%% NR: previous para is about the type of arfGraph; I don't want to

2371 %%% leave the result type dangling. I hope the opening sentence of

2372 %%% this para suggests that the para is all about DGs.

2373 A~``decorated graph'' is one in which each block is decorated with the

2374 fact that holds at the start of the block.

2375 @DG@ actually shares a representation with @Graph@,

2376 which is possible because the definition of

2378 synonym for an underlying type @`Graph'@, which takes the type of block

2379 as an additional parameter.

2381 higher-order function that takes a block-concatenation function as a

2382 parameter.)

2383 The truth about @Graph@ and @DG@ is as follows:

2385 % defn DG

2386 % defn DBlock

2387 Type~@DG@ is internal to \hoopl; it is not seen by any client.

2388 To~convert a~@DG@ to the @Graph@ and @FactBase@

2389 that are returned by the API function @analyzeAndRewriteFwdBody@,

2390 we use a 12-line function:

2392 `normalizeGraph

2393 :: NonLocal n => DG f n e x -> (Graph n e x, FactBase f)

2396 Function @arfGraph@ is implemented as follows:

2397 \begingroup

2400 arfGraph ^pass entries = graph

2401 where\^

2402 node :: forall e x . (ShapeLifter e x)

2403 => n e x -> f -> m (DG f n e x, Fact x f)\^

2404 block:: forall e x .

2405 Block n e x -> f -> m (DG f n e x, Fact x f)\^

2407 -> Fact C f -> m (DG f n C C, Fact C f)\^

2408 `graph:: Graph n e x -> Fact e f -> m (DG f n e x, Fact x f)\^

2409 ... definitions of 'node', 'block', 'body', and 'graph' ...

2411 The four auxiliary functions help us separate concerns: for example, only

2412 \endgroup

2413 @node@ knows about rewrite functions;

2414 and only @body@ knows about fixed points.

2415 %% All four functions have access to the @FwdPass@ and any entry points

2416 %% that are passed to @arfGraph@.

2417 %% These functions also have access to type variables bound by

2418 %% @arfGraph@:

2419 %% @n@~is the type of nodes; @f@~is the type of facts;

2420 %% @m@~is the monad used in the rewrite functions of the @FwdPass@;

2421 %% and

2422 %% @e@~and~@x@ give the shape of the graph passed to @arfGraph@.

2423 %% The types of the inner functions are

2424 %% \begin{smallcode}

2425 %% \end{smallcode}

2426 Each auxiliary function works the same way: it~takes a ``thing'' and

2428 An~extended fact transformer takes dataflow fact(s) coming into

2429 the ``thing,'' and it returns an output fact.

2430 It~also returns a decorated graph representing the (possibly

2432 Finally, because a rewrite may require fresh names provided

2433 by the client,

2434 may wish to consult state provided by the client,

2436 every extended fact transformer is monadic.

2438 %%%% \begin{figure}

2439 %%%% SIMON HAS ASKED IF TYPE SYNONYMS MIGHT IMPROVE THINGS FOR EXTENDED

2440 %%%% FACT TRANSFORMERS. JUDGE FOR YOURSELF.

2441 %%%% FIRST, SOMETHING THAT IS SOMEWHAT READABLE BUT IS NOT LEGAL HASKELL:

2442 %%%% \begin{smallcode}

2443 %%%% type EFFX ipt e x = ipt -> m (DG f n e x, Fact x f)

2444 %%%% -- extended forward fact transformer

2445 %%%%

2446 %%%% node :: forall e x . (ShapeLifter e x)

2447 %%%% => n e x -> EFFX f e x

2448 %%%% block :: forall e x .

2449 %%%% Block n e x -> EFFX f e x

2450 %%%% body :: [Label] -> LabelMap (Block n C C)

2451 %%%% -> EFFX (Fact C f) C C

2452 %%%% graph :: Graph n e x -> EFFX (Fact e f) e x

2453 %%%% \end{smallcode}

2454 %%%% IF WE MAKE IT LEGAL HASKELL, IT BECOMES COMPLETELY HOPELESS:

2455 %%%% \begin{smallcode}

2456 %%%% type EFFX m n f ipt e x = ipt -> m (DG f n e x, Fact x f)

2457 %%%% -- extended forward fact transformer

2458 %%%%

2459 %%%% node :: forall e x . (ShapeLifter e x)

2460 %%%% => n e x -> EFFX m n f f e x

2461 %%%% block :: forall e x .

2462 %%%% Block n e x -> EFFX m n f f e x

2463 %%%% body :: [Label] -> LabelMap (Block n C C)

2464 %%%% -> EFFX m n f (Fact C f) C C

2465 %%%% graph :: Graph n e x -> EFFX m n f (Fact e f) e x

2466 %%%% \end{smallcode}

2467 %%%% \caption{EXPERIMENTS WITH TYPE SYNONYMS}

2468 %%%% \end{figure}

2469 %%%%

2472 The types of the four extended fact transformers are not quite

2473 identical:

2475 \item

2476 Extended fact transformers for nodes and blocks have the same type;

2477 like forward transfer functions,

2478 they expect a fact~@f@ rather than the more general @Fact e f@

2479 required for a graph.

2480 Because a node or a block has

2481 exactly one fact flowing into the entry, it is easiest simply to pass

2482 that fact.

2483 \item

2484 Extended fact transformers for graphs have the most general type,

2485 as expressed using @Fact@:

2486 if the graph is open on entry, its fact transformer expects a

2487 single fact;

2488 if the graph is closed on entry, its fact transformer expects a

2489 @FactBase@.

2490 \item

2491 Extended fact transformers for bodies have the same type as

2492 extended fact transformers for closed/closed graphs.

2496 Function @arfGraph@ and its four auxiliary functions comprise a cycle of

2497 mutual recursion:

2498 @arfGraph@ calls @graph@;

2499 @graph@ calls @body@ and @block@;

2500 @body@ calls @block@;

2501 @block@ calls @node@;

2502 and

2503 @node@ calls @arfGraph@.

2504 These five functions do three different kinds of work:

2505 compose extended fact transformers, analyze and rewrite nodes, and compute

2506 fixed points.

2513 Extended fact transformers compose nicely.

2514 For example, @block@ is implemented thus:

2515 %\ifpagetuning\par\vfilbreak[1in]\fi % horrible page break

2517 `block :: forall e x .

2518 Block n e x -> f -> m (DG f n e x, Fact x f)

2519 block (BFirst n) = node n

2520 block (BMiddle n) = node n

2521 block (BLast n) = node n

2522 block (BCat b1 b2) = block b1 `cat` block b2

2524 The composition function @cat@ feeds facts from one extended fact

2525 transformer to another, and it splices decorated graphs.

2526 It~has a very general type:

2528 cat :: forall m e a x f f1 f2. Monad m

2529 => (f -> m (DG f n e a, f1))

2530 -> (f1 -> m (DG f n a x, f2))

2531 -> (f -> m (DG f n e x, f2))

2532 `cat ft1 ft2 f = do { (g1,f1) <- ft1 f

2533 ; (g2,f2) <- ft2 f1

2534 ; return (g1 `dgSplice` g2, f2) }

2536 (Function @`dgSplice@ is the same splicing function used for an ordinary

2537 @Graph@, but it uses a one-line block-concatenation function suitable

2538 for @DBlock@s.)

2539 The name @cat@ comes from the concatenation of the decorated graphs,

2540 but it is also appropriate because the style in which it is used is

2541 reminiscent of @concatMap@, with the @node@ and @block@ functions

2542 playing the role of @map@.

2546 Function @graph@ is much like @block@, but it has more cases.

2549 %%%%

2550 %%%% \begin{itemize}

2551 %%%% \item

2552 %%%% The @arfNode@ function processes nodes (\secref{arf-node}).

2553 %%%% It handles the subtleties of interleaved analysis and rewriting,

2554 %%%% and it deals with fuel consumption. It calls @arfGraph@ to analyze

2555 %%%% and transform rewritten graphs.

2556 %%%% \item

2557 %%%% Based on @arfNode@ it is extremely easy to write @arfBlock@, which lifts

2558 %%%% the analysis and rewriting from nodes to blocks (\secref{arf-block}).

2559 %%%%

2560 %%%%

2561 %%%%

2562 %%%% \item

2563 %%%% Using @arfBlock@ we define @arfBody@, which analyzes and rewrites a

2564 %%%% @Body@: a~group of closed/closed blocks linked by arbitrary

2565 %%%% control flow.

2566 %%%% Because a @Body@ is

2567 %%%% always closed/closed and does not take shape parameters, function

2568 %%%% @arfBody@ is less polymorphic than the others; its type is what

2569 %%%% would be obtained by expanding and specializing the definition of

2570 %%%% @ARF@ for a @thing@ which is always closed/closed and is equivalent to

2571 %%%% a @Body@.

2572 %%%%

2573 %%%% Function @arfBody@ takes care of fixed points (\secref{arf-body}).

2574 %%%% \item

2575 %%%% Based on @arfBody@ it is easy to write @arfGraph@ (\secref{arf-graph}).

2576 %%%% \end{itemize}

2577 %%%% Given these functions, writing the main analyzer is a simple

2578 %%%% matter of matching the external API to the internal functions:

2579 %%%% \begin{code}

2580 %%%% `analyzeAndRewriteFwdBody

2581 %%%% :: forall n f. NonLocal n

2582 %%%% => FwdPass n f -> Body n -> FactBase f

2583 %%%% -> FuelMonad (Body n, FactBase f)

2584 %%%%

2585 %%%% analyzeAndRewriteFwdBody pass ^body facts

2586 %%%% = do { (^rg, _) <- arfBody pass body facts

2587 %%%% ; return (normalizeBody rg) }

2588 %%%% \end{code}

2589 %%%%

2590 %%%% \subsection{From nodes to blocks} \seclabel{arf-block}

2591 %%%% \seclabel{arf-graph}

2592 %%%%

2593 %%%% We begin our explanation with the second task:

2594 %%%% writing @arfBlock@, which analyzes and transforms blocks.

2595 %%%% \begin{code}

2596 %%%% `arfBlock :: NonLocal n => ARF (Block n) n

2597 %%%% arfBlock pass (BUnit node) f

2598 %%%% = arfNode pass node f

2599 %%%% arfBlock pass (BCat b1 b2) f

2600 %%%% = do { (g1,f1) <- arfBlock pass b1 f

2601 %%%% ; (g2,f2) <- arfBlock pass b2 f1

2602 %%%% ; return (g1 `DGCatO` g2, f2) }

2603 %%%% \end{code}

2604 %%%% The code is delightfully simple.

2605 %%%% The @BUnit@ case is implemented by @arfNode@.

2606 %%%% The @BCat@ case is implemented by recursively applying @arfBlock@ to the two

2607 %%%% sub-blocks, threading the output fact from the first as the

2608 %%%% input to the second.

2609 %%%% Each recursive call produces a rewritten graph;

2610 %%%% we concatenate them with @DGCatO@.

2611 %%%%

2612 %%%% Function @arfGraph@ is equally straightforward:

2613 %%%% XXXXXXXXXXXXXXX

2614 %%%% The pattern is the same as for @arfBlock@: thread

2615 %%%% facts through the sequence, and concatenate the results.

2616 %%%% Because the constructors of type~@DG@ are more polymorphic than those

2617 %%%% of @Graph@, type~@DG@ can represent

2618 %%%% graphs more simply than @Graph@; for example, each element of a

2619 %%%% @GMany@ becomes a single @DG@ object, and these @DG@ objects are then

2620 %%%% concatenated to form a single result of type~@DG@.

2621 %%%%

2625 The @node@ function is where we interleave analysis with rewriting:

2628 % defn ShapeLifter

2629 % defn singletonDG

2630 % defn fwdEntryFact

2631 % defn fwdEntryLabel

2632 % defn ftransfer

2633 % defn frewrite

2634 %\begin{smallfuzzcode}{10.5pt}

2635 %node :: forall e x . (ShapeLifter e x, FuelMonad m)

2636 % => n e x -> f -> m (DG f n e x, Fact x f)

2637 %node n f

2638 % = do { ^rew <- frewrite pass n f

2639 % ; case rew of

2640 % Nothing -> return (singletonDG f n,

2641 % ftransfer pass n f)

2642 % Just (FwdRew g rw) ->

2643 % let ^pass' = pass { fp_rewrite = rw }

2644 % f' = fwdEntryFact n f

2645 % in arfGraph pass' (fwdEntryLabel n) g f' }

2646 %

2647 %class `ShapeLifter e x where

2648 % `singletonDG :: f -> n e x -> DG f n e x

2649 % `fwdEntryFact :: NonLocal n => n e x -> f -> Fact e f

2650 % `fwdEntryLabel :: NonLocal n => n e x -> MaybeC e [Label]

2651 % `ftransfer :: FwdPass m n f -> n e x -> f -> Fact x f

2652 % `frewrite :: FwdPass m n f -> n e x

2653 % -> f -> m (Maybe (FwdRew m n f e x))

2654 %\end{smallfuzzcode}

2656 Function @node@ uses @frewrite@ to extract the rewrite function from

2657 @pass@,

2658 and applies the rewrite function to the node~@n@ and the incoming fact~@f@.

2659 The result, @rew@, is

2660 scrutinized by the @case@ expression.

2662 In the @Nothing@ case, no rewrite takes place.

2663 We~return node~@n@ and its incoming fact~@f@

2664 as the decorated graph @singletonDG f n@.

2665 To produce the outgoing fact, we apply the transfer function

2666 @ftransfer pass@ to @n@~and~@f@.

2668 In the @Just@ case, we receive a replacement

2669 graph~@g@ and a new rewrite function~@rw@.

2670 We~recursively analyze @g@ with @arfGraph@.

2671 This analysis uses @pass'@, which contains the original lattice and transfer

2672 function from @pass@, together with the new rewrite function~@rw@.

2673 Function @fwdEntryFact@ converts fact~@f@ from the type~@f@,

2674 which @node@ expects, to the type @Fact e f@, which @arfGraph@ expects.

2676 As you see, several functions called in @node@ are overloaded over a

2677 (private) class @ShapeLifter@, because their implementations depend

2678 on the open/closed shape of the node.

2679 By design, the shape of a node is known statically everywhere @node@

2680 is called, so

2681 this use of @ShapeLifter@ is specialized

2682 away by the compiler.

2684 %% And that's it! If~the client wanted deep rewriting, it is

2685 %% implemented by the call to @arfGraph@;

2686 %% if the client wanted

2687 %% shallow rewriting, the rewrite function will have returned

2688 %% @noFwdRw@ as~@rw@, which is implanted in @pass'@

2689 %% (\secref{shallow-vs-deep}).

2694 The fourth and final auxiliary function of @arfGraph@ is

2695 @body@, which iterates to a fixed point.

2696 This part of the implementation is the only really tricky part, and it is

2697 cleanly separated from everything else:

2699 % defn body

2700 % local do_block

2701 % local blocks

2702 % local lattice

2703 % local entryFact

2704 % local entries

2705 % local init_fbase

2706 % local blockmap

2707 Function @getFact@ looks up a fact by its label.

2708 If the label is not found,

2709 @getFact@ returns

2710 the bottom element of the lattice:

2712 `getFact :: DataflowLattice f -> Label -> FactBase f -> f

2714 Function @forwardBlockList@ takes a list of possible entry points and

2715 a finite map from labels to blocks.

2716 It returns a list of

2717 blocks, sorted into an order that makes forward dataflow efficient.\footnote

2718 {The order of the blocks does not affect the fixed point or any other

2719 part of the answer; it affects only the number of iterations needed to

2720 reach the fixed point.}

2722 `forwardBlockList

2723 :: NonLocal n

2726 For

2727 example, if the entry point is at~@L2@, and the block at~@L2@

2728 branches to~@L1@, but not vice versa, then \hoopl\ will reach a fixed point

2729 more quickly if we process @L2@ before~@L1@.

2730 To~find an efficient order, @forwardBlockList@ uses

2731 the methods of the @NonLocal@ class---@entryLabel@ and @successors@---to

2732 perform a reverse postorder depth-first traversal of the control-flow graph.

2733 %%

2734 %%The @NonLocal@ type-class constraint on~@n@ propagates to all the

2735 %%@`arfThing@ functions.

2736 %% paragraph carrying too much freight

2737 %%

2739 The rest of the work is done by @`fixpoint@, which is shared by

2740 both forward and backward analyses:

2742 % defn Direction

2743 % defn Fwd

2744 % defn Bwd

2745 Except for the @Direction@ passed as the first argument,

2746 the type signature tells the story.

2747 The third argument is an extended fact transformer for a single block;

2748 @fixpoint@ applies that function successively to each block in the list

2749 passed as the fourth argument.

2750 The result is an extended fact transformer for the list.

2752 The extended fact transformer returned by @fixpoint@

2753 maintains a

2754 ``current @FactBase@''

2755 which grows monotonically:

2756 as each block is analyzed,

2757 the block's input fact is taken from

2758 the current @FactBase@,

2759 and the current @FactBase@

2760 is

2761 augmented with the facts that flow out of the block.

2762 %

2763 The initial value of the current @FactBase@ is the input @FactBase@,

2764 and

2765 the extended fact transformer

2766 iterates over the blocks until the current @FactBase@

2767 stops changing.

2771 Implementing @fixpoint@ requires about 90 lines,

2772 formatted narrowly for display in one column.

2773 %%

2774 %% for completeness, it

2775 %% appears in Appendix~\ref{app:fixpoint}.

2776 The~code is mostly straightforward, although we try to be clever

2777 about deciding when a new fact means that another iteration over the

2778 blocks will be required.

2780 There is one more subtle point worth mentioning, which we highlight by

2781 considering a forward analysis of this graph, where execution starts at~@L1@:

2783 L1: x:=3; goto L4

2784 L2: x:=4; goto L4

2785 L4: if x>3 goto L2 else goto L5

2787 Block @L2@ is unreachable.

2788 But if we \naively\ process all the blocks (say in

2789 order @L1@, @L4@, @L2@), then we will start with the bottom fact for @L2@, propagate

2792 Given @x=@$\top$, the

2793 conditional in @L4@ cannot be rewritten, and @L2@~seems reachable. We have

2794 lost a good optimization.

2796 Function @fixpoint@ solves this problem

2797 by analyzing a block only if the block is

2798 reachable from an entry point.

2799 This trick is safe only for a forward analysis, which

2800 is why

2801 @fixpoint@ takes a @Direction@ as its first argument.

2803 %% Although the trick can be implemented in just a couple of lines of

2804 %% code, the reasoning behind it is quite subtle---exactly the sort of

2805 %% thing that should be implemented once in \hoopl, so clients don't have

2806 %% to worry about it.

2815 When optimization produces a faulty program,

2817 given a program that fails when compiled with optimization,

2818 a binary search on the number of rewrites

2819 finds an~$n$ such that the program works correctly after $n-1$ rewrites

2820 but fails after $n$~rewrites.

2821 The $n$th rewrite is faulty.

2823 technique enables us to debug complex optimizations by

2824 identifying one single rewrite that is faulty.

2826 This debugging technique requires the~ability to~limit

2827 the number of~rewrites,

2829 If each rewrite consumes one unit of optimization fuel

2830 (and does nothing if there is no remaining fuel),

2831 we can perform the binary search by running the optimization

2832 with different supplies of fuel.

2834 which must be implemented by the client's monad @m@.

2835 And to~ensure that each rewrite consumes one~unit of~fuel,

2836 the function @mkFRewrite@ wraps the client's rewrite function, which is oblivious to fuel,

2837 in another function that satisfies the following contract:

2839 \item

2840 If the fuel supply is empty, the wrapped function always returns @Nothing@.

2841 \item

2842 If the wrapped function returns @Just g@, it has the monadic effect of

2843 reducing the fuel supply by one unit.

2846 %%%% \seclabel{fuel-monad}

2847 %%%%

2848 %%%% \begin{ntext}

2849 %%%% \subsection{Rewrite functions}

2850 %%%%

2851 %%%%

2852 %%%%

2853 %%%% \begin{code}

2854 %%%% `withFuel :: FuelMonad m => Maybe a -> m (Maybe a)

2855 %%%% \end{code}

2856 %%%%

2857 %%%%

2858 %%%% as expressed by the

2859 %%%% @FwdRew@ type returned by a @FwdRewrite@ (\figref{api-types}).

2860 %%%% The first component of the @FwdRew@ is the replacement graph, as discussed earlier.

2861 %%%% The second component, $\rw$, is a

2862 %%%% \emph{new rewrite function} to use when recursively processing

2863 %%%% the replacement graph.

2864 %%%% For shallow rewriting this new function is

2865 %%%% the constant @Nothing@ function; for deep rewriting it is the original

2866 %%%% rewrite function.

2867 %%%% While @mkFRewrite@ allows for general rewriting, most clients will

2868 %%%% take advantage of \hoopl's support for these two common cases:

2869 %%%% \begin{smallcode}

2870 %%%% `deepFwdRw, `shallowFwdRw

2871 %%%% :: Monad m

2872 %%%% => (forall e x . n e x -> f -> m (Maybe (Graph n e x))

2873 %%%% -> FwdRewrite m n f

2874 %%%% \end{smallcode}

2875 %%%% \end{ntext}

2881 While there is a vast body of literature on

2882 dataflow analysis and optimization,

2883 relatively little can be found on

2885 We therefore focus on the foundations of dataflow analysis

2886 and on the implementations of some comparable dataflow frameworks.

2890 When transfer functions are monotone and lattices are finite in height,

2891 iterative dataflow analysis converges to a fixed point

2893 If~the lattice's join operation distributes over transfer

2894 functions,

2895 this fixed point is equivalent to a join-over-all-paths solution to

2896 the recursive dataflow equations

2898 {Kildall uses meets, not joins.

2899 Lattice orientation is a matter of convention, and conventions have changed.

2900 We use Dana Scott's

2901 orientation, in which higher elements carry more information.}

2903 monotone functions.

2904 Each~client of \hoopl\ must guarantee monotonicity.

2906 \ifcutting

2908 \else

2910 \fi

2911 introduce abstract interpretation as a technique for developing

2912 lattices for program analysis.

2914 a dataflow analysis can be implemented using model checking;

2916 expands on this~result by showing that

2917 an all-paths dataflow problem can be viewed as model checking an

2918 abstract interpretation.

2921 present a survey of different methods for performing dataflow analyses,

2922 with emphasis on theoretical results.

2924 presents many examples of both particular analyses and related

2925 algorithms.

2928 The soundness of interleaving analysis and transformation,

2929 even when not all speculative transformations are performed on later

2930 iterations, was shown by

2935 Most dataflow frameworks support only analysis, not transformation.

2936 The framework computes a fixed point of transfer functions, and it is

2937 up to the client of

2938 the framework to use that fixed point for transformation.

2939 Omitting transformation makes it much easier to build frameworks,

2940 and one can find a spectrum of designs.

2941 We~describe two representative

2942 designs, then move on to the prior frameworks that support interleaved

2943 analysis and transformation.

2945 The Soot framework is designed for analysis of Java programs

2947 best for Soot in general, but there doesn't appear

2948 to be any formal publication that actually details the dataflow

2949 framework part. ---JD}

2950 While Soot's dataflow library supports only analysis, not

2951 transformation, we found much

2952 to admire in its design.

2953 Soot's library is abstracted over the representation of

2954 the control-flow graph and the representation of instructions.

2955 Soot's interface for defining lattice and analysis functions is

2956 like our own,

2957 although because Soot is implemented in an imperative style,

2958 additional functions are needed to copy lattice elements.

2962 for same reason as Soot below ---JD}

2963 supports both analysis and rewriting of C~programs,

2964 but rewriting is clearly distinct from analysis:

2965 one runs an analysis to completion and then rewrites based on the

2966 results.

2967 The framework is limited to one representation of control-flow graphs

2968 and one representation of instructions, both of which are provided by

2969 the framework.

2970 The~API is complicated;

2971 much of the complexity is needed to enable the client to

2972 affect which instructions

2973 the analysis iterates over.

2977 but I'll be darned if I can find anything that might be termed a dataflow framework.}

2979 The Whirlwind compiler contains the dataflow framework implemented

2981 interleave analysis and transformation.

2982 Their implementation is much like our early efforts:

2983 it is a complicated mix of code that simultaneously manages interleaving,

2984 deep rewriting, and fixed-point computation.

2985 By~separating these tasks,

2986 our implementation simplifies the problem dramatically.

2987 Whirlwind's implementation also suffers from the difficulty of

2988 maintaining pointer invariants in a mutable representation of

2989 control-flow graphs, a problem we have discussed elsewhere

2992 Because speculative transformation is difficult in an imperative setting,

2993 Whirlwind's implementation is split into two phases.

2994 The first phase runs the interleaved analyses and transformations

2995 to compute the final dataflow facts and a representation of the transformations

2996 that should be applied to the input graph.

2997 The second phase executes the transformations.

2998 In~\hoopl, because control-flow graphs are immutable, speculative transformations

2999 can be applied immediately, and there is no need

3000 for a phase distinction.

3002 %%% % repetitious...

3003 %%%

3004 %%% \ourlib\ also improves upon Whirlwind's dataflow framework by providing

3005 %%% new support for the optimization writer:

3006 %%% \begin{itemize}

3007 %%% \item Using static type guarantees, \hoopl\ rules out a whole

3008 %%% class of possible bugs: transformations that produced malformed

3009 %%% control-flow graphs.

3010 %%% \item Using dynamic testing,

3011 %%% we can isolate the rewrite that transforms a working program

3012 %%% into a faulty program,

3013 %%% using Whalley's \citeyearpar{whalley:isolation} fault-isolation technique.

3014 %%% \end{itemize}

3016 %% what follows now looks redundant with discussion below ---NR

3018 %% In previous work \cite{ramsey-dias:applicative-flow-graph}, we

3019 %% described a zipper-based representation of control-flow

3020 %% graphs, stressing the advantages

3021 %% of immutability.

3022 %% Our new representation, described in \secref{graph-rep}, is a significant improvement:

3023 %% \begin{itemize}

3024 %% \item

3025 %% We can concatenate nodes, blocks, and graphs in constant time.

3026 %% %Previously, we had to resort to Hughes's

3027 %% %\citeyearpar{hughes:lists-representation:article} technique, representing

3028 %% %a graph as a function.

3029 %% \item

3030 %% We can do a backward analysis without having

3031 %% to ``unzip'' (and allocate a copy of) each block.

3032 %% \item

3033 %% Using GADTs, we can represent a flow-graph

3034 %% node using a single type, instead of the triple of first, middle, and

3035 %% last types used in our earlier representation.

3036 %% This change simplifies the interface significantly:

3037 %% instead of providing three transfer functions and three rewrite

3038 %% functions per pass---one for

3039 %% each type of node---a client of \hoopl\ provides only one transfer

3040 %% function and one rewrite function per pass.

3041 %% \item

3042 %% Errors in concatenation are ruled out at

3043 %% compile-compile time by Haskell's static

3044 %% type system.

3045 %% In~earlier implementations, such errors were not detected until

3046 %% the compiler~ran, at which point we tried to compensate

3047 %% for the errors---but

3048 %% the compensation code harbored subtle faults,

3049 %% which we discovered while developing a new back end

3050 %% for the Glasgow Haskell Compiler.

3051 %% \end{itemize}

3052 %%

3053 %% The implementation of \ourlib\ is also much better than

3054 %% our earlier implementations.

3055 %% Not only is the code simpler conceptually,

3056 %% but it is also shorter:

3057 %% our new implementation is about a third as long

3058 %% as the previous version, which is part of GHC, version 6.12.

3065 Our work on \hoopl\ is too new for us to be able to say much

3066 about performance.

3067 It's~important to know how well \hoopl\ performs, but the

3068 question is comparative, and there isn't another library we can compare

3069 \hoopl\ with.

3070 For example, \hoopl\ is not a drop-in replacement for an existing

3071 component of GHC; we introduced \hoopl\ to GHC as part of a

3072 major refactoring of GHC's back end.

3074 the previous version, and we

3075 don't know what portion of the slowdown can be attributed to the

3076 optimizer.

3077 %

3078 What we can say is that the costs of using \hoopl\ seem reasonable;

3079 there is no ``big performance hit.''

3081 language, actually improved performance in an apples-to-apples

3082 comparison with a library using a mutable control-flow graph

3085 Although a~thorough evaluation of \hoopl's performance must await

3086 future work, we can identify some design decisions that affect

3087 performance.

3089 \item

3091 Using this representation, a block of $N$~nodes is represented using

3093 We~have also implemented a representation of blocks that include

3094 ``cons-like'' and ``snoc-like'' constructors;

3095 this representation requires only $N+1$ heap objects.

3096 We~don't know what difference this choice makes to performance.

3097 \item

3099 rewrites the body of a control-flow graph, and @fixpoint@ iterates

3100 this analysis until it reaches a fixed point.

3101 Decorated graphs computed on earlier iterations are thrown away.

3102 But~for each decorated graph of $N$~nodes, it is necessary to allocate

3104 @singletonDG@ in~@node@ and of @dgSplice@ in~@cat@.

3105 In~an earlier version of \hoopl, this overhead was

3106 eliminated by splitting @arfGraph@ into two very similar functions: one to compute the

3107 fixed point, and the other to produce the rewritten graph.

3108 Having a single version of @arfGraph@ is simpler and easier

3109 to maintain, but we don't know the cost of allocating the

3110 additional thunks.

3111 \item

3112 The representation of a forward-transfer function is private to

3113 \hoopl.

3114 Two representations are possible:

3115 we may store a triple of functions, one for each shape a node may

3116 have;

3117 or we may store a single, polymorphic function.

3118 If~we use triples throughout, the costs are straightforward, but the

3119 code is complex.

3120 If~we use a single, polymorphic function, we sometimes have to use a

3122 transfer functions.

3123 Using a shape classifier may introduce extra @case@ discriminations

3124 every time a transfer function or rewrite function is applied to a

3125 node.

3126 We~don't know how these extra discriminations might affect

3127 performance.

3129 In summary, \hoopl\ performs well enough for use in~GHC,

3130 but there is much we don't know. Systematic investigation

3131 is indicated.

3137 We built \hoopl\ in order to combine three good ideas (interleaved

3138 analysis and transformation, optimization fuel, and an applicative

3139 control-flow graph) in a way that could easily be reused by many

3140 compiler writers.

3141 To~evaluate how well we succeeded, we examine how \hoopl\ has been

3142 used,

3143 we~examine the API, and we examine the implementation.

3147 As~suggested by the constant-propagation example in

3149 dataflow analyses.

3150 Students using \hoopl\ in a class at Tufts were able to implement

3152 and induction-variable elimination

3154 Students at Yale and at Portland State have also succeeded in building

3155 a variety of optimizations.

3157 \hoopl's data types can support optimizations beyond classic

3158 dataflow.

3159 For example, within GHC, \hoopl's graphs are used

3160 to implement optimizations based on control flow,

3161 such as eliminating branch chains.

3163 \hoopl\ is SSA-neutral:

3164 although we know of no attempt to use

3165 \hoopl\ to establish or enforce SSA~invariants,

3167 representation of first nodes,

3168 and if a transformation preserves SSA~invariants, it will continue to do

3169 so when implemented in \hoopl.

3174 itself,

3175 but there are a couple of properties we think are worth highlighting.

3176 First, it's a good sign that the API provides many higher-order

3177 combinators that make it easier to write client code. % with simple, expressive types.

3178 We~have had space to mention only a few:

3179 @extendJoinDomain@,

3180 @thenFwdRw@, @deepFwdRw@, and @pairFwd@.

3182 Second,

3183 the static encoding of open and closed shapes at compile time worked

3184 out well.

3185 % especially because it applies equally to nodes, blocks, and graphs.

3186 Shapes may

3187 seem like a small refinement, but they helped eliminate a number of

3188 bugs from GHC, and we expect them to help other clients too.

3189 GADTs are a convenient way to express shapes, and for clients

3190 written in Haskell, they are clearly appropriate.

3191 If~one wished to port \hoopl\ to a language without GADTs,

3192 many of the benefits could be realized by making the shapes phantom

3193 types, but without GADTs, pattern matching would be significantly more

3194 tedious and error-prone.

3197 % An~advantage of our ``shapely'' node API is that a client can

3198 % write a \emph{single} transfer function that is polymorphic in shape.

3199 % To~make this design work, however, we \emph{must} have

3200 % the type-level function

3201 % @Fact@ (\figref{api-types}), to express how incoming

3202 % and outgoing facts depend on the shape of a node.

3203 % Without type-level functions, we would have had to force clients to

3204 % use \emph{only} the triple-of-functions interface described in

3205 % \secref{triple-of-functions}.

3209 If you are thinking of adopting \hoopl, you had better consider not

3210 only whether you like the API, but whether, in case of emergency, you

3211 could maintain the implementation.

3213 implementation is a clear improvement over previous implementations

3214 of similar ideas.

3215 % The implementation is more difficult to evaluate than the~API.

3216 % Previous implementations of similar ideas have rolled the problems

3217 % into a big ball of mud.

3218 By~decomposing our implementation into @node@, @block@, @body@,

3219 @graph@, @cat@, @fixpoint@, and @mkFRewrite@, we have clearly separated

3220 multiple concerns:

3221 interleaving analysis with rewriting,

3222 throttling rewriting using optimization fuel,

3223 and

3224 computing a fixed point using speculative rewriting.

3225 Because of this separation of concerns,

3226 we believe our implementation will be much easier to maintain than

3227 anything that preceded it.

3229 Another good sign is that we have been able to make substantial

3230 changes in the implementation without changing the API.

3231 For example, in addition to ``@concatMap@ style,'' we have also

3232 implemented @arfGraph@ in ``fold style'' and in continuation-passing

3233 style.

3234 Which style is preferred is a matter of taste, and possibly

3235 a matter of performance.

3240 \iffalse

3242 (We have also implemented a ``fold style,'' but because the types are

3243 a little less intuitive, we are sticking with @concatMap@ style for now.)

3246 > Some numbers, I have used it nine times, and would need the general fold

3247 > once to define blockToNodeList (or CB* equivalent suggested by you).

3248 > (We are using it in GHC to

3249 > - computing hash of the blocks from the nodes

3250 > - finding the last node of a block

3251 > - converting block to the old representation (2x)

3252 > - computing interference graph

3253 > - counting Area used by a block (2x)

3254 > - counting stack high-water mark for a block

3255 > - prettyprinting block)

3258 type-parameter hell, newtype hell, typechecking hell, instance hell,

3259 triple hell

3261 \fi

3264 % We have spent six years implementing and reimplementing frameworks for

3265 % dataflow analysis and transformation.

3266 % This formidable design problem taught us

3267 % two kinds of lessons:

3268 % we learned some very specific lessons about representations and

3269 % algorithms for optimizing compilers,

3270 % and we were forcibly reminded of some very general, old lessons that are well

3271 % known not just to functional programmers, but to programmers

3272 % everywhere.

3276 %%%% \remark{Orphaned: but for transfer functions that

3277 %%%% approximate weakest preconditions or strongest postconditions,

3278 %%%% monotonicity falls out naturally.}

3279 %%%%

3280 %%%%

3281 %%%% In conclusion we offer the following lessons from the experience of designing

3282 %%%% and implementing \ourlib{}.

3283 %%%% \begin{itemize}

3284 %%%% \item

3285 %%%% Although we have not stressed this point, there is a close connection

3286 %%%% between dataflow analyses and program logic:

3287 %%%% \begin{itemize}

3288 %%%% \item

3289 %%%% A forward dataflow analysis is represented by a predicate transformer

3290 %%%% that is related to \emph{strongest postconditions}

3291 %%%% \cite{floyd:meaning}.\footnote

3292 %%%% {In Floyd's paper the output of the predicate transformer is called

3293 %%%% the \emph{strongest verifiable consequent}, not the ``strongest

3294 %%%% postcondition.''}

3295 %%%% \item

3296 %%%% A backward dataflow analysis is represented by a predicate transformer

3297 %%%% that is related to \emph{weakest preconditions} \cite{dijkstra:discipline}.

3298 %%%% \end{itemize}

3299 %%%% Logicians write down the predicate transformers for the primitive

3300 %%%% program fragments, and then use compositional rules to ``lift'' them

3301 %%%% to a logic for whole programs. In the same way \ourlib{} lets the client

3302 %%%% write simple predicate transformers,

3303 %%%% and local rewrites based on those assertions, and ``lifts'' them to entire

3304 %%%% function bodies with arbitrary control flow.

3306 \iffalse

3309 Reuse requires abstraction, and as is well known,

3310 designing good abstractions is challenging.

3311 \hoopl's data types and the functions over those types have been

3314 As~we were refining our design, we~found it invaluable to operate in

3315 two modes:

3316 In the first mode, we designed, built, and used a framework as an

3318 In the second mode, we designed and built a standalone library, then

3319 redesigned and rebuilt it, sometimes going through several significant

3320 changes in a week.

3321 Operating in the first mode---inside a live compiler---forced us to

3322 make sure that no corners were cut, that we were solving a real

3323 problem, and that we did not inadvertently

3324 cripple some other part of the compiler.

3325 Operating in the second mode---as a standalone library---enabled us to

3326 iterate furiously, trying out many more ideas than would have

3327 been possible in the first mode.

3328 Alternating between these two modes has led to a

3329 better design than operating in either mode alone.

3331 %% We were forcibly reminded of timeless truths:

3332 It is a truth universally acknowledged that

3333 interfaces are more important than implementations and data

3334 is more important than code.

3335 This truth is reflected in this paper, in which

3336 we

3339 We have evaluate \hoopl's API through small, focused classroom

3340 projects and by using \hoopl\ in the back end of the Glasgow Haskell

3341 Compiler.

3345 We were also reminded that Haskell's type system (polymorphism, GADTs,

3346 higher-order functions, type classes, and so on) is a remarkably

3347 effective

3348 language for thinking about data and code---and that

3349 Haskell lacks a language of interfaces (like ML's signatures) that

3350 would make it equally effective for thinking about APIs at a larger scale.

3351 Still, as usual, the types were a remarkable aid to writing the code:

3352 when we finally agreed on the types presented above, the

3353 code almost wrote itself.

3355 Types are widely appreciated at ICFP, but here are three specific

3356 examples of how types helped us:

3358 \item

3359 Reuse is enabled by representation-independence, which in a functional

3360 language is

3361 expressed through parametric polymorphism.

3363 made the code simpler, easier to understand, and easier to maintain.

3365 \ourlib\ must know about nodes, and to embody that knowledge in the

3367 \item

3369 %

3370 % this paper is just not about run-time performance ---NR

3371 %

3372 %%%% Moreover, the implementation is faster than it would otherwise be,

3373 %%%% because, say, a @(Fact O f)e@ is known to be just an @f@ rather than

3374 %%%% being a sum type that must be tested (with a statically known outcome!).

3375 %

3377 to nodes, blocks, and graphs helped our

3378 thinking and helped to structure the implementation.

3379 \item

3382 \fi

3386 Why do we allow the client to define the monad~@m@ used in rewrite

3387 functions and in @analyzeAndRewriteFwdBody@?

3389 The~obvious alternative, which we have implemented and explored, is to require

3391 This alternative has advantages:

3392 we implement

3393 @checkpoint@, @restart@,

3394 @setFuel@, and @getFuel@,

3395 and we make sure they are right, without placing any obligation on

3396 the client.

3397 Our~alternative API also provides a supply of unique names.

3398 But~we are wary of mandating the representation of such a central

3399 abstraction as a supply of unique names;

3400 no~matter how well designed the~API,

3401 if it does not accomodate design choices already embodied in existing

3402 compilers, it won't be used.

3403 %

3404 And~experience has shown us that for the client, the convenience

3405 of a custom monad is well worth the cognitive costs of understanding

3406 the more complex API and of meeting the contracts for

3407 instances of @FuelMonad@ and @CkpointMonad@.

3408 As~a very simple example, a client might want one set of

3409 unique names for labels and a different set for registers.

3410 Or~in order to judge the effectiveness of an optimization,

3411 a client might want to log how many rewrites take place, or in what

3412 functions they take place.

3414 Primitive Redex

3415 Speculation, a code-improving transformation that can create new

3416 function definitions.

3417 A~\hoopl\ client implementing this transformation would define a monad

3418 to accumulate the new definitions.

3419 In~these examples, the law governing @checkpoint@ and @restart@

3420 ensures that a speculative rewrite, if later rolled back, does not

3421 create a log entry or a function definition.

3422 }

3425 In~addition to major API variations like ``who controls the monad,''

3426 we have considered myriad minor variations,

3427 mostly are driven by

3428 implementation concerns.

3429 There are some variations in implementation that don't affect the

3430 interface, but there are also many that have small effects.

3431 We~have space only to raise a few of the questions.

3432 An~earlier implementation of the fixed-point computation stored the

3433 ``current'' @FactBase@ in a monad; we~find it easier to understand and

3434 maintain the code that passes the current @FactBase@ as an argument.

3437 We~have also implemented a ``fold style'' and continuation-passing

3438 style.

3439 No~one of these styles obviously makes all the code easiest to read

3440 and understand: @concatMap@ style is relatively straightforward throughout;

3441 fold~style is similar overall but different in detail;

3442 and continuation-passing style is very clear and elegant to those who

3443 like continuations, but a bit baffling to those who don't.

3444 Another question is which value constructors should be

3445 polymorphic in the shapes of their arguments and which should be

3446 monomorphic.

3447 We~experimented with a polymorphic

3449 `BNode :: n e x -> Block n e x

3451 but we found that there are significant advantages to knowing the type

3452 of every node statically, using purely local information---so instead

3453 we use

3454 the three monomorphic constructors @BFirst@, @BMiddle@, and @BLast@

3456 Similar questions arise about the polymorphic @BCat@ and about the

3457 graph constructors, and even among ourselves, we are divided about how

3458 best to answer them.

3459 Yet another question is whether it is worthwhile to save a level of

3460 indirection by providing a cons-like constructor to concatenate a node

3461 and a block.

3462 Perhaps some of these questions can be answered by appealing to

3463 performance, but the experiments that will provide the answers have

3464 yet to be devised.

3472 Dataflow optimization is usually described as a way to improve imperative

3473 programs by mutating control-flow graphs.

3474 Such transformations appear very different from the tree rewriting

3475 that functional languages are so well known for, and that makes

3476 \ifhaskellworkshop

3477 Haskell

3478 \else

3479 functional languages

3480 \fi

3481 so attractive for writing other parts of compilers.

3482 But even though dataflow optimization looks very different from

3483 what we are used to,

3484 writing a dataflow optimizer

3485 in

3486 \ifhaskellworkshop

3487 Haskell

3488 \else

3489 a pure functional language

3490 \fi

3491 was a win:

3492 % We could not possibly have conceived \ourlib{} in C++.

3493 we had to make every input and output explicit,

3494 and we had a strong incentive to implement things compositionally.

3495 Using Haskell helped us make real improvements in the implementation of

3496 some very sophisticated ideas.

3497 % %%

3498 % %%

3499 % %% In~a pure functional language, not only do we know that

3500 % %% no data structure will be unexpectedly mutated,

3501 % %% but we are forced to be

3502 % %% explicit about every input and output,

3503 % %% and we are encouraged to implement things compositionally.

3504 % %% This kind of thinking has helped us make

3505 % %% significant improvements to the already tricky work of Lerner, Grove,

3506 % %% and Chambers:

3507 % %% per-function control of shallow vs deep rewriting

3508 % %% (\secref{shallow-vs-deep}),

3509 % %% optimization fuel (\secref{fuel}),

3510 % %% and transparent management of unreachable blocks (\secref{arf-body}).

3511 % We~trust that the improvements are right only because they are

3512 % implemented in separate

3513 % parts of the code that cannot interact except through

3514 % explicit function calls.

3515 % %% %%

3516 % %% %%An ancestor of \ourlib{} is in the Glasgow Haskell Compiler today,

3517 % %% %%in version~6.12.

3518 % %% With this new, improved design in hand, we are now moving back to

3519 % %% live-compiler mode, pushing \hoopl\ into version

3520 % %% 6.13 of the Glasgow Haskell Compiler.

3523 \acks

3525 Brian Huffman and Graham Hutton helped with algebraic laws.

3526 Sukyoung Ryu told us about Primitive Redex Speculation.

3527 Several anonymous reviewers helped improve the presentation.

3528 % , especially reviewer~C, who suggested better language with which to

3529 % describe our work.

3531 The first and second authors were funded

3532 by a grant from Intel Corporation and

3534 These authors also thank Microsoft Research Ltd, UK, for funding

3535 extended visits to the third author.

3538 \makeatother

3545 \clearpage

3547 \appendix

3549 % omit LabelSet :: *

3550 % omit LabelMap :: *

3551 % omit delFromFactBase :: FactBase f -> [(Label,f)] -> FactBase f

3552 % omit elemFactBase :: Label -> FactBase f -> Bool

3553 % omit elemLabelSet :: Label -> LabelSet -> Bool

3554 % omit emptyLabelSet :: LabelSet

3555 % omit factBaseLabels :: FactBase f -> [Label]

3556 % omit extendFactBase :: FactBase f -> Label -> f -> FactBase f

3557 % omit extendLabelSet :: LabelSet -> Label -> LabelSet

3558 % omit lookupFact :: FactBase f -> Label -> Maybe f

3559 % omit factBaseList :: FactBase f -> [(Label, f)]

3561 %% \section{Code for \textmd{\texttt{fixpoint}}}

3562 %% \label{app:fixpoint}

3563 %%

3564 %% {\def\baselinestretch{0.95}\hfuzz=20pt

3565 %% \begin{smallcode}

3566 %% data `TxFactBase n f

3567 %% = `TxFB { `tfb_fbase :: FactBase f

3568 %% , `tfb_rg :: DG n f C C -- Transformed blocks

3569 %% , `tfb_cha :: ChangeFlag

3570 %% , `tfb_lbls :: LabelSet }

3571 %% -- Set the tfb_cha flag iff

3572 %% -- (a) the fact in tfb_fbase for or a block L changes

3573 %% -- (b) L is in tfb_lbls.

3574 %% -- The tfb_lbls are all Labels of the *original*

3575 %% -- (not transformed) blocks

3576 %%

3577 %% `updateFact :: DataflowLattice f -> LabelSet -> (Label, f)

3578 %% -> (ChangeFlag, FactBase f)

3579 %% -> (ChangeFlag, FactBase f)

3580 %% updateFact ^lat ^lbls (lbl, ^new_fact) (^cha, fbase)

3581 %% | NoChange <- ^cha2 = (cha, fbase)

3582 %% | lbl `elemLabelSet` lbls = (SomeChange, new_fbase)

3583 %% | otherwise = (cha, new_fbase)

3584 %% where

3585 %% (cha2, ^res_fact)

3586 %% = case lookupFact fbase lbl of

3587 %% Nothing -> (SomeChange, new_fact)

3588 %% Just ^old_fact -> fact_extend lat old_fact new_fact

3589 %% ^new_fbase = extendFactBase fbase lbl res_fact

3590 %%

3591 %% fixpoint :: forall n f. NonLocal n

3592 %% => Bool -- Going forwards?

3593 %% -> DataflowLattice f

3594 %% -> (Block n C C -> FactBase f

3595 %% -> FuelMonad (DG n f C C, FactBase f))

3596 %% -> FactBase f -> [(Label, Block n C C)]

3597 %% -> FuelMonad (DG n f C C, FactBase f)

3598 %% fixpoint ^is_fwd lat ^do_block ^init_fbase ^blocks

3599 %% = do { ^fuel <- getFuel

3600 %% ; ^tx_fb <- loop fuel init_fbase

3601 %% ; return (tfb_rg tx_fb,

3602 %% tfb_fbase tx_fb `delFromFactBase` blocks) }

3603 %% -- The outgoing FactBase contains facts only for

3604 %% -- Labels *not* in the blocks of the graph

3605 %% where

3606 %% `tx_blocks :: [(Label, Block n C C)]

3607 %% -> TxFactBase n f -> FuelMonad (TxFactBase n f)

3608 %% tx_blocks [] tx_fb = return tx_fb

3609 %% tx_blocks ((lbl,blk):bs) tx_fb = tx_block lbl blk tx_fb

3610 %% >>= tx_blocks bs

3611 %%

3612 %% `tx_block :: Label -> Block n C C

3613 %% -> TxFactBase n f -> FuelMonad (TxFactBase n f)

3614 %% tx_block ^lbl ^blk tx_fb@(TxFB { tfb_fbase = fbase

3615 %% , tfb_lbls = lbls

3616 %% , tfb_rg = ^blks

3617 %% , tfb_cha = cha })

3618 %% | is_fwd && not (lbl `elemFactBase` fbase)

3619 %% = return tx_fb -- Note [Unreachable blocks]

3620 %% | otherwise

3621 %% = do { (rg, ^out_facts) <- do_block blk fbase

3622 %% ; let (^cha', ^fbase')

3623 %% = foldr (updateFact lat lbls) (cha,fbase)

3624 %% (factBaseList out_facts)

3625 %% ; return (TxFB { tfb_lbls = extendLabelSet lbls lbl

3626 %% , tfb_rg = rg `DGCatC` blks

3627 %% , tfb_fbase = fbase'

3628 %% , tfb_cha = cha' }) }

3629 %%

3630 %% loop :: Fuel -> FactBase f -> FuelMonad (TxFactBase n f)

3631 %% `loop fuel fbase

3632 %% = do { let ^init_tx_fb = TxFB { tfb_fbase = fbase

3633 %% , tfb_cha = NoChange

3634 %% , tfb_rg = DGNil

3635 %% , tfb_lbls = emptyLabelSet}

3636 %% ; tx_fb <- tx_blocks blocks init_tx_fb

3637 %% ; case tfb_cha tx_fb of

3638 %% NoChange -> return tx_fb

3639 %% SomeChange -> setFuel fuel >>

3640 %% loop fuel (tfb_fbase tx_fb) }

3641 %% \end{smallcode}

3642 %% \par

3643 %% } % end \baselinestretch

3648 This appendix lists every nontrivial identifier used in the body of

3649 the paper.

3650 For each identifier, we list the page on which that identifier is

3651 defined or discussed---or when appropriate, the figure (with line

3652 number where possible).

3653 For those few identifiers not defined or discussed in text, we give

3654 the type signature and the page on which the identifier is first

3655 referred to.

3657 Some identifiers used in the text are defined in the Haskell Prelude;

3658 for those readers less familiar with Haskell (possible even at the

3659 Haskell Symposium!), these identifiers are

3666 \let\hsprelude\dropit

3675 \noindent

3678 \else

3680 \fi

3681 }

3683 \noindent

3686 \else

3688 \fi

3689 }

3693 \noindent

3696 \else

3697 \texttt{#2} {let}- or $\lambda$-bound on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\

3698 \fi

3699 }

3710 \else

3712 \fi

3713 }

3715 \newif\ifundefinedsection\undefinedsectionfalse

3718 \ifundefinedsection

3720 \else

3721 \undefinedsectiontrue

3722 \par

3725 \fi

3726 }

3728 \begingroup

3729 \raggedright

3734 \undefinedsectionfalse

3738 \ifundefinedsection

3740 \else

3741 \undefinedsectiontrue

3742 \par

3745 \fi

3746 }

3747 \let\hspagedef\dropit

3748 \let\omithspagedef\dropit

3749 \let\omithsfigdef\dropit

3750 \let\hsfigdef\dropit

3751 \let\hstabdef\dropit

3752 \let\hspagedefll\dropit

3753 \let\hsfigdefll\dropit

3754 \let\nothspagedef\dropit

3755 \let\nothsfigdef\dropit

3756 \let\nothslinedef\dropit

3763 \endgroup

3766 \iffalse

3786 \fi

3795 THE FUEL PROBLEM:

3798 Here is the problem:

3800 A graph has an entry sequence, a body, and an exit sequence.

3801 Correctly computing facts on and flowing out of the body requires

3802 iteration; computation on the entry and exit sequences do not, since

3803 each is connected to the body by exactly one flow edge.

3805 The problem is to provide the correct fuel supply to the combined

3806 analysis/rewrite (iterator) functions, so that speculative rewriting

3807 is limited by the fuel supply.

3809 I will number iterations from 1 and name the fuel supplies as

3810 follows:

3812 f_pre fuel remaining before analysis/rewriting starts

3813 f_0 fuel remaining after analysis/rewriting of the entry sequence

3814 f_i, i>0 fuel remaining after iteration i of the body

3815 f_post fuel remaining after analysis/rewriting of the exit sequence

3817 The issue here is that only the last iteration of the body 'counts'.

3818 To formalize, I will name fuel consumed:

3820 C_pre fuel consumed by speculative rewrites in entry sequence

3821 C_i fuel consumed by speculative rewrites in iteration i of body

3822 C_post fuel consumed by speculative rewrites in exit sequence

3824 These quantities should be related as follows:

3826 f_0 = f_pre - C_pref

3827 f_i = f_0 - C_i where i > 0

3828 f_post = f_n - C_post where iteration converges after n steps

3830 When the fuel supply is passed explicitly as parameter and result, it

3831 is fairly easy to see how to keep reusing f_0 at every iteration, then

3832 extract f_n for use before the exit sequence. It is not obvious to me

3833 how to do it cleanly using the fuel monad.

3836 Norman