1 \iffalse

3 Payload: alternatives for functions polymorphic in node type

5 (n C O -> a, n C O -> b, n C O -> c)

7 vs

9 (forall e x . n e x -> ShapeTag e x, forall e x . n e x -> result e x)

11 where

13 data ShapeTag e x where

14 First :: ShapeTag C O

15 Middle :: ShapeTag O O

16 Last :: ShapeTag O C

18 result C O = a

19 result O O = b

20 result O C = c

22 The function returning ShapeTag can be passed in a type-class

23 dictionary.

28 Say something about the proliferation of heavyweight type signatures

29 required for GADT pattern matches. When the signature is three times

30 the size of the function, something is wrong...

33 I'm going to leave this point out, because in all the client code

34 we've written, we are (a) matching only on a node, and (b) matching on

35 all cases. In this scenario the GADT exhaustiveness checker provides

36 no additional benefits. Indeed, GADTs can be an irritant to the

37 client: in many pattern matches, GADTs make it impossible to default

38 cases.

41 I was thinking again about unwrapped nodes, cons-lists, snoc-lists,

42 and tree fringe. I think there's an analogy with ordinary lists, and

43 the analogy is 'concatMap' vs 'fold'. Just as with lists, the common

45 unwrapping of elements. We nevertheless prefer 'concatMap' because it

46 provides better separation of concerns. But this analogy suggests

47 several ideas:

49 - Can block processing be written in higher-order fashion?

51 - Can it be done in both styles (concatMap *and* fold)?

53 - Once we have framed the problem in these terms, can we write

54 fold-style cold that is not too hard to understand and maintain?

56 - Finally, is there an analog of stream fusion that would work for

57 control-flow graphs, enabling us to write some graph combinators

58 that be both perspicuous and efficient?

60 These are good observations for the paper and for future work.

63 ----------------------------------------------------------------

66 P.S. The three of us should have a nice little Skype chat about

67 higher-rank types. A lot of the questions still at issue boil down to

68 the following observations:

70 - It's really convenient for the *implementation* of Hoopl to put

71 forall to the left of an arrow. Polymorphic functions as

72 arguments are powerful and keen, and they make it really easy for

73 Hoopl to do its job.

75 - It's a bid inconvenient for a *client* of Hoopl to be forced to

76 supply functions that are polymorphic in shape. All constructors

77 have to be written out by hand; defaulting is not possible. (Or

78 at least John and I haven't figured out how to do it.)

80 - Using higher-rank types in Hoopl's interface makes some very

81 desirable things downright impossible:

83 - One can't write a generic dominator analysis with type

85 dom :: NonLocal n => FwdPass n Dominators

87 - One can't write a generic debugging wrapper of type

89 debug :: (Show n, Show f)

90 => TraceFn -> WhatToTrace -> FwdPass n f -> FwdPass n f

92 - One can't write a cominator of type

94 product :: FwdPass n f -> FwdPass n f' -> FwdPass n (f, f')

96 I submit that these things are all very desirable.

98 I'm therefore strongly in favor of removing the *requirement* that a

99 FwdPass include transfer and rewrite functions that are polymorphic in

100 shape. It will be a bit tedious to return to triples of functions,

101 but for those lucky clients who *can* write polymorphic functions and

102 who wish to, we can provide injection functions.

104 I should stress that I believe the tedium can be contained within

105 reasonable bounds---for example, the arfXXX functions that are

106 internal to Hoopl all remain polymorphic in shape (although not

107 higher-rank any longer).

109 \fi

112 %

113 % TODO

114 %

115 %

116 % AGraph = graph under construction

117 % Graph = replacement graph

118 %

119 %%% Hoopl assumes that the replacement graph for a node open at the exit

120 %%% doesn't contain any additional exits

121 %%%

122 %%% introduce replacement graph in section 3, after graph.

123 %%% it has arbitrarily complicated internal control flow, but only

124 %%% one exit (if open at the exit)

125 %%%

126 %%% a rquirement on the client that is not checked statically.

133 \newif\ifnoauthornotes \noauthornotesfalse

137 % \timestampfalse % it's post-submission time

145 \genkillfalse

148 \newif\ifnotesinmargin \notesinmarginfalse

154 % higher-order optimization library

155 % ('Hoople' was taken -- see hoople.org)

156 \let\hoopl\ourlib

162 % l2h substitution ourlib Hoopl

163 % l2h substitution hoopl Hoopl

193 \makeatletter

195 \let\c@table=

200 }

205 \else

208 \fi

209 \else

211 \fi

212 }

215 }

226 \noindent

228 %{\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\phantom{: \,}}}%

229 \else

232 \fi

236 \else

239 \verbatim

240 }

243 \makeatother

253 \makeatletter

257 \makeatother

266 \let\cite\citep

288 %%

289 %% 2009/05/10: removed 'float' package because it breaks multiple

290 %% \caption's per {figure} environment. ---NR

291 %%

292 %% % Put figures in boxes --- WHY??? --NR

293 %% \usepackage{float}

294 %% \floatstyle{boxed}

295 %% \restylefloat{figure}

296 %% \restylefloat{table}

300 % ON LINE THREE, set \noauthornotestrue to suppress notes (or not)

302 %\newcommand{\qed}{QED}

303 \ifnotesinmargin

305 \ifvmode

310 \else

317 \else

318 % Simon: please set \notesinmarginfalse on the first line

320 \fi

321 \ifnoauthornotes

323 \fi

327 \let\remark\norman

329 % \let \finalremark \remark % uncomment after submission

351 \iftimestamp

354 \fi

363 %\title{\ourlib: Dataflow Optimization Made Simple}

365 %\title{Implementing Dataflow Analysis and Optimization by Lifting Node Functions to Basic Blocks and Control-Flow Graphs}

366 %\subtitle{Programming pearl}

368 %% \titlebanner{\textsf{\mdseries\itshape

369 %% Under consideration for publication in the ACM Haskell Symposium.

370 %% Comments

371 %% are welcome; please identify this version by the words

372 %% \textbf{\mdfivewords}.

373 %% }}

375 \ifnoauthornotes

376 \makeatletter

377 \let\HyPsd@Warning=

378 \@gobble

379 \makeatother

380 \fi

382 % JoÃ£o

390 \maketitle

396 We present \ourlib, a Haskell library that makes it easy for a

397 compiler writer

398 to implement program transformations based on dataflow analyses.

399 A~client of \ourlib\ defines a representation of

400 logical assertions,

401 a transfer function that computes outgoing assertions from incoming

402 assertions,

403 and a rewrite function that improves code when improvements are

404 justified by the assertions.

405 \ourlib\ does the actual analysis and transformation.

407 \ourlib\ implements state-of-the art algorithms:

408 Lerner, Grove, and Chambers's

410 composition of simple analyses and transformations, which achieves

411 the same precision as complex, handwritten

412 ``super-analyses;''

414 isolating bugs in a client's code.

415 \ourlib's implementation is unique in that unlike previous

416 implementations,

417 it carefully separates the tricky

418 elements of each of these algorithms, so that they can be examined and

419 understood independently.

424 \fi

425 Dataflow analysis and transformation of control-flow graphs is

426 pervasive in optimizing compilers, but it is typically tightly

429 unusually easy to define new analyses and

431 \ourlib's interface is modular and polymorphic,

432 and it offers unusually strong static guarantees.

433 The implementation

434 is also far from routine: it encapsulates

435 state-of-the-art algorithms (interleaved analysis and rewriting,

436 dynamic error isolation), and it cleanly separates their tricky elements

437 so that they can be understood independently.

438 %

439 %\ourlib\ will be the workhorse of a new

440 %back end for the Glasgow Haskell Compiler (version~6.14, forthcoming).

442 %% \emph{Reviewers:} code examples are indexed at {\small\url{http://bit.ly/jkr3K}}

443 %%% Source: http://www.cs.tufts.edu/~nr/drop/popl-index.pdf

450 A mature optimizing compiler for an imperative language includes many

451 analyses, the results of which justify the optimizer's

452 code-improving transformations.

453 Many of the most important analyses and transformations---constant

454 propagation, live-variable analysis, inlining, sinking of loads,

455 and so on---should be regarded as particular cases of

457 %% \remark{I do not feel compelled to cite Muchnick (or anyone else) here}

458 Dataflow analysis is over thirty years old,

460 describing a powerful but subtle way to

462 piggybacks on the other.

464 Because optimizations based on dataflow analysis

465 share a common intellectual framework, and because that framework is

466 subtle, it it tempting to

467 try to build a single reusable library that embodies the

468 subtle ideas, while

469 making it easy for clients to instantiate the library for different

470 situations.

471 Tempting, but difficult.

472 Although some such frameworks exist, as we discuss

474 and none implements the Lerner/Grove/Chambers technique.

477 optimization library''), a new Haskell library for dataflow analysis and

478 optimization. It has the following distinctive characteristics:

481 \item

482 \ourlib\ is purely functional.

483 Perhaps surprisingly, code that

484 manipulates control-flow graphs is easier to write, and far easier

485 to write correctly, when written in a purely functional style

487 When analysis and rewriting

489 without knowing whether

490 the result of the rewrite will be retained or discarded,

491 the benefit of a purely functional style is intensified

494 \item

495 \ourlib\ is polymorphic. Just as a list library is

497 the nodes that inhabit graphs, and in the dataflow facts that

500 \item The paper by Lerner, Grove, and Chambers is inspiring but abstract.

501 We articulate their ideas in a concrete but simple API that hides

503 You provide a representation for assertions,

504 a transfer function that transforms assertions across a node,

505 and a rewrite function that uses a assertion to

506 justify rewriting a node.

507 \ourlib\ ``lifts'' these node-level functions to work over

508 control-flow graphs, sets up and solves recursion equations,

509 and interleaves rewriting with analysis.

510 Designing good abstractions (data types, APIs) is surprisingly

511 hard; we have been through over a dozen significantly different

512 iterations, and we offer our API as a contribution.

513 % ``in its own right'' -- there's an echo in here...

515 \item

516 Because the client

517 can perform very local reasoning (``@y@ is live before

519 \footnote

521 ``if @x@~is live after @x:=y+2@ then @y@ is live before~it,''

524 analyses and transformations built on \ourlib\

525 are small, simple, and easy to get right.

526 Moreover, \ourlib\ helps you write correct optimizations:

527 it~statically rules out transformations that violate invariants

529 and dynamically it can help find the first transformation that introduces a fault

532 but I like this sentence with its claim to both static and dynamic assistance,

533 and maybe the open/closed story is hard to understand here.}

535 % \item \ourlib{} makes use of GADTS and type functions to offer unusually

536 % strong static guarantees. In particular, nodes, basic blocks, and

537 % graphs are all statically typed by their open or closedness on entry, and

538 % their open or closedness on exit (\secref{graph-rep}). For example, an add instruction is

539 % open on entry and exit, while a branch is open on entry and closed on exit.

540 % Using these types we can statically guarantee that, say, an add instruction

541 % is rewritten to a graph that is also open on both entry and exit; and

542 % that the user cannot construct a block where an add instruction follows an

543 % unconditional branch. We know of no other system that offers

544 % static guarantees this strong.

547 (a)~interleaved analysis and rewriting, (b)~speculative rewriting,

548 (c)~computing fixed points, and (d)~dynamic fault isolation.

549 Previous implementations of these algorithms---including three of our

550 own---are complicated and hard to understand, because the tricky pieces

551 are implemented all together, inseparably.

552 A~significant contribution of this paper is

553 a new way to structure the implementation so that each tricky piece

554 is handled in just

556 %\remark{This is a very important claim---is

557 %it substantiated in \secref{engine}? And shouldn't the word

558 %\textbf{contribution} appear in this~\P?}

559 % \simon{Better?} % yes thanks ---NR

560 The result is sufficiently elegant that

561 we emphasize the implementation as an object of interest in

562 its own right.

566 It is no toy: an ancestor of this library is

567 part of the Glasgow Haskell Compiler, where it optimizes the

569 nicer, and it will be in GHC shortly.

572 relatively sophisticated aspects of Haskell's type system, such

573 as higher-rank polymorphism, GADTs, and type functions.

575 of these features.

584 We begin by setting the scene, introducing some vocabulary, and

585 showing a small motivating example.

586 A control-flow graph, perhaps representing the body of a procedure,

588 Each block is a sequence of instructions,

589 beginning with a label and ending with a

590 control-transfer instruction that branches to other blocks.

591 % Each block has a label at the beginning,

592 % a sequence of

593 % -- each of which has a label at the

594 % beginning. Each block may branch to other blocks with arbitrarily

595 % complex control flow.

596 The goal of dataflow optimization is to compute valid

598 then use those assertions to justify code-improving

601 Consider a concrete example: constant propagation with constant folding.

602 On the left we have a basic block; in the middle we have

604 in the block; and at

605 the right we have the result of transforming the block

606 based on the assertions:

608 Before Facts After

609 ------------{}-------------

612 z := x>5 z := True

614 if z goto L1

615 then goto L1

616 else goto L2

618 Constant propagation works

619 from top to bottom. We start with the empty fact.

621 transformation?

622 Yes! We can replace the node with @x:=7@.

623 Now, given this transformed node,

624 and the original fact, what fact flows out of the bottom of

625 the transformed node?

629 Now, can we do another transformation? Yes: constant folding can

630 replace the node with @z:=True@.

631 And so the process continues to the end of the block, where we

632 can replace the conditional branch with an unconditional one, @goto L1@.

634 The example above is simple because the program has only straightline code;

635 when programs have loops, dataflow analysis gets more complicated.

636 For example,

637 consider the following graph,

638 where we assume @L1@ is the entry point:

641 L2: x=7; goto L3

642 L3: ...

644 Because control flows to @L3@ from two places,

646 All paths to @L3@ produce the fact @y=@$4$,

647 so we can conclude that this fact holds at @L3@.

648 But depending on the the path to @L3@, @x@ may have different

649 values, so we conclude ``@x=@$\top$'',

650 meaning that there is no single value held by @x@ at @L3@.%

652 In this example @x@ really does vary at @L3@, but in general

653 the analysis might be conservative.}

654 The final result of joining the dataflow facts that flow to @L3@

662 Interleaving makes it far easier to write effective analyses.

665 the transformations.

667 and the instruction @z:=x>5@,

668 a pure analysis could produce the outgoing fact

670 But the subsequent transformation must perform

674 the transfer function becomes laughably simple: it merely has to see if the

676 The gain is even more compelling if there are a number of interacting

677 analyses and/or transformations; for more substantial

682 assertion about the program state (such as ``variable~@x@ holds value~@7@'').

684 A~prime example is live-variable analysis, where a fact takes the~form

685 ``variable @x@ is live'' and is an assertion about the

687 live'' at a program point P is an assertion that @x@ is used on some program

689 The accompanying transformation is called dead-code elimination;

690 if @x@~is not live, this transformation

691 replaces the node @x:=e@ with a no-op.

693 % ----------------------------------------

697 and transformations driven by these analyses, on control-flow graphs.

698 Graphs are composed from smaller units, which we discuss from the

710 Nodes, blocks, and graphs share important properties in common.

715 control-transfer instruction to a named label.

716 For example,

718 \item A shift-left instruction is open on entry (because control can fall into it

719 from the preceding instruction), and open on exit (because control falls through

720 to the next instruction).

721 \item An unconditional branch is open on entry, but closed on exit (because

722 control cannot fall through to the next instruction).

724 control to fall through into a branch target), but open on exit.

726 % This taxonomy enables \ourlib{} to enforce invariants:

727 % only nodes closed at entry can be the targets of branches, and only nodes closed

728 % at exits can transfer control (see also \secref{edges}).

729 % As~a consequence, all control transfers originate at control-transfer

730 % instructions and terminated at labels; this invariant dramatically

731 % simplifies analysis and transformation.

732 These examples concern nodes, but the same classification applies

733 to blocks and graphs. For example the block

737 is open on entry and closed on exit.

739 ``open/closed;''

740 we may refer to an ``open/closed block.''

742 The shape of a thing determines that thing's control-flow properties.

743 In particular, whenever E is a node, block, or graph,

744 % : \simon{Removed the claim about a unique entry point.}

746 \item

747 If E is open at the entry, it has a unique predecessor;

748 if it is closed, it may have arbitrarily many predecessors---or none.

749 \item

750 If E is open at the exit, it has a unique successor;

751 if it is closed, it may have arbitrarily many successors---or none.

753 %%%%

754 %%%%

755 %%%% % \item Regardless of whether E is open or closed,

756 %%%% % it has a unique entry point where execution begins.

757 %%%% \item If E is closed at exit, control leaves \emph{only}

758 %%%% by explicit branches from closed-on-exit nodes.

759 %%%% \item If E is open at exit, control \emph{may} leave E

760 %%%% by ``falling through'' from a distinguished exit point.

761 %%%% \remark{If E is a node or block, control \emph{only} leaves E by

762 %%%% falling through, but this isn't so for a graph. Example: a body of a

763 %%%% loop contains a \texttt{break} statement} \simon{I don't understand.

764 %%%% A break statement would have to be translated as a branch, no?

765 %%%% Can you give a an example? I claim that control only leaves an

766 %%%% open graph by falling through.}

767 %%%% \end{itemize}

774 Typically, a node might represent a machine instruction, such as an

775 assignment, a call, or a conditional branch.

777 so each client can define nodes as it likes.

778 Because they contain nodes defined by the client,

779 graphs can include arbitrary client-specified data, including

780 (say) C~statements, method calls in an object-oriented language, or

781 whatever.

786 data `Node e x where

787 Label :: Label -> Node C O

788 `Assign :: Var -> Expr -> Node O O

789 `Store :: Expr -> Expr -> Node O O

790 `Branch :: Label -> Node O C

791 `CondBranch :: Expr -> Label -> Label -> Node O C

792 -- ... more constructors ...

799 closed at entry and exit:

800 the type of a node has kind @*->*->*@, where the two type parameters

801 are type-level flags, one for entry and one for exit.

802 Such a type parameter may be instantiated only with type @O@~(for

803 open) or type~@C@ (for closed).

804 As an example,

807 The type parameters are written @e@ and @x@, for

808 entry and exit respectively.

809 The type is a generalized algebraic data type;

810 the syntax gives the type of each constructor.

811 %%% \cite{peyton-jones:unification-based-gadts}.

812 For example, constructor @Label@

813 takes a @Label@ and returns a node of type @Node C O@, where

814 the~``@C@'' says ``closed at entry'' and the~``@O@'' says ``open at exit''.

815 The types @Label@, @O@, and~@C@ are

818 Similarly, an @Assign@ node takes a variable and an expression, and

819 returns a @Node@ open at both entry and exit; the @Store@ node is

820 similar. The types @`Var@ and @`Expr@ are private to the client, and

822 Finally, the control-transfer nodes @Branch@ and @CondBranch@ are open at entry

823 and closed at exit.

825 Nodes closed on entry are the only targets of control transfers;

826 nodes open on entry and exit never perform control transfers;

828 To obey these invariants,

829 a node for

830 a conditional-branch instruction, which typically either transfers control

832 conditional branch, with the fall-through path in a separate block.

834 and it costs nothing in practice:

835 such code is easily sequentialized without superfluous branches.

836 %a late-stage code-layout pass can readily reconstruct efficient code.

837 }.

838 Because of the position each type of node occupies in a

839 basic block,

841 respectively.

847 data `O -- Open

848 data `C -- Closed

850 data `Block n e x where

851 `BFirst :: n C O -> Block n C O

852 `BMiddle :: n O O -> Block n O O

853 `BLast :: n O C -> Block n O C

854 `BCat :: Block n e O -> Block n O x -> Block n e x

856 data `Graph n e x where

857 `GNil :: Graph n O O

858 `GUnit :: Block n O O -> Graph n O O

859 `GMany :: MaybeO e (Block n O C)

860 -> LabelMap (Block n C C)

861 -> MaybeO x (Block n C O)

862 -> Graph n e x

864 data `MaybeO ^ex t where

865 `JustO :: t -> MaybeO O t

866 `NothingO :: MaybeO C t

868 newtype `Label = Label Int

870 class `NonLocal n where

871 `entryLabel :: n C x -> Label

878 \ourlib\ combines the client's nodes into

881 A~@Block@ is parameterized over the node type~@n@

882 as well as over the same flag types that make it open or closed at

883 entry and exit.

885 The @BFirst@, @BMiddle@, and @BLast@ constructors create one-node

886 blocks.

889 An~earlier representation of blocks used a single constructor

891 shape.

893 uses the power of GADTs to ensure that the shape of every node is

894 known statically.

895 This property makes no difference to clients, but it significantly

896 simplifies the implementation of analysis and transformation in

899 The @BCat@ constructor concatenates blocks in sequence.

900 It~makes sense to concatenate blocks only when control can fall

901 through from the first to the second; therefore,

903 the point of concatenation.

904 This restriction is enforced by the type of @BCat@, whose first

905 argument must be open on exit, and whose second argument must be open on entry.

906 It~is statically impossible, for example, to concatenate a @Branch@

907 immediately before an

908 @Assign@.

909 Indeed, the @Block@ type statically guarantees that any

910 closed/closed @Block@---which compiler writers normally

911 call a ``basic block''---consists of exactly one closed/open node

913 followed by zero or more open/open nodes (@Assign@ or @Store@),

914 and terminated with exactly one

915 open/closed node (@Branch@ or @CondBranch@).

916 Using GADTs to enforce these invariants is one of

918 % Also notice that the definition of @Block@ guarantees that every @Block@

919 % has at least one node.

920 % SPJ: I think we've agreed to stick with PNil.

921 % \remark{So what? Who cares? Why is it important

922 % that no block is empty? (And if we had an empty block, we could ditch

923 % @PNil@ from @PGraph@, which woudl be an improvement.)}

929 Like @Block@, the data type @Graph@ is parameterized over

930 both nodes @n@ and its open/closed shape (@e@ and @x@).

931 It has three constructors. The first two

932 deal with the base cases of open/open graphs:

933 an empty graph is represented by @GNil@ while a single-block graph

934 is represented by @GUnit@.

936 More general graphs are represented by @GMany@, which has three

937 fields: an optional entry sequence, a body, and an optional exit

938 sequence.

940 \item

941 If the graph is open at the entry, it contains an entry sequence of type

942 @Block n O C@.

943 We could represent this sequence as a value of type

944 @Maybe (Block n O C)@, but we can do better:

947 if the graph is open at the entry.

948 We~express our compile-time knowledge by using the type

949 @MaybeO e (Block n O C)@, a type-indexed version of @Maybe@

951 the type @MaybeO O a@ is isomorphic to~@a@, while

952 the type @MaybeO C a@ is isomorphic to~@()@.

953 \item

954 The body of the graph is a collection of closed/closed blocks.

955 To~facilitate traversal of the graph, we represent a body as a finite

956 map from label to block.

957 \item

958 The exit sequence is dual to the entry sequence, and like the entry

959 sequence, its presence or absence is deducible from the static type of the graph.

964 We can splice graphs together nicely; the cost is logarithmic in the

965 number of closed/closed blocks.

966 Unlike blocks, two graphs may be spliced together

967 not only when they are both open

968 at splice point but also

969 when they are both closed---and not in the other two cases:

971 `gSplice :: Graph n e a -> Graph n a x -> Graph n e x

972 gSplice GNil g2 = g2

973 gSplice g1 GNil = g1

975 gSplice (GUnit ^b1) (GUnit ^b2) = GUnit (b1 `BCat` b2)

977 gSplice (GUnit b) (GMany (JustO e) bs x)

978 = GMany (JustO (b `BCat` e)) bs x

980 gSplice (GMany e ^bs (JustO x)) (GUnit b2)

981 = GMany e bs (JustO (x `BCat` b2))

983 gSplice (GMany e1 ^bs1 (JustO x1)) (GMany (JustO e2) ^bs2 x2)

984 = GMany e1 (bs1 `mapUnion` (b `addBlock` bs2)) x2

985 where b = x1 `BCat` e2

987 gSplice (GMany e1 bs1 NothingO) (GMany NothingO bs2 x2)

988 = GMany e1 (bs1 `mapUnion` bs2) x2

990 This definition illustrates the power of GADTs: the

991 pattern matching is exhaustive, and all the open/closed invariants are

992 statically checked. For example, consider the second-last equation for @gSplice@.

993 Since the exit link of the first argument is @JustO x1@,

994 we know that type parameter~@a@ is~@O@, and hence the entry link of the second

995 argument must be @JustO e2@.

996 Moreover, block~@x1@ must be

997 closed/open, and block~@e2@ must be open/closed.

998 We can therefore concatenate them

999 with @BCat@ to produce a closed/closed block, which is

1000 added to the body of the result.

1002 We~have carefully crafted the types so that if @BCat@ and @BodyCat@

1003 are considered as associative operators,

1005 Need some words about how this is part of making the API simple for

1006 the client---I've added something to the end of the paragraph, but I'm

1007 not particularly thrilled.}

1008 %% \simon{Well, you were the one who was so keen on a unique representation!

1009 %% And since we have one, I think tis useful to say so. Lastly, the representation of

1010 %% singleton blocks is not entirely obvious.}

1011 %%%%

1012 %%%% An empty open/open graph is represented

1013 %%%% by @GNil@, while a closed/closed one is @gNilCC@:

1014 %%%% \par {\small

1015 %%%% \begin{code}

1016 %%%% gNilCC :: Graph C C

1017 %%%% gNilCC = GMany NothingO BodyEmpty NothingO

1018 %%%% \end{code}}

1019 %%%% The representation of a @Graph@ consisting of a single block~@b@

1020 %%%% depends on the shape of~@b@:\remark{Does anyone care?}

1021 %%%% \par{\small

1022 %%%% \begin{code}

1023 %%%% gUnitOO :: Block O O -> Graph O O

1024 %%%% gUnitOC :: Block O C -> Graph O C

1025 %%%% gUnitCO :: Block O C -> Graph C O

1026 %%%% gUnitCC :: Block O C -> Graph C C

1027 %%%% gUnitOO b = GUnit b

1028 %%%% gUnitOC b = GMany (JustO b) BodyEmpty NothingO

1029 %%%% gUnitCO b = GMany NothingO BodyEmpty (JustO b)

1030 %%%% gUnitCC b = GMany NothingO (BodyUnit b) NothingO

1031 %%%% \end{code}}

1032 %%%% Multi-block graphs are similar.

1033 %%%% From these definitions

1034 To guarantee uniqueness, @GUnit@ is restricted to open/open

1035 blocks.

1036 If~@GUnit@ were more polymorphic, there would be

1037 more than one way to represent some graphs, and it wouldn't be obvious

1038 to a client which representation to choose---or if the choice made a difference.

1044 it~still needs to know how nonlocal control is transferred from one

1045 block to another:

1046 \hoopl\ must know where

1047 a control transfer can go,

1048 and it must know what @Label@ is at the start of a block.

1049 The client must provide this information by placing its node type into

1052 and returns its @Label@;

1053 the @successors@ method takes a last node (closed on exit) and returns

1054 the @Label@s to

1055 which it can transfer control.

1056 A~middle node, which is open at both entry and exit, cannot refer to

1057 any @Label@s,

1058 so no corresponding interrogation function is needed.

1060 A node type defined by a client must be an instance of @NonLocal@.

1062 the client's instance declaration for @Node@ would be

1064 instance NonLocal Node where

1065 entryLabel (Label l) = l

1069 Again, the pattern matching for both functions is exhaustive, and

1070 the compiler statically checks this fact. Here, @entryLabel@

1071 cannot be applied to an @Assign@ or @Branch@ node,

1072 and any attempt to define a case for @Assign@ or @Branch@ would result

1073 in a type error.

1075 While it is required for the client to provide this information about

1076 nodes, it is very convenient for \ourlib\ to get the same information

1077 about blocks.

1078 For its own internal use,

1081 instance NonLocal n => NonLocal (Block n) where

1082 entryLabel (BFirst n) = entryLabel n

1083 entryLabel (BCat b _) = entryLabel b

1084 successors (BLast n) = successors n

1085 successors (BCat _ b) = successors b

1087 Because the functions @entryLabel@ and @successors@ are used to track control

1089 entry label or successors of a @Graph@ itself.

1091 if a @Graph@ is closed at the entry, it does not have a unique entry label.

1092 %

1093 % A slight infelicity is that we cannot make @Graph@ an instance of @NonLocal@,

1094 % because a graph closed on entry has no unique label. Fortunately, we never

1095 % need @Graph@s to be in @NonLocal@.

1096 %

1097 % ``Never apologize. Never confess to infelicity.'' ---SLPJ

1105 }

1114 Control-flow graphs& Us & Us & One \\

1116 Dataflow fact~$F$ & You & You & One type per logic \\

1118 Transfer functions & Us & You & One per analysis \\

1120 Solve-and-rewrite functions & Us & Us & Two (forward, backward) \\

1122 }

1133 Now that we have graphs, how do we optimize them?

1135 client to build a new dataflow analysis and optimization.

1136 The client must supply the following pieces:

1140 that let the client build control-flow graphs out of nodes.

1143 Each analysis uses facts that are specific to that particular analysis,

1145 the fact type.

1150 and which returns either @Nothing@

1151 or @(Just g)@ where @g@~is a graph that should

1152 replace the node.

1154 internal control flow is

1155 crucial for many code-improving transformations.

1156 We discuss the rewrite function

1160 Because facts, transfer functions, and rewrite functions work closely together,

1161 we represent their combination

1167 Given a node type~@n@ and a @FwdPass@,

1169 graph.

1170 \hoopl\ provides a fully polymorphic interface, but for purposes of

1171 exposition, we present a function that is specialized to a

1172 closed/closed graph:

1174 analyzeAndRewriteFwdBody

1175 :: ( FuelMonad m -- Gensym and other state

1176 , NonLocal n ) -- Extract non-local flow edges

1177 => FwdPass m n f -- Lattice, transfer, rewrite

1179 -> Graph n C C -- Input graph

1180 -> FactBase f -- Input fact(s)

1181 -> m ( Graph n C C -- Result graph

1182 , FactBase f ) -- ... and its facts

1184 Given a @FwdPass@, the

1185 analyze-and-rewrite function transforms a graph into

1186 an optimized graph.

1187 As its type shows, this function

1188 is polymorphic in the types of nodes~@n@ and facts~@f@;

1189 these types are determined entirely by the client.

1190 The type of the monad~@m@ is also supplied by the client, but it must

1191 meet certain constraints laid out by \hoopl, as described in

1197 data `FwdPass m n f

1198 = FwdPass { `fp_lattice :: DataflowLattice f

1199 , `fp_transfer :: FwdTransfer n f

1200 , `fp_rewrite :: FwdRewrite m n f }

1202 ------- Lattice ----------

1203 data `DataflowLattice a = DataflowLattice

1204 { `fact_bot :: a

1205 , `fact_join :: OldFact a -> NewFact a

1206 -> (ChangeFlag, a) }

1208 data `ChangeFlag = `NoChange | `SomeChange

1209 newtype `OldFact a = OldFact a

1210 newtype `NewFact a = NewFact a

1212 ------- Transfers ----------

1213 newtype `FwdTransfer n f -- abstract type

1214 `mkFTransfer'

1215 :: (forall e x . n e x -> f -> Fact x f)

1216 -> FwdTransfer n f

1218 ------- Rewrites ----------

1219 newtype `FwdRewrite m n f -- abstract type

1220 `mkFRewrite'

1221 :: (forall e x . n e x -> f -> m (Maybe (FwdRew m n f e x)))

1222 -> FwdRewrite m n f

1224 data `FwdRew m n f e x

1225 = FwdRew (Graph n e x) (FwdRewrite m n f)

1227 ------- Fact-like things -------

1228 type family `Fact x f :: *

1229 type instance Fact O f = f

1230 type instance Fact C f = FactBase f

1232 ------- FactBase -------

1233 type `FactBase f = LabelMap f

1234 -- A finite mapping from Labels to facts f

1241 % omit mkFactBase :: [(Label, f)] -> FactBase f

1245 As well as taking and returning a graph with its entry point(s), the

1246 function also takes input facts (the @FactBase@) and produces output facts.

1247 A~@FactBase@ is simply a finite mapping from @Label@ to facts.

1248 The

1249 output @FactBase@ maps each @Label@ in the @Body@ to its fact; if

1250 the @Label@ is not in the domain of the @FactBase@, its fact is the

1251 bottom element of the lattice.

1252 The point(s) at which control flow may enter the graph are supplied by

1253 the caller; the input @FactBase@ supplies any

1254 facts that hold at these points.

1256 if the graph

1257 represents the body of a procedure

1258 with parameters $x,y,z$, we would map the entry @Label@ to a fact

1260 parameters are not known to be constants.

1262 The client's model of how @analyzeAndRewriteFwdBody@ works is as follows:

1263 \ourlib\ walks forward over each block in the graph.

1264 At each node, \ourlib\ applies the

1265 rewrite function to the node and the incoming fact. If the rewrite

1266 function returns @Nothing@, the node is retained as part of the output

1267 graph, the transfer function is used to compute the downstream fact,

1268 and \ourlib\ moves on to the next node.

1269 But if the rewrite function returns @Just (FwdRew g rw)@,

1270 indicating that it wants to rewrite the node to the replacement graph~@g@, then

1271 \ourlib\ recursively analyzes and rewrites~@g@, using the new rewrite

1272 function~@rw@, before moving on to the next node.

1273 A~node following a rewritten node sees

1275 analyzing the replacement graph.

1277 In this section we flesh out the

1284 For each analysis or transformation, the client must define a type

1285 of dataflow facts.

1286 A~dataflow fact often represents an assertion

1287 about a program point,\footnote

1289 control-flow graph.}

1294 claims that variable @x@ holds value @3@ at P, regardless of the

1295 path by which P is reached.

1298 assertion ``@x@ is dead'' at point P claims that no path from P uses

1299 variable @x@.

1303 (a)~the bottom element of the lattice and (b)~how to take

1304 the least upper bound (join) of two elements.

1305 To ensure that analysis

1306 terminates, it is enough if every fact has a finite number of

1307 distinct facts above it, so that repeated joins

1308 eventually reach a fixed point.

1310 In practice, joins are computed at labels.

1312 label~$L$,

1314 into the label~$L$,

1317 Furthermore, the dataflow engine wants to know if

1319 because if not, the analysis has not reached a fixed point.

1321 The bottom element and join operation of a lattice of facts of

1322 type~@f@ are stored in a value of type @DataflowLattice f@

1324 %%Such a value is part of the @FwdPass n f@ that is passed to

1325 %%@analyzeAndRewriteFwd@ above.

1326 As noted in the previous paragraph,

1328 to one of the arguments joined.

1329 Because this information can often be computed cheaply together

1330 when the join, \ourlib\ does not

1331 require a separate equality test on facts (which might be

1332 expensive).

1333 Instead, \ourlib\ requires that @fact_join@ return a @ChangeFlag@ as

1334 well as the least upper bound.

1335 The @ChangeFlag@ should be @NoChange@ if

1336 the result is the same as the old fact, and

1337 @SomeChange@ if the result differs.

1341 To help clients create lattices and join functions,

1342 \hoopl\ includes functions and constructors that extend a type~@a@

1343 with top and bottom elements.

1344 In this paper, we use only type @`WithTop@, which has value

1345 constructors with these types:

1347 `PElem :: a -> WithTop a

1348 `Top :: WithTop a

1350 \hoopl\ also provides a @`Bot@ constructor and types @WithTop@,

1351 @`WithBot@, and @`WithTopAndBot@.

1352 All are defined using a single GADT,

1353 so that value constructors @Top@ and @Bot@ may be used with any of the

1354 types.

1356 The real convenience of @WithTop@, @WithBot@, and @WithTopAndBot@ is

1357 that \hoopl\ provides a combinator that lifts

1358 a join function on~@a@ to a join function on

1359 @WithTop a@, and similarly for the other types.

1360 The lifting combinator ensures that joins involving top and bottom

1361 elements not only obey the appropriate algebraic laws but also set the

1362 @ChangeFlag@ properly.

1367 % A~possible representation of the facts needed to implement

1368 % constant propagation is shown in \figref{const-prop}.

1369 % A~fact

1370 % is represented as a finite map from a variable to a value of type

1371 % @Maybe HasConst@.%

1372 % A~variable $x$ maps to @Nothing@ iff $x=\bot$;

1373 % $x$~maps to @Just Top@ iff $x=\top$;

1374 % and $x$~maps to $@Just@\, (@I@\, k)$ iff $x=k$ (and similarly for

1375 % Boolean constants).\remark{Ugh}

1376 % Any one procedure has only

1377 % finitely many variables; only finitely many facts are computed at any

1378 % program point; and in this lattice any one fact can increase at most

1379 % twice. These properties ensure that the dataflow engine will reach a

1380 % fixed point.

1385 A~forward transfer function is presented with the dataflow fact(s) on

1386 the edge(s) coming

1387 into a node, and it computes dataflow fact(s) on the outgoing edge(s).

1388 In a forward analysis, the dataflow engine starts with the fact at the

1389 beginning of a block and applies the transfer function to successive

1390 nodes in that block until eventually the transfer function for the last node

1391 computes the facts that are propagated to the block's successors.

1392 For example, consider this graph, with entry at @L1@:

1394 L1: x=3; goto L2

1396 if x>0 then goto L2 else return

1399 Analyzing @L1@ propagates this fact forward, by applying the transfer

1400 function successively to the nodes

1402 This new fact is joined with the existing (bottom) fact for @L2@.

1403 Now the analysis propagates @L2@'s fact forward, again using the transfer

1405 Again, the new fact is joined with the existing fact for @L2@, and the process

1406 is iterated until the facts for each label reach a fixed point.

1409 If the node is open at exit, the transfer function produces a single fact.

1411 In that case the transfer function must

1412 produce a list of (@Label@,fact) pairs, one for each outgoing edge.

1413 %

1415 depends on the shape of the node's exit.}

1416 Fortunately, this dependency can be expressed precisely, at compile

1417 time, by Haskell's (recently added)

1420 A~forward transfer function supplied by a client,

1421 of type (@FwdTransfer@ @n@ @f@),

1422 is typically a function polymorphic in @e@ and @x@.

1423 It takes a

1425 outgoing ``fact-like thing'' of type (@Fact@ @x@ @f@). The

1426 type constructor @Fact@

1427 should be thought of as a type-level function; its signature is given in the

1428 @type family@ declaration, while its definition is given by two @type instance@

1429 declarations. The first declaration says that the fact-like thing

1430 coming out of a node

1432 the fact-like thing

1433 coming out of a node

1436 We have ordered the arguments such that if

1438 `transfer_fn :: forall e x . n e x -> f -> Fact x f

1439 `node :: n e x

1441 then @(transfer_fn node)@ is a fact transformer:

1443 transfer_fn node :: f -> Fact x f

1446 So much for the interface.

1447 What about the implementation?

1448 The way GADTs work is that the compiler uses the value constructor for

1450 closed.

1452 type~@n@!

1453 Such functions include

1455 \item

1456 A function that takes a pair of @FwdTransfer@s for facts @f@~and~@f'@,

1457 and returns a single @FwdTransfer@ for the fact @(f, f')@

1458 \item

1459 A function that takes a @FwdTransfer@ and wraps it in logging code, so

1460 an analysis can be debugged by watching the facts flow through the

1461 nodes

1464 % may need

1465 %functions that are polymorphic in the node type~@n@:

1467 \item

1468 A dominator analysis in the style of

1471 we have written such an analysis using transfer functions that are

1472 polymorphic in~@n@.

1474 Because the mapping from

1475 value constructors to shape is different for each node type~@n@, transfer

1476 functions cannot be polymorphic in both

1477 the representation and the shape of nodes.

1478 Our implementation therefore sacrifices polymorphism in shape:

1479 internally, \hoopl\ represents

1481 each polymorphic in~@n@ but monomorphic in shape:

1483 newtype FwdTransfer n f

1484 = FwdTransfers ( n C O -> f -> f

1485 , n O O -> f -> f

1486 , n O C -> f -> FactBase f

1487 )

1489 Such triples can easily be composed and wrapped without requiring a

1490 pattern match on the value constructor of an unknown node

1492 Working with triples is tedious, but only \hoopl\ itself is forced to

1493 work with triples; each client, which knows the node type, may supply

1494 a triple,

1495 but it typically supplies a single function polymorphic in the shape

1496 of the (known) node.

1505 %%%% \begin{figure}

1506 %%%% \begin{smallfuzzcode}{6.6pt}

1507 %%%% type `AGraph n e x

1508 %%%% = [Label] -> (Graph n e x, [Label])

1509 %%%%

1510 %%%% `withLabels :: Int -> ([Label] -> AGraph n e x)

1511 %%%% -> AGraph n e x

1512 %%%% withLabels n ^fn = \ls -> fn (take n ^ls) (drop n ls)

1513 %%%%

1514 %%%% `mkIfThenElse :: Expr -> AGraph Node O O

1515 %%%% -> AGraph Node O O -> AGraph Node O O

1516 %%%% mkIfThenElse p t e

1517 %%%% = withLabels 3 $ \[l1,l2,l3] ->

1518 %%%% gUnitOC (BUnit (CondBranch p l1 l2)) `gSplice`

1519 %%%% mkLabel l1 `gSplice` t `gSplice` mkBranch l3 `gSplice`

1520 %%%% mkLabel l2 `gSplice` e `gSplice` mkBranch l3 `gSplice`

1521 %%%% mkLabel l3

1522 %%%%

1523 %%%% `mkLabel l = gUnitCO (BUnit (Label l))

1524 %%%% `mkBranch l = gUnitOC (BUnit (Branch l))

1525 %%%% `gUnitOC b = GMany (JustO b) BodyEmpty NothingO

1526 %%%% `gUnitCO b = GMany NothingO BodyEmpty (JustO b)

1527 %%%% \end{smallfuzzcode}

1528 %%%% \caption{The \texttt{AGraph} type and example constructions} \figlabel{agraph}

1529 %%%% \end{figure}

1531 We compute dataflow facts in order to enable code-improving

1532 transformations.

1533 In our constant-propagation example,

1534 the dataflow facts may enable us

1535 to simplify an expression by performing constant folding, or to

1536 turn a conditional branch into an unconditional one.

1537 Similarly, a liveness analysis may allow us to

1538 replace a dead assignment with a no-op.

1542 A rewriting function takes a node and a fact, and optionally returns\ldots what?

1543 At first one might

1544 expect that rewriting should return a new node, but that is not enough:

1545 We~might want to remove a node by rewriting it to the empty graph,

1546 or more ambitiously, we might want to replace a high-level operation

1547 with a tree of conditional branches or a loop, which would entail

1548 introducing new blocks with internal control flow.

1549 In~general, a rewrite function must be able to return a

1554 Concretely, a @FwdRewrite@ takes a node and a suitably shaped

1555 fact, and returns either @Nothing@, indicating that the node should

1556 not be replaced,

1558 be replaced with~\ag: the replacement graph.

1559 The result is monadic because

1560 if the rewriter makes graphs containing blocks,

1561 it may need fresh @Label@s, which are supplied by a monad.

1562 A~client may use its own monad or may use a monad or monad transformer

1563 supplied by \hoopl.

1566 %%%% \ifpagetuning\enlargethispage{0.8\baselineskip}\fi

1570 the replacement graph $\ag$ has

1572 For example, a branch instruction can be replaced only by a graph

1573 closed at the exit. Moreover, because only an open/open graph can be

1575 of @FwdRewrite@

1576 guarantees, at compile time, that no head of a block (closed/open)

1577 or tail of a block (open/closed) can ever be deleted by being

1578 rewritten to an empty graph.

1583 Once the rewrite has been performed, what then?

1584 The rewrite returns a replacement graph,

1585 which must itself be analyzed, and its nodes may be further rewritten.

1586 We~call @analyzeAndRewriteFwdBody@ to process the replacement

1587 graph---but what @FwdPass@ should we use?

1588 There are two common situations:

1590 \item Sometimes we want to analyze and transform the replacement graph

1591 with an unmodified @FwdPass@, further rewriting the replacement graph.

1592 This procedure is

1594 When deep rewriting is used, the client's rewrite function must

1595 ensure that the graphs it produces are not rewritten indefinitely

1599 It~is easily implemented by using a modified @FwdPass@

1600 whose rewriting function always returns @Nothing@.

1602 Deep rewriting is essential to achieve the full benefits of

1603 interleaved analysis and transformation

1605 But shallow rewriting can be vital as well;

1606 for example, a forward dataflow pass that inserts

1607 a spill before a call must not rewrite the call again, lest it attempt

1608 to insert infinitely many spills.

1610 An~innovation of \hoopl\ is to build the choice of shallow or deep

1611 rewriting into each rewrite function,

1612 an idea that is elegantly captured by the

1614 The first component of the @FwdRew@ is the replacement graph, as discussed earlier.

1615 The second component, $\rw$, is a

1617 the replacement graph.

1618 For shallow rewriting this new function is

1619 the constant @Nothing@ function; for deep rewriting it is the original

1620 rewriting function.

1625 By requiring each rewrite to return a new rewrite function,

1626 \hoopl\ enables a variety of combinators over

1627 rewrite functions.

1629 (triple-based) representation. It's not pretty any more.}

1630 For example, here is a function

1631 that combines two rewriting functions in sequence:

1634 What a beautiful type @thenFwdRw@ has!

1636 It tries @rw1@, and if @rw1@

1637 declines to rewrite, it behaves like @rw2@. But if

1638 @rw1@ rewrites, returning a new rewriter @rw1a@, then the overall call also

1639 succeeds, returning a new rewrite function obtained by combining @rw1a@

1640 with @rw2@. (We cannot apply @rw1a@ or @rw2@

1641 directly to the replacement graph~@g@,

1642 because @r1@~returns a graph and @rw2@~expects a node.)

1643 The rewriter @noFwdRw@ is the identity of @thenFwdRw@.

1644 Finally, @thenFwdRw@ can

1645 combine a deep-rewriting function and a shallow-rewriting function,

1646 to produce a rewriting function that is a combination of deep and shallow.

1647 %%This is something that the Lerner/Grove/Chambers framework could not do,

1648 %%because there was a global shallow/deep flag.

1649 %% Our shortsightedness; Lerner/Grove/Chambers is deep only ---NR

1652 A shallow rewriting function can be made deep by iterating

1655 If we have shallow rewrites $A$~and~$B$ then we can build $AB$,

1656 $A^*B$, $(AB)^*$,

1657 and so on: sequential composition is @thenFwdRw@ and the Kleene star

1660 the second argument to @thenFwdRew@ would have to be the constant

1661 nothing function when applied but would have to be the original triple

1662 when passed to @thenFwdRew@ as the second argument (in the recursive

1663 call).}

1666 The combinators above operate on rewrite functions that share a common

1667 fact type and transfer function.

1668 It~can also be useful to combine entire dataflow passes that use

1669 different facts.

1670 We~invite you to write one such combinator, with type

1672 `pairFwd :: Monad m

1673 => FwdPass m n f1

1674 -> FwdPass m n f2

1675 -> FwdPass m n (f1,f2)

1677 The two passes run interleaved, not sequentially, and each may help

1678 the other,

1679 yielding better results than running $A$~and then~$B$ or $B$~and then~$A$

1681 %% call these passes. ``super-analyses;''

1682 %% in \hoopl, construction of super-analyses is

1683 %% particularly concrete.

1688 % omit Binop :: Operator -> Expr -> Expr -> Expr

1689 % omit Add :: Operator

1692 %% \begin{figure}

1693 %% {\small\hfuzz=3pt

1694 %% \begin{code}

1695 %% -- Types and definition of the lattice

1696 %% data `HasConst = `Top | `B Bool | `I Integer

1697 %% type `ConstFact = Map.Map Var HasConst

1698 %% `constLattice = DataflowLattice

1699 %% { fact_bot = Map.empty

1700 %% , fact_extend = stdMapJoin constFactAdd }

1701 %% where

1702 %% `constFactAdd ^old ^new = (c, j)

1703 %% where j = if new == old then new else Top

1704 %% c = if j == old then NoChange else SomeChange

1705 %%

1706 %% -------------------------------------------------------

1707 %% -- Analysis: variable has constant value

1708 %% `varHasConst :: FwdTransfer Node ConstFact

1709 %% varHasConst (Label l) f = lookupFact f l

1710 %% varHasConst (Store _ _) f = f

1711 %% varHasConst (Assign x (Bool b)) f = Map.insert x (B b) f

1712 %% varHasConst (Assign x (Int i)) f = Map.insert x (I i) f

1713 %% varHasConst (Assign x _) f = Map.insert x Top f

1714 %% varHasConst (Branch l) f = mkFactBase [(l, f)]

1715 %% varHasConst (CondBranch (Var x) ^tid ^fid) f

1716 %% = mkFactBase [(tid, Map.insert x (B True) f),

1717 %% (fid, Map.insert x (B False) f)]

1718 %% varHasConst (CondBranch _ tid fid) f

1719 %% = mkFactBase [(tid, f), (fid, f)]

1720 %%

1721 %% -------------------------------------------------------

1722 %% -- Constant propagation

1723 %% `constProp :: FwdRewrite Node ConstFact

1724 %% constProp node ^facts

1725 %% = fmap toAGraph (mapE rewriteE node)

1726 %% where

1727 %% `rewriteE e (Var x)

1728 %% = case Map.lookup x facts of

1729 %% Just (B b) -> Just $ Bool b

1730 %% Just (I i) -> Just $ Int i

1731 %% _ -> Nothing

1732 %% rewriteE e = Nothing

1733 %%

1734 %% -------------------------------------------------------

1735 %% -- Simplification ("constant folding")

1736 %% `simplify :: FwdRewrite Node f

1737 %% simplify (CondBranch (Bool b) t f) _

1738 %% = Just $ toAGraph $ Branch (if b then t else f)

1739 %% simplify node _ = fmap toAGraph (mapE s_exp node)

1740 %% where

1741 %% `s_exp (Binop Add (Int i1) (Int i2))

1742 %% = Just $ Int $ i1 + i2

1743 %% ... -- more cases for constant folding

1744 %%

1745 %% -- Rewriting expressions

1746 %% `mapE :: (Expr -> Maybe Expr)

1747 %% -> Node e x -> Maybe (Node e x)

1748 %% mapE f (Label _) = Nothing

1749 %% mapE f (Assign x e) = fmap (Assign x) $ f e

1750 %% ... -- more cases for rewriting expressions

1751 %%

1752 %% -------------------------------------------------------

1753 %% -- Defining the forward dataflow pass

1754 %% `constPropPass = FwdPass

1755 %% { fp_lattice = constLattice

1756 %% , fp_transfer = varHasConst

1757 %% , fp_rewrite = constProp `thenFwdRw` simplify }

1758 %% \end{code}}

1759 %% \caption{The client for constant propagation and constant folding} \figlabel{old-const-prop}

1760 %% \end{figure}

1763 \caption{The client for constant propagation and constant folding\break (extracted automatically from code distributed with Hoopl)}

1769 constant propagation and constant folding.

1770 For each variable at each point in a graph, the analysis concludes one of three facts:

1771 the variable holds a constant value (Boolean or integer),

1772 the variable might hold a non-constant value,

1773 or nothing is known about what the variable holds.

1774 We~represent these facts using a finite map from a variable to a

1776 A~variable with a constant value maps to @Just (PElem k)@, where

1777 @k@~is the constant value;

1778 a variable with a non-constant value maps to @Just Top@;

1779 and a variable with an unknown value maps to @Nothing@ (it is not

1780 in the domain of the finite map).

1782 % \afterpage{\clearpage}

1784 The definition of the lattice (@constLattice@) is straightforward.

1785 The bottom element is an empty map (nothing is known about the contents of any variable).

1786 We~use the @stdMapJoin@ function to lift the join operation

1787 for a single variable (@constFactAdd@) up to the map containing facts

1788 for all variables.

1789 The utility function @`changeIf :: Bool -> ChangeFlag@ is exported by \hoopl.

1791 % omit stdMapJoin :: Ord k => JoinFun v -> JoinFun (Map.Map k v)

1793 The forward transfer function @varHasLit@ is defined using the

1794 shape-polymorphic auxiliary function~@ft@.

1795 There are two interesting kinds of nodes:

1796 assignment and conditional branch.

1797 In the first two cases for assignment, a variable gets a constant value,

1798 so we produce a dataflow fact mapping the variable to its value.

1799 In the third case for assignment, the variable gets a non-constant value,

1800 so we produce a dataflow fact mapping the variable to @Top@.

1801 The~last interesting case is a conditional branch where the condition

1802 is a variable.

1803 If the conditional branch flows to the true successor,

1804 the variable holds @True@, and similarly for the false successor.

1805 We~update the fact flowing to each successor accordingly.

1806 (Because function @lt@ scrutinizes a GADT, it cannot

1807 default the uninteresting cases.)

1809 We do not need to consider complicated cases such as

1810 an assignment @x:=y@ where @y@ holds a constant value @k@.

1811 Instead, we rely on the interleaving of transformation

1812 and analysis to first transform the assignment to @x:=k@,

1813 which is exactly what our simple transfer function expects.

1815 interleaving makes it possible to write

1816 the simplest imaginable transfer functions, without missing

1817 opportunities to improve the code.

1820 rewrites each use of a variable to its constant value.

1821 The client has defined auxiliary functions

1823 `mapVE :: (Var -> Maybe Expr) -> (Expr -> Maybe Expr)

1824 `mapEE :: (Expr -> Maybe Expr) -> (Expr -> Maybe Expr)

1825 `mapEN :: (Expr -> Maybe Expr) -> (Node e x -> Maybe (Node e x))

1826 `mapVN :: (Var -> Maybe Expr) -> (Node e x -> Maybe (Node e x))

1827 `nodeToG :: Node e x -> Graph Node e x

1829 The client composes @mapXX@ functions

1830 to apply @lookup@ to each use of a variable in each kind of node;

1831 @lookup@ substitutes for each variable that has a constant value.

1832 Applying @fmap nodeToG@ lifts the final node, if present, into a~@Graph@.

1836 for constant

1837 folding: @simplify@.

1838 This function rewrites a conditional branch on a

1839 boolean constant to an unconditional branch, and

1840 to find constant subexpressions,

1841 it runs @s_exp@

1842 on every subexpression.

1843 Function @simplify@ does not need to check whether a variable holds a

1844 constant value; it relies on @constProp@ to have replaced the

1845 variable by the constant.

1846 Indeed, @simplify@ does not consult the

1847 incoming fact, so it is polymorphic in~@f@.

1850 The @FwdRewrite@ functions @constProp@ and @simplify@

1851 are useful independently.

1852 In this case, however, we

1854 so we compose them with @thenFwdRw@.

1855 The composition, along with the lattice and the

1856 transfer function,

1858 Given @constPropPass@, we can

1859 improve a graph~@g@ by passing @constPropPass@ and~@g@

1860 to

1861 @analyzeAndRewriteFwdBody@.

1868 Debugging an optimization can be tricky:

1869 an optimization may rewrite hundreds of nodes,

1870 and any of those rewrites could be incorrect.

1871 To debug dataflow optimizations, we use Whalley's

1873 to identify the first rewrite that

1874 transforms a program from working code to faulty code.

1876 The key idea is to~limit the number of rewrites that

1877 are performed while optimizing a graph.

1878 In \hoopl, the limit is called

1880 each rewrite costs one unit of fuel,

1881 and when the fuel is exhausted, no more rewrites are permitted.

1882 Because each rewrite leaves the observable behavior of the

1883 program unchanged, it is safe to stop rewriting at any point.

1884 Given a program that fails when compiled with optimization,

1885 a test infrastructure uses binary search on the amount of

1886 optimization fuel, until

1887 it finds that the program works correctly after $n-1$ rewrites but fails

1888 after $n$~rewrites.

1889 The $n$th rewrite is faulty.

1893 \hoopl\ ensures it can keep track of fuel by imposing the constraint

1894 @FuelMonad m@ on the monad of any rewrite function used with

1895 @analyzeAndRewriteFwdBody@.

1896 This constraint ensures that \hoopl\ can get and set the amount of

1897 fuel remaining.

1902 Are rewrites sound, especially when there are loops?

1903 Many analyses compute a fixed point starting from unsound

1904 ``facts''; for example, a live-variable analysis starts from the

1906 performed before a fixed point is reached may be unsound, and their results

1907 must be discarded}. Each iteration of the fixed-point computation must

1908 start afresh with the original graph.

1912 (speculatively, and possibly recursively),

1913 so that the facts downstream of the replacement graphs are as accurate

1914 as possible.

1915 For~example, consider this graph, with entry at @L1@:

1918 L1: x=0; goto L2

1924 What if the predicate in the conditional branch were @x<10@ instead

1925 of @x==10@?

1927 Using that fact, we would rewrite the conditional branch to an unconditional

1928 branch @goto L3@. No new fact would propagate to @L2@, and we would

1929 have successfully (and soundly) eliminated the loop.

1930 This example is contrived, but it illustrates that

1931 for best results we should

1933 \item Perform the rewrites on every iteration.

1934 \item Begin each new iteration with the original, virgin graph.

1936 This sort of algorithm is hard to implement in an imperative setting, where rewrites

1937 mutate a graph in place.

1938 But with an immutable graph, implementing the algorithm

1939 is trivially easy: we simply revert to the original graph at the start

1940 of each fixed-point iteration.

1944 Facts computed by @analyzeAndRewriteFwdBody@ depend on graphs produced by the rewrite

1945 function, which in turn depend on facts computed by the transfer function.

1946 How~do we know this algorithm is sound, or if it terminates?

1947 A~proof requires a POPL paper

1949 intuition.

1951 \hoopl\ requires that a client's functions meet

1952 these preconditions:

1954 \item

1956 every sequence of calls to @fact_extend@ must eventually return @NoChange@.

1957 \item

1958 The transfer function must be

1960 it should produce a more informative fact out.

1961 \item

1963 if it replaces a node @n@ by a replacement graph~@g@, then @g@~must be

1964 observationally equivalent to~@n@ under the

1965 assumptions expressed by the incoming dataflow fact~@f@.

1966 %%\footnote{We do not permit a transformation to change

1967 %% the @Label@ of a node. We have not found any optimizations

1968 %% that are prevented (or even affected) by this restriction.}

1969 \item

1972 For example, if the analysis says that @x@ is dead before the node~@n@,

1973 then it had better still be dead if @n@ is replaced by @g@.

1974 \item

1975 To ensure termination, a transformation that uses deep rewriting

1976 must not return replacement graphs which

1977 contain nodes that could be rewritten indefinitely.

1979 Without the conditions on monotonicity and consistency,

1980 our algorithm will terminate,

1981 but there is no guarantee that it will compute

1982 a fixed point of the analysis. And that in turn threatens the

1983 soundness of rewrites based on possibly bogus ``facts''.

1985 However, when the preconditions above are met,

1987 \item

1988 The algorithm terminates. The fixed-point loop must terminate because the

1989 lattice has no infinite ascending chains. And the client is responsible

1990 for avoiding infinite recursion when deep rewriting is used.

1991 \item

1992 The algorithm is sound. Why? Because if each rewrite is sound (in the sense given above),

1993 then applying a succession of rewrites is also sound.

1994 Moreover, a~sound analysis of the replacement graph

1995 may generate only dataflow facts that could have been

1996 generated by a more complicated analysis of the original graph.

2000 for a forward analysis/transform, if (rewrite P s) = Just s',

2001 then (transfer P s $\sqsubseteq$ transfer P s').

2002 For backward: if (rewrite Q s) = Just s', then (transfer Q s' $\sqsubseteq$ transfer Q s).

2003 Works for liveness.

2004 ``It works for liveness, so it must be true'' (NR).

2005 If this is true, it's worth a QuickCheck property!

2010 be monotonic in~$f$. We think this is true of liveness, but we are not sure

2011 whether it's just a generally good idea, or whether it's actually a

2014 %%%% \simon{The rewrite functions must presumably satisfy

2015 %%%% some monotonicity property. Something like: given a more informative

2016 %%%% fact, the rewrite function will rewrite a node to a more informative graph

2017 %%%% (in the fact lattice.).

2018 %%%% \textbf{NR}: actually the only obligation of the rewrite function is

2019 %%%% to preserve observable behavior. There's no requirement that it be

2020 %%%% monotonic or indeed that it do anything useful. It just has to

2021 %%%% preserve semantics (and be a pure function of course).

2022 %%%% \textbf{SLPJ} In that case I think I could cook up a program that

2023 %%%% would never reach a fixed point. Imagine a liveness analysis with a loop;

2024 %%%% x is initially unused anywhere.

2025 %%%% At some assignment node inside the loop, the rewriter behaves as follows:

2026 %%%% if (and only if) x is dead downstream,

2027 %%%% make it alive by rewriting the assignment to mention x.

2028 %%%% Now in each successive iteration x will go live/dead/live/dead etc. I

2029 %%%% maintain my claim that rewrite functions must satisfy some

2030 %%%% monotonicity property.

2031 %%%% \textbf{JD}: in the example you cite, monotonicity of facts at labels

2032 %%%% means x cannot go live/dead/live/dead etc. The only way we can think

2033 %%%% of not to terminate is infinite ``deep rewriting.''

2034 %%%% }

2044 gives a client's-eye view of \hoopl, showing how to use

2045 it to create analyses and transformations.

2046 \hoopl's interface is simple, but

2049 do not describe their implementation. We have written

2050 at least three previous implementations, all of which

2051 were long and hard to understand, and only one of which

2052 provided compile-time guarantees about open and closed shapes.

2053 We are not confident that any of these implementations are correct.

2055 In this paper we describe our new implementation. It is short

2056 (about a third of the size of our last attempt), elegant, and offers

2057 strong static shape guarantees.

2061 analysis and transformation.

2062 The implementations of backward analysis and transformation are

2063 exactly analogous and are included in \hoopl.

2070 %% We on @analyzeAndRewriteFwd@, whose type is more general

2071 %% than that of @analyzeAndRewriteFwdBody@:

2072 %% \begin{smallcode}

2073 %% `analyzeAndRewriteFwd

2074 %% :: forall m n f e x. (FuelMonad m, NonLocal n)

2075 %% => FwdPass m n f -- lattice, transfers, rewrites

2076 %% -> MaybeC e [Label] -- entry points for a closed graph

2077 %% -> Graph n e x -- the original graph

2078 %% -> Fact e f -- fact(s) flowing into the entry/entries

2079 %% -> m (Graph n e x, FactBase f, MaybeO x f)

2080 %% \end{smallcode}

2081 %% We analyze graphs of all shapes; a single @FwdPass@ may be used with

2082 %% multiple shapes.

2083 %% If a graph is closed at the entry, a list of entry points must be

2084 %% provided;

2085 %% if the graph is open at the entry, it must be the case that the

2086 %% implicit entry point is the only entry point.

2087 %% The fact or set of facts flowing into the entries is similarly

2088 %% determined by the shape of the entry point.

2089 %% Finally, the result is a rewritten graph, a @FactBase@ that gives a

2090 %% fixed point of the analysis on the rewritten graph, and if the graph

2091 %% is open at the exit, an ``exit fact'' flowing out.

2093 Instead of the interface function @analyzeAndRewriteBody@, we present

2094 the internal function @arfGraph@ (short for ``analyze and rewrite

2095 forward graph''):

2097 `arfGraph

2098 :: forall m n f e x. (FuelMonad m, NonLocal n)

2099 => FwdPass m n f -- lattice, transfers, rewrites

2101 -> Graph n e x -- the original graph

2102 -> Fact e f -- fact(s) flowing into the entry/entries

2103 -> m (DG f n e x, Fact x f)

2105 We analyze graphs of all shapes; a single @FwdPass@ may be used with

2106 multiple shapes.

2107 If a graph is closed at the entry, a list of entry points must be

2108 provided;

2109 if the graph is open at the entry, it must be the case that the

2110 implicit entry point is the only entry point.

2111 The fact or set of facts flowing into the entries is similarly

2112 determined by the shape of the entry point.

2113 Finally, the result is a ``decorated graph''

2114 (@DG f n e x@),

2115 and if the graph

2116 is open at the exit, an ``exit fact'' flowing out.

2118 A~``decorated graph'' is one in which each block is decorated with the

2119 fact that holds at the start of the block.

2120 @DG@ actually shares a representation with @Graph@,

2121 which is possible because the definition of

2123 synonym for an underlying type @`Graph'@, which takes the type of block

2124 as an additional parameter.

2126 higher-order function that takes a block-concatenation function as a

2127 parameter.)

2128 The truth about @Graph@ and @DG@ is as follows:

2130 type Graph = Graph' Block

2131 type DG f n e x = Graph' (DBlock f) n e x

2132 data DBlock f n e x = DBlock f (Block n e x)

2134 Type~@DG@ is internal to \hoopl; it is not seen by any client.

2135 Extracting a @Graph@ and a @FactBase@ for @analyzeAndRewriteFwdBody@

2136 requires a 12-line function.

2137 %%

2138 %% We use it, not @Graph@, for two reasons:

2139 %% \begin{itemize}

2140 %% \item The client is often interested not only in the facts flowing

2141 %% out of the graph (which are returned in the @Fact x f@),

2142 %% but also in the facts on the \emph{internal} blocks

2143 %% of the graph. A~replacement graph of type @(DG n e x)@ is decorated with

2144 %% these internal facts.

2145 %% \item A @Graph@ has deliberately restrictive invariants; for example,

2146 %% a @GMany@ with a @JustO@ is always open at exit (\figref{graph}). It turns

2147 %% out to be awkward to maintain these invariants \emph{during} rewriting,

2148 %% but easy to restore them \emph{after} rewriting by ``normalizing'' an @DG@.

2149 %% \end{itemize}

2150 %% The information in an @DG@ is returned to the client by

2151 %% the normalization function @normalizeBody@, which

2152 %% splits an @DG@ into a @Body@ and its corresponding @FactBase@:

2153 %% \begin{code}

2154 %% `normalizeBody :: NonLocal n => DG n f C C

2155 %% -> (Body n, FactBase f)

2156 %%

2158 Function @arfGraph@ is implemented using four inner functions:

2159 one each for nodes, blocks, graph bodies, and full graphs.

2160 These functions help us separate concerns: for example, only the

2161 analysis of a node has to worry about rewrite functions;

2162 and only the analysis of a body has to worry about fixed points.

2163 All four functions have access to the @FwdPass@ and any entry points

2164 that are passed to @arfGraph@.

2165 These functions also have access to type variables bound by

2166 @arfGraph@:

2167 @n@~is the type of nodes; @f@~is the type of facts;

2168 @m@~is the monad used in the rewriting functions of the @FwdPass@;

2169 and

2170 @e@~and~@x@ give the shape of the graph passed to @arfGraph@.

2171 The types of the inner functions are

2173 node :: forall e x . (ShapeLifter e x)

2174 => n e x -> f -> m (DG f n e x, Fact x f)

2175 block :: forall e x .

2176 Block n e x -> f -> m (DG f n e x, Fact x f)

2178 -> Fact C f -> m (DG f n C C, Fact C f)

2179 graph :: Graph n e x -> Fact e f -> m (DG f n e x, Fact x f)

2181 Each inner function works the same way: it takes a ``thing'' and

2183 An~extended fact transformer takes an input fact or fact suitable to

2184 the ``thing'' and it returns an output fact or facts.

2185 And it also returns a decorated graph representing the (possibly

2187 Finally, because rewriting may consume fuel or require a fresh name,

2188 every extended fact transformer is monadic.

2190 The types of the four extended fact transformers are not quite

2191 identical:

2193 \item

2194 Extended fact transformers for nodes and blocks have the same type;

2195 they expect a fact~@f@ rather than the more general @Fact e f@

2196 required for a graph.

2197 We~made a similar design choice in the interface for transfer

2199 exactly one fact flowing into the entry, it is easiest simply to pass

2200 that fact,

2201 \item

2202 Extended fact transformers for graphs have the most general type,

2203 which is expressed using the type family @Fact@:

2204 if the graph is open at the entry, its fact transformer expects a

2205 single fact;

2206 if the graph is close at the entry, its fact transformer expects a

2207 @FactBase@.

2208 \item

2209 Extended fact transformers for bodies have the same type as an

2210 extended fact transformer for a closed/closed graph.

2214 Function @arfGraph@ and its four inner functions comprise a cycle of

2215 mutual recursion:

2216 @arfGraph@ calls @graph@;

2217 @graph@ calls @body@ and @block@;

2218 @body@ calls @block@;

2219 @block@ calls @node@;

2220 and

2221 @node@ calls @arfGraph@.

2222 Theses five functions do three different kinds of work:

2223 compose extended fact transformers, analyze and rewrite nodes, and compute

2224 fixed points.

2229 Extended fact transformers compose nicely.

2230 The composition function, @cat@, arranges for the functional parts to

2231 compose and for the decorated graphs to be spliceable.

2232 It~has a very general type:

2234 cat :: forall m e a x info info' info''. Monad m

2235 => (info -> m (DG f n e a, info'))

2236 -> (info' -> m (DG f n a x, info''))

2237 -> (info -> m (DG f n e x, info''))

2239 The name @cat@ comes from the concatenation of the decorated graphs,

2240 but it is also appropriate because the style in which it is used is

2241 reminiscent of @concatMap@.

2242 The inner @block@ function exemplifies this style:

2244 block :: forall e x .

2245 Block n e x -> f -> m (DG f n e x, Fact x f)

2246 block (BFirst n) = node n

2247 block (BMiddle n) = node n

2248 block (BLast n) = node n

2249 block (BCat b1 b2) = block b1 `cat` block b2

2251 The implementation of @graph@ is similar but has many more cases.

2253 Function @cat@ threads facts through

2254 monadic fact transformers and accumulates decorated graphs:

2256 `cat ft1 ft2 f = do { (g1,f1) <- ft1 f

2257 ; (g2,f2) <- ft2 f1

2258 ; return (g1 `dgCat` g2, f2) }

2260 (Function @dgCat@ is the same splicing function used for an ordinary

2261 @Graph@, but it takes a one-line block-concatenation function suitable

2262 for @DBlock@s.)

2266 %%%%

2267 %%%% \begin{itemize}

2268 %%%% \item

2269 %%%% The @arfNode@ function processes nodes (\secref{arf-node}).

2270 %%%% It handles the subtleties of interleaved analysis and rewriting,

2271 %%%% and it deals with fuel consumption. It calls @arfGraph@ to analyze

2272 %%%% and transform rewritten graphs.

2273 %%%% \item

2274 %%%% Based on @arfNode@ it is extremely easy to write @arfBlock@, which lifts

2275 %%%% the analysis and rewriting from nodes to blocks (\secref{arf-block}).

2276 %%%%

2277 %%%% %%% \ifpagetuning\penalty -10000 \fi

2278 %%%%

2279 %%%%

2280 %%%% \item

2281 %%%% Using @arfBlock@ we define @arfBody@, which analyzes and rewrites a

2282 %%%% @Body@: a~group of closed/closed blocks linked by arbitrary

2283 %%%% control flow.

2284 %%%% Because a @Body@ is

2285 %%%% always closed/closed and does not take shape parameters, function

2286 %%%% @arfBody@ is less polymorphic than the others; its type is what

2287 %%%% would be obtained by expanding and specializing the definition of

2288 %%%% @ARF@ for a @thing@ which is always closed/closed and is equivalent to

2289 %%%% a @Body@.

2290 %%%%

2291 %%%% Function @arfBody@ takes care of fixed points (\secref{arf-body}).

2292 %%%% \item

2293 %%%% Based on @arfBody@ it is easy to write @arfGraph@ (\secref{arf-graph}).

2294 %%%% \end{itemize}

2295 %%%% Given these functions, writing the main analyzer is a simple

2296 %%%% matter of matching the external API to the internal functions:

2297 %%%% \begin{code}

2298 %%%% `analyzeAndRewriteFwdBody

2299 %%%% :: forall n f. NonLocal n

2300 %%%% => FwdPass n f -> Body n -> FactBase f

2301 %%%% -> FuelMonad (Body n, FactBase f)

2302 %%%%

2303 %%%% analyzeAndRewriteFwdBody pass ^body facts

2304 %%%% = do { (^rg, _) <- arfBody pass body facts

2305 %%%% ; return (normalizeBody rg) }

2306 %%%% \end{code}

2307 %%%%

2308 %%%% \subsection{From nodes to blocks} \seclabel{arf-block}

2309 %%%% \seclabel{arf-graph}

2310 %%%%

2311 %%%% We begin our explanation with the second task:

2312 %%%% writing @arfBlock@, which analyzes and transforms blocks.

2313 %%%% \begin{code}

2314 %%%% `arfBlock :: NonLocal n => ARF (Block n) n

2315 %%%% arfBlock pass (BUnit node) f

2316 %%%% = arfNode pass node f

2317 %%%% arfBlock pass (BCat b1 b2) f

2318 %%%% = do { (g1,f1) <- arfBlock pass b1 f

2319 %%%% ; (g2,f2) <- arfBlock pass b2 f1

2320 %%%% ; return (g1 `DGCatO` g2, f2) }

2321 %%%% \end{code}

2322 %%%% The code is delightfully simple.

2323 %%%% The @BUnit@ case is implemented by @arfNode@.

2324 %%%% The @BCat@ case is implemented by recursively applying @arfBlock@ to the two

2325 %%%% sub-blocks, threading the output fact from the first as the

2326 %%%% input to the second.

2327 %%%% Each recursive call produces a rewritten graph;

2328 %%%% we concatenate them with @DGCatO@.

2329 %%%%

2330 %%%% Function @arfGraph@ is equally straightforward:

2331 %%%% XXXXXXXXXXXXXXX

2332 %%%% The pattern is the same as for @arfBlock@: thread

2333 %%%% facts through the sequence, and concatenate the results.

2334 %%%% Because the constructors of type~@DG@ are more polymorphic than those

2335 %%%% of @Graph@, type~@DG@ can represent

2336 %%%% graphs more simply than @Graph@; for example, each element of a

2337 %%%% @GMany@ becomes a single @DG@ object, and these @DG@ objects are then

2338 %%%% concatenated to form a single result of type~@DG@.

2339 %%%%

2340 %%%% %% \ifpagetuning\penalty -10000 \fi

2344 Although interleaving analysis with transformation is tricky, we have

2345 isolated the algorithm in the @node@ function:

2347 node :: forall e x . (ShapeLifter e x)

2348 => n e x -> f -> m (DG f n e x, Fact x f)

2349 node n f

2350 = do { rew <- frewrite pass n f >>= withFuel

2351 ; case rew of

2352 Nothing -> return (toDg f (toBlock n),

2353 ftransfer pass n f)

2354 Just (FwdRew g rw) ->

2356 f' = fwdEntryFact n f

2357 in arfGraph pass' (maybeEntry n) g f' }

2359 This code is more complicated

2360 but still brief.

2361 %

2362 Functions @`frewrite@, @`ftransfer@, @`toBlock@,

2363 @`maybeEntry@, and @`fwdEntryFact@ are overloaded based on the

2364 node's shape, via the (private) @`ShapeLifter@ class.

2365 We begin by using @frewrite@ to

2366 extract the

2367 rewriting function from the @FwdPass@,

2368 and we apply it to the node~@n@

2369 and the incoming fact~@f@.

2371 The resulting monadic @Maybe@ is passed to @withFuel@, which

2372 accounts for fuel:

2374 `withFuel :: FuelMonad m => Maybe a -> m (Maybe a)

2378 Otherwise, @withFuel@ consumes one unit of fuel and returns its

2379 % defn Fuel

2380 argument (which will be a @Just@). That is all we need say about fuel.

2382 In the @Nothing@ case, no rewrite takes place: fuel is

2383 exhausted, or the rewrite function

2384 didn't want a rewrite.

2385 We~return the single-node

2386 graph @(toDg f (toBlock n))@, decorated with its incoming fact.

2387 We~also apply the transfer function @(ftransfer pass)@

2388 to the incoming fact to produce the outgoing fact.

2390 In the @Just@ case, we receive a replacement

2391 graph~@g@ and a new rewrite function~@rw@.

2392 We~recursively analyze @g@ with @arfGraph@.

2393 This analysis uses @pass'@, which contains the original lattice and transfer

2394 function from @pass@, together with the new rewrite function~@rw@.

2395 Function @fwdEntryFact@ converts fact~@f@ from the type~@f@,

2396 which @node@ expects, to the type @Fact e f@, which @arfGraph@ expects.

2398 And that's it! If~the client wanted deep rewriting, it is

2399 implemented by the call to @arfGraph@;

2400 if the client wanted

2401 shallow rewriting, the rewrite function will have returned

2402 @noFwdRw@ as~@rw@, which is implanted in @pass'@

2407 Lastly, the inner @body@ function iterates to a fixed point.

2408 This part of the implementation is the only really tricky part, and it is

2409 cleanly separated from everything else:

2412 -> Fact C f -> m (DG f n C C, Fact C f)

2413 body entries blocks init_fbase

2414 = fixpoint True lattice do_block init_fbase $

2415 forwardBlockList entries blocks

2416 where

2417 lattice = fp_lattice pass

2418 do_block b fb = do (g, fb) <- block b entryFact

2419 return (g, mapToList fb)

2420 where entryFact = getFact lattice (entryLabel b) fb

2422 Function @forwardBlockList@ takes a list of possible entry points and

2423 a finite map from labels to places.

2424 It returns a list of

2425 blocks, sorted into an order that makes forward dataflow efficient:

2427 `forwardBlockList

2428 :: NonLocal n

2431 For

2432 example, if the entry point is at~@L2@, and the block at~@L2@

2433 branches to~@L1@, but not vice versa, then \hoopl\ will reach a fixed point

2434 more quickly if we process @L2@ before~@L1@.

2435 To~find an efficient order, @forwardBlockList@ uses

2436 the methods of the @NonLocal@ class---@entryLabel@ and @successors@---to

2437 perform a reverse postorder depth-first traversal of the control-flow graph.

2438 %%

2439 %%The @NonLocal@ type-class constraint on~@n@ propagates to all the

2440 %%@`arfThing@ functions.

2441 %% paragraph carrying too much freight

2442 %%

2443 (The order of the blocks does not affect the fixed point or any other

2444 part of the answer; it affects only the number of iterations needed to

2445 reach the fixed point.)

2447 The rest of the work is done by @fixpoint@, which is shared by

2448 both forward and backward analyses:

2450 `fixpoint :: forall m n f. (FuelMonad m, NonLocal n)

2451 => Bool -- going Forward?

2452 -> DataflowLattice f

2453 -> (Block n C C -> FactBase f -> m (DG n f C C, FactBase f))

2455 -> (FactBase f -> m (DG f n C C, FactBase f))

2457 Except for the mysterious @Bool@ passed as the first argument,

2458 the type signature tells the story.

2459 The third argument is an extended fact transformer for a single block;

2460 @fixpoint@ applies that function successively to all the blocks,

2461 which are passed as the fourth argument.

2462 The result is an extended fact transformer for the blocks.

2463 This extended fact transformer maintains a

2464 ``Current @FactBase@''

2465 which grows monotonically:

2466 its initial value is the input @FactBase@, and it is

2467 augmented with the new facts that flow

2468 out of each @Block@ as it is analyzed.

2469 The @fixpoint@ function

2470 iterates over the list of blocks until the Current @FactBase@ reaches

2471 a fixed point.

2473 The code for @fixpoint@ is a massive 70 lines long;

2474 for completeness, it

2476 The~code is mostly straightforward, although we try to be clever

2477 about deciding when a new fact means that another iteration over the

2478 blocks will be required.

2480 There is one more subtle point worth mentioning, which we highlight by

2481 considering a forward analysis of this graph, where execution starts at~@L1@:

2483 L1: x:=3; goto L4

2484 L2: x:=4; goto L4

2485 L4: if x>3 goto L2 else goto L5

2487 Block @L2@ is unreachable.

2488 But if we \naively\ process all the blocks (say in

2489 order @L1@, @L4@, @L2@), then we will start with the bottom fact for @L2@, propagate

2492 Given @x=@$\top$, the

2493 conditional in @L4@ cannot be rewritten, and @L2@~seems reachable. We have

2494 lost a good optimization.

2496 Our implementation solves this problem through a trick that is

2497 safe only for a forward analysis;

2498 @fixpoint@ analyzes a block only if the block is

2499 reachable from an entry point.

2500 This trick is not safe for a backward analysis, which

2501 is why

2502 @fixpoint@ takes a @Bool@ as its first argument:

2503 it must know if the analysis goes forward.

2505 Although the trick can be implemented in just a couple of lines of

2506 code, the reasoning behind it is quite subtle---exactly the sort of

2507 thing that should be implemented once in \hoopl, so clients don't have

2508 to worry about it.

2512 While there is a vast body of literature on

2513 dataflow analysis and optimization,

2514 relatively little can be found on

2516 We therefore focus on the foundations of dataflow analysis

2517 and on the implementations of some comparable dataflow frameworks.

2521 When transfer functions are monotone and lattices are finite in height,

2522 iterative dataflow analysis converges to a fixed point

2524 If~the lattice's join operation distributes over transfer

2525 functions,

2526 this fixed point is equivalent to a join-over-all-paths solution to

2527 the recursive dataflow equations

2529 {Kildall uses meets, not joins.

2530 Lattice orientation is conventional, and conventions have changed.

2531 We use Dana Scott's

2532 orientation, in which higher elements carry more information.}

2534 monotone functions.

2535 Each~client of \hoopl\ must guarantee monotonicity.

2537 \ifcutting

2539 \else

2541 \fi

2542 introduce abstract interpretation as a technique for developing

2543 lattices for program analysis.

2545 an all-paths dataflow problem can be viewed as model checking an

2546 abstract interpretation.

2549 presents many examples of both particular analyses and related

2550 algorithms.

2553 The soundness of interleaving analysis and transformation,

2554 even when not all speculative transformations are performed on later

2555 iterations, was shown by

2560 Most dataflow frameworks support only analysis, not transformation.

2561 The framework computes a fixed point of transfer functions, and it is

2562 up to the client of

2563 the framework to use that fixed point for transformation.

2564 Omitting transformation makes it much easier to build frameworks,

2565 and one can find a spectrum of designs.

2566 We~describe two representative

2567 designs, then move on to the prior frameworks that support interleaved

2568 analysis and transformation.

2571 for same reason as Soot below ---JD}

2572 provides an analysis-only framework for C~programs.

2573 The framework is limited to one representation of control-flow graphs

2574 and one representation of instructions, both of which are provided by

2575 the framework.

2576 The~API is complicated;

2577 much of the complexity is needed to enable the client to

2578 affect which instructions

2579 the analysis iterates over.

2582 The Soot framework is designed for analysis of Java programs

2584 best for Soot in general, but there doesn't appear

2585 to be any formal publication that actually details the dataflow

2586 framework part. ---JD}

2587 While Soot's dataflow library supports only analysis, not

2588 transformation, we found much

2589 to admire in its design.

2590 Soot's library is abstracted over the representation of

2591 the control-flow graph and the representation of instructions.

2592 Soot's interface for defining lattice and analysis functions is

2593 like our own,

2594 although because Soot is implemented in an imperative style,

2595 additional functions are needed to copy lattice elements.

2596 Like CIL, Soot provides only analysis, not transformation.

2600 but I'll be darned if I can find anything that might be termed a dataflow framework.}

2602 The Whirlwind compiler contains the dataflow framework implemented

2604 interleave analysis and transformation.

2605 Their implementation is much like our early efforts:

2606 it is a complicated mix of code that simultaneously manages interleaving,

2607 deep rewriting, and fixed-point computation.

2608 By~separating these tasks,

2609 our implementation simplifies the problem dramatically.

2610 Whirlwind's implementation also suffers from the difficulty of

2611 maintaining pointer invariants in a mutable representation of

2612 control-flow graphs, a problem we have discussed elsewhere

2615 Because speculative transformation is difficult in an imperative setting,

2616 Whirlwind's implementation is split into two phases.

2617 The first phase runs the interleaved analyses and transformations

2618 to compute the final dataflow facts and a representation of the transformations

2619 that should be applied to the input graph.

2620 The second phase executes the transformations.

2621 In~\hoopl, because control-flow graphs are immutable, speculative transformations

2622 can be applied immediately, and there is no need

2623 for a phase distinction.

2625 %%% % repetitious...

2626 %%%

2627 %%% \ourlib\ also improves upon Whirlwind's dataflow framework by providing

2628 %%% new support for the optimization writer:

2629 %%% \begin{itemize}

2630 %%% \item Using static type guarantees, \hoopl\ rules out a whole

2631 %%% class of possible bugs: transformations that produced malformed

2632 %%% control-flow graphs.

2633 %%% \item Using dynamic testing,

2634 %%% we can isolate the rewrite that transforms a working program

2635 %%% into a faulty program,

2636 %%% using Whalley's \citeyearpar{whalley:isolation} fault-isolation technique.

2637 %%% \end{itemize}

2640 described a zipper-based representation of control-flow

2641 graphs, stressing the advantages

2642 of immutability.

2645 \item

2646 We can concatenate nodes, blocks, and graphs in constant time.

2647 %Previously, we had to resort to Hughes's

2648 %\citeyearpar{hughes:lists-representation:article} technique, representing

2649 %a graph as a function.

2650 \item

2651 We can do a backward analysis without having

2652 to ``unzip'' (and allocate a copy of) each block.

2653 \item

2654 Using GADTs, we can represent a flow-graph

2655 node using a single type, instead of the triple of first, middle, and

2656 last types used in our earlier representation.

2657 This change simplifies the interface significantly:

2658 instead of providing three transfer functions and three rewrite

2659 functions per pass---one for

2660 each type of node---a client of \hoopl\ provides only one transfer

2661 function and one rewrite function per pass.

2662 \item

2663 Errors in concatenation are ruled out at

2664 compile-compile time by Haskell's static

2665 type system.

2666 In~earlier implementations, such errors were not detected until

2667 the compiler~ran, at which point we tried to compensate

2668 for the errors---but

2669 the compensation code harbored subtle faults,

2670 which we discovered while developing a new back end

2671 for the Glasgow Haskell Compiler.

2674 The implementation of \ourlib\ is also much better than

2675 our earlier implementations.

2676 Not only is the code simpler conceptually,

2677 but it is also shorter:

2678 our new implementation is about a third as long

2679 as the previous version, which is part of GHC, version 6.12.

2686 (We have also implemented a ``fold style,'' but because the types are

2687 a little less intuitive, we are sticking with @concatMap@ style for now.)

2690 > Some numbers, I have used it nine times, and would need the general fold

2691 > once to define blockToNodeList (or CB* equivalent suggested by you).

2692 > (We are using it in GHC to

2693 > - computing hash of the blocks from the nodes

2694 > - finding the last node of a block

2695 > - converting block to the old representation (2x)

2696 > - computing interference graph

2697 > - counting Area used by a block (2x)

2698 > - counting stack high-water mark for a block

2699 > - prettyprinting block)

2702 type-parameter hell, newtype hell, typechekcing hell, instance hell,

2703 triple hell

2707 We have spent six years implementing and reimplementing frameworks for

2708 dataflow analysis and transformation.

2709 This formidable design problem taught us

2710 two kinds of lessons:

2711 we learned some very specific lessons about representations and

2712 algorithms for optimizing compilers,

2713 and we were forcibly reminded of some very general, old lessons that are well

2714 known not just to functional programmers, but to programmers

2715 everywhere.

2719 %% \remark{Orphaned: but for transfer functions that

2720 %% approximate weakest preconditions or strongest postconditions,

2721 %% monotonicity falls out naturally.}

2722 %%

2723 %%

2724 %% In conclusion we offer the following lessons from the experience of designing

2725 %% and implementing \ourlib{}.

2726 %% \begin{itemize}

2727 %% \item

2728 %% Although we have not stressed this point, there is a close connection

2729 %% between dataflow analyses and program logic:

2730 %% \begin{itemize}

2731 %% \item

2732 %% A forward dataflow analysis is represented by a predicate transformer

2733 %% that is related to \emph{strongest postconditions}

2734 %% \cite{floyd:meaning}.\footnote

2735 %% {In Floyd's paper the output of the predicate transformer is called

2736 %% the \emph{strongest verifiable consequent}, not the ``strongest

2737 %% postcondition.''}

2738 %% \item

2739 %% A backward dataflow analysis is represented by a predicate transformer

2740 %% that is related to \emph{weakest preconditions} \cite{dijkstra:discipline}.

2741 %% \end{itemize}

2742 %% Logicians write down the predicate transformers for the primitive

2743 %% program fragments, and then use compositional rules to ``lift'' them

2744 %% to a logic for whole programs. In the same way \ourlib{} lets the client

2745 %% write simple predicate transformers,

2746 %% and local rewrites based on those assertions, and ``lifts'' them to entire

2747 %% function bodies with arbitrary control flow.

2749 Our main goal for \hoopl\ was to combine three good ideas (interleaved

2750 analysis and transformation, optimization fuel, and an applicative

2751 control-flow graph) in a way that could easily be reused by many, many

2752 compiler writers.

2753 Reuse requires abstraction, and as is well known,

2754 designing good abstractions is challenging.

2755 \hoopl's data types and the functions over those types have been

2757 As~we were refining our design, we~found it invaluable to operate in

2758 two modes:

2759 In the first mode, we designed, built, and used a framework as an

2761 In the second mode, we designed and built a standalone library, then

2762 redesigned and rebuilt it, sometimes going through several significant

2763 changes in a week.

2764 Operating in the first mode---inside a live compiler---forced us to

2765 make sure that no corners were cut, that we were solving a real

2766 problem, and that we did not inadvertently

2767 cripple some other part of the compiler.

2768 Operating in the second mode---as a standalone library---enabled us to

2769 iterate furiously, trying out many more ideas than would have

2770 been possible in the first mode.

2771 We~have learned that alternating between these two modes leads to a

2772 better design than operating in either mode alone.

2774 We were forcibly reminded of timeless truths:

2775 that interfaces are more important than implementations, and that data

2776 is more important than code.

2777 These truths are reflected in this paper, in which

2778 we

2781 We were also reminded that Haskell's type system (polymorphism, GADTs,

2782 higher-order functions, type classes, and so on) is a remarkably

2783 effective

2784 language for thinking about data and code---and that

2785 Haskell lacks a language of interfaces (like ML's signatures) that

2786 would make it equally effective for thinking about APIs at a larger scale.

2787 Still, as usual, the types were a remarkable aid to writing the code:

2788 when we finally agreed on the types presented above, the

2789 code almost wrote itself.

2791 Types are widely appreciated at ICFP, but here are three specific

2792 examples of how types helped us:

2794 \item

2795 Reuse is enabled by representation-independence, which in a functional

2796 language is

2797 expressed through parametric polymorphism.

2799 made the code simpler, easier to understand, and easier to maintain.

2801 \ourlib\ must know about nodes, and to embody that knowledge in the

2803 \item

2804 We are proud of using GADTs to

2805 track the open and closed shapes of nodes, blocks, and graphs at

2806 compile time.

2807 Shapes may

2808 seem like a small refinement, but they helped tremendously when

2809 building \hoopl, and we expect them to help clients.

2810 %

2811 % this paper is just not about run-time performance ---NR

2812 %

2813 %%%% Moreover, the implementation is faster than it would otherwise be,

2814 %%%% because, say, a @(Fact O f)e@ is known to be just an @f@ rather than

2815 %%%% being a sum type that must be tested (with a statically known outcome!).

2816 %

2818 to nodes, blocks, and graphs helped our

2819 thinking and helped to structure the implementation.

2820 \item

2822 types: first, middle, and last nodes.

2823 Those designs therefore required

2824 three transfer functions, three rewrite functions, and so~on.

2825 Moving to a single, ``shapely'' node type was a major breakthrough:

2826 not only do we have just one node type, but our client need supply only

2827 one transfer function and one rewrite function.

2829 the type-level function for

2831 and outgoing facts depend on the shape of a node.

2834 Dataflow optimization is usually described as a way to improve imperative

2835 programs by mutating control-flow graphs.

2836 Such transformations appear very different from the tree rewriting

2837 that functional languages are so well known for, and that makes

2838 functional languages so attractive for writing other parts of compilers.

2839 But even though dataflow optimization looks very different from

2840 what we are used to,

2841 writing a dataflow optimizer

2842 in a pure functional language was a huge win.

2843 % We could not possibly have conceived \ourlib{} in C++.

2844 In~a pure functional language, not only do we know that

2845 no data structure will be unexpectedly mutated,

2846 but we are forced to be

2847 explicit about every input and output,

2848 and we are encouraged to implement things compositionally.

2849 This kind of thinking has helped us make

2850 significant improvements to the already tricky work of Lerner, Grove,

2851 and Chambers:

2852 per-function control of shallow vs deep rewriting

2857 We~trust that these improvements are right only because they are

2858 implemented in separate

2859 parts of the code that cannot interact except through

2860 explicit function calls.

2861 %%

2862 %%An ancestor of \ourlib{} is in the Glasgow Haskell Compiler today,

2863 %%in version~6.12.

2864 With this new, improved design in hand, we are now moving back to

2865 live-compiler mode, pushing \hoopl\ into version

2866 6.13 of the Glasgow Haskell Compiler.

2869 \acks

2871 The first and second authors were funded

2872 by a grant from Intel Corporation and

2874 These authors also thank Microsoft Research Ltd, UK, for funding

2875 extended visits to the third author.

2878 \makeatother

2885 \clearpage

2887 \appendix

2889 % omit LabelSet :: *

2890 % omit LabelMap :: *

2891 % omit delFromFactBase :: FactBase f -> [(Label,f)] -> FactBase f

2892 % omit elemFactBase :: Label -> FactBase f -> Bool

2893 % omit elemLabelSet :: Label -> LabelSet -> Bool

2894 % omit emptyLabelSet :: LabelSet

2895 % omit factBaseLabels :: FactBase f -> [Label]

2896 % omit extendFactBase :: FactBase f -> Label -> f -> FactBase f

2897 % omit extendLabelSet :: LabelSet -> Label -> LabelSet

2898 % omit getFuel :: FuelMonad Fuel

2899 % omit setFuel :: Fuel -> FuelMonad ()

2900 % omit lookupFact :: FactBase f -> Label -> Maybe f

2901 % omit factBaseList :: FactBase f -> [(Label, f)]

2908 data `TxFactBase n f

2909 = `TxFB { `tfb_fbase :: FactBase f

2910 , `tfb_rg :: DG n f C C -- Transformed blocks

2911 , `tfb_cha :: ChangeFlag

2912 , `tfb_lbls :: LabelSet }

2913 -- Set the tfb_cha flag iff

2914 -- (a) the fact in tfb_fbase for or a block L changes

2915 -- (b) L is in tfb_lbls.

2916 -- The tfb_lbls are all Labels of the *original*

2917 -- (not transformed) blocks

2919 `updateFact :: DataflowLattice f -> LabelSet -> (Label, f)

2920 -> (ChangeFlag, FactBase f)

2921 -> (ChangeFlag, FactBase f)

2922 updateFact ^lat ^lbls (lbl, ^new_fact) (^cha, fbase)

2923 | NoChange <- ^cha2 = (cha, fbase)

2924 | lbl `elemLabelSet` lbls = (SomeChange, new_fbase)

2925 | otherwise = (cha, new_fbase)

2926 where

2927 (cha2, ^res_fact)

2928 = case lookupFact fbase lbl of

2929 Nothing -> (SomeChange, new_fact)

2930 Just ^old_fact -> fact_extend lat old_fact new_fact

2931 ^new_fbase = extendFactBase fbase lbl res_fact

2933 fixpoint :: forall n f. NonLocal n

2934 => Bool -- Going forwards?

2935 -> DataflowLattice f

2936 -> (Block n C C -> FactBase f

2937 -> FuelMonad (DG n f C C, FactBase f))

2939 -> FuelMonad (DG n f C C, FactBase f)

2940 fixpoint ^is_fwd lat ^do_block ^init_fbase ^blocks

2941 = do { ^fuel <- getFuel

2942 ; ^tx_fb <- loop fuel init_fbase

2943 ; return (tfb_rg tx_fb,

2944 tfb_fbase tx_fb `delFromFactBase` blocks) }

2945 -- The outgoing FactBase contains facts only for

2946 -- Labels *not* in the blocks of the graph

2947 where

2949 -> TxFactBase n f -> FuelMonad (TxFactBase n f)

2950 tx_blocks [] tx_fb = return tx_fb

2951 tx_blocks ((lbl,blk):bs) tx_fb = tx_block lbl blk tx_fb

2952 >>= tx_blocks bs

2954 `tx_block :: Label -> Block n C C

2955 -> TxFactBase n f -> FuelMonad (TxFactBase n f)

2956 tx_block ^lbl ^blk tx_fb@(TxFB { tfb_fbase = fbase

2957 , tfb_lbls = lbls

2958 , tfb_rg = ^blks

2959 , tfb_cha = cha })

2960 | is_fwd && not (lbl `elemFactBase` fbase)

2962 | otherwise

2963 = do { (rg, ^out_facts) <- do_block blk fbase

2964 ; let (^cha', ^fbase')

2965 = foldr (updateFact lat lbls) (cha,fbase)

2966 (factBaseList out_facts)

2967 ; return (TxFB { tfb_lbls = extendLabelSet lbls lbl

2968 , tfb_rg = rg `DGCatC` blks

2969 , tfb_fbase = fbase'

2972 loop :: Fuel -> FactBase f -> FuelMonad (TxFactBase n f)

2973 `loop fuel fbase

2975 , tfb_cha = NoChange

2976 , tfb_rg = DGNil

2977 , tfb_lbls = emptyLabelSet}

2978 ; tx_fb <- tx_blocks blocks init_tx_fb

2979 ; case tfb_cha tx_fb of

2980 NoChange -> return tx_fb

2981 SomeChange -> setFuel fuel >>

2982 loop fuel (tfb_fbase tx_fb) }

2984 \par

2990 This appendix lists every nontrivial identifier used in the body of

2991 the paper.

2992 For each identifier, we list the page on which that identifier is

2993 defined or discussed---or when appropriate, the figure (with line

2994 number where possible).

2995 For those few identifiers not defined or discussed in text, we give

2996 the type signature and the page on which the identifier is first

2997 referred to.

2999 Some identifiers used in the text are defined in the Haskell Prelude;

3000 for those readers less familiar with Haskell, these identifiers are

3007 \let\hsprelude\dropit

3016 \noindent

3019 \else

3021 \fi

3022 }

3024 \noindent

3027 \else

3029 \fi

3030 }

3034 \noindent

3037 \else

3038 \texttt{#2} {let}- or $\lambda$-bound on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\

3039 \fi

3040 }

3051 \else

3053 \fi

3054 }

3056 \newif\ifundefinedsection\undefinedsectionfalse

3059 \ifundefinedsection

3061 \else

3062 \undefinedsectiontrue

3063 \par

3066 \fi

3067 }

3069 \begingroup

3070 \raggedright

3075 \undefinedsectionfalse

3079 \ifundefinedsection

3081 \else

3082 \undefinedsectiontrue

3083 \par

3086 \fi

3087 }

3088 \let\hspagedef\dropit

3089 \let\omithspagedef\dropit

3090 \let\omithsfigdef\dropit

3091 \let\hsfigdef\dropit

3092 \let\hstabdef\dropit

3093 \let\hspagedefll\dropit

3094 \let\hsfigdefll\dropit

3095 \let\nothspagedef\dropit

3096 \let\nothsfigdef\dropit

3097 \let\nothslinedef\dropit

3104 \endgroup

3107 \iffalse

3127 \fi

3136 THE FUEL PROBLEM:

3139 Here is the problem:

3141 A graph has an entry sequence, a body, and an exit sequence.

3142 Correctly computing facts on and flowing out of the body requires

3143 iteration; computation on the entry and exit sequences do not, since

3144 each is connected to the body by exactly one flow edge.

3146 The problem is to provide the correct fuel supply to the combined

3147 analysis/rewrite (iterator) functions, so that speculative rewriting

3148 is limited by the fuel supply.

3150 I will number iterations from 1 and name the fuel supplies as

3151 follows:

3153 f_pre fuel remaining before analysis/rewriting starts

3154 f_0 fuel remaining after analysis/rewriting of the entry sequence

3155 f_i, i>0 fuel remaining after iteration i of the body

3156 f_post fuel remaining after analysis/rewriting of the exit sequence

3158 The issue here is that only the last iteration of the body 'counts'.

3159 To formalize, I will name fuel consumed:

3161 C_pre fuel consumed by speculative rewrites in entry sequence

3162 C_i fuel consumed by speculative rewrites in iteration i of body

3163 C_post fuel consumed by speculative rewrites in exit sequence

3165 These quantities should be related as follows:

3167 f_0 = f_pre - C_pref

3168 f_i = f_0 - C_i where i > 0

3169 f_post = f_n - C_post where iteration converges after n steps

3171 When the fuel supply is passed explicitly as parameter and result, it

3172 is fairly easy to see how to keep reusing f_0 at every iteration, then

3173 extract f_n for use before the exit sequence. It is not obvious to me

3174 how to do it cleanly using the fuel monad.

3177 Norman