clarify common usage for client transfer functions.
[packages/hoopl.git] / paper / dfopt.tex
1 \iffalse
2
3 Say something about the proliferation of heavyweight type signatures
4 required for GADT pattern matches. When the signature is three times
5 the size of the function, something is wrong...
6
7
8 I was thinking again about unwrapped nodes, cons-lists, snoc-lists,
9 and tree fringe. I think there's an analogy with ordinary lists, and
10 the analogy is 'concatMap' vs 'fold'. Just as with lists, the common
11 case of 'concatMap (\a -> [a])' does a lot of unnecessary wrapping and
12 unwrapping of elements. We nevertheless prefer 'concatMap' because it
13 provides better separation of concerns. But this analogy suggests
14 several ideas:
15
16 - Can block processing be written in higher-order fashion?
17
18 - Can it be done in both styles (concatMap *and* fold)?
19
20 - Once we have framed the problem in these terms, can we write
21 fold-style cold that is not too hard to understand and maintain?
22
23 - Finally, is there an analog of stream fusion that would work for
24 control-flow graphs, enabling us to write some graph combinators
25 that be both perspicuous and efficient?
26
27 These are good observations for the paper and for future work.
28
29
30 ----------------------------------------------------------------
31
32
33 P.S. The three of us should have a nice little Skype chat about
34 higher-rank types. A lot of the questions still at issue boil down to
35 the following observations:
36
37 - It's really convenient for the *implementation* of Hoopl to put
38 forall to the left of an arrow. Polymorphic functions as
39 arguments are powerful and keen, and they make it really easy for
40 Hoopl to do its job.
41
42 - It's a bid inconvenient for a *client* of Hoopl to be forced to
43 supply functions that are polymorphic in shape. All constructors
44 have to be written out by hand; defaulting is not possible. (Or
45 at least John and I haven't figured out how to do it.)
46
47 - Using higher-rank types in Hoopl's interface makes some very
48 desirable things downright impossible:
49
50 - One can't write a generic dominator analysis with type
51
52 dom :: Edges n => FwdPass n Dominators
53
54 - One can't write a generic debugging wrapper of type
55
56 debug :: (Show n, Show f)
57 => TraceFn -> WhatToTrace -> FwdPass n f -> FwdPass n f
58
59 - One can't write a cominator of type
60
61 product :: FwdPass n f -> FwdPass n f' -> FwdPass n (f, f')
62
63 I submit that these things are all very desirable.
64
65 I'm therefore strongly in favor of removing the *requirement* that a
66 FwdPass include transfer and rewrite functions that are polymorphic in
67 shape. It will be a bit tedious to return to triples of functions,
68 but for those lucky clients who *can* write polymorphic functions and
69 who wish to, we can provide injection functions.
70
71 I should stress that I believe the tedium can be contained within
72 reasonable bounds---for example, the arfXXX functions that are
73 internal to Hoopl all remain polymorphic in shape (although not
74 higher-rank any longer).
75
76 \fi
77
78
79 %
80 % TODO
81 %
82 %
83 % AGraph = graph under construction
84 % Graph = replacement graph
85 %
86 %%% Hoopl assumes that the replacement graph for a node open at the exit
87 %%% doesn't contain any additional exits
88 %%%
89 %%% introduce replacement graph in section 3, after graph.
90 %%% it has arbitrarily complicated internal control flow, but only
91 %%% one exit (if open at the exit)
92 %%%
93 %%% a rquirement on the client that is not checked statically.
94
95
96 \input{dfoptdu.tex}
97
98 \newif\ifpagetuning \pagetuningtrue % adjust page breaks
99
100 \newif\ifnoauthornotes \noauthornotesfalse
101
102 \newif\iftimestamp\timestamptrue % show MD5 stamp of paper
103
104 % \timestampfalse % it's post-submission time
105
106 \IfFileExists{timestamp.tex}{}{\timestampfalse}
107
108 \newif\ifcutting \cuttingfalse % cutting down to submission size
109
110
111 \newif\ifgenkill\genkillfalse % have a section on gen and kill
112 \genkillfalse
113
114
115 \newif\ifnotesinmargin \notesinmarginfalse
116 \IfFileExists{notesinmargin.tex}{\notesinmargintrue}{\relax}
117
118 \documentclass[blockstyle,preprint,natbib,nocopyrightspace]{sigplanconf}
119
120 \newcommand\ourlib{Hoopl}
121 % higher-order optimization library
122 % ('Hoople' was taken -- see hoople.org)
123 \let\hoopl\ourlib
124
125 \newcommand\ag{\ensuremath{\mathit{ag}}}
126 \renewcommand\ag{\ensuremath{g}} % not really seeing that 'ag' is helpful here ---NR
127 \newcommand\rw{\ensuremath{\mathit{rw}}}
128
129 % l2h substitution ourlib Hoopl
130 % l2h substitution hoopl Hoopl
131
132 \newcommand\fs{\ensuremath{\mathit{fs}}} % dataflow facts, possibly plural
133
134 \newcommand\vfilbreak[1][\baselineskip]{%
135 \vskip 0pt plus #1 \penalty -200 \vskip 0pt plus -#1 }
136
137 \usepackage{alltt}
138 \usepackage{array}
139 \usepackage{afterpage}
140 \newcommand\lbr{\char`\{}
141 \newcommand\rbr{\char`\}}
142
143 \clubpenalty=10000
144 \widowpenalty=10000
145
146 \usepackage{verbatim} % allows to define \begin{smallcode}
147 \newenvironment{smallcode}{\par\unskip\small\verbatim}{\endverbatim}
148 \newenvironment{fuzzcode}[1]{\par\unskip\hfuzz=#1 \verbatim}{\endverbatim}
149 \newenvironment{smallfuzzcode}[1]{\par\unskip\small\hfuzz=#1 \verbatim}{\endverbatim}
150
151 \newcommand\lineref[1]{line~\ref{line:#1}}
152 \newcommand\linepairref[2]{lines \ref{line:#1}~and~\ref{line:#2}}
153 \newcommand\linerangeref[2]{\mbox{lines~\ref{line:#1}--\ref{line:#2}}}
154 \newcommand\Lineref[1]{Line~\ref{line:#1}}
155 \newcommand\Linepairref[2]{Lines \ref{line:#1}~and~\ref{line:#2}}
156 \newcommand\Linerangeref[2]{\mbox{Lines~\ref{line:#1}--\ref{line:#2}}}
157
158 \makeatletter
159
160 \let\c@table=
161 \c@figure % one counter for tables and figures, please
162
163 \newcommand\setlabel[1]{%
164 \setlabel@#1!!\@endsetlabel
165 }
166 \def\setlabel@#1!#2!#3\@endsetlabel{%
167 \ifx*#1*% line begins with label or is empty
168 \ifx*#2*% line is empty
169 \verbatim@line{}%
170 \else
171 \@stripbangs#3\@endsetlabel%
172 \label{line:#2}%
173 \fi
174 \else
175 \@stripbangs#1!#2!#3\@endsetlabel%
176 \fi
177 }
178 \def\@stripbangs#1!!\@endsetlabel{%
179 \verbatim@line{#1}%
180 }
181
182
183 \verbatim@line{hello mama}
184
185 \newcommand{\numberedcodebackspace}{0.5\baselineskip}
186
187 \newcounter{codeline}
188 \newenvironment{numberedcode}
189 {\endgraf
190 \def\verbatim@processline{%
191 \noindent
192 \expandafter\ifx\expandafter+\the\verbatim@line+ % blank line
193 %{\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\phantom{: \,}}}%
194 \else
195 \refstepcounter{codeline}%
196 {\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\llap{\arabic{codeline}}: \,}}%
197 \fi
198 \expandafter\setlabel\expandafter{\the\verbatim@line}%
199 \expandafter\ifx\expandafter+\the\verbatim@line+ % blank line
200 \vspace*{-\numberedcodebackspace}\par%
201 \else
202 \the\verbatim@line\par
203 \fi}%
204 \verbatim
205 }
206 {\endverbatim}
207
208 \makeatother
209
210 \newcommand\arrow{\rightarrow}
211
212 \newcommand\join{\sqcup}
213 \newcommand\slotof[1]{\ensuremath{s_{#1}}}
214 \newcommand\tempof[1]{\ensuremath{t_{#1}}}
215 \let\tempOf=\tempof
216 \let\slotOf=\slotof
217
218 \makeatletter
219 \newcommand{\nrmono}[1]{%
220 {\@tempdima = \fontdimen2\font\relax
221 \texttt{\spaceskip = 1.1\@tempdima #1}}}
222 \makeatother
223
224 \usepackage{times} % denser fonts
225 \renewcommand{\ttdefault}{aett} % \texttt that goes better with times fonts
226 \usepackage{enumerate}
227 \usepackage{url}
228 \usepackage{graphicx}
229 \usepackage{natbib} % redundant for Simon
230 \bibpunct();A{},
231 \let\cite\citep
232 \let\citeyearnopar=\citeyear
233 \let\citeyear=\citeyearpar
234
235 \usepackage[ps2pdf,bookmarksopen,breaklinks,pdftitle=dataflow-made-simple]{hyperref}
236 \usepackage{breakurl} % enables \burl
237
238 \newcommand\naive{na\"\i ve}
239 \newcommand\naively{na\"\i vely}
240 \newcommand\Naive{Na\"\i ve}
241
242 \usepackage{amsfonts}
243 \newcommand\naturals{\ensuremath{\mathbb{N}}}
244 \newcommand\true{\ensuremath{\mathbf{true}}}
245 \newcommand\implies{\supseteq} % could use \Rightarrow?
246
247 \newcommand\PAL{\mbox{C{\texttt{-{}-}}}}
248 \newcommand\high[1]{\mbox{\fboxsep=1pt \smash{\fbox{\vrule height 6pt
249 depth 0pt width 0pt \leavevmode \kern 1pt #1}}}}
250
251 \usepackage{tabularx}
252
253 %%
254 %% 2009/05/10: removed 'float' package because it breaks multiple
255 %% \caption's per {figure} environment. ---NR
256 %%
257 %% % Put figures in boxes --- WHY??? --NR
258 %% \usepackage{float}
259 %% \floatstyle{boxed}
260 %% \restylefloat{figure}
261 %% \restylefloat{table}
262
263
264
265 % ON LINE THREE, set \noauthornotestrue to suppress notes (or not)
266
267 %\newcommand{\qed}{QED}
268 \ifnotesinmargin
269 \long\def\authornote#1{%
270 \ifvmode
271 \marginpar{\raggedright\hbadness=10000
272 \parindent=8pt \parskip=2pt
273 \def\baselinestretch{0.8}\tiny
274 \itshape\noindent #1\par}%
275 \else
276 \unskip\raisebox{-3.5pt}{\rlap{$\scriptstyle\diamond$}}%
277 \marginpar{\raggedright\hbadness=10000
278 \parindent=8pt \parskip=2pt
279 \def\baselinestretch{0.8}\tiny
280 \itshape\noindent #1\par}%
281 \fi}
282 \else
283 % Simon: please set \notesinmarginfalse on the first line
284 \long\def\authornote#1{{\em #1\/}}
285 \fi
286 \ifnoauthornotes
287 \def\authornote#1{\unskip\relax}
288 \fi
289
290 \newcommand{\simon}[1]{\authornote{SLPJ: #1}}
291 \newcommand{\norman}[1]{\authornote{NR: #1}}
292 \let\remark\norman
293 \def\finalremark#1{\relax}
294 % \let \finalremark \remark % uncomment after submission
295 \newcommand{\john}[1]{\authornote{JD: #1}}
296 \newcommand{\todo}[1]{\textbf{To~do:} \emph{#1}}
297 \newcommand\delendum[1]{\relax\ifvmode\else\unskip\fi\relax}
298
299 \newcommand\secref[1]{Section~\ref{sec:#1}}
300 \newcommand\secreftwo[2]{Sections \ref{sec:#1}~and~\ref{sec:#2}}
301 \newcommand\seclabel[1]{\label{sec:#1}}
302
303 \newcommand\figref[1]{Figure~\ref{fig:#1}}
304 \newcommand\figreftwo[2]{Figures \ref{fig:#1}~and~\ref{fig:#2}}
305 \newcommand\figlabel[1]{\label{fig:#1}}
306
307 \newcommand\tabref[1]{Table~\ref{tab:#1}}
308 \newcommand\tablabel[1]{\label{tab:#1}}
309
310
311 \newcommand{\CPS}{\textbf{StkMan}} % Not sure what to call it.
312
313
314 \usepackage{code} % At-sign notation
315
316 \iftimestamp
317 \input{timestamp}
318 \preprintfooter{\mdfivestamp}
319 \fi
320
321 \hyphenation{there-by}
322
323 \renewcommand{\floatpagefraction}{0.9} % must be less than \topfraction
324 \renewcommand{\topfraction}{0.95}
325 \renewcommand{\textfraction}{0.05}
326
327 \begin{document}
328 %\title{\ourlib: Dataflow Optimization Made Simple}
329 \title{\ourlib: A Modular, Reusable Library for\\ Dataflow Analysis and Transformation}
330 %\title{Implementing Dataflow Analysis and Optimization by Lifting Node Functions to Basic Blocks and Control-Flow Graphs}
331 \subtitle{Programming pearl}
332
333 \titlebanner{\textsf{\mdseries\itshape
334 This paper is a revised version of a submission to ICFP'10. Comments
335 are welcome; please identify this version by the words
336 \textbf{\mdfivewords}.
337 }}
338
339 \ifnoauthornotes
340 \makeatletter
341 \let\HyPsd@Warning=
342 \@gobble
343 \makeatother
344 \fi
345
346 % João
347
348
349 \authorinfo{Norman Ramsey}{Tufts University}{nr@cs.tufts.edu}
350 \authorinfo{Jo\~ao Dias}{Tufts University}{dias@cs.tufts.edu}
351 \authorinfo{Simon Peyton Jones}{Microsoft Research}{simonpj@microsoft.com}
352
353
354 \maketitle
355
356 \begin{abstract}
357 \iffalse % A vote for Simon's abstract
358 \remark{I have replaced a good abstract of the POPL submission with a
359 bad abstract of \emph{this} submission.}
360 We present \ourlib, a Haskell library that makes it easy for a
361 compiler writer
362 to implement program transformations based on dataflow analyses.
363 A~client of \ourlib\ defines a representation of
364 logical assertions,
365 a transfer function that computes outgoing assertions from incoming
366 assertions,
367 and a rewrite function that improves code when improvements are
368 justified by the assertions.
369 \ourlib\ does the actual analysis and transformation.
370
371 \ourlib\ implements state-of-the art algorithms:
372 Lerner, Grove, and Chambers's
373 \citeyearpar{lerner-grove-chambers:2002}
374 composition of simple analyses and transformations, which achieves
375 the same precision as complex, handwritten
376 ``super-analyses;''
377 and Whalley's \citeyearpar{whalley:isolation} dynamic technique for
378 isolating bugs in a client's code.
379 \ourlib's implementation is unique in that unlike previous
380 implementations,
381 it carefully separates the tricky
382 elements of each of these algorithms, so that they can be examined and
383 understood independently.
384
385
386 \simon{Here is an alternative abstract based on the four-sentence model.}
387 \remark{Four-sentence model? You must teach me\ldots}
388 \fi
389 Dataflow analysis and transformation of control-flow graphs is
390 pervasive in optimizing compilers, but it is typically tightly
391 interwoven with the details of a \emph{particular} compiler.
392 We~describe \ourlib{}, a reusable Haskell library that makes it
393 unusually easy to define new analyses and
394 transformations for \emph{any} compiler.
395 \ourlib's interface is modular and polymorphic,
396 and it offers unusually strong static guarantees.
397 The implementation
398 is also far from routine: it encapsulates
399 state-of-the-art algorithms (interleaved analysis and rewriting,
400 dynamic error isolation), and it cleanly separates their tricky elements
401 so that they can be understood independently.
402 %
403 %\ourlib\ will be the workhorse of a new
404 %back end for the Glasgow Haskell Compiler (version~6.14, forthcoming).
405
406 %% \emph{Reviewers:} code examples are indexed at {\small\url{http://bit.ly/jkr3K}}
407 %%% Source: http://www.cs.tufts.edu/~nr/drop/popl-index.pdf
408 \end{abstract}
409
410 \makeatactive % Enable @foo@ notation
411
412 \section{Introduction}
413
414 A mature optimizing compiler for an imperative language includes many
415 analyses, the results of which justify the optimizer's
416 code-improving transformations.
417 Many of the most important analyses and transformations---constant
418 propagation, live-variable analysis, inlining, sinking of loads,
419 and so on---should be regarded as particular cases of
420 a single general problem: \emph{dataflow analysis and optimization}.
421 %% \remark{I do not feel compelled to cite Muchnick (or anyone else) here}
422 Dataflow analysis is over thirty years old,
423 but a recent, seminal paper by \citet{lerner-grove-chambers:2002} goes further,
424 describing a powerful but subtle way to
425 \emph{interleave} analysis and transformation so that each
426 piggybacks on the other.
427
428 Because optimizations based on dataflow analysis
429 share a common intellectual framework, and because that framework is
430 subtle, it it tempting to
431 try to build a single reusable library that embodies the
432 subtle ideas, while
433 making it easy for clients to instantiate the library for different
434 situations.
435 Tempting, but difficult.
436 Although some such frameworks exist, as we discuss
437 in \secref{related}, they have complex APIs and implementations,
438 and none implements the Lerner/Grove/Chambers technique.
439
440 In this paper we present \ourlib{} (short for ``higher-order
441 optimization library''), a new Haskell library for dataflow analysis and
442 optimization. It has the following distinctive characteristics:
443
444 \begin{itemize}
445 \item
446 \ourlib\ is purely functional.
447 Perhaps surprisingly, code that
448 manipulates control-flow graphs is easier to write, and far easier
449 to write correctly, when written in a purely functional style
450 \cite{ramsey-dias:applicative-flow-graph}.
451 When analysis and rewriting
452 are interleaved, so that rewriting must be done \emph{speculatively},
453 without knowing whether
454 the result of the rewrite will be retained or discarded,
455 the benefit of a purely functional style is intensified
456 (Sections \ref{sec:overview} and \ref{sec:fixpoints}).
457
458 \item
459 \ourlib\ is polymorphic. Just as a list library is
460 polymorphic in the list elements, so is \ourlib{} polymorphic, both in
461 the nodes that inhabit graphs, and in the dataflow facts that
462 analyses compute over these graphs (\secref{using-hoopl}).
463
464 \item The paper by Lerner, Grove, and Chambers is inspiring but abstract.
465 We articulate their ideas in a concrete but simple API that hides
466 a subtle implementation (Sections \ref{sec:graph-rep} and \ref{sec:using-hoopl}).
467 You provide a representation for assertions,
468 a transfer function that transforms assertions across a node,
469 and a rewrite function that uses a assertion to
470 justify rewriting a node.
471 \ourlib\ ``lifts'' these node-level functions to work over
472 control-flow graphs, sets up and solves recursion equations,
473 and interleaves rewriting with analysis.
474 Designing good abstractions (data types, APIs) is surprisingly
475 hard; we have been through over a dozen significantly different
476 iterations, and we offer our API as a contribution.
477 % ``in its own right'' -- there's an echo in here...
478
479 \item
480 Because the client
481 can perform very local reasoning (``@y@ is live before
482 @x:=y+2@'').%
483 \footnote
484 {Using \hoopl, it is not necessary to have the more complex rule
485 ``if @x@~is live after @x:=y+2@ then @y@ is live before~it,''
486 because if @x@~is \emph{not} live after @x:=y+2@, the assignment
487 @x:=y+2@ will be eliminated.},
488 analyses and transformations built on \ourlib\
489 are small, simple, and easy to get right.
490 Moreover, \ourlib\ helps you write correct optimizations:
491 it~statically rules out transformations that violate invariants
492 of the control-flow graph (Sections \ref{sec:graph-rep} and \ref{sec:rewrites}),
493 and dynamically it can help find the first transformation that introduces a fault
494 in a test program (\secref{fuel}).
495 \finalremark{SLPJ: I wanted to write more about open/closed,
496 but I like this sentence with its claim to both static and dynamic assistance,
497 and maybe the open/closed story is hard to understand here.}
498
499 % \item \ourlib{} makes use of GADTS and type functions to offer unusually
500 % strong static guarantees. In particular, nodes, basic blocks, and
501 % graphs are all statically typed by their open or closedness on entry, and
502 % their open or closedness on exit (\secref{graph-rep}). For example, an add instruction is
503 % open on entry and exit, while a branch is open on entry and closed on exit.
504 % Using these types we can statically guarantee that, say, an add instruction
505 % is rewritten to a graph that is also open on both entry and exit; and
506 % that the user cannot construct a block where an add instruction follows an
507 % unconditional branch. We know of no other system that offers
508 % static guarantees this strong.
509
510 \item \ourlib{} implements subtle algorithms, including
511 (a)~interleaved analysis and rewriting, (b)~speculative rewriting,
512 (c)~computing fixed points, and (d)~dynamic fault isolation.
513 Previous implementations of these algorithms---including three of our
514 own---are complicated and hard to understand, because the tricky pieces
515 are implemented all together, inseparably.
516 A~significant contribution of this paper is
517 a new way to structure the implementation so that each tricky piece
518 is handled in just
519 one place, separate from all the others (\secref{engine}).
520 %\remark{This is a very important claim---is
521 %it substantiated in \secref{engine}? And shouldn't the word
522 %\textbf{contribution} appear in this~\P?}
523 % \simon{Better?} % yes thanks ---NR
524 The result is sufficiently elegant that
525 we emphasize the implementation as an object of interest in
526 its own right.
527 \end{itemize}
528 A working prototype of \ourlib{} is available from
529 \burl{http://ghc.cs.tufts.edu/hoopl} and also from Hackage.
530 It is no toy: an ancestor of this library is
531 part of the Glasgow Haskell Compiler, where it optimizes the
532 imperative {\PAL} code in GHC's back end. The new design is far
533 nicer, and it will be in GHC shortly.
534
535 The API for \ourlib{} seems quite natural, but it requires
536 relatively sophisticated aspects of Haskell's type system, such
537 as higher-rank polymorphism, GADTs, and type functions.
538 As such, \ourlib{} offers a compelling case study in the utility
539 of these features.
540
541
542 \section{Dataflow analysis {\&} transformation by \texorpdfstring{\rlap{example}}{example}}
543 \seclabel{overview}
544 \seclabel{constant-propagation}
545 \seclabel{example:transforms}
546 \seclabel{example:xforms}
547
548 We begin by setting the scene, introducing some vocabulary, and
549 showing a small motivating example.
550 A control-flow graph, perhaps representing the body of a procedure,
551 is a collection of \emph{basic blocks}---or just ``blocks''.
552 Each block is a sequence of instructions,
553 beginning with a label and ending with a
554 control-transfer instruction that branches to other blocks.
555 % Each block has a label at the beginning,
556 % a sequence of
557 % -- each of which has a label at the
558 % beginning. Each block may branch to other blocks with arbitrarily
559 % complex control flow.
560 The goal of dataflow optimization is to compute valid
561 \emph{assertions} (or \emph{dataflow facts}),
562 then use those assertions to justify code-improving
563 transformations (or \emph{rewrites}) on a \emph{control-flow graph}.
564
565 Consider a concrete example: constant propagation with constant folding.
566 On the left we have a basic block; in the middle we have
567 facts that hold between statements (or \emph{nodes})
568 in the block; and at
569 the right we have the result of transforming the block
570 based on the assertions:
571 \begin{verbatim}
572 Before Facts After
573 ------------{}-------------
574 x := 3+4 x := 7
575 ----------{x=7}------------
576 z := x>5 z := True
577 -------{x=7, z=True}-------
578 if z goto L1
579 then goto L1
580 else goto L2
581 \end{verbatim}
582 Constant propagation works
583 from top to bottom. We start with the empty fact.
584 Given the empty fact and the node @x:=3+4@ can we make a (constant-folding)
585 transformation?
586 Yes! We can replace the node with @x:=7@.
587 Now, given this transformed node,
588 and the original fact, what fact flows out of the bottom of
589 the transformed node?
590 The~fact \{@x=7@\}.
591 Given the fact \{@x=7@\} and the node @z:=x>5@, can we make a
592 transformation? Yes: constant propagation can replace the node with @z:=7>5@.
593 Now, can we do another transformation? Yes: constant folding can
594 replace the node with @z:=True@.
595 And so the process continues to the end of the block, where we
596 can replace the conditional branch with an unconditional one, @goto L1@.
597
598 The example above is simple because the program has only straightline code;
599 when programs have loops, dataflow analysis gets more complicated.
600 For example,
601 consider the following graph,
602 where we assume @L1@ is the entry point:
603 \begin{verbatim}
604 L1: x=3; y=4; if z then goto L2 else goto L3
605 L2: x=7; goto L3
606 L3: ...
607 \end{verbatim}
608 Because control flows to @L3@ from two places,
609 we must \emph{join} the facts coming from those two places.
610 All paths to @L3@ produce the fact @y=@$4$,
611 so we can conclude that this fact holds at @L3@.
612 But depending on the the path to @L3@, @x@ may have different
613 values, so we conclude ``@x=@$\top$'',
614 meaning that there is no single value held by @x@ at @L3@.%
615 \footnote{
616 In this example @x@ really does vary at @L3@, but in general
617 the analysis might be conservative.}
618 The final result of joining the dataflow facts that flow to @L3@
619 is the new fact $@x=@\top \land @y=4@ \land @z=@\top$.
620
621 \seclabel{const-prop-example}
622
623 \seclabel{simple-tx}
624 \paragraph{Interleaved transformation and analysis.}
625 Our example \emph{interleaves} transformation and analysis.
626 Interleaving makes it far easier to write effective analyses.
627 If, instead, we \emph{first} analyzed the block
628 and \emph{then} transformed it, the analysis would have to ``predict''
629 the transformations.
630 For example, given the incoming fact \{@x=7@\}
631 and the instruction @z:=x>5@,
632 a pure analysis could produce the outgoing fact
633 \{@x=7@, @z=True@\} by simplifying @x>5@ to @True@.
634 But the subsequent transformation must perform
635 \emph{exactly the same simplification} when it transforms the instruction to @z:=True@!
636 If instead we \emph{first} rewrite the node to @z:=True@,
637 and \emph{then} apply the transfer function to the new node,
638 the transfer function becomes laughably simple: it merely has to see if the
639 right hand side is a constant (you can see actual code in \secref{const-prop-client}).
640 The gain is even more compelling if there are a number of interacting
641 analyses and/or transformations; for more substantial
642 examples, consult \citet{lerner-grove-chambers:2002}.
643
644 \paragraph{Forwards and backwards.}
645 Constant propagation works \emph{forwards}, and a fact is typically an
646 assertion about the program state (such as ``variable~@x@ holds value~@7@'').
647 Some useful analyses work \emph{backwards}.
648 A~prime example is live-variable analysis, where a fact takes the~form
649 ``variable @x@ is live'' and is an assertion about the
650 \emph{continuation} of a program point. For example, the fact ``@x@~is
651 live'' at a program point P is an assertion that @x@ is used on some program
652 path starting at \mbox{P}. % TeXbook, exercise 12.6
653 The accompanying transformation is called dead-code elimination;
654 if @x@~is not live, this transformation
655 replaces the node @x:=e@ with a no-op.
656
657 % ----------------------------------------
658 \section{Representing control-flow graphs} \seclabel{graph-rep}
659
660 \ourlib{} is a library that makes it easy to define dataflow analyses,
661 and transformations driven by these analyses, on control-flow graphs.
662 Graphs are composed from smaller units, which we discuss from the
663 bottom up:\finalremark{something about replacement graphs?}
664 \begin{itemize}
665 \item A \emph{node} is defined by \ourlib's client;
666 \ourlib{} knows nothing about the representation of nodes (\secref{nodes}).
667 \item A basic \emph{block} is a sequence of nodes (\secref{blocks}).
668 \item A \emph{graph} is an arbitrarily complicated control-flow graph,
669 composed from basic blocks (\secref{graphs}).
670 \end{itemize}
671
672 \subsection{Shapes: Open and closed}
673
674 Nodes, blocks, and graphs share important properties in common.
675 In particular, each can be \emph{open or closed at entry}
676 and \emph{open or closed at exit}.
677 An \emph{open} point is one at which control may implicitly ``fall through;''
678 to transfer control at a \emph{closed} point requires an explicit
679 control-transfer instruction to a named label.
680 For example,
681 \begin{itemize}
682 \item A shift-left instruction is open on entry (because control can fall into it
683 from the preceding instruction), and open on exit (because control falls through
684 to the next instruction).
685 \item An unconditional branch is open on entry, but closed on exit (because
686 control cannot fall through to the next instruction).
687 \item A label is closed on entry (because in \ourlib{} we do not allow
688 control to fall through into a branch target), but open on exit.
689 \end{itemize}
690 % This taxonomy enables \ourlib{} to enforce invariants:
691 % only nodes closed at entry can be the targets of branches, and only nodes closed
692 % at exits can transfer control (see also \secref{edges}).
693 % As~a consequence, all control transfers originate at control-transfer
694 % instructions and terminated at labels; this invariant dramatically
695 % simplifies analysis and transformation.
696 These examples concern nodes, but the same classification applies
697 to blocks and graphs. For example the block
698 \begin{code}
699 x:=7; y:=x+2; goto L
700 \end{code}
701 is open on entry and closed on exit.
702 This is the block's \emph{shape}, which we often abbreviate
703 ``open/closed;''
704 we may refer to an ``open/closed block.''
705
706 The shape of a thing determines that thing's control-flow properties.
707 In particular, whenever E is a node, block, or graph,
708 % : \simon{Removed the claim about a unique entry point.}
709 \begin{itemize}
710 \item
711 If E is open at the entry, it has a unique predecessor;
712 if it is closed, it may have arbitrarily many predecessors---or none.
713 \item
714 If E is open at the exit, it has a unique successor;
715 if it is closed, it may have arbitrarily many successors---or none.
716 \end{itemize}
717 %%%%
718 %%%%
719 %%%% % \item Regardless of whether E is open or closed,
720 %%%% % it has a unique entry point where execution begins.
721 %%%% \item If E is closed at exit, control leaves \emph{only}
722 %%%% by explicit branches from closed-on-exit nodes.
723 %%%% \item If E is open at exit, control \emph{may} leave E
724 %%%% by ``falling through'' from a distinguished exit point.
725 %%%% \remark{If E is a node or block, control \emph{only} leaves E by
726 %%%% falling through, but this isn't so for a graph. Example: a body of a
727 %%%% loop contains a \texttt{break} statement} \simon{I don't understand.
728 %%%% A break statement would have to be translated as a branch, no?
729 %%%% Can you give a an example? I claim that control only leaves an
730 %%%% open graph by falling through.}
731 %%%% \end{itemize}
732
733
734 \subsection{Nodes} \seclabel{nodes}
735
736 The primitive constituents of a \ourlib{} control-flow graph are
737 \emph{nodes}, which are defined by the client.
738 Typically, a node might represent a machine instruction, such as an
739 assignment, a call, or a conditional branch.
740 But \ourlib{}'s graph representation is polymorphic in the node type,
741 so each client can define nodes as it likes.
742 Because they contain nodes defined by the client,
743 graphs can include arbitrary client-specified data, including
744 (say) C~statements, method calls in an object-oriented language, or
745 whatever.
746
747
748 \begin{figure}
749 \begin{fuzzcode}{0.98pt}
750 data `Node e x where
751 Label :: Label -> Node C O
752 `Assign :: Var -> Expr -> Node O O
753 `Store :: Expr -> Expr -> Node O O
754 `Branch :: Label -> Node O C
755 `CondBranch :: Expr -> Label -> Label -> Node O C
756 -- ... more constructors ...
757 \end{fuzzcode}
758 \caption{A typical node type as it might be defined by a client}
759 \figlabel{cmm-node}
760 \end{figure}
761
762 \ourlib{} knows \emph{at compile time} whether a node is open or
763 closed at entry and exit:
764 the type of a node has kind @*->*->*@, where the two type parameters
765 are type-level flags, one for entry and one for exit.
766 Such a type parameter may be instantiated only with type @O@~(for
767 open) or type~@C@ (for closed).
768 As an example,
769 \figref{cmm-node} shows a typical node type as it might be written by
770 one of \ourlib's {clients}.
771 The type parameters are written @e@ and @x@, for
772 entry and exit respectively.
773 The type is a generalized algebraic data type;
774 the syntax gives the type of each constructor.
775 %%% \cite{peyton-jones:unification-based-gadts}.
776 For example, constructor @Label@
777 takes a @Label@ and returns a node of type @Node C O@, where
778 the~``@C@'' says ``closed at entry'' and the~``@O@'' says ``open at exit''.
779 The types @Label@, @O@, and~@C@ are
780 defined by \ourlib{} (\figref{graph}).
781
782 Similarly, an @Assign@ node takes a variable and an expression, and
783 returns a @Node@ open at both entry and exit; the @Store@ node is
784 similar. The types @`Var@ and @`Expr@ are private to the client, and
785 \ourlib{} knows nothing of them.
786 Finally, the control-transfer nodes @Branch@ and @CondBranch@ are open at entry
787 and closed at exit.
788
789 Nodes closed on entry are the only targets of control transfers;
790 nodes open on entry and exit never perform control transfers;
791 and nodes closed on exit always perform control transfers\footnote{%
792 To obey these invariants,
793 a node for
794 a conditional-branch instruction, which typically either transfers control
795 \emph{or} falls through, must be represented as a two-target
796 conditional branch, with the fall-through path in a separate block.
797 This representation is standard \cite{appel:modern},
798 and it costs nothing in practice:
799 such code is easily sequentialized without superfluous branches.
800 %a late-stage code-layout pass can readily reconstruct efficient code.
801 }.
802 Because of the position each type of node occupies in a
803 basic block,
804 we~often call them \emph{first}, \emph{middle}, and \emph{last} nodes
805 respectively.
806
807 \subsection{Blocks} \seclabel{blocks}
808
809 \begin{figure}
810 \begin{fuzzcode}{0.98pt}
811 data `O -- Open
812 data `C -- Closed
813
814 data `Block n e x where
815 `BFirst :: n C O -> Block n C O
816 `BMiddle :: n O O -> Block n O O
817 `BLast :: n O C -> Block n O C
818 `BCat :: Block n e O -> Block n O x -> Block n e x
819
820 data `Graph n e x where
821 `GNil :: Graph n O O
822 `GUnit :: Block n O O -> Graph n O O
823 `GMany :: MaybeO e (Block n O C)
824 -> LabelMap (Block n C C)
825 -> MaybeO x (Block n C O)
826 -> Graph n e x
827
828 data `MaybeO ^ex t where
829 `JustO :: t -> MaybeO O t
830 `NothingO :: MaybeO C t
831
832 newtype `Label = Label Int
833
834 class `Edges n where
835 `entryLabel :: n C x -> Label
836 `successors :: n e C -> [Label]
837 \end{fuzzcode}
838 \caption{The block and graph types defined by \ourlib}
839 \figlabel{graph} \figlabel{edges}
840 \end{figure}
841
842 \ourlib\ combines the client's nodes into
843 blocks and graphs, which, unlike the nodes, are defined by \ourlib{}
844 (\figref{graph}).
845 A~@Block@ is parameterized over the node type~@n@
846 as well as over the same flag types that make it open or closed at
847 entry and exit.
848
849 The @BFirst@, @BMiddle@, and @BLast@ constructors create one-node
850 blocks.
851 Each of these constructors is polymorphic in the node's \emph{representation}
852 but monomorphic in its \emph{shape}.
853 An~earlier representation of blocks used a single constructor
854 of type \mbox{@n e x -> Block n e x@}, polymorphic in a node's representation \emph{and}
855 shape.
856 The representation of blocks in \figref{graph} has more constructors, but it
857 uses the power of GADTs to ensure that the shape of every node is
858 known statically.
859 This property makes no difference to clients, but it significantly
860 simplifies the implementation of analysis and transformation in
861 \secref{monomorphic-shape-outcome}.
862
863 The @BCat@ constructor concatenates blocks in sequence.
864 It~makes sense to concatenate blocks only when control can fall
865 through from the first to the second; therefore,
866 two blocks may be concatenated {only} if each block is open at
867 the point of concatenation.
868 This restriction is enforced by the type of @BCat@, whose first
869 argument must be open on exit, and whose second argument must be open on entry.
870 It~is statically impossible, for example, to concatenate a @Branch@
871 immediately before an
872 @Assign@.
873 Indeed, the @Block@ type statically guarantees that any
874 closed/closed @Block@---which compiler writers normally
875 call a ``basic block''---consists of exactly one closed/open node
876 (such as @Label@ in \figref{cmm-node}),
877 followed by zero or more open/open nodes (@Assign@ or @Store@),
878 and terminated with exactly one
879 open/closed node (@Branch@ or @CondBranch@).
880 Using GADTs to enforce these invariants is one of
881 \ourlib{}'s innovations.
882 % Also notice that the definition of @Block@ guarantees that every @Block@
883 % has at least one node.
884 % SPJ: I think we've agreed to stick with PNil.
885 % \remark{So what? Who cares? Why is it important
886 % that no block is empty? (And if we had an empty block, we could ditch
887 % @PNil@ from @PGraph@, which woudl be an improvement.)}
888
889 \subsection{Graphs} \seclabel{graphs}
890
891 \ourlib{} composes blocks into graphs, which are also defined
892 in \figref{graph}.
893 Like @Block@, the data type @Graph@ is parameterized over
894 both nodes @n@ and its open/closed shape (@e@ and @x@).
895 It has three constructors. The first two
896 deal with the base cases of open/open graphs:
897 an empty graph is represented by @GNil@ while a single-block graph
898 is represented by @GUnit@.
899
900 More general graphs are represented by @GMany@, which has three
901 fields: an optional entry sequence, a body, and an optional exit
902 sequence.
903 \begin{itemize}
904 \item
905 If the graph is open at the entry, it contains an entry sequence of type
906 @Block n O C@.
907 We could represent this sequence as a value of type
908 @Maybe (Block n O C)@, but we can do better:
909 a~value of @Maybe@ type requires a \emph{dynamic} test,
910 but we know \emph{statically}, at compile time, that the sequence is present if and only
911 if the graph is open at the entry.
912 We~express our compile-time knowledge by using the type
913 @MaybeO e (Block n O C)@, a type-indexed version of @Maybe@
914 which is also defined in \figref{graph}:
915 the type @MaybeO O a@ is isomorphic to~@a@, while
916 the type @MaybeO C a@ is isomorphic to~@()@.
917 \item
918 The body of the graph is a collection of closed/closed blocks.
919 To~facilitate traversal of the graph, we represent a body as a finite
920 map from label to block.
921 \item
922 The exit sequence is dual to the entry sequence, and like the entry
923 sequence, its presence or absence is deducible from the static type of the graph.
924 \end{itemize}
925
926 We can splice graphs together nicely; the cost is logarithmic in the
927 number of closed/closed blocks.
928 Unlike blocks, two graphs may be spliced together
929 not only when they are both open
930 at splice point but also
931 when they are both closed---and not in the other two cases:
932 \begin{smallfuzzcode}{10.8pt}
933 `gSplice :: Graph n e a -> Graph n a x -> Graph n e x
934 gSplice GNil g2 = g2
935 gSplice g1 GNil = g1
936
937 gSplice (GUnit ^b1) (GUnit ^b2) = GUnit (b1 `BCat` b2)
938
939 gSplice (GUnit b) (GMany (JustO e) bs x)
940 = GMany (JustO (b `BCat` e)) bs x
941
942 gSplice (GMany e ^bs (JustO x)) (GUnit b2)
943 = GMany e bs (JustO (x `BCat` b2))
944
945 gSplice (GMany e1 ^bs1 (JustO x1)) (GMany (JustO e2) ^bs2 x2)
946 = GMany e1 (bs1 `mapUnion` (b `addBlock` bs2) x2
947 where b = x1 `BCat` e2
948
949 gSplice (GMany e1 bs1 NothingO) (GMany NothingO bs2 x2)
950 = GMany e1 (bs1 `mapUnion` bs2) x2
951 \end{smallfuzzcode}
952 This definition illustrates the power of GADTs: the
953 pattern matching is exhaustive, and all the open/closed invariants are
954 statically checked. For example, consider the second-last equation for @gSplice@.
955 Since the exit link of the first argument is @JustO x1@,
956 we know that type parameter~@a@ is~@O@, and hence the entry link of the second
957 argument must be @JustO e2@.
958 Moreover, block~@x1@ must be
959 closed/open, and block~@e2@ must be open/closed.
960 We can therefore concatenate them
961 with @BCat@ to produce a closed/closed block, which is
962 added to the body of the result.
963
964 We~have carefully crafted the types so that if @BCat@ and @BodyCat@
965 are considered as associative operators,
966 every graph has a unique representation.\finalremark{So what? (Doumont)
967 Need some words about how this is part of making the API simple for
968 the client---I've added something to the end of the paragraph, but I'm
969 not particularly thrilled.}
970 %% \simon{Well, you were the one who was so keen on a unique representation!
971 %% And since we have one, I think tis useful to say so. Lastly, the representation of
972 %% singleton blocks is not entirely obvious.}
973 %%%%
974 %%%% An empty open/open graph is represented
975 %%%% by @GNil@, while a closed/closed one is @gNilCC@:
976 %%%% \par {\small
977 %%%% \begin{code}
978 %%%% gNilCC :: Graph C C
979 %%%% gNilCC = GMany NothingO BodyEmpty NothingO
980 %%%% \end{code}}
981 %%%% The representation of a @Graph@ consisting of a single block~@b@
982 %%%% depends on the shape of~@b@:\remark{Does anyone care?}
983 %%%% \par{\small
984 %%%% \begin{code}
985 %%%% gUnitOO :: Block O O -> Graph O O
986 %%%% gUnitOC :: Block O C -> Graph O C
987 %%%% gUnitCO :: Block O C -> Graph C O
988 %%%% gUnitCC :: Block O C -> Graph C C
989 %%%% gUnitOO b = GUnit b
990 %%%% gUnitOC b = GMany (JustO b) BodyEmpty NothingO
991 %%%% gUnitCO b = GMany NothingO BodyEmpty (JustO b)
992 %%%% gUnitCC b = GMany NothingO (BodyUnit b) NothingO
993 %%%% \end{code}}
994 %%%% Multi-block graphs are similar.
995 %%%% From these definitions
996 To guarantee uniqueness, @GUnit@ is restricted to open/open
997 blocks.
998 If~@GUnit@ were more polymorphic, there would be
999 more than one way to represent some graphs, and it wouldn't be obvious
1000 to a client which representation to choose---or if the choice made a difference.
1001
1002
1003 \subsection{Labels and successors} \seclabel{edges}
1004
1005 If \ourlib{} knows nothing about nodes, how can it know where
1006 a control transfer goes, or what is the @Label@ at the start of a block?
1007 To~answer such questions,
1008 the standard Haskell idiom is to define a type class whose methods
1009 provide exactly the operations needed;
1010 \ourlib's type class, called @Edges@, is given in \figref{edges}.
1011 The @entryLabel@ method takes a first node (one closed on entry, \secref{nodes})
1012 and returns its @Label@;
1013 the @successors@ method takes a last node (closed on exit) and returns
1014 the @Label@s to
1015 which it can transfer control.
1016 A~middle node, which is open at both entry and exit, cannot refer to
1017 any @Label@s,
1018 so no corresponding interrogation function is needed.
1019
1020 A node type defined by a client must be an instance of @Edges@.
1021 In~\figref{cmm-node},
1022 the client's instance declaration for @Node@ would be
1023 \begin{code}
1024 instance Edges Node where
1025 entryLabel (Label l) = l
1026 successors (Branch b) = [b]
1027 successors (CondBranch e b1 b2) = [b1, b2]
1028 \end{code}
1029 Again, the pattern matching for both functions is exhaustive, and
1030 the compiler statically checks this fact. Here, @entryLabel@
1031 cannot be applied to an @Assign@ or @Branch@ node,
1032 and any attempt to define a case for @Assign@ or @Branch@ would result
1033 in a type error.
1034
1035 While it is required for the client to provide this information about
1036 nodes, it is very convenient for \ourlib\ to get the same information
1037 about blocks.
1038 For its own internal use,
1039 \ourlib{} provides this instance declaration for the @Block@ type:
1040 \begin{code}
1041 instance Edges n => Edges (Block n) where
1042 entryLabel (BFirst n) = entryLabel n
1043 entryLabel (BCat b _) = entryLabel b
1044 successors (BLast n) = successors n
1045 successors (BCat _ b) = successors b
1046 \end{code}
1047 Because the functions @entryLabel@ and @successors@ are used to track control
1048 flow \emph{within} a graph, \ourlib\ does not need to ask for the
1049 entry label or successors of a @Graph@ itself.
1050 Indeed, @Graph@ \emph{cannot} be an instance of @Edges@, because even
1051 if a @Graph@ is closed at the entry, it does not have a unique entry label.
1052 %
1053 % A slight infelicity is that we cannot make @Graph@ an instance of @Edges@,
1054 % because a graph closed on entry has no unique label. Fortunately, we never
1055 % need @Graph@s to be in @Edges@.
1056 %
1057 % ``Never apologize. Never confess to infelicity.'' ---SLPJ
1058
1059
1060
1061
1062 \begin{table}
1063 \centerline{%
1064 \begin{tabular}{@{}>{\raggedright\arraybackslash}p{1.03in}>{\scshape}c>{\scshape
1065 }
1066 c>{\raggedright\arraybackslash}p{1.29in}@{}}
1067 &\multicolumn1{r}{\llap{\emph{Specified}}\hspace*{-0.3em}}&
1068 \multicolumn1{l}{\hspace*{-0.4em}\rlap{\emph{Implemented}}}&\\
1069 \multicolumn1{c}{\emph{Part of optimizer}}
1070 &\multicolumn1{c}{\emph{by}}&
1071 \multicolumn1{c}{\emph{by}}&
1072 \multicolumn1{c}{\emph{How many}}%
1073 \\[5pt]
1074 Control-flow graphs& Us & Us & One \\
1075 Nodes in a control-flow graph & You & You & One type per intermediate language \\[3pt]
1076 Dataflow fact~$F$ & You & You & One type per logic \\
1077 Lattice operations & Us & You & One set per logic \\[3pt]
1078 Transfer functions & Us & You & One per analysis \\
1079 Rewrite functions & Us & You & One per \rlap{transformation} \\[3pt]
1080 Solve-and-rewrite functions & Us & Us & Two (forward, backward) \\
1081 \end{tabular}%
1082 }
1083 \caption{Parts of an optimizer built with \ourlib}
1084 \tablabel{parts}
1085 \end{table}
1086
1087
1088
1089 \section {Using \ourlib{} to analyze and transform graphs} \seclabel{using-hoopl}
1090 \seclabel{making-simple}
1091 \seclabel{create-analysis}
1092
1093 Now that we have graphs, how do we optimize them?
1094 \ourlib{} makes it easy for a
1095 client to build a new dataflow analysis and optimization.
1096 The client must supply the following pieces:
1097 \begin{itemize}
1098 \item \emph{A node type} (\secref{nodes}).
1099 \ourlib{} supplies the @Block@ and @Graph@ types
1100 that let the client build control-flow graphs out of nodes.
1101 \item \emph{A data type of facts} and some operations over
1102 those facts (\secref{facts}).
1103 Each analysis uses facts that are specific to that particular analysis,
1104 which \ourlib{} accommodates by being polymorphic in
1105 the fact type.
1106 \item \emph{A transfer function} that takes a node and returns a
1107 \emph{fact transformer}, which takes a fact flowing into the node and
1108 returns the transformed fact that flows out of the node (\secref{transfers}).
1109 \item \emph{A rewrite function} that takes a node and an input fact,
1110 and which returns either @Nothing@
1111 or @(Just g)@ where @g@~is a graph that should
1112 replace the node.
1113 The ability to replace a \emph{node} by a \emph{graph} that may include
1114 internal control flow is
1115 crucial for many code-improving transformations.
1116 We discuss the rewrite function
1117 in Sections \ref{sec:rewrites} and \ref{sec:shallow-vs-deep}.
1118 \end{itemize}
1119 These requirements are summarized in \tabref{parts}.
1120 Because facts, transfer functions, and rewrite functions work closely together,
1121 we represent their combination
1122 as a single record of type @FwdPass@ (\figref{api-types}).
1123
1124
1125 \ifpagetuning\enlargethispage{\baselineskip}\fi
1126
1127 Given a node type~@n@ and a @FwdPass@,
1128 a client can ask \ourlib{}\ to analyze and rewrite a
1129 graph.
1130 \hoopl\ provides a fully polymorphic interface, but for purposes of
1131 exposition, we present a function that is specialized to a
1132 closed/closed graph:
1133 \begin{code}
1134 analyzeAndRewriteFwdBody
1135 :: ( FuelMonad m -- Gensym and other state
1136 , Edges n ) -- Access to flow edges
1137 => FwdPass m n f -- Lattice, transfer, rewrite
1138 -> [Label] -- Entry point(s)
1139 -> Graph n C C -- Input graph
1140 -> FactBase f -- Input fact(s)
1141 -> m ( Graph n C C -- Result graph
1142 , FactBase f ) -- ... and its facts
1143 \end{code}
1144 Given a @FwdPass@, the
1145 analyze-and-rewrite function transforms a graph into
1146 an optimized graph.
1147 As its type shows, this function
1148 is polymorphic in the types of nodes~@n@ and facts~@f@;
1149 these types are determined entirely by the client.
1150 The type of the monad~@m@ is also supplied by the client, but it must
1151 meet certain constraints laid out by \hoopl, as described in
1152 \secref{fuel-monad}.
1153
1154
1155 \begin{figure}
1156 \begin{code}
1157 data `FwdPass m n f
1158 = FwdPass { `fp_lattice :: DataflowLattice f
1159 , `fp_transfer :: FwdTransfer n f
1160 , `fp_rewrite :: FwdRewrite m n f }
1161
1162 ------- Lattice ----------
1163 data `DataflowLattice a = DataflowLattice
1164 { `fact_bot :: a
1165 , `fact_extend :: a -> a -> (ChangeFlag, a) }
1166
1167 data `ChangeFlag = `NoChange | `SomeChange
1168
1169 ------- Transfers ----------
1170 newtype `FwdTransfer n f -- abstract type
1171 `mkFTransfer'
1172 :: (forall e x . n e x -> f -> Fact x f)
1173 -> FwdTransfer n f
1174
1175 ------- Rewrites ----------
1176 newtype `FwdRewrite m n f -- abstract type
1177 `mkFRewrite'
1178 :: (forall e x . n e x -> f -> m (FwdRes m n f e x))
1179 -> FwdRewrite m n f
1180
1181 data `FwdRes m n f e x
1182 = FwdRes (Graph n e x) (FwdRewrite m n f)
1183 | `NoFwdRes
1184
1185 ------- Fact-like things -------
1186 type family `Fact x f :: *
1187 type instance Fact O f = f
1188 type instance Fact C f = FactBase f
1189
1190 ------- FactBase -------
1191 type `FactBase f = LabelMap f
1192 -- A finite mapping from Labels to facts f
1193 \end{code}
1194 \caption{\ourlib{} API data types}
1195 \figlabel{api-types}
1196 \figlabel{lattice-type} \figlabel{lattice}
1197 \figlabel{transfers} \figlabel{rewrites}
1198 \end{figure}
1199 % omit mkFactBase :: [(Label, f)] -> FactBase f
1200
1201
1202
1203 As well as taking and returning a graph with its entry point(s), the
1204 function also takes input facts (the @FactBase@) and produces output facts.
1205 A~@FactBase@ is simply a finite mapping from @Label@ to facts.
1206 The
1207 output @FactBase@ maps each @Label@ in the @Body@ to its fact; if
1208 the @Label@ is not in the domain of the @FactBase@, its fact is the
1209 bottom element of the lattice.
1210 The point(s) at which control flow may enter the graph are supplied by
1211 the caller; the input @FactBase@ supplies any
1212 facts that hold at these points.
1213 For example, in our constant-propagation example from \secref{const-prop-example},
1214 if the graph
1215 represents the body of a procedure
1216 with parameters $x,y,z$, we would map the entry @Label@ to a fact
1217 $@x=@\top \land @y=@\top \land @z=@\top$, to specify that the procedure's
1218 parameters are not known to be constants.
1219
1220 The client's model of how @analyzeAndRewriteFwd@ works is as follows:
1221 \ourlib\ walks forward over each block in the graph.
1222 At each node, \ourlib\ applies the
1223 rewrite function to the node and the incoming fact. If the rewrite
1224 function returns @NoFwdRes@, the node is retained as part of the output
1225 graph, the transfer function is used to compute the downstream fact,
1226 and \ourlib\ moves on to the next node.
1227 But if the rewrite function returns @(FwdRes g rw)@,
1228 indicating that it wants to rewrite the node to the replacement graph~@g@, then
1229 \ourlib\ recursively analyzes and rewrites~@g@, using the new rewrite
1230 function~@rw@, before moving on to the next node.
1231 A~node following a rewritten node sees
1232 \emph{up-to-date} facts; that is, its input fact is computed by
1233 analyzing the replacement graph.
1234
1235 In this section we flesh out the
1236 \emph{interface} to @analyzeAndRewriteFwd@, leaving the implementation for
1237 \secref{engine}.
1238 {\hfuzz=7.2pt\par}
1239
1240 \subsection{Dataflow lattices} \seclabel{lattices} \seclabel{facts}
1241
1242 For each analysis or transformation, the client must define a type
1243 of dataflow facts.
1244 A~dataflow fact often represents an assertion
1245 about a program point,\footnote
1246 {In \ourlib, a program point is simply an edge in a
1247 control-flow graph.}
1248 but in general, dataflow analysis establishes properties of \emph{paths}:
1249 \begin{itemize}
1250 \item An assertion about all paths \emph{to} a program point is established
1251 by a \emph{forwards analysis}. For example the assertion ``$@x@=@3@$'' at point P
1252 claims that variable @x@ holds value @3@ at P, regardless of the
1253 path by which P is reached.
1254 \item An assertion about all paths \emph{from} a program point is
1255 established by a \emph{backwards analysis}. For example, the
1256 assertion ``@x@ is dead'' at point P claims that no path from P uses
1257 variable @x@.
1258 \end{itemize}
1259
1260 A~set of dataflow facts must form a lattice, and \ourlib{} must know
1261 (a)~the bottom element of the lattice and (b)~how to take
1262 the least upper bound (join) of two elements.
1263 To ensure that analysis
1264 terminates, it is enough if every fact has a finite number of
1265 distinct facts above it, so that repeated joins
1266 eventually reach a fixed point.
1267
1268 In practice, joins are computed at labels.
1269 If~$f_{\mathit{id}}$ is the fact currently associated with the
1270 label~$\mathit{id}$,
1271 and if a transfer function propagates a new fact~$f_{\mathit{new}}$
1272 into the label~$\mathit{id}$,
1273 the dataflow engine replaces $f_{\mathit{id}}$ with
1274 the join $f_{\mathit{new}} \join f_{\mathit{id}}$.
1275 Furthermore, the dataflow engine wants to know if
1276 $f_{\mathit{new}} \join f_{\mathit{id}} = f_{\mathit{id}}$,
1277 because if not, the analysis has not reached a fixed point.
1278
1279 The bottom element and join operation of a lattice of facts of
1280 type~@f@ are stored in a value of type @DataflowLattice f@
1281 (\figref{lattice}).
1282 %%Such a value is part of the @FwdPass n f@ that is passed to
1283 %%@analyzeAndRewriteFwd@ above.
1284 As noted in the previous paragraph,
1285 \ourlib{} needs to know when the result of a join is equal
1286 to one of the arguments joined.
1287 Because this information is often available very cheaply at the time
1288 when the join is computed, \ourlib\ does not
1289 require a separate equality test on facts (which might be
1290 expensive).
1291 Instead, \ourlib\ requires that @fact_extend@ return a @ChangeFlag@ as
1292 well as the least upper bound.
1293 The @ChangeFlag@ should be @NoChange@ if
1294 the result is the same as the first argument (the old fact), and
1295 @SomeChange@ if the result differs.
1296 (Function @fact_extend@ is \emph{not} symmetric in
1297 its arguments.)
1298
1299 % A~possible representation of the facts needed to implement
1300 % constant propagation is shown in \figref{const-prop}.
1301 % A~fact
1302 % is represented as a finite map from a variable to a value of type
1303 % @Maybe HasConst@.%
1304 % A~variable $x$ maps to @Nothing@ iff $x=\bot$;
1305 % $x$~maps to @Just Top@ iff $x=\top$;
1306 % and $x$~maps to $@Just@\, (@I@\, k)$ iff $x=k$ (and similarly for
1307 % Boolean constants).\remark{Ugh}
1308 % Any one procedure has only
1309 % finitely many variables; only finitely many facts are computed at any
1310 % program point; and in this lattice any one fact can increase at most
1311 % twice. These properties ensure that the dataflow engine will reach a
1312 % fixed point.
1313
1314
1315 \subsection{The transfer function} \seclabel{transfers}
1316
1317 A~forward transfer function is presented with the dataflow fact(s) on
1318 the edge(s) coming
1319 into a node, and it computes dataflow fact(s) on the outgoing edge(s).
1320 In a forward analysis, the dataflow engine starts with the fact at the
1321 beginning of a block and applies the transfer function to successive
1322 nodes in that block until eventually the transfer function for the last node
1323 computes the facts that are propagated to the block's successors.
1324 For example, consider this graph, with entry at @L1@:
1325 \begin{code}
1326 L1: x=3; goto L2
1327 L2: y=x+4; x=x-1;
1328 if x>0 then goto L2 else return
1329 \end{code}
1330 A forward analysis starts with the bottom fact \{\} at every label.
1331 Analyzing @L1@ propagates this fact forward, by applying the transfer
1332 function successively to the nodes
1333 of @L1@, emerging with the fact \{@x=3@\} for @L2@.
1334 This new fact is joined with the existing (bottom) fact for @L2@.
1335 Now the analysis propagates @L2@'s fact forward, again using the transfer
1336 function, this time emerging with a new fact \mbox{\{@x=2@, @y=7@\}} for @L2@.
1337 Again, the new fact is joined with the existing fact for @L2@, and the process
1338 is iterated until the facts for each label reach a fixed point.
1339
1340 But wait! What is the \emph{type} of the transfer function?
1341 If the node is open at exit, the transfer function produces a single fact.
1342 But what if the node is \emph{closed} on exit?
1343 In that case the transfer function must
1344 produce a list of (@Label@,fact) pairs, one for each outgoing edge.
1345 %
1346 \emph{So the type of the transfer function's result
1347 depends on the shape of the node's exit.}
1348 Fortunately, this dependency can be expressed precisely, at compile
1349 time, by Haskell's (recently added)
1350 \emph{indexed type families}.
1351 The relevant part of \ourlib's interface is given in \figref{transfers}.
1352 A~forward transfer function supplied by a client,
1353 of type (@FwdTransfer@ @n@ @f@),
1354 is typically a function polymorphic in @e@ and @x@.
1355 It takes a
1356 node of type \mbox{(@n@ @e@ @x@)} and a fact of type~@f@, and it produces an
1357 outgoing ``fact-like thing'' of type (@Fact@ @x@ @f@). The
1358 type constructor @Fact@
1359 should be thought of as a type-level function; its signature is given in the
1360 @type family@ declaration, while its definition is given by two @type instance@
1361 declarations. The first declaration says that the fact-like thing
1362 coming out of a node
1363 \emph{open} at the exit is just a fact~@f@. The second declaration says that
1364 the fact-like thing
1365 coming out of a node
1366 \emph{closed} at the exit is a mapping from @Label@ to facts.
1367
1368 We have ordered the arguments such that if
1369 \begin{code}
1370 `transfer_fn :: forall e x . n e x -> f -> Fact x f
1371 `node :: n e x
1372 \end{code}
1373 then @(transfer_fn node)@ is a fact transformer:
1374 \begin{code}
1375 transfer_fn node :: f -> Fact x f
1376 \end{code}
1377
1378 So much for the interface.
1379 What about the implementation?
1380 The way GADTs work is that the compiler uses the value constructor for
1381 type \mbox{@n@ @e@ @x@} to determine whether @e@~and~@x@ are open or
1382 closed.
1383 But we want to write functions that are \emph{polymorphic} in the node
1384 type~@n@!
1385 Such functions include
1386 \begin{itemize}
1387 \item
1388 A function that takes a pair of @FwdTransfer@s for facts @f@~and~@f'@,
1389 and returns a single @FwdTransfer@ for the fact @(f, f')@
1390 \item
1391 A function that takes a @FwdTransfer@ and wraps it in logging code, so
1392 an analysis can be debugged by watching the facts flow through the
1393 nodes
1394 \end{itemize}
1395 Because the mapping from
1396 value constructors to shape is different for each node type~@n@, such
1397 functions cannot be polymorphic in both
1398 the representation and the shape of nodes.
1399 We~therefore, in our implementation, sacrifice polymorphism in shape:
1400 a~@FwdTransfer n f@ is represented by a \emph{triple} of functions,
1401 each polymorphic in the node type but monomorphic in shape:
1402 \begin{code}
1403 newtype FwdTransfer n f
1404 = FwdTransfers ( n C O -> f -> f
1405 , n O O -> f -> f
1406 , n O C -> f -> FactBase f
1407 )
1408 \end{code}
1409 Such triples can easily be composed and wrapped without requiring a
1410 pattern match on the value constructor of an unknown node
1411 type.\remark{Need to refer to this junk in the conclusion}
1412 Working with triples is tedious, but only \hoopl\ itself is forced to
1413 work with triples; each client, which knows the node type, may supply
1414 a triple,
1415 but it typically supplies a single function polymorphic in the shape
1416 of the (known) node.
1417
1418
1419
1420
1421 \subsection{The rewrite function}
1422 \seclabel{rewrites}
1423 \seclabel{example-rewrites}
1424
1425 %%%% \begin{figure}
1426 %%%% \begin{smallfuzzcode}{6.6pt}
1427 %%%% type `AGraph n e x
1428 %%%% = [Label] -> (Graph n e x, [Label])
1429 %%%%
1430 %%%% `withLabels :: Int -> ([Label] -> AGraph n e x)
1431 %%%% -> AGraph n e x
1432 %%%% withLabels n ^fn = \ls -> fn (take n ^ls) (drop n ls)
1433 %%%%
1434 %%%% `mkIfThenElse :: Expr -> AGraph Node O O
1435 %%%% -> AGraph Node O O -> AGraph Node O O
1436 %%%% mkIfThenElse p t e
1437 %%%% = withLabels 3 $ \[l1,l2,l3] ->
1438 %%%% gUnitOC (BUnit (CondBranch p l1 l2)) `gSplice`
1439 %%%% mkLabel l1 `gSplice` t `gSplice` mkBranch l3 `gSplice`
1440 %%%% mkLabel l2 `gSplice` e `gSplice` mkBranch l3 `gSplice`
1441 %%%% mkLabel l3
1442 %%%%
1443 %%%% `mkLabel l = gUnitCO (BUnit (Label l))
1444 %%%% `mkBranch l = gUnitOC (BUnit (Branch l))
1445 %%%% `gUnitOC b = GMany (JustO b) BodyEmpty NothingO
1446 %%%% `gUnitCO b = GMany NothingO BodyEmpty (JustO b)
1447 %%%% \end{smallfuzzcode}
1448 %%%% \caption{The \texttt{AGraph} type and example constructions} \figlabel{agraph}
1449 %%%% \end{figure}
1450
1451 We compute dataflow facts in order to enable code-improving
1452 transformations.
1453 In our constant-propagation example,
1454 the dataflow facts may enable us
1455 to simplify an expression by performing constant folding, or to
1456 turn a conditional branch into an unconditional one.
1457 Similarly, a liveness analysis may allow us to
1458 replace a dead assignment with a no-op.
1459
1460 A @FwdPass@ therefore includes a \emph{rewriting function}, whose
1461 type, @FwdRewrite@, is given in \figref{api-types}.
1462 A rewriting function takes a node and a fact, and optionally returns\ldots what?
1463 At first one might
1464 expect that rewriting should return a new node, but that is not enough:
1465 We~might want to remove a node by rewriting it to the empty graph,
1466 or more ambitiously, we might want to replace a high-level operation
1467 with a tree of conditional branches or a loop, which would entail
1468 introducing new blocks with internal control flow.
1469 In~general, a rewrite function must be able to return a
1470 \emph{graph}.
1471
1472 \ifpagetuning\enlargethispage{0.8\baselineskip}\fi
1473
1474 Concretely, a @FwdRewrite@ takes a node and a suitably shaped
1475 fact, and returns either @NoFwdRes@, indicating that the node should
1476 not be replaced,
1477 or $m\;@(FwdRes@\;\ag\;\rw@)@$, indicating that the node should
1478 be replaced with~\ag: the replacement graph.
1479 The result is monadic because
1480 if the rewriter makes graphs containing blocks,
1481 it may need fresh @Label@s, which are supplied by a monad.
1482 A~client may use its own monad or may use a monad or monad transformer
1483 supplied by \hoopl.
1484 \remark{Forward reference??}
1485
1486 %%%% \ifpagetuning\enlargethispage{0.8\baselineskip}\fi
1487
1488 The type of @FwdRewrite@ in \figref{api-types} guarantees
1489 \emph{at compile time} that
1490 the replacement graph $\ag$ has
1491 the \emph{same} open/closed shape as the node being rewritten.
1492 For example, a branch instruction can be replaced only by a graph
1493 closed at the exit. Moreover, because only an open/open graph can be
1494 empty---look at the type of @GNil@ in \figref{graph}---the type
1495 of @FwdRewrite@
1496 guarantees, at compile time, that no head of a block (closed/open)
1497 or tail of a block (open/closed) can ever be deleted by being
1498 rewritten to an empty graph.
1499
1500 \subsection{Shallow vs deep rewriting}
1501 \seclabel{shallow-vs-deep}
1502
1503 Once the rewrite has been performed, what then?
1504 The rewrite returns a replacement graph,
1505 which must itself be analyzed, and its nodes may be further rewritten.
1506 We~call @analyzeAndRewriteFwd@ to process the replacement
1507 graph---but what @FwdPass@ should we use?
1508 There are two common situations:
1509 \begin{itemize}
1510 \item Sometimes we want to analyze and transform the replacement graph
1511 with an unmodified @FwdPass@, further rewriting the replacement graph.
1512 This procedure is
1513 called \emph{deep rewriting}.
1514 When deep rewriting is used, the client's rewrite function must
1515 ensure that the graphs it produces are not rewritten indefinitely
1516 (\secref{correctness}).
1517 \item Sometimes we want to analyze \emph{but not further rewrite} the
1518 replacement graph. This procedure is called \emph{shallow rewriting}.
1519 It~is easily implemented by using a modified @FwdPass@
1520 whose rewriting function always returns @NoFwdRes@.
1521 \end{itemize}
1522 Deep rewriting is essential to achieve the full benefits of
1523 interleaved analysis and transformation
1524 \citep{lerner-grove-chambers:2002}.
1525 But shallow rewriting can be vital as well;
1526 for example, a forward dataflow pass that inserts
1527 a spill before a call must not rewrite the call again, lest it attempt
1528 to insert infinitely many spills.
1529
1530 An~innovation of \hoopl\ is to build the choice of shallow or deep
1531 rewriting into each rewrite function,
1532 an idea that is elegantly captured by the
1533 @FwdRes@ type returned by a @FwdRewrite@ (\figref{api-types}).
1534 The first component of the @FwdRes@ is the replacement graph, as discussed earlier.
1535 The second component, $\rw$, is a
1536 \emph{new rewriting function} to use when recursively processing
1537 the replacement graph.
1538 For shallow rewriting this new function is
1539 the constant @Nothing@ function; for deep rewriting it is the original
1540 rewriting function.
1541
1542
1543 \subsection{Composing rewrite functions and dataflow passes} \seclabel{combinators}
1544
1545 By requiring each rewrite to return a new rewrite function,
1546 \hoopl\ enables a variety of combinators over
1547 rewrite functions.
1548 \remark{This whole subsection needs to be redone in light of the new
1549 (triple-based) representation. It's not pretty any more.}
1550 For example, here is a function
1551 that combines two rewriting functions in sequence:
1552 \remark{This code must be improved}
1553 \begin{smallcode}
1554 thenFwdRw :: Monad m
1555 => FwdRewrite m n f
1556 -> FwdRewrite m n f
1557 -> FwdRewrite m n f
1558 `thenFwdRw rw1 rw2 = wrapFRewrites2' f rw1 rw2
1559 where f rw1 rw2' n f = do
1560 res1 <- rw1 n f
1561 case res1 of
1562 NoFwdRes -> rw2' n f
1563 FwdRes g rw1a ->
1564 return $ FwdRes g (rw1a `thenFwdRw` rw2)
1565
1566 `noFwdRw :: FwdRewrite n f
1567 noFwdRw = mkFRewrite' $ \ n f -> NoFwdRes
1568 \end{smallcode}
1569 What a beautiful type @thenFwdRw@ has! It tries @rw1@, and if @rw1@
1570 declines to rewrite, it behaves like @rw2@. But if
1571 @rw1@ rewrites, returning a new rewriter @rw1a@, then the overall call also
1572 succeeds, returning a new rewrite function obtained by combining @rw1a@
1573 with @rw2@. (We cannot apply @rw1a@ or @rw2@
1574 directly to the replacement graph~@g@,
1575 because @r1@~returns a graph and @rw2@~expects a node.)
1576 The rewriter @noFwdRw@ is the identity of @thenFwdRw@.
1577 Finally, @thenFwdRw@ can
1578 combine a deep-rewriting function and a shallow-rewriting function,
1579 to produce a rewriting function that is a combination of deep and shallow.
1580 %%This is something that the Lerner/Grove/Chambers framework could not do,
1581 %%because there was a global shallow/deep flag.
1582 %% Our shortsightedness; Lerner/Grove/Chambers is deep only ---NR
1583
1584
1585 A shallow rewriting function can be made deep by iterating
1586 it:\remark{Algebraic law wanted!}
1587 \begin{smallfuzzcode}{6.6pt}
1588 iterFwdRw
1589 :: Monad m => FwdRewrite m n f -> FwdRewrite m n f
1590 `iterFwdRw rw = wrapFRewrites' f rw
1591 where f rw' n f = liftM iter (rw' n f)
1592 iter NoFwdRes = NoFwdRes
1593 iter (FwdRes g rw2) =
1594 FwdRes g (rw2 `thenFwdRw` iterFwdRw rw)
1595 \end{smallfuzzcode}
1596 If we have shallow rewrites $A$~and~$B$ then we can build $AB$,
1597 $A^*B$, $(AB)^*$,
1598 and so on: sequential composition is @thenFwdRw@ and the Kleene star
1599 is @iterFwdRw@.\remark{Do we still believe this claim?}
1600
1601 The combinators above operate on rewrite functions that share a common
1602 fact type and transfer function.
1603 It~can also be useful to combine entire dataflow passes that use
1604 different facts.
1605 We~invite you to write one such combinator, with type
1606 \begin{code}
1607 `thenFwd :: Monad m
1608 => FwdPass m n f1
1609 -> FwdPass m n f2
1610 -> FwdPass m n (f1,f2)
1611 \end{code}
1612 The two passes run interleaved, not sequentially, and each may help
1613 the other,
1614 yielding better results than running $A$~and then~$B$ or $B$~and then~$A$
1615 \citep{lerner-grove-chambers:2002}.
1616 %% call these passes. ``super-analyses;''
1617 %% in \hoopl, construction of super-analyses is
1618 %% particularly concrete.
1619
1620 \subsection{Example: Constant propagation and constant folding}
1621 \seclabel{const-prop-client}
1622
1623 % omit Binop :: Operator -> Expr -> Expr -> Expr
1624 % omit Add :: Operator
1625
1626
1627 \begin{figure}
1628 {\small\hfuzz=3pt
1629 \begin{code}
1630 -- Types and definition of the lattice
1631 data `HasConst = `Top | `B Bool | `I Integer
1632 type `ConstFact = Map.Map Var HasConst
1633 `constLattice = DataflowLattice
1634 { fact_bot = Map.empty
1635 , fact_extend = stdMapJoin constFactAdd }
1636 where
1637 `constFactAdd ^old ^new = (c, j)
1638 where j = if new == old then new else Top
1639 c = if j == old then NoChange else SomeChange
1640
1641 -------------------------------------------------------
1642 -- Analysis: variable has constant value
1643 `varHasConst :: FwdTransfer Node ConstFact
1644 varHasConst (Label l) f = lookupFact f l
1645 varHasConst (Store _ _) f = f
1646 varHasConst (Assign x (Bool b)) f = Map.insert x (B b) f
1647 varHasConst (Assign x (Int i)) f = Map.insert x (I i) f
1648 varHasConst (Assign x _) f = Map.insert x Top f
1649 varHasConst (Branch l) f = mkFactBase [(l, f)]
1650 varHasConst (CondBranch (Var x) ^tid ^fid) f
1651 = mkFactBase [(tid, Map.insert x (B True) f),
1652 (fid, Map.insert x (B False) f)]
1653 varHasConst (CondBranch _ tid fid) f
1654 = mkFactBase [(tid, f), (fid, f)]
1655
1656 -------------------------------------------------------
1657 -- Constant propagation
1658 `constProp :: FwdRewrite Node ConstFact
1659 constProp node ^facts
1660 = fmap toAGraph (mapE rewriteE node)
1661 where
1662 `rewriteE e (Var x)
1663 = case Map.lookup x facts of
1664 Just (B b) -> Just $ Bool b
1665 Just (I i) -> Just $ Int i
1666 _ -> Nothing
1667 rewriteE e = Nothing
1668
1669 -------------------------------------------------------
1670 -- Simplification ("constant folding")
1671 `simplify :: FwdRewrite Node f
1672 simplify (CondBranch (Bool b) t f) _
1673 = Just $ toAGraph $ Branch (if b then t else f)
1674 simplify node _ = fmap toAGraph (mapE s_exp node)
1675 where
1676 `s_exp (Binop Add (Int i1) (Int i2))
1677 = Just $ Int $ i1 + i2
1678 ... -- more cases for constant folding
1679
1680 -- Rewriting expressions
1681 `mapE :: (Expr -> Maybe Expr)
1682 -> Node e x -> Maybe (Node e x)
1683 mapE f (Label _) = Nothing
1684 mapE f (Assign x e) = fmap (Assign x) $ f e
1685 ... -- more cases for rewriting expressions
1686
1687 -------------------------------------------------------
1688 -- Defining the forward dataflow pass
1689 `constPropPass = FwdPass
1690 { fp_lattice = constLattice
1691 , fp_transfer = varHasConst
1692 , fp_rewrite = constProp `thenFwdRw` simplify }
1693 \end{code}}
1694 \caption{The client for constant propagation and constant folding} \figlabel{const-prop}
1695 \end{figure}
1696 \figref{const-prop} shows client code for
1697 constant propagation and constant folding.
1698 For each variable at each point in a graph, the analysis concludes one of three facts:
1699 the variable holds a constant value (Boolean or integer),
1700 the variable might hold a non-constant value,
1701 or nothing is known about what the variable holds.
1702 We~represent these facts using a finite map from a variable to a
1703 fact of type (@Maybe HasConst@).
1704 A variable with a constant value maps to @Just k@, where @k@ is the constant value;
1705 a variable with a non-constant value maps to @Just Top@;
1706 and a variable with an unknown value maps to @Nothing@ (i.e., it is not
1707 in the domain of the finite map).
1708
1709 % \afterpage{\clearpage}
1710
1711 The definition of the lattice (@constLattice@) is straightforward.
1712 The bottom element is an empty map (nothing is known about the contents of any variable).
1713 We~use the @stdMapJoin@ function to lift the join operation
1714 for a single variable (@constFactAdd@) up to the map containing facts
1715 for all variables.
1716
1717 % omit stdMapJoin :: Ord k => JoinFun v -> JoinFun (Map.Map k v)
1718
1719 For the transfer function, @varHasConst@,
1720 there are two interesting kinds of nodes:
1721 assignment and conditional branch.
1722 In the first two cases for assignment, a variable gets a constant value,
1723 so we produce a dataflow fact mapping the variable to its value.
1724 In the third case for assignment, the variable gets a non-constant value,
1725 so we produce a dataflow fact mapping the variable to @Top@.
1726 The last interesting case is a conditional branch where the condition
1727 is a variable.
1728 If the conditional branch flows to the true successor,
1729 the variable holds @True@, and similarly for the false successor.
1730 We update the fact flowing to each successor accordingly.
1731
1732 We do not need to consider complicated cases such as
1733 an assignment @x:=y@ where @y@ holds a constant value @k@.
1734 Instead, we rely on the interleaving of transformation
1735 and analysis to first transform the assignment to @x:=k@,
1736 which is exactly what our simple transfer function expects.
1737 As we mention in \secref{simple-tx},
1738 interleaving makes it possible to write
1739 the simplest imaginable transfer functions, without missing
1740 opportunities to improve the code.
1741
1742 The rewrite function for constant propagation, @constProp@,
1743 simply rewrites each use of a variable to its constant value.
1744 We use the auxiliary function @mapE@
1745 to apply @rewriteE@ to each use of a variable in each kind of node;
1746 in turn, the @rewriteE@ function checks if the variable has a constant
1747 value and makes the substitution. We assume an auxiliary function
1748 \begin{code}
1749 `toAGraph :: Node e x -> AGraph Node e x
1750 \end{code}
1751
1752 \figref{const-prop} also gives a completely separate rewrite function
1753 to perform constant
1754 folding, called @simplify@. It rewrites a conditional branch on a
1755 boolean constant to an unconditional branch, and
1756 to find constant subexpressions,
1757 it runs @s_exp@
1758 on every subexpression.
1759 Function @simplify@ does not need to check whether a variable holds a
1760 constant value; it relies on @constProp@ to have replaced the
1761 variable by the constant.
1762 Indeed, @simplify@ does not consult the
1763 incoming fact at all, and hence is polymorphic in~@f@.
1764
1765
1766
1767 We have written two @FwdRewrite@ functions
1768 because they are independently useful. But in this case we
1769 want to apply \emph{both} of them,
1770 so we compose them with @thenFwdRw@.
1771 The composed rewrite functions, along with the lattice and the
1772 transfer function,
1773 go into @constPropPass@ (bottom of \figref{const-prop}).
1774 To improve a particular graph, we pass @constPropPass@ and the graph
1775 to
1776 @analyzeAndRewriteFwd@.
1777
1778
1779 \subsection{Throttling the dataflow engine using ``optimization fuel''}
1780 \seclabel{vpoiso}
1781 \seclabel{fuel}
1782
1783 Debugging an optimization can be tricky:
1784 an optimization may rewrite hundreds of nodes,
1785 and any of those rewrites could be incorrect.
1786 To debug dataflow optimizations, we use Whalley's
1787 \citeyearpar{whalley:isolation} powerful technique
1788 to identify the first rewrite that
1789 transforms a program from working code to faulty code.
1790
1791 The key idea is to~limit the number of rewrites that
1792 are performed while optimizing a graph.
1793 In \hoopl, the limit is called
1794 \emph{optimization fuel}:
1795 each rewrite costs one unit of fuel,
1796 and when the fuel is exhausted, no more rewrites are permitted.
1797 Because each rewrite leaves the observable behavior of the
1798 program unchanged, it is safe to stop rewriting at any point.
1799 Given a program that fails when compiled with optimization,
1800 a test infrastructure uses binary search on the amount of
1801 optimization fuel, until
1802 it finds that the program works correctly after $n-1$ rewrites but fails
1803 after $n$~rewrites.
1804 The $n$th rewrite is faulty.
1805
1806
1807 You may have noticed that @analyzeAndRewriteFwd@
1808 returns a value in the @FuelMonad@ (\secref{using-hoopl}).
1809 The @`FuelMonad@ is a simple state monad maintaining the supply of unused
1810 fuel. It also holds a supply of fresh labels, which are used by the rewriter
1811 for making new blocks; more precisely,
1812 \hoopl\ uses these labels to
1813 take the @AGraph@ returned by a pass's rewrite
1814 function (\figref{rewrites}) and convert it to a~@Graph@.
1815
1816
1817 \subsection{Fixed points and speculative rewrites} \seclabel{fixpoints}
1818
1819 Are rewrites sound, especially when there are loops?
1820 Many analyses compute a fixed point starting from unsound
1821 ``facts''; for example, a live-variable analysis starts from the
1822 assumption that all variables are dead. This means \emph{rewrites
1823 performed before a fixed point is reached may be unsound, and their results
1824 must be discarded}. Each iteration of the fixed-point computation must
1825 start afresh with the original graph.
1826
1827
1828 Although the rewrites may be unsound, \emph{they must be performed}
1829 (speculatively, and possibly recursively),
1830 so that the facts downstream of the replacement graphs are as accurate
1831 as possible.
1832 For~example, consider this graph, with entry at @L1@:
1833 \par{\small
1834 \begin{code}
1835 L1: x=0; goto L2
1836 L2: x=x+1; if x==10 then goto L3 else goto L2
1837 \end{code}}
1838 The first traversal of block @L2@ starts with the unsound ``fact'' \{x=0\};
1839 but analysis of the block propagates the new fact \{x=1\} to @L2@, which joins the
1840 existing fact to get \{x=$\top$\}.
1841 What if the predicate in the conditional branch were @x<10@ instead
1842 of @x==10@?
1843 Again the first iteration would begin with the tentative fact \{x=0\}.
1844 Using that fact, we would rewrite the conditional branch to an unconditional
1845 branch @goto L3@. No new fact would propagate to @L2@, and we would
1846 have successfully (and soundly) eliminated the loop.
1847 This example is contrived, but it illustrates that
1848 for best results we should
1849 \begin{itemize}
1850 \item Perform the rewrites on every iteration.
1851 \item Begin each new iteration with the original, virgin graph.
1852 \end{itemize}
1853 This sort of algorithm is hard to implement in an imperative setting, where rewrites
1854 mutate a graph in place.
1855 But with an immutable graph, implementing the algorithm
1856 is trivially easy: we simply revert to the original graph at the start
1857 of each fixed-point iteration.
1858
1859 \subsection{Correctness} \seclabel{correctness}
1860
1861 Facts computed by @analyzeAndRewriteFwd@ depend on graphs produced by the rewrite
1862 function, which in turn depend on facts computed by the transfer function.
1863 How~do we know this algorithm is sound, or if it terminates?
1864 A~proof requires a POPL paper
1865 \cite{lerner-grove-chambers:2002}, but we can give some
1866 intuition.
1867
1868 \hoopl\ requires that a client's functions meet
1869 these preconditions:
1870 \begin{itemize}
1871 \item
1872 The lattice must have no \emph{infinite ascending chains}; that is,
1873 every sequence of calls to @fact_extend@ must eventually return @NoChange@.
1874 \item
1875 The transfer function must be
1876 \emph{monotonic}: given a more informative fact in,
1877 it should produce a more informative fact out.
1878 \item
1879 The rewrite function must be \emph{sound}:
1880 if it replaces a node @n@ by a replacement graph~@g@, then @g@~must be
1881 observationally equivalent to~@n@ under the
1882 assumptions expressed by the incoming dataflow fact~@f@.
1883 %%\footnote{We do not permit a transformation to change
1884 %% the @Label@ of a node. We have not found any optimizations
1885 %% that are prevented (or even affected) by this restriction.}
1886 \item
1887 The rewrite function must be \emph{consistent} with the transfer function;
1888 that is, \mbox{@`transfer n f@ $\sqsubseteq$ @transfer g f@}.
1889 For example, if the analysis says that @x@ is dead before the node~@n@,
1890 then it had better still be dead if @n@ is replaced by @g@.
1891 \item
1892 To ensure termination, a transformation that uses deep rewriting
1893 must not return replacement graphs which
1894 contain nodes that could be rewritten indefinitely.
1895 \end{itemize}
1896 Without the conditions on monotonicity and consistency,
1897 our algorithm will terminate,
1898 but there is no guarantee that it will compute
1899 a fixed point of the analysis. And that in turn threatens the
1900 soundness of rewrites based on possibly bogus ``facts''.
1901
1902 However, when the preconditions above are met,
1903 \begin{itemize}
1904 \item
1905 The algorithm terminates. The fixed-point loop must terminate because the
1906 lattice has no infinite ascending chains. And the client is responsible
1907 for avoiding infinite recursion when deep rewriting is used.
1908 \item
1909 The algorithm is sound. Why? Because if each rewrite is sound (in the sense given above),
1910 then applying a succession of rewrites is also sound.
1911 Moreover, a~sound analysis of the replacement graph
1912 may generate only dataflow facts that could have been
1913 generated by a more complicated analysis of the original graph.
1914 \end{itemize}
1915
1916 \finalremark{Doesn't the rewrite have to be have the following property:
1917 for a forward analysis/transform, if (rewrite P s) = Just s',
1918 then (transfer P s $\sqsubseteq$ transfer P s').
1919 For backward: if (rewrite Q s) = Just s', then (transfer Q s' $\sqsubseteq$ transfer Q s).
1920 Works for liveness.
1921 ``It works for liveness, so it must be true'' (NR).
1922 If this is true, it's worth a QuickCheck property!
1923 }%
1924 \finalremark{Version 2, after further rumination. Let's define
1925 $\scriptstyle \mathit{rt}(f,s) = \mathit{transform}(f, \mathit{rewrite}(f,s))$.
1926 Then $\mathit{rt}$ should
1927 be monotonic in~$f$. We think this is true of liveness, but we are not sure
1928 whether it's just a generally good idea, or whether it's actually a
1929 precondition for some (as yet unarticulated) property of \ourlib{} to hold.}%
1930
1931 %%%% \simon{The rewrite functions must presumably satisfy
1932 %%%% some monotonicity property. Something like: given a more informative
1933 %%%% fact, the rewrite function will rewrite a node to a more informative graph
1934 %%%% (in the fact lattice.).
1935 %%%% \textbf{NR}: actually the only obligation of the rewrite function is
1936 %%%% to preserve observable behavior. There's no requirement that it be
1937 %%%% monotonic or indeed that it do anything useful. It just has to
1938 %%%% preserve semantics (and be a pure function of course).
1939 %%%% \textbf{SLPJ} In that case I think I could cook up a program that
1940 %%%% would never reach a fixed point. Imagine a liveness analysis with a loop;
1941 %%%% x is initially unused anywhere.
1942 %%%% At some assignment node inside the loop, the rewriter behaves as follows:
1943 %%%% if (and only if) x is dead downstream,
1944 %%%% make it alive by rewriting the assignment to mention x.
1945 %%%% Now in each successive iteration x will go live/dead/live/dead etc. I
1946 %%%% maintain my claim that rewrite functions must satisfy some
1947 %%%% monotonicity property.
1948 %%%% \textbf{JD}: in the example you cite, monotonicity of facts at labels
1949 %%%% means x cannot go live/dead/live/dead etc. The only way we can think
1950 %%%% of not to terminate is infinite ``deep rewriting.''
1951 %%%% }
1952
1953
1954
1955
1956 \section{\ourlib's implementation}
1957 \seclabel{engine}
1958 \seclabel{dfengine}
1959
1960 \secref{making-simple}
1961 gives a client's-eye view of \hoopl, showing how to use
1962 it to create analyses and transformations.
1963 \hoopl's interface is simple, but
1964 the \emph{implementation} of interleaved analysis and rewriting is
1965 quite complicated. \citet{lerner-grove-chambers:2002}
1966 do not describe their implementation. We have written
1967 at least three previous implementations, all of which
1968 were long and hard to understand, and only one of which
1969 provided compile-time guarantees about open and closed shapes.
1970 We are not confident that any of these implementations are correct.
1971
1972 In this paper we describe our new implementation. It is short
1973 (about a third of the size of our last attempt), elegant, and offers
1974 strong static shape guarantees. The whole thing is about 300~lines of
1975 code, excluding comments; this count includes both forward and backward
1976 dataflow analysis and transformation.
1977
1978 We describe the implementation of \emph{forward}
1979 analysis and transformation.
1980 The implementations of backward analysis and transformation are
1981 exactly analogous and are included in \hoopl.
1982
1983
1984
1985 \subsection{Overview}
1986
1987 {\hfuzz=0.6pt
1988 We concentrate on implementing @analyzeAndRewriteFwd@, whose
1989 type is in \secref{using-hoopl}.
1990 Its implementation is built on the hierarchy of nodes, blocks, and graphs
1991 described in \secref{graph-rep}. For each thing in the hierarchy,
1992 we develop a function of this type:
1993 \begin{code}
1994 type `ARF ^thing n
1995 = forall f e x. FwdPass n f
1996 -> thing e x -> Fact e f
1997 -> FuelMonad (RG n e x, Fact x f)
1998 \end{code}
1999 An @ARF@ (short for ``analyze and rewrite forward'') is a combination of
2000 a rewrite and transfer function.
2001 }%
2002 An @ARF@ takes a @FwdPass@, a @thing@ (a node, block, or graph),
2003 and an input fact,
2004 and it returns a rewritten graph of type @(RG n e x)@ of the same shape
2005 as the @thing@, plus a suitably shaped output fact.
2006 %
2007 %Regardless of whether @thing@ is a node, block, or graph, the result
2008 %is always a graph.
2009 %
2010 % point is made adequately in the client section ---NR
2011 The type~@RG@ is internal to \hoopl; it is not seen by any client.
2012 We use it, not @Graph@, for two reasons:
2013 \begin{itemize}
2014 \item The client is often interested not only in the facts flowing
2015 out of the graph (which are returned in the @Fact x f@),
2016 but also in the facts on the \emph{internal} blocks
2017 of the graph. A~replacement graph of type @(RG n e x)@ is decorated with
2018 these internal facts.
2019 \item A @Graph@ has deliberately restrictive invariants; for example,
2020 a @GMany@ with a @JustO@ is always open at exit (\figref{graph}). It turns
2021 out to be awkward to maintain these invariants \emph{during} rewriting,
2022 but easy to restore them \emph{after} rewriting by ``normalizing'' an @RG@.
2023 \end{itemize}
2024 The information in an @RG@ is returned to the client by
2025 the normalization function @normalizeBody@, which
2026 splits an @RG@ into a @Body@ and its corresponding @FactBase@:
2027 \begin{code}
2028 `normalizeBody :: Edges n => RG n f C C
2029 -> (Body n, FactBase f)
2030 \end{code}
2031 \begin{figure}
2032 \hfuzz=0.98pt
2033 \begin{code}
2034 data `RG n f e x where
2035 `RGNil :: RG n f a a
2036 `RGCatO :: RG n f e O -> RG n f O x -> RG n f e x
2037 `RGCatC :: RG n f e C -> RG n f C x -> RG n f e x
2038 `RGUnit :: Fact e f -> Block n e x -> RG n f e x
2039 \end{code}
2040 \caption{The data type \texttt{RG} of rewritten graphs} \figlabel{rg}
2041 \end{figure}
2042 The constructors of @RG@ are given in \figref{rg}.
2043 The essential points are that constructor @RGUnit@ is polymorphic in
2044 the shape of a block, @RGUnit@ carries a fact as
2045 well as a block, and the concatenation constructors record the shapes
2046 of the graphs at the point of concatenation.
2047 \remark{Not too happy with calling this ``concatenation''}
2048 (A~record of the shapes is needed so that when
2049 @normalizeBody@ is presented with a block carried by @RGUnit@, it is
2050 known whether the block is an entry sequence, an exit sequence, or a
2051 basic block.)
2052 \remark{
2053 Within \hoopl,
2054 the @RG@~type is a great convenience. Mutter mutter:
2055 it carries facts as well as blocks, and it frees the client's
2056 rewrite functions from any obligation to respect the invariants of
2057 type @Graph@---I'm not convinced.}
2058
2059
2060 We exploit the type distinctions of nodes, @Block@, @Body@,
2061 and @Graph@ to structure the code into several small pieces, each of which
2062 can be understood independently. Specifically, we define a layered set of
2063 functions, each of which calls the previous one:
2064 \begin{code}
2065 arfNode :: Edges n => ARF n n
2066 arfBlock :: Edges n => ARF (Block n) n
2067 arfBody :: Edges n
2068 => FwdPass n f -> Body n -> FactBase f
2069 -> FuelMonad (RG n f C C, FactBase f)
2070 arfGraph :: Edges n => ARF (Graph n) n
2071 \end{code}
2072 \begin{itemize}
2073 \item
2074 The @arfNode@ function processes nodes (\secref{arf-node}).
2075 It handles the subtleties of interleaved analysis and rewriting,
2076 and it deals with fuel consumption. It calls @arfGraph@ to analyze
2077 and transform rewritten graphs.
2078 \item
2079 Based on @arfNode@ it is extremely easy to write @arfBlock@, which lifts
2080 the analysis and rewriting from nodes to blocks (\secref{arf-block}).
2081
2082 %%% \ifpagetuning\penalty -10000 \fi
2083
2084
2085 \item
2086 Using @arfBlock@ we define @arfBody@, which analyzes and rewrites a
2087 @Body@: a~group of closed/closed blocks linked by arbitrary
2088 control flow.
2089 Because a @Body@ is
2090 always closed/closed and does not take shape parameters, function
2091 @arfBody@ is less polymorphic than the others; its type is what
2092 would be obtained by expanding and specializing the definition of
2093 @ARF@ for a @thing@ which is always closed/closed and is equivalent to
2094 a @Body@.
2095
2096 Function @arfBody@ takes care of fixed points (\secref{arf-body}).
2097 \item
2098 Based on @arfBody@ it is easy to write @arfGraph@ (\secref{arf-graph}).
2099 \end{itemize}
2100 Given these functions, writing the main analyzer is a simple
2101 matter of matching the external API to the internal functions:
2102 \begin{code}
2103 `analyzeAndRewriteFwd
2104 :: forall n f. Edges n
2105 => FwdPass n f -> Body n -> FactBase f
2106 -> FuelMonad (Body n, FactBase f)
2107
2108 analyzeAndRewriteFwd pass ^body facts
2109 = do { (^rg, _) <- arfBody pass body facts
2110 ; return (normalizeBody rg) }
2111 \end{code}
2112
2113 \subsection{From nodes to blocks} \seclabel{arf-block}
2114 \seclabel{arf-graph}
2115
2116 We begin our explanation with the second task:
2117 writing @arfBlock@, which analyzes and transforms blocks.
2118 \begin{code}
2119 `arfBlock :: Edges n => ARF (Block n) n
2120 arfBlock pass (BUnit node) f
2121 = arfNode pass node f
2122 arfBlock pass (BCat b1 b2) f
2123 = do { (g1,f1) <- arfBlock pass b1 f
2124 ; (g2,f2) <- arfBlock pass b2 f1
2125 ; return (g1 `RGCatO` g2, f2) }
2126 \end{code}
2127 The code is delightfully simple.
2128 The @BUnit@ case is implemented by @arfNode@.
2129 The @BCat@ case is implemented by recursively applying @arfBlock@ to the two
2130 sub-blocks, threading the output fact from the first as the
2131 input to the second.
2132 Each recursive call produces a rewritten graph;
2133 we concatenate them with @RGCatO@.
2134
2135 Function @arfGraph@ is equally straightforward:
2136 \begin{code}
2137 `arfGraph :: Edges n => ARF (Graph n) n
2138 arfGraph _ GNil f = return (RGNil, f)
2139 arfGraph pass (GUnit blk) f = arfBlock pass blk f
2140 arfGraph pass (GMany NothingO body NothingO) f
2141 = do { (^body', ^fb) <- arfBody pass body f
2142 ; return (body', fb) }
2143 arfGraph pass (GMany NothingO body (JustO ^exit)) f
2144 = do { (body', fb) <- arfBody pass body f
2145 ; (^exit', ^fx) <- arfBlock pass exit fb
2146 ; return (body' `RGCatC` exit', fx) }
2147 -- ... two more equations for GMany ...
2148 \end{code}
2149 The pattern is the same as for @arfBlock@: thread
2150 facts through the sequence, and concatenate the results.
2151 Because the constructors of type~@RG@ are more polymorphic than those
2152 of @Graph@, type~@RG@ can represent
2153 graphs more simply than @Graph@; for example, each element of a
2154 @GMany@ becomes a single @RG@ object, and these @RG@ objects are then
2155 concatenated to form a single result of type~@RG@.
2156
2157 %% \ifpagetuning\penalty -10000 \fi
2158
2159 \subsection{Analyzing and rewriting nodes} \seclabel{arf-node}
2160
2161 Although interleaving analysis with transformation is tricky, we have
2162 succeeded in isolating the algorithm in just two functions,
2163 @arfNode@ and its backward analog, @`arbNode@:
2164 \begin{fuzzcode}{10.5pt}
2165 `arfNode :: Edges n => ARF n n
2166 arfNode ^pass n f
2167 = do { ^mb_g <- withFuel (fp_rewrite pass n f)
2168 ; case mb_g of
2169 Nothing -> return (RGUnit f (BUnit n),
2170 fp_transfer pass n f)
2171 Just (FwdRes ^ag ^rw) ->
2172 do { g <- graphOfAGraph ag
2173 ; let ^pass' = pass { fp_rewrite = rw }
2174 ; arfGraph pass' g f } }
2175 \end{fuzzcode}
2176 The code here is more complicated,
2177 but still admirably brief.
2178 Using the @fp_rewrite@ record selector (\figref{api-types}),
2179 we~begin by extracting the
2180 rewriting function from the @FwdPass@,
2181 and we apply it to the node~@n@
2182 and the incoming fact~@f@.
2183
2184 The resulting @Maybe@ is passed to @withFuel@, which
2185 deals with fuel accounting:
2186 \begin{code}
2187 `withFuel :: Maybe a -> FuelMonad (Maybe a)
2188 \end{code}
2189 If @withFuel@'s argument is @Nothing@, \emph{or} if we have run out of
2190 optimization fuel (\secref{fuel}), @withFuel@ returns @Nothing@.
2191 Otherwise, @withFuel@ consumes one unit of fuel and returns its
2192 % defn Fuel
2193 argument (which will be a @Just@). That is all we need say about fuel.
2194
2195 In the @Nothing@ case, no rewrite takes place---either because the rewrite function
2196 didn't want one or because fuel is exhausted.
2197 We~return a single-node
2198 graph @(RGUnit f (BUnit n))@, decorated with its incoming fact.
2199 We~also apply the transfer function @(fp_transfer pass)@
2200 to the incoming fact to produce the outgoing fact.
2201 (Like @fp_rewrite@, @fp_transfer@ is a record selector of @FwdPass@.)
2202
2203 In the @Just@ case, we receive a replacement
2204 @AGraph@ @ag@ and a new rewrite function~@rw@.
2205 We~convert @ag@ to a @Graph@, using
2206 \par{\small
2207 \begin{code}
2208 `graphOfAGraph :: AGraph n e x -> FuelMonad (Graph n e x)
2209 \end{code}}
2210 and we analyze the resulting @Graph@ with @arfGraph@.
2211 This analysis uses @pass'@, which contains the original lattice and transfer
2212 function from @pass@, together with the new rewrite function~@rg@.
2213
2214 And that's it! If~the client wanted deep rewriting, it is
2215 implemented by the call to @arfGraph@;
2216 if the client wanted
2217 shallow rewriting, the rewrite function will have returned
2218 @noFwdRw@ as~@rw@, which is implanted in @pass'@
2219 (\secref{shallow-vs-deep}).
2220
2221 \subsection{Fixed points} \seclabel{arf-body}
2222
2223 Lastly, @arfBody@ deals with the fixed-point calculation.
2224 This part of the implementation is the only really tricky part, and it is
2225 cleanly separated from everything else:
2226 \par{\small
2227 \begin{code}
2228 arfBody :: Edges n
2229 => FwdPass n f -> Body n -> FactBase f
2230 -> FuelMonad (RG n f C C, FactBase f)
2231 `arfBody pass body ^fbase
2232 = fixpoint (fp_lattice pass) (arfBlock pass) fbase $
2233 forwardBlockList (factBaseLabels fbase) body
2234 \end{code}}
2235 Function @forwardBlockList@ takes a list of possible entry points and @Body@,
2236 and it
2237 returns a linear list of
2238 blocks, sorted into an order that makes forward dataflow efficient:
2239 \begin{code}
2240 `forwardBlockList
2241 :: Edges n => [Label]
2242 -> Body n -> [(Label,Block n C C)]
2243 \end{code}
2244 For
2245 example, if the @Body@ starts at block~@L2@, and @L2@
2246 branches to~@L1@, but not vice versa, then \hoopl\ will reach a fixed point
2247 more quickly if we process @L2@ before~@L1@.
2248 To~find an efficient order, @forwardBlockList@ uses
2249 the methods of the @Edges@ class---@entryLabel@ and @successors@---to
2250 perform a reverse depth-first traversal of the control-flow graph.
2251 %%
2252 %%The @Edges@ type-class constraint on~@n@ propagates to all the
2253 %%@`arfThing@ functions.
2254 %% paragraph carrying too much freight
2255 %%
2256 The order of the blocks does not affect the fixed point or any other
2257 part of the answer; it affects only the number of iterations needed to
2258 reach the fixed point.
2259
2260 How do we know what entry points to pass to @forwardBlockList@?
2261 We treat
2262 any block with an entry in the in-flowing @FactBase@ as an entry point.
2263 \finalremark{Why does this work?}
2264 {\hfuzz=0.8pt \par}
2265
2266 The rest of the work is done by @fixpoint@, which is shared by
2267 both forward and backward analyses:
2268 \begin{code}
2269 `fixpoint :: forall n f.
2270 Edges n
2271 => Bool -- going Forward?
2272 -> DataflowLattice f
2273 -> (Block n C C -> FactBase f ->
2274 FuelMonad (RG n f C C, FactBase f))
2275 -> FactBase f
2276 -> [(Label, Block n C C)]
2277 -> FuelMonad (RG n f C C, FactBase f)
2278 \end{code}
2279 Except for the mysterious @Bool@ passed as the first argument,
2280 the type signature tells the story.
2281 The third argument is
2282 a function that analyzes and rewrites a single block;
2283 @fixpoint@ applies that function successively to all the blocks,
2284 which are passed as the fifth argument.\finalremark{For consistency with the transfer
2285 functions, blocks should come before @FactBase@, even though this change will
2286 ugly up the call site some.}
2287 The @fixpoint@ function maintains a
2288 ``Current @FactBase@''
2289 which grows monotonically:
2290 the initial value of the Current @FactBase@ is the fourth argument to
2291 @fixpoint@,
2292 and the Current @FactBase@ is augmented with the new facts that flow
2293 out of each @Block@ as it is analyzed.
2294 The @fixpoint@ function
2295 keeps analyzing blocks until the Current @FactBase@ reaches a fixed
2296 point.
2297
2298 The code for @fixpoint@ is a massive 70 lines long;
2299 for completeness, it
2300 appears in Appendix~\ref{app:fixpoint}.
2301 The~code is mostly straightforward, although we try to be a bit clever
2302 about deciding when a new fact means that another iteration over the
2303 blocks will be required.
2304 There is one more subtle point worth mentioning, which we highlight by
2305 considering a forward analysis of this graph, where execution starts at~@L1@:
2306 \begin{code}
2307 L1: x:=3; goto L4
2308 L2: x:=4; goto L4
2309 L4: if x>3 goto L2 else goto L5
2310 \end{code}
2311 Block @L2@ is unreachable.
2312 But if we \naively\ process all the blocks (say in
2313 order @L1@, @L4@, @L2@), then we will start with the bottom fact for @L2@, propagate
2314 \{@x=4@\} to @L4@, where it will join with \{@x=3@\} to yield
2315 \{@x=@$\top$\}.
2316 Given @x=@$\top$, the
2317 conditional in @L4@ cannot be rewritten, and @L2@~seems reachable. We have
2318 lost a good optimization.
2319
2320 Our implementation solves this problem through a clever trick that is
2321 safe only for a forward analysis;
2322 @fixpoint@ analyzes a block only if the block is
2323 reachable from an entry point.
2324 This trick is not safe for a backward analysis, which
2325 is why
2326 @fixpoint@ takes a @Bool@ as its first argument:
2327 it must know if the analysis goes forward.
2328
2329 Although the trick can be implemented in just a couple of lines of
2330 code, the reasoning behind it is quite subtle---exactly the sort of
2331 thing that should be implemented once in \hoopl, so clients don't have
2332 to worry about it.
2333
2334 \section {Related work} \seclabel{related}
2335
2336 While there is a vast body of literature on
2337 dataflow analysis and optimization,
2338 relatively little can be found on
2339 the \emph{design} of optimizers, which is the topic of this paper.
2340 We therefore focus on the foundations of dataflow analysis
2341 and on the implementations of some comparable dataflow frameworks.
2342
2343 \paragraph{Foundations}
2344
2345 When transfer functions are monotone and lattices are finite in height,
2346 iterative dataflow analysis converges to a fixed point
2347 \cite{kam-ullman:global-iterative-analysis}.
2348 If~the lattice's join operation distributes over transfer
2349 functions,
2350 this fixed point is equivalent to a join-over-all-paths solution to
2351 the recursive dataflow equations
2352 \cite{kildall:unified-optimization}.\footnote
2353 {Kildall uses meets, not joins.
2354 Lattice orientation is conventional, and conventions have changed.
2355 We use Dana Scott's
2356 orientation, in which higher elements carry more information.}
2357 \citet{kam-ullman:monotone-flow-analysis} generalize to some
2358 monotone functions.
2359 Each~client of \hoopl\ must guarantee monotonicity.
2360
2361 \ifcutting
2362 \citet{cousot:abstract-interpretation:1977}
2363 \else
2364 \citet{cousot:abstract-interpretation:1977,cousot:systematic-analysis-frameworks}
2365 \fi
2366 introduce abstract interpretation as a technique for developing
2367 lattices for program analysis.
2368 \citet{schmidt:data-flow-analysis-model-checking} shows that
2369 an all-paths dataflow problem can be viewed as model checking an
2370 abstract interpretation.
2371
2372 \citet{muchnick:compiler-implementation}
2373 presents many examples of both particular analyses and related
2374 algorithms.
2375
2376
2377 The soundness of interleaving analysis and transformation,
2378 even when not all speculative transformations are performed on later
2379 iterations, was shown by
2380 \citet{lerner-grove-chambers:2002}.
2381
2382
2383 \paragraph{Frameworks}
2384 Most dataflow frameworks support only analysis, not transformation.
2385 The framework computes a fixed point of transfer functions, and it is
2386 up to the client of
2387 the framework to use that fixed point for transformation.
2388 Omitting transformation makes it much easier to build frameworks,
2389 and one can find a spectrum of designs.
2390 We~describe two representative
2391 designs, then move on to the prior frameworks that support interleaved
2392 analysis and transformation.
2393
2394 The CIL toolkit \cite{necula:cil:2002}\finalremark{No good citation
2395 for same reason as Soot below ---JD}
2396 provides an analysis-only framework for C~programs.
2397 The framework is limited to one representation of control-flow graphs
2398 and one representation of instructions, both of which are provided by
2399 the framework.
2400 The~API is complicated;
2401 much of the complexity is needed to enable the client to
2402 affect which instructions
2403 the analysis iterates over.
2404
2405
2406 The Soot framework is designed for analysis of Java programs
2407 \cite{hendren:soot:2000}.\finalremark{This citation is probably the
2408 best for Soot in general, but there doesn't appear
2409 to be any formal publication that actually details the dataflow
2410 framework part. ---JD}
2411 While Soot's dataflow library supports only analysis, not
2412 transformation, we found much
2413 to admire in its design.
2414 Soot's library is abstracted over the representation of
2415 the control-flow graph and the representation of instructions.
2416 Soot's interface for defining lattice and analysis functions is
2417 like our own,
2418 although because Soot is implemented in an imperative style,
2419 additional functions are needed to copy lattice elements.
2420 Like CIL, Soot provides only analysis, not transformation.
2421
2422
2423 \finalremark{FYI, LLVM has Pass Managers that try to control the order of passes,
2424 but I'll be darned if I can find anything that might be termed a dataflow framework.}
2425
2426 The Whirlwind compiler contains the dataflow framework implemented
2427 by \citet{lerner-grove-chambers:2002}, who were the first to
2428 interleave analysis and transformation.
2429 Their implementation is much like our early efforts:
2430 it is a complicated mix of code that simultaneously manages interleaving,
2431 deep rewriting, and fixed-point computation.
2432 By~separating these tasks,
2433 our implementation simplifies the problem dramatically.
2434 Whirlwind's implementation also suffers from the difficulty of
2435 maintaining pointer invariants in a mutable representation of
2436 control-flow graphs, a problem we have discussed elsewhere
2437 \cite{ramsey-dias:applicative-flow-graph}.
2438
2439 Because speculative transformation is difficult in an imperative setting,
2440 Whirlwind's implementation is split into two phases.
2441 The first phase runs the interleaved analyses and transformations
2442 to compute the final dataflow facts and a representation of the transformations
2443 that should be applied to the input graph.
2444 The second phase executes the transformations.
2445 In~\hoopl, because control-flow graphs are immutable, speculative transformations
2446 can be applied immediately, and there is no need
2447 for a phase distinction.
2448
2449 %%% % repetitious...
2450 %%%
2451 %%% \ourlib\ also improves upon Whirlwind's dataflow framework by providing
2452 %%% new support for the optimization writer:
2453 %%% \begin{itemize}
2454 %%% \item Using static type guarantees, \hoopl\ rules out a whole
2455 %%% class of possible bugs: transformations that produced malformed
2456 %%% control-flow graphs.
2457 %%% \item Using dynamic testing,
2458 %%% we can isolate the rewrite that transforms a working program
2459 %%% into a faulty program,
2460 %%% using Whalley's \citeyearpar{whalley:isolation} fault-isolation technique.
2461 %%% \end{itemize}
2462
2463 In previous work \cite{ramsey-dias:applicative-flow-graph}, we
2464 described a zipper-based representation of control-flow
2465 graphs, stressing the advantages
2466 of immutability.
2467 Our new representation, described in \secref{graph-rep}, is a significant improvement:
2468 \begin{itemize}
2469 \item
2470 We can concatenate nodes, blocks, and graphs in constant time.
2471 %Previously, we had to resort to Hughes's
2472 %\citeyearpar{hughes:lists-representation:article} technique, representing
2473 %a graph as a function.
2474 \item
2475 We can do a backward analysis without having
2476 to ``unzip'' (and allocate a copy of) each block.
2477 \item
2478 Using GADTs, we can represent a flow-graph
2479 node using a single type, instead of the triple of first, middle, and
2480 last types used in our earlier representation.
2481 This change simplifies the interface significantly:
2482 instead of providing three transfer functions and three rewrite
2483 functions per pass---one for
2484 each type of node---a client of \hoopl\ provides only one transfer
2485 function and one rewrite function per pass.
2486 \item
2487 Errors in concatenation are ruled out at
2488 compile-compile time by Haskell's static
2489 type system.
2490 In~earlier implementations, such errors were not detected until
2491 the compiler~ran, at which point we tried to compensate
2492 for the errors---but
2493 the compensation code harbored subtle faults,
2494 which we discovered while developing a new back end
2495 for the Glasgow Haskell Compiler.
2496 \end{itemize}
2497
2498 The implementation of \ourlib\ is also much better than
2499 our earlier implementations.
2500 Not only is the code simpler conceptually,
2501 but it is also shorter:
2502 our new implementation is about a third as long
2503 as the previous version, which is part of GHC, version 6.12.
2504
2505
2506
2507
2508 \section{What we learned}
2509
2510 We have spent six years implementing and reimplementing frameworks for
2511 dataflow analysis and transformation.
2512 This formidable design problem taught us
2513 two kinds of lessons:
2514 we learned some very specific lessons about representations and
2515 algorithms for optimizing compilers,
2516 and we were forcibly reminded of some very general, old lessons that are well
2517 known not just to functional programmers, but to programmers
2518 everywhere.
2519
2520
2521
2522 %% \remark{Orphaned: but for transfer functions that
2523 %% approximate weakest preconditions or strongest postconditions,
2524 %% monotonicity falls out naturally.}
2525 %%
2526 %%
2527 %% In conclusion we offer the following lessons from the experience of designing
2528 %% and implementing \ourlib{}.
2529 %% \begin{itemize}
2530 %% \item
2531 %% Although we have not stressed this point, there is a close connection
2532 %% between dataflow analyses and program logic:
2533 %% \begin{itemize}
2534 %% \item
2535 %% A forward dataflow analysis is represented by a predicate transformer
2536 %% that is related to \emph{strongest postconditions}
2537 %% \cite{floyd:meaning}.\footnote
2538 %% {In Floyd's paper the output of the predicate transformer is called
2539 %% the \emph{strongest verifiable consequent}, not the ``strongest
2540 %% postcondition.''}
2541 %% \item
2542 %% A backward dataflow analysis is represented by a predicate transformer
2543 %% that is related to \emph{weakest preconditions} \cite{dijkstra:discipline}.
2544 %% \end{itemize}
2545 %% Logicians write down the predicate transformers for the primitive
2546 %% program fragments, and then use compositional rules to ``lift'' them
2547 %% to a logic for whole programs. In the same way \ourlib{} lets the client
2548 %% write simple predicate transformers,
2549 %% and local rewrites based on those assertions, and ``lifts'' them to entire
2550 %% function bodies with arbitrary control flow.
2551
2552 Our main goal for \hoopl\ was to combine three good ideas (interleaved
2553 analysis and transformation, optimization fuel, and an applicative
2554 control-flow graph) in a way that could easily be reused by many, many
2555 compiler writers.
2556 Reuse requires abstraction, and as is well known,
2557 designing good abstractions is challenging.
2558 \hoopl's data types and the functions over those types have been
2559 through \emph{dozens} of revisions.
2560 As~we were refining our design, we~found it invaluable to operate in
2561 two modes:
2562 In the first mode, we designed, built, and used a framework as an
2563 important component of a real compiler (first Quick~{\PAL}, then GHC).
2564 In the second mode, we designed and built a standalone library, then
2565 redesigned and rebuilt it, sometimes going through several significant
2566 changes in a week.
2567 Operating in the first mode---inside a live compiler---forced us to
2568 make sure that no corners were cut, that we were solving a real
2569 problem, and that we did not inadvertently
2570 cripple some other part of the compiler.
2571 Operating in the second mode---as a standalone library---enabled us to
2572 iterate furiously, trying out many more ideas than would have
2573 been possible in the first mode.
2574 We~have learned that alternating between these two modes leads to a
2575 better design than operating in either mode alone.
2576
2577 We were forcibly reminded of timeless truths:
2578 that interfaces are more important than implementations, and that data
2579 is more important than code.
2580 These truths are reflected in this paper, in which
2581 we
2582 have given \hoopl's API three times as much space as \hoopl's implementation.
2583
2584 We were also reminded that Haskell's type system (polymorphism, GADTs,
2585 higher-order functions, type classes, and so on) is a remarkably
2586 effective
2587 language for thinking about data and code---and that
2588 Haskell lacks a language of interfaces (like ML's signatures) that
2589 would make it equally effective for thinking about APIs at a larger scale.
2590 Still, as usual, the types were a remarkable aid to writing the code:
2591 when we finally agreed on the types presented above, the
2592 code almost wrote itself.
2593
2594 Types are widely appreciated at ICFP, but here are three specific
2595 examples of how types helped us:
2596 \begin{itemize}
2597 \item
2598 Reuse is enabled by representation-independence, which in a functional
2599 language is
2600 expressed through parametric polymorphism.
2601 Making \ourlib{} polymorphic in the nodes
2602 made the code simpler, easier to understand, and easier to maintain.
2603 In particular, it forced us to make explicit \emph{exactly} what
2604 \ourlib\ must know about nodes, and to embody that knowledge in the
2605 @Edges@ type class (\secref{edges}).
2606 \item
2607 We are proud of using GADTs to
2608 track the open and closed shapes of nodes, blocks, and graphs at
2609 compile time.
2610 Shapes may
2611 seem like a small refinement, but they helped tremendously when
2612 building \hoopl, and we expect them to help clients.
2613 %
2614 % this paper is just not about run-time performance ---NR
2615 %
2616 %%%% Moreover, the implementation is faster than it would otherwise be,
2617 %%%% because, say, a @(Fact O f)e@ is known to be just an @f@ rather than
2618 %%%% being a sum type that must be tested (with a statically known outcome!).
2619 %
2620 Giving the \emph{same} shapes
2621 to nodes, blocks, and graphs helped our
2622 thinking and helped to structure the implementation.
2623 \item
2624 In our earlier designs, graphs were parameterized over \emph{three} node
2625 types: first, middle, and last nodes.
2626 Those designs therefore required
2627 three transfer functions, three rewrite functions, and so~on.
2628 Moving to a single, ``shapely'' node type was a major breakthrough:
2629 not only do we have just one node type, but our client need supply only
2630 one transfer function and one rewrite function.
2631 To~make this design work, however, we \emph{must} have
2632 the type-level function for
2633 @Fact@ (\figref{api-types}), to express how incoming
2634 and outgoing facts depend on the shape of a node.
2635 \end{itemize}
2636
2637 Dataflow optimization is usually described as a way to improve imperative
2638 programs by mutating control-flow graphs.
2639 Such transformations appear very different from the tree rewriting
2640 that functional languages are so well known for, and that makes
2641 functional languages so attractive for writing other parts of compilers.
2642 But even though dataflow optimization looks very different from
2643 what we are used to,
2644 writing a dataflow optimizer
2645 in a pure functional language was a huge win.
2646 % We could not possibly have conceived \ourlib{} in C++.
2647 In~a pure functional language, not only do we know that
2648 no data structure will be unexpectedly mutated,
2649 but we are forced to be
2650 explicit about every input and output,
2651 and we are encouraged to implement things compositionally.
2652 This kind of thinking has helped us make
2653 significant improvements to the already tricky work of Lerner, Grove,
2654 and Chambers:
2655 per-function control of shallow vs deep rewriting
2656 (\secref{shallow-vs-deep}),
2657 combinators for dataflow passes (\secref{combinators}),
2658 optimization fuel (\secref{fuel}),
2659 and transparent management of unreachable blocks (\secref{arf-body}).
2660 We~trust that these improvements are right only because they are
2661 implemented in separate
2662 parts of the code that cannot interact except through
2663 explicit function calls.
2664 %%
2665 %%An ancestor of \ourlib{} is in the Glasgow Haskell Compiler today,
2666 %%in version~6.12.
2667 With this new, improved design in hand, we are now moving back to
2668 live-compiler mode, pushing \hoopl\ into version
2669 6.13 of the Glasgow Haskell Compiler.
2670
2671
2672 \acks
2673
2674 The first and second authors were funded
2675 by a grant from Intel Corporation and
2676 by NSF awards CCF-0838899 and CCF-0311482.
2677 These authors also thank Microsoft Research Ltd, UK, for funding
2678 extended visits to the third author.
2679
2680
2681 \makeatother
2682
2683 \providecommand\includeftpref{\relax} %% total bafflement -- workaround
2684 \IfFileExists{nrbib.tex}{\bibliography{cs,ramsey}}{\bibliography{dfopt}}
2685 \bibliographystyle{plainnatx}
2686
2687
2688 \clearpage
2689
2690 \appendix
2691
2692 % omit LabelSet :: *
2693 % omit LabelMap :: *
2694 % omit delFromFactBase :: FactBase f -> [(Label,f)] -> FactBase f
2695 % omit elemFactBase :: Label -> FactBase f -> Bool
2696 % omit elemLabelSet :: Label -> LabelSet -> Bool
2697 % omit emptyLabelSet :: LabelSet
2698 % omit factBaseLabels :: FactBase f -> [Label]
2699 % omit extendFactBase :: FactBase f -> Label -> f -> FactBase f
2700 % omit extendLabelSet :: LabelSet -> Label -> LabelSet
2701 % omit getFuel :: FuelMonad Fuel
2702 % omit setFuel :: Fuel -> FuelMonad ()
2703 % omit lookupFact :: FactBase f -> Label -> Maybe f
2704 % omit factBaseList :: FactBase f -> [(Label, f)]
2705
2706 \section{Code for \textmd{\texttt{fixpoint}}}
2707 \label{app:fixpoint}
2708
2709 {\def\baselinestretch{0.95}\hfuzz=20pt
2710 \begin{smallcode}
2711 data `TxFactBase n f
2712 = `TxFB { `tfb_fbase :: FactBase f
2713 , `tfb_rg :: RG n f C C -- Transformed blocks
2714 , `tfb_cha :: ChangeFlag
2715 , `tfb_lbls :: LabelSet }
2716 -- Set the tfb_cha flag iff
2717 -- (a) the fact in tfb_fbase for or a block L changes
2718 -- (b) L is in tfb_lbls.
2719 -- The tfb_lbls are all Labels of the *original*
2720 -- (not transformed) blocks
2721
2722 `updateFact :: DataflowLattice f -> LabelSet -> (Label, f)
2723 -> (ChangeFlag, FactBase f)
2724 -> (ChangeFlag, FactBase f)
2725 updateFact ^lat ^lbls (lbl, ^new_fact) (^cha, fbase)
2726 | NoChange <- ^cha2 = (cha, fbase)
2727 | lbl `elemLabelSet` lbls = (SomeChange, new_fbase)
2728 | otherwise = (cha, new_fbase)
2729 where
2730 (cha2, ^res_fact)
2731 = case lookupFact fbase lbl of
2732 Nothing -> (SomeChange, new_fact)
2733 Just ^old_fact -> fact_extend lat old_fact new_fact
2734 ^new_fbase = extendFactBase fbase lbl res_fact
2735
2736 fixpoint :: forall n f. Edges n
2737 => Bool -- Going forwards?
2738 -> DataflowLattice f
2739 -> (Block n C C -> FactBase f
2740 -> FuelMonad (RG n f C C, FactBase f))
2741 -> FactBase f -> [(Label, Block n C C)]
2742 -> FuelMonad (RG n f C C, FactBase f)
2743 fixpoint ^is_fwd lat ^do_block ^init_fbase ^blocks
2744 = do { ^fuel <- getFuel
2745 ; ^tx_fb <- loop fuel init_fbase
2746 ; return (tfb_rg tx_fb,
2747 tfb_fbase tx_fb `delFromFactBase` blocks) }
2748 -- The outgoing FactBase contains facts only for
2749 -- Labels *not* in the blocks of the graph
2750 where
2751 `tx_blocks :: [(Label, Block n C C)]
2752 -> TxFactBase n f -> FuelMonad (TxFactBase n f)
2753 tx_blocks [] tx_fb = return tx_fb
2754 tx_blocks ((lbl,blk):bs) tx_fb = tx_block lbl blk tx_fb
2755 >>= tx_blocks bs
2756
2757 `tx_block :: Label -> Block n C C
2758 -> TxFactBase n f -> FuelMonad (TxFactBase n f)
2759 tx_block ^lbl ^blk tx_fb@(TxFB { tfb_fbase = fbase
2760 , tfb_lbls = lbls
2761 , tfb_rg = ^blks
2762 , tfb_cha = cha })
2763 | is_fwd && not (lbl `elemFactBase` fbase)
2764 = return tx_fb -- Note [Unreachable blocks]
2765 | otherwise
2766 = do { (rg, ^out_facts) <- do_block blk fbase
2767 ; let (^cha', ^fbase')
2768 = foldr (updateFact lat lbls) (cha,fbase)
2769 (factBaseList out_facts)
2770 ; return (TxFB { tfb_lbls = extendLabelSet lbls lbl
2771 , tfb_rg = rg `RGCatC` blks
2772 , tfb_fbase = fbase'
2773 , tfb_cha = cha' }) }
2774
2775 loop :: Fuel -> FactBase f -> FuelMonad (TxFactBase n f)
2776 `loop fuel fbase
2777 = do { let ^init_tx_fb = TxFB { tfb_fbase = fbase
2778 , tfb_cha = NoChange
2779 , tfb_rg = RGNil
2780 , tfb_lbls = emptyLabelSet}
2781 ; tx_fb <- tx_blocks blocks init_tx_fb
2782 ; case tfb_cha tx_fb of
2783 NoChange -> return tx_fb
2784 SomeChange -> setFuel fuel >>
2785 loop fuel (tfb_fbase tx_fb) }
2786 \end{smallcode}
2787 \par
2788 } % end \baselinestretch
2789
2790
2791 \section{Index of defined identifiers}
2792
2793 This appendix lists every nontrivial identifier used in the body of
2794 the paper.
2795 For each identifier, we list the page on which that identifier is
2796 defined or discussed---or when appropriate, the figure (with line
2797 number where possible).
2798 For those few identifiers not defined or discussed in text, we give
2799 the type signature and the page on which the identifier is first
2800 referred to.
2801
2802 Some identifiers used in the text are defined in the Haskell Prelude;
2803 for those readers less familiar with Haskell, these identifiers are
2804 listed in Appendix~\ref{sec:prelude}.
2805
2806 \newcommand\dropit[3][]{}
2807
2808 \newcommand\hsprelude[2]{\noindent
2809 \texttt{#1} defined in the Haskell Prelude\\}
2810 \let\hsprelude\dropit
2811
2812 \newcommand\hspagedef[3][]{\noindent
2813 \texttt{#2} defined on page~\pageref{#3}.\\}
2814 \newcommand\omithspagedef[3][]{\noindent
2815 \texttt{#2} not shown (but see page~\pageref{#3}).\\}
2816 \newcommand\omithsfigdef[3][]{\noindent
2817 \texttt{#2} not shown (but see Figure~\ref{#3} on page~\pageref{#3}).\\}
2818 \newcommand\hsfigdef[3][]{%
2819 \noindent
2820 \ifx!#1!%
2821 \texttt{#2} defined in Figure~\ref{#3} on page~\pageref{#3}.\\
2822 \else
2823 \texttt{#2} defined on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\
2824 \fi
2825 }
2826 \newcommand\hstabdef[3][]{%
2827 \noindent
2828 \ifx!#1!
2829 \texttt{#2} defined in Table~\ref{#3} on page~\pageref{#3}.\\
2830 \else
2831 \texttt{#2} defined on \lineref{#1} of Table~\ref{#3} on page~\pageref{#3}.\\
2832 \fi
2833 }
2834 \newcommand\hspagedefll[3][]{\noindent
2835 \texttt{#2} {let}- or $\lambda$-bound on page~\pageref{#3}.\\}
2836 \newcommand\hsfigdefll[3][]{%
2837 \noindent
2838 \ifx!#1!%
2839 \texttt{#2} {let}- or $\lambda$-bound in Figure~\ref{#3} on page~\pageref{#3}.\\
2840 \else
2841 \texttt{#2} {let}- or $\lambda$-bound on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\
2842 \fi
2843 }
2844
2845 \newcommand\nothspagedef[3][]{\notdefd\ndpage{#1}{#2}{#3}}
2846 \newcommand\nothsfigdef[3][]{\notdefd\ndfig{#1}{#2}{#3}}
2847 \newcommand\nothslinedef[3][]{\notdefd\ndline{#1}{#2}{#3}}
2848
2849 \newcommand\ndpage[3]{\texttt{#2}~(p\pageref{#3})}
2850 \newcommand\ndfig[3]{\texttt{#2}~(Fig~\ref{#3},~p\pageref{#3})}
2851 \newcommand\ndline[3]{%
2852 \ifx!#1!%
2853 \ndfig{#1}{#2}{#3}%
2854 \else
2855 \texttt{#2}~(Fig~\ref{#3}, line~\lineref{#1}, p\pageref{#3})%
2856 \fi
2857 }
2858
2859 \newif\ifundefinedsection\undefinedsectionfalse
2860
2861 \newcommand\notdefd[4]{%
2862 \ifundefinedsection
2863 , #1{#2}{#3}{#4}%
2864 \else
2865 \undefinedsectiontrue
2866 \par
2867 \section{Undefined identifiers}
2868 #1{#2}{#3}{#4}%
2869 \fi
2870 }
2871
2872 \begingroup
2873 \raggedright
2874
2875 \input{defuse}%
2876 \ifundefinedsection.\fi
2877
2878 \undefinedsectionfalse
2879
2880
2881 \renewcommand\hsprelude[2]{\noindent
2882 \ifundefinedsection
2883 , \texttt{#1}%
2884 \else
2885 \undefinedsectiontrue
2886 \par
2887 \section{Identifiers defined in Haskell Prelude or a standard library}\label{sec:prelude}
2888 \texttt{#1}%
2889 \fi
2890 }
2891 \let\hspagedef\dropit
2892 \let\omithspagedef\dropit
2893 \let\omithsfigdef\dropit
2894 \let\hsfigdef\dropit
2895 \let\hstabdef\dropit
2896 \let\hspagedefll\dropit
2897 \let\hsfigdefll\dropit
2898 \let\nothspagedef\dropit
2899 \let\nothsfigdef\dropit
2900 \let\nothslinedef\dropit
2901
2902 \input{defuse}
2903 \ifundefinedsection.\fi
2904
2905
2906
2907 \endgroup
2908
2909
2910 \iffalse
2911
2912 \section{Dataflow-engine functions}
2913
2914
2915 \begin{figure*}
2916 \setcounter{codeline}{0}
2917 \begin{numberedcode}
2918 \end{numberedcode}
2919 \caption{The forward iterator}
2920 \end{figure*}
2921
2922 \begin{figure*}
2923 \setcounter{codeline}{0}
2924 \begin{numberedcode}
2925 \end{numberedcode}
2926 \caption{The forward actualizer}
2927 \end{figure*}
2928
2929
2930 \fi
2931
2932
2933
2934 \end{document}
2935
2936
2937
2938
2939 THE FUEL PROBLEM:
2940
2941
2942 Here is the problem:
2943
2944 A graph has an entry sequence, a body, and an exit sequence.
2945 Correctly computing facts on and flowing out of the body requires
2946 iteration; computation on the entry and exit sequences do not, since
2947 each is connected to the body by exactly one flow edge.
2948
2949 The problem is to provide the correct fuel supply to the combined
2950 analysis/rewrite (iterator) functions, so that speculative rewriting
2951 is limited by the fuel supply.
2952
2953 I will number iterations from 1 and name the fuel supplies as
2954 follows:
2955
2956 f_pre fuel remaining before analysis/rewriting starts
2957 f_0 fuel remaining after analysis/rewriting of the entry sequence
2958 f_i, i>0 fuel remaining after iteration i of the body
2959 f_post fuel remaining after analysis/rewriting of the exit sequence
2960
2961 The issue here is that only the last iteration of the body 'counts'.
2962 To formalize, I will name fuel consumed:
2963
2964 C_pre fuel consumed by speculative rewrites in entry sequence
2965 C_i fuel consumed by speculative rewrites in iteration i of body
2966 C_post fuel consumed by speculative rewrites in exit sequence
2967
2968 These quantities should be related as follows:
2969
2970 f_0 = f_pre - C_pref
2971 f_i = f_0 - C_i where i > 0
2972 f_post = f_n - C_post where iteration converges after n steps
2973
2974 When the fuel supply is passed explicitly as parameter and result, it
2975 is fairly easy to see how to keep reusing f_0 at every iteration, then
2976 extract f_n for use before the exit sequence. It is not obvious to me
2977 how to do it cleanly using the fuel monad.
2978
2979
2980 Norman