Constant-propagation figure is now extracted automatically from John's code; some...
[packages/hoopl.git] / paper / dfopt.tex
1 \iffalse
2
3 Say something about the proliferation of heavyweight type signatures
4 required for GADT pattern matches. When the signature is three times
5 the size of the function, something is wrong...
6
7
8 I was thinking again about unwrapped nodes, cons-lists, snoc-lists,
9 and tree fringe. I think there's an analogy with ordinary lists, and
10 the analogy is 'concatMap' vs 'fold'. Just as with lists, the common
11 case of 'concatMap (\a -> [a])' does a lot of unnecessary wrapping and
12 unwrapping of elements. We nevertheless prefer 'concatMap' because it
13 provides better separation of concerns. But this analogy suggests
14 several ideas:
15
16 - Can block processing be written in higher-order fashion?
17
18 - Can it be done in both styles (concatMap *and* fold)?
19
20 - Once we have framed the problem in these terms, can we write
21 fold-style cold that is not too hard to understand and maintain?
22
23 - Finally, is there an analog of stream fusion that would work for
24 control-flow graphs, enabling us to write some graph combinators
25 that be both perspicuous and efficient?
26
27 These are good observations for the paper and for future work.
28
29
30 ----------------------------------------------------------------
31
32
33 P.S. The three of us should have a nice little Skype chat about
34 higher-rank types. A lot of the questions still at issue boil down to
35 the following observations:
36
37 - It's really convenient for the *implementation* of Hoopl to put
38 forall to the left of an arrow. Polymorphic functions as
39 arguments are powerful and keen, and they make it really easy for
40 Hoopl to do its job.
41
42 - It's a bid inconvenient for a *client* of Hoopl to be forced to
43 supply functions that are polymorphic in shape. All constructors
44 have to be written out by hand; defaulting is not possible. (Or
45 at least John and I haven't figured out how to do it.)
46
47 - Using higher-rank types in Hoopl's interface makes some very
48 desirable things downright impossible:
49
50 - One can't write a generic dominator analysis with type
51
52 dom :: Edges n => FwdPass n Dominators
53
54 - One can't write a generic debugging wrapper of type
55
56 debug :: (Show n, Show f)
57 => TraceFn -> WhatToTrace -> FwdPass n f -> FwdPass n f
58
59 - One can't write a cominator of type
60
61 product :: FwdPass n f -> FwdPass n f' -> FwdPass n (f, f')
62
63 I submit that these things are all very desirable.
64
65 I'm therefore strongly in favor of removing the *requirement* that a
66 FwdPass include transfer and rewrite functions that are polymorphic in
67 shape. It will be a bit tedious to return to triples of functions,
68 but for those lucky clients who *can* write polymorphic functions and
69 who wish to, we can provide injection functions.
70
71 I should stress that I believe the tedium can be contained within
72 reasonable bounds---for example, the arfXXX functions that are
73 internal to Hoopl all remain polymorphic in shape (although not
74 higher-rank any longer).
75
76 \fi
77
78
79 %
80 % TODO
81 %
82 %
83 % AGraph = graph under construction
84 % Graph = replacement graph
85 %
86 %%% Hoopl assumes that the replacement graph for a node open at the exit
87 %%% doesn't contain any additional exits
88 %%%
89 %%% introduce replacement graph in section 3, after graph.
90 %%% it has arbitrarily complicated internal control flow, but only
91 %%% one exit (if open at the exit)
92 %%%
93 %%% a rquirement on the client that is not checked statically.
94
95
96 \input{dfoptdu.tex}
97
98 \newif\ifpagetuning \pagetuningtrue % adjust page breaks
99
100 \newif\ifnoauthornotes \noauthornotesfalse
101
102 \newif\iftimestamp\timestamptrue % show MD5 stamp of paper
103
104 % \timestampfalse % it's post-submission time
105
106 \IfFileExists{timestamp.tex}{}{\timestampfalse}
107
108 \newif\ifcutting \cuttingfalse % cutting down to submission size
109
110
111 \newif\ifgenkill\genkillfalse % have a section on gen and kill
112 \genkillfalse
113
114
115 \newif\ifnotesinmargin \notesinmarginfalse
116 \IfFileExists{notesinmargin.tex}{\notesinmargintrue}{\relax}
117
118 \documentclass[blockstyle,preprint,natbib,nocopyrightspace]{sigplanconf}
119
120 \newcommand\ourlib{Hoopl}
121 % higher-order optimization library
122 % ('Hoople' was taken -- see hoople.org)
123 \let\hoopl\ourlib
124
125 \newcommand\ag{\ensuremath{\mathit{ag}}}
126 \renewcommand\ag{\ensuremath{g}} % not really seeing that 'ag' is helpful here ---NR
127 \newcommand\rw{\ensuremath{\mathit{rw}}}
128
129 % l2h substitution ourlib Hoopl
130 % l2h substitution hoopl Hoopl
131
132 \newcommand\fs{\ensuremath{\mathit{fs}}} % dataflow facts, possibly plural
133
134 \newcommand\vfilbreak[1][\baselineskip]{%
135 \vskip 0pt plus #1 \penalty -200 \vskip 0pt plus -#1 }
136
137 \usepackage{alltt}
138 \usepackage{array}
139 \usepackage{afterpage}
140 \newcommand\lbr{\char`\{}
141 \newcommand\rbr{\char`\}}
142
143 \clubpenalty=10000
144 \widowpenalty=10000
145
146 \usepackage{verbatim} % allows to define \begin{smallcode}
147 \newenvironment{smallcode}{\par\unskip\small\verbatim}{\endverbatim}
148 \newenvironment{fuzzcode}[1]{\par\unskip\hfuzz=#1 \verbatim}{\endverbatim}
149 \newenvironment{smallfuzzcode}[1]{\par\unskip\small\hfuzz=#1 \verbatim}{\endverbatim}
150
151 \newcommand\lineref[1]{line~\ref{line:#1}}
152 \newcommand\linepairref[2]{lines \ref{line:#1}~and~\ref{line:#2}}
153 \newcommand\linerangeref[2]{\mbox{lines~\ref{line:#1}--\ref{line:#2}}}
154 \newcommand\Lineref[1]{Line~\ref{line:#1}}
155 \newcommand\Linepairref[2]{Lines \ref{line:#1}~and~\ref{line:#2}}
156 \newcommand\Linerangeref[2]{\mbox{Lines~\ref{line:#1}--\ref{line:#2}}}
157
158 \makeatletter
159
160 \let\c@table=
161 \c@figure % one counter for tables and figures, please
162
163 \newcommand\setlabel[1]{%
164 \setlabel@#1!!\@endsetlabel
165 }
166 \def\setlabel@#1!#2!#3\@endsetlabel{%
167 \ifx*#1*% line begins with label or is empty
168 \ifx*#2*% line is empty
169 \verbatim@line{}%
170 \else
171 \@stripbangs#3\@endsetlabel%
172 \label{line:#2}%
173 \fi
174 \else
175 \@stripbangs#1!#2!#3\@endsetlabel%
176 \fi
177 }
178 \def\@stripbangs#1!!\@endsetlabel{%
179 \verbatim@line{#1}%
180 }
181
182
183 \verbatim@line{hello mama}
184
185 \newcommand{\numberedcodebackspace}{0.5\baselineskip}
186
187 \newcounter{codeline}
188 \newenvironment{numberedcode}
189 {\endgraf
190 \def\verbatim@processline{%
191 \noindent
192 \expandafter\ifx\expandafter+\the\verbatim@line+ % blank line
193 %{\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\phantom{: \,}}}%
194 \else
195 \refstepcounter{codeline}%
196 {\small\textit{\def\rmdefault{cmr}\rmfamily\phantom{00}\llap{\arabic{codeline}}: \,}}%
197 \fi
198 \expandafter\setlabel\expandafter{\the\verbatim@line}%
199 \expandafter\ifx\expandafter+\the\verbatim@line+ % blank line
200 \vspace*{-\numberedcodebackspace}\par%
201 \else
202 \the\verbatim@line\par
203 \fi}%
204 \verbatim
205 }
206 {\endverbatim}
207
208 \makeatother
209
210 \newcommand\arrow{\rightarrow}
211
212 \newcommand\join{\sqcup}
213 \newcommand\slotof[1]{\ensuremath{s_{#1}}}
214 \newcommand\tempof[1]{\ensuremath{t_{#1}}}
215 \let\tempOf=\tempof
216 \let\slotOf=\slotof
217
218 \makeatletter
219 \newcommand{\nrmono}[1]{%
220 {\@tempdima = \fontdimen2\font\relax
221 \texttt{\spaceskip = 1.1\@tempdima #1}}}
222 \makeatother
223
224 \usepackage{times} % denser fonts
225 \renewcommand{\ttdefault}{aett} % \texttt that goes better with times fonts
226 \usepackage{enumerate}
227 \usepackage{url}
228 \usepackage{graphicx}
229 \usepackage{natbib} % redundant for Simon
230 \bibpunct();A{},
231 \let\cite\citep
232 \let\citeyearnopar=\citeyear
233 \let\citeyear=\citeyearpar
234
235 \usepackage[ps2pdf,bookmarksopen,breaklinks,pdftitle=dataflow-made-simple]{hyperref}
236 \usepackage{breakurl} % enables \burl
237
238 \newcommand\naive{na\"\i ve}
239 \newcommand\naively{na\"\i vely}
240 \newcommand\Naive{Na\"\i ve}
241
242 \usepackage{amsfonts}
243 \newcommand\naturals{\ensuremath{\mathbb{N}}}
244 \newcommand\true{\ensuremath{\mathbf{true}}}
245 \newcommand\implies{\supseteq} % could use \Rightarrow?
246
247 \newcommand\PAL{\mbox{C{\texttt{-{}-}}}}
248 \newcommand\high[1]{\mbox{\fboxsep=1pt \smash{\fbox{\vrule height 6pt
249 depth 0pt width 0pt \leavevmode \kern 1pt #1}}}}
250
251 \usepackage{tabularx}
252
253 %%
254 %% 2009/05/10: removed 'float' package because it breaks multiple
255 %% \caption's per {figure} environment. ---NR
256 %%
257 %% % Put figures in boxes --- WHY??? --NR
258 %% \usepackage{float}
259 %% \floatstyle{boxed}
260 %% \restylefloat{figure}
261 %% \restylefloat{table}
262
263
264
265 % ON LINE THREE, set \noauthornotestrue to suppress notes (or not)
266
267 %\newcommand{\qed}{QED}
268 \ifnotesinmargin
269 \long\def\authornote#1{%
270 \ifvmode
271 \marginpar{\raggedright\hbadness=10000
272 \parindent=8pt \parskip=2pt
273 \def\baselinestretch{0.8}\tiny
274 \itshape\noindent #1\par}%
275 \else
276 \unskip\raisebox{-3.5pt}{\rlap{$\scriptstyle\diamond$}}%
277 \marginpar{\raggedright\hbadness=10000
278 \parindent=8pt \parskip=2pt
279 \def\baselinestretch{0.8}\tiny
280 \itshape\noindent #1\par}%
281 \fi}
282 \else
283 % Simon: please set \notesinmarginfalse on the first line
284 \long\def\authornote#1{{\em #1\/}}
285 \fi
286 \ifnoauthornotes
287 \def\authornote#1{\unskip\relax}
288 \fi
289
290 \newcommand{\simon}[1]{\authornote{SLPJ: #1}}
291 \newcommand{\norman}[1]{\authornote{NR: #1}}
292 \let\remark\norman
293 \def\finalremark#1{\relax}
294 % \let \finalremark \remark % uncomment after submission
295 \newcommand{\john}[1]{\authornote{JD: #1}}
296 \newcommand{\todo}[1]{\textbf{To~do:} \emph{#1}}
297 \newcommand\delendum[1]{\relax\ifvmode\else\unskip\fi\relax}
298
299 \newcommand\secref[1]{Section~\ref{sec:#1}}
300 \newcommand\secreftwo[2]{Sections \ref{sec:#1}~and~\ref{sec:#2}}
301 \newcommand\seclabel[1]{\label{sec:#1}}
302
303 \newcommand\figref[1]{Figure~\ref{fig:#1}}
304 \newcommand\figreftwo[2]{Figures \ref{fig:#1}~and~\ref{fig:#2}}
305 \newcommand\figlabel[1]{\label{fig:#1}}
306
307 \newcommand\tabref[1]{Table~\ref{tab:#1}}
308 \newcommand\tablabel[1]{\label{tab:#1}}
309
310
311 \newcommand{\CPS}{\textbf{StkMan}} % Not sure what to call it.
312
313
314 \usepackage{code} % At-sign notation
315
316 \iftimestamp
317 \input{timestamp}
318 \preprintfooter{\mdfivestamp}
319 \fi
320
321 \hyphenation{there-by}
322
323 \renewcommand{\floatpagefraction}{0.9} % must be less than \topfraction
324 \renewcommand{\topfraction}{0.95}
325 \renewcommand{\textfraction}{0.05}
326
327 \begin{document}
328 %\title{\ourlib: Dataflow Optimization Made Simple}
329 \title{\ourlib: A Modular, Reusable Library for\\ Dataflow Analysis and Transformation}
330 %\title{Implementing Dataflow Analysis and Optimization by Lifting Node Functions to Basic Blocks and Control-Flow Graphs}
331 \subtitle{Programming pearl}
332
333 \titlebanner{\textsf{\mdseries\itshape
334 This paper is a revised version of a submission to ICFP'10. Comments
335 are welcome; please identify this version by the words
336 \textbf{\mdfivewords}.
337 }}
338
339 \ifnoauthornotes
340 \makeatletter
341 \let\HyPsd@Warning=
342 \@gobble
343 \makeatother
344 \fi
345
346 % João
347
348
349 \authorinfo{Norman Ramsey}{Tufts University}{nr@cs.tufts.edu}
350 \authorinfo{Jo\~ao Dias}{Tufts University}{dias@cs.tufts.edu}
351 \authorinfo{Simon Peyton Jones}{Microsoft Research}{simonpj@microsoft.com}
352
353
354 \maketitle
355
356 \begin{abstract}
357 \iffalse % A vote for Simon's abstract
358 \remark{I have replaced a good abstract of the POPL submission with a
359 bad abstract of \emph{this} submission.}
360 We present \ourlib, a Haskell library that makes it easy for a
361 compiler writer
362 to implement program transformations based on dataflow analyses.
363 A~client of \ourlib\ defines a representation of
364 logical assertions,
365 a transfer function that computes outgoing assertions from incoming
366 assertions,
367 and a rewrite function that improves code when improvements are
368 justified by the assertions.
369 \ourlib\ does the actual analysis and transformation.
370
371 \ourlib\ implements state-of-the art algorithms:
372 Lerner, Grove, and Chambers's
373 \citeyearpar{lerner-grove-chambers:2002}
374 composition of simple analyses and transformations, which achieves
375 the same precision as complex, handwritten
376 ``super-analyses;''
377 and Whalley's \citeyearpar{whalley:isolation} dynamic technique for
378 isolating bugs in a client's code.
379 \ourlib's implementation is unique in that unlike previous
380 implementations,
381 it carefully separates the tricky
382 elements of each of these algorithms, so that they can be examined and
383 understood independently.
384
385
386 \simon{Here is an alternative abstract based on the four-sentence model.}
387 \remark{Four-sentence model? You must teach me\ldots}
388 \fi
389 Dataflow analysis and transformation of control-flow graphs is
390 pervasive in optimizing compilers, but it is typically tightly
391 interwoven with the details of a \emph{particular} compiler.
392 We~describe \ourlib{}, a reusable Haskell library that makes it
393 unusually easy to define new analyses and
394 transformations for \emph{any} compiler.
395 \ourlib's interface is modular and polymorphic,
396 and it offers unusually strong static guarantees.
397 The implementation
398 is also far from routine: it encapsulates
399 state-of-the-art algorithms (interleaved analysis and rewriting,
400 dynamic error isolation), and it cleanly separates their tricky elements
401 so that they can be understood independently.
402 %
403 %\ourlib\ will be the workhorse of a new
404 %back end for the Glasgow Haskell Compiler (version~6.14, forthcoming).
405
406 %% \emph{Reviewers:} code examples are indexed at {\small\url{http://bit.ly/jkr3K}}
407 %%% Source: http://www.cs.tufts.edu/~nr/drop/popl-index.pdf
408 \end{abstract}
409
410 \makeatactive % Enable @foo@ notation
411
412 \section{Introduction}
413
414 A mature optimizing compiler for an imperative language includes many
415 analyses, the results of which justify the optimizer's
416 code-improving transformations.
417 Many of the most important analyses and transformations---constant
418 propagation, live-variable analysis, inlining, sinking of loads,
419 and so on---should be regarded as particular cases of
420 a single general problem: \emph{dataflow analysis and optimization}.
421 %% \remark{I do not feel compelled to cite Muchnick (or anyone else) here}
422 Dataflow analysis is over thirty years old,
423 but a recent, seminal paper by \citet{lerner-grove-chambers:2002} goes further,
424 describing a powerful but subtle way to
425 \emph{interleave} analysis and transformation so that each
426 piggybacks on the other.
427
428 Because optimizations based on dataflow analysis
429 share a common intellectual framework, and because that framework is
430 subtle, it it tempting to
431 try to build a single reusable library that embodies the
432 subtle ideas, while
433 making it easy for clients to instantiate the library for different
434 situations.
435 Tempting, but difficult.
436 Although some such frameworks exist, as we discuss
437 in \secref{related}, they have complex APIs and implementations,
438 and none implements the Lerner/Grove/Chambers technique.
439
440 In this paper we present \ourlib{} (short for ``higher-order
441 optimization library''), a new Haskell library for dataflow analysis and
442 optimization. It has the following distinctive characteristics:
443
444 \begin{itemize}
445 \item
446 \ourlib\ is purely functional.
447 Perhaps surprisingly, code that
448 manipulates control-flow graphs is easier to write, and far easier
449 to write correctly, when written in a purely functional style
450 \cite{ramsey-dias:applicative-flow-graph}.
451 When analysis and rewriting
452 are interleaved, so that rewriting must be done \emph{speculatively},
453 without knowing whether
454 the result of the rewrite will be retained or discarded,
455 the benefit of a purely functional style is intensified
456 (Sections \ref{sec:overview} and \ref{sec:fixpoints}).
457
458 \item
459 \ourlib\ is polymorphic. Just as a list library is
460 polymorphic in the list elements, so is \ourlib{} polymorphic, both in
461 the nodes that inhabit graphs, and in the dataflow facts that
462 analyses compute over these graphs (\secref{using-hoopl}).
463
464 \item The paper by Lerner, Grove, and Chambers is inspiring but abstract.
465 We articulate their ideas in a concrete but simple API that hides
466 a subtle implementation (Sections \ref{sec:graph-rep} and \ref{sec:using-hoopl}).
467 You provide a representation for assertions,
468 a transfer function that transforms assertions across a node,
469 and a rewrite function that uses a assertion to
470 justify rewriting a node.
471 \ourlib\ ``lifts'' these node-level functions to work over
472 control-flow graphs, sets up and solves recursion equations,
473 and interleaves rewriting with analysis.
474 Designing good abstractions (data types, APIs) is surprisingly
475 hard; we have been through over a dozen significantly different
476 iterations, and we offer our API as a contribution.
477 % ``in its own right'' -- there's an echo in here...
478
479 \item
480 Because the client
481 can perform very local reasoning (``@y@ is live before
482 @x:=y+2@'').%
483 \footnote
484 {Using \hoopl, it is not necessary to have the more complex rule
485 ``if @x@~is live after @x:=y+2@ then @y@ is live before~it,''
486 because if @x@~is \emph{not} live after @x:=y+2@, the assignment
487 @x:=y+2@ will be eliminated.},
488 analyses and transformations built on \ourlib\
489 are small, simple, and easy to get right.
490 Moreover, \ourlib\ helps you write correct optimizations:
491 it~statically rules out transformations that violate invariants
492 of the control-flow graph (Sections \ref{sec:graph-rep} and \ref{sec:rewrites}),
493 and dynamically it can help find the first transformation that introduces a fault
494 in a test program (\secref{fuel}).
495 \finalremark{SLPJ: I wanted to write more about open/closed,
496 but I like this sentence with its claim to both static and dynamic assistance,
497 and maybe the open/closed story is hard to understand here.}
498
499 % \item \ourlib{} makes use of GADTS and type functions to offer unusually
500 % strong static guarantees. In particular, nodes, basic blocks, and
501 % graphs are all statically typed by their open or closedness on entry, and
502 % their open or closedness on exit (\secref{graph-rep}). For example, an add instruction is
503 % open on entry and exit, while a branch is open on entry and closed on exit.
504 % Using these types we can statically guarantee that, say, an add instruction
505 % is rewritten to a graph that is also open on both entry and exit; and
506 % that the user cannot construct a block where an add instruction follows an
507 % unconditional branch. We know of no other system that offers
508 % static guarantees this strong.
509
510 \item \ourlib{} implements subtle algorithms, including
511 (a)~interleaved analysis and rewriting, (b)~speculative rewriting,
512 (c)~computing fixed points, and (d)~dynamic fault isolation.
513 Previous implementations of these algorithms---including three of our
514 own---are complicated and hard to understand, because the tricky pieces
515 are implemented all together, inseparably.
516 A~significant contribution of this paper is
517 a new way to structure the implementation so that each tricky piece
518 is handled in just
519 one place, separate from all the others (\secref{engine}).
520 %\remark{This is a very important claim---is
521 %it substantiated in \secref{engine}? And shouldn't the word
522 %\textbf{contribution} appear in this~\P?}
523 % \simon{Better?} % yes thanks ---NR
524 The result is sufficiently elegant that
525 we emphasize the implementation as an object of interest in
526 its own right.
527 \end{itemize}
528 A working prototype of \ourlib{} is available from
529 \burl{http://ghc.cs.tufts.edu/hoopl} and also from Hackage.
530 It is no toy: an ancestor of this library is
531 part of the Glasgow Haskell Compiler, where it optimizes the
532 imperative {\PAL} code in GHC's back end. The new design is far
533 nicer, and it will be in GHC shortly.
534
535 The API for \ourlib{} seems quite natural, but it requires
536 relatively sophisticated aspects of Haskell's type system, such
537 as higher-rank polymorphism, GADTs, and type functions.
538 As such, \ourlib{} offers a compelling case study in the utility
539 of these features.
540
541
542 \section{Dataflow analysis {\&} transformation by \texorpdfstring{\rlap{example}}{example}}
543 \seclabel{overview}
544 \seclabel{constant-propagation}
545 \seclabel{example:transforms}
546 \seclabel{example:xforms}
547
548 We begin by setting the scene, introducing some vocabulary, and
549 showing a small motivating example.
550 A control-flow graph, perhaps representing the body of a procedure,
551 is a collection of \emph{basic blocks}---or just ``blocks''.
552 Each block is a sequence of instructions,
553 beginning with a label and ending with a
554 control-transfer instruction that branches to other blocks.
555 % Each block has a label at the beginning,
556 % a sequence of
557 % -- each of which has a label at the
558 % beginning. Each block may branch to other blocks with arbitrarily
559 % complex control flow.
560 The goal of dataflow optimization is to compute valid
561 \emph{assertions} (or \emph{dataflow facts}),
562 then use those assertions to justify code-improving
563 transformations (or \emph{rewrites}) on a \emph{control-flow graph}.
564
565 Consider a concrete example: constant propagation with constant folding.
566 On the left we have a basic block; in the middle we have
567 facts that hold between statements (or \emph{nodes})
568 in the block; and at
569 the right we have the result of transforming the block
570 based on the assertions:
571 \begin{verbatim}
572 Before Facts After
573 ------------{}-------------
574 x := 3+4 x := 7
575 ----------{x=7}------------
576 z := x>5 z := True
577 -------{x=7, z=True}-------
578 if z goto L1
579 then goto L1
580 else goto L2
581 \end{verbatim}
582 Constant propagation works
583 from top to bottom. We start with the empty fact.
584 Given the empty fact and the node @x:=3+4@ can we make a (constant-folding)
585 transformation?
586 Yes! We can replace the node with @x:=7@.
587 Now, given this transformed node,
588 and the original fact, what fact flows out of the bottom of
589 the transformed node?
590 The~fact \{@x=7@\}.
591 Given the fact \{@x=7@\} and the node @z:=x>5@, can we make a
592 transformation? Yes: constant propagation can replace the node with @z:=7>5@.
593 Now, can we do another transformation? Yes: constant folding can
594 replace the node with @z:=True@.
595 And so the process continues to the end of the block, where we
596 can replace the conditional branch with an unconditional one, @goto L1@.
597
598 The example above is simple because the program has only straightline code;
599 when programs have loops, dataflow analysis gets more complicated.
600 For example,
601 consider the following graph,
602 where we assume @L1@ is the entry point:
603 \begin{verbatim}
604 L1: x=3; y=4; if z then goto L2 else goto L3
605 L2: x=7; goto L3
606 L3: ...
607 \end{verbatim}
608 Because control flows to @L3@ from two places,
609 we must \emph{join} the facts coming from those two places.
610 All paths to @L3@ produce the fact @y=@$4$,
611 so we can conclude that this fact holds at @L3@.
612 But depending on the the path to @L3@, @x@ may have different
613 values, so we conclude ``@x=@$\top$'',
614 meaning that there is no single value held by @x@ at @L3@.%
615 \footnote{
616 In this example @x@ really does vary at @L3@, but in general
617 the analysis might be conservative.}
618 The final result of joining the dataflow facts that flow to @L3@
619 is the new fact $@x=@\top \land @y=4@ \land @z=@\top$.
620
621 \seclabel{const-prop-example}
622
623 \seclabel{simple-tx}
624 \paragraph{Interleaved transformation and analysis.}
625 Our example \emph{interleaves} transformation and analysis.
626 Interleaving makes it far easier to write effective analyses.
627 If, instead, we \emph{first} analyzed the block
628 and \emph{then} transformed it, the analysis would have to ``predict''
629 the transformations.
630 For example, given the incoming fact \{@x=7@\}
631 and the instruction @z:=x>5@,
632 a pure analysis could produce the outgoing fact
633 \{@x=7@, @z=True@\} by simplifying @x>5@ to @True@.
634 But the subsequent transformation must perform
635 \emph{exactly the same simplification} when it transforms the instruction to @z:=True@!
636 If instead we \emph{first} rewrite the node to @z:=True@,
637 and \emph{then} apply the transfer function to the new node,
638 the transfer function becomes laughably simple: it merely has to see if the
639 right hand side is a constant (you can see actual code in \secref{const-prop-client}).
640 The gain is even more compelling if there are a number of interacting
641 analyses and/or transformations; for more substantial
642 examples, consult \citet{lerner-grove-chambers:2002}.
643
644 \paragraph{Forwards and backwards.}
645 Constant propagation works \emph{forwards}, and a fact is typically an
646 assertion about the program state (such as ``variable~@x@ holds value~@7@'').
647 Some useful analyses work \emph{backwards}.
648 A~prime example is live-variable analysis, where a fact takes the~form
649 ``variable @x@ is live'' and is an assertion about the
650 \emph{continuation} of a program point. For example, the fact ``@x@~is
651 live'' at a program point P is an assertion that @x@ is used on some program
652 path starting at \mbox{P}. % TeXbook, exercise 12.6
653 The accompanying transformation is called dead-code elimination;
654 if @x@~is not live, this transformation
655 replaces the node @x:=e@ with a no-op.
656
657 % ----------------------------------------
658 \section{Representing control-flow graphs} \seclabel{graph-rep}
659
660 \ourlib{} is a library that makes it easy to define dataflow analyses,
661 and transformations driven by these analyses, on control-flow graphs.
662 Graphs are composed from smaller units, which we discuss from the
663 bottom up:\finalremark{something about replacement graphs?}
664 \begin{itemize}
665 \item A \emph{node} is defined by \ourlib's client;
666 \ourlib{} knows nothing about the representation of nodes (\secref{nodes}).
667 \item A basic \emph{block} is a sequence of nodes (\secref{blocks}).
668 \item A \emph{graph} is an arbitrarily complicated control-flow graph,
669 composed from basic blocks (\secref{graphs}).
670 \end{itemize}
671
672 \subsection{Shapes: Open and closed}
673
674 Nodes, blocks, and graphs share important properties in common.
675 In particular, each can be \emph{open or closed at entry}
676 and \emph{open or closed at exit}.
677 An \emph{open} point is one at which control may implicitly ``fall through;''
678 to transfer control at a \emph{closed} point requires an explicit
679 control-transfer instruction to a named label.
680 For example,
681 \begin{itemize}
682 \item A shift-left instruction is open on entry (because control can fall into it
683 from the preceding instruction), and open on exit (because control falls through
684 to the next instruction).
685 \item An unconditional branch is open on entry, but closed on exit (because
686 control cannot fall through to the next instruction).
687 \item A label is closed on entry (because in \ourlib{} we do not allow
688 control to fall through into a branch target), but open on exit.
689 \end{itemize}
690 % This taxonomy enables \ourlib{} to enforce invariants:
691 % only nodes closed at entry can be the targets of branches, and only nodes closed
692 % at exits can transfer control (see also \secref{edges}).
693 % As~a consequence, all control transfers originate at control-transfer
694 % instructions and terminated at labels; this invariant dramatically
695 % simplifies analysis and transformation.
696 These examples concern nodes, but the same classification applies
697 to blocks and graphs. For example the block
698 \begin{code}
699 x:=7; y:=x+2; goto L
700 \end{code}
701 is open on entry and closed on exit.
702 This is the block's \emph{shape}, which we often abbreviate
703 ``open/closed;''
704 we may refer to an ``open/closed block.''
705
706 The shape of a thing determines that thing's control-flow properties.
707 In particular, whenever E is a node, block, or graph,
708 % : \simon{Removed the claim about a unique entry point.}
709 \begin{itemize}
710 \item
711 If E is open at the entry, it has a unique predecessor;
712 if it is closed, it may have arbitrarily many predecessors---or none.
713 \item
714 If E is open at the exit, it has a unique successor;
715 if it is closed, it may have arbitrarily many successors---or none.
716 \end{itemize}
717 %%%%
718 %%%%
719 %%%% % \item Regardless of whether E is open or closed,
720 %%%% % it has a unique entry point where execution begins.
721 %%%% \item If E is closed at exit, control leaves \emph{only}
722 %%%% by explicit branches from closed-on-exit nodes.
723 %%%% \item If E is open at exit, control \emph{may} leave E
724 %%%% by ``falling through'' from a distinguished exit point.
725 %%%% \remark{If E is a node or block, control \emph{only} leaves E by
726 %%%% falling through, but this isn't so for a graph. Example: a body of a
727 %%%% loop contains a \texttt{break} statement} \simon{I don't understand.
728 %%%% A break statement would have to be translated as a branch, no?
729 %%%% Can you give a an example? I claim that control only leaves an
730 %%%% open graph by falling through.}
731 %%%% \end{itemize}
732
733
734 \subsection{Nodes} \seclabel{nodes}
735
736 The primitive constituents of a \ourlib{} control-flow graph are
737 \emph{nodes}, which are defined by the client.
738 Typically, a node might represent a machine instruction, such as an
739 assignment, a call, or a conditional branch.
740 But \ourlib{}'s graph representation is polymorphic in the node type,
741 so each client can define nodes as it likes.
742 Because they contain nodes defined by the client,
743 graphs can include arbitrary client-specified data, including
744 (say) C~statements, method calls in an object-oriented language, or
745 whatever.
746
747
748 \begin{figure}
749 \begin{fuzzcode}{0.98pt}
750 data `Node e x where
751 Label :: Label -> Node C O
752 `Assign :: Var -> Expr -> Node O O
753 `Store :: Expr -> Expr -> Node O O
754 `Branch :: Label -> Node O C
755 `CondBranch :: Expr -> Label -> Label -> Node O C
756 -- ... more constructors ...
757 \end{fuzzcode}
758 \caption{A typical node type as it might be defined by a client}
759 \figlabel{cmm-node}
760 \end{figure}
761
762 \ourlib{} knows \emph{at compile time} whether a node is open or
763 closed at entry and exit:
764 the type of a node has kind @*->*->*@, where the two type parameters
765 are type-level flags, one for entry and one for exit.
766 Such a type parameter may be instantiated only with type @O@~(for
767 open) or type~@C@ (for closed).
768 As an example,
769 \figref{cmm-node} shows a typical node type as it might be written by
770 one of \ourlib's {clients}.
771 The type parameters are written @e@ and @x@, for
772 entry and exit respectively.
773 The type is a generalized algebraic data type;
774 the syntax gives the type of each constructor.
775 %%% \cite{peyton-jones:unification-based-gadts}.
776 For example, constructor @Label@
777 takes a @Label@ and returns a node of type @Node C O@, where
778 the~``@C@'' says ``closed at entry'' and the~``@O@'' says ``open at exit''.
779 The types @Label@, @O@, and~@C@ are
780 defined by \ourlib{} (\figref{graph}).
781
782 Similarly, an @Assign@ node takes a variable and an expression, and
783 returns a @Node@ open at both entry and exit; the @Store@ node is
784 similar. The types @`Var@ and @`Expr@ are private to the client, and
785 \ourlib{} knows nothing of them.
786 Finally, the control-transfer nodes @Branch@ and @CondBranch@ are open at entry
787 and closed at exit.
788
789 Nodes closed on entry are the only targets of control transfers;
790 nodes open on entry and exit never perform control transfers;
791 and nodes closed on exit always perform control transfers\footnote{%
792 To obey these invariants,
793 a node for
794 a conditional-branch instruction, which typically either transfers control
795 \emph{or} falls through, must be represented as a two-target
796 conditional branch, with the fall-through path in a separate block.
797 This representation is standard \cite{appel:modern},
798 and it costs nothing in practice:
799 such code is easily sequentialized without superfluous branches.
800 %a late-stage code-layout pass can readily reconstruct efficient code.
801 }.
802 Because of the position each type of node occupies in a
803 basic block,
804 we~often call them \emph{first}, \emph{middle}, and \emph{last} nodes
805 respectively.
806
807 \subsection{Blocks} \seclabel{blocks}
808
809 \begin{figure}
810 \begin{fuzzcode}{0.98pt}
811 data `O -- Open
812 data `C -- Closed
813
814 data `Block n e x where
815 `BFirst :: n C O -> Block n C O
816 `BMiddle :: n O O -> Block n O O
817 `BLast :: n O C -> Block n O C
818 `BCat :: Block n e O -> Block n O x -> Block n e x
819
820 data `Graph n e x where
821 `GNil :: Graph n O O
822 `GUnit :: Block n O O -> Graph n O O
823 `GMany :: MaybeO e (Block n O C)
824 -> LabelMap (Block n C C)
825 -> MaybeO x (Block n C O)
826 -> Graph n e x
827
828 data `MaybeO ^ex t where
829 `JustO :: t -> MaybeO O t
830 `NothingO :: MaybeO C t
831
832 newtype `Label = Label Int
833
834 class `Edges n where
835 `entryLabel :: n C x -> Label
836 `successors :: n e C -> [Label]
837 \end{fuzzcode}
838 \caption{The block and graph types defined by \ourlib}
839 \figlabel{graph} \figlabel{edges}
840 \end{figure}
841
842 \ourlib\ combines the client's nodes into
843 blocks and graphs, which, unlike the nodes, are defined by \ourlib{}
844 (\figref{graph}).
845 A~@Block@ is parameterized over the node type~@n@
846 as well as over the same flag types that make it open or closed at
847 entry and exit.
848
849 The @BFirst@, @BMiddle@, and @BLast@ constructors create one-node
850 blocks.
851 Each of these constructors is polymorphic in the node's \emph{representation}
852 but monomorphic in its \emph{shape}.
853 An~earlier representation of blocks used a single constructor
854 of type \mbox{@n e x -> Block n e x@}, polymorphic in a node's representation \emph{and}
855 shape.
856 The representation of blocks in \figref{graph} has more constructors, but it
857 uses the power of GADTs to ensure that the shape of every node is
858 known statically.
859 This property makes no difference to clients, but it significantly
860 simplifies the implementation of analysis and transformation in
861 \secref{monomorphic-shape-outcome}.
862
863 The @BCat@ constructor concatenates blocks in sequence.
864 It~makes sense to concatenate blocks only when control can fall
865 through from the first to the second; therefore,
866 two blocks may be concatenated {only} if each block is open at
867 the point of concatenation.
868 This restriction is enforced by the type of @BCat@, whose first
869 argument must be open on exit, and whose second argument must be open on entry.
870 It~is statically impossible, for example, to concatenate a @Branch@
871 immediately before an
872 @Assign@.
873 Indeed, the @Block@ type statically guarantees that any
874 closed/closed @Block@---which compiler writers normally
875 call a ``basic block''---consists of exactly one closed/open node
876 (such as @Label@ in \figref{cmm-node}),
877 followed by zero or more open/open nodes (@Assign@ or @Store@),
878 and terminated with exactly one
879 open/closed node (@Branch@ or @CondBranch@).
880 Using GADTs to enforce these invariants is one of
881 \ourlib{}'s innovations.
882 % Also notice that the definition of @Block@ guarantees that every @Block@
883 % has at least one node.
884 % SPJ: I think we've agreed to stick with PNil.
885 % \remark{So what? Who cares? Why is it important
886 % that no block is empty? (And if we had an empty block, we could ditch
887 % @PNil@ from @PGraph@, which woudl be an improvement.)}
888
889 \subsection{Graphs} \seclabel{graphs}
890
891 \ourlib{} composes blocks into graphs, which are also defined
892 in \figref{graph}.
893 Like @Block@, the data type @Graph@ is parameterized over
894 both nodes @n@ and its open/closed shape (@e@ and @x@).
895 It has three constructors. The first two
896 deal with the base cases of open/open graphs:
897 an empty graph is represented by @GNil@ while a single-block graph
898 is represented by @GUnit@.
899
900 More general graphs are represented by @GMany@, which has three
901 fields: an optional entry sequence, a body, and an optional exit
902 sequence.
903 \begin{itemize}
904 \item
905 If the graph is open at the entry, it contains an entry sequence of type
906 @Block n O C@.
907 We could represent this sequence as a value of type
908 @Maybe (Block n O C)@, but we can do better:
909 a~value of @Maybe@ type requires a \emph{dynamic} test,
910 but we know \emph{statically}, at compile time, that the sequence is present if and only
911 if the graph is open at the entry.
912 We~express our compile-time knowledge by using the type
913 @MaybeO e (Block n O C)@, a type-indexed version of @Maybe@
914 which is also defined in \figref{graph}:
915 the type @MaybeO O a@ is isomorphic to~@a@, while
916 the type @MaybeO C a@ is isomorphic to~@()@.
917 \item
918 The body of the graph is a collection of closed/closed blocks.
919 To~facilitate traversal of the graph, we represent a body as a finite
920 map from label to block.
921 \item
922 The exit sequence is dual to the entry sequence, and like the entry
923 sequence, its presence or absence is deducible from the static type of the graph.
924 \end{itemize}
925
926 We can splice graphs together nicely; the cost is logarithmic in the
927 number of closed/closed blocks.
928 Unlike blocks, two graphs may be spliced together
929 not only when they are both open
930 at splice point but also
931 when they are both closed---and not in the other two cases:
932 \begin{smallfuzzcode}{10.8pt}
933 `gSplice :: Graph n e a -> Graph n a x -> Graph n e x
934 gSplice GNil g2 = g2
935 gSplice g1 GNil = g1
936
937 gSplice (GUnit ^b1) (GUnit ^b2) = GUnit (b1 `BCat` b2)
938
939 gSplice (GUnit b) (GMany (JustO e) bs x)
940 = GMany (JustO (b `BCat` e)) bs x
941
942 gSplice (GMany e ^bs (JustO x)) (GUnit b2)
943 = GMany e bs (JustO (x `BCat` b2))
944
945 gSplice (GMany e1 ^bs1 (JustO x1)) (GMany (JustO e2) ^bs2 x2)
946 = GMany e1 (bs1 `mapUnion` (b `addBlock` bs2) x2
947 where b = x1 `BCat` e2
948
949 gSplice (GMany e1 bs1 NothingO) (GMany NothingO bs2 x2)
950 = GMany e1 (bs1 `mapUnion` bs2) x2
951 \end{smallfuzzcode}
952 This definition illustrates the power of GADTs: the
953 pattern matching is exhaustive, and all the open/closed invariants are
954 statically checked. For example, consider the second-last equation for @gSplice@.
955 Since the exit link of the first argument is @JustO x1@,
956 we know that type parameter~@a@ is~@O@, and hence the entry link of the second
957 argument must be @JustO e2@.
958 Moreover, block~@x1@ must be
959 closed/open, and block~@e2@ must be open/closed.
960 We can therefore concatenate them
961 with @BCat@ to produce a closed/closed block, which is
962 added to the body of the result.
963
964 We~have carefully crafted the types so that if @BCat@ and @BodyCat@
965 are considered as associative operators,
966 every graph has a unique representation.\finalremark{So what? (Doumont)
967 Need some words about how this is part of making the API simple for
968 the client---I've added something to the end of the paragraph, but I'm
969 not particularly thrilled.}
970 %% \simon{Well, you were the one who was so keen on a unique representation!
971 %% And since we have one, I think tis useful to say so. Lastly, the representation of
972 %% singleton blocks is not entirely obvious.}
973 %%%%
974 %%%% An empty open/open graph is represented
975 %%%% by @GNil@, while a closed/closed one is @gNilCC@:
976 %%%% \par {\small
977 %%%% \begin{code}
978 %%%% gNilCC :: Graph C C
979 %%%% gNilCC = GMany NothingO BodyEmpty NothingO
980 %%%% \end{code}}
981 %%%% The representation of a @Graph@ consisting of a single block~@b@
982 %%%% depends on the shape of~@b@:\remark{Does anyone care?}
983 %%%% \par{\small
984 %%%% \begin{code}
985 %%%% gUnitOO :: Block O O -> Graph O O
986 %%%% gUnitOC :: Block O C -> Graph O C
987 %%%% gUnitCO :: Block O C -> Graph C O
988 %%%% gUnitCC :: Block O C -> Graph C C
989 %%%% gUnitOO b = GUnit b
990 %%%% gUnitOC b = GMany (JustO b) BodyEmpty NothingO
991 %%%% gUnitCO b = GMany NothingO BodyEmpty (JustO b)
992 %%%% gUnitCC b = GMany NothingO (BodyUnit b) NothingO
993 %%%% \end{code}}
994 %%%% Multi-block graphs are similar.
995 %%%% From these definitions
996 To guarantee uniqueness, @GUnit@ is restricted to open/open
997 blocks.
998 If~@GUnit@ were more polymorphic, there would be
999 more than one way to represent some graphs, and it wouldn't be obvious
1000 to a client which representation to choose---or if the choice made a difference.
1001
1002
1003 \subsection{Labels and successors} \seclabel{edges}
1004
1005 If \ourlib{} knows nothing about nodes, how can it know where
1006 a control transfer goes, or what is the @Label@ at the start of a block?
1007 To~answer such questions,
1008 the standard Haskell idiom is to define a type class whose methods
1009 provide exactly the operations needed;
1010 \ourlib's type class, called @Edges@, is given in \figref{edges}.
1011 The @entryLabel@ method takes a first node (one closed on entry, \secref{nodes})
1012 and returns its @Label@;
1013 the @successors@ method takes a last node (closed on exit) and returns
1014 the @Label@s to
1015 which it can transfer control.
1016 A~middle node, which is open at both entry and exit, cannot refer to
1017 any @Label@s,
1018 so no corresponding interrogation function is needed.
1019
1020 A node type defined by a client must be an instance of @Edges@.
1021 In~\figref{cmm-node},
1022 the client's instance declaration for @Node@ would be
1023 \begin{code}
1024 instance Edges Node where
1025 entryLabel (Label l) = l
1026 successors (Branch b) = [b]
1027 successors (CondBranch e b1 b2) = [b1, b2]
1028 \end{code}
1029 Again, the pattern matching for both functions is exhaustive, and
1030 the compiler statically checks this fact. Here, @entryLabel@
1031 cannot be applied to an @Assign@ or @Branch@ node,
1032 and any attempt to define a case for @Assign@ or @Branch@ would result
1033 in a type error.
1034
1035 While it is required for the client to provide this information about
1036 nodes, it is very convenient for \ourlib\ to get the same information
1037 about blocks.
1038 For its own internal use,
1039 \ourlib{} provides this instance declaration for the @Block@ type:
1040 \begin{code}
1041 instance Edges n => Edges (Block n) where
1042 entryLabel (BFirst n) = entryLabel n
1043 entryLabel (BCat b _) = entryLabel b
1044 successors (BLast n) = successors n
1045 successors (BCat _ b) = successors b
1046 \end{code}
1047 Because the functions @entryLabel@ and @successors@ are used to track control
1048 flow \emph{within} a graph, \ourlib\ does not need to ask for the
1049 entry label or successors of a @Graph@ itself.
1050 Indeed, @Graph@ \emph{cannot} be an instance of @Edges@, because even
1051 if a @Graph@ is closed at the entry, it does not have a unique entry label.
1052 %
1053 % A slight infelicity is that we cannot make @Graph@ an instance of @Edges@,
1054 % because a graph closed on entry has no unique label. Fortunately, we never
1055 % need @Graph@s to be in @Edges@.
1056 %
1057 % ``Never apologize. Never confess to infelicity.'' ---SLPJ
1058
1059
1060
1061
1062 \begin{table}
1063 \centerline{%
1064 \begin{tabular}{@{}>{\raggedright\arraybackslash}p{1.03in}>{\scshape}c>{\scshape
1065 }
1066 c>{\raggedright\arraybackslash}p{1.29in}@{}}
1067 &\multicolumn1{r}{\llap{\emph{Specified}}\hspace*{-0.3em}}&
1068 \multicolumn1{l}{\hspace*{-0.4em}\rlap{\emph{Implemented}}}&\\
1069 \multicolumn1{c}{\emph{Part of optimizer}}
1070 &\multicolumn1{c}{\emph{by}}&
1071 \multicolumn1{c}{\emph{by}}&
1072 \multicolumn1{c}{\emph{How many}}%
1073 \\[5pt]
1074 Control-flow graphs& Us & Us & One \\
1075 Nodes in a control-flow graph & You & You & One type per intermediate language \\[3pt]
1076 Dataflow fact~$F$ & You & You & One type per logic \\
1077 Lattice operations & Us & You & One set per logic \\[3pt]
1078 Transfer functions & Us & You & One per analysis \\
1079 Rewrite functions & Us & You & One per \rlap{transformation} \\[3pt]
1080 Solve-and-rewrite functions & Us & Us & Two (forward, backward) \\
1081 \end{tabular}%
1082 }
1083 \caption{Parts of an optimizer built with \ourlib}
1084 \tablabel{parts}
1085 \end{table}
1086
1087
1088
1089 \section {Using \ourlib{} to analyze and transform graphs} \seclabel{using-hoopl}
1090 \seclabel{making-simple}
1091 \seclabel{create-analysis}
1092
1093 Now that we have graphs, how do we optimize them?
1094 \ourlib{} makes it easy for a
1095 client to build a new dataflow analysis and optimization.
1096 The client must supply the following pieces:
1097 \begin{itemize}
1098 \item \emph{A node type} (\secref{nodes}).
1099 \ourlib{} supplies the @Block@ and @Graph@ types
1100 that let the client build control-flow graphs out of nodes.
1101 \item \emph{A data type of facts} and some operations over
1102 those facts (\secref{facts}).
1103 Each analysis uses facts that are specific to that particular analysis,
1104 which \ourlib{} accommodates by being polymorphic in
1105 the fact type.
1106 \item \emph{A transfer function} that takes a node and returns a
1107 \emph{fact transformer}, which takes a fact flowing into the node and
1108 returns the transformed fact that flows out of the node (\secref{transfers}).
1109 \item \emph{A rewrite function} that takes a node and an input fact,
1110 and which returns either @Nothing@
1111 or @(Just g)@ where @g@~is a graph that should
1112 replace the node.
1113 The ability to replace a \emph{node} by a \emph{graph} that may include
1114 internal control flow is
1115 crucial for many code-improving transformations.
1116 We discuss the rewrite function
1117 in Sections \ref{sec:rewrites} and \ref{sec:shallow-vs-deep}.
1118 \end{itemize}
1119 These requirements are summarized in \tabref{parts}.
1120 Because facts, transfer functions, and rewrite functions work closely together,
1121 we represent their combination
1122 as a single record of type @FwdPass@ (\figref{api-types}).
1123
1124
1125 \ifpagetuning\enlargethispage{\baselineskip}\fi
1126
1127 Given a node type~@n@ and a @FwdPass@,
1128 a client can ask \ourlib{}\ to analyze and rewrite a
1129 graph.
1130 \hoopl\ provides a fully polymorphic interface, but for purposes of
1131 exposition, we present a function that is specialized to a
1132 closed/closed graph:
1133 \begin{code}
1134 analyzeAndRewriteFwdBody
1135 :: ( FuelMonad m -- Gensym and other state
1136 , Edges n ) -- Access to flow edges
1137 => FwdPass m n f -- Lattice, transfer, rewrite
1138 -> [Label] -- Entry point(s)
1139 -> Graph n C C -- Input graph
1140 -> FactBase f -- Input fact(s)
1141 -> m ( Graph n C C -- Result graph
1142 , FactBase f ) -- ... and its facts
1143 \end{code}
1144 Given a @FwdPass@, the
1145 analyze-and-rewrite function transforms a graph into
1146 an optimized graph.
1147 As its type shows, this function
1148 is polymorphic in the types of nodes~@n@ and facts~@f@;
1149 these types are determined entirely by the client.
1150 The type of the monad~@m@ is also supplied by the client, but it must
1151 meet certain constraints laid out by \hoopl, as described in
1152 \secref{fuel-monad}.
1153
1154
1155 \begin{figure}
1156 \begin{code}
1157 data `FwdPass m n f
1158 = FwdPass { `fp_lattice :: DataflowLattice f
1159 , `fp_transfer :: FwdTransfer n f
1160 , `fp_rewrite :: FwdRewrite m n f }
1161
1162 ------- Lattice ----------
1163 data `DataflowLattice a = DataflowLattice
1164 { `fact_bot :: a
1165 , `fact_extend :: a -> a -> (ChangeFlag, a) }
1166
1167 data `ChangeFlag = `NoChange | `SomeChange
1168
1169 ------- Transfers ----------
1170 newtype `FwdTransfer n f -- abstract type
1171 `mkFTransfer'
1172 :: (forall e x . n e x -> f -> Fact x f)
1173 -> FwdTransfer n f
1174
1175 ------- Rewrites ----------
1176 newtype `FwdRewrite m n f -- abstract type
1177 `mkFRewrite'
1178 :: (forall e x . n e x -> f -> m (FwdRes m n f e x))
1179 -> FwdRewrite m n f
1180
1181 data `FwdRes m n f e x
1182 = FwdRes (Graph n e x) (FwdRewrite m n f)
1183 | `NoFwdRes
1184
1185 ------- Fact-like things -------
1186 type family `Fact x f :: *
1187 type instance Fact O f = f
1188 type instance Fact C f = FactBase f
1189
1190 ------- FactBase -------
1191 type `FactBase f = LabelMap f
1192 -- A finite mapping from Labels to facts f
1193 \end{code}
1194 \caption{\ourlib{} API data types}
1195 \figlabel{api-types}
1196 \figlabel{lattice-type} \figlabel{lattice}
1197 \figlabel{transfers} \figlabel{rewrites}
1198 \end{figure}
1199 % omit mkFactBase :: [(Label, f)] -> FactBase f
1200
1201
1202
1203 As well as taking and returning a graph with its entry point(s), the
1204 function also takes input facts (the @FactBase@) and produces output facts.
1205 A~@FactBase@ is simply a finite mapping from @Label@ to facts.
1206 The
1207 output @FactBase@ maps each @Label@ in the @Body@ to its fact; if
1208 the @Label@ is not in the domain of the @FactBase@, its fact is the
1209 bottom element of the lattice.
1210 The point(s) at which control flow may enter the graph are supplied by
1211 the caller; the input @FactBase@ supplies any
1212 facts that hold at these points.
1213 For example, in our constant-propagation example from \secref{const-prop-example},
1214 if the graph
1215 represents the body of a procedure
1216 with parameters $x,y,z$, we would map the entry @Label@ to a fact
1217 $@x=@\top \land @y=@\top \land @z=@\top$, to specify that the procedure's
1218 parameters are not known to be constants.
1219
1220 The client's model of how @analyzeAndRewriteFwd@ works is as follows:
1221 \ourlib\ walks forward over each block in the graph.
1222 At each node, \ourlib\ applies the
1223 rewrite function to the node and the incoming fact. If the rewrite
1224 function returns @NoFwdRes@, the node is retained as part of the output
1225 graph, the transfer function is used to compute the downstream fact,
1226 and \ourlib\ moves on to the next node.
1227 But if the rewrite function returns @(FwdRes g rw)@,
1228 indicating that it wants to rewrite the node to the replacement graph~@g@, then
1229 \ourlib\ recursively analyzes and rewrites~@g@, using the new rewrite
1230 function~@rw@, before moving on to the next node.
1231 A~node following a rewritten node sees
1232 \emph{up-to-date} facts; that is, its input fact is computed by
1233 analyzing the replacement graph.
1234
1235 In this section we flesh out the
1236 \emph{interface} to @analyzeAndRewriteFwd@, leaving the implementation for
1237 \secref{engine}.
1238 {\hfuzz=7.2pt\par}
1239
1240 \subsection{Dataflow lattices} \seclabel{lattices} \seclabel{facts}
1241
1242 For each analysis or transformation, the client must define a type
1243 of dataflow facts.
1244 A~dataflow fact often represents an assertion
1245 about a program point,\footnote
1246 {In \ourlib, a program point is simply an edge in a
1247 control-flow graph.}
1248 but in general, dataflow analysis establishes properties of \emph{paths}:
1249 \begin{itemize}
1250 \item An assertion about all paths \emph{to} a program point is established
1251 by a \emph{forwards analysis}. For example the assertion ``$@x@=@3@$'' at point P
1252 claims that variable @x@ holds value @3@ at P, regardless of the
1253 path by which P is reached.
1254 \item An assertion about all paths \emph{from} a program point is
1255 established by a \emph{backwards analysis}. For example, the
1256 assertion ``@x@ is dead'' at point P claims that no path from P uses
1257 variable @x@.
1258 \end{itemize}
1259
1260 A~set of dataflow facts must form a lattice, and \ourlib{} must know
1261 (a)~the bottom element of the lattice and (b)~how to take
1262 the least upper bound (join) of two elements.
1263 To ensure that analysis
1264 terminates, it is enough if every fact has a finite number of
1265 distinct facts above it, so that repeated joins
1266 eventually reach a fixed point.
1267
1268 In practice, joins are computed at labels.
1269 If~$f_{\mathit{id}}$ is the fact currently associated with the
1270 label~$\mathit{id}$,
1271 and if a transfer function propagates a new fact~$f_{\mathit{new}}$
1272 into the label~$\mathit{id}$,
1273 the dataflow engine replaces $f_{\mathit{id}}$ with
1274 the join $f_{\mathit{new}} \join f_{\mathit{id}}$.
1275 Furthermore, the dataflow engine wants to know if
1276 $f_{\mathit{new}} \join f_{\mathit{id}} = f_{\mathit{id}}$,
1277 because if not, the analysis has not reached a fixed point.
1278
1279 The bottom element and join operation of a lattice of facts of
1280 type~@f@ are stored in a value of type @DataflowLattice f@
1281 (\figref{lattice}).
1282 %%Such a value is part of the @FwdPass n f@ that is passed to
1283 %%@analyzeAndRewriteFwd@ above.
1284 As noted in the previous paragraph,
1285 \ourlib{} needs to know when the result of a join is equal
1286 to one of the arguments joined.
1287 Because this information is often available very cheaply at the time
1288 when the join is computed, \ourlib\ does not
1289 require a separate equality test on facts (which might be
1290 expensive).
1291 Instead, \ourlib\ requires that @fact_extend@ return a @ChangeFlag@ as
1292 well as the least upper bound.
1293 The @ChangeFlag@ should be @NoChange@ if
1294 the result is the same as the first argument (the old fact), and
1295 @SomeChange@ if the result differs.
1296 (Function @fact_extend@ is \emph{not} symmetric in
1297 its arguments.)
1298
1299 \remark{Notes about @changeIf@, @PElem@, @WithTop@?}
1300
1301 % A~possible representation of the facts needed to implement
1302 % constant propagation is shown in \figref{const-prop}.
1303 % A~fact
1304 % is represented as a finite map from a variable to a value of type
1305 % @Maybe HasConst@.%
1306 % A~variable $x$ maps to @Nothing@ iff $x=\bot$;
1307 % $x$~maps to @Just Top@ iff $x=\top$;
1308 % and $x$~maps to $@Just@\, (@I@\, k)$ iff $x=k$ (and similarly for
1309 % Boolean constants).\remark{Ugh}
1310 % Any one procedure has only
1311 % finitely many variables; only finitely many facts are computed at any
1312 % program point; and in this lattice any one fact can increase at most
1313 % twice. These properties ensure that the dataflow engine will reach a
1314 % fixed point.
1315
1316
1317 \subsection{The transfer function} \seclabel{transfers}
1318
1319 A~forward transfer function is presented with the dataflow fact(s) on
1320 the edge(s) coming
1321 into a node, and it computes dataflow fact(s) on the outgoing edge(s).
1322 In a forward analysis, the dataflow engine starts with the fact at the
1323 beginning of a block and applies the transfer function to successive
1324 nodes in that block until eventually the transfer function for the last node
1325 computes the facts that are propagated to the block's successors.
1326 For example, consider this graph, with entry at @L1@:
1327 \begin{code}
1328 L1: x=3; goto L2
1329 L2: y=x+4; x=x-1;
1330 if x>0 then goto L2 else return
1331 \end{code}
1332 A forward analysis starts with the bottom fact \{\} at every label.
1333 Analyzing @L1@ propagates this fact forward, by applying the transfer
1334 function successively to the nodes
1335 of @L1@, emerging with the fact \{@x=3@\} for @L2@.
1336 This new fact is joined with the existing (bottom) fact for @L2@.
1337 Now the analysis propagates @L2@'s fact forward, again using the transfer
1338 function, this time emerging with a new fact \mbox{\{@x=2@, @y=7@\}} for @L2@.
1339 Again, the new fact is joined with the existing fact for @L2@, and the process
1340 is iterated until the facts for each label reach a fixed point.
1341
1342 But wait! What is the \emph{type} of the transfer function?
1343 If the node is open at exit, the transfer function produces a single fact.
1344 But what if the node is \emph{closed} on exit?
1345 In that case the transfer function must
1346 produce a list of (@Label@,fact) pairs, one for each outgoing edge.
1347 %
1348 \emph{So the type of the transfer function's result
1349 depends on the shape of the node's exit.}
1350 Fortunately, this dependency can be expressed precisely, at compile
1351 time, by Haskell's (recently added)
1352 \emph{indexed type families}.
1353 The relevant part of \ourlib's interface is given in \figref{transfers}.
1354 A~forward transfer function supplied by a client,
1355 of type (@FwdTransfer@ @n@ @f@),
1356 is typically a function polymorphic in @e@ and @x@.
1357 It takes a
1358 node of type \mbox{(@n@ @e@ @x@)} and a fact of type~@f@, and it produces an
1359 outgoing ``fact-like thing'' of type (@Fact@ @x@ @f@). The
1360 type constructor @Fact@
1361 should be thought of as a type-level function; its signature is given in the
1362 @type family@ declaration, while its definition is given by two @type instance@
1363 declarations. The first declaration says that the fact-like thing
1364 coming out of a node
1365 \emph{open} at the exit is just a fact~@f@. The second declaration says that
1366 the fact-like thing
1367 coming out of a node
1368 \emph{closed} at the exit is a mapping from @Label@ to facts.
1369
1370 We have ordered the arguments such that if
1371 \begin{code}
1372 `transfer_fn :: forall e x . n e x -> f -> Fact x f
1373 `node :: n e x
1374 \end{code}
1375 then @(transfer_fn node)@ is a fact transformer:
1376 \begin{code}
1377 transfer_fn node :: f -> Fact x f
1378 \end{code}
1379
1380 So much for the interface.
1381 What about the implementation?
1382 The way GADTs work is that the compiler uses the value constructor for
1383 type \mbox{@n@ @e@ @x@} to determine whether @e@~and~@x@ are open or
1384 closed.
1385 But we want to write functions that are \emph{polymorphic} in the node
1386 type~@n@!
1387 Such functions include
1388 \begin{itemize}
1389 \item
1390 A function that takes a pair of @FwdTransfer@s for facts @f@~and~@f'@,
1391 and returns a single @FwdTransfer@ for the fact @(f, f')@
1392 \item
1393 A function that takes a @FwdTransfer@ and wraps it in logging code, so
1394 an analysis can be debugged by watching the facts flow through the
1395 nodes
1396 \end{itemize}
1397 Because the mapping from
1398 value constructors to shape is different for each node type~@n@, such
1399 functions cannot be polymorphic in both
1400 the representation and the shape of nodes.
1401 We~therefore, in our implementation, sacrifice polymorphism in shape:
1402 a~@FwdTransfer n f@ is represented by a \emph{triple} of functions,
1403 each polymorphic in the node type but monomorphic in shape:
1404 \begin{code}
1405 newtype FwdTransfer n f
1406 = FwdTransfers ( n C O -> f -> f
1407 , n O O -> f -> f
1408 , n O C -> f -> FactBase f
1409 )
1410 \end{code}
1411 Such triples can easily be composed and wrapped without requiring a
1412 pattern match on the value constructor of an unknown node
1413 type.\remark{Need to refer to this junk in the conclusion}
1414 Working with triples is tedious, but only \hoopl\ itself is forced to
1415 work with triples; each client, which knows the node type, may supply
1416 a triple,
1417 but it typically supplies a single function polymorphic in the shape
1418 of the (known) node.
1419
1420
1421
1422
1423 \subsection{The rewrite function}
1424 \seclabel{rewrites}
1425 \seclabel{example-rewrites}
1426
1427 %%%% \begin{figure}
1428 %%%% \begin{smallfuzzcode}{6.6pt}
1429 %%%% type `AGraph n e x
1430 %%%% = [Label] -> (Graph n e x, [Label])
1431 %%%%
1432 %%%% `withLabels :: Int -> ([Label] -> AGraph n e x)
1433 %%%% -> AGraph n e x
1434 %%%% withLabels n ^fn = \ls -> fn (take n ^ls) (drop n ls)
1435 %%%%
1436 %%%% `mkIfThenElse :: Expr -> AGraph Node O O
1437 %%%% -> AGraph Node O O -> AGraph Node O O
1438 %%%% mkIfThenElse p t e
1439 %%%% = withLabels 3 $ \[l1,l2,l3] ->
1440 %%%% gUnitOC (BUnit (CondBranch p l1 l2)) `gSplice`
1441 %%%% mkLabel l1 `gSplice` t `gSplice` mkBranch l3 `gSplice`
1442 %%%% mkLabel l2 `gSplice` e `gSplice` mkBranch l3 `gSplice`
1443 %%%% mkLabel l3
1444 %%%%
1445 %%%% `mkLabel l = gUnitCO (BUnit (Label l))
1446 %%%% `mkBranch l = gUnitOC (BUnit (Branch l))
1447 %%%% `gUnitOC b = GMany (JustO b) BodyEmpty NothingO
1448 %%%% `gUnitCO b = GMany NothingO BodyEmpty (JustO b)
1449 %%%% \end{smallfuzzcode}
1450 %%%% \caption{The \texttt{AGraph} type and example constructions} \figlabel{agraph}
1451 %%%% \end{figure}
1452
1453 We compute dataflow facts in order to enable code-improving
1454 transformations.
1455 In our constant-propagation example,
1456 the dataflow facts may enable us
1457 to simplify an expression by performing constant folding, or to
1458 turn a conditional branch into an unconditional one.
1459 Similarly, a liveness analysis may allow us to
1460 replace a dead assignment with a no-op.
1461
1462 A @FwdPass@ therefore includes a \emph{rewriting function}, whose
1463 type, @FwdRewrite@, is given in \figref{api-types}.
1464 A rewriting function takes a node and a fact, and optionally returns\ldots what?
1465 At first one might
1466 expect that rewriting should return a new node, but that is not enough:
1467 We~might want to remove a node by rewriting it to the empty graph,
1468 or more ambitiously, we might want to replace a high-level operation
1469 with a tree of conditional branches or a loop, which would entail
1470 introducing new blocks with internal control flow.
1471 In~general, a rewrite function must be able to return a
1472 \emph{graph}.
1473
1474 \ifpagetuning\enlargethispage{0.8\baselineskip}\fi
1475
1476 Concretely, a @FwdRewrite@ takes a node and a suitably shaped
1477 fact, and returns either @NoFwdRes@, indicating that the node should
1478 not be replaced,
1479 or $m\;@(FwdRes@\;\ag\;\rw@)@$, indicating that the node should
1480 be replaced with~\ag: the replacement graph.
1481 The result is monadic because
1482 if the rewriter makes graphs containing blocks,
1483 it may need fresh @Label@s, which are supplied by a monad.
1484 A~client may use its own monad or may use a monad or monad transformer
1485 supplied by \hoopl.
1486 \remark{Forward reference??}
1487
1488 %%%% \ifpagetuning\enlargethispage{0.8\baselineskip}\fi
1489
1490 The type of @FwdRewrite@ in \figref{api-types} guarantees
1491 \emph{at compile time} that
1492 the replacement graph $\ag$ has
1493 the \emph{same} open/closed shape as the node being rewritten.
1494 For example, a branch instruction can be replaced only by a graph
1495 closed at the exit. Moreover, because only an open/open graph can be
1496 empty---look at the type of @GNil@ in \figref{graph}---the type
1497 of @FwdRewrite@
1498 guarantees, at compile time, that no head of a block (closed/open)
1499 or tail of a block (open/closed) can ever be deleted by being
1500 rewritten to an empty graph.
1501
1502 \subsection{Shallow vs deep rewriting}
1503 \seclabel{shallow-vs-deep}
1504
1505 Once the rewrite has been performed, what then?
1506 The rewrite returns a replacement graph,
1507 which must itself be analyzed, and its nodes may be further rewritten.
1508 We~call @analyzeAndRewriteFwd@ to process the replacement
1509 graph---but what @FwdPass@ should we use?
1510 There are two common situations:
1511 \begin{itemize}
1512 \item Sometimes we want to analyze and transform the replacement graph
1513 with an unmodified @FwdPass@, further rewriting the replacement graph.
1514 This procedure is
1515 called \emph{deep rewriting}.
1516 When deep rewriting is used, the client's rewrite function must
1517 ensure that the graphs it produces are not rewritten indefinitely
1518 (\secref{correctness}).
1519 \item Sometimes we want to analyze \emph{but not further rewrite} the
1520 replacement graph. This procedure is called \emph{shallow rewriting}.
1521 It~is easily implemented by using a modified @FwdPass@
1522 whose rewriting function always returns @NoFwdRes@.
1523 \end{itemize}
1524 Deep rewriting is essential to achieve the full benefits of
1525 interleaved analysis and transformation
1526 \citep{lerner-grove-chambers:2002}.
1527 But shallow rewriting can be vital as well;
1528 for example, a forward dataflow pass that inserts
1529 a spill before a call must not rewrite the call again, lest it attempt
1530 to insert infinitely many spills.
1531
1532 An~innovation of \hoopl\ is to build the choice of shallow or deep
1533 rewriting into each rewrite function,
1534 an idea that is elegantly captured by the
1535 @FwdRes@ type returned by a @FwdRewrite@ (\figref{api-types}).
1536 The first component of the @FwdRes@ is the replacement graph, as discussed earlier.
1537 The second component, $\rw$, is a
1538 \emph{new rewriting function} to use when recursively processing
1539 the replacement graph.
1540 For shallow rewriting this new function is
1541 the constant @Nothing@ function; for deep rewriting it is the original
1542 rewriting function.
1543
1544
1545 \subsection{Composing rewrite functions and dataflow passes} \seclabel{combinators}
1546
1547 By requiring each rewrite to return a new rewrite function,
1548 \hoopl\ enables a variety of combinators over
1549 rewrite functions.
1550 \remark{This whole subsection needs to be redone in light of the new
1551 (triple-based) representation. It's not pretty any more.}
1552 For example, here is a function
1553 that combines two rewriting functions in sequence:
1554 \remark{This code must be improved}
1555 \begin{smallcode}
1556 thenFwdRw :: Monad m
1557 => FwdRewrite m n f
1558 -> FwdRewrite m n f
1559 -> FwdRewrite m n f
1560 `thenFwdRw rw1 rw2 = wrapFRewrites2' f rw1 rw2
1561 where f rw1 rw2' n f = do
1562 res1 <- rw1 n f
1563 case res1 of
1564 NoFwdRes -> rw2' n f
1565 FwdRes g rw1a ->
1566 return $ FwdRes g (rw1a `thenFwdRw` rw2)
1567
1568 `noFwdRw :: FwdRewrite n f
1569 noFwdRw = mkFRewrite' $ \ n f -> NoFwdRes
1570 \end{smallcode}
1571 What a beautiful type @thenFwdRw@ has! It tries @rw1@, and if @rw1@
1572 declines to rewrite, it behaves like @rw2@. But if
1573 @rw1@ rewrites, returning a new rewriter @rw1a@, then the overall call also
1574 succeeds, returning a new rewrite function obtained by combining @rw1a@
1575 with @rw2@. (We cannot apply @rw1a@ or @rw2@
1576 directly to the replacement graph~@g@,
1577 because @r1@~returns a graph and @rw2@~expects a node.)
1578 The rewriter @noFwdRw@ is the identity of @thenFwdRw@.
1579 Finally, @thenFwdRw@ can
1580 combine a deep-rewriting function and a shallow-rewriting function,
1581 to produce a rewriting function that is a combination of deep and shallow.
1582 %%This is something that the Lerner/Grove/Chambers framework could not do,
1583 %%because there was a global shallow/deep flag.
1584 %% Our shortsightedness; Lerner/Grove/Chambers is deep only ---NR
1585
1586
1587 A shallow rewriting function can be made deep by iterating
1588 it:\remark{Algebraic law wanted!}
1589 \begin{smallfuzzcode}{6.6pt}
1590 iterFwdRw
1591 :: Monad m => FwdRewrite m n f -> FwdRewrite m n f
1592 `iterFwdRw rw = wrapFRewrites' f rw
1593 where f rw' n f = liftM iter (rw' n f)
1594 iter NoFwdRes = NoFwdRes
1595 iter (FwdRes g rw2) =
1596 FwdRes g (rw2 `thenFwdRw` iterFwdRw rw)
1597 \end{smallfuzzcode}
1598 If we have shallow rewrites $A$~and~$B$ then we can build $AB$,
1599 $A^*B$, $(AB)^*$,
1600 and so on: sequential composition is @thenFwdRw@ and the Kleene star
1601 is @iterFwdRw@.\remark{Do we still believe this claim?}
1602
1603 The combinators above operate on rewrite functions that share a common
1604 fact type and transfer function.
1605 It~can also be useful to combine entire dataflow passes that use
1606 different facts.
1607 We~invite you to write one such combinator, with type
1608 \begin{code}
1609 `thenFwd :: Monad m
1610 => FwdPass m n f1
1611 -> FwdPass m n f2
1612 -> FwdPass m n (f1,f2)
1613 \end{code}
1614 The two passes run interleaved, not sequentially, and each may help
1615 the other,
1616 yielding better results than running $A$~and then~$B$ or $B$~and then~$A$
1617 \citep{lerner-grove-chambers:2002}.
1618 %% call these passes. ``super-analyses;''
1619 %% in \hoopl, construction of super-analyses is
1620 %% particularly concrete.
1621
1622 \subsection{Example: Constant propagation and constant folding}
1623 \seclabel{const-prop-client}
1624
1625 % omit Binop :: Operator -> Expr -> Expr -> Expr
1626 % omit Add :: Operator
1627
1628
1629 %% \begin{figure}
1630 %% {\small\hfuzz=3pt
1631 %% \begin{code}
1632 %% -- Types and definition of the lattice
1633 %% data `HasConst = `Top | `B Bool | `I Integer
1634 %% type `ConstFact = Map.Map Var HasConst
1635 %% `constLattice = DataflowLattice
1636 %% { fact_bot = Map.empty
1637 %% , fact_extend = stdMapJoin constFactAdd }
1638 %% where
1639 %% `constFactAdd ^old ^new = (c, j)
1640 %% where j = if new == old then new else Top
1641 %% c = if j == old then NoChange else SomeChange
1642 %%
1643 %% -------------------------------------------------------
1644 %% -- Analysis: variable has constant value
1645 %% `varHasConst :: FwdTransfer Node ConstFact
1646 %% varHasConst (Label l) f = lookupFact f l
1647 %% varHasConst (Store _ _) f = f
1648 %% varHasConst (Assign x (Bool b)) f = Map.insert x (B b) f
1649 %% varHasConst (Assign x (Int i)) f = Map.insert x (I i) f
1650 %% varHasConst (Assign x _) f = Map.insert x Top f
1651 %% varHasConst (Branch l) f = mkFactBase [(l, f)]
1652 %% varHasConst (CondBranch (Var x) ^tid ^fid) f
1653 %% = mkFactBase [(tid, Map.insert x (B True) f),
1654 %% (fid, Map.insert x (B False) f)]
1655 %% varHasConst (CondBranch _ tid fid) f
1656 %% = mkFactBase [(tid, f), (fid, f)]
1657 %%
1658 %% -------------------------------------------------------
1659 %% -- Constant propagation
1660 %% `constProp :: FwdRewrite Node ConstFact
1661 %% constProp node ^facts
1662 %% = fmap toAGraph (mapE rewriteE node)
1663 %% where
1664 %% `rewriteE e (Var x)
1665 %% = case Map.lookup x facts of
1666 %% Just (B b) -> Just $ Bool b
1667 %% Just (I i) -> Just $ Int i
1668 %% _ -> Nothing
1669 %% rewriteE e = Nothing
1670 %%
1671 %% -------------------------------------------------------
1672 %% -- Simplification ("constant folding")
1673 %% `simplify :: FwdRewrite Node f
1674 %% simplify (CondBranch (Bool b) t f) _
1675 %% = Just $ toAGraph $ Branch (if b then t else f)
1676 %% simplify node _ = fmap toAGraph (mapE s_exp node)
1677 %% where
1678 %% `s_exp (Binop Add (Int i1) (Int i2))
1679 %% = Just $ Int $ i1 + i2
1680 %% ... -- more cases for constant folding
1681 %%
1682 %% -- Rewriting expressions
1683 %% `mapE :: (Expr -> Maybe Expr)
1684 %% -> Node e x -> Maybe (Node e x)
1685 %% mapE f (Label _) = Nothing
1686 %% mapE f (Assign x e) = fmap (Assign x) $ f e
1687 %% ... -- more cases for rewriting expressions
1688 %%
1689 %% -------------------------------------------------------
1690 %% -- Defining the forward dataflow pass
1691 %% `constPropPass = FwdPass
1692 %% { fp_lattice = constLattice
1693 %% , fp_transfer = varHasConst
1694 %% , fp_rewrite = constProp `thenFwdRw` simplify }
1695 %% \end{code}}
1696 %% \caption{The client for constant propagation and constant folding} \figlabel{old-const-prop}
1697 %% \end{figure}
1698 \begin{figure}
1699 {\small
1700 \verbatiminput{cprop}
1701 }
1702 \caption{The client for constant propagation and constant folding\break (extracted automatically from code distributed with Hoopl)}
1703 \figlabel{const-prop}
1704 \end{figure}
1705
1706
1707 \figref{const-prop} shows client code for
1708 constant propagation and constant folding.
1709 For each variable at each point in a graph, the analysis concludes one of three facts:
1710 the variable holds a constant value (Boolean or integer),
1711 the variable might hold a non-constant value,
1712 or nothing is known about what the variable holds.
1713 We~represent these facts using a finite map from a variable to a
1714 fact of type (@Maybe HasConst@).
1715 A variable with a constant value maps to @Just k@, where @k@ is the constant value;
1716 a variable with a non-constant value maps to @Just Top@;
1717 and a variable with an unknown value maps to @Nothing@ (i.e., it is not
1718 in the domain of the finite map).
1719
1720 % \afterpage{\clearpage}
1721
1722 The definition of the lattice (@constLattice@) is straightforward.
1723 The bottom element is an empty map (nothing is known about the contents of any variable).
1724 We~use the @stdMapJoin@ function to lift the join operation
1725 for a single variable (@constFactAdd@) up to the map containing facts
1726 for all variables.
1727
1728 % omit stdMapJoin :: Ord k => JoinFun v -> JoinFun (Map.Map k v)
1729
1730 For the transfer function, @varHasConst@,
1731 there are two interesting kinds of nodes:
1732 assignment and conditional branch.
1733 In the first two cases for assignment, a variable gets a constant value,
1734 so we produce a dataflow fact mapping the variable to its value.
1735 In the third case for assignment, the variable gets a non-constant value,
1736 so we produce a dataflow fact mapping the variable to @Top@.
1737 The last interesting case is a conditional branch where the condition
1738 is a variable.
1739 If the conditional branch flows to the true successor,
1740 the variable holds @True@, and similarly for the false successor.
1741 We update the fact flowing to each successor accordingly.
1742
1743 We do not need to consider complicated cases such as
1744 an assignment @x:=y@ where @y@ holds a constant value @k@.
1745 Instead, we rely on the interleaving of transformation
1746 and analysis to first transform the assignment to @x:=k@,
1747 which is exactly what our simple transfer function expects.
1748 As we mention in \secref{simple-tx},
1749 interleaving makes it possible to write
1750 the simplest imaginable transfer functions, without missing
1751 opportunities to improve the code.
1752
1753 The rewrite function for constant propagation, @constProp@,
1754 simply rewrites each use of a variable to its constant value.
1755 We use the auxiliary function @mapE@
1756 to apply @rewriteE@ to each use of a variable in each kind of node;
1757 in turn, the @rewriteE@ function checks if the variable has a constant
1758 value and makes the substitution. We assume an auxiliary function
1759 \begin{code}
1760 `toAGraph :: Node e x -> AGraph Node e x
1761 \end{code}
1762
1763 \figref{const-prop} also gives a completely separate rewrite function
1764 to perform constant
1765 folding, called @simplify@. It rewrites a conditional branch on a
1766 boolean constant to an unconditional branch, and
1767 to find constant subexpressions,
1768 it runs @s_exp@
1769 on every subexpression.
1770 Function @simplify@ does not need to check whether a variable holds a
1771 constant value; it relies on @constProp@ to have replaced the
1772 variable by the constant.
1773 Indeed, @simplify@ does not consult the
1774 incoming fact at all, and hence is polymorphic in~@f@.
1775
1776
1777
1778 We have written two @FwdRewrite@ functions
1779 because they are independently useful. But in this case we
1780 want to apply \emph{both} of them,
1781 so we compose them with @thenFwdRw@.
1782 The composed rewrite functions, along with the lattice and the
1783 transfer function,
1784 go into @constPropPass@ (bottom of \figref{const-prop}).
1785 To improve a particular graph, we pass @constPropPass@ and the graph
1786 to
1787 @analyzeAndRewriteFwd@.
1788
1789
1790 \subsection{Throttling the dataflow engine using ``optimization fuel''}
1791 \seclabel{vpoiso}
1792 \seclabel{fuel}
1793
1794 Debugging an optimization can be tricky:
1795 an optimization may rewrite hundreds of nodes,
1796 and any of those rewrites could be incorrect.
1797 To debug dataflow optimizations, we use Whalley's
1798 \citeyearpar{whalley:isolation} powerful technique
1799 to identify the first rewrite that
1800 transforms a program from working code to faulty code.
1801
1802 The key idea is to~limit the number of rewrites that
1803 are performed while optimizing a graph.
1804 In \hoopl, the limit is called
1805 \emph{optimization fuel}:
1806 each rewrite costs one unit of fuel,
1807 and when the fuel is exhausted, no more rewrites are permitted.
1808 Because each rewrite leaves the observable behavior of the
1809 program unchanged, it is safe to stop rewriting at any point.
1810 Given a program that fails when compiled with optimization,
1811 a test infrastructure uses binary search on the amount of
1812 optimization fuel, until
1813 it finds that the program works correctly after $n-1$ rewrites but fails
1814 after $n$~rewrites.
1815 The $n$th rewrite is faulty.
1816
1817
1818 You may have noticed that @analyzeAndRewriteFwd@
1819 returns a value in the @FuelMonad@ (\secref{using-hoopl}).
1820 The @`FuelMonad@ is a simple state monad maintaining the supply of unused
1821 fuel. It also holds a supply of fresh labels, which are used by the rewriter
1822 for making new blocks; more precisely,
1823 \hoopl\ uses these labels to
1824 take the @AGraph@ returned by a pass's rewrite
1825 function (\figref{rewrites}) and convert it to a~@Graph@.
1826
1827
1828 \subsection{Fixed points and speculative rewrites} \seclabel{fixpoints}
1829
1830 Are rewrites sound, especially when there are loops?
1831 Many analyses compute a fixed point starting from unsound
1832 ``facts''; for example, a live-variable analysis starts from the
1833 assumption that all variables are dead. This means \emph{rewrites
1834 performed before a fixed point is reached may be unsound, and their results
1835 must be discarded}. Each iteration of the fixed-point computation must
1836 start afresh with the original graph.
1837
1838
1839 Although the rewrites may be unsound, \emph{they must be performed}
1840 (speculatively, and possibly recursively),
1841 so that the facts downstream of the replacement graphs are as accurate
1842 as possible.
1843 For~example, consider this graph, with entry at @L1@:
1844 \par{\small
1845 \begin{code}
1846 L1: x=0; goto L2
1847 L2: x=x+1; if x==10 then goto L3 else goto L2
1848 \end{code}}
1849 The first traversal of block @L2@ starts with the unsound ``fact'' \{x=0\};
1850 but analysis of the block propagates the new fact \{x=1\} to @L2@, which joins the
1851 existing fact to get \{x=$\top$\}.
1852 What if the predicate in the conditional branch were @x<10@ instead
1853 of @x==10@?
1854 Again the first iteration would begin with the tentative fact \{x=0\}.
1855 Using that fact, we would rewrite the conditional branch to an unconditional
1856 branch @goto L3@. No new fact would propagate to @L2@, and we would
1857 have successfully (and soundly) eliminated the loop.
1858 This example is contrived, but it illustrates that
1859 for best results we should
1860 \begin{itemize}
1861 \item Perform the rewrites on every iteration.
1862 \item Begin each new iteration with the original, virgin graph.
1863 \end{itemize}
1864 This sort of algorithm is hard to implement in an imperative setting, where rewrites
1865 mutate a graph in place.
1866 But with an immutable graph, implementing the algorithm
1867 is trivially easy: we simply revert to the original graph at the start
1868 of each fixed-point iteration.
1869
1870 \subsection{Correctness} \seclabel{correctness}
1871
1872 Facts computed by @analyzeAndRewriteFwd@ depend on graphs produced by the rewrite
1873 function, which in turn depend on facts computed by the transfer function.
1874 How~do we know this algorithm is sound, or if it terminates?
1875 A~proof requires a POPL paper
1876 \cite{lerner-grove-chambers:2002}, but we can give some
1877 intuition.
1878
1879 \hoopl\ requires that a client's functions meet
1880 these preconditions:
1881 \begin{itemize}
1882 \item
1883 The lattice must have no \emph{infinite ascending chains}; that is,
1884 every sequence of calls to @fact_extend@ must eventually return @NoChange@.
1885 \item
1886 The transfer function must be
1887 \emph{monotonic}: given a more informative fact in,
1888 it should produce a more informative fact out.
1889 \item
1890 The rewrite function must be \emph{sound}:
1891 if it replaces a node @n@ by a replacement graph~@g@, then @g@~must be
1892 observationally equivalent to~@n@ under the
1893 assumptions expressed by the incoming dataflow fact~@f@.
1894 %%\footnote{We do not permit a transformation to change
1895 %% the @Label@ of a node. We have not found any optimizations
1896 %% that are prevented (or even affected) by this restriction.}
1897 \item
1898 The rewrite function must be \emph{consistent} with the transfer function;
1899 that is, \mbox{@`transfer n f@ $\sqsubseteq$ @transfer g f@}.
1900 For example, if the analysis says that @x@ is dead before the node~@n@,
1901 then it had better still be dead if @n@ is replaced by @g@.
1902 \item
1903 To ensure termination, a transformation that uses deep rewriting
1904 must not return replacement graphs which
1905 contain nodes that could be rewritten indefinitely.
1906 \end{itemize}
1907 Without the conditions on monotonicity and consistency,
1908 our algorithm will terminate,
1909 but there is no guarantee that it will compute
1910 a fixed point of the analysis. And that in turn threatens the
1911 soundness of rewrites based on possibly bogus ``facts''.
1912
1913 However, when the preconditions above are met,
1914 \begin{itemize}
1915 \item
1916 The algorithm terminates. The fixed-point loop must terminate because the
1917 lattice has no infinite ascending chains. And the client is responsible
1918 for avoiding infinite recursion when deep rewriting is used.
1919 \item
1920 The algorithm is sound. Why? Because if each rewrite is sound (in the sense given above),
1921 then applying a succession of rewrites is also sound.
1922 Moreover, a~sound analysis of the replacement graph
1923 may generate only dataflow facts that could have been
1924 generated by a more complicated analysis of the original graph.
1925 \end{itemize}
1926
1927 \finalremark{Doesn't the rewrite have to be have the following property:
1928 for a forward analysis/transform, if (rewrite P s) = Just s',
1929 then (transfer P s $\sqsubseteq$ transfer P s').
1930 For backward: if (rewrite Q s) = Just s', then (transfer Q s' $\sqsubseteq$ transfer Q s).
1931 Works for liveness.
1932 ``It works for liveness, so it must be true'' (NR).
1933 If this is true, it's worth a QuickCheck property!
1934 }%
1935 \finalremark{Version 2, after further rumination. Let's define
1936 $\scriptstyle \mathit{rt}(f,s) = \mathit{transform}(f, \mathit{rewrite}(f,s))$.
1937 Then $\mathit{rt}$ should
1938 be monotonic in~$f$. We think this is true of liveness, but we are not sure
1939 whether it's just a generally good idea, or whether it's actually a
1940 precondition for some (as yet unarticulated) property of \ourlib{} to hold.}%
1941
1942 %%%% \simon{The rewrite functions must presumably satisfy
1943 %%%% some monotonicity property. Something like: given a more informative
1944 %%%% fact, the rewrite function will rewrite a node to a more informative graph
1945 %%%% (in the fact lattice.).
1946 %%%% \textbf{NR}: actually the only obligation of the rewrite function is
1947 %%%% to preserve observable behavior. There's no requirement that it be
1948 %%%% monotonic or indeed that it do anything useful. It just has to
1949 %%%% preserve semantics (and be a pure function of course).
1950 %%%% \textbf{SLPJ} In that case I think I could cook up a program that
1951 %%%% would never reach a fixed point. Imagine a liveness analysis with a loop;
1952 %%%% x is initially unused anywhere.
1953 %%%% At some assignment node inside the loop, the rewriter behaves as follows:
1954 %%%% if (and only if) x is dead downstream,
1955 %%%% make it alive by rewriting the assignment to mention x.
1956 %%%% Now in each successive iteration x will go live/dead/live/dead etc. I
1957 %%%% maintain my claim that rewrite functions must satisfy some
1958 %%%% monotonicity property.
1959 %%%% \textbf{JD}: in the example you cite, monotonicity of facts at labels
1960 %%%% means x cannot go live/dead/live/dead etc. The only way we can think
1961 %%%% of not to terminate is infinite ``deep rewriting.''
1962 %%%% }
1963
1964
1965
1966
1967 \section{\ourlib's implementation}
1968 \seclabel{engine}
1969 \seclabel{dfengine}
1970
1971 \secref{making-simple}
1972 gives a client's-eye view of \hoopl, showing how to use
1973 it to create analyses and transformations.
1974 \hoopl's interface is simple, but
1975 the \emph{implementation} of interleaved analysis and rewriting is
1976 quite complicated. \citet{lerner-grove-chambers:2002}
1977 do not describe their implementation. We have written
1978 at least three previous implementations, all of which
1979 were long and hard to understand, and only one of which
1980 provided compile-time guarantees about open and closed shapes.
1981 We are not confident that any of these implementations are correct.
1982
1983 In this paper we describe our new implementation. It is short
1984 (about a third of the size of our last attempt), elegant, and offers
1985 strong static shape guarantees. The whole thing is about 300~lines of
1986 code, excluding comments; this count includes both forward and backward
1987 dataflow analysis and transformation.
1988
1989 We describe the implementation of \emph{forward}
1990 analysis and transformation.
1991 The implementations of backward analysis and transformation are
1992 exactly analogous and are included in \hoopl.
1993
1994
1995
1996 \subsection{Overview}
1997
1998 {\hfuzz=0.6pt
1999 We concentrate on implementing @analyzeAndRewriteFwd@, whose
2000 type is in \secref{using-hoopl}.
2001 Its implementation is built on the hierarchy of nodes, blocks, and graphs
2002 described in \secref{graph-rep}. For each thing in the hierarchy,
2003 we develop a function of this type:
2004 \begin{code}
2005 type `ARF ^thing n
2006 = forall f e x. FwdPass n f
2007 -> thing e x -> Fact e f
2008 -> FuelMonad (RG n e x, Fact x f)
2009 \end{code}
2010 An @ARF@ (short for ``analyze and rewrite forward'') is a combination of
2011 a rewrite and transfer function.
2012 }%
2013 An @ARF@ takes a @FwdPass@, a @thing@ (a node, block, or graph),
2014 and an input fact,
2015 and it returns a rewritten graph of type @(RG n e x)@ of the same shape
2016 as the @thing@, plus a suitably shaped output fact.
2017 %
2018 %Regardless of whether @thing@ is a node, block, or graph, the result
2019 %is always a graph.
2020 %
2021 % point is made adequately in the client section ---NR
2022 The type~@RG@ is internal to \hoopl; it is not seen by any client.
2023 We use it, not @Graph@, for two reasons:
2024 \begin{itemize}
2025 \item The client is often interested not only in the facts flowing
2026 out of the graph (which are returned in the @Fact x f@),
2027 but also in the facts on the \emph{internal} blocks
2028 of the graph. A~replacement graph of type @(RG n e x)@ is decorated with
2029 these internal facts.
2030 \item A @Graph@ has deliberately restrictive invariants; for example,
2031 a @GMany@ with a @JustO@ is always open at exit (\figref{graph}). It turns
2032 out to be awkward to maintain these invariants \emph{during} rewriting,
2033 but easy to restore them \emph{after} rewriting by ``normalizing'' an @RG@.
2034 \end{itemize}
2035 The information in an @RG@ is returned to the client by
2036 the normalization function @normalizeBody@, which
2037 splits an @RG@ into a @Body@ and its corresponding @FactBase@:
2038 \begin{code}
2039 `normalizeBody :: Edges n => RG n f C C
2040 -> (Body n, FactBase f)
2041 \end{code}
2042 \begin{figure}
2043 \hfuzz=0.98pt
2044 \begin{code}
2045 data `RG n f e x where
2046 `RGNil :: RG n f a a
2047 `RGCatO :: RG n f e O -> RG n f O x -> RG n f e x
2048 `RGCatC :: RG n f e C -> RG n f C x -> RG n f e x
2049 `RGUnit :: Fact e f -> Block n e x -> RG n f e x
2050 \end{code}
2051 \caption{The data type \texttt{RG} of rewritten graphs} \figlabel{rg}
2052 \end{figure}
2053 The constructors of @RG@ are given in \figref{rg}.
2054 The essential points are that constructor @RGUnit@ is polymorphic in
2055 the shape of a block, @RGUnit@ carries a fact as
2056 well as a block, and the concatenation constructors record the shapes
2057 of the graphs at the point of concatenation.
2058 \remark{Not too happy with calling this ``concatenation''}
2059 (A~record of the shapes is needed so that when
2060 @normalizeBody@ is presented with a block carried by @RGUnit@, it is
2061 known whether the block is an entry sequence, an exit sequence, or a
2062 basic block.)
2063 \remark{
2064 Within \hoopl,
2065 the @RG@~type is a great convenience. Mutter mutter:
2066 it carries facts as well as blocks, and it frees the client's
2067 rewrite functions from any obligation to respect the invariants of
2068 type @Graph@---I'm not convinced.}
2069
2070
2071 We exploit the type distinctions of nodes, @Block@, @Body@,
2072 and @Graph@ to structure the code into several small pieces, each of which
2073 can be understood independently. Specifically, we define a layered set of
2074 functions, each of which calls the previous one:
2075 \begin{code}
2076 arfNode :: Edges n => ARF n n
2077 arfBlock :: Edges n => ARF (Block n) n
2078 arfBody :: Edges n
2079 => FwdPass n f -> Body n -> FactBase f
2080 -> FuelMonad (RG n f C C, FactBase f)
2081 arfGraph :: Edges n => ARF (Graph n) n
2082 \end{code}
2083 \begin{itemize}
2084 \item
2085 The @arfNode@ function processes nodes (\secref{arf-node}).
2086 It handles the subtleties of interleaved analysis and rewriting,
2087 and it deals with fuel consumption. It calls @arfGraph@ to analyze
2088 and transform rewritten graphs.
2089 \item
2090 Based on @arfNode@ it is extremely easy to write @arfBlock@, which lifts
2091 the analysis and rewriting from nodes to blocks (\secref{arf-block}).
2092
2093 %%% \ifpagetuning\penalty -10000 \fi
2094
2095
2096 \item
2097 Using @arfBlock@ we define @arfBody@, which analyzes and rewrites a
2098 @Body@: a~group of closed/closed blocks linked by arbitrary
2099 control flow.
2100 Because a @Body@ is
2101 always closed/closed and does not take shape parameters, function
2102 @arfBody@ is less polymorphic than the others; its type is what
2103 would be obtained by expanding and specializing the definition of
2104 @ARF@ for a @thing@ which is always closed/closed and is equivalent to
2105 a @Body@.
2106
2107 Function @arfBody@ takes care of fixed points (\secref{arf-body}).
2108 \item
2109 Based on @arfBody@ it is easy to write @arfGraph@ (\secref{arf-graph}).
2110 \end{itemize}
2111 Given these functions, writing the main analyzer is a simple
2112 matter of matching the external API to the internal functions:
2113 \begin{code}
2114 `analyzeAndRewriteFwd
2115 :: forall n f. Edges n
2116 => FwdPass n f -> Body n -> FactBase f
2117 -> FuelMonad (Body n, FactBase f)
2118
2119 analyzeAndRewriteFwd pass ^body facts
2120 = do { (^rg, _) <- arfBody pass body facts
2121 ; return (normalizeBody rg) }
2122 \end{code}
2123
2124 \subsection{From nodes to blocks} \seclabel{arf-block}
2125 \seclabel{arf-graph}
2126
2127 We begin our explanation with the second task:
2128 writing @arfBlock@, which analyzes and transforms blocks.
2129 \begin{code}
2130 `arfBlock :: Edges n => ARF (Block n) n
2131 arfBlock pass (BUnit node) f
2132 = arfNode pass node f
2133 arfBlock pass (BCat b1 b2) f
2134 = do { (g1,f1) <- arfBlock pass b1 f
2135 ; (g2,f2) <- arfBlock pass b2 f1
2136 ; return (g1 `RGCatO` g2, f2) }
2137 \end{code}
2138 The code is delightfully simple.
2139 The @BUnit@ case is implemented by @arfNode@.
2140 The @BCat@ case is implemented by recursively applying @arfBlock@ to the two
2141 sub-blocks, threading the output fact from the first as the
2142 input to the second.
2143 Each recursive call produces a rewritten graph;
2144 we concatenate them with @RGCatO@.
2145
2146 Function @arfGraph@ is equally straightforward:
2147 \begin{code}
2148 `arfGraph :: Edges n => ARF (Graph n) n
2149 arfGraph _ GNil f = return (RGNil, f)
2150 arfGraph pass (GUnit blk) f = arfBlock pass blk f
2151 arfGraph pass (GMany NothingO body NothingO) f
2152 = do { (^body', ^fb) <- arfBody pass body f
2153 ; return (body', fb) }
2154 arfGraph pass (GMany NothingO body (JustO ^exit)) f
2155 = do { (body', fb) <- arfBody pass body f
2156 ; (^exit', ^fx) <- arfBlock pass exit fb
2157 ; return (body' `RGCatC` exit', fx) }
2158 -- ... two more equations for GMany ...
2159 \end{code}
2160 The pattern is the same as for @arfBlock@: thread
2161 facts through the sequence, and concatenate the results.
2162 Because the constructors of type~@RG@ are more polymorphic than those
2163 of @Graph@, type~@RG@ can represent
2164 graphs more simply than @Graph@; for example, each element of a
2165 @GMany@ becomes a single @RG@ object, and these @RG@ objects are then
2166 concatenated to form a single result of type~@RG@.
2167
2168 %% \ifpagetuning\penalty -10000 \fi
2169
2170 \subsection{Analyzing and rewriting nodes} \seclabel{arf-node}
2171
2172 Although interleaving analysis with transformation is tricky, we have
2173 succeeded in isolating the algorithm in just two functions,
2174 @arfNode@ and its backward analog, @`arbNode@:
2175 \begin{fuzzcode}{10.5pt}
2176 `arfNode :: Edges n => ARF n n
2177 arfNode ^pass n f
2178 = do { ^mb_g <- withFuel (fp_rewrite pass n f)
2179 ; case mb_g of
2180 Nothing -> return (RGUnit f (BUnit n),
2181 fp_transfer pass n f)
2182 Just (FwdRes ^ag ^rw) ->
2183 do { g <- graphOfAGraph ag
2184 ; let ^pass' = pass { fp_rewrite = rw }
2185 ; arfGraph pass' g f } }
2186 \end{fuzzcode}
2187 The code here is more complicated,
2188 but still admirably brief.
2189 Using the @fp_rewrite@ record selector (\figref{api-types}),
2190 we~begin by extracting the
2191 rewriting function from the @FwdPass@,
2192 and we apply it to the node~@n@
2193 and the incoming fact~@f@.
2194
2195 The resulting @Maybe@ is passed to @withFuel@, which
2196 deals with fuel accounting:
2197 \begin{code}
2198 `withFuel :: Maybe a -> FuelMonad (Maybe a)
2199 \end{code}
2200 If @withFuel@'s argument is @Nothing@, \emph{or} if we have run out of
2201 optimization fuel (\secref{fuel}), @withFuel@ returns @Nothing@.
2202 Otherwise, @withFuel@ consumes one unit of fuel and returns its
2203 % defn Fuel
2204 argument (which will be a @Just@). That is all we need say about fuel.
2205
2206 In the @Nothing@ case, no rewrite takes place---either because the rewrite function
2207 didn't want one or because fuel is exhausted.
2208 We~return a single-node
2209 graph @(RGUnit f (BUnit n))@, decorated with its incoming fact.
2210 We~also apply the transfer function @(fp_transfer pass)@
2211 to the incoming fact to produce the outgoing fact.
2212 (Like @fp_rewrite@, @fp_transfer@ is a record selector of @FwdPass@.)
2213
2214 In the @Just@ case, we receive a replacement
2215 @AGraph@ @ag@ and a new rewrite function~@rw@.
2216 We~convert @ag@ to a @Graph@, using
2217 \par{\small
2218 \begin{code}
2219 `graphOfAGraph :: AGraph n e x -> FuelMonad (Graph n e x)
2220 \end{code}}
2221 and we analyze the resulting @Graph@ with @arfGraph@.
2222 This analysis uses @pass'@, which contains the original lattice and transfer
2223 function from @pass@, together with the new rewrite function~@rg@.
2224
2225 And that's it! If~the client wanted deep rewriting, it is
2226 implemented by the call to @arfGraph@;
2227 if the client wanted
2228 shallow rewriting, the rewrite function will have returned
2229 @noFwdRw@ as~@rw@, which is implanted in @pass'@
2230 (\secref{shallow-vs-deep}).
2231
2232 \subsection{Fixed points} \seclabel{arf-body}
2233
2234 Lastly, @arfBody@ deals with the fixed-point calculation.
2235 This part of the implementation is the only really tricky part, and it is
2236 cleanly separated from everything else:
2237 \par{\small
2238 \begin{code}
2239 arfBody :: Edges n
2240 => FwdPass n f -> Body n -> FactBase f
2241 -> FuelMonad (RG n f C C, FactBase f)
2242 `arfBody pass body ^fbase
2243 = fixpoint (fp_lattice pass) (arfBlock pass) fbase $
2244 forwardBlockList (factBaseLabels fbase) body
2245 \end{code}}
2246 Function @forwardBlockList@ takes a list of possible entry points and @Body@,
2247 and it
2248 returns a linear list of
2249 blocks, sorted into an order that makes forward dataflow efficient:
2250 \begin{code}
2251 `forwardBlockList
2252 :: Edges n => [Label]
2253 -> Body n -> [(Label,Block n C C)]
2254 \end{code}
2255 For
2256 example, if the @Body@ starts at block~@L2@, and @L2@
2257 branches to~@L1@, but not vice versa, then \hoopl\ will reach a fixed point
2258 more quickly if we process @L2@ before~@L1@.
2259 To~find an efficient order, @forwardBlockList@ uses
2260 the methods of the @Edges@ class---@entryLabel@ and @successors@---to
2261 perform a reverse depth-first traversal of the control-flow graph.
2262 %%
2263 %%The @Edges@ type-class constraint on~@n@ propagates to all the
2264 %%@`arfThing@ functions.
2265 %% paragraph carrying too much freight
2266 %%
2267 The order of the blocks does not affect the fixed point or any other
2268 part of the answer; it affects only the number of iterations needed to
2269 reach the fixed point.
2270
2271 How do we know what entry points to pass to @forwardBlockList@?
2272 We treat
2273 any block with an entry in the in-flowing @FactBase@ as an entry point.
2274 \finalremark{Why does this work?}
2275 {\hfuzz=0.8pt \par}
2276
2277 The rest of the work is done by @fixpoint@, which is shared by
2278 both forward and backward analyses:
2279 \begin{code}
2280 `fixpoint :: forall n f.
2281 Edges n
2282 => Bool -- going Forward?
2283 -> DataflowLattice f
2284 -> (Block n C C -> FactBase f ->
2285 FuelMonad (RG n f C C, FactBase f))
2286 -> FactBase f
2287 -> [(Label, Block n C C)]
2288 -> FuelMonad (RG n f C C, FactBase f)
2289 \end{code}
2290 Except for the mysterious @Bool@ passed as the first argument,
2291 the type signature tells the story.
2292 The third argument is
2293 a function that analyzes and rewrites a single block;
2294 @fixpoint@ applies that function successively to all the blocks,
2295 which are passed as the fifth argument.\finalremark{For consistency with the transfer
2296 functions, blocks should come before @FactBase@, even though this change will
2297 ugly up the call site some.}
2298 The @fixpoint@ function maintains a
2299 ``Current @FactBase@''
2300 which grows monotonically:
2301 the initial value of the Current @FactBase@ is the fourth argument to
2302 @fixpoint@,
2303 and the Current @FactBase@ is augmented with the new facts that flow
2304 out of each @Block@ as it is analyzed.
2305 The @fixpoint@ function
2306 keeps analyzing blocks until the Current @FactBase@ reaches a fixed
2307 point.
2308
2309 The code for @fixpoint@ is a massive 70 lines long;
2310 for completeness, it
2311 appears in Appendix~\ref{app:fixpoint}.
2312 The~code is mostly straightforward, although we try to be a bit clever
2313 about deciding when a new fact means that another iteration over the
2314 blocks will be required.
2315 There is one more subtle point worth mentioning, which we highlight by
2316 considering a forward analysis of this graph, where execution starts at~@L1@:
2317 \begin{code}
2318 L1: x:=3; goto L4
2319 L2: x:=4; goto L4
2320 L4: if x>3 goto L2 else goto L5
2321 \end{code}
2322 Block @L2@ is unreachable.
2323 But if we \naively\ process all the blocks (say in
2324 order @L1@, @L4@, @L2@), then we will start with the bottom fact for @L2@, propagate
2325 \{@x=4@\} to @L4@, where it will join with \{@x=3@\} to yield
2326 \{@x=@$\top$\}.
2327 Given @x=@$\top$, the
2328 conditional in @L4@ cannot be rewritten, and @L2@~seems reachable. We have
2329 lost a good optimization.
2330
2331 Our implementation solves this problem through a clever trick that is
2332 safe only for a forward analysis;
2333 @fixpoint@ analyzes a block only if the block is
2334 reachable from an entry point.
2335 This trick is not safe for a backward analysis, which
2336 is why
2337 @fixpoint@ takes a @Bool@ as its first argument:
2338 it must know if the analysis goes forward.
2339
2340 Although the trick can be implemented in just a couple of lines of
2341 code, the reasoning behind it is quite subtle---exactly the sort of
2342 thing that should be implemented once in \hoopl, so clients don't have
2343 to worry about it.
2344
2345 \section {Related work} \seclabel{related}
2346
2347 While there is a vast body of literature on
2348 dataflow analysis and optimization,
2349 relatively little can be found on
2350 the \emph{design} of optimizers, which is the topic of this paper.
2351 We therefore focus on the foundations of dataflow analysis
2352 and on the implementations of some comparable dataflow frameworks.
2353
2354 \paragraph{Foundations}
2355
2356 When transfer functions are monotone and lattices are finite in height,
2357 iterative dataflow analysis converges to a fixed point
2358 \cite{kam-ullman:global-iterative-analysis}.
2359 If~the lattice's join operation distributes over transfer
2360 functions,
2361 this fixed point is equivalent to a join-over-all-paths solution to
2362 the recursive dataflow equations
2363 \cite{kildall:unified-optimization}.\footnote
2364 {Kildall uses meets, not joins.
2365 Lattice orientation is conventional, and conventions have changed.
2366 We use Dana Scott's
2367 orientation, in which higher elements carry more information.}
2368 \citet{kam-ullman:monotone-flow-analysis} generalize to some
2369 monotone functions.
2370 Each~client of \hoopl\ must guarantee monotonicity.
2371
2372 \ifcutting
2373 \citet{cousot:abstract-interpretation:1977}
2374 \else
2375 \citet{cousot:abstract-interpretation:1977,cousot:systematic-analysis-frameworks}
2376 \fi
2377 introduce abstract interpretation as a technique for developing
2378 lattices for program analysis.
2379 \citet{schmidt:data-flow-analysis-model-checking} shows that
2380 an all-paths dataflow problem can be viewed as model checking an
2381 abstract interpretation.
2382
2383 \citet{muchnick:compiler-implementation}
2384 presents many examples of both particular analyses and related
2385 algorithms.
2386
2387
2388 The soundness of interleaving analysis and transformation,
2389 even when not all speculative transformations are performed on later
2390 iterations, was shown by
2391 \citet{lerner-grove-chambers:2002}.
2392
2393
2394 \paragraph{Frameworks}
2395 Most dataflow frameworks support only analysis, not transformation.
2396 The framework computes a fixed point of transfer functions, and it is
2397 up to the client of
2398 the framework to use that fixed point for transformation.
2399 Omitting transformation makes it much easier to build frameworks,
2400 and one can find a spectrum of designs.
2401 We~describe two representative
2402 designs, then move on to the prior frameworks that support interleaved
2403 analysis and transformation.
2404
2405 The CIL toolkit \cite{necula:cil:2002}\finalremark{No good citation
2406 for same reason as Soot below ---JD}
2407 provides an analysis-only framework for C~programs.
2408 The framework is limited to one representation of control-flow graphs
2409 and one representation of instructions, both of which are provided by
2410 the framework.
2411 The~API is complicated;
2412 much of the complexity is needed to enable the client to
2413 affect which instructions
2414 the analysis iterates over.
2415
2416
2417 The Soot framework is designed for analysis of Java programs
2418 \cite{hendren:soot:2000}.\finalremark{This citation is probably the
2419 best for Soot in general, but there doesn't appear
2420 to be any formal publication that actually details the dataflow
2421 framework part. ---JD}
2422 While Soot's dataflow library supports only analysis, not
2423 transformation, we found much
2424 to admire in its design.
2425 Soot's library is abstracted over the representation of
2426 the control-flow graph and the representation of instructions.
2427 Soot's interface for defining lattice and analysis functions is
2428 like our own,
2429 although because Soot is implemented in an imperative style,
2430 additional functions are needed to copy lattice elements.
2431 Like CIL, Soot provides only analysis, not transformation.
2432
2433
2434 \finalremark{FYI, LLVM has Pass Managers that try to control the order of passes,
2435 but I'll be darned if I can find anything that might be termed a dataflow framework.}
2436
2437 The Whirlwind compiler contains the dataflow framework implemented
2438 by \citet{lerner-grove-chambers:2002}, who were the first to
2439 interleave analysis and transformation.
2440 Their implementation is much like our early efforts:
2441 it is a complicated mix of code that simultaneously manages interleaving,
2442 deep rewriting, and fixed-point computation.
2443 By~separating these tasks,
2444 our implementation simplifies the problem dramatically.
2445 Whirlwind's implementation also suffers from the difficulty of
2446 maintaining pointer invariants in a mutable representation of
2447 control-flow graphs, a problem we have discussed elsewhere
2448 \cite{ramsey-dias:applicative-flow-graph}.
2449
2450 Because speculative transformation is difficult in an imperative setting,
2451 Whirlwind's implementation is split into two phases.
2452 The first phase runs the interleaved analyses and transformations
2453 to compute the final dataflow facts and a representation of the transformations
2454 that should be applied to the input graph.
2455 The second phase executes the transformations.
2456 In~\hoopl, because control-flow graphs are immutable, speculative transformations
2457 can be applied immediately, and there is no need
2458 for a phase distinction.
2459
2460 %%% % repetitious...
2461 %%%
2462 %%% \ourlib\ also improves upon Whirlwind's dataflow framework by providing
2463 %%% new support for the optimization writer:
2464 %%% \begin{itemize}
2465 %%% \item Using static type guarantees, \hoopl\ rules out a whole
2466 %%% class of possible bugs: transformations that produced malformed
2467 %%% control-flow graphs.
2468 %%% \item Using dynamic testing,
2469 %%% we can isolate the rewrite that transforms a working program
2470 %%% into a faulty program,
2471 %%% using Whalley's \citeyearpar{whalley:isolation} fault-isolation technique.
2472 %%% \end{itemize}
2473
2474 In previous work \cite{ramsey-dias:applicative-flow-graph}, we
2475 described a zipper-based representation of control-flow
2476 graphs, stressing the advantages
2477 of immutability.
2478 Our new representation, described in \secref{graph-rep}, is a significant improvement:
2479 \begin{itemize}
2480 \item
2481 We can concatenate nodes, blocks, and graphs in constant time.
2482 %Previously, we had to resort to Hughes's
2483 %\citeyearpar{hughes:lists-representation:article} technique, representing
2484 %a graph as a function.
2485 \item
2486 We can do a backward analysis without having
2487 to ``unzip'' (and allocate a copy of) each block.
2488 \item
2489 Using GADTs, we can represent a flow-graph
2490 node using a single type, instead of the triple of first, middle, and
2491 last types used in our earlier representation.
2492 This change simplifies the interface significantly:
2493 instead of providing three transfer functions and three rewrite
2494 functions per pass---one for
2495 each type of node---a client of \hoopl\ provides only one transfer
2496 function and one rewrite function per pass.
2497 \item
2498 Errors in concatenation are ruled out at
2499 compile-compile time by Haskell's static
2500 type system.
2501 In~earlier implementations, such errors were not detected until
2502 the compiler~ran, at which point we tried to compensate
2503 for the errors---but
2504 the compensation code harbored subtle faults,
2505 which we discovered while developing a new back end
2506 for the Glasgow Haskell Compiler.
2507 \end{itemize}
2508
2509 The implementation of \ourlib\ is also much better than
2510 our earlier implementations.
2511 Not only is the code simpler conceptually,
2512 but it is also shorter:
2513 our new implementation is about a third as long
2514 as the previous version, which is part of GHC, version 6.12.
2515
2516
2517
2518
2519 \section{What we learned}
2520
2521 We have spent six years implementing and reimplementing frameworks for
2522 dataflow analysis and transformation.
2523 This formidable design problem taught us
2524 two kinds of lessons:
2525 we learned some very specific lessons about representations and
2526 algorithms for optimizing compilers,
2527 and we were forcibly reminded of some very general, old lessons that are well
2528 known not just to functional programmers, but to programmers
2529 everywhere.
2530
2531
2532
2533 %% \remark{Orphaned: but for transfer functions that
2534 %% approximate weakest preconditions or strongest postconditions,
2535 %% monotonicity falls out naturally.}
2536 %%
2537 %%
2538 %% In conclusion we offer the following lessons from the experience of designing
2539 %% and implementing \ourlib{}.
2540 %% \begin{itemize}
2541 %% \item
2542 %% Although we have not stressed this point, there is a close connection
2543 %% between dataflow analyses and program logic:
2544 %% \begin{itemize}
2545 %% \item
2546 %% A forward dataflow analysis is represented by a predicate transformer
2547 %% that is related to \emph{strongest postconditions}
2548 %% \cite{floyd:meaning}.\footnote
2549 %% {In Floyd's paper the output of the predicate transformer is called
2550 %% the \emph{strongest verifiable consequent}, not the ``strongest
2551 %% postcondition.''}
2552 %% \item
2553 %% A backward dataflow analysis is represented by a predicate transformer
2554 %% that is related to \emph{weakest preconditions} \cite{dijkstra:discipline}.
2555 %% \end{itemize}
2556 %% Logicians write down the predicate transformers for the primitive
2557 %% program fragments, and then use compositional rules to ``lift'' them
2558 %% to a logic for whole programs. In the same way \ourlib{} lets the client
2559 %% write simple predicate transformers,
2560 %% and local rewrites based on those assertions, and ``lifts'' them to entire
2561 %% function bodies with arbitrary control flow.
2562
2563 Our main goal for \hoopl\ was to combine three good ideas (interleaved
2564 analysis and transformation, optimization fuel, and an applicative
2565 control-flow graph) in a way that could easily be reused by many, many
2566 compiler writers.
2567 Reuse requires abstraction, and as is well known,
2568 designing good abstractions is challenging.
2569 \hoopl's data types and the functions over those types have been
2570 through \emph{dozens} of revisions.
2571 As~we were refining our design, we~found it invaluable to operate in
2572 two modes:
2573 In the first mode, we designed, built, and used a framework as an
2574 important component of a real compiler (first Quick~{\PAL}, then GHC).
2575 In the second mode, we designed and built a standalone library, then
2576 redesigned and rebuilt it, sometimes going through several significant
2577 changes in a week.
2578 Operating in the first mode---inside a live compiler---forced us to
2579 make sure that no corners were cut, that we were solving a real
2580 problem, and that we did not inadvertently
2581 cripple some other part of the compiler.
2582 Operating in the second mode---as a standalone library---enabled us to
2583 iterate furiously, trying out many more ideas than would have
2584 been possible in the first mode.
2585 We~have learned that alternating between these two modes leads to a
2586 better design than operating in either mode alone.
2587
2588 We were forcibly reminded of timeless truths:
2589 that interfaces are more important than implementations, and that data
2590 is more important than code.
2591 These truths are reflected in this paper, in which
2592 we
2593 have given \hoopl's API three times as much space as \hoopl's implementation.
2594
2595 We were also reminded that Haskell's type system (polymorphism, GADTs,
2596 higher-order functions, type classes, and so on) is a remarkably
2597 effective
2598 language for thinking about data and code---and that
2599 Haskell lacks a language of interfaces (like ML's signatures) that
2600 would make it equally effective for thinking about APIs at a larger scale.
2601 Still, as usual, the types were a remarkable aid to writing the code:
2602 when we finally agreed on the types presented above, the
2603 code almost wrote itself.
2604
2605 Types are widely appreciated at ICFP, but here are three specific
2606 examples of how types helped us:
2607 \begin{itemize}
2608 \item
2609 Reuse is enabled by representation-independence, which in a functional
2610 language is
2611 expressed through parametric polymorphism.
2612 Making \ourlib{} polymorphic in the nodes
2613 made the code simpler, easier to understand, and easier to maintain.
2614 In particular, it forced us to make explicit \emph{exactly} what
2615 \ourlib\ must know about nodes, and to embody that knowledge in the
2616 @Edges@ type class (\secref{edges}).
2617 \item
2618 We are proud of using GADTs to
2619 track the open and closed shapes of nodes, blocks, and graphs at
2620 compile time.
2621 Shapes may
2622 seem like a small refinement, but they helped tremendously when
2623 building \hoopl, and we expect them to help clients.
2624 %
2625 % this paper is just not about run-time performance ---NR
2626 %
2627 %%%% Moreover, the implementation is faster than it would otherwise be,
2628 %%%% because, say, a @(Fact O f)e@ is known to be just an @f@ rather than
2629 %%%% being a sum type that must be tested (with a statically known outcome!).
2630 %
2631 Giving the \emph{same} shapes
2632 to nodes, blocks, and graphs helped our
2633 thinking and helped to structure the implementation.
2634 \item
2635 In our earlier designs, graphs were parameterized over \emph{three} node
2636 types: first, middle, and last nodes.
2637 Those designs therefore required
2638 three transfer functions, three rewrite functions, and so~on.
2639 Moving to a single, ``shapely'' node type was a major breakthrough:
2640 not only do we have just one node type, but our client need supply only
2641 one transfer function and one rewrite function.
2642 To~make this design work, however, we \emph{must} have
2643 the type-level function for
2644 @Fact@ (\figref{api-types}), to express how incoming
2645 and outgoing facts depend on the shape of a node.
2646 \end{itemize}
2647
2648 Dataflow optimization is usually described as a way to improve imperative
2649 programs by mutating control-flow graphs.
2650 Such transformations appear very different from the tree rewriting
2651 that functional languages are so well known for, and that makes
2652 functional languages so attractive for writing other parts of compilers.
2653 But even though dataflow optimization looks very different from
2654 what we are used to,
2655 writing a dataflow optimizer
2656 in a pure functional language was a huge win.
2657 % We could not possibly have conceived \ourlib{} in C++.
2658 In~a pure functional language, not only do we know that
2659 no data structure will be unexpectedly mutated,
2660 but we are forced to be
2661 explicit about every input and output,
2662 and we are encouraged to implement things compositionally.
2663 This kind of thinking has helped us make
2664 significant improvements to the already tricky work of Lerner, Grove,
2665 and Chambers:
2666 per-function control of shallow vs deep rewriting
2667 (\secref{shallow-vs-deep}),
2668 combinators for dataflow passes (\secref{combinators}),
2669 optimization fuel (\secref{fuel}),
2670 and transparent management of unreachable blocks (\secref{arf-body}).
2671 We~trust that these improvements are right only because they are
2672 implemented in separate
2673 parts of the code that cannot interact except through
2674 explicit function calls.
2675 %%
2676 %%An ancestor of \ourlib{} is in the Glasgow Haskell Compiler today,
2677 %%in version~6.12.
2678 With this new, improved design in hand, we are now moving back to
2679 live-compiler mode, pushing \hoopl\ into version
2680 6.13 of the Glasgow Haskell Compiler.
2681
2682
2683 \acks
2684
2685 The first and second authors were funded
2686 by a grant from Intel Corporation and
2687 by NSF awards CCF-0838899 and CCF-0311482.
2688 These authors also thank Microsoft Research Ltd, UK, for funding
2689 extended visits to the third author.
2690
2691
2692 \makeatother
2693
2694 \providecommand\includeftpref{\relax} %% total bafflement -- workaround
2695 \IfFileExists{nrbib.tex}{\bibliography{cs,ramsey}}{\bibliography{dfopt}}
2696 \bibliographystyle{plainnatx}
2697
2698
2699 \clearpage
2700
2701 \appendix
2702
2703 % omit LabelSet :: *
2704 % omit LabelMap :: *
2705 % omit delFromFactBase :: FactBase f -> [(Label,f)] -> FactBase f
2706 % omit elemFactBase :: Label -> FactBase f -> Bool
2707 % omit elemLabelSet :: Label -> LabelSet -> Bool
2708 % omit emptyLabelSet :: LabelSet
2709 % omit factBaseLabels :: FactBase f -> [Label]
2710 % omit extendFactBase :: FactBase f -> Label -> f -> FactBase f
2711 % omit extendLabelSet :: LabelSet -> Label -> LabelSet
2712 % omit getFuel :: FuelMonad Fuel
2713 % omit setFuel :: Fuel -> FuelMonad ()
2714 % omit lookupFact :: FactBase f -> Label -> Maybe f
2715 % omit factBaseList :: FactBase f -> [(Label, f)]
2716
2717 \section{Code for \textmd{\texttt{fixpoint}}}
2718 \label{app:fixpoint}
2719
2720 {\def\baselinestretch{0.95}\hfuzz=20pt
2721 \begin{smallcode}
2722 data `TxFactBase n f
2723 = `TxFB { `tfb_fbase :: FactBase f
2724 , `tfb_rg :: RG n f C C -- Transformed blocks
2725 , `tfb_cha :: ChangeFlag
2726 , `tfb_lbls :: LabelSet }
2727 -- Set the tfb_cha flag iff
2728 -- (a) the fact in tfb_fbase for or a block L changes
2729 -- (b) L is in tfb_lbls.
2730 -- The tfb_lbls are all Labels of the *original*
2731 -- (not transformed) blocks
2732
2733 `updateFact :: DataflowLattice f -> LabelSet -> (Label, f)
2734 -> (ChangeFlag, FactBase f)
2735 -> (ChangeFlag, FactBase f)
2736 updateFact ^lat ^lbls (lbl, ^new_fact) (^cha, fbase)
2737 | NoChange <- ^cha2 = (cha, fbase)
2738 | lbl `elemLabelSet` lbls = (SomeChange, new_fbase)
2739 | otherwise = (cha, new_fbase)
2740 where
2741 (cha2, ^res_fact)
2742 = case lookupFact fbase lbl of
2743 Nothing -> (SomeChange, new_fact)
2744 Just ^old_fact -> fact_extend lat old_fact new_fact
2745 ^new_fbase = extendFactBase fbase lbl res_fact
2746
2747 fixpoint :: forall n f. Edges n
2748 => Bool -- Going forwards?
2749 -> DataflowLattice f
2750 -> (Block n C C -> FactBase f
2751 -> FuelMonad (RG n f C C, FactBase f))
2752 -> FactBase f -> [(Label, Block n C C)]
2753 -> FuelMonad (RG n f C C, FactBase f)
2754 fixpoint ^is_fwd lat ^do_block ^init_fbase ^blocks
2755 = do { ^fuel <- getFuel
2756 ; ^tx_fb <- loop fuel init_fbase
2757 ; return (tfb_rg tx_fb,
2758 tfb_fbase tx_fb `delFromFactBase` blocks) }
2759 -- The outgoing FactBase contains facts only for
2760 -- Labels *not* in the blocks of the graph
2761 where
2762 `tx_blocks :: [(Label, Block n C C)]
2763 -> TxFactBase n f -> FuelMonad (TxFactBase n f)
2764 tx_blocks [] tx_fb = return tx_fb
2765 tx_blocks ((lbl,blk):bs) tx_fb = tx_block lbl blk tx_fb
2766 >>= tx_blocks bs
2767
2768 `tx_block :: Label -> Block n C C
2769 -> TxFactBase n f -> FuelMonad (TxFactBase n f)
2770 tx_block ^lbl ^blk tx_fb@(TxFB { tfb_fbase = fbase
2771 , tfb_lbls = lbls
2772 , tfb_rg = ^blks
2773 , tfb_cha = cha })
2774 | is_fwd && not (lbl `elemFactBase` fbase)
2775 = return tx_fb -- Note [Unreachable blocks]
2776 | otherwise
2777 = do { (rg, ^out_facts) <- do_block blk fbase
2778 ; let (^cha', ^fbase')
2779 = foldr (updateFact lat lbls) (cha,fbase)
2780 (factBaseList out_facts)
2781 ; return (TxFB { tfb_lbls = extendLabelSet lbls lbl
2782 , tfb_rg = rg `RGCatC` blks
2783 , tfb_fbase = fbase'
2784 , tfb_cha = cha' }) }
2785
2786 loop :: Fuel -> FactBase f -> FuelMonad (TxFactBase n f)
2787 `loop fuel fbase
2788 = do { let ^init_tx_fb = TxFB { tfb_fbase = fbase
2789 , tfb_cha = NoChange
2790 , tfb_rg = RGNil
2791 , tfb_lbls = emptyLabelSet}
2792 ; tx_fb <- tx_blocks blocks init_tx_fb
2793 ; case tfb_cha tx_fb of
2794 NoChange -> return tx_fb
2795 SomeChange -> setFuel fuel >>
2796 loop fuel (tfb_fbase tx_fb) }
2797 \end{smallcode}
2798 \par
2799 } % end \baselinestretch
2800
2801
2802 \section{Index of defined identifiers}
2803
2804 This appendix lists every nontrivial identifier used in the body of
2805 the paper.
2806 For each identifier, we list the page on which that identifier is
2807 defined or discussed---or when appropriate, the figure (with line
2808 number where possible).
2809 For those few identifiers not defined or discussed in text, we give
2810 the type signature and the page on which the identifier is first
2811 referred to.
2812
2813 Some identifiers used in the text are defined in the Haskell Prelude;
2814 for those readers less familiar with Haskell, these identifiers are
2815 listed in Appendix~\ref{sec:prelude}.
2816
2817 \newcommand\dropit[3][]{}
2818
2819 \newcommand\hsprelude[2]{\noindent
2820 \texttt{#1} defined in the Haskell Prelude\\}
2821 \let\hsprelude\dropit
2822
2823 \newcommand\hspagedef[3][]{\noindent
2824 \texttt{#2} defined on page~\pageref{#3}.\\}
2825 \newcommand\omithspagedef[3][]{\noindent
2826 \texttt{#2} not shown (but see page~\pageref{#3}).\\}
2827 \newcommand\omithsfigdef[3][]{\noindent
2828 \texttt{#2} not shown (but see Figure~\ref{#3} on page~\pageref{#3}).\\}
2829 \newcommand\hsfigdef[3][]{%
2830 \noindent
2831 \ifx!#1!%
2832 \texttt{#2} defined in Figure~\ref{#3} on page~\pageref{#3}.\\
2833 \else
2834 \texttt{#2} defined on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\
2835 \fi
2836 }
2837 \newcommand\hstabdef[3][]{%
2838 \noindent
2839 \ifx!#1!
2840 \texttt{#2} defined in Table~\ref{#3} on page~\pageref{#3}.\\
2841 \else
2842 \texttt{#2} defined on \lineref{#1} of Table~\ref{#3} on page~\pageref{#3}.\\
2843 \fi
2844 }
2845 \newcommand\hspagedefll[3][]{\noindent
2846 \texttt{#2} {let}- or $\lambda$-bound on page~\pageref{#3}.\\}
2847 \newcommand\hsfigdefll[3][]{%
2848 \noindent
2849 \ifx!#1!%
2850 \texttt{#2} {let}- or $\lambda$-bound in Figure~\ref{#3} on page~\pageref{#3}.\\
2851 \else
2852 \texttt{#2} {let}- or $\lambda$-bound on \lineref{#1} of Figure~\ref{#3} on page~\pageref{#3}.\\
2853 \fi
2854 }
2855
2856 \newcommand\nothspagedef[3][]{\notdefd\ndpage{#1}{#2}{#3}}
2857 \newcommand\nothsfigdef[3][]{\notdefd\ndfig{#1}{#2}{#3}}
2858 \newcommand\nothslinedef[3][]{\notdefd\ndline{#1}{#2}{#3}}
2859
2860 \newcommand\ndpage[3]{\texttt{#2}~(p\pageref{#3})}
2861 \newcommand\ndfig[3]{\texttt{#2}~(Fig~\ref{#3},~p\pageref{#3})}
2862 \newcommand\ndline[3]{%
2863 \ifx!#1!%
2864 \ndfig{#1}{#2}{#3}%
2865 \else
2866 \texttt{#2}~(Fig~\ref{#3}, line~\lineref{#1}, p\pageref{#3})%
2867 \fi
2868 }
2869
2870 \newif\ifundefinedsection\undefinedsectionfalse
2871
2872 \newcommand\notdefd[4]{%
2873 \ifundefinedsection
2874 , #1{#2}{#3}{#4}%
2875 \else
2876 \undefinedsectiontrue
2877 \par
2878 \section{Undefined identifiers}
2879 #1{#2}{#3}{#4}%
2880 \fi
2881 }
2882
2883 \begingroup
2884 \raggedright
2885
2886 \input{defuse}%
2887 \ifundefinedsection.\fi
2888
2889 \undefinedsectionfalse
2890
2891
2892 \renewcommand\hsprelude[2]{\noindent
2893 \ifundefinedsection
2894 , \texttt{#1}%
2895 \else
2896 \undefinedsectiontrue
2897 \par
2898 \section{Identifiers defined in Haskell Prelude or a standard library}\label{sec:prelude}
2899 \texttt{#1}%
2900 \fi
2901 }
2902 \let\hspagedef\dropit
2903 \let\omithspagedef\dropit
2904 \let\omithsfigdef\dropit
2905 \let\hsfigdef\dropit
2906 \let\hstabdef\dropit
2907 \let\hspagedefll\dropit
2908 \let\hsfigdefll\dropit
2909 \let\nothspagedef\dropit
2910 \let\nothsfigdef\dropit
2911 \let\nothslinedef\dropit
2912
2913 \input{defuse}
2914 \ifundefinedsection.\fi
2915
2916
2917
2918 \endgroup
2919
2920
2921 \iffalse
2922
2923 \section{Dataflow-engine functions}
2924
2925
2926 \begin{figure*}
2927 \setcounter{codeline}{0}
2928 \begin{numberedcode}
2929 \end{numberedcode}
2930 \caption{The forward iterator}
2931 \end{figure*}
2932
2933 \begin{figure*}
2934 \setcounter{codeline}{0}
2935 \begin{numberedcode}
2936 \end{numberedcode}
2937 \caption{The forward actualizer}
2938 \end{figure*}
2939
2940
2941 \fi
2942
2943
2944
2945 \end{document}
2946
2947
2948
2949
2950 THE FUEL PROBLEM:
2951
2952
2953 Here is the problem:
2954
2955 A graph has an entry sequence, a body, and an exit sequence.
2956 Correctly computing facts on and flowing out of the body requires
2957 iteration; computation on the entry and exit sequences do not, since
2958 each is connected to the body by exactly one flow edge.
2959
2960 The problem is to provide the correct fuel supply to the combined
2961 analysis/rewrite (iterator) functions, so that speculative rewriting
2962 is limited by the fuel supply.
2963
2964 I will number iterations from 1 and name the fuel supplies as
2965 follows:
2966
2967 f_pre fuel remaining before analysis/rewriting starts
2968 f_0 fuel remaining after analysis/rewriting of the entry sequence
2969 f_i, i>0 fuel remaining after iteration i of the body
2970 f_post fuel remaining after analysis/rewriting of the exit sequence
2971
2972 The issue here is that only the last iteration of the body 'counts'.
2973 To formalize, I will name fuel consumed:
2974
2975 C_pre fuel consumed by speculative rewrites in entry sequence
2976 C_i fuel consumed by speculative rewrites in iteration i of body
2977 C_post fuel consumed by speculative rewrites in exit sequence
2978
2979 These quantities should be related as follows:
2980
2981 f_0 = f_pre - C_pref
2982 f_i = f_0 - C_i where i > 0
2983 f_post = f_n - C_post where iteration converges after n steps
2984
2985 When the fuel supply is passed explicitly as parameter and result, it
2986 is fairly easy to see how to keep reusing f_0 at every iteration, then
2987 extract f_n for use before the exit sequence. It is not obvious to me
2988 how to do it cleanly using the fuel monad.
2989
2990
2991 Norman