\documentclass[twocolumn]{article}
\usepackage{vmargin,mathpartir,times,mathptm,graphicx}
% l2h substitution PAL `C--`
\renewcommand{\ttdefault}{aett}
\setcounter{secnumdepth}{0}
\usepackage{verbatim} % allows to define \begin{smallcode}
\newenvironment{code}{\par\unskip\kern-6pt \small\verbatim}{\endverbatim}
\newenvironment{smallcode}{\par\unskip\footnotesize\verbatim}{\endverbatim}
%\newenvironment{fuzzcode}[1]{\par\unskip\hfuzz=#1 \verbatim}{\endverbatim}
%\newenvironment{smallfuzzcode}[1]{\par\unskip\small\hfuzz=#1 \verbatim}{\endverbatim}
\newcommand{\PAL}{\mbox{C\ttfamily-{}-}}
\newcommand\lrtl{\mbox{$\lambda$-RTL}}
% l2h substitution lrtl **lambda**-RTL
\setpapersize{USletter}
% left top right bottom
\setmarginsrb{0.93in}{0.5in}{0.93in}{0.4in}
{\headheight}{\headsep}{\footheight}{\footskip}
\columnsep=23pt
\pagestyle{headings}
\makeatletter
\let\nrlist\@listi
\def\@listi{\nrlist\parsep=0.5\parsep
\itemsep=0.5\itemsep\topsep=0.5\topsep
\parskip=0.5\parskip}
\let\@listI\@listi
\def\@oddhead{\hfil\smash{\raisebox{-10pt}{
\large
\begin{tabular}{c}
\Large {Hoopl} in four pages\\Norman Ramsey, Tufts University
\\(joint work with Jo\~ao Dias and Simon Peyton Jones)\vrule width 0pt
height 0pt depth 8pt
\end{tabular}}}\hfil}
\def\@oddfoot{\hfil\thepage\hfil}
\makeatother
\makeatletter
\newcommand\mysection[1]{%
\par
\vskip 0.5\baselineskip plus 2pt minus 1pt
\noindent{\raggedright\textbf{#1}}
\par
\vskip 0.3\baselineskip plus 2pt minus 1pt
\@afterindentfalse
}
\newcommand\triple[3]{\ensuremath{#1\;\{#2\}\;#3}}
\newcommand\implies{\Rightarrow}
\renewcommand\wp{\ensuremath{\mathit{wlp}}}
\let\wlp=\wp
\renewcommand\sp{\ensuremath{\mathit{sp}}}
\renewcommand\mysection[1]{%
\@startsection{section}{1}{\z@}{-0.5\baselineskip plus -2pt minus -1pt}%
{0.3\baselineskip plus 2pt minus 1pt}%
{\normalfont\raggedright\bfseries}}
\makeatother
\parindent=0pt
\parskip=2pt plus3pt
\newenvironment{twolist}{\itemize}{\enditemize}
\input{diagrams}
\begin{document}
\mysection*{Background}
In my student days I didn't care for dataflow analysis or
optimization.
Dataflow analysis was full of bit vectors and $\cap$~and~$\cup$
symbols and funny words like ``gen'' and ``kill.''
Optimization was even more chaotic: more bit vectors, plus
special-purpose, pointwise, stateful program transformations that
didn't seem to have anything to do with anything else (or each other).
Two things changed my mind: in 2002, Sorin Lerner, David Grove, and Craig
Chambers published a landmark paper which, among other things, showed
me a uniform way to think about program analysis and transformation;
and in 2004, after several years wrestling with traditional, mutable
representations of programs, John Dias and I had the idea of using an
\emph{applicative} control-flow graph based on Huet's zipper.
The applicative control-flow graph turned out to make it almost
trivial to implement the big innovation of Lerner, Grove, and
Chambers: speculative rewriting.
The key idea is an analogy between dataflow analysis and program
logic.
\mysection*{Program logic}
Compilers might be ugly, but semantics can be beautiful.
The rock on which we build is Tony Hoare's axiomatic semantics,
also known as ``Hoare logic,''
where
\begin{mathpar}
\triple P S Q
\end{mathpar}
says that if we execute command~$S$ in any state satisfying precondition~$P$, and
if it terminates, the final state satisfies postcondition~$Q$.
Tony laid out axioms and inference rules for this judgment, of which
the most suggestive are the strengthening and weakening rules:
\begin{mathpar}
\inferrule{P \implies P' \\ \triple {P'} S Q}
{\triple P S Q}
\inferrule{\triple P S {Q'} \\ Q' \implies Q \\ }
{\triple P S Q}
\end{mathpar}
These rules suggest there could be a \emph{best}
postcondition~$Q$---the one that implies all the others---and likewise
a \emph{best} precondition~$P$.
In fact they are Bob Floyd's
\emph{strongest verifiable consequent} (now called ``strongest postcondition'')
and Dijkstra's weakest (liberal) precondition.
And the most natural thing in the world is to try to turn Hoare's
relation into a function: to \emph{compute} \wp~or~\sp.
Unfortunately, in the presence of loops, these computations don't
terminate.
Dijkstra and Hoare got around the problem by forcing the programmer to
write down a \emph{loop invariant}:
\begin{mathpar}
\inferrule{\triple {I \land B} S I}
{\triple I {\mbox{\texttt{while} $B$ \texttt{do} $S$}} {I \land \lnot B}}
\end{mathpar}
Without a loop invariant, this approach seemed like a dead end---but
it isn't.
\mysection*{Dataflow analysis as program logic}
Dataflow analysis typically talks about ``states'' (sometimes
represented as bit vectors) and ``transfer functions.''
I~follow Lerner, Grove, and Chambers, who talk about ``dataflow facts.''
The connection with program logic is simple but breathtaking:\footnote
{As stated here, the connection is \emph{over} simple. More anon.}
\begin{itemize}
\item
A dataflow fact \emph{stands for} a logical formula.
Even better, every logical formula can be \emph{approximated} by a
dataflow fact.
\item
The transfer function for a forward dataflow analysis is a homomorphic
image of \sp\ on the dataflow fact.
And the transfer function for a backward dataflow analysis is a homomorphic
image of \wlp\ on the dataflow fact.
\end{itemize}
In other words, \emph{dataflow analysis is simply predicate
transformers applied to an impoverished program logic}.\footnote
{I'm sure David Schmidt means something similar when he says that
dataflow analysis is model checking of abstract interpretation, but
since I've never been able to understand any of those papers,
I~can't prove~it.}
A~``transfer function'' is a Curried function that takes program code
as an argument and returns a \emph{fact transformer}.
\newcommand\embed{\mathcal E}
\renewcommand\approx{\mathcal A}
% \newarrow{impliedBy} {<=}====
\kern-2\baselineskip
\begin{center}
\begin{diagram}
P' & \lImplies & P & \rTo^{\sp} & Q' & \rImplies & Q \\
& \luTo^{\embed} & \dTo ^ \approx & & & \ruTo^\embed & \\
& & f & \rTo^{\mbox{transfer}} & f' & & \\
\end{diagram}
\end{center}
Because of loops, there's a little more to it than that:
\begin{itemize}
\item
We associate a logical variable with each basic block,
and on each basic block
we run the fact transformers and get out an equation relating logical
variables.
In the presence of loops, the equations are mutually recursive.
\item
We solve the recursive equations \emph{constructively}, through the
method of successive approximations.
If we use a work-list method,
the analogy with iterative dataflow analysis is exact.
\end{itemize}
This observation has two important consequences:
\begin{enumerate}
\item
\label{cbottom}
The representation of dataflow facts, unlike that of
logical formulas, \emph{must} include a bottom element.
\item
\label{cterm}
Unlike the language of formulas,
the language of dataflow facts must be sufficiently impoverished that
there are no infinite ascending chains.
\end{enumerate}
Consequence~\ref{cbottom} gives us a starting point for the method of
successive approximations; consequence~\ref{cterm} ensures it terminates.
\mysection*{The genesis of Hoopl}
Here's why I've spent five years on this problem:
\begin{quote}
\emph{If
the analogy between program logic and dataflow analysis holds up,
we should be able to create optimizers that are powerful, fun to
build, easy to get right, and that are part of an intellectually
coherent family of program transformations.
}
\end{quote}
I've refined this notion into three hypotheses.
\emph{Hypothesis~\#1\quad}
Having a good story changes the way we should think
about dataflow analysis:
\begin{itemize}
\item
Instead of thinking about bit vectors, sets, mutation, gen, kill, and
all that, we should think about \emph{transforming dataflow facts},
which represent logical formulas.
Using a pure functional language and ``wholemeal programming'' as
advocated by Richard Bird, we can write the code the way we think
about problems.
\item
There are only two analyses: weakest preconditions and strongest
postconditions.
And there are an infinite number of ways to approximate formulas.
These approximations will be the source of the next 700 dataflow
analyses.
\end{itemize}
\emph{Hypothesis~\#2\quad}
The classical optimizations, which appear to be such a mess, can
be better understood as being composed from just three
transformations:
\begin{itemize}
\item
Substitution of equals for equals
\item
Elimination of redundant assignments
\item
\emph{Introduction} of redundant assignment
\end{itemize}
Substitution needs no introduction.
A~redundant assignment is the imperative analog of a let-bound
variable that does not appear free in the body: the binding can be
eliminated.
Redundant assignments are \emph{introduced} to enable the first two
transformations.
The simplest example is ``code motion.''
\emph{Hypothesis~\#3\quad}
We should change the way we
code:
\begin{itemize}
\item
If we understand predicate transformers, transfer functions should be
easy to write.
\item
If that diagram really commutes, it should be possible to \emph{test}
that it commutes.
\item
If we can pull our head out of the bit vectors and the other details,
it should be possible to tell beautiful new stories about all the old
optimizations.
\end{itemize}
To investigate these hypotheses, John Dias, Simon Peyton Jones, and~I
have created a reusable library called \emph{Hoopl:} a Higher Order
OPtimization Library.
Hoopl is intended for classical optimization of \emph{imperative} code
such as low-level intermediate code or machine instructions.
An~analysis or optimization written using Hoopl is a \emph{client}.
\mysection*{Representing control-flow graphs}
Authors of clients should be free to think great thoughts about
predicate transformers and fact transformers.
Hoopl keeps track of which predicates flow where.
To~simplify this process we \emph{statically type} the units of the IR
being optimized:
\begin{itemize}
\item
The target of \emph{any} control transfer must be labelled with a
unique \texttt{Label}.
A~labelled node may have any number of predecessors (including zero),
but it has exactly one successor.
In~a typical IR, a labelled node will contain \emph{only} the label.
\item
An ordinary computational node does no control flow; it has
exactly one predecessor and exactly one successor.
This case is the most common and is the simplest for the client,
because it reduces most directly to predicate transformers.
\item
A control-flow node has a unique predecessor, but it may have many
successors.
Such nodes includes calls, returns, and all forms of goto
(conditional, unconditional, and computed).
\emph{Every} successor must be a labelled node; Hoopl does not permit
``fallthrough.''
\end{itemize}
\subsection{Nodes, blocks and graphs; open and closed}
Every node is \emph{open or closed at entry}
and \emph{open or closed at exit}.
An \emph{open} point is one at which control may implicitly ``fall through;''
to transfer control at a \emph{closed} point requires an explicit
control-transfer instruction.
A~sequence of nodes is well typed only if whenever two nodes follow
one another in the sequence, both nodes are open at the point where
they touch.
Such a sequence is called a \emph{block} and can be written using
these constructors:
\begin{code}
data O -- Open
data C -- Closed
data Block n e x where
BFirst :: n C O -> Block n C O
BMiddle :: n O O -> Block n O O
BLast :: n O C -> Block n O C
BCat :: Block n e O -> Block n O x -> Block n e x
\end{code}
Blocks come in four shapes: open/open, open/closed, closed/open, and
closed/closed.
A~closed/closed block is a \emph{basic block} and cannot be further
extended with \texttt{BCat}.
Basic blocks are \emph{not} living dinosaurs; they are a fundamental
consequence of controlling predecessors and successors.
In honor of their position within a basic block,
a closed/open node is called a \emph{first node};
an open/open node is called a \emph{middle node};
and
an open/closed node is called a \emph{last node}.
A \emph{control-flow graph} is a collection of blocks.
Graphs also come in four shapes:
\begin{code}
data Graph n e x where
GNil :: Graph n O O
GUnit :: Block n O O -> Graph n O O
GMany :: MaybeO e (Block n O C)
-> Map Label (Block n C C)
-> MaybeO x (Block n C O)
-> Graph n e x
data MaybeO ex t where
JustO :: t -> MaybeO O t
NothingO :: MaybeO C t
\end{code}
Most blocks are closed/closed; a graph may contain at most one
open/closed \emph{entry sequence} and at most one closed/open
\emph{exit sequence}.
Also, an open/open sequence of middle nodes forms a graph
(by~\texttt{GUnit} or \texttt{GNil}).
Our implementation exploits the fact that the shape of every node and
block is known \emph{statically} from the context in which it occurs.
But our analysis and rewriting functions are all polymorphic in the
shape.
\mysection*{Dataflow passes}
Each dataflow analysis begins with a lattice of dataflow facts.
Hoopl proper does not need to know how to embed a fact into the
language of logical formulas or how to approximate a logical formula
by a fact.
It~needs only to be able to start at the bottom and to take the least
upper bound of a pair of facts.
(When it does take the least upper bound, however, it needs to know if
something changed.)
\begin{smallcode}
data DataflowLattice f = DataflowLattice
{ fact_bot :: f
, fact_extend :: JoinFun f
}
type JoinFun f
= Label -> OldFact f -> NewFact f -> (ChangeFlag, f)
-- the label argument is for debugging purposes only
newtype OldFact f = OldFact f
newtype NewFact f = NewFact f
data ChangeFlag = NoChange | SomeChange
\end{smallcode}
A forward transfer function takes a node of any shape and returns a
fact transformer:
\begin{code}
type FwdTransfer n f
= forall e x. n e x -> f -> Fact x f
type family Fact x f :: *
type instance Fact C f = FactBase f
type instance Fact O f = f
type FactBase f = Map Label f
\end{code}
The type definition uses a new feature of Haskell called \emph{type
families}.
A~type family is a type-level function; this one says that the
transfer function for a node that is open at the exit (and so has a
single successor) returns a single fact~\texttt{f}.
But the transfer function for a node that is \emph{closed} at the exit
could have multiple successors, so it returns a finite map from
successors' labels to facts.
This map is called a \emph{fact base}.
Clients can also \emph{rewrite} nodes:
\begin{smallcode}
type FwdRewrite n f
= forall e x. n e x -> f -> Maybe (FwdRes n f e x)
data FwdRes n f e x = FwdRes (AGraph n e x) (FwdRewrite n f)
-- result of a rewrite is a new graph and
-- a (possibly) new rewrite function
\end{smallcode}
If justified by the incoming fact, a rewrite function can
\emph{replace} a node with a graph, subject only to the requirement
that in any execution in which the incoming fact holds, the graph is
observationally equivalent to the node it replaces.
Rewrite functions implement all three kinds of transformations.
Combine all three and you get a \emph{forward dataflow pass}:
\begin{code}
data FwdPass n f
= FwdPass { fp_lattice :: DataflowLattice f
, fp_transfer :: FwdTransfer n f
, fp_rewrite :: FwdRewrite n f }
\end{code}
What Hoopl does for you is captured in one function:
\begin{smallcode}
analyzeAndRewriteFwd
:: (Edges n, LabelsPtr entries)
=> FwdPass n f
-> entries
-> Graph n e x
-> Fact e f
-> FuelMonad (Graph n e x, FactBase f, MaybeO x f)
\end{smallcode}
This function takes a forward pass, a graph with entry points,
and an incoming fact or fact~base.
It returns a rewritten graph, the fact associated with each label in
the rewritten graph, and if the graph is open at the exit, the fact
flowing out that exit.
Backward dataflow passes use the same lattice type and appropriate
types for transfer and rewrite functions.
\mysection*{Creating a client}
To create an analysis or optimization, you
\begin{enumerate}
\item
Decide on a set of dataflow facts and identify what logical formulas
they approximate.
\item
Write the transfer function as the homomorphic image of the predicate
transformer \wp~or~\sp.
\item
Possibly create a rewrite function that exploits the dataflow fact to
make code-improving transformations.
\end{enumerate}
One beautiful aspect of this approach is that once you have chosen
a representation of nodes, \emph{there is only one correct way to
write \wp~and~\sp}.
You can therefore build a deep understanding of these predicate
transformers and how to approximate them.
Perhaps one day we'll have automated tests or even proofs, such as
Sorin Lerner has developed at UCSD.
\mysection*{The critical bit I left out}
Dataflow analysis does \emph{more} than classical program logic:
it can reason about \emph{paths}.
Classical Hoare logic gives a predicate about states, and the
predicate applies to any state that the machine can be in at a
particular program point, \emph{no matter what path it took to get
there}.
In~the jargon of dataflow analysis, anything analogous to Hoare logic
is an ``all-paths problem.''
There are also ``any-paths problems,'' such as \emph{reachability}:
can a particular node be reached from the entry point? Or~from some
other node.
To decide the redundancy of an assignment to variable~$x$, we solve a ``backwards
any-path problem:'' is there any path from the assignment to a
\emph{use} of~$x$ such that the path is not cut by any other
assignment to~$x$.
As far as I know, it is not known if there is an analogy between path
problems and program logic.
I'm~keenly interested in this question.
I~find it telling that while we have a name for the set of paths
leaving a program
point---the point's \emph{continuation}---I don't know if we have a name
for the set of paths \emph{reaching} a program point.
Dataflow analyses can reason about history and about the future.
How~can we connect this reasoning to program semantics?
\mysection*{An example: dominators}
If \emph{every} path from the entry to
label \texttt{L} must pass through another label \texttt{D},
\texttt{D} \emph{dominates} \texttt{L}.
\texttt{D} is called a \emph{dominator} of \texttt{L}.
There are efficient special-purpose
algorithms for computing dominators, but we can also compute
dominators through a forward dataflow analysis.
The analysis is beautifully simple; it is based on an unpublished
paper by Cooper, Harvey, and Kennedy, who showed that their $O(N^2)$ version
outperforms the classical $O(E\;\log N)$ algorithm of Lengauer and
Tarjan.
\begin{enumerate}
\item
The dataflow fact at a point is a list of all the labels that dominate
the point:
\begin{code}
type DPath = [Label] -- path in dominator tree
\end{code}
The list is ordered such that every label is dominated by all its
successors.
The predicate associated with the dataflow fact is not a logical
formula; it is an assertion about the set of all paths from the entry to
the point at which the fact applies.
The analysis requires a bottom element that is not a list;
bottom corresponds to an assertion that a node that is not reachable
from the entry, i.e., no path exists from the entry to that node.
The bottom element is added by type \texttt{WithBot}:
\begin{code}
type Doms = WithBot DPath
\end{code}
\item
The transfer function is simple: given a first node, it adds the label
to the head of the list.
Transfer for a middle node is the identity, and
transfer for a last node distributes the fact to its successors:
\begin{code}
domFirst n = (entryLabel n :)
domMiddle _ = id
domLast = distributeFact
distributeFact :: Edges n => n O C -> f -> FactBase f
distributeFact n f
= mkFactBase [ (l, f) | l <- successors n ]
\end{code}
\item
The interesting part is the join function:
given two paths, it returns the longest common suffix:
\begin{code}
extend :: JoinFun DPath
extend _ (OldFact l) (NewFact l')
= (changeIf (l `lengthDiffers` j), j)
where j = lcs l l'
lcs :: [Label] -> [Label] -> [Label]
...
\end{code}
The \texttt{WithBot} type comes with a lifting function that returns
\texttt{JoinFun~Doms}.
\end{enumerate}
\subsection{Example results from dominator analysis}
The program is insertation sort, with two nested loops:
\begin{smallcode}
a := 0
goto L1
L1:
if (a != d) then goto L2 else goto L3
L6:
c[b] := e
goto L1
L2:
a := (a + 1)
b := (a - 1)
e := c[b]
goto L4
L4:
if ((b != 0) && (c[(b - 1)] > e)) then goto L5 else goto L6
L5:
c[b] := c[(b - 1)]
b := (b - 1)
goto L4
L3:
\end{smallcode}
\centerline{%
\includegraphics[scale=0.4]{dom.eps}%
}
\end{document}