Test Trac #9036
[ghc.git] / docs / comm / the-beast / desugar.html
1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
2 <html>
3 <head>
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
5 <title>The GHC Commentary - Sugar Free: From Haskell To Core</title>
6 </head>
7
8 <body BGCOLOR="FFFFFF">
9 <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1>
10 <p>
11 Up until after type checking, GHC keeps the source program in an
12 abstract representation of Haskell source without removing any of the
13 syntactic sugar (such as, list comprehensions) that could easily be
14 represented by more primitive Haskell. This complicates part of the
15 front-end considerably as the abstract syntax of Haskell (as exported by
16 the module <a
17 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>)
18 is much more complex than a simplified representation close to, say, the
19 <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell
20 Kernel</a> would be. However, having a representation that is as close
21 as possible to the surface syntax simplifies the generation of clear
22 error messages. As GHC (quite in contrast to "conventional" compilers)
23 prints code fragments as part of error messages, the choice of
24 representation is especially important.
25 <p>
26 Nonetheless, as soon as the input has passed all static checks, it is
27 transformed into GHC's principal intermediate language that goes by the
28 name of <em>Core</em> and whose representation is exported by the
29 module <a
30 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>.
31 All following compiler phases, except code generation operate on Core.
32 Due to Andrew Tolmach's effort, there is also an <a
33 href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external
34 representation for Core.</a>
35 <p>
36 The conversion of the compiled module from <code>HsSyn</code> into that
37 of <code>CoreSyn</code> is performed by a phase called the
38 <em>desugarer</em>, which is located in
39 <a
40 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>.
41 It's operation is detailed in the following.
42 </p>
43
44 <h2>Auxilliary Functions</h2>
45 <p>
46 The modules <a
47 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a>
48 defines the desugarer monad (of type <code>DsM</code>) which maintains
49 the environment needed for desugaring. In particular, it encapsulates a
50 unique supply for generating new variables, a map to lookup standard
51 names (such as functions from the prelude), a source location for error
52 messages, and a pool to collect warning messages generated during
53 desugaring. Initialisation of the environment happens in the function <a
54 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>,
55 which is also the main entry point into the desugarer.
56 <p>
57 The generation of Core code often involves the use of standard functions
58 for which proper identifiers (i.e., values of type <code>Id</code> that
59 actually refer to the definition in the right Prelude) need to be
60 obtained. This is supported by the function
61 <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>.
62
63 <h2><a name="patmat">Pattern Matching</a></h2>
64 <p>
65 Nested pattern matching with guards and everything is translated into
66 the simple, flat case expressions of Core by the following modules:
67 <dl>
68 <dt><a
69 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>:
70 <dd>This modules contains the main pattern-matching compiler in the form
71 of a function called <code>match</code>. There is some documentation
72 as to how <code>match</code> works contained in the module itself.
73 Generally, the implemented algorithm is similar to the one described
74 in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The
75 Implementation of Functional Programming Languages</em>.
76 <code>Match</code> exports a couple of functions with not really
77 intuitive names. In particular, it exports <code>match</code>,
78 <code>matchWrapper</code>, <code>matchExport</code>, and
79 <code>matchSimply</code>. The function <code>match</code>, which is
80 the main work horse, is only used by the other matching modules. The
81 function <code>matchExport</code> - despite it's name - is merely used
82 internally in <code>Match</code> and handles warning messages (see
83 below for more details). The actual interface to the outside is
84 <code>matchWrapper</code>, which converts the output of the type
85 checker into the form needed by the pattern matching compiler (i.e., a
86 list of <code>EquationInfo</code>). Similar in function to
87 <code>matchWrapper</code> is <code>matchSimply</code>, which provides
88 an interface for the case where a single expression is to be matched
89 against a single pattern (as, for example, is the case in bindings in
90 a <code>do</code> expression).
91 <dt><a
92 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>:
93 <dd>This module generates code for a set of alternative constructor
94 patterns that belong to a single type by means of the routine
95 <code>matchConFamily</code>. More precisely, the routine gets a set
96 of equations where the left-most pattern of each equation is a
97 constructor pattern with a head symbol from the same type as that of
98 all the other equations. A Core case expression is generated that
99 distinguihes between all these constructors. The routine is clever
100 enough to generate a sparse case expression and to add a catch-all
101 default case only when needed (i.e., if the case expression isn't
102 exhaustive already). There is also an explanation at the start of the
103 modules.
104 <dt><a
105 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>:
106 <dd>Generates code for a set of alternative literal patterns by means of
107 the routine <code>matchLiterals</code>. The principle is similar to
108 that of <code>matchConFamily</code>, but all left-most patterns are
109 literals of the same type.
110 <dt><a
111 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>:
112 <dd>This module provides a set of auxilliary definitions as well as the
113 data types <code>EquationInfo</code> and <code>MatchResult</code> that
114 form the input and output, respectively, of the pattern matching
115 compiler.
116 <dt><a
117 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>:
118 <dd>This module does not really contribute the compiling pattern
119 matching, but it inspects sets of equations to find whether there are
120 any overlapping patterns or non-exhaustive pattern sets. This task is
121 implemented by the function <code>check</code>, which returns a list of
122 patterns that are part of a non-exhaustive case distinction as well as a
123 set of equation labels that can be reached during execution of the code;
124 thus, the remaining equations are shadowed due to overlapping patterns.
125 The function <code>check</code> is invoked and its result converted into
126 suitable warning messages by the function <code>Match.matchExport</code>
127 (which is a wrapper for <code>Match.match</code>).
128 </dl>
129 <p>
130 The central function <code>match</code>, given a set of equations,
131 proceeds in a number of steps:
132 <ol>
133 <li>It starts by desugaring the left-most pattern of each equation using
134 the function <code>tidy1</code> (indirectly via
135 <code>tidyEqnInfo</code>). During this process, non-elementary
136 pattern (e.g., those using explicit list syntax <code>[x, y, ...,
137 z]</code>) are converted to a standard constructor pattern and also
138 irrefutable pattern are removed.
139 <li>Then, a process called <em>unmixing</em> clusters the equations into
140 blocks (without re-ordering them), such that the left-most pattern of
141 all equations in a block are either all variables, all literals, or
142 all constructors.
143 <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>,
144 which forwards the handling of literal pattern blocks to
145 <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to
146 <code>MatchCon.matchConFamily</code>, and hands variable pattern
147 blocks back to <code>match</code>.
148 </ol>
149
150 <p><hr><small>
151 <!-- hhmts start -->
152 Last modified: Mon Feb 11 22:35:25 EST 2002
153 <!-- hhmts end -->
154 </small>
155 </body>
156 </html>