Improve documentation of comprehensions
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by command line flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>&#x2237;</entry>
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&#x2919;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&#x291A;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry>&#x291B;</entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry>&#x291C;</entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483 <sect2 id="binary-literals">
484 <title>Binary integer literals</title>
485 <para>
486 Haskell 2010 and Haskell 98 allows for integer literals to
487 be given in decimal, octal (prefixed by
488 <literal>0o</literal> or <literal>0O</literal>), or
489 hexadecimal notation (prefixed by <literal>0x</literal> or
490 <literal>0X</literal>).
491 </para>
492
493 <para>
494 The language extension <option>-XBinaryLiterals</option>
495 adds support for expressing integer literals in binary
496 notation with the prefix <literal>0b</literal> or
497 <literal>0B</literal>. For instance, the binary integer
498 literal <literal>0b11001001</literal> will be desugared into
499 <literal>fromInteger 201</literal> when
500 <option>-XBinaryLiterals</option> is enabled.
501 </para>
502 </sect2>
503
504 <!-- ====================== HIERARCHICAL MODULES ======================= -->
505
506
507 <sect2 id="hierarchical-modules">
508 <title>Hierarchical Modules</title>
509
510 <para>GHC supports a small extension to the syntax of module
511 names: a module name is allowed to contain a dot
512 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
513 &ldquo;hierarchical module namespace&rdquo; extension, because
514 it extends the normally flat Haskell module namespace into a
515 more flexible hierarchy of modules.</para>
516
517 <para>This extension has very little impact on the language
518 itself; modules names are <emphasis>always</emphasis> fully
519 qualified, so you can just think of the fully qualified module
520 name as <quote>the module name</quote>. In particular, this
521 means that the full module name must be given after the
522 <literal>module</literal> keyword at the beginning of the
523 module; for example, the module <literal>A.B.C</literal> must
524 begin</para>
525
526 <programlisting>module A.B.C</programlisting>
527
528
529 <para>It is a common strategy to use the <literal>as</literal>
530 keyword to save some typing when using qualified names with
531 hierarchical modules. For example:</para>
532
533 <programlisting>
534 import qualified Control.Monad.ST.Strict as ST
535 </programlisting>
536
537 <para>For details on how GHC searches for source and interface
538 files in the presence of hierarchical modules, see <xref
539 linkend="search-path"/>.</para>
540
541 <para>GHC comes with a large collection of libraries arranged
542 hierarchically; see the accompanying <ulink
543 url="../libraries/index.html">library
544 documentation</ulink>. More libraries to install are available
545 from <ulink
546 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
547 </sect2>
548
549 <!-- ====================== PATTERN GUARDS ======================= -->
550
551 <sect2 id="pattern-guards">
552 <title>Pattern guards</title>
553
554 <para>
555 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
556 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
557 </para>
558
559 <para>
560 Suppose we have an abstract data type of finite maps, with a
561 lookup operation:
562
563 <programlisting>
564 lookup :: FiniteMap -> Int -> Maybe Int
565 </programlisting>
566
567 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
568 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
569 </para>
570
571 <programlisting>
572 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
573 | otherwise = var1 + var2
574 where
575 m1 = lookup env var1
576 m2 = lookup env var2
577 ok1 = maybeToBool m1
578 ok2 = maybeToBool m2
579 val1 = expectJust m1
580 val2 = expectJust m2
581 </programlisting>
582
583 <para>
584 The auxiliary functions are
585 </para>
586
587 <programlisting>
588 maybeToBool :: Maybe a -&gt; Bool
589 maybeToBool (Just x) = True
590 maybeToBool Nothing = False
591
592 expectJust :: Maybe a -&gt; a
593 expectJust (Just x) = x
594 expectJust Nothing = error "Unexpected Nothing"
595 </programlisting>
596
597 <para>
598 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
599 ok2</literal> checks that both lookups succeed, using
600 <function>maybeToBool</function> to convert the <function>Maybe</function>
601 types to booleans. The (lazily evaluated) <function>expectJust</function>
602 calls extract the values from the results of the lookups, and binds the
603 returned values to <varname>val1</varname> and <varname>val2</varname>
604 respectively. If either lookup fails, then clunky takes the
605 <literal>otherwise</literal> case and returns the sum of its arguments.
606 </para>
607
608 <para>
609 This is certainly legal Haskell, but it is a tremendously verbose and
610 un-obvious way to achieve the desired effect. Arguably, a more direct way
611 to write clunky would be to use case expressions:
612 </para>
613
614 <programlisting>
615 clunky env var1 var2 = case lookup env var1 of
616 Nothing -&gt; fail
617 Just val1 -&gt; case lookup env var2 of
618 Nothing -&gt; fail
619 Just val2 -&gt; val1 + val2
620 where
621 fail = var1 + var2
622 </programlisting>
623
624 <para>
625 This is a bit shorter, but hardly better. Of course, we can rewrite any set
626 of pattern-matching, guarded equations as case expressions; that is
627 precisely what the compiler does when compiling equations! The reason that
628 Haskell provides guarded equations is because they allow us to write down
629 the cases we want to consider, one at a time, independently of each other.
630 This structure is hidden in the case version. Two of the right-hand sides
631 are really the same (<function>fail</function>), and the whole expression
632 tends to become more and more indented.
633 </para>
634
635 <para>
636 Here is how I would write clunky:
637 </para>
638
639 <programlisting>
640 clunky env var1 var2
641 | Just val1 &lt;- lookup env var1
642 , Just val2 &lt;- lookup env var2
643 = val1 + val2
644 ...other equations for clunky...
645 </programlisting>
646
647 <para>
648 The semantics should be clear enough. The qualifiers are matched in order.
649 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
650 right hand side is evaluated and matched against the pattern on the left.
651 If the match fails then the whole guard fails and the next equation is
652 tried. If it succeeds, then the appropriate binding takes place, and the
653 next qualifier is matched, in the augmented environment. Unlike list
654 comprehensions, however, the type of the expression to the right of the
655 <literal>&lt;-</literal> is the same as the type of the pattern to its
656 left. The bindings introduced by pattern guards scope over all the
657 remaining guard qualifiers, and over the right hand side of the equation.
658 </para>
659
660 <para>
661 Just as with list comprehensions, boolean expressions can be freely mixed
662 with among the pattern guards. For example:
663 </para>
664
665 <programlisting>
666 f x | [y] &lt;- x
667 , y > 3
668 , Just z &lt;- h y
669 = ...
670 </programlisting>
671
672 <para>
673 Haskell's current guards therefore emerge as a special case, in which the
674 qualifier list has just one element, a boolean expression.
675 </para>
676 </sect2>
677
678 <!-- ===================== View patterns =================== -->
679
680 <sect2 id="view-patterns">
681 <title>View patterns
682 </title>
683
684 <para>
685 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
686 More information and examples of view patterns can be found on the
687 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
688 page</ulink>.
689 </para>
690
691 <para>
692 View patterns are somewhat like pattern guards that can be nested inside
693 of other patterns. They are a convenient way of pattern-matching
694 against values of abstract types. For example, in a programming language
695 implementation, we might represent the syntax of the types of the
696 language as follows:
697
698 <programlisting>
699 type Typ
700
701 data TypView = Unit
702 | Arrow Typ Typ
703
704 view :: Typ -> TypView
705
706 -- additional operations for constructing Typ's ...
707 </programlisting>
708
709 The representation of Typ is held abstract, permitting implementations
710 to use a fancy representation (e.g., hash-consing to manage sharing).
711
712 Without view patterns, using this signature a little inconvenient:
713 <programlisting>
714 size :: Typ -> Integer
715 size t = case view t of
716 Unit -> 1
717 Arrow t1 t2 -> size t1 + size t2
718 </programlisting>
719
720 It is necessary to iterate the case, rather than using an equational
721 function definition. And the situation is even worse when the matching
722 against <literal>t</literal> is buried deep inside another pattern.
723 </para>
724
725 <para>
726 View patterns permit calling the view function inside the pattern and
727 matching against the result:
728 <programlisting>
729 size (view -> Unit) = 1
730 size (view -> Arrow t1 t2) = size t1 + size t2
731 </programlisting>
732
733 That is, we add a new form of pattern, written
734 <replaceable>expression</replaceable> <literal>-></literal>
735 <replaceable>pattern</replaceable> that means "apply the expression to
736 whatever we're trying to match against, and then match the result of
737 that application against the pattern". The expression can be any Haskell
738 expression of function type, and view patterns can be used wherever
739 patterns are used.
740 </para>
741
742 <para>
743 The semantics of a pattern <literal>(</literal>
744 <replaceable>exp</replaceable> <literal>-></literal>
745 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
746
747 <itemizedlist>
748
749 <listitem> Scoping:
750
751 <para>The variables bound by the view pattern are the variables bound by
752 <replaceable>pat</replaceable>.
753 </para>
754
755 <para>
756 Any variables in <replaceable>exp</replaceable> are bound occurrences,
757 but variables bound "to the left" in a pattern are in scope. This
758 feature permits, for example, one argument to a function to be used in
759 the view of another argument. For example, the function
760 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
761 written using view patterns as follows:
762
763 <programlisting>
764 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
765 ...other equations for clunky...
766 </programlisting>
767 </para>
768
769 <para>
770 More precisely, the scoping rules are:
771 <itemizedlist>
772 <listitem>
773 <para>
774 In a single pattern, variables bound by patterns to the left of a view
775 pattern expression are in scope. For example:
776 <programlisting>
777 example :: Maybe ((String -> Integer,Integer), String) -> Bool
778 example Just ((f,_), f -> 4) = True
779 </programlisting>
780
781 Additionally, in function definitions, variables bound by matching earlier curried
782 arguments may be used in view pattern expressions in later arguments:
783 <programlisting>
784 example :: (String -> Integer) -> String -> Bool
785 example f (f -> 4) = True
786 </programlisting>
787 That is, the scoping is the same as it would be if the curried arguments
788 were collected into a tuple.
789 </para>
790 </listitem>
791
792 <listitem>
793 <para>
794 In mutually recursive bindings, such as <literal>let</literal>,
795 <literal>where</literal>, or the top level, view patterns in one
796 declaration may not mention variables bound by other declarations. That
797 is, each declaration must be self-contained. For example, the following
798 program is not allowed:
799 <programlisting>
800 let {(x -> y) = e1 ;
801 (y -> x) = e2 } in x
802 </programlisting>
803
804 (For some amplification on this design choice see
805 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
806
807 </para>
808 </listitem>
809 </itemizedlist>
810
811 </para>
812 </listitem>
813
814 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
815 <replaceable>T1</replaceable> <literal>-></literal>
816 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
817 a <replaceable>T2</replaceable>, then the whole view pattern matches a
818 <replaceable>T1</replaceable>.
819 </para></listitem>
820
821 <listitem><para> Matching: To the equations in Section 3.17.3 of the
822 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
823 Report</ulink>, add the following:
824 <programlisting>
825 case v of { (e -> p) -> e1 ; _ -> e2 }
826 =
827 case (e v) of { p -> e1 ; _ -> e2 }
828 </programlisting>
829 That is, to match a variable <replaceable>v</replaceable> against a pattern
830 <literal>(</literal> <replaceable>exp</replaceable>
831 <literal>-></literal> <replaceable>pat</replaceable>
832 <literal>)</literal>, evaluate <literal>(</literal>
833 <replaceable>exp</replaceable> <replaceable> v</replaceable>
834 <literal>)</literal> and match the result against
835 <replaceable>pat</replaceable>.
836 </para></listitem>
837
838 <listitem><para> Efficiency: When the same view function is applied in
839 multiple branches of a function definition or a case expression (e.g.,
840 in <literal>size</literal> above), GHC makes an attempt to collect these
841 applications into a single nested case expression, so that the view
842 function is only applied once. Pattern compilation in GHC follows the
843 matrix algorithm described in Chapter 4 of <ulink
844 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
845 Implementation of Functional Programming Languages</ulink>. When the
846 top rows of the first column of a matrix are all view patterns with the
847 "same" expression, these patterns are transformed into a single nested
848 case. This includes, for example, adjacent view patterns that line up
849 in a tuple, as in
850 <programlisting>
851 f ((view -> A, p1), p2) = e1
852 f ((view -> B, p3), p4) = e2
853 </programlisting>
854 </para>
855
856 <para> The current notion of when two view pattern expressions are "the
857 same" is very restricted: it is not even full syntactic equality.
858 However, it does include variables, literals, applications, and tuples;
859 e.g., two instances of <literal>view ("hi", "there")</literal> will be
860 collected. However, the current implementation does not compare up to
861 alpha-equivalence, so two instances of <literal>(x, view x ->
862 y)</literal> will not be coalesced.
863 </para>
864
865 </listitem>
866
867 </itemizedlist>
868 </para>
869
870 </sect2>
871
872 <!-- ===================== Pattern synonyms =================== -->
873
874 <sect2 id="pattern-synonyms">
875 <title>Pattern synonyms
876 </title>
877
878 <para>
879 Pattern synonyms are enabled by the flag
880 <literal>-XPatternSynonyms</literal>, which is required for defining
881 them, but <emphasis>not</emphasis> for using them. More information
882 and examples of view patterns can be found on the <ulink
883 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
884 page</ulink>.
885 </para>
886
887 <para>
888 Pattern synonyms enable giving names to parametrized pattern
889 schemes. They can also be thought of as abstract constructors that
890 don't have a bearing on data representation. For example, in a
891 programming language implementation, we might represent types of the
892 language as follows:
893 </para>
894
895 <programlisting>
896 data Type = App String [Type]
897 </programlisting>
898
899 <para>
900 Here are some examples of using said representation.
901 Consider a few types of the <literal>Type</literal> universe encoded
902 like this:
903 </para>
904
905 <programlisting>
906 App "->" [t1, t2] -- t1 -> t2
907 App "Int" [] -- Int
908 App "Maybe" [App "Int" []] -- Maybe Int
909 </programlisting>
910
911 <para>
912 This representation is very generic in that no types are given special
913 treatment. However, some functions might need to handle some known
914 types specially, for example the following two functions collect all
915 argument types of (nested) arrow types, and recognize the
916 <literal>Int</literal> type, respectively:
917 </para>
918
919 <programlisting>
920 collectArgs :: Type -> [Type]
921 collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
922 collectArgs _ = []
923
924 isInt :: Type -> Bool
925 isInt (App "Int" []) = True
926 isInt _ = False
927 </programlisting>
928
929 <para>
930 Matching on <literal>App</literal> directly is both hard to read and
931 error prone to write. And the situation is even worse when the
932 matching is nested:
933 </para>
934
935 <programlisting>
936 isIntEndo :: Type -> Bool
937 isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
938 isIntEndo _ = False
939 </programlisting>
940
941 <para>
942 Pattern synonyms permit abstracting from the representation to expose
943 matchers that behave in a constructor-like manner with respect to
944 pattern matching. We can create pattern synonyms for the known types
945 we care about, without committing the representation to them (note
946 that these don't have to be defined in the same module as the
947 <literal>Type</literal> type):
948 </para>
949
950 <programlisting>
951 pattern Arrow t1 t2 = App "->" [t1, t2]
952 pattern Int = App "Int" []
953 pattern Maybe t = App "Maybe" [t]
954 </programlisting>
955
956 <para>
957 Which enables us to rewrite our functions in a much cleaner style:
958 </para>
959
960 <programlisting>
961 collectArgs :: Type -> [Type]
962 collectArgs (Arrow t1 t2) = t1 : collectArgs t2
963 collectArgs _ = []
964
965 isInt :: Type -> Bool
966 isInt Int = True
967 isInt _ = False
968
969 isIntEndo :: Type -> Bool
970 isIntEndo (Arrow Int Int) = True
971 isIntEndo _ = False
972 </programlisting>
973
974 <para>
975 Note that in this example, the pattern synonyms
976 <literal>Int</literal> and <literal>Arrow</literal> can also be used
977 as expressions (they are <emphasis>bidirectional</emphasis>). This
978 is not necessarily the case: <emphasis>unidirectional</emphasis>
979 pattern synonyms can also be declared with the following syntax:
980 </para>
981
982 <programlisting>
983 pattern Head x &lt;- x:xs
984 </programlisting>
985
986 <para>
987 In this case, <literal>Head</literal> <replaceable>x</replaceable>
988 cannot be used in expressions, only patterns, since it wouldn't
989 specify a value for the <replaceable>xs</replaceable> on the
990 right-hand side. We can give an explicit inversion of a pattern
991 synonym using the following syntax:
992 </para>
993
994 <programlisting>
995 pattern Head x &lt;- x:xs where
996 Head x = [x]
997 </programlisting>
998
999 <para>
1000 The syntax and semantics of pattern synonyms are elaborated in the
1001 following subsections.
1002 See the <ulink
1003 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
1004 page</ulink> for more details.
1005 </para>
1006
1007 <sect3> <title>Syntax and scoping of pattern synonyms</title>
1008 <para>
1009 A pattern synonym declaration can be either unidirectional or
1010 bidirectional. The syntax for unidirectional pattern synonyms is:
1011 <programlisting>
1012 pattern Name args &lt;- pat
1013 </programlisting>
1014 and the syntax for bidirectional pattern synonyms is:
1015 <programlisting>
1016 pattern Name args = pat
1017 </programlisting> or
1018 <programlisting>
1019 pattern Name args &lt;- pat where
1020 Name args = expr
1021 </programlisting>
1022 Either prefix or infix syntax can be
1023 used.
1024 </para>
1025 <para>
1026 Pattern synonym declarations can only occur in the top level of a
1027 module. In particular, they are not allowed as local
1028 definitions.
1029 </para>
1030 <para>
1031 The variables in the left-hand side of the definition are bound by
1032 the pattern on the right-hand side. For implicitly bidirectional
1033 pattern synonyms, all the variables of the right-hand side must also
1034 occur on the left-hand side; also, wildcard patterns and view
1035 patterns are not allowed. For unidirectional and
1036 explicitly-bidirectional pattern synonyms, there is no restriction
1037 on the right-hand side pattern.
1038 </para>
1039
1040 <para>
1041 Pattern synonyms cannot be defined recursively.
1042 </para>
1043 </sect3>
1044
1045 <sect3 id="patsyn-impexp"> <title>Import and export of pattern synonyms</title>
1046
1047 <para>
1048 The name of the pattern synonym itself is in the same namespace as
1049 proper data constructors. In an export or import specification,
1050 you must prefix pattern
1051 names with the <literal>pattern</literal> keyword, e.g.:
1052 <programlisting>
1053 module Example (pattern Single) where
1054 pattern Single x = [x]
1055 </programlisting>
1056 Without the <literal>pattern</literal> prefix, <literal>Single</literal> would
1057 be interpreted as a type constructor in the export list.
1058 </para>
1059 <para>
1060 You may also use the <literal>pattern</literal> keyword in an import/export
1061 specification to import or export an ordinary data constructor. For example:
1062 <programlisting>
1063 import Data.Maybe( pattern Just )
1064 </programlisting>
1065 would bring into scope the data constructor <literal>Just</literal> from the
1066 <literal>Maybe</literal> type, without also bringing the type constructor
1067 <literal>Maybe</literal> into scope.
1068 </para>
1069 </sect3>
1070
1071 <sect3> <title>Typing of pattern synonyms</title>
1072
1073 <para>
1074 Given a pattern synonym definition of the form
1075 <programlisting>
1076 pattern P var1 var2 ... varN &lt;- pat
1077 </programlisting>
1078 it is assigned a <emphasis>pattern type</emphasis> of the form
1079 <programlisting>
1080 pattern P :: CProv => CReq => t1 -> t2 -> ... -> tN -> t
1081 </programlisting>
1082 where <replaceable>CProv</replaceable> and
1083 <replaceable>CReq</replaceable> are type contexts, and
1084 <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ...,
1085 <replaceable>tN</replaceable> and <replaceable>t</replaceable> are
1086 types.
1087 Notice the unusual form of the type, with two contexts <replaceable>CProv</replaceable> and <replaceable>CReq</replaceable>:
1088 <itemizedlist>
1089 <listitem><para><replaceable>CReq</replaceable> are the constraints <emphasis>required</emphasis> to match the pattern.</para></listitem>
1090 <listitem><para><replaceable>CProv</replaceable> are the constraints <emphasis>made available (provided)</emphasis>
1091 by a successful pattern match.</para></listitem>
1092 </itemizedlist>
1093 For example, consider
1094 <programlisting>
1095 data T a where
1096 MkT :: (Show b) => a -> b -> T a
1097
1098 f1 :: (Eq a, Num a) => T a -> String
1099 f1 (MkT 42 x) = show x
1100
1101 pattern ExNumPat :: (Show b) => (Num a, Eq a) => b -> T a
1102 pattern ExNumPat x = MkT 42 x
1103
1104 f2 :: (Eq a, Num a) => T a -> String
1105 f2 (ExNumPat x) = show x
1106 </programlisting>
1107 Here <literal>f1</literal> does not use pattern synonyms. To match against the
1108 numeric pattern <literal>42</literal> <emphasis>requires</emphasis> the caller to
1109 satisfy the constraints <literal>(Num a, Eq a)</literal>,
1110 so they appear in <literal>f1</literal>'s type. The call to <literal>show</literal> generates a <literal>(Show b)</literal>
1111 constraint, where <literal>b</literal> is an existentially type variable bound by the pattern match
1112 on <literal>MkT</literal>. But the same pattern match also <emphasis>provides</emphasis> the constraint
1113 <literal>(Show b)</literal> (see <literal>MkT</literal>'s type), and so all is well.
1114 </para>
1115 <para>
1116 Exactly the same reasoning applies to <literal>ExNumPat</literal>:
1117 matching against <literal>ExNumPat</literal> <emphasis>requires</emphasis>
1118 the constraints <literal>(Num a, Eq a)</literal>, and <emphasis>provides</emphasis>
1119 the constraint <literal>(Show b)</literal>.
1120 </para>
1121 <para>
1122 Note also the following points
1123 <itemizedlist>
1124 <listitem><para>
1125 In the common case where <replaceable>CReq</replaceable> is empty,
1126 <literal>()</literal>, it can be omitted altogether.
1127 </para> </listitem>
1128
1129 <listitem><para>
1130 You may specify an explicit <emphasis>pattern signature</emphasis>, as
1131 we did for <literal>ExNumPat</literal> above, to specify the type of a pattern,
1132 just as you can for a function. As usual, the type signature can be less polymorphic
1133 than the inferred type. For example
1134 <programlisting>
1135 -- Inferred type would be 'a -> [a]'
1136 pattern SinglePair :: (a, a) -> [(a, a)]
1137 pattern SinglePair x = [x]
1138 </programlisting>
1139 </para> </listitem>
1140
1141 <listitem><para>
1142 The GHCi <literal>:info</literal> command shows pattern types in this format.
1143 </para> </listitem>
1144
1145 <listitem><para>
1146 For a bidirectional pattern synonym, a use of the pattern synonym as an expression has the type
1147 <programlisting>
1148 (CProv, CReq) => t1 -> t2 -> ... -> tN -> t
1149 </programlisting>
1150 So in the previous example, when used in an expression, <literal>ExNumPat</literal> has type
1151 <programlisting>
1152 ExNumPat :: (Show b, Num a, Eq a) => b -> T t
1153 </programlisting>
1154 Notice that this is a tiny bit more restrictive than the expression <literal>MkT 42 x</literal>
1155 which would not require <literal>(Eq a)</literal>.
1156 </para> </listitem>
1157
1158 <listitem><para>
1159 Consider these two pattern synonyms:
1160 <programlisting>
1161 data S a where
1162 S1 :: Bool -> S Bool
1163
1164 pattern P1 b = Just b -- P1 :: Bool -> Maybe Bool
1165 pattern P2 b = S1 b -- P2 :: (b~Bool) => Bool -> S b
1166
1167 f :: Maybe a -> String
1168 f (P1 x) = "no no no" -- Type-incorrect
1169
1170 g :: S a -> String
1171 g (P2 b) = "yes yes yes" -- Fine
1172 </programlisting>
1173 Pattern <literal>P1</literal> can only match against a value of type <literal>Maybe Bool</literal>,
1174 so function <literal>f</literal> is rejected because the type signature is <literal>Maybe a</literal>.
1175 (To see this, imagine expanding the pattern synonym.)
1176 </para>
1177 <para>
1178 On the other hand, function <literal>g</literal> works fine, because matching against <literal>P2</literal>
1179 (which wraps the GADT <literal>S</literal>) provides the local equality <literal>(a~Bool)</literal>.
1180 If you were to give an explicit pattern signature <literal>P2 :: Bool -> S Bool</literal>, then <literal>P2</literal>
1181 would become less polymorphic, and would behave exactly like <literal>P1</literal> so that <literal>g</literal>
1182 would then be rejected.
1183 </para>
1184 <para>
1185 In short, if you want GADT-like behaviour for pattern synonyms,
1186 then (unlike unlike concrete data constructors like <literal>S1</literal>)
1187 you must write its type with explicit provided equalities.
1188 For a concrete data constructor like <literal>S1</literal> you can write
1189 its type signature as either <literal>S1 :: Bool -> S Bool</literal> or
1190 <literal>S1 :: (b~Bool) => Bool -> S b</literal>; the two are equivalent.
1191 Not so for pattern synonyms: the two forms are different, in order to
1192 distinguish the two cases above. (See <ulink url="https://ghc.haskell.org/trac/ghc/ticket/9953">Trac #9953</ulink> for
1193 discussion of this choice.)
1194 </para></listitem>
1195 </itemizedlist>
1196 </para>
1197 </sect3>
1198
1199 <sect3><title>Matching of pattern synonyms</title>
1200
1201 <para>
1202 A pattern synonym occurrence in a pattern is evaluated by first
1203 matching against the pattern synonym itself, and then on the argument
1204 patterns. For example, in the following program, <literal>f</literal>
1205 and <literal>f'</literal> are equivalent:
1206 </para>
1207
1208 <programlisting>
1209 pattern Pair x y &lt;- [x, y]
1210
1211 f (Pair True True) = True
1212 f _ = False
1213
1214 f' [x, y] | True &lt;- x, True &lt;- y = True
1215 f' _ = False
1216 </programlisting>
1217
1218 <para>
1219 Note that the strictness of <literal>f</literal> differs from that
1220 of <literal>g</literal> defined below:
1221 <programlisting>
1222 g [True, True] = True
1223 g _ = False
1224
1225 *Main> f (False:undefined)
1226 *** Exception: Prelude.undefined
1227 *Main> g (False:undefined)
1228 False
1229 </programlisting>
1230 </para>
1231 </sect3>
1232
1233 </sect2>
1234
1235 <!-- ===================== n+k patterns =================== -->
1236
1237 <sect2 id="n-k-patterns">
1238 <title>n+k patterns</title>
1239 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
1240
1241 <para>
1242 <literal>n+k</literal> pattern support is disabled by default. To enable
1243 it, you can use the <option>-XNPlusKPatterns</option> flag.
1244 </para>
1245
1246 </sect2>
1247
1248 <!-- ===================== Traditional record syntax =================== -->
1249
1250 <sect2 id="traditional-record-syntax">
1251 <title>Traditional record syntax</title>
1252 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
1253
1254 <para>
1255 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
1256 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
1257 </para>
1258
1259 </sect2>
1260
1261 <!-- ===================== Recursive do-notation =================== -->
1262
1263 <sect2 id="recursive-do-notation">
1264 <title>The recursive do-notation
1265 </title>
1266
1267 <para>
1268 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
1269 that is, the variables bound in a do-expression are visible only in the textually following
1270 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
1271 group.
1272 </para>
1273
1274 <para>
1275 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
1276 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
1277 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
1278 <programlisting>
1279 class Monad m => MonadFix m where
1280 mfix :: (a -> m a) -> m a
1281 </programlisting>
1282 Haskell's
1283 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
1284 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
1285 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
1286 </para>
1287
1288 <para>
1289 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
1290 an extended version of the do-notation that allows recursive bindings.
1291 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
1292 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
1293 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
1294 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
1295 are recursively defined, much like in an ordinary let-expression. Due to the new
1296 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
1297 </para>
1298
1299 <para>
1300 Here is a simple (albeit contrived) example:
1301 <programlisting>
1302 {-# LANGUAGE RecursiveDo #-}
1303 justOnes = mdo { xs &lt;- Just (1:xs)
1304 ; return (map negate xs) }
1305 </programlisting>
1306 or equivalently
1307 <programlisting>
1308 {-# LANGUAGE RecursiveDo #-}
1309 justOnes = do { rec { xs &lt;- Just (1:xs) }
1310 ; return (map negate xs) }
1311 </programlisting>
1312 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
1313 </para>
1314
1315 <para>
1316 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
1317 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
1318 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
1319 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
1320 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
1321 </para>
1322
1323 <sect3>
1324 <title>Recursive binding groups</title>
1325
1326 <para>
1327 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
1328 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
1329 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
1330 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
1331 <programlisting>
1332 do { a &lt;- getChar do { a &lt;- getChar
1333 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
1334 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
1335 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
1336 </programlisting>
1337 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
1338 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
1339 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
1340 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
1341 </para>
1342
1343 <para>
1344 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
1345 group, it will compute its set of bound variables, and will introduce an appropriate call
1346 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
1347 <literal>MonadFix</literal> class. Here is an example:
1348 <programlisting>
1349 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
1350 ; c &lt;- f b a } ; c &lt;- f b a
1351 ; return (b,c) })
1352 </programlisting>
1353 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
1354 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
1355 <programlisting>
1356 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1357 </programlisting>
1358 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1359 </para>
1360
1361 <para>
1362 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
1363 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
1364 for the <literal>mdo</literal> notation, which is described next.
1365 </para>
1366 </sect3>
1367
1368 <sect3>
1369 <title>The <literal>mdo</literal> notation</title>
1370
1371 <para>
1372 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
1373 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
1374 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
1375 in detail in Section 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
1376 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
1377 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
1378 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
1379 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
1380 interference, changing the termination behavior of the resulting translation.
1381 (Details can be found in Sections 3.1 and 7.2.2 of
1382 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1383 </para>
1384
1385 <para>
1386 The <literal>mdo</literal> notation removes the burden of placing
1387 explicit <literal>rec</literal> blocks in the code. Unlike an
1388 ordinary <literal>do</literal> expression, in which variables bound by
1389 statements are only in scope for later statements, variables bound in
1390 an <literal>mdo</literal> expression are in scope for all statements
1391 of the expression. The compiler then automatically identifies minimal
1392 mutually recursively dependent segments of statements, treating them as
1393 if the user had wrapped a <literal>rec</literal> qualifier around them.
1394 </para>
1395
1396 <para>
1397 The definition is syntactic:
1398 </para>
1399 <itemizedlist>
1400 <listitem>
1401 <para>
1402 A generator <replaceable>g</replaceable>
1403 <emphasis>depends</emphasis> on a textually following generator
1404 <replaceable>g'</replaceable>, if
1405 </para>
1406 <itemizedlist>
1407 <listitem>
1408 <para>
1409 <replaceable>g'</replaceable> defines a variable that
1410 is used by <replaceable>g</replaceable>, or
1411 </para>
1412 </listitem>
1413 <listitem>
1414 <para>
1415 <replaceable>g'</replaceable> textually appears between
1416 <replaceable>g</replaceable> and
1417 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1418 depends on <replaceable>g''</replaceable>.
1419 </para>
1420 </listitem>
1421 </itemizedlist>
1422 </listitem>
1423 <listitem>
1424 <para>
1425 A <emphasis>segment</emphasis> of a given
1426 <literal>mdo</literal>-expression is a minimal sequence of generators
1427 such that no generator of the sequence depends on an outside
1428 generator. As a special case, although it is not a generator,
1429 the final expression in an <literal>mdo</literal>-expression is
1430 considered to form a segment by itself.
1431 </para>
1432 </listitem>
1433 </itemizedlist>
1434 <para>
1435 Segments in this sense are
1436 related to <emphasis>strongly-connected components</emphasis> analysis,
1437 with the exception that bindings in a segment cannot be reordered and
1438 must be contiguous.
1439 </para>
1440
1441 <para>
1442 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1443 <programlisting>
1444 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1445 ; b &lt;- f a c ; rec { b &lt;- f a c
1446 ; c &lt;- f b a ; ; c &lt;- f b a }
1447 ; z &lt;- h a b ; z &lt;- h a b
1448 ; d &lt;- g d e ; rec { d &lt;- g d e
1449 ; e &lt;- g a z ; ; e &lt;- g a z }
1450 ; putChar c } ; putChar c }
1451 </programlisting>
1452 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1453 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1454 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1455 would expect.
1456 </para>
1457
1458 <para>
1459 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1460 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1461 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1462 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1463 version would do so.
1464 </para>
1465
1466 <para>
1467 Here are some other important points in using the recursive-do notation:
1468
1469 <itemizedlist>
1470 <listitem>
1471 <para>
1472 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1473 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1474 blocks inside <literal>do</literal> expressions.)
1475 </para>
1476 </listitem>
1477 <listitem>
1478 <para>
1479 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1480 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1481 <literal>rec</literal> blocks in a single expression.
1482 </para>
1483 </listitem>
1484 <listitem>
1485 <para>
1486 If recursive bindings are required for a monad, then that monad must be declared an instance of
1487 the <literal>MonadFix</literal> class.
1488 </para>
1489 </listitem>
1490 <listitem>
1491 <para>
1492 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1493 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1494 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1495 state monad (strict and lazy, respectively).
1496 </para>
1497 </listitem>
1498 <listitem>
1499 <para>
1500 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1501 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1502 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1503 </para>
1504 </listitem>
1505 </itemizedlist>
1506 </para>
1507 </sect3>
1508
1509
1510 </sect2>
1511
1512
1513 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1514
1515 <sect2 id="parallel-list-comprehensions">
1516 <title>Parallel List Comprehensions</title>
1517 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1518 </indexterm>
1519 <indexterm><primary>parallel list comprehensions</primary>
1520 </indexterm>
1521
1522 <para>Parallel list comprehensions are a natural extension to list
1523 comprehensions. List comprehensions can be thought of as a nice
1524 syntax for writing maps and filters. Parallel comprehensions
1525 extend this to include the <literal>zipWith</literal> family.</para>
1526
1527 <para>A parallel list comprehension has multiple independent
1528 branches of qualifier lists, each separated by a `|' symbol. For
1529 example, the following zips together two lists:</para>
1530
1531 <programlisting>
1532 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1533 </programlisting>
1534
1535 <para>The behaviour of parallel list comprehensions follows that of
1536 zip, in that the resulting list will have the same length as the
1537 shortest branch.</para>
1538
1539 <para>We can define parallel list comprehensions by translation to
1540 regular comprehensions. Here's the basic idea:</para>
1541
1542 <para>Given a parallel comprehension of the form: </para>
1543
1544 <programlisting>
1545 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1546 | q1 &lt;- e21, q2 &lt;- e22, ...
1547 ...
1548 ]
1549 </programlisting>
1550
1551 <para>This will be translated to: </para>
1552
1553 <programlisting>
1554 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1555 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1556 ...
1557 ]
1558 </programlisting>
1559
1560 <para>where `zipN' is the appropriate zip for the given number of
1561 branches.</para>
1562
1563 </sect2>
1564
1565 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1566
1567 <sect2 id="generalised-list-comprehensions">
1568 <title>Generalised (SQL-Like) List Comprehensions</title>
1569 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1570 </indexterm>
1571 <indexterm><primary>extended list comprehensions</primary>
1572 </indexterm>
1573 <indexterm><primary>group</primary></indexterm>
1574 <indexterm><primary>sql</primary></indexterm>
1575
1576
1577 <para>Generalised list comprehensions are a further enhancement to the
1578 list comprehension syntactic sugar to allow operations such as sorting
1579 and grouping which are familiar from SQL. They are fully described in the
1580 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1581 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1582 except that the syntax we use differs slightly from the paper.</para>
1583 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1584 <para>Here is an example:
1585 <programlisting>
1586 employees = [ ("Simon", "MS", 80)
1587 , ("Erik", "MS", 100)
1588 , ("Phil", "Ed", 40)
1589 , ("Gordon", "Ed", 45)
1590 , ("Paul", "Yale", 60)]
1591
1592 output = [ (the dept, sum salary)
1593 | (name, dept, salary) &lt;- employees
1594 , then group by dept using groupWith
1595 , then sortWith by (sum salary)
1596 , then take 5 ]
1597 </programlisting>
1598 In this example, the list <literal>output</literal> would take on
1599 the value:
1600
1601 <programlisting>
1602 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1603 </programlisting>
1604 </para>
1605 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1606 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1607 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1608
1609 <para>There are five new forms of comprehension qualifier,
1610 all introduced by the (existing) keyword <literal>then</literal>:
1611 <itemizedlist>
1612 <listitem>
1613
1614 <programlisting>
1615 then f
1616 </programlisting>
1617
1618 This statement requires that <literal>f</literal> have the type <literal>
1619 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1620 motivating example, as this form is used to apply <literal>take 5</literal>.
1621
1622 </listitem>
1623
1624
1625 <listitem>
1626 <para>
1627 <programlisting>
1628 then f by e
1629 </programlisting>
1630
1631 This form is similar to the previous one, but allows you to create a function
1632 which will be passed as the first argument to f. As a consequence f must have
1633 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1634 from the type, this function lets f &quot;project out&quot; some information
1635 from the elements of the list it is transforming.</para>
1636
1637 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1638 is supplied with a function that lets it find out the <literal>sum salary</literal>
1639 for any item in the list comprehension it transforms.</para>
1640
1641 </listitem>
1642
1643
1644 <listitem>
1645
1646 <programlisting>
1647 then group by e using f
1648 </programlisting>
1649
1650 <para>This is the most general of the grouping-type statements. In this form,
1651 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1652 As with the <literal>then f by e</literal> case above, the first argument
1653 is a function supplied to f by the compiler which lets it compute e on every
1654 element of the list being transformed. However, unlike the non-grouping case,
1655 f additionally partitions the list into a number of sublists: this means that
1656 at every point after this statement, binders occurring before it in the comprehension
1657 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1658 this, let's look at an example:</para>
1659
1660 <programlisting>
1661 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1662 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1663 groupRuns f = groupBy (\x y -> f x == f y)
1664
1665 output = [ (the x, y)
1666 | x &lt;- ([1..3] ++ [1..2])
1667 , y &lt;- [4..6]
1668 , then group by x using groupRuns ]
1669 </programlisting>
1670
1671 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1672
1673 <programlisting>
1674 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1675 </programlisting>
1676
1677 <para>Note that we have used the <literal>the</literal> function to change the type
1678 of x from a list to its original numeric type. The variable y, in contrast, is left
1679 unchanged from the list form introduced by the grouping.</para>
1680
1681 </listitem>
1682
1683 <listitem>
1684
1685 <programlisting>
1686 then group using f
1687 </programlisting>
1688
1689 <para>With this form of the group statement, f is required to simply have the type
1690 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1691 comprehension so far directly. An example of this form is as follows:</para>
1692
1693 <programlisting>
1694 output = [ x
1695 | y &lt;- [1..5]
1696 , x &lt;- "hello"
1697 , then group using inits]
1698 </programlisting>
1699
1700 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1701
1702 <programlisting>
1703 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1704 </programlisting>
1705
1706 </listitem>
1707 </itemizedlist>
1708 </para>
1709 </sect2>
1710
1711 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1712
1713 <sect2 id="monad-comprehensions">
1714 <title>Monad comprehensions</title>
1715 <indexterm><primary>monad comprehensions</primary></indexterm>
1716
1717 <para>
1718 Monad comprehensions generalise the list comprehension notation,
1719 including parallel comprehensions
1720 (<xref linkend="parallel-list-comprehensions"/>) and
1721 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1722 to work for any monad.
1723 </para>
1724
1725 <para>Monad comprehensions support:</para>
1726
1727 <itemizedlist>
1728 <listitem>
1729 <para>
1730 Bindings:
1731 </para>
1732
1733 <programlisting>
1734 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1735 </programlisting>
1736
1737 <para>
1738 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1739 <literal>return</literal> functions to the usual do-notation:
1740 </para>
1741
1742 <programlisting>
1743 do x &lt;- Just 1
1744 y &lt;- Just 2
1745 return (x+y)
1746 </programlisting>
1747
1748 </listitem>
1749 <listitem>
1750 <para>
1751 Guards:
1752 </para>
1753
1754 <programlisting>
1755 [ x | x &lt;- [1..10], x &lt;= 5 ]
1756 </programlisting>
1757
1758 <para>
1759 Guards are translated with the <literal>guard</literal> function,
1760 which requires a <literal>MonadPlus</literal> instance:
1761 </para>
1762
1763 <programlisting>
1764 do x &lt;- [1..10]
1765 guard (x &lt;= 5)
1766 return x
1767 </programlisting>
1768
1769 </listitem>
1770 <listitem>
1771 <para>
1772 Transform statements (as with <literal>-XTransformListComp</literal>):
1773 </para>
1774
1775 <programlisting>
1776 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1777 </programlisting>
1778
1779 <para>
1780 This translates to:
1781 </para>
1782
1783 <programlisting>
1784 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1785 y &lt;- [1..x]
1786 return (x,y))
1787 return (x+y)
1788 </programlisting>
1789
1790 </listitem>
1791 <listitem>
1792 <para>
1793 Group statements (as with <literal>-XTransformListComp</literal>):
1794 </para>
1795
1796 <programlisting>
1797 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1798 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1799 </programlisting>
1800
1801 </listitem>
1802 <listitem>
1803 <para>
1804 Parallel statements (as with <literal>-XParallelListComp</literal>):
1805 </para>
1806
1807 <programlisting>
1808 [ (x+y) | x &lt;- [1..10]
1809 | y &lt;- [11..20]
1810 ]
1811 </programlisting>
1812
1813 <para>
1814 Parallel statements are translated using the
1815 <literal>mzip</literal> function, which requires a
1816 <literal>MonadZip</literal> instance defined in
1817 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1818 </para>
1819
1820 <programlisting>
1821 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1822 return x)
1823 (do y &lt;- [11..20]
1824 return y)
1825 return (x+y)
1826 </programlisting>
1827
1828 </listitem>
1829 </itemizedlist>
1830
1831 <para>
1832 All these features are enabled by default if the
1833 <literal>MonadComprehensions</literal> extension is enabled. The types
1834 and more detailed examples on how to use comprehensions are explained
1835 in the previous chapters <xref
1836 linkend="generalised-list-comprehensions"/> and <xref
1837 linkend="parallel-list-comprehensions"/>. In general you just have
1838 to replace the type <literal>[a]</literal> with the type
1839 <literal>Monad m => m a</literal> for monad comprehensions.
1840 </para>
1841
1842 <para>
1843 Note: Even though most of these examples are using the list monad,
1844 monad comprehensions work for any monad.
1845 The <literal>base</literal> package offers all necessary instances for
1846 lists, which make <literal>MonadComprehensions</literal> backward
1847 compatible to built-in, transform and parallel list comprehensions.
1848 </para>
1849 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1850 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1851 <programlisting>
1852 Expressions: e
1853 Declarations: d
1854 Lists of qualifiers: Q,R,S
1855
1856 -- Basic forms
1857 D[ e | ] = return e
1858 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1859 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1860 D[ e | let d, Q ] = let d in D[ e | Q ]
1861
1862 -- Parallel comprehensions (iterate for multiple parallel branches)
1863 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1864
1865 -- Transform comprehensions
1866 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1867
1868 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1869
1870 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1871 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1872 Qv -&gt; D[ e | R ]
1873
1874 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1875 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1876 Qv -&gt; D[ e | R ]
1877
1878 where Qv is the tuple of variables bound by Q (and used subsequently)
1879 selQvi is a selector mapping Qv to the ith component of Qv
1880
1881 Operator Standard binding Expected type
1882 --------------------------------------------------------------------
1883 return GHC.Base t1 -&gt; m t2
1884 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1885 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1886 guard Control.Monad t1 -&gt; m t2
1887 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1888 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1889 </programlisting>
1890 The comprehension should typecheck when its desugaring would typecheck,
1891 except that (as discussed in <xref linkend="generalised-list-comprehensions"/>)
1892 in the "then f" and "then group using f" clauses,
1893 when the "by b" qualifier is omitted, argument f should have a polymorphic type.
1894 In particular, "then Data.List.sort" and
1895 "then group using Data.List.group" are insufficiently polymorphic.
1896 </para>
1897 <para>
1898 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1899 Without rebindable
1900 syntax, the operators from the "standard binding" module are used; with
1901 rebindable syntax, the operators are looked up in the current lexical scope.
1902 For example, parallel comprehensions will be typechecked and desugared
1903 using whatever "<literal>mzip</literal>" is in scope.
1904 </para>
1905 <para>
1906 The rebindable operators must have the "Expected type" given in the
1907 table above. These types are surprisingly general. For example, you can
1908 use a bind operator with the type
1909 <programlisting>
1910 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1911 </programlisting>
1912 In the case of transform comprehensions, notice that the groups are
1913 parameterised over some arbitrary type <literal>n</literal> (provided it
1914 has an <literal>fmap</literal>, as well as
1915 the comprehension being over an arbitrary monad.
1916 </para>
1917 </sect2>
1918
1919 <!-- ===================== REBINDABLE SYNTAX =================== -->
1920
1921 <sect2 id="rebindable-syntax">
1922 <title>Rebindable syntax and the implicit Prelude import</title>
1923
1924 <para><indexterm><primary>-XNoImplicitPrelude
1925 option</primary></indexterm> GHC normally imports
1926 <filename>Prelude.hi</filename> files for you. If you'd
1927 rather it didn't, then give it a
1928 <option>-XNoImplicitPrelude</option> option. The idea is
1929 that you can then import a Prelude of your own. (But don't
1930 call it <literal>Prelude</literal>; the Haskell module
1931 namespace is flat, and you must not conflict with any
1932 Prelude module.)</para>
1933
1934 <para>Suppose you are importing a Prelude of your own
1935 in order to define your own numeric class
1936 hierarchy. It completely defeats that purpose if the
1937 literal "1" means "<literal>Prelude.fromInteger
1938 1</literal>", which is what the Haskell Report specifies.
1939 So the <option>-XRebindableSyntax</option>
1940 flag causes
1941 the following pieces of built-in syntax to refer to
1942 <emphasis>whatever is in scope</emphasis>, not the Prelude
1943 versions:
1944 <itemizedlist>
1945 <listitem>
1946 <para>An integer literal <literal>368</literal> means
1947 "<literal>fromInteger (368::Integer)</literal>", rather than
1948 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1949 </para> </listitem>
1950
1951 <listitem><para>Fractional literals are handed in just the same way,
1952 except that the translation is
1953 <literal>fromRational (3.68::Rational)</literal>.
1954 </para> </listitem>
1955
1956 <listitem><para>The equality test in an overloaded numeric pattern
1957 uses whatever <literal>(==)</literal> is in scope.
1958 </para> </listitem>
1959
1960 <listitem><para>The subtraction operation, and the
1961 greater-than-or-equal test, in <literal>n+k</literal> patterns
1962 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1963 </para></listitem>
1964
1965 <listitem>
1966 <para>Negation (e.g. "<literal>- (f x)</literal>")
1967 means "<literal>negate (f x)</literal>", both in numeric
1968 patterns, and expressions.
1969 </para></listitem>
1970
1971 <listitem>
1972 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1973 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1974 </para></listitem>
1975
1976 <listitem>
1977 <para>"Do" notation is translated using whatever
1978 functions <literal>(>>=)</literal>,
1979 <literal>(>>)</literal>, and <literal>fail</literal>,
1980 are in scope (not the Prelude
1981 versions). List comprehensions, <literal>mdo</literal>
1982 (<xref linkend="recursive-do-notation"/>), and parallel array
1983 comprehensions, are unaffected. </para></listitem>
1984
1985 <listitem>
1986 <para>Arrow
1987 notation (see <xref linkend="arrow-notation"/>)
1988 uses whatever <literal>arr</literal>,
1989 <literal>(>>>)</literal>, <literal>first</literal>,
1990 <literal>app</literal>, <literal>(|||)</literal> and
1991 <literal>loop</literal> functions are in scope. But unlike the
1992 other constructs, the types of these functions must match the
1993 Prelude types very closely. Details are in flux; if you want
1994 to use this, ask!
1995 </para></listitem>
1996 </itemizedlist>
1997 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1998 </para>
1999 <para>
2000 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
2001 even if that is a little unexpected. For example, the
2002 static semantics of the literal <literal>368</literal>
2003 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
2004 <literal>fromInteger</literal> to have any of the types:
2005 <programlisting>
2006 fromInteger :: Integer -> Integer
2007 fromInteger :: forall a. Foo a => Integer -> a
2008 fromInteger :: Num a => a -> Integer
2009 fromInteger :: Integer -> Bool -> Bool
2010 </programlisting>
2011 </para>
2012
2013 <para>Be warned: this is an experimental facility, with
2014 fewer checks than usual. Use <literal>-dcore-lint</literal>
2015 to typecheck the desugared program. If Core Lint is happy
2016 you should be all right.</para>
2017
2018 </sect2>
2019
2020 <sect2 id="postfix-operators">
2021 <title>Postfix operators</title>
2022
2023 <para>
2024 The <option>-XPostfixOperators</option> flag enables a small
2025 extension to the syntax of left operator sections, which allows you to
2026 define postfix operators. The extension is this: the left section
2027 <programlisting>
2028 (e !)
2029 </programlisting>
2030 is equivalent (from the point of view of both type checking and execution) to the expression
2031 <programlisting>
2032 ((!) e)
2033 </programlisting>
2034 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
2035 The strict Haskell 98 interpretation is that the section is equivalent to
2036 <programlisting>
2037 (\y -> (!) e y)
2038 </programlisting>
2039 That is, the operator must be a function of two arguments. GHC allows it to
2040 take only one argument, and that in turn allows you to write the function
2041 postfix.
2042 </para>
2043 <para>The extension does not extend to the left-hand side of function
2044 definitions; you must define such a function in prefix form.</para>
2045
2046 </sect2>
2047
2048 <sect2 id="tuple-sections">
2049 <title>Tuple sections</title>
2050
2051 <para>
2052 The <option>-XTupleSections</option> flag enables Python-style partially applied
2053 tuple constructors. For example, the following program
2054 <programlisting>
2055 (, True)
2056 </programlisting>
2057 is considered to be an alternative notation for the more unwieldy alternative
2058 <programlisting>
2059 \x -> (x, True)
2060 </programlisting>
2061 You can omit any combination of arguments to the tuple, as in the following
2062 <programlisting>
2063 (, "I", , , "Love", , 1337)
2064 </programlisting>
2065 which translates to
2066 <programlisting>
2067 \a b c d -> (a, "I", b, c, "Love", d, 1337)
2068 </programlisting>
2069 </para>
2070
2071 <para>
2072 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
2073 will also be available for them, like so
2074 <programlisting>
2075 (# , True #)
2076 </programlisting>
2077 Because there is no unboxed unit tuple, the following expression
2078 <programlisting>
2079 (# #)
2080 </programlisting>
2081 continues to stand for the unboxed singleton tuple data constructor.
2082 </para>
2083
2084 </sect2>
2085
2086 <sect2 id="lambda-case">
2087 <title>Lambda-case</title>
2088 <para>
2089 The <option>-XLambdaCase</option> flag enables expressions of the form
2090 <programlisting>
2091 \case { p1 -> e1; ...; pN -> eN }
2092 </programlisting>
2093 which is equivalent to
2094 <programlisting>
2095 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
2096 </programlisting>
2097 Note that <literal>\case</literal> starts a layout, so you can write
2098 <programlisting>
2099 \case
2100 p1 -> e1
2101 ...
2102 pN -> eN
2103 </programlisting>
2104 </para>
2105 </sect2>
2106
2107 <sect2 id="empty-case">
2108 <title>Empty case alternatives</title>
2109 <para>
2110 The <option>-XEmptyCase</option> flag enables
2111 case expressions, or lambda-case expressions, that have no alternatives,
2112 thus:
2113 <programlisting>
2114 case e of { } -- No alternatives
2115 or
2116 \case { } -- -XLambdaCase is also required
2117 </programlisting>
2118 This can be useful when you know that the expression being scrutinised
2119 has no non-bottom values. For example:
2120 <programlisting>
2121 data Void
2122 f :: Void -> Int
2123 f x = case x of { }
2124 </programlisting>
2125 With dependently-typed features it is more useful
2126 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
2127 For example, consider these two candidate definitions of <literal>absurd</literal>:
2128 <programlisting>
2129 data a :==: b where
2130 Refl :: a :==: a
2131
2132 absurd :: True :~: False -> a
2133 absurd x = error "absurd" -- (A)
2134 absurd x = case x of {} -- (B)
2135 </programlisting>
2136 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
2137 is an empty type. So (B) has no partiality and GHC should be able to compile with
2138 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
2139 yet clever enough to do that.)
2140 On the other hand (A) looks dangerous, and GHC doesn't check to make
2141 sure that, in fact, the function can never get called.
2142 </para>
2143 </sect2>
2144
2145 <sect2 id="multi-way-if">
2146 <title>Multi-way if-expressions</title>
2147 <para>
2148 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
2149 with multiple branches:
2150 <programlisting>
2151 if | guard1 -> expr1
2152 | ...
2153 | guardN -> exprN
2154 </programlisting>
2155 which is roughly equivalent to
2156 <programlisting>
2157 case () of
2158 _ | guard1 -> expr1
2159 ...
2160 _ | guardN -> exprN
2161 </programlisting>
2162 </para>
2163
2164 <para>Multi-way if expressions introduce a new layout context. So the
2165 example above is equivalent to:
2166 <programlisting>
2167 if { | guard1 -> expr1
2168 ; | ...
2169 ; | guardN -> exprN
2170 }
2171 </programlisting>
2172 The following behaves as expected:
2173 <programlisting>
2174 if | guard1 -> if | guard2 -> expr2
2175 | guard3 -> expr3
2176 | guard4 -> expr4
2177 </programlisting>
2178 because layout translates it as
2179 <programlisting>
2180 if { | guard1 -> if { | guard2 -> expr2
2181 ; | guard3 -> expr3
2182 }
2183 ; | guard4 -> expr4
2184 }
2185 </programlisting>
2186 Layout with multi-way if works in the same way as other layout
2187 contexts, except that the semi-colons between guards in a multi-way if
2188 are optional. So it is not necessary to line up all the guards at the
2189 same column; this is consistent with the way guards work in function
2190 definitions and case expressions.
2191 </para>
2192 </sect2>
2193
2194 <sect2 id="disambiguate-fields">
2195 <title>Record field disambiguation</title>
2196 <para>
2197 In record construction and record pattern matching
2198 it is entirely unambiguous which field is referred to, even if there are two different
2199 data types in scope with a common field name. For example:
2200 <programlisting>
2201 module M where
2202 data S = MkS { x :: Int, y :: Bool }
2203
2204 module Foo where
2205 import M
2206
2207 data T = MkT { x :: Int }
2208
2209 ok1 (MkS { x = n }) = n+1 -- Unambiguous
2210 ok2 n = MkT { x = n+1 } -- Unambiguous
2211
2212 bad1 k = k { x = 3 } -- Ambiguous
2213 bad2 k = x k -- Ambiguous
2214 </programlisting>
2215 Even though there are two <literal>x</literal>'s in scope,
2216 it is clear that the <literal>x</literal> in the pattern in the
2217 definition of <literal>ok1</literal> can only mean the field
2218 <literal>x</literal> from type <literal>S</literal>. Similarly for
2219 the function <literal>ok2</literal>. However, in the record update
2220 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
2221 it is not clear which of the two types is intended.
2222 </para>
2223 <para>
2224 Haskell 98 regards all four as ambiguous, but with the
2225 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
2226 the former two. The rules are precisely the same as those for instance
2227 declarations in Haskell 98, where the method names on the left-hand side
2228 of the method bindings in an instance declaration refer unambiguously
2229 to the method of that class (provided they are in scope at all), even
2230 if there are other variables in scope with the same name.
2231 This reduces the clutter of qualified names when you import two
2232 records from different modules that use the same field name.
2233 </para>
2234 <para>
2235 Some details:
2236 <itemizedlist>
2237 <listitem><para>
2238 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
2239 <programlisting>
2240 module Foo where
2241 import M
2242 x=True
2243 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
2244 </programlisting>
2245 </para></listitem>
2246
2247 <listitem><para>
2248 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
2249 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
2250 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
2251 <programlisting>
2252 module Foo where
2253 import qualified M -- Note qualified
2254
2255 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
2256 </programlisting>
2257 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
2258 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
2259 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
2260 is not. (In effect, it is qualified by the constructor.)
2261 </para></listitem>
2262 </itemizedlist>
2263 </para>
2264
2265 </sect2>
2266
2267 <!-- ===================== Record puns =================== -->
2268
2269 <sect2 id="record-puns">
2270 <title>Record puns
2271 </title>
2272
2273 <para>
2274 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
2275 </para>
2276
2277 <para>
2278 When using records, it is common to write a pattern that binds a
2279 variable with the same name as a record field, such as:
2280
2281 <programlisting>
2282 data C = C {a :: Int}
2283 f (C {a = a}) = a
2284 </programlisting>
2285 </para>
2286
2287 <para>
2288 Record punning permits the variable name to be elided, so one can simply
2289 write
2290
2291 <programlisting>
2292 f (C {a}) = a
2293 </programlisting>
2294
2295 to mean the same pattern as above. That is, in a record pattern, the
2296 pattern <literal>a</literal> expands into the pattern <literal>a =
2297 a</literal> for the same name <literal>a</literal>.
2298 </para>
2299
2300 <para>
2301 Note that:
2302 <itemizedlist>
2303 <listitem><para>
2304 Record punning can also be used in an expression, writing, for example,
2305 <programlisting>
2306 let a = 1 in C {a}
2307 </programlisting>
2308 instead of
2309 <programlisting>
2310 let a = 1 in C {a = a}
2311 </programlisting>
2312 The expansion is purely syntactic, so the expanded right-hand side
2313 expression refers to the nearest enclosing variable that is spelled the
2314 same as the field name.
2315 </para></listitem>
2316
2317 <listitem><para>
2318 Puns and other patterns can be mixed in the same record:
2319 <programlisting>
2320 data C = C {a :: Int, b :: Int}
2321 f (C {a, b = 4}) = a
2322 </programlisting>
2323 </para></listitem>
2324
2325 <listitem><para>
2326 Puns can be used wherever record patterns occur (e.g. in
2327 <literal>let</literal> bindings or at the top-level).
2328 </para></listitem>
2329
2330 <listitem><para>
2331 A pun on a qualified field name is expanded by stripping off the module qualifier.
2332 For example:
2333 <programlisting>
2334 f (C {M.a}) = a
2335 </programlisting>
2336 means
2337 <programlisting>
2338 f (M.C {M.a = a}) = a
2339 </programlisting>
2340 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
2341 is only in scope in qualified form.)
2342 </para></listitem>
2343 </itemizedlist>
2344 </para>
2345
2346
2347 </sect2>
2348
2349 <!-- ===================== Record wildcards =================== -->
2350
2351 <sect2 id="record-wildcards">
2352 <title>Record wildcards
2353 </title>
2354
2355 <para>
2356 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
2357 This flag implies <literal>-XDisambiguateRecordFields</literal>.
2358 </para>
2359
2360 <para>
2361 For records with many fields, it can be tiresome to write out each field
2362 individually in a record pattern, as in
2363 <programlisting>
2364 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
2365 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
2366 </programlisting>
2367 </para>
2368
2369 <para>
2370 Record wildcard syntax permits a "<literal>..</literal>" in a record
2371 pattern, where each elided field <literal>f</literal> is replaced by the
2372 pattern <literal>f = f</literal>. For example, the above pattern can be
2373 written as
2374 <programlisting>
2375 f (C {a = 1, ..}) = b + c + d
2376 </programlisting>
2377 </para>
2378
2379 <para>
2380 More details:
2381 <itemizedlist>
2382 <listitem><para>
2383 Record wildcards in patterns can be mixed with other patterns, including puns
2384 (<xref linkend="record-puns"/>); for example, in a pattern <literal>(C {a
2385 = 1, b, ..})</literal>. Additionally, record wildcards can be used
2386 wherever record patterns occur, including in <literal>let</literal>
2387 bindings and at the top-level. For example, the top-level binding
2388 <programlisting>
2389 C {a = 1, ..} = e
2390 </programlisting>
2391 defines <literal>b</literal>, <literal>c</literal>, and
2392 <literal>d</literal>.
2393 </para></listitem>
2394
2395 <listitem><para>
2396 Record wildcards can also be used in an expression, when constructing a record. For example,
2397 <programlisting>
2398 let {a = 1; b = 2; c = 3; d = 4} in C {..}
2399 </programlisting>
2400 in place of
2401 <programlisting>
2402 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
2403 </programlisting>
2404 The expansion is purely syntactic, so the record wildcard
2405 expression refers to the nearest enclosing variables that are spelled
2406 the same as the omitted field names.
2407 </para></listitem>
2408
2409 <listitem><para>
2410 Record wildcards may <emphasis>not</emphasis> be used in record <emphasis>updates</emphasis>. For example this
2411 is illegal:
2412 <programlisting>
2413 f r = r { x = 3, .. }
2414 </programlisting>
2415 </para></listitem>
2416
2417 <listitem><para>
2418 For both pattern and expression wildcards, the "<literal>..</literal>" expands to the missing
2419 <emphasis>in-scope</emphasis> record fields.
2420 Specifically the expansion of "<literal>C {..}</literal>" includes
2421 <literal>f</literal> if and only if:
2422 <itemizedlist>
2423 <listitem><para>
2424 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2425 </para></listitem>
2426 <listitem><para>
2427 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2428 </para></listitem>
2429 <listitem><para>
2430 In the case of expressions (but not patterns),
2431 the variable <literal>f</literal> is in scope unqualified,
2432 apart from the binding of the record selector itself.
2433 </para></listitem>
2434 </itemizedlist>
2435 These rules restrict record wildcards to the situations in which the user
2436 could have written the expanded version.
2437 For example
2438 <programlisting>
2439 module M where
2440 data R = R { a,b,c :: Int }
2441 module X where
2442 import M( R(a,c) )
2443 f b = R { .. }
2444 </programlisting>
2445 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2446 omitting <literal>b</literal> since the record field is not in scope,
2447 and omitting <literal>c</literal> since the variable <literal>c</literal>
2448 is not in scope (apart from the binding of the
2449 record selector <literal>c</literal>, of course).
2450 </para></listitem>
2451
2452 <listitem><para>
2453 Record wildcards cannot be used (a) in a record update construct, and (b) for data
2454 constructors that are not declared with record fields. For example:
2455 <programlisting>
2456 f x = x { v=True, .. } -- Illegal (a)
2457
2458 data T = MkT Int Bool
2459 g = MkT { .. } -- Illegal (b)
2460 h (MkT { .. }) = True -- Illegal (b)
2461 </programlisting>
2462 </para></listitem>
2463 </itemizedlist>
2464 </para>
2465
2466 </sect2>
2467
2468 <!-- ===================== Local fixity declarations =================== -->
2469
2470 <sect2 id="local-fixity-declarations">
2471 <title>Local Fixity Declarations
2472 </title>
2473
2474 <para>A careful reading of the Haskell 98 Report reveals that fixity
2475 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2476 <literal>infixr</literal>) are permitted to appear inside local bindings
2477 such those introduced by <literal>let</literal> and
2478 <literal>where</literal>. However, the Haskell Report does not specify
2479 the semantics of such bindings very precisely.
2480 </para>
2481
2482 <para>In GHC, a fixity declaration may accompany a local binding:
2483 <programlisting>
2484 let f = ...
2485 infixr 3 `f`
2486 in
2487 ...
2488 </programlisting>
2489 and the fixity declaration applies wherever the binding is in scope.
2490 For example, in a <literal>let</literal>, it applies in the right-hand
2491 sides of other <literal>let</literal>-bindings and the body of the
2492 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2493 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2494 declarations of a <literal>let</literal> statement scope over other
2495 statements in the group, just as the bound name does.
2496 </para>
2497
2498 <para>
2499 Moreover, a local fixity declaration *must* accompany a local binding of
2500 that name: it is not possible to revise the fixity of name bound
2501 elsewhere, as in
2502 <programlisting>
2503 let infixr 9 $ in ...
2504 </programlisting>
2505
2506 Because local fixity declarations are technically Haskell 98, no flag is
2507 necessary to enable them.
2508 </para>
2509 </sect2>
2510
2511 <sect2 id="package-imports">
2512 <title>Import and export extensions</title>
2513
2514 <sect3>
2515 <title>Hiding things the imported module doesn't export</title>
2516
2517 <para>
2518 Technically in Haskell 2010 this is illegal:
2519 <programlisting>
2520 module A( f ) where
2521 f = True
2522
2523 module B where
2524 import A hiding( g ) -- A does not export g
2525 g = f
2526 </programlisting>
2527 The <literal>import A hiding( g )</literal> in module <literal>B</literal>
2528 is technically an error (<ulink url="http://www.haskell.org/onlinereport/haskell2010/haskellch5.html#x11-1020005.3.1">Haskell Report, 5.3.1</ulink>)
2529 because <literal>A</literal> does not export <literal>g</literal>.
2530 However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of
2531 <literal>A</literal> might export <literal>g</literal>, and you want <literal>B</literal> to work
2532 in either case.
2533 </para>
2534 <para>
2535 The warning <literal>-fwarn-dodgy-imports</literal>, which is off by default but included with <literal>-W</literal>,
2536 warns if you hide something that the imported module does not export.
2537 </para>
2538 </sect3>
2539
2540 <sect3>
2541 <title id="package-qualified-imports">Package-qualified imports</title>
2542
2543 <para>With the <option>-XPackageImports</option> flag, GHC allows
2544 import declarations to be qualified by the package name that the
2545 module is intended to be imported from. For example:</para>
2546
2547 <programlisting>
2548 import "network" Network.Socket
2549 </programlisting>
2550
2551 <para>would import the module <literal>Network.Socket</literal> from
2552 the package <literal>network</literal> (any version). This may
2553 be used to disambiguate an import when the same module is
2554 available from multiple packages, or is present in both the
2555 current package being built and an external package.</para>
2556
2557 <para>The special package name <literal>this</literal> can be used to
2558 refer to the current package being built.</para>
2559
2560 <para>Note: you probably don't need to use this feature, it was
2561 added mainly so that we can build backwards-compatible versions of
2562 packages when APIs change. It can lead to fragile dependencies in
2563 the common case: modules occasionally move from one package to
2564 another, rendering any package-qualified imports broken.
2565 See also <xref linkend="package-thinning-and-renaming" /> for
2566 an alternative way of disambiguating between module names.</para>
2567 </sect3>
2568
2569 <sect3 id="safe-imports-ext">
2570 <title>Safe imports</title>
2571
2572 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2573 and <option>-XUnsafe</option> language flags, GHC extends
2574 the import declaration syntax to take an optional <literal>safe</literal>
2575 keyword after the <literal>import</literal> keyword. This feature
2576 is part of the Safe Haskell GHC extension. For example:</para>
2577
2578 <programlisting>
2579 import safe qualified Network.Socket as NS
2580 </programlisting>
2581
2582 <para>would import the module <literal>Network.Socket</literal>
2583 with compilation only succeeding if Network.Socket can be
2584 safely imported. For a description of when a import is
2585 considered safe see <xref linkend="safe-haskell"/></para>
2586
2587 </sect3>
2588
2589 <sect3 id="explicit-namespaces">
2590 <title>Explicit namespaces in import/export</title>
2591
2592 <para> In an import or export list, such as
2593 <programlisting>
2594 module M( f, (++) ) where ...
2595 import N( f, (++) )
2596 ...
2597 </programlisting>
2598 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2599 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2600 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2601 case, how would you export or import it?
2602 </para>
2603 <para>
2604 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2605 a type constructor in an import or export list with "<literal>type</literal>" to
2606 disambiguate this case, thus:
2607 <programlisting>
2608 module M( f, type (++) ) where ...
2609 import N( f, type (++) )
2610 ...
2611 module N( f, type (++) ) where
2612 data family a ++ b = L a | R b
2613 </programlisting>
2614 The extension <option>-XExplicitNamespaces</option>
2615 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2616 </para>
2617 <para>
2618 In addition, with <option>-XPatternSynonyms</option> you can prefix the name of
2619 a data constructor in an import or export list with the keyword <literal>pattern</literal>,
2620 to allow the import or export of a data constructor without its parent type constructor
2621 (see <xref linkend="patsyn-impexp"/>).
2622 </para>
2623 </sect3>
2624
2625 </sect2>
2626
2627 <sect2 id="syntax-stolen">
2628 <title>Summary of stolen syntax</title>
2629
2630 <para>Turning on an option that enables special syntax
2631 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2632 to compile, perhaps because it uses a variable name which has
2633 become a reserved word. This section lists the syntax that is
2634 "stolen" by language extensions.
2635 We use
2636 notation and nonterminal names from the Haskell 98 lexical syntax
2637 (see the Haskell 98 Report).
2638 We only list syntax changes here that might affect
2639 existing working programs (i.e. "stolen" syntax). Many of these
2640 extensions will also enable new context-free syntax, but in all
2641 cases programs written to use the new syntax would not be
2642 compilable without the option enabled.</para>
2643
2644 <para>There are two classes of special
2645 syntax:
2646
2647 <itemizedlist>
2648 <listitem>
2649 <para>New reserved words and symbols: character sequences
2650 which are no longer available for use as identifiers in the
2651 program.</para>
2652 </listitem>
2653 <listitem>
2654 <para>Other special syntax: sequences of characters that have
2655 a different meaning when this particular option is turned
2656 on.</para>
2657 </listitem>
2658 </itemizedlist>
2659
2660 The following syntax is stolen:
2661
2662 <variablelist>
2663 <varlistentry>
2664 <term>
2665 <literal>forall</literal>
2666 <indexterm><primary><literal>forall</literal></primary></indexterm>
2667 </term>
2668 <listitem><para>
2669 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2670 <option>-XScopedTypeVariables</option>,
2671 <option>-XLiberalTypeSynonyms</option>,
2672 <option>-XRankNTypes</option>,
2673 <option>-XExistentialQuantification</option>
2674 </para></listitem>
2675 </varlistentry>
2676
2677 <varlistentry>
2678 <term>
2679 <literal>mdo</literal>
2680 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2681 </term>
2682 <listitem><para>
2683 Stolen by: <option>-XRecursiveDo</option>
2684 </para></listitem>
2685 </varlistentry>
2686
2687 <varlistentry>
2688 <term>
2689 <literal>foreign</literal>
2690 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2691 </term>
2692 <listitem><para>
2693 Stolen by: <option>-XForeignFunctionInterface</option>
2694 </para></listitem>
2695 </varlistentry>
2696
2697 <varlistentry>
2698 <term>
2699 <literal>rec</literal>,
2700 <literal>proc</literal>, <literal>-&lt;</literal>,
2701 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2702 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2703 <literal>|)</literal> brackets
2704 <indexterm><primary><literal>proc</literal></primary></indexterm>
2705 </term>
2706 <listitem><para>
2707 Stolen by: <option>-XArrows</option>
2708 </para></listitem>
2709 </varlistentry>
2710
2711 <varlistentry>
2712 <term>
2713 <literal>?<replaceable>varid</replaceable></literal>
2714 <indexterm><primary>implicit parameters</primary></indexterm>
2715 </term>
2716 <listitem><para>
2717 Stolen by: <option>-XImplicitParams</option>
2718 </para></listitem>
2719 </varlistentry>
2720
2721 <varlistentry>
2722 <term>
2723 <literal>[|</literal>,
2724 <literal>[e|</literal>, <literal>[p|</literal>,
2725 <literal>[d|</literal>, <literal>[t|</literal>,
2726 <literal>$(</literal>,
2727 <literal>$$(</literal>,
2728 <literal>[||</literal>,
2729 <literal>[e||</literal>,
2730 <literal>$<replaceable>varid</replaceable></literal>,
2731 <literal>$$<replaceable>varid</replaceable></literal>
2732 <indexterm><primary>Template Haskell</primary></indexterm>
2733 </term>
2734 <listitem><para>
2735 Stolen by: <option>-XTemplateHaskell</option>
2736 </para></listitem>
2737 </varlistentry>
2738
2739 <varlistentry>
2740 <term>
2741 <literal>[<replaceable>varid</replaceable>|</literal>
2742 <indexterm><primary>quasi-quotation</primary></indexterm>
2743 </term>
2744 <listitem><para>
2745 Stolen by: <option>-XQuasiQuotes</option>
2746 </para></listitem>
2747 </varlistentry>
2748
2749 <varlistentry>
2750 <term>
2751 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2752 <replaceable>char</replaceable><literal>&num;</literal>,
2753 <replaceable>string</replaceable><literal>&num;</literal>,
2754 <replaceable>integer</replaceable><literal>&num;</literal>,
2755 <replaceable>float</replaceable><literal>&num;</literal>,
2756 <replaceable>float</replaceable><literal>&num;&num;</literal>
2757 </term>
2758 <listitem><para>
2759 Stolen by: <option>-XMagicHash</option>
2760 </para></listitem>
2761 </varlistentry>
2762
2763 <varlistentry>
2764 <term>
2765 <literal>(&num;</literal>, <literal>&num;)</literal>
2766 </term>
2767 <listitem><para>
2768 Stolen by: <option>-XUnboxedTuples</option>
2769 </para></listitem>
2770 </varlistentry>
2771
2772 <varlistentry>
2773 <term>
2774 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2775 </term>
2776 <listitem><para>
2777 Stolen by: <option>-XBangPatterns</option>
2778 </para></listitem>
2779 </varlistentry>
2780
2781 <varlistentry>
2782 <term>
2783 <literal>pattern</literal>
2784 </term>
2785 <listitem><para>
2786 Stolen by: <option>-XPatternSynonyms</option>
2787 </para></listitem>
2788 </varlistentry>
2789 </variablelist>
2790 </para>
2791 </sect2>
2792 </sect1>
2793
2794
2795 <!-- TYPE SYSTEM EXTENSIONS -->
2796 <sect1 id="data-type-extensions">
2797 <title>Extensions to data types and type synonyms</title>
2798
2799 <sect2 id="nullary-types">
2800 <title>Data types with no constructors</title>
2801
2802 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2803 GHC lets you declare a data type with no constructors. For example:</para>
2804
2805 <programlisting>
2806 data S -- S :: *
2807 data T a -- T :: * -> *
2808 </programlisting>
2809
2810 <para>Syntactically, the declaration lacks the "= constrs" part. The
2811 type can be parameterised over types of any kind, but if the kind is
2812 not <literal>*</literal> then an explicit kind annotation must be used
2813 (see <xref linkend="kinding"/>).</para>
2814
2815 <para>Such data types have only one value, namely bottom.
2816 Nevertheless, they can be useful when defining "phantom types".</para>
2817 </sect2>
2818
2819 <sect2 id="datatype-contexts">
2820 <title>Data type contexts</title>
2821
2822 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2823
2824 <programlisting>
2825 data Eq a => Set a = NilSet | ConsSet a (Set a)
2826 </programlisting>
2827
2828 <para>give constructors with types:</para>
2829
2830 <programlisting>
2831 NilSet :: Set a
2832 ConsSet :: Eq a => a -> Set a -> Set a
2833 </programlisting>
2834
2835 <para>This is widely considered a misfeature, and is going to be removed from
2836 the language. In GHC, it is controlled by the deprecated extension
2837 <literal>DatatypeContexts</literal>.</para>
2838 </sect2>
2839
2840 <sect2 id="infix-tycons">
2841 <title>Infix type constructors, classes, and type variables</title>
2842
2843 <para>
2844 GHC allows type constructors, classes, and type variables to be operators, and
2845 to be written infix, very much like expressions. More specifically:
2846 <itemizedlist>
2847 <listitem><para>
2848 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2849 The lexical syntax is the same as that for data constructors.
2850 </para></listitem>
2851 <listitem><para>
2852 Data type and type-synonym declarations can be written infix, parenthesised
2853 if you want further arguments. E.g.
2854 <screen>
2855 data a :*: b = Foo a b
2856 type a :+: b = Either a b
2857 class a :=: b where ...
2858
2859 data (a :**: b) x = Baz a b x
2860 type (a :++: b) y = Either (a,b) y
2861 </screen>
2862 </para></listitem>
2863 <listitem><para>
2864 Types, and class constraints, can be written infix. For example
2865 <screen>
2866 x :: Int :*: Bool
2867 f :: (a :=: b) => a -> b
2868 </screen>
2869 </para></listitem>
2870 <listitem><para>
2871 Back-quotes work
2872 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2873 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2874 </para></listitem>
2875 <listitem><para>
2876 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2877 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2878 sets the fixity for a data constructor and the corresponding type constructor. For example:
2879 <screen>
2880 infixl 7 T, :*:
2881 </screen>
2882 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2883 and similarly for <literal>:*:</literal>.
2884 <literal>Int `a` Bool</literal>.
2885 </para></listitem>
2886 <listitem><para>
2887 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2888 </para></listitem>
2889
2890 </itemizedlist>
2891 </para>
2892 </sect2>
2893
2894 <sect2 id="type-operators">
2895 <title>Type operators</title>
2896 <para>
2897 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2898 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2899 <programlisting>
2900 type T (+) = ((+), (+))
2901 -- Just like: type T a = (a,a)
2902
2903 f :: T Int -> Int
2904 f (x,y)= x
2905 </programlisting>
2906 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2907 allow you to write them infix.
2908 </para>
2909 <para>
2910 The language <option>-XTypeOperators</option> changes this behaviour:
2911 <itemizedlist>
2912 <listitem><para>
2913 Operator symbols become type <emphasis>constructors</emphasis> rather than
2914 type <emphasis>variables</emphasis>.
2915 </para></listitem>
2916 <listitem><para>
2917 Operator symbols in types can be written infix, both in definitions and uses.
2918 for example:
2919 <programlisting>
2920 data a + b = Plus a b
2921 type Foo = Int + Bool
2922 </programlisting>
2923 </para></listitem>
2924 <listitem><para>
2925 There is now some potential ambiguity in import and export lists; for example
2926 if you write <literal>import M( (+) )</literal> do you mean the
2927 <emphasis>function</emphasis> <literal>(+)</literal> or the
2928 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2929 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2930 by <option>-XTypeOperators</option>) GHC allows you to specify the latter
2931 by preceding it with the keyword <literal>type</literal>, thus:
2932 <programlisting>
2933 import M( type (+) )
2934 </programlisting>
2935 See <xref linkend="explicit-namespaces"/>.
2936 </para></listitem>
2937 <listitem><para>
2938 The fixity of a type operator may be set using the usual fixity declarations
2939 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2940 a single fixity.
2941 </para></listitem>
2942 </itemizedlist>
2943 </para>
2944 </sect2>
2945
2946 <sect2 id="type-synonyms">
2947 <title>Liberalised type synonyms</title>
2948
2949 <para>
2950 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2951 on individual synonym declarations.
2952 With the <option>-XLiberalTypeSynonyms</option> extension,
2953 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2954 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2955
2956 <itemizedlist>
2957 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2958 in a type synonym, thus:
2959 <programlisting>
2960 type Discard a = forall b. Show b => a -> b -> (a, String)
2961
2962 f :: Discard a
2963 f x y = (x, show y)
2964
2965 g :: Discard Int -> (Int,String) -- A rank-2 type
2966 g f = f 3 True
2967 </programlisting>
2968 </para>
2969 </listitem>
2970
2971 <listitem><para>
2972 If you also use <option>-XUnboxedTuples</option>,
2973 you can write an unboxed tuple in a type synonym:
2974 <programlisting>
2975 type Pr = (# Int, Int #)
2976
2977 h :: Int -> Pr
2978 h x = (# x, x #)
2979 </programlisting>
2980 </para></listitem>
2981
2982 <listitem><para>
2983 You can apply a type synonym to a forall type:
2984 <programlisting>
2985 type Foo a = a -> a -> Bool
2986
2987 f :: Foo (forall b. b->b)
2988 </programlisting>
2989 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2990 <programlisting>
2991 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2992 </programlisting>
2993 </para></listitem>
2994
2995 <listitem><para>
2996 You can apply a type synonym to a partially applied type synonym:
2997 <programlisting>
2998 type Generic i o = forall x. i x -> o x
2999 type Id x = x
3000
3001 foo :: Generic Id []
3002 </programlisting>
3003 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
3004 <programlisting>
3005 foo :: forall x. x -> [x]
3006 </programlisting>
3007 </para></listitem>
3008
3009 </itemizedlist>
3010 </para>
3011
3012 <para>
3013 GHC currently does kind checking before expanding synonyms (though even that
3014 could be changed.)
3015 </para>
3016 <para>
3017 After expanding type synonyms, GHC does validity checking on types, looking for
3018 the following mal-formedness which isn't detected simply by kind checking:
3019 <itemizedlist>
3020 <listitem><para>
3021 Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal>
3022 is off)
3023 </para></listitem>
3024 <listitem><para>
3025 Partially-applied type synonym.
3026 </para></listitem>
3027 </itemizedlist>
3028 So, for example, this will be rejected:
3029 <programlisting>
3030 type Pr = forall a. a
3031
3032 h :: [Pr]
3033 h = ...
3034 </programlisting>
3035 because GHC does not allow type constructors applied to for-all types.
3036 </para>
3037 </sect2>
3038
3039
3040 <sect2 id="existential-quantification">
3041 <title>Existentially quantified data constructors
3042 </title>
3043
3044 <para>
3045 The idea of using existential quantification in data type declarations
3046 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
3047 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
3048 London, 1991). It was later formalised by Laufer and Odersky
3049 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
3050 TOPLAS, 16(5), pp1411-1430, 1994).
3051 It's been in Lennart
3052 Augustsson's <command>hbc</command> Haskell compiler for several years, and
3053 proved very useful. Here's the idea. Consider the declaration:
3054 </para>
3055
3056 <para>
3057
3058 <programlisting>
3059 data Foo = forall a. MkFoo a (a -> Bool)
3060 | Nil
3061 </programlisting>
3062
3063 </para>
3064
3065 <para>
3066 The data type <literal>Foo</literal> has two constructors with types:
3067 </para>
3068
3069 <para>
3070
3071 <programlisting>
3072 MkFoo :: forall a. a -> (a -> Bool) -> Foo
3073 Nil :: Foo
3074 </programlisting>
3075
3076 </para>
3077
3078 <para>
3079 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
3080 does not appear in the data type itself, which is plain <literal>Foo</literal>.
3081 For example, the following expression is fine:
3082 </para>
3083
3084 <para>
3085
3086 <programlisting>
3087 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
3088 </programlisting>
3089
3090 </para>
3091
3092 <para>
3093 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
3094 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
3095 isUpper</function> packages a character with a compatible function. These
3096 two things are each of type <literal>Foo</literal> and can be put in a list.
3097 </para>
3098
3099 <para>
3100 What can we do with a value of type <literal>Foo</literal>?. In particular,
3101 what happens when we pattern-match on <function>MkFoo</function>?
3102 </para>
3103
3104 <para>
3105
3106 <programlisting>
3107 f (MkFoo val fn) = ???
3108 </programlisting>
3109
3110 </para>
3111
3112 <para>
3113 Since all we know about <literal>val</literal> and <function>fn</function> is that they
3114 are compatible, the only (useful) thing we can do with them is to
3115 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
3116 </para>
3117
3118 <para>
3119
3120 <programlisting>
3121 f :: Foo -> Bool
3122 f (MkFoo val fn) = fn val
3123 </programlisting>
3124
3125 </para>
3126
3127 <para>
3128 What this allows us to do is to package heterogeneous values
3129 together with a bunch of functions that manipulate them, and then treat
3130 that collection of packages in a uniform manner. You can express
3131 quite a bit of object-oriented-like programming this way.
3132 </para>
3133
3134 <sect3 id="existential">
3135 <title>Why existential?
3136 </title>
3137
3138 <para>
3139 What has this to do with <emphasis>existential</emphasis> quantification?
3140 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
3141 </para>
3142
3143 <para>
3144
3145 <programlisting>
3146 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
3147 </programlisting>
3148
3149 </para>
3150
3151 <para>
3152 But Haskell programmers can safely think of the ordinary
3153 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
3154 adding a new existential quantification construct.
3155 </para>
3156
3157 </sect3>
3158
3159 <sect3 id="existential-with-context">
3160 <title>Existentials and type classes</title>
3161
3162 <para>
3163 An easy extension is to allow
3164 arbitrary contexts before the constructor. For example:
3165 </para>
3166
3167 <para>
3168
3169 <programlisting>
3170 data Baz = forall a. Eq a => Baz1 a a
3171 | forall b. Show b => Baz2 b (b -> b)
3172 </programlisting>
3173
3174 </para>
3175
3176 <para>
3177 The two constructors have the types you'd expect:
3178 </para>
3179
3180 <para>
3181
3182 <programlisting>
3183 Baz1 :: forall a. Eq a => a -> a -> Baz
3184 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3185 </programlisting>
3186
3187 </para>
3188
3189 <para>
3190 But when pattern matching on <function>Baz1</function> the matched values can be compared
3191 for equality, and when pattern matching on <function>Baz2</function> the first matched
3192 value can be converted to a string (as well as applying the function to it).
3193 So this program is legal:
3194 </para>
3195
3196 <para>
3197
3198 <programlisting>
3199 f :: Baz -> String
3200 f (Baz1 p q) | p == q = "Yes"
3201 | otherwise = "No"
3202 f (Baz2 v fn) = show (fn v)
3203 </programlisting>
3204
3205 </para>
3206
3207 <para>
3208 Operationally, in a dictionary-passing implementation, the
3209 constructors <function>Baz1</function> and <function>Baz2</function> must store the
3210 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
3211 extract it on pattern matching.
3212 </para>
3213
3214 </sect3>
3215
3216 <sect3 id="existential-records">
3217 <title>Record Constructors</title>
3218
3219 <para>
3220 GHC allows existentials to be used with records syntax as well. For example:
3221
3222 <programlisting>
3223 data Counter a = forall self. NewCounter
3224 { _this :: self
3225 , _inc :: self -> self
3226 , _display :: self -> IO ()
3227 , tag :: a
3228 }
3229 </programlisting>
3230 Here <literal>tag</literal> is a public field, with a well-typed selector
3231 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
3232 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
3233 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
3234 compile-time error. In other words, <emphasis>GHC defines a record selector function
3235 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
3236 (This example used an underscore in the fields for which record selectors
3237 will not be defined, but that is only programming style; GHC ignores them.)
3238 </para>
3239
3240 <para>
3241 To make use of these hidden fields, we need to create some helper functions:
3242
3243 <programlisting>
3244 inc :: Counter a -> Counter a
3245 inc (NewCounter x i d t) = NewCounter
3246 { _this = i x, _inc = i, _display = d, tag = t }
3247
3248 display :: Counter a -> IO ()
3249 display NewCounter{ _this = x, _display = d } = d x
3250 </programlisting>
3251
3252 Now we can define counters with different underlying implementations:
3253
3254 <programlisting>
3255 counterA :: Counter String
3256 counterA = NewCounter
3257 { _this = 0, _inc = (1+), _display = print, tag = "A" }
3258
3259 counterB :: Counter String
3260 counterB = NewCounter
3261 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
3262
3263 main = do
3264 display (inc counterA) -- prints "1"
3265 display (inc (inc counterB)) -- prints "##"
3266 </programlisting>
3267
3268 Record update syntax is supported for existentials (and GADTs):
3269 <programlisting>
3270 setTag :: Counter a -> a -> Counter a
3271 setTag obj t = obj{ tag = t }
3272 </programlisting>
3273 The rule for record update is this: <emphasis>
3274 the types of the updated fields may
3275 mention only the universally-quantified type variables
3276 of the data constructor. For GADTs, the field may mention only types
3277 that appear as a simple type-variable argument in the constructor's result
3278 type</emphasis>. For example:
3279 <programlisting>
3280 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
3281 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
3282 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
3283 -- existentially quantified)
3284
3285 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
3286 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
3287 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
3288 -- type-variable argument in G1's result type)
3289 </programlisting>
3290 </para>
3291
3292 </sect3>
3293
3294
3295 <sect3>
3296 <title>Restrictions</title>
3297
3298 <para>
3299 There are several restrictions on the ways in which existentially-quantified
3300 constructors can be use.
3301 </para>
3302
3303 <para>
3304
3305 <itemizedlist>
3306 <listitem>
3307
3308 <para>
3309 When pattern matching, each pattern match introduces a new,
3310 distinct, type for each existential type variable. These types cannot
3311 be unified with any other type, nor can they escape from the scope of
3312 the pattern match. For example, these fragments are incorrect:
3313
3314
3315 <programlisting>
3316 f1 (MkFoo a f) = a
3317 </programlisting>
3318
3319
3320 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
3321 is the result of <function>f1</function>. One way to see why this is wrong is to
3322 ask what type <function>f1</function> has:
3323
3324
3325 <programlisting>
3326 f1 :: Foo -> a -- Weird!
3327 </programlisting>
3328
3329
3330 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
3331 this:
3332
3333
3334 <programlisting>
3335 f1 :: forall a. Foo -> a -- Wrong!
3336 </programlisting>
3337
3338
3339 The original program is just plain wrong. Here's another sort of error
3340
3341
3342 <programlisting>
3343 f2 (Baz1 a b) (Baz1 p q) = a==q
3344 </programlisting>
3345
3346
3347 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
3348 <literal>a==q</literal> is wrong because it equates the two distinct types arising
3349 from the two <function>Baz1</function> constructors.
3350
3351
3352 </para>
3353 </listitem>
3354 <listitem>
3355
3356 <para>
3357 You can't pattern-match on an existentially quantified
3358 constructor in a <literal>let</literal> or <literal>where</literal> group of
3359 bindings. So this is illegal:
3360
3361
3362 <programlisting>
3363 f3 x = a==b where { Baz1 a b = x }
3364 </programlisting>
3365
3366 Instead, use a <literal>case</literal> expression:
3367
3368 <programlisting>
3369 f3 x = case x of Baz1 a b -> a==b
3370 </programlisting>
3371
3372 In general, you can only pattern-match
3373 on an existentially-quantified constructor in a <literal>case</literal> expression or
3374 in the patterns of a function definition.
3375
3376 The reason for this restriction is really an implementation one.
3377 Type-checking binding groups is already a nightmare without
3378 existentials complicating the picture. Also an existential pattern
3379 binding at the top level of a module doesn't make sense, because it's
3380 not clear how to prevent the existentially-quantified type "escaping".
3381 So for now, there's a simple-to-state restriction. We'll see how
3382 annoying it is.
3383
3384 </para>
3385 </listitem>
3386 <listitem>
3387
3388 <para>
3389 You can't use existential quantification for <literal>newtype</literal>
3390 declarations. So this is illegal:
3391
3392
3393 <programlisting>
3394 newtype T = forall a. Ord a => MkT a
3395 </programlisting>
3396
3397
3398 Reason: a value of type <literal>T</literal> must be represented as a
3399 pair of a dictionary for <literal>Ord t</literal> and a value of type
3400 <literal>t</literal>. That contradicts the idea that
3401 <literal>newtype</literal> should have no concrete representation.
3402 You can get just the same efficiency and effect by using
3403 <literal>data</literal> instead of <literal>newtype</literal>. If
3404 there is no overloading involved, then there is more of a case for
3405 allowing an existentially-quantified <literal>newtype</literal>,
3406 because the <literal>data</literal> version does carry an
3407 implementation cost, but single-field existentially quantified
3408 constructors aren't much use. So the simple restriction (no
3409 existential stuff on <literal>newtype</literal>) stands, unless there
3410 are convincing reasons to change it.
3411
3412
3413 </para>
3414 </listitem>
3415 <listitem>
3416
3417 <para>
3418 You can't use <literal>deriving</literal> to define instances of a
3419 data type with existentially quantified data constructors.
3420
3421 Reason: in most cases it would not make sense. For example:;
3422
3423 <programlisting>
3424 data T = forall a. MkT [a] deriving( Eq )
3425 </programlisting>
3426
3427 To derive <literal>Eq</literal> in the standard way we would need to have equality
3428 between the single component of two <function>MkT</function> constructors:
3429
3430 <programlisting>
3431 instance Eq T where
3432 (MkT a) == (MkT b) = ???
3433 </programlisting>
3434
3435 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
3436 It's just about possible to imagine examples in which the derived instance
3437 would make sense, but it seems altogether simpler simply to prohibit such
3438 declarations. Define your own instances!
3439 </para>
3440 </listitem>
3441
3442 </itemizedlist>
3443
3444 </para>
3445
3446 </sect3>
3447 </sect2>
3448
3449 <!-- ====================== Generalised algebraic data types ======================= -->
3450
3451 <sect2 id="gadt-style">
3452 <title>Declaring data types with explicit constructor signatures</title>
3453
3454 <para>When the <literal>GADTSyntax</literal> extension is enabled,
3455 GHC allows you to declare an algebraic data type by
3456 giving the type signatures of constructors explicitly. For example:
3457 <programlisting>
3458 data Maybe a where
3459 Nothing :: Maybe a
3460 Just :: a -> Maybe a
3461 </programlisting>
3462 The form is called a "GADT-style declaration"
3463 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
3464 can only be declared using this form.</para>
3465 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
3466 For example, these two declarations are equivalent:
3467 <programlisting>
3468 data Foo = forall a. MkFoo a (a -> Bool)
3469 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
3470 </programlisting>
3471 </para>
3472 <para>Any data type that can be declared in standard Haskell-98 syntax
3473 can also be declared using GADT-style syntax.
3474 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
3475 they treat class constraints on the data constructors differently.
3476 Specifically, if the constructor is given a type-class context, that
3477 context is made available by pattern matching. For example:
3478 <programlisting>
3479 data Set a where
3480 MkSet :: Eq a => [a] -> Set a
3481
3482 makeSet :: Eq a => [a] -> Set a
3483 makeSet xs = MkSet (nub xs)
3484
3485 insert :: a -> Set a -> Set a
3486 insert a (MkSet as) | a `elem` as = MkSet as
3487 | otherwise = MkSet (a:as)
3488 </programlisting>
3489 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3490 gives rise to a <literal>(Eq a)</literal>
3491 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3492 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3493 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3494 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3495 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3496 In the example, the equality dictionary is used to satisfy the equality constraint
3497 generated by the call to <literal>elem</literal>, so that the type of
3498 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3499 </para>
3500 <para>
3501 For example, one possible application is to reify dictionaries:
3502 <programlisting>
3503 data NumInst a where
3504 MkNumInst :: Num a => NumInst a
3505
3506 intInst :: NumInst Int
3507 intInst = MkNumInst
3508
3509 plus :: NumInst a -> a -> a -> a
3510 plus MkNumInst p q = p + q
3511 </programlisting>
3512 Here, a value of type <literal>NumInst a</literal> is equivalent
3513 to an explicit <literal>(Num a)</literal> dictionary.
3514 </para>
3515 <para>
3516 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3517 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3518 like this:
3519 <programlisting>
3520 data NumInst a
3521 = Num a => MkNumInst (NumInst a)
3522 </programlisting>
3523 Notice that, unlike the situation when declaring an existential, there is
3524 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3525 data type's universally quantified type variable <literal>a</literal>.
3526 A constructor may have both universal and existential type variables: for example,
3527 the following two declarations are equivalent:
3528 <programlisting>
3529 data T1 a
3530 = forall b. (Num a, Eq b) => MkT1 a b
3531 data T2 a where
3532 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3533 </programlisting>
3534 </para>
3535 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3536 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3537 In Haskell 98 the definition
3538 <programlisting>
3539 data Eq a => Set' a = MkSet' [a]
3540 </programlisting>
3541 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3542 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3543 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3544 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3545 GHC's behaviour is much more useful, as well as much more intuitive.
3546 </para>
3547
3548 <para>
3549 The rest of this section gives further details about GADT-style data
3550 type declarations.
3551
3552 <itemizedlist>
3553 <listitem><para>
3554 The result type of each data constructor must begin with the type constructor being defined.
3555 If the result type of all constructors
3556 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3557 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3558 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3559 </para></listitem>
3560
3561 <listitem><para>
3562 As with other type signatures, you can give a single signature for several data constructors.
3563 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3564 <programlisting>
3565 data T a where
3566 T1,T2 :: a -> T a
3567 T3 :: T a
3568 </programlisting>
3569 </para></listitem>
3570
3571 <listitem><para>
3572 The type signature of
3573 each constructor is independent, and is implicitly universally quantified as usual.
3574 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3575 have no scope, and different constructors may have different universally-quantified type variables:
3576 <programlisting>
3577 data T a where -- The 'a' has no scope
3578 T1,T2 :: b -> T b -- Means forall b. b -> T b
3579 T3 :: T a -- Means forall a. T a
3580 </programlisting>
3581 </para></listitem>
3582
3583 <listitem><para>
3584 A constructor signature may mention type class constraints, which can differ for
3585 different constructors. For example, this is fine:
3586 <programlisting>
3587 data T a where
3588 T1 :: Eq b => b -> b -> T b
3589 T2 :: (Show c, Ix c) => c -> [c] -> T c
3590 </programlisting>
3591 When pattern matching, these constraints are made available to discharge constraints
3592 in the body of the match. For example:
3593 <programlisting>
3594 f :: T a -> String
3595 f (T1 x y) | x==y = "yes"
3596 | otherwise = "no"
3597 f (T2 a b) = show a
3598 </programlisting>
3599 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3600 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3601 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3602 </para></listitem>
3603
3604 <listitem><para>
3605 Unlike a Haskell-98-style
3606 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3607 have no scope. Indeed, one can write a kind signature instead:
3608 <programlisting>
3609 data Set :: * -> * where ...
3610 </programlisting>
3611 or even a mixture of the two:
3612 <programlisting>
3613 data Bar a :: (* -> *) -> * where ...
3614 </programlisting>
3615 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3616 like this:
3617 <programlisting>
3618 data Bar a (b :: * -> *) where ...
3619 </programlisting>
3620 </para></listitem>
3621
3622
3623 <listitem><para>
3624 You can use strictness annotations, in the obvious places
3625 in the constructor type:
3626 <programlisting>
3627 data Term a where
3628 Lit :: !Int -> Term Int
3629 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3630 Pair :: Term a -> Term b -> Term (a,b)
3631 </programlisting>
3632 </para></listitem>
3633
3634 <listitem><para>
3635 You can use a <literal>deriving</literal> clause on a GADT-style data type
3636 declaration. For example, these two declarations are equivalent
3637 <programlisting>
3638 data Maybe1 a where {
3639 Nothing1 :: Maybe1 a ;
3640 Just1 :: a -> Maybe1 a
3641 } deriving( Eq, Ord )
3642
3643 data Maybe2 a = Nothing2 | Just2 a
3644 deriving( Eq, Ord )
3645 </programlisting>
3646 </para></listitem>
3647
3648 <listitem><para>
3649 The type signature may have quantified type variables that do not appear
3650 in the result type:
3651 <programlisting>
3652 data Foo where
3653 MkFoo :: a -> (a->Bool) -> Foo
3654 Nil :: Foo
3655 </programlisting>
3656 Here the type variable <literal>a</literal> does not appear in the result type
3657 of either constructor.
3658 Although it is universally quantified in the type of the constructor, such
3659 a type variable is often called "existential".
3660 Indeed, the above declaration declares precisely the same type as
3661 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3662 </para><para>
3663 The type may contain a class context too, of course:
3664 <programlisting>
3665 data Showable where
3666 MkShowable :: Show a => a -> Showable
3667 </programlisting>
3668 </para></listitem>
3669
3670 <listitem><para>
3671 You can use record syntax on a GADT-style data type declaration:
3672
3673 <programlisting>
3674 data Person where
3675 Adult :: { name :: String, children :: [Person] } -> Person
3676 Child :: Show a => { name :: !String, funny :: a } -> Person
3677 </programlisting>
3678 As usual, for every constructor that has a field <literal>f</literal>, the type of
3679 field <literal>f</literal> must be the same (modulo alpha conversion).
3680 The <literal>Child</literal> constructor above shows that the signature
3681 may have a context, existentially-quantified variables, and strictness annotations,
3682 just as in the non-record case. (NB: the "type" that follows the double-colon
3683 is not really a type, because of the record syntax and strictness annotations.
3684 A "type" of this form can appear only in a constructor signature.)
3685 </para></listitem>
3686
3687 <listitem><para>
3688 Record updates are allowed with GADT-style declarations,
3689 only fields that have the following property: the type of the field
3690 mentions no existential type variables.
3691 </para></listitem>
3692
3693 <listitem><para>
3694 As in the case of existentials declared using the Haskell-98-like record syntax
3695 (<xref linkend="existential-records"/>),
3696 record-selector functions are generated only for those fields that have well-typed
3697 selectors.
3698 Here is the example of that section, in GADT-style syntax:
3699 <programlisting>
3700 data Counter a where
3701 NewCounter :: { _this :: self
3702 , _inc :: self -> self
3703 , _display :: self -> IO ()
3704 , tag :: a
3705 } -> Counter a
3706 </programlisting>
3707 As before, only one selector function is generated here, that for <literal>tag</literal>.
3708 Nevertheless, you can still use all the field names in pattern matching and record construction.
3709 </para></listitem>
3710
3711 <listitem><para>
3712 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3713 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3714 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3715 So GHC implements the following design: a data constructor declared in a GADT-style data type
3716 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3717 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3718 <programlisting>
3719 infix 6 (:--:)
3720 data T a where
3721 (:--:) :: Int -> Bool -> T Int
3722 </programlisting>
3723 </para></listitem>
3724 </itemizedlist></para>
3725 </sect2>
3726
3727 <sect2 id="gadt">
3728 <title>Generalised Algebraic Data Types (GADTs)</title>
3729
3730 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3731 by allowing constructors to have richer return types. Here is an example:
3732 <programlisting>
3733 data Term a where
3734 Lit :: Int -> Term Int
3735 Succ :: Term Int -> Term Int
3736 IsZero :: Term Int -> Term Bool
3737 If :: Term Bool -> Term a -> Term a -> Term a
3738 Pair :: Term a -> Term b -> Term (a,b)
3739 </programlisting>
3740 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3741 case with ordinary data types. This generality allows us to
3742 write a well-typed <literal>eval</literal> function
3743 for these <literal>Terms</literal>:
3744 <programlisting>
3745 eval :: Term a -> a
3746 eval (Lit i) = i
3747 eval (Succ t) = 1 + eval t
3748 eval (IsZero t) = eval t == 0
3749 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3750 eval (Pair e1 e2) = (eval e1, eval e2)
3751 </programlisting>
3752 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3753 For example, in the right hand side of the equation
3754 <programlisting>
3755 eval :: Term a -> a
3756 eval (Lit i) = ...
3757 </programlisting>
3758 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3759 A precise specification of the type rules is beyond what this user manual aspires to,
3760 but the design closely follows that described in
3761 the paper <ulink
3762 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3763 unification-based type inference for GADTs</ulink>,
3764 (ICFP 2006).
3765 The general principle is this: <emphasis>type refinement is only carried out
3766 based on user-supplied type annotations</emphasis>.
3767 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3768 and lots of obscure error messages will
3769 occur. However, the refinement is quite general. For example, if we had:
3770 <programlisting>
3771 eval :: Term a -> a -> a
3772 eval (Lit i) j = i+j
3773 </programlisting>
3774 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3775 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3776 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3777 </para>
3778 <para>
3779 These and many other examples are given in papers by Hongwei Xi, and
3780 Tim Sheard. There is a longer introduction
3781 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3782 and Ralf Hinze's
3783 <ulink url="http://www.cs.ox.ac.uk/ralf.hinze/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3784 may use different notation to that implemented in GHC.
3785 </para>
3786 <para>
3787 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3788 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XGADTSyntax</option>
3789 and <option>-XMonoLocalBinds</option>.
3790 <itemizedlist>
3791 <listitem><para>
3792 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3793 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3794 The result type of each constructor must begin with the type constructor being defined,
3795 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3796 For example, in the <literal>Term</literal> data
3797 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3798 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3799 constructor).
3800 </para></listitem>
3801
3802 <listitem><para>
3803 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3804 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3805 whose result type is not just <literal>T a b</literal>.
3806 </para></listitem>
3807
3808 <listitem><para>
3809 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3810 an ordinary data type.
3811 </para></listitem>
3812
3813 <listitem><para>
3814 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3815 For example:
3816 <programlisting>
3817 data Term a where
3818 Lit :: { val :: Int } -> Term Int
3819 Succ :: { num :: Term Int } -> Term Int
3820 Pred :: { num :: Term Int } -> Term Int
3821 IsZero :: { arg :: Term Int } -> Term Bool
3822 Pair :: { arg1 :: Term a
3823 , arg2 :: Term b
3824 } -> Term (a,b)
3825 If :: { cnd :: Term Bool
3826 , tru :: Term a
3827 , fls :: Term a
3828 } -> Term a
3829 </programlisting>
3830 However, for GADTs there is the following additional constraint:
3831 every constructor that has a field <literal>f</literal> must have
3832 the same result type (modulo alpha conversion)
3833 Hence, in the above example, we cannot merge the <literal>num</literal>
3834 and <literal>arg</literal> fields above into a
3835 single name. Although their field types are both <literal>Term Int</literal>,
3836 their selector functions actually have different types:
3837
3838 <programlisting>
3839 num :: Term Int -> Term Int
3840 arg :: Term Bool -> Term Int
3841 </programlisting>
3842 </para></listitem>
3843
3844 <listitem><para>
3845 When pattern-matching against data constructors drawn from a GADT,
3846 for example in a <literal>case</literal> expression, the following rules apply:
3847 <itemizedlist>
3848 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3849 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3850 <listitem><para>The type of any free variable mentioned in any of
3851 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3852 </itemizedlist>
3853 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3854 way to ensure that a variable a rigid type is to give it a type signature.
3855 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3856 Simple unification-based type inference for GADTs
3857 </ulink>. The criteria implemented by GHC are given in the Appendix.
3858
3859 </para></listitem>
3860
3861 </itemizedlist>
3862 </para>
3863
3864 </sect2>
3865 </sect1>
3866
3867 <!-- ====================== End of Generalised algebraic data types ======================= -->
3868
3869 <sect1 id="deriving">
3870 <title>Extensions to the "deriving" mechanism</title>
3871
3872 <sect2 id="deriving-inferred">
3873 <title>Inferred context for deriving clauses</title>
3874
3875 <para>
3876 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3877 legal. For example:
3878 <programlisting>
3879 data T0 f a = MkT0 a deriving( Eq )
3880 data T1 f a = MkT1 (f a) deriving( Eq )
3881 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3882 </programlisting>
3883 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3884 <programlisting>
3885 instance Eq a => Eq (T0 f a) where ...
3886 instance Eq (f a) => Eq (T1 f a) where ...
3887 instance Eq (f (f a)) => Eq (T2 f a) where ...
3888 </programlisting>
3889 The first of these is obviously fine. The second is still fine, although less obviously.
3890 The third is not Haskell 98, and risks losing termination of instances.
3891 </para>
3892 <para>
3893 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3894 each constraint in the inferred instance context must consist only of type variables,
3895 with no repetitions.
3896 </para>
3897 <para>
3898 This rule is applied regardless of flags. If you want a more exotic context, you can write
3899 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3900 </para>
3901 </sect2>
3902
3903 <sect2 id="stand-alone-deriving">
3904 <title>Stand-alone deriving declarations</title>
3905
3906 <para>
3907 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3908 <programlisting>
3909 data Foo a = Bar a | Baz String
3910
3911 deriving instance Eq a => Eq (Foo a)
3912 </programlisting>
3913 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3914 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3915 </para>
3916 <para>
3917 However, standalone deriving differs from a <literal>deriving</literal> clause in a number
3918 of important ways:
3919 <itemizedlist>
3920 <listitem><para>The standalone deriving declaration does not need to be in the
3921 same module as the data type declaration. (But be aware of the dangers of
3922 orphan instances (<xref linkend="orphan-modules"/>).
3923 </para></listitem>
3924
3925 <listitem><para>
3926 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3927 exactly as you would in an ordinary instance declaration.
3928 (In contrast, in a <literal>deriving</literal> clause
3929 attached to a data type declaration, the context is inferred.)
3930 </para></listitem>
3931
3932 <listitem><para>
3933 Unlike a <literal>deriving</literal>
3934 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3935 than the data type (assuming you also use
3936 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3937 for example
3938 <programlisting>
3939 data Foo a = Bar a | Baz String
3940
3941 deriving instance Eq a => Eq (Foo [a])
3942 deriving instance Eq a => Eq (Foo (Maybe a))
3943 </programlisting>
3944 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3945 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3946 </para></listitem>
3947
3948 <listitem><para>
3949 Unlike a <literal>deriving</literal>
3950 declaration attached to a <literal>data</literal> declaration,
3951 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3952 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3953 your problem. (GHC will show you the offending code if it has a type error.)
3954 </para>
3955 <para>
3956 The merit of this is that you can derive instances for GADTs and other exotic
3957 data types, providing only that the boilerplate code does indeed typecheck. For example:
3958 <programlisting>
3959 data T a where
3960 T1 :: T Int
3961 T2 :: T Bool
3962
3963 deriving instance Show (T a)
3964 </programlisting>
3965 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3966 data type declaration for <literal>T</literal>,
3967 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3968 the instance declaration using stand-alone deriving.
3969 </para>
3970 <para>
3971 The down-side is that,
3972 if the boilerplate code fails to typecheck, you will get an error message about that
3973 code, which you did not write. Whereas, with a <literal>deriving</literal> clause
3974 the side-conditions are necessarily more conservative, but any error message
3975 may be more comprehensible.
3976 </para>
3977 </listitem>
3978 </itemizedlist></para>
3979
3980 <para>
3981 In other ways, however, a standalone deriving obeys the same rules as ordinary deriving:
3982 <itemizedlist>
3983 <listitem><para>
3984 A <literal>deriving instance</literal> declaration
3985 must obey the same rules concerning form and termination as ordinary instance declarations,
3986 controlled by the same flags; see <xref linkend="instance-decls"/>.
3987 </para></listitem>
3988
3989 <listitem>
3990 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3991 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3992 For example:
3993 <programlisting>
3994 newtype Foo a = MkFoo (State Int a)
3995
3996 deriving instance MonadState Int Foo
3997 </programlisting>
3998 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3999 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
4000 </para></listitem>
4001 </itemizedlist></para>
4002
4003 </sect2>
4004
4005 <sect2 id="deriving-extra">
4006 <title>Deriving instances of extra classes (<literal>Data</literal>, etc)</title>
4007
4008 <para>
4009 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
4010 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
4011 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
4012 classes <literal>Eq</literal>, <literal>Ord</literal>,
4013 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
4014 </para>
4015 <para>
4016 GHC extends this list with several more classes that may be automatically derived:
4017 <itemizedlist>
4018 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
4019 instances of the classes <literal>Generic</literal> and
4020 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
4021 You can use these to define generic functions,
4022 as described in <xref linkend="generic-programming"/>.
4023 </para></listitem>
4024
4025 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
4026 the class <literal>Functor</literal>,
4027 defined in <literal>GHC.Base</literal>. See <xref linkend="deriving-functor"/>.
4028 </para></listitem>
4029
4030 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of
4031 the class <literal>Data</literal>,
4032 defined in <literal>Data.Data</literal>. See <xref linkend="deriving-typeable"/> for
4033 deriving <literal>Typeable</literal>.
4034 </para></listitem>
4035
4036 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
4037 the class <literal>Foldable</literal>,
4038 defined in <literal>Data.Foldable</literal>. See <xref linkend="deriving-foldable"/>.
4039 </para></listitem>
4040
4041 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
4042 the class <literal>Traversable</literal>,
4043 defined in <literal>Data.Traversable</literal>. Since the <literal>Traversable</literal>
4044 instance dictates the instances of <literal>Functor</literal> and
4045 <literal>Foldable</literal>, you'll probably want to derive them too, so
4046 <option>-XDeriveTraversable</option> implies
4047 <option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>.
4048 See <xref linkend="deriving-traversable"/>.
4049 </para></listitem>
4050 </itemizedlist>
4051 You can also use a standalone deriving declaration instead
4052 (see <xref linkend="stand-alone-deriving"/>).
4053 </para>
4054 <para>
4055 In each case the appropriate class must be in scope before it
4056 can be mentioned in the <literal>deriving</literal> clause.
4057 </para>
4058 </sect2>
4059
4060 <sect2 id="deriving-functor">
4061 <title>Deriving <literal>Functor</literal> instances</title>
4062
4063 <para>With <option>-XDeriveFunctor</option>, one can derive
4064 <literal>Functor</literal> instances for data types of kind
4065 <literal>* -> *</literal>. For example, this declaration:
4066
4067 <programlisting>
4068 data Example a = Ex a Char (Example a) (Example Char)
4069 deriving Functor
4070 </programlisting>
4071
4072 would generate the following instance:
4073
4074 <programlisting>
4075 instance Functor Example where
4076 fmap f (Ex a1 a2 a3 a4) = Ex (f a1) a2 (fmap f a3) a4
4077 </programlisting>
4078 </para>
4079
4080 <para>The basic algorithm for <option>-XDeriveFunctor</option> walks the
4081 arguments of each constructor of a data type, applying a mapping function
4082 depending on the type of each argument. Suppose we are deriving
4083 <literal>Functor</literal> for a data type whose last type parameter is
4084 <literal>a</literal>. Then we write the derivation of <literal>fmap</literal>
4085 code over the type variable <literal>a</literal> for type
4086 <literal>b</literal> as <literal>$(fmap 'a 'b)</literal>.
4087