Improve documentation of syntax for promoted lists
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by command line flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>::</entry> <!-- no special char, apparently -->
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&larrtl;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&rarrtl;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry></entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry></entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483 <sect2 id="binary-literals">
484 <title>Binary integer literals</title>
485 <para>
486 Haskell 2010 and Haskell 98 allows for integer literals to
487 be given in decimal, octal (prefixed by
488 <literal>0o</literal> or <literal>0O</literal>), or
489 hexadecimal notation (prefixed by <literal>0x</literal> or
490 <literal>0X</literal>).
491 </para>
492
493 <para>
494 The language extension <option>-XBinaryLiterals</option>
495 adds support for expressing integer literals in binary
496 notation with the prefix <literal>0b</literal> or
497 <literal>0B</literal>. For instance, the binary integer
498 literal <literal>0b11001001</literal> will be desugared into
499 <literal>fromInteger 201</literal> when
500 <option>-XBinaryLiterals</option> is enabled.
501 </para>
502 </sect2>
503
504 <!-- ====================== HIERARCHICAL MODULES ======================= -->
505
506
507 <sect2 id="hierarchical-modules">
508 <title>Hierarchical Modules</title>
509
510 <para>GHC supports a small extension to the syntax of module
511 names: a module name is allowed to contain a dot
512 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
513 &ldquo;hierarchical module namespace&rdquo; extension, because
514 it extends the normally flat Haskell module namespace into a
515 more flexible hierarchy of modules.</para>
516
517 <para>This extension has very little impact on the language
518 itself; modules names are <emphasis>always</emphasis> fully
519 qualified, so you can just think of the fully qualified module
520 name as <quote>the module name</quote>. In particular, this
521 means that the full module name must be given after the
522 <literal>module</literal> keyword at the beginning of the
523 module; for example, the module <literal>A.B.C</literal> must
524 begin</para>
525
526 <programlisting>module A.B.C</programlisting>
527
528
529 <para>It is a common strategy to use the <literal>as</literal>
530 keyword to save some typing when using qualified names with
531 hierarchical modules. For example:</para>
532
533 <programlisting>
534 import qualified Control.Monad.ST.Strict as ST
535 </programlisting>
536
537 <para>For details on how GHC searches for source and interface
538 files in the presence of hierarchical modules, see <xref
539 linkend="search-path"/>.</para>
540
541 <para>GHC comes with a large collection of libraries arranged
542 hierarchically; see the accompanying <ulink
543 url="../libraries/index.html">library
544 documentation</ulink>. More libraries to install are available
545 from <ulink
546 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
547 </sect2>
548
549 <!-- ====================== PATTERN GUARDS ======================= -->
550
551 <sect2 id="pattern-guards">
552 <title>Pattern guards</title>
553
554 <para>
555 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
556 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
557 </para>
558
559 <para>
560 Suppose we have an abstract data type of finite maps, with a
561 lookup operation:
562
563 <programlisting>
564 lookup :: FiniteMap -> Int -> Maybe Int
565 </programlisting>
566
567 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
568 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
569 </para>
570
571 <programlisting>
572 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
573 | otherwise = var1 + var2
574 where
575 m1 = lookup env var1
576 m2 = lookup env var2
577 ok1 = maybeToBool m1
578 ok2 = maybeToBool m2
579 val1 = expectJust m1
580 val2 = expectJust m2
581 </programlisting>
582
583 <para>
584 The auxiliary functions are
585 </para>
586
587 <programlisting>
588 maybeToBool :: Maybe a -&gt; Bool
589 maybeToBool (Just x) = True
590 maybeToBool Nothing = False
591
592 expectJust :: Maybe a -&gt; a
593 expectJust (Just x) = x
594 expectJust Nothing = error "Unexpected Nothing"
595 </programlisting>
596
597 <para>
598 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
599 ok2</literal> checks that both lookups succeed, using
600 <function>maybeToBool</function> to convert the <function>Maybe</function>
601 types to booleans. The (lazily evaluated) <function>expectJust</function>
602 calls extract the values from the results of the lookups, and binds the
603 returned values to <varname>val1</varname> and <varname>val2</varname>
604 respectively. If either lookup fails, then clunky takes the
605 <literal>otherwise</literal> case and returns the sum of its arguments.
606 </para>
607
608 <para>
609 This is certainly legal Haskell, but it is a tremendously verbose and
610 un-obvious way to achieve the desired effect. Arguably, a more direct way
611 to write clunky would be to use case expressions:
612 </para>
613
614 <programlisting>
615 clunky env var1 var2 = case lookup env var1 of
616 Nothing -&gt; fail
617 Just val1 -&gt; case lookup env var2 of
618 Nothing -&gt; fail
619 Just val2 -&gt; val1 + val2
620 where
621 fail = var1 + var2
622 </programlisting>
623
624 <para>
625 This is a bit shorter, but hardly better. Of course, we can rewrite any set
626 of pattern-matching, guarded equations as case expressions; that is
627 precisely what the compiler does when compiling equations! The reason that
628 Haskell provides guarded equations is because they allow us to write down
629 the cases we want to consider, one at a time, independently of each other.
630 This structure is hidden in the case version. Two of the right-hand sides
631 are really the same (<function>fail</function>), and the whole expression
632 tends to become more and more indented.
633 </para>
634
635 <para>
636 Here is how I would write clunky:
637 </para>
638
639 <programlisting>
640 clunky env var1 var2
641 | Just val1 &lt;- lookup env var1
642 , Just val2 &lt;- lookup env var2
643 = val1 + val2
644 ...other equations for clunky...
645 </programlisting>
646
647 <para>
648 The semantics should be clear enough. The qualifiers are matched in order.
649 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
650 right hand side is evaluated and matched against the pattern on the left.
651 If the match fails then the whole guard fails and the next equation is
652 tried. If it succeeds, then the appropriate binding takes place, and the
653 next qualifier is matched, in the augmented environment. Unlike list
654 comprehensions, however, the type of the expression to the right of the
655 <literal>&lt;-</literal> is the same as the type of the pattern to its
656 left. The bindings introduced by pattern guards scope over all the
657 remaining guard qualifiers, and over the right hand side of the equation.
658 </para>
659
660 <para>
661 Just as with list comprehensions, boolean expressions can be freely mixed
662 with among the pattern guards. For example:
663 </para>
664
665 <programlisting>
666 f x | [y] &lt;- x
667 , y > 3
668 , Just z &lt;- h y
669 = ...
670 </programlisting>
671
672 <para>
673 Haskell's current guards therefore emerge as a special case, in which the
674 qualifier list has just one element, a boolean expression.
675 </para>
676 </sect2>
677
678 <!-- ===================== View patterns =================== -->
679
680 <sect2 id="view-patterns">
681 <title>View patterns
682 </title>
683
684 <para>
685 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
686 More information and examples of view patterns can be found on the
687 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
688 page</ulink>.
689 </para>
690
691 <para>
692 View patterns are somewhat like pattern guards that can be nested inside
693 of other patterns. They are a convenient way of pattern-matching
694 against values of abstract types. For example, in a programming language
695 implementation, we might represent the syntax of the types of the
696 language as follows:
697
698 <programlisting>
699 type Typ
700
701 data TypView = Unit
702 | Arrow Typ Typ
703
704 view :: Typ -> TypView
705
706 -- additional operations for constructing Typ's ...
707 </programlisting>
708
709 The representation of Typ is held abstract, permitting implementations
710 to use a fancy representation (e.g., hash-consing to manage sharing).
711
712 Without view patterns, using this signature a little inconvenient:
713 <programlisting>
714 size :: Typ -> Integer
715 size t = case view t of
716 Unit -> 1
717 Arrow t1 t2 -> size t1 + size t2
718 </programlisting>
719
720 It is necessary to iterate the case, rather than using an equational
721 function definition. And the situation is even worse when the matching
722 against <literal>t</literal> is buried deep inside another pattern.
723 </para>
724
725 <para>
726 View patterns permit calling the view function inside the pattern and
727 matching against the result:
728 <programlisting>
729 size (view -> Unit) = 1
730 size (view -> Arrow t1 t2) = size t1 + size t2
731 </programlisting>
732
733 That is, we add a new form of pattern, written
734 <replaceable>expression</replaceable> <literal>-></literal>
735 <replaceable>pattern</replaceable> that means "apply the expression to
736 whatever we're trying to match against, and then match the result of
737 that application against the pattern". The expression can be any Haskell
738 expression of function type, and view patterns can be used wherever
739 patterns are used.
740 </para>
741
742 <para>
743 The semantics of a pattern <literal>(</literal>
744 <replaceable>exp</replaceable> <literal>-></literal>
745 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
746
747 <itemizedlist>
748
749 <listitem> Scoping:
750
751 <para>The variables bound by the view pattern are the variables bound by
752 <replaceable>pat</replaceable>.
753 </para>
754
755 <para>
756 Any variables in <replaceable>exp</replaceable> are bound occurrences,
757 but variables bound "to the left" in a pattern are in scope. This
758 feature permits, for example, one argument to a function to be used in
759 the view of another argument. For example, the function
760 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
761 written using view patterns as follows:
762
763 <programlisting>
764 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
765 ...other equations for clunky...
766 </programlisting>
767 </para>
768
769 <para>
770 More precisely, the scoping rules are:
771 <itemizedlist>
772 <listitem>
773 <para>
774 In a single pattern, variables bound by patterns to the left of a view
775 pattern expression are in scope. For example:
776 <programlisting>
777 example :: Maybe ((String -> Integer,Integer), String) -> Bool
778 example Just ((f,_), f -> 4) = True
779 </programlisting>
780
781 Additionally, in function definitions, variables bound by matching earlier curried
782 arguments may be used in view pattern expressions in later arguments:
783 <programlisting>
784 example :: (String -> Integer) -> String -> Bool
785 example f (f -> 4) = True
786 </programlisting>
787 That is, the scoping is the same as it would be if the curried arguments
788 were collected into a tuple.
789 </para>
790 </listitem>
791
792 <listitem>
793 <para>
794 In mutually recursive bindings, such as <literal>let</literal>,
795 <literal>where</literal>, or the top level, view patterns in one
796 declaration may not mention variables bound by other declarations. That
797 is, each declaration must be self-contained. For example, the following
798 program is not allowed:
799 <programlisting>
800 let {(x -> y) = e1 ;
801 (y -> x) = e2 } in x
802 </programlisting>
803
804 (For some amplification on this design choice see
805 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
806
807 </para>
808 </listitem>
809 </itemizedlist>
810
811 </para>
812 </listitem>
813
814 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
815 <replaceable>T1</replaceable> <literal>-></literal>
816 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
817 a <replaceable>T2</replaceable>, then the whole view pattern matches a
818 <replaceable>T1</replaceable>.
819 </para></listitem>
820
821 <listitem><para> Matching: To the equations in Section 3.17.3 of the
822 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
823 Report</ulink>, add the following:
824 <programlisting>
825 case v of { (e -> p) -> e1 ; _ -> e2 }
826 =
827 case (e v) of { p -> e1 ; _ -> e2 }
828 </programlisting>
829 That is, to match a variable <replaceable>v</replaceable> against a pattern
830 <literal>(</literal> <replaceable>exp</replaceable>
831 <literal>-></literal> <replaceable>pat</replaceable>
832 <literal>)</literal>, evaluate <literal>(</literal>
833 <replaceable>exp</replaceable> <replaceable> v</replaceable>
834 <literal>)</literal> and match the result against
835 <replaceable>pat</replaceable>.
836 </para></listitem>
837
838 <listitem><para> Efficiency: When the same view function is applied in
839 multiple branches of a function definition or a case expression (e.g.,
840 in <literal>size</literal> above), GHC makes an attempt to collect these
841 applications into a single nested case expression, so that the view
842 function is only applied once. Pattern compilation in GHC follows the
843 matrix algorithm described in Chapter 4 of <ulink
844 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
845 Implementation of Functional Programming Languages</ulink>. When the
846 top rows of the first column of a matrix are all view patterns with the
847 "same" expression, these patterns are transformed into a single nested
848 case. This includes, for example, adjacent view patterns that line up
849 in a tuple, as in
850 <programlisting>
851 f ((view -> A, p1), p2) = e1
852 f ((view -> B, p3), p4) = e2
853 </programlisting>
854 </para>
855
856 <para> The current notion of when two view pattern expressions are "the
857 same" is very restricted: it is not even full syntactic equality.
858 However, it does include variables, literals, applications, and tuples;
859 e.g., two instances of <literal>view ("hi", "there")</literal> will be
860 collected. However, the current implementation does not compare up to
861 alpha-equivalence, so two instances of <literal>(x, view x ->
862 y)</literal> will not be coalesced.
863 </para>
864
865 </listitem>
866
867 </itemizedlist>
868 </para>
869
870 </sect2>
871
872 <!-- ===================== Pattern synonyms =================== -->
873
874 <sect2 id="pattern-synonyms">
875 <title>Pattern synonyms
876 </title>
877
878 <para>
879 Pattern synonyms are enabled by the flag
880 <literal>-XPatternSynonyms</literal>, which is required for defining
881 them, but <emphasis>not</emphasis> for using them. More information
882 and examples of view patterns can be found on the <ulink
883 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
884 page</ulink>.
885 </para>
886
887 <para>
888 Pattern synonyms enable giving names to parametrized pattern
889 schemes. They can also be thought of as abstract constructors that
890 don't have a bearing on data representation. For example, in a
891 programming language implementation, we might represent types of the
892 language as follows:
893 </para>
894
895 <programlisting>
896 data Type = App String [Type]
897 </programlisting>
898
899 <para>
900 Here are some examples of using said representation.
901 Consider a few types of the <literal>Type</literal> universe encoded
902 like this:
903 </para>
904
905 <programlisting>
906 App "->" [t1, t2] -- t1 -> t2
907 App "Int" [] -- Int
908 App "Maybe" [App "Int" []] -- Maybe Int
909 </programlisting>
910
911 <para>
912 This representation is very generic in that no types are given special
913 treatment. However, some functions might need to handle some known
914 types specially, for example the following two functions collect all
915 argument types of (nested) arrow types, and recognize the
916 <literal>Int</literal> type, respectively:
917 </para>
918
919 <programlisting>
920 collectArgs :: Type -> [Type]
921 collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
922 collectArgs _ = []
923
924 isInt :: Type -> Bool
925 isInt (App "Int" []) = True
926 isInt _ = False
927 </programlisting>
928
929 <para>
930 Matching on <literal>App</literal> directly is both hard to read and
931 error prone to write. And the situation is even worse when the
932 matching is nested:
933 </para>
934
935 <programlisting>
936 isIntEndo :: Type -> Bool
937 isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
938 isIntEndo _ = False
939 </programlisting>
940
941 <para>
942 Pattern synonyms permit abstracting from the representation to expose
943 matchers that behave in a constructor-like manner with respect to
944 pattern matching. We can create pattern synonyms for the known types
945 we care about, without committing the representation to them (note
946 that these don't have to be defined in the same module as the
947 <literal>Type</literal> type):
948 </para>
949
950 <programlisting>
951 pattern Arrow t1 t2 = App "->" [t1, t2]
952 pattern Int = App "Int" []
953 pattern Maybe t = App "Maybe" [t]
954 </programlisting>
955
956 <para>
957 Which enables us to rewrite our functions in a much cleaner style:
958 </para>
959
960 <programlisting>
961 collectArgs :: Type -> [Type]
962 collectArgs (Arrow t1 t2) = t1 : collectArgs t2
963 collectArgs _ = []
964
965 isInt :: Type -> Bool
966 isInt Int = True
967 isInt _ = False
968
969 isIntEndo :: Type -> Bool
970 isIntEndo (Arrow Int Int) = True
971 isIntEndo _ = False
972 </programlisting>
973
974 <para>
975 Note that in this example, the pattern synonyms
976 <literal>Int</literal> and <literal>Arrow</literal> can also be used
977 as expressions (they are <emphasis>bidirectional</emphasis>). This
978 is not necessarily the case: <emphasis>unidirectional</emphasis>
979 pattern synonyms can also be declared with the following syntax:
980 </para>
981
982 <programlisting>
983 pattern Head x &lt;- x:xs
984 </programlisting>
985
986 <para>
987 In this case, <literal>Head</literal> <replaceable>x</replaceable>
988 cannot be used in expressions, only patterns, since it wouldn't
989 specify a value for the <replaceable>xs</replaceable> on the
990 right-hand side. We can give an explicit inversion of a pattern
991 synonym using the following syntax:
992 </para>
993
994 <programlisting>
995 pattern Head x &lt;- x:xs where
996 Head x = [x]
997 </programlisting>
998
999 <para>
1000 The syntax and semantics of pattern synonyms are elaborated in the
1001 following subsections.
1002 See the <ulink
1003 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
1004 page</ulink> for more details.
1005 </para>
1006
1007 <sect3> <title>Syntax and scoping of pattern synonyms</title>
1008 <para>
1009 A pattern synonym declaration can be either unidirectional or
1010 bidirectional. The syntax for unidirectional pattern synonyms is:
1011 <programlisting>
1012 pattern Name args &lt;- pat
1013 </programlisting>
1014 and the syntax for bidirectional pattern synonyms is:
1015 <programlisting>
1016 pattern Name args = pat
1017 </programlisting> or
1018 <programlisting>
1019 pattern Name args &lt;- pat where
1020 Name args = expr
1021 </programlisting>
1022 Either prefix or infix syntax can be
1023 used.
1024 </para>
1025 <para>
1026 Pattern synonym declarations can only occur in the top level of a
1027 module. In particular, they are not allowed as local
1028 definitions. Currently, they also don't work in GHCi, but that is a
1029 technical restriction that will be lifted in later versions.
1030 </para>
1031 <para>
1032 The variables in the left-hand side of the definition are bound by
1033 the pattern on the right-hand side. For implicitly bidirectional
1034 pattern synonyms, all the variables of the right-hand side must also
1035 occur on the left-hand side; also, wildcard patterns and view
1036 patterns are not allowed. For unidirectional and
1037 explicitly-bidirectional pattern synonyms, there is no restriction
1038 on the right-hand side pattern.
1039 </para>
1040
1041 <para>
1042 Pattern synonyms cannot be defined recursively.
1043 </para>
1044 </sect3>
1045
1046 <sect3 id="patsyn-impexp"> <title>Import and export of pattern synonyms</title>
1047
1048 <para>
1049 The name of the pattern synonym itself is in the same namespace as
1050 proper data constructors. In an export or import specification,
1051 you must prefix pattern
1052 names with the <literal>pattern</literal> keyword, e.g.:
1053 <programlisting>
1054 module Example (pattern Single) where
1055 pattern Single x = [x]
1056 </programlisting>
1057 Without the <literal>pattern</literal> prefix, <literal>Single</literal> would
1058 be interpreted as a type constructor in the export list.
1059 </para>
1060 <para>
1061 You may also use the <literal>pattern</literal> keyword in an import/export
1062 specification to import or export an ordinary data constructor. For example:
1063 <programlisting>
1064 import Data.Maybe( pattern Just )
1065 </programlisting>
1066 would bring into scope the data constructor <literal>Just</literal> from the
1067 <literal>Maybe</literal> type, without also bringing the type constructor
1068 <literal>Maybe</literal> into scope.
1069 </para>
1070 </sect3>
1071
1072 <sect3> <title>Typing of pattern synonyms</title>
1073
1074 <para>
1075 Given a pattern synonym definition of the form
1076 </para>
1077 <programlisting>
1078 pattern P var1 var2 ... varN &lt;- pat
1079 </programlisting>
1080 <para>
1081 it is assigned a <emphasis>pattern type</emphasis> of the form
1082 </para>
1083 <programlisting>
1084 pattern P :: CProv => CReq => t1 -> t2 -> ... -> tN -> t
1085 </programlisting>
1086 <para>
1087 where <replaceable>CProv</replaceable> and
1088 <replaceable>CReq</replaceable> are type contexts, and
1089 <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ...,
1090 <replaceable>tN</replaceable> and <replaceable>t</replaceable> are
1091 types.
1092 </para>
1093
1094 <para>
1095 A pattern synonym of this type can be used in a pattern if the
1096 instatiated (monomorphic) type satisfies the constraints of
1097 <replaceable>CReq</replaceable>. In this case, it extends the context
1098 available in the right-hand side of the match with
1099 <replaceable>CProv</replaceable>, just like how an existentially-typed
1100 data constructor can extend the context.
1101 </para>
1102
1103 <para>
1104 For example, in the following program:
1105 </para>
1106 <programlisting>
1107 {-# LANGUAGE PatternSynonyms, GADTs #-}
1108 module ShouldCompile where
1109
1110 data T a where
1111 MkT :: (Show b) => a -> b -> T a
1112
1113 pattern ExNumPat x = MkT 42 x
1114 </programlisting>
1115
1116 <para>
1117 the inferred pattern type of <literal>ExNumPat</literal> is
1118 </para>
1119
1120 <programlisting>
1121 pattern (Show b) => ExNumPat b :: (Num a, Eq a) => T a
1122 </programlisting>
1123
1124 <para>
1125 and so can be used in a function definition like the following:
1126 </para>
1127
1128 <programlisting>
1129 f :: (Num t, Eq t) => T t -> String
1130 f (ExNumPat x) = show x
1131 </programlisting>
1132
1133 <para>
1134 For bidirectional pattern synonyms, uses as expressions have the type
1135 </para>
1136 <programlisting>
1137 (CProv, CReq) => t1 -> t2 -> ... -> tN -> t
1138 </programlisting>
1139
1140 <para>
1141 So in the previous example, <literal>ExNumPat</literal>,
1142 when used in an expression, has type
1143 </para>
1144 <programlisting>
1145 ExNumPat :: (Show b, Num a, Eq a) => b -> T t
1146 </programlisting>
1147 </sect3>
1148
1149 <para>
1150 Pattern synonyms can also be given a type signature in the source
1151 program, e.g.:
1152 </para>
1153
1154 <programlisting>
1155 -- Inferred type would be 'a -> [a]'
1156 pattern SinglePair :: (a, a) -> [(a, a)]
1157 pattern SinglePair x = [x]
1158 </programlisting>
1159
1160 <sect3><title>Matching of pattern synonyms</title>
1161
1162 <para>
1163 A pattern synonym occurrence in a pattern is evaluated by first
1164 matching against the pattern synonym itself, and then on the argument
1165 patterns. For example, in the following program, <literal>f</literal>
1166 and <literal>f'</literal> are equivalent:
1167 </para>
1168
1169 <programlisting>
1170 pattern Pair x y &lt;- [x, y]
1171
1172 f (Pair True True) = True
1173 f _ = False
1174
1175 f' [x, y] | True &lt;- x, True &lt;- y = True
1176 f' _ = False
1177 </programlisting>
1178
1179 <para>
1180 Note that the strictness of <literal>f</literal> differs from that
1181 of <literal>g</literal> defined below:
1182 <programlisting>
1183 g [True, True] = True
1184 g _ = False
1185
1186 *Main> f (False:undefined)
1187 *** Exception: Prelude.undefined
1188 *Main> g (False:undefined)
1189 False
1190 </programlisting>
1191 </para>
1192 </sect3>
1193
1194 </sect2>
1195
1196 <!-- ===================== n+k patterns =================== -->
1197
1198 <sect2 id="n-k-patterns">
1199 <title>n+k patterns</title>
1200 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
1201
1202 <para>
1203 <literal>n+k</literal> pattern support is disabled by default. To enable
1204 it, you can use the <option>-XNPlusKPatterns</option> flag.
1205 </para>
1206
1207 </sect2>
1208
1209 <!-- ===================== Traditional record syntax =================== -->
1210
1211 <sect2 id="traditional-record-syntax">
1212 <title>Traditional record syntax</title>
1213 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
1214
1215 <para>
1216 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
1217 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
1218 </para>
1219
1220 </sect2>
1221
1222 <!-- ===================== Recursive do-notation =================== -->
1223
1224 <sect2 id="recursive-do-notation">
1225 <title>The recursive do-notation
1226 </title>
1227
1228 <para>
1229 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
1230 that is, the variables bound in a do-expression are visible only in the textually following
1231 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
1232 group.
1233 </para>
1234
1235 <para>
1236 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
1237 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
1238 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
1239 <programlisting>
1240 class Monad m => MonadFix m where
1241 mfix :: (a -> m a) -> m a
1242 </programlisting>
1243 Haskell's
1244 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
1245 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
1246 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
1247 </para>
1248
1249 <para>
1250 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
1251 an extended version of the do-notation that allows recursive bindings.
1252 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
1253 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
1254 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
1255 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
1256 are recursively defined, much like in an ordinary let-expression. Due to the new
1257 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
1258 </para>
1259
1260 <para>
1261 Here is a simple (albeit contrived) example:
1262 <programlisting>
1263 {-# LANGUAGE RecursiveDo #-}
1264 justOnes = mdo { xs &lt;- Just (1:xs)
1265 ; return (map negate xs) }
1266 </programlisting>
1267 or equivalently
1268 <programlisting>
1269 {-# LANGUAGE RecursiveDo #-}
1270 justOnes = do { rec { xs &lt;- Just (1:xs) }
1271 ; return (map negate xs) }
1272 </programlisting>
1273 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
1274 </para>
1275
1276 <para>
1277 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
1278 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
1279 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
1280 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
1281 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
1282 </para>
1283
1284 <sect3>
1285 <title>Recursive binding groups</title>
1286
1287 <para>
1288 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
1289 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
1290 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
1291 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
1292 <programlisting>
1293 do { a &lt;- getChar do { a &lt;- getChar
1294 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
1295 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
1296 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
1297 </programlisting>
1298 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
1299 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
1300 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
1301 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
1302 </para>
1303
1304 <para>
1305 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
1306 group, it will compute its set of bound variables, and will introduce an appropriate call
1307 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
1308 <literal>MonadFix</literal> class. Here is an example:
1309 <programlisting>
1310 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
1311 ; c &lt;- f b a } ; c &lt;- f b a
1312 ; return (b,c) })
1313 </programlisting>
1314 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
1315 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
1316 <programlisting>
1317 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1318 </programlisting>
1319 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1320 </para>
1321
1322 <para>
1323 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
1324 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
1325 for the <literal>mdo</literal> notation, which is described next.
1326 </para>
1327 </sect3>
1328
1329 <sect3>
1330 <title>The <literal>mdo</literal> notation</title>
1331
1332 <para>
1333 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
1334 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
1335 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
1336 in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
1337 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
1338 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
1339 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
1340 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
1341 interference, changing the termination behavior of the resulting translation.
1342 (Details can be found in Sections 3.1 and 7.2.2 of
1343 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1344 </para>
1345
1346 <para>
1347 The <literal>mdo</literal> notation removes the burden of placing
1348 explicit <literal>rec</literal> blocks in the code. Unlike an
1349 ordinary <literal>do</literal> expression, in which variables bound by
1350 statements are only in scope for later statements, variables bound in
1351 an <literal>mdo</literal> expression are in scope for all statements
1352 of the expression. The compiler then automatically identifies minimal
1353 mutually recursively dependent segments of statements, treating them as
1354 if the user had wrapped a <literal>rec</literal> qualifier around them.
1355 </para>
1356
1357 <para>
1358 The definition is syntactic:
1359 </para>
1360 <itemizedlist>
1361 <listitem>
1362 <para>
1363 A generator <replaceable>g</replaceable>
1364 <emphasis>depends</emphasis> on a textually following generator
1365 <replaceable>g'</replaceable>, if
1366 </para>
1367 <itemizedlist>
1368 <listitem>
1369 <para>
1370 <replaceable>g'</replaceable> defines a variable that
1371 is used by <replaceable>g</replaceable>, or
1372 </para>
1373 </listitem>
1374 <listitem>
1375 <para>
1376 <replaceable>g'</replaceable> textually appears between
1377 <replaceable>g</replaceable> and
1378 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1379 depends on <replaceable>g''</replaceable>.
1380 </para>
1381 </listitem>
1382 </itemizedlist>
1383 </listitem>
1384 <listitem>
1385 <para>
1386 A <emphasis>segment</emphasis> of a given
1387 <literal>mdo</literal>-expression is a minimal sequence of generators
1388 such that no generator of the sequence depends on an outside
1389 generator. As a special case, although it is not a generator,
1390 the final expression in an <literal>mdo</literal>-expression is
1391 considered to form a segment by itself.
1392 </para>
1393 </listitem>
1394 </itemizedlist>
1395 <para>
1396 Segments in this sense are
1397 related to <emphasis>strongly-connected components</emphasis> analysis,
1398 with the exception that bindings in a segment cannot be reordered and
1399 must be contiguous.
1400 </para>
1401
1402 <para>
1403 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1404 <programlisting>
1405 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1406 ; b &lt;- f a c ; rec { b &lt;- f a c
1407 ; c &lt;- f b a ; ; c &lt;- f b a }
1408 ; z &lt;- h a b ; z &lt;- h a b
1409 ; d &lt;- g d e ; rec { d &lt;- g d e
1410 ; e &lt;- g a z ; ; e &lt;- g a z }
1411 ; putChar c } ; putChar c }
1412 </programlisting>
1413 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1414 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1415 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1416 would expect.
1417 </para>
1418
1419 <para>
1420 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1421 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1422 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1423 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1424 version would do so.
1425 </para>
1426
1427 <para>
1428 Here are some other important points in using the recursive-do notation:
1429
1430 <itemizedlist>
1431 <listitem>
1432 <para>
1433 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1434 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1435 blocks inside <literal>do</literal> expressions.)
1436 </para>
1437 </listitem>
1438 <listitem>
1439 <para>
1440 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1441 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1442 <literal>rec</literal> blocks in a single expression.
1443 </para>
1444 </listitem>
1445 <listitem>
1446 <para>
1447 If recursive bindings are required for a monad, then that monad must be declared an instance of
1448 the <literal>MonadFix</literal> class.
1449 </para>
1450 </listitem>
1451 <listitem>
1452 <para>
1453 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1454 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1455 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1456 state monad (strict and lazy, respectively).
1457 </para>
1458 </listitem>
1459 <listitem>
1460 <para>
1461 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1462 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1463 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1464 </para>
1465 </listitem>
1466 </itemizedlist>
1467 </para>
1468 </sect3>
1469
1470
1471 </sect2>
1472
1473
1474 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1475
1476 <sect2 id="parallel-list-comprehensions">
1477 <title>Parallel List Comprehensions</title>
1478 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1479 </indexterm>
1480 <indexterm><primary>parallel list comprehensions</primary>
1481 </indexterm>
1482
1483 <para>Parallel list comprehensions are a natural extension to list
1484 comprehensions. List comprehensions can be thought of as a nice
1485 syntax for writing maps and filters. Parallel comprehensions
1486 extend this to include the zipWith family.</para>
1487
1488 <para>A parallel list comprehension has multiple independent
1489 branches of qualifier lists, each separated by a `|' symbol. For
1490 example, the following zips together two lists:</para>
1491
1492 <programlisting>
1493 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1494 </programlisting>
1495
1496 <para>The behaviour of parallel list comprehensions follows that of
1497 zip, in that the resulting list will have the same length as the
1498 shortest branch.</para>
1499
1500 <para>We can define parallel list comprehensions by translation to
1501 regular comprehensions. Here's the basic idea:</para>
1502
1503 <para>Given a parallel comprehension of the form: </para>
1504
1505 <programlisting>
1506 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1507 | q1 &lt;- e21, q2 &lt;- e22, ...
1508 ...
1509 ]
1510 </programlisting>
1511
1512 <para>This will be translated to: </para>
1513
1514 <programlisting>
1515 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1516 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1517 ...
1518 ]
1519 </programlisting>
1520
1521 <para>where `zipN' is the appropriate zip for the given number of
1522 branches.</para>
1523
1524 </sect2>
1525
1526 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1527
1528 <sect2 id="generalised-list-comprehensions">
1529 <title>Generalised (SQL-Like) List Comprehensions</title>
1530 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1531 </indexterm>
1532 <indexterm><primary>extended list comprehensions</primary>
1533 </indexterm>
1534 <indexterm><primary>group</primary></indexterm>
1535 <indexterm><primary>sql</primary></indexterm>
1536
1537
1538 <para>Generalised list comprehensions are a further enhancement to the
1539 list comprehension syntactic sugar to allow operations such as sorting
1540 and grouping which are familiar from SQL. They are fully described in the
1541 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1542 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1543 except that the syntax we use differs slightly from the paper.</para>
1544 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1545 <para>Here is an example:
1546 <programlisting>
1547 employees = [ ("Simon", "MS", 80)
1548 , ("Erik", "MS", 100)
1549 , ("Phil", "Ed", 40)
1550 , ("Gordon", "Ed", 45)
1551 , ("Paul", "Yale", 60)]
1552
1553 output = [ (the dept, sum salary)
1554 | (name, dept, salary) &lt;- employees
1555 , then group by dept using groupWith
1556 , then sortWith by (sum salary)
1557 , then take 5 ]
1558 </programlisting>
1559 In this example, the list <literal>output</literal> would take on
1560 the value:
1561
1562 <programlisting>
1563 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1564 </programlisting>
1565 </para>
1566 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1567 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1568 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1569
1570 <para>There are five new forms of comprehension qualifier,
1571 all introduced by the (existing) keyword <literal>then</literal>:
1572 <itemizedlist>
1573 <listitem>
1574
1575 <programlisting>
1576 then f
1577 </programlisting>
1578
1579 This statement requires that <literal>f</literal> have the type <literal>
1580 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1581 motivating example, as this form is used to apply <literal>take 5</literal>.
1582
1583 </listitem>
1584
1585
1586 <listitem>
1587 <para>
1588 <programlisting>
1589 then f by e
1590 </programlisting>
1591
1592 This form is similar to the previous one, but allows you to create a function
1593 which will be passed as the first argument to f. As a consequence f must have
1594 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1595 from the type, this function lets f &quot;project out&quot; some information
1596 from the elements of the list it is transforming.</para>
1597
1598 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1599 is supplied with a function that lets it find out the <literal>sum salary</literal>
1600 for any item in the list comprehension it transforms.</para>
1601
1602 </listitem>
1603
1604
1605 <listitem>
1606
1607 <programlisting>
1608 then group by e using f
1609 </programlisting>
1610
1611 <para>This is the most general of the grouping-type statements. In this form,
1612 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1613 As with the <literal>then f by e</literal> case above, the first argument
1614 is a function supplied to f by the compiler which lets it compute e on every
1615 element of the list being transformed. However, unlike the non-grouping case,
1616 f additionally partitions the list into a number of sublists: this means that
1617 at every point after this statement, binders occurring before it in the comprehension
1618 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1619 this, let's look at an example:</para>
1620
1621 <programlisting>
1622 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1623 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1624 groupRuns f = groupBy (\x y -> f x == f y)
1625
1626 output = [ (the x, y)
1627 | x &lt;- ([1..3] ++ [1..2])
1628 , y &lt;- [4..6]
1629 , then group by x using groupRuns ]
1630 </programlisting>
1631
1632 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1633
1634 <programlisting>
1635 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1636 </programlisting>
1637
1638 <para>Note that we have used the <literal>the</literal> function to change the type
1639 of x from a list to its original numeric type. The variable y, in contrast, is left
1640 unchanged from the list form introduced by the grouping.</para>
1641
1642 </listitem>
1643
1644 <listitem>
1645
1646 <programlisting>
1647 then group using f
1648 </programlisting>
1649
1650 <para>With this form of the group statement, f is required to simply have the type
1651 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1652 comprehension so far directly. An example of this form is as follows:</para>
1653
1654 <programlisting>
1655 output = [ x
1656 | y &lt;- [1..5]
1657 , x &lt;- "hello"
1658 , then group using inits]
1659 </programlisting>
1660
1661 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1662
1663 <programlisting>
1664 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1665 </programlisting>
1666
1667 </listitem>
1668 </itemizedlist>
1669 </para>
1670 </sect2>
1671
1672 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1673
1674 <sect2 id="monad-comprehensions">
1675 <title>Monad comprehensions</title>
1676 <indexterm><primary>monad comprehensions</primary></indexterm>
1677
1678 <para>
1679 Monad comprehensions generalise the list comprehension notation,
1680 including parallel comprehensions
1681 (<xref linkend="parallel-list-comprehensions"/>) and
1682 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1683 to work for any monad.
1684 </para>
1685
1686 <para>Monad comprehensions support:</para>
1687
1688 <itemizedlist>
1689 <listitem>
1690 <para>
1691 Bindings:
1692 </para>
1693
1694 <programlisting>
1695 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1696 </programlisting>
1697
1698 <para>
1699 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1700 <literal>return</literal> functions to the usual do-notation:
1701 </para>
1702
1703 <programlisting>
1704 do x &lt;- Just 1
1705 y &lt;- Just 2
1706 return (x+y)
1707 </programlisting>
1708
1709 </listitem>
1710 <listitem>
1711 <para>
1712 Guards:
1713 </para>
1714
1715 <programlisting>
1716 [ x | x &lt;- [1..10], x &lt;= 5 ]
1717 </programlisting>
1718
1719 <para>
1720 Guards are translated with the <literal>guard</literal> function,
1721 which requires a <literal>MonadPlus</literal> instance:
1722 </para>
1723
1724 <programlisting>
1725 do x &lt;- [1..10]
1726 guard (x &lt;= 5)
1727 return x
1728 </programlisting>
1729
1730 </listitem>
1731 <listitem>
1732 <para>
1733 Transform statements (as with <literal>-XTransformListComp</literal>):
1734 </para>
1735
1736 <programlisting>
1737 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1738 </programlisting>
1739
1740 <para>
1741 This translates to:
1742 </para>
1743
1744 <programlisting>
1745 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1746 y &lt;- [1..x]
1747 return (x,y))
1748 return (x+y)
1749 </programlisting>
1750
1751 </listitem>
1752 <listitem>
1753 <para>
1754 Group statements (as with <literal>-XTransformListComp</literal>):
1755 </para>
1756
1757 <programlisting>
1758 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1759 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1760 </programlisting>
1761
1762 </listitem>
1763 <listitem>
1764 <para>
1765 Parallel statements (as with <literal>-XParallelListComp</literal>):
1766 </para>
1767
1768 <programlisting>
1769 [ (x+y) | x &lt;- [1..10]
1770 | y &lt;- [11..20]
1771 ]
1772 </programlisting>
1773
1774 <para>
1775 Parallel statements are translated using the
1776 <literal>mzip</literal> function, which requires a
1777 <literal>MonadZip</literal> instance defined in
1778 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1779 </para>
1780
1781 <programlisting>
1782 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1783 return x)
1784 (do y &lt;- [11..20]
1785 return y)
1786 return (x+y)
1787 </programlisting>
1788
1789 </listitem>
1790 </itemizedlist>
1791
1792 <para>
1793 All these features are enabled by default if the
1794 <literal>MonadComprehensions</literal> extension is enabled. The types
1795 and more detailed examples on how to use comprehensions are explained
1796 in the previous chapters <xref
1797 linkend="generalised-list-comprehensions"/> and <xref
1798 linkend="parallel-list-comprehensions"/>. In general you just have
1799 to replace the type <literal>[a]</literal> with the type
1800 <literal>Monad m => m a</literal> for monad comprehensions.
1801 </para>
1802
1803 <para>
1804 Note: Even though most of these examples are using the list monad,
1805 monad comprehensions work for any monad.
1806 The <literal>base</literal> package offers all necessary instances for
1807 lists, which make <literal>MonadComprehensions</literal> backward
1808 compatible to built-in, transform and parallel list comprehensions.
1809 </para>
1810 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1811 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1812 <programlisting>
1813 Expressions: e
1814 Declarations: d
1815 Lists of qualifiers: Q,R,S
1816
1817 -- Basic forms
1818 D[ e | ] = return e
1819 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1820 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1821 D[ e | let d, Q ] = let d in D[ e | Q ]
1822
1823 -- Parallel comprehensions (iterate for multiple parallel branches)
1824 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1825
1826 -- Transform comprehensions
1827 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1828
1829 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1830
1831 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1832 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1833 Qv -&gt; D[ e | R ]
1834
1835 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1836 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1837 Qv -&gt; D[ e | R ]
1838
1839 where Qv is the tuple of variables bound by Q (and used subsequently)
1840 selQvi is a selector mapping Qv to the ith component of Qv
1841
1842 Operator Standard binding Expected type
1843 --------------------------------------------------------------------
1844 return GHC.Base t1 -&gt; m t2
1845 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1846 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1847 guard Control.Monad t1 -&gt; m t2
1848 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1849 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1850 </programlisting>
1851 The comprehension should typecheck when its desugaring would typecheck.
1852 </para>
1853 <para>
1854 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1855 Without rebindable
1856 syntax, the operators from the "standard binding" module are used; with
1857 rebindable syntax, the operators are looked up in the current lexical scope.
1858 For example, parallel comprehensions will be typechecked and desugared
1859 using whatever "<literal>mzip</literal>" is in scope.
1860 </para>
1861 <para>
1862 The rebindable operators must have the "Expected type" given in the
1863 table above. These types are surprisingly general. For example, you can
1864 use a bind operator with the type
1865 <programlisting>
1866 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1867 </programlisting>
1868 In the case of transform comprehensions, notice that the groups are
1869 parameterised over some arbitrary type <literal>n</literal> (provided it
1870 has an <literal>fmap</literal>, as well as
1871 the comprehension being over an arbitrary monad.
1872 </para>
1873 </sect2>
1874
1875 <!-- ===================== REBINDABLE SYNTAX =================== -->
1876
1877 <sect2 id="rebindable-syntax">
1878 <title>Rebindable syntax and the implicit Prelude import</title>
1879
1880 <para><indexterm><primary>-XNoImplicitPrelude
1881 option</primary></indexterm> GHC normally imports
1882 <filename>Prelude.hi</filename> files for you. If you'd
1883 rather it didn't, then give it a
1884 <option>-XNoImplicitPrelude</option> option. The idea is
1885 that you can then import a Prelude of your own. (But don't
1886 call it <literal>Prelude</literal>; the Haskell module
1887 namespace is flat, and you must not conflict with any
1888 Prelude module.)</para>
1889
1890 <para>Suppose you are importing a Prelude of your own
1891 in order to define your own numeric class
1892 hierarchy. It completely defeats that purpose if the
1893 literal "1" means "<literal>Prelude.fromInteger
1894 1</literal>", which is what the Haskell Report specifies.
1895 So the <option>-XRebindableSyntax</option>
1896 flag causes
1897 the following pieces of built-in syntax to refer to
1898 <emphasis>whatever is in scope</emphasis>, not the Prelude
1899 versions:
1900 <itemizedlist>
1901 <listitem>
1902 <para>An integer literal <literal>368</literal> means
1903 "<literal>fromInteger (368::Integer)</literal>", rather than
1904 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1905 </para> </listitem>
1906
1907 <listitem><para>Fractional literals are handed in just the same way,
1908 except that the translation is
1909 <literal>fromRational (3.68::Rational)</literal>.
1910 </para> </listitem>
1911
1912 <listitem><para>The equality test in an overloaded numeric pattern
1913 uses whatever <literal>(==)</literal> is in scope.
1914 </para> </listitem>
1915
1916 <listitem><para>The subtraction operation, and the
1917 greater-than-or-equal test, in <literal>n+k</literal> patterns
1918 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1919 </para></listitem>
1920
1921 <listitem>
1922 <para>Negation (e.g. "<literal>- (f x)</literal>")
1923 means "<literal>negate (f x)</literal>", both in numeric
1924 patterns, and expressions.
1925 </para></listitem>
1926
1927 <listitem>
1928 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1929 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1930 </para></listitem>
1931
1932 <listitem>
1933 <para>"Do" notation is translated using whatever
1934 functions <literal>(>>=)</literal>,
1935 <literal>(>>)</literal>, and <literal>fail</literal>,
1936 are in scope (not the Prelude
1937 versions). List comprehensions, <literal>mdo</literal>
1938 (<xref linkend="recursive-do-notation"/>), and parallel array
1939 comprehensions, are unaffected. </para></listitem>
1940
1941 <listitem>
1942 <para>Arrow
1943 notation (see <xref linkend="arrow-notation"/>)
1944 uses whatever <literal>arr</literal>,
1945 <literal>(>>>)</literal>, <literal>first</literal>,
1946 <literal>app</literal>, <literal>(|||)</literal> and
1947 <literal>loop</literal> functions are in scope. But unlike the
1948 other constructs, the types of these functions must match the
1949 Prelude types very closely. Details are in flux; if you want
1950 to use this, ask!
1951 </para></listitem>
1952 </itemizedlist>
1953 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1954 </para>
1955 <para>
1956 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1957 even if that is a little unexpected. For example, the
1958 static semantics of the literal <literal>368</literal>
1959 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1960 <literal>fromInteger</literal> to have any of the types:
1961 <programlisting>
1962 fromInteger :: Integer -> Integer
1963 fromInteger :: forall a. Foo a => Integer -> a
1964 fromInteger :: Num a => a -> Integer
1965 fromInteger :: Integer -> Bool -> Bool
1966 </programlisting>
1967 </para>
1968
1969 <para>Be warned: this is an experimental facility, with
1970 fewer checks than usual. Use <literal>-dcore-lint</literal>
1971 to typecheck the desugared program. If Core Lint is happy
1972 you should be all right.</para>
1973
1974 </sect2>
1975
1976 <sect2 id="postfix-operators">
1977 <title>Postfix operators</title>
1978
1979 <para>
1980 The <option>-XPostfixOperators</option> flag enables a small
1981 extension to the syntax of left operator sections, which allows you to
1982 define postfix operators. The extension is this: the left section
1983 <programlisting>
1984 (e !)
1985 </programlisting>
1986 is equivalent (from the point of view of both type checking and execution) to the expression
1987 <programlisting>
1988 ((!) e)
1989 </programlisting>
1990 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1991 The strict Haskell 98 interpretation is that the section is equivalent to
1992 <programlisting>
1993 (\y -> (!) e y)
1994 </programlisting>
1995 That is, the operator must be a function of two arguments. GHC allows it to
1996 take only one argument, and that in turn allows you to write the function
1997 postfix.
1998 </para>
1999 <para>The extension does not extend to the left-hand side of function
2000 definitions; you must define such a function in prefix form.</para>
2001
2002 </sect2>
2003
2004 <sect2 id="tuple-sections">
2005 <title>Tuple sections</title>
2006
2007 <para>
2008 The <option>-XTupleSections</option> flag enables Python-style partially applied
2009 tuple constructors. For example, the following program
2010 <programlisting>
2011 (, True)
2012 </programlisting>
2013 is considered to be an alternative notation for the more unwieldy alternative
2014 <programlisting>
2015 \x -> (x, True)
2016 </programlisting>
2017 You can omit any combination of arguments to the tuple, as in the following
2018 <programlisting>
2019 (, "I", , , "Love", , 1337)
2020 </programlisting>
2021 which translates to
2022 <programlisting>
2023 \a b c d -> (a, "I", b, c, "Love", d, 1337)
2024 </programlisting>
2025 </para>
2026
2027 <para>
2028 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
2029 will also be available for them, like so
2030 <programlisting>
2031 (# , True #)
2032 </programlisting>
2033 Because there is no unboxed unit tuple, the following expression
2034 <programlisting>
2035 (# #)
2036 </programlisting>
2037 continues to stand for the unboxed singleton tuple data constructor.
2038 </para>
2039
2040 </sect2>
2041
2042 <sect2 id="lambda-case">
2043 <title>Lambda-case</title>
2044 <para>
2045 The <option>-XLambdaCase</option> flag enables expressions of the form
2046 <programlisting>
2047 \case { p1 -> e1; ...; pN -> eN }
2048 </programlisting>
2049 which is equivalent to
2050 <programlisting>
2051 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
2052 </programlisting>
2053 Note that <literal>\case</literal> starts a layout, so you can write
2054 <programlisting>
2055 \case
2056 p1 -> e1
2057 ...
2058 pN -> eN
2059 </programlisting>
2060 </para>
2061 </sect2>
2062
2063 <sect2 id="empty-case">
2064 <title>Empty case alternatives</title>
2065 <para>
2066 The <option>-XEmptyCase</option> flag enables
2067 case expressions, or lambda-case expressions, that have no alternatives,
2068 thus:
2069 <programlisting>
2070 case e of { } -- No alternatives
2071 or
2072 \case { } -- -XLambdaCase is also required
2073 </programlisting>
2074 This can be useful when you know that the expression being scrutinised
2075 has no non-bottom values. For example:
2076 <programlisting>
2077 data Void
2078 f :: Void -> Int
2079 f x = case x of { }
2080 </programlisting>
2081 With dependently-typed features it is more useful
2082 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
2083 For example, consider these two candidate definitions of <literal>absurd</literal>:
2084 <programlisting>
2085 data a :==: b where
2086 Refl :: a :==: a
2087
2088 absurd :: True :~: False -> a
2089 absurd x = error "absurd" -- (A)
2090 absurd x = case x of {} -- (B)
2091 </programlisting>
2092 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
2093 is an empty type. So (B) has no partiality and GHC should be able to compile with
2094 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
2095 yet clever enough to do that.)
2096 On the other hand (A) looks dangerous, and GHC doesn't check to make
2097 sure that, in fact, the function can never get called.
2098 </para>
2099 </sect2>
2100
2101 <sect2 id="multi-way-if">
2102 <title>Multi-way if-expressions</title>
2103 <para>
2104 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
2105 with multiple branches:
2106 <programlisting>
2107 if | guard1 -> expr1
2108 | ...
2109 | guardN -> exprN
2110 </programlisting>
2111 which is roughly equivalent to
2112 <programlisting>
2113 case () of
2114 _ | guard1 -> expr1
2115 ...
2116 _ | guardN -> exprN
2117 </programlisting>
2118 </para>
2119
2120 <para>Multi-way if expressions introduce a new layout context. So the
2121 example above is equivalent to:
2122 <programlisting>
2123 if { | guard1 -> expr1
2124 ; | ...
2125 ; | guardN -> exprN
2126 }
2127 </programlisting>
2128 The following behaves as expected:
2129 <programlisting>
2130 if | guard1 -> if | guard2 -> expr2
2131 | guard3 -> expr3
2132 | guard4 -> expr4
2133 </programlisting>
2134 because layout translates it as
2135 <programlisting>
2136 if { | guard1 -> if { | guard2 -> expr2
2137 ; | guard3 -> expr3
2138 }
2139 ; | guard4 -> expr4
2140 }
2141 </programlisting>
2142 Layout with multi-way if works in the same way as other layout
2143 contexts, except that the semi-colons between guards in a multi-way if
2144 are optional. So it is not necessary to line up all the guards at the
2145 same column; this is consistent with the way guards work in function
2146 definitions and case expressions.
2147 </para>
2148 </sect2>
2149
2150 <sect2 id="disambiguate-fields">
2151 <title>Record field disambiguation</title>
2152 <para>
2153 In record construction and record pattern matching
2154 it is entirely unambiguous which field is referred to, even if there are two different
2155 data types in scope with a common field name. For example:
2156 <programlisting>
2157 module M where
2158 data S = MkS { x :: Int, y :: Bool }
2159
2160 module Foo where
2161 import M
2162
2163 data T = MkT { x :: Int }
2164
2165 ok1 (MkS { x = n }) = n+1 -- Unambiguous
2166 ok2 n = MkT { x = n+1 } -- Unambiguous
2167
2168 bad1 k = k { x = 3 } -- Ambiguous
2169 bad2 k = x k -- Ambiguous
2170 </programlisting>
2171 Even though there are two <literal>x</literal>'s in scope,
2172 it is clear that the <literal>x</literal> in the pattern in the
2173 definition of <literal>ok1</literal> can only mean the field
2174 <literal>x</literal> from type <literal>S</literal>. Similarly for
2175 the function <literal>ok2</literal>. However, in the record update
2176 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
2177 it is not clear which of the two types is intended.
2178 </para>
2179 <para>
2180 Haskell 98 regards all four as ambiguous, but with the
2181 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
2182 the former two. The rules are precisely the same as those for instance
2183 declarations in Haskell 98, where the method names on the left-hand side
2184 of the method bindings in an instance declaration refer unambiguously
2185 to the method of that class (provided they are in scope at all), even
2186 if there are other variables in scope with the same name.
2187 This reduces the clutter of qualified names when you import two
2188 records from different modules that use the same field name.
2189 </para>
2190 <para>
2191 Some details:
2192 <itemizedlist>
2193 <listitem><para>
2194 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
2195 <programlisting>
2196 module Foo where
2197 import M
2198 x=True
2199 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
2200 </programlisting>
2201 </para></listitem>
2202
2203 <listitem><para>
2204 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
2205 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
2206 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
2207 <programlisting>
2208 module Foo where
2209 import qualified M -- Note qualified
2210
2211 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
2212 </programlisting>
2213 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
2214 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
2215 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
2216 is not. (In effect, it is qualified by the constructor.)
2217 </para></listitem>
2218 </itemizedlist>
2219 </para>
2220
2221 </sect2>
2222
2223 <!-- ===================== Record puns =================== -->
2224
2225 <sect2 id="record-puns">
2226 <title>Record puns
2227 </title>
2228
2229 <para>
2230 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
2231 </para>
2232
2233 <para>
2234 When using records, it is common to write a pattern that binds a
2235 variable with the same name as a record field, such as:
2236
2237 <programlisting>
2238 data C = C {a :: Int}
2239 f (C {a = a}) = a
2240 </programlisting>
2241 </para>
2242
2243 <para>
2244 Record punning permits the variable name to be elided, so one can simply
2245 write
2246
2247 <programlisting>
2248 f (C {a}) = a
2249 </programlisting>
2250
2251 to mean the same pattern as above. That is, in a record pattern, the
2252 pattern <literal>a</literal> expands into the pattern <literal>a =
2253 a</literal> for the same name <literal>a</literal>.
2254 </para>
2255
2256 <para>
2257 Note that:
2258 <itemizedlist>
2259 <listitem><para>
2260 Record punning can also be used in an expression, writing, for example,
2261 <programlisting>
2262 let a = 1 in C {a}
2263 </programlisting>
2264 instead of
2265 <programlisting>
2266 let a = 1 in C {a = a}
2267 </programlisting>
2268 The expansion is purely syntactic, so the expanded right-hand side
2269 expression refers to the nearest enclosing variable that is spelled the
2270 same as the field name.
2271 </para></listitem>
2272
2273 <listitem><para>
2274 Puns and other patterns can be mixed in the same record:
2275 <programlisting>
2276 data C = C {a :: Int, b :: Int}
2277 f (C {a, b = 4}) = a
2278 </programlisting>
2279 </para></listitem>
2280
2281 <listitem><para>
2282 Puns can be used wherever record patterns occur (e.g. in
2283 <literal>let</literal> bindings or at the top-level).
2284 </para></listitem>
2285
2286 <listitem><para>
2287 A pun on a qualified field name is expanded by stripping off the module qualifier.
2288 For example:
2289 <programlisting>
2290 f (C {M.a}) = a
2291 </programlisting>
2292 means
2293 <programlisting>
2294 f (M.C {M.a = a}) = a
2295 </programlisting>
2296 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
2297 is only in scope in qualified form.)
2298 </para></listitem>
2299 </itemizedlist>
2300 </para>
2301
2302
2303 </sect2>
2304
2305 <!-- ===================== Record wildcards =================== -->
2306
2307 <sect2 id="record-wildcards">
2308 <title>Record wildcards
2309 </title>
2310
2311 <para>
2312 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
2313 This flag implies <literal>-XDisambiguateRecordFields</literal>.
2314 </para>
2315
2316 <para>
2317 For records with many fields, it can be tiresome to write out each field
2318 individually in a record pattern, as in
2319 <programlisting>
2320 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
2321 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
2322 </programlisting>
2323 </para>
2324
2325 <para>
2326 Record wildcard syntax permits a "<literal>..</literal>" in a record
2327 pattern, where each elided field <literal>f</literal> is replaced by the
2328 pattern <literal>f = f</literal>. For example, the above pattern can be
2329 written as
2330 <programlisting>
2331 f (C {a = 1, ..}) = b + c + d
2332 </programlisting>
2333 </para>
2334
2335 <para>
2336 More details:
2337 <itemizedlist>
2338 <listitem><para>
2339 Record wildcards in patterns can be mixed with other patterns, including puns
2340 (<xref linkend="record-puns"/>); for example, in a pattern <literal>(C {a
2341 = 1, b, ..})</literal>. Additionally, record wildcards can be used
2342 wherever record patterns occur, including in <literal>let</literal>
2343 bindings and at the top-level. For example, the top-level binding
2344 <programlisting>
2345 C {a = 1, ..} = e
2346 </programlisting>
2347 defines <literal>b</literal>, <literal>c</literal>, and
2348 <literal>d</literal>.
2349 </para></listitem>
2350
2351 <listitem><para>
2352 Record wildcards can also be used in an expression, when constructing a record. For example,
2353 <programlisting>
2354 let {a = 1; b = 2; c = 3; d = 4} in C {..}
2355 </programlisting>
2356 in place of
2357 <programlisting>
2358 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
2359 </programlisting>
2360 The expansion is purely syntactic, so the record wildcard
2361 expression refers to the nearest enclosing variables that are spelled
2362 the same as the omitted field names.
2363 </para></listitem>
2364
2365 <listitem><para>
2366 Record wildcards may <emphasis>not</emphasis> be used in record <emphasis>updates</emphasis>. For example this
2367 is illegal:
2368 <programlisting>
2369 f r = r { x = 3, .. }
2370 </programlisting>
2371 </para></listitem>
2372
2373 <listitem><para>
2374 For both pattern and expression wildcards, the "<literal>..</literal>" expands to the missing
2375 <emphasis>in-scope</emphasis> record fields.
2376 Specifically the expansion of "<literal>C {..}</literal>" includes
2377 <literal>f</literal> if and only if:
2378 <itemizedlist>
2379 <listitem><para>
2380 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2381 </para></listitem>
2382 <listitem><para>
2383 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2384 </para></listitem>
2385 <listitem><para>
2386 In the case of expressions (but not patterns),
2387 the variable <literal>f</literal> is in scope unqualified,
2388 apart from the binding of the record selector itself.
2389 </para></listitem>
2390 </itemizedlist>
2391 These rules restrict record wildcards to the situations in which the user
2392 could have written the expanded version.
2393 For example
2394 <programlisting>
2395 module M where
2396 data R = R { a,b,c :: Int }
2397 module X where
2398 import M( R(a,c) )
2399 f b = R { .. }
2400 </programlisting>
2401 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2402 omitting <literal>b</literal> since the record field is not in scope,
2403 and omitting <literal>c</literal> since the variable <literal>c</literal>
2404 is not in scope (apart from the binding of the
2405 record selector <literal>c</literal>, of course).
2406 </para></listitem>
2407
2408 <listitem><para>
2409 Record wildcards cannot be used (a) in a record update construct, and (b) for data
2410 constructors that are not declared with record fields. For example:
2411 <programlisting>
2412 f x = x { v=True, .. } -- Illegal (a)
2413
2414 data T = MkT Int Bool
2415 g = MkT { .. } -- Illegal (b)
2416 h (MkT { .. }) = True -- Illegal (b)
2417 </programlisting>
2418 </para></listitem>
2419 </itemizedlist>
2420 </para>
2421
2422 </sect2>
2423
2424 <!-- ===================== Local fixity declarations =================== -->
2425
2426 <sect2 id="local-fixity-declarations">
2427 <title>Local Fixity Declarations
2428 </title>
2429
2430 <para>A careful reading of the Haskell 98 Report reveals that fixity
2431 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2432 <literal>infixr</literal>) are permitted to appear inside local bindings
2433 such those introduced by <literal>let</literal> and
2434 <literal>where</literal>. However, the Haskell Report does not specify
2435 the semantics of such bindings very precisely.
2436 </para>
2437
2438 <para>In GHC, a fixity declaration may accompany a local binding:
2439 <programlisting>
2440 let f = ...
2441 infixr 3 `f`
2442 in
2443 ...
2444 </programlisting>
2445 and the fixity declaration applies wherever the binding is in scope.
2446 For example, in a <literal>let</literal>, it applies in the right-hand
2447 sides of other <literal>let</literal>-bindings and the body of the
2448 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2449 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2450 declarations of a <literal>let</literal> statement scope over other
2451 statements in the group, just as the bound name does.
2452 </para>
2453
2454 <para>
2455 Moreover, a local fixity declaration *must* accompany a local binding of
2456 that name: it is not possible to revise the fixity of name bound
2457 elsewhere, as in
2458 <programlisting>
2459 let infixr 9 $ in ...
2460 </programlisting>
2461
2462 Because local fixity declarations are technically Haskell 98, no flag is
2463 necessary to enable them.
2464 </para>
2465 </sect2>
2466
2467 <sect2 id="package-imports">
2468 <title>Import and export extensions</title>
2469
2470 <sect3>
2471 <title>Hiding things the imported module doesn't export</title>
2472
2473 <para>
2474 Technically in Haskell 2010 this is illegal:
2475 <programlisting>
2476 module A( f ) where
2477 f = True
2478
2479 module B where
2480 import A hiding( g ) -- A does not export g
2481 g = f
2482 </programlisting>
2483 The <literal>import A hiding( g )</literal> in module <literal>B</literal>
2484 is technically an error (<ulink url="http://www.haskell.org/onlinereport/haskell2010/haskellch5.html#x11-1020005.3.1">Haskell Report, 5.3.1</ulink>)
2485 because <literal>A</literal> does not export <literal>g</literal>.
2486 However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of
2487 <literal>A</literal> might export <literal>g</literal>, and you want <literal>B</literal> to work
2488 in either case.
2489 </para>
2490 <para>
2491 The warning <literal>-fwarn-dodgy-imports</literal>, which is off by default but included with <literal>-W</literal>,
2492 warns if you hide something that the imported module does not export.
2493 </para>
2494 </sect3>
2495
2496 <sect3>
2497 <title id="package-qualified-imports">Package-qualified imports</title>
2498
2499 <para>With the <option>-XPackageImports</option> flag, GHC allows
2500 import declarations to be qualified by the package name that the
2501 module is intended to be imported from. For example:</para>
2502
2503 <programlisting>
2504 import "network" Network.Socket
2505 </programlisting>
2506
2507 <para>would import the module <literal>Network.Socket</literal> from
2508 the package <literal>network</literal> (any version). This may
2509 be used to disambiguate an import when the same module is
2510 available from multiple packages, or is present in both the
2511 current package being built and an external package.</para>
2512
2513 <para>The special package name <literal>this</literal> can be used to
2514 refer to the current package being built.</para>
2515
2516 <para>Note: you probably don't need to use this feature, it was
2517 added mainly so that we can build backwards-compatible versions of
2518 packages when APIs change. It can lead to fragile dependencies in
2519 the common case: modules occasionally move from one package to
2520 another, rendering any package-qualified imports broken.
2521 See also <xref linkend="package-thinning-and-renaming" /> for
2522 an alternative way of disambiguating between module names.</para>
2523 </sect3>
2524
2525 <sect3 id="safe-imports-ext">
2526 <title>Safe imports</title>
2527
2528 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2529 and <option>-XUnsafe</option> language flags, GHC extends
2530 the import declaration syntax to take an optional <literal>safe</literal>
2531 keyword after the <literal>import</literal> keyword. This feature
2532 is part of the Safe Haskell GHC extension. For example:</para>
2533
2534 <programlisting>
2535 import safe qualified Network.Socket as NS
2536 </programlisting>
2537
2538 <para>would import the module <literal>Network.Socket</literal>
2539 with compilation only succeeding if Network.Socket can be
2540 safely imported. For a description of when a import is
2541 considered safe see <xref linkend="safe-haskell"/></para>
2542
2543 </sect3>
2544
2545 <sect3 id="explicit-namespaces">
2546 <title>Explicit namespaces in import/export</title>
2547
2548 <para> In an import or export list, such as
2549 <programlisting>
2550 module M( f, (++) ) where ...
2551 import N( f, (++) )
2552 ...
2553 </programlisting>
2554 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2555 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2556 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2557 case, how would you export or import it?
2558 </para>
2559 <para>
2560 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2561 a type constructor in an import or export list with "<literal>type</literal>" to
2562 disambiguate this case, thus:
2563 <programlisting>
2564 module M( f, type (++) ) where ...
2565 import N( f, type (++) )
2566 ...
2567 module N( f, type (++) ) where
2568 data family a ++ b = L a | R b
2569 </programlisting>
2570 The extension <option>-XExplicitNamespaces</option>
2571 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2572 </para>
2573 <para>
2574 In addition, with <option>-XPatternSynonyms</option> you can prefix the name of
2575 a data constructor in an import or export list with the keyword <literal>pattern</literal>,
2576 to allow the import or export of a data constructor without its parent type constructor
2577 (see <xref linkend="patsyn-impexp"/>).
2578 </para>
2579 </sect3>
2580
2581 </sect2>
2582
2583 <sect2 id="syntax-stolen">
2584 <title>Summary of stolen syntax</title>
2585
2586 <para>Turning on an option that enables special syntax
2587 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2588 to compile, perhaps because it uses a variable name which has
2589 become a reserved word. This section lists the syntax that is
2590 "stolen" by language extensions.
2591 We use
2592 notation and nonterminal names from the Haskell 98 lexical syntax
2593 (see the Haskell 98 Report).
2594 We only list syntax changes here that might affect
2595 existing working programs (i.e. "stolen" syntax). Many of these
2596 extensions will also enable new context-free syntax, but in all
2597 cases programs written to use the new syntax would not be
2598 compilable without the option enabled.</para>
2599
2600 <para>There are two classes of special
2601 syntax:
2602
2603 <itemizedlist>
2604 <listitem>
2605 <para>New reserved words and symbols: character sequences
2606 which are no longer available for use as identifiers in the
2607 program.</para>
2608 </listitem>
2609 <listitem>
2610 <para>Other special syntax: sequences of characters that have
2611 a different meaning when this particular option is turned
2612 on.</para>
2613 </listitem>
2614 </itemizedlist>
2615
2616 The following syntax is stolen:
2617
2618 <variablelist>
2619 <varlistentry>
2620 <term>
2621 <literal>forall</literal>
2622 <indexterm><primary><literal>forall</literal></primary></indexterm>
2623 </term>
2624 <listitem><para>
2625 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2626 <option>-XScopedTypeVariables</option>,
2627 <option>-XLiberalTypeSynonyms</option>,
2628 <option>-XRankNTypes</option>,
2629 <option>-XExistentialQuantification</option>
2630 </para></listitem>
2631 </varlistentry>
2632
2633 <varlistentry>
2634 <term>
2635 <literal>mdo</literal>
2636 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2637 </term>
2638 <listitem><para>
2639 Stolen by: <option>-XRecursiveDo</option>
2640 </para></listitem>
2641 </varlistentry>
2642
2643 <varlistentry>
2644 <term>
2645 <literal>foreign</literal>
2646 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2647 </term>
2648 <listitem><para>
2649 Stolen by: <option>-XForeignFunctionInterface</option>
2650 </para></listitem>
2651 </varlistentry>
2652
2653 <varlistentry>
2654 <term>
2655 <literal>rec</literal>,
2656 <literal>proc</literal>, <literal>-&lt;</literal>,
2657 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2658 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2659 <literal>|)</literal> brackets
2660 <indexterm><primary><literal>proc</literal></primary></indexterm>
2661 </term>
2662 <listitem><para>
2663 Stolen by: <option>-XArrows</option>
2664 </para></listitem>
2665 </varlistentry>
2666
2667 <varlistentry>
2668 <term>
2669 <literal>?<replaceable>varid</replaceable></literal>
2670 <indexterm><primary>implicit parameters</primary></indexterm>
2671 </term>
2672 <listitem><para>
2673 Stolen by: <option>-XImplicitParams</option>
2674 </para></listitem>
2675 </varlistentry>
2676
2677 <varlistentry>
2678 <term>
2679 <literal>[|</literal>,
2680 <literal>[e|</literal>, <literal>[p|</literal>,
2681 <literal>[d|</literal>, <literal>[t|</literal>,
2682 <literal>$(</literal>,
2683 <literal>$$(</literal>,
2684 <literal>[||</literal>,
2685 <literal>[e||</literal>,
2686 <literal>$<replaceable>varid</replaceable></literal>,
2687 <literal>$$<replaceable>varid</replaceable></literal>
2688 <indexterm><primary>Template Haskell</primary></indexterm>
2689 </term>
2690 <listitem><para>
2691 Stolen by: <option>-XTemplateHaskell</option>
2692 </para></listitem>
2693 </varlistentry>
2694
2695 <varlistentry>
2696 <term>
2697 <literal>[<replaceable>varid</replaceable>|</literal>
2698 <indexterm><primary>quasi-quotation</primary></indexterm>
2699 </term>
2700 <listitem><para>
2701 Stolen by: <option>-XQuasiQuotes</option>
2702 </para></listitem>
2703 </varlistentry>
2704
2705 <varlistentry>
2706 <term>
2707 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2708 <replaceable>char</replaceable><literal>&num;</literal>,
2709 <replaceable>string</replaceable><literal>&num;</literal>,
2710 <replaceable>integer</replaceable><literal>&num;</literal>,
2711 <replaceable>float</replaceable><literal>&num;</literal>,
2712 <replaceable>float</replaceable><literal>&num;&num;</literal>
2713 </term>
2714 <listitem><para>
2715 Stolen by: <option>-XMagicHash</option>
2716 </para></listitem>
2717 </varlistentry>
2718
2719 <varlistentry>
2720 <term>
2721 <literal>(&num;</literal>, <literal>&num;)</literal>
2722 </term>
2723 <listitem><para>
2724 Stolen by: <option>-XUnboxedTuples</option>
2725 </para></listitem>
2726 </varlistentry>
2727
2728 <varlistentry>
2729 <term>
2730 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2731 </term>
2732 <listitem><para>
2733 Stolen by: <option>-XBangPatterns</option>
2734 </para></listitem>
2735 </varlistentry>
2736
2737 <varlistentry>
2738 <term>
2739 <literal>pattern</literal>
2740 </term>
2741 <listitem><para>
2742 Stolen by: <option>-XPatternSynonyms</option>
2743 </para></listitem>
2744 </varlistentry>
2745 </variablelist>
2746 </para>
2747 </sect2>
2748 </sect1>
2749
2750
2751 <!-- TYPE SYSTEM EXTENSIONS -->
2752 <sect1 id="data-type-extensions">
2753 <title>Extensions to data types and type synonyms</title>
2754
2755 <sect2 id="nullary-types">
2756 <title>Data types with no constructors</title>
2757
2758 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2759 GHC lets you declare a data type with no constructors. For example:</para>
2760
2761 <programlisting>
2762 data S -- S :: *
2763 data T a -- T :: * -> *
2764 </programlisting>
2765
2766 <para>Syntactically, the declaration lacks the "= constrs" part. The
2767 type can be parameterised over types of any kind, but if the kind is
2768 not <literal>*</literal> then an explicit kind annotation must be used
2769 (see <xref linkend="kinding"/>).</para>
2770
2771 <para>Such data types have only one value, namely bottom.
2772 Nevertheless, they can be useful when defining "phantom types".</para>
2773 </sect2>
2774
2775 <sect2 id="datatype-contexts">
2776 <title>Data type contexts</title>
2777
2778 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2779
2780 <programlisting>
2781 data Eq a => Set a = NilSet | ConsSet a (Set a)
2782 </programlisting>
2783
2784 <para>give constructors with types:</para>
2785
2786 <programlisting>
2787 NilSet :: Set a
2788 ConsSet :: Eq a => a -> Set a -> Set a
2789 </programlisting>
2790
2791 <para>This is widely considered a misfeature, and is going to be removed from
2792 the language. In GHC, it is controlled by the deprecated extension
2793 <literal>DatatypeContexts</literal>.</para>
2794 </sect2>
2795
2796 <sect2 id="infix-tycons">
2797 <title>Infix type constructors, classes, and type variables</title>
2798
2799 <para>
2800 GHC allows type constructors, classes, and type variables to be operators, and
2801 to be written infix, very much like expressions. More specifically:
2802 <itemizedlist>
2803 <listitem><para>
2804 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2805 The lexical syntax is the same as that for data constructors.
2806 </para></listitem>
2807 <listitem><para>
2808 Data type and type-synonym declarations can be written infix, parenthesised
2809 if you want further arguments. E.g.
2810 <screen>
2811 data a :*: b = Foo a b
2812 type a :+: b = Either a b
2813 class a :=: b where ...
2814
2815 data (a :**: b) x = Baz a b x
2816 type (a :++: b) y = Either (a,b) y
2817 </screen>
2818 </para></listitem>
2819 <listitem><para>
2820 Types, and class constraints, can be written infix. For example
2821 <screen>
2822 x :: Int :*: Bool
2823 f :: (a :=: b) => a -> b
2824 </screen>
2825 </para></listitem>
2826 <listitem><para>
2827 Back-quotes work
2828 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2829 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2830 </para></listitem>
2831 <listitem><para>
2832 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2833 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2834 sets the fixity for a data constructor and the corresponding type constructor. For example:
2835 <screen>
2836 infixl 7 T, :*:
2837 </screen>
2838 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2839 and similarly for <literal>:*:</literal>.
2840 <literal>Int `a` Bool</literal>.
2841 </para></listitem>
2842 <listitem><para>
2843 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2844 </para></listitem>
2845
2846 </itemizedlist>
2847 </para>
2848 </sect2>
2849
2850 <sect2 id="type-operators">
2851 <title>Type operators</title>
2852 <para>
2853 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2854 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2855 <programlisting>
2856 type T (+) = ((+), (+))
2857 -- Just like: type T a = (a,a)
2858
2859 f :: T Int -> Int
2860 f (x,y)= x
2861 </programlisting>
2862 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2863 allow you to write them infix.
2864 </para>
2865 <para>
2866 The language <option>-XTypeOperators</option> changes this behaviour:
2867 <itemizedlist>
2868 <listitem><para>
2869 Operator symbols become type <emphasis>constructors</emphasis> rather than
2870 type <emphasis>variables</emphasis>.
2871 </para></listitem>
2872 <listitem><para>
2873 Operator symbols in types can be written infix, both in definitions and uses.
2874 for example:
2875 <programlisting>
2876 data a + b = Plus a b
2877 type Foo = Int + Bool
2878 </programlisting>
2879 </para></listitem>
2880 <listitem><para>
2881 There is now some potential ambiguity in import and export lists; for example
2882 if you write <literal>import M( (+) )</literal> do you mean the
2883 <emphasis>function</emphasis> <literal>(+)</literal> or the
2884 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2885 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2886 by <option>-XExplicitTypeOperators</option>) GHC allows you to specify the latter
2887 by preceding it with the keyword <literal>type</literal>, thus:
2888 <programlisting>
2889 import M( type (+) )
2890 </programlisting>
2891 See <xref linkend="explicit-namespaces"/>.
2892 </para></listitem>
2893 <listitem><para>
2894 The fixity of a type operator may be set using the usual fixity declarations
2895 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2896 a single fixity.
2897 </para></listitem>
2898 </itemizedlist>
2899 </para>
2900 </sect2>
2901
2902 <sect2 id="type-synonyms">
2903 <title>Liberalised type synonyms</title>
2904
2905 <para>
2906 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2907 on individual synonym declarations.
2908 With the <option>-XLiberalTypeSynonyms</option> extension,
2909 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2910 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2911
2912 <itemizedlist>
2913 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2914 in a type synonym, thus:
2915 <programlisting>
2916 type Discard a = forall b. Show b => a -> b -> (a, String)
2917
2918 f :: Discard a
2919 f x y = (x, show y)
2920
2921 g :: Discard Int -> (Int,String) -- A rank-2 type
2922 g f = f 3 True
2923 </programlisting>
2924 </para>
2925 </listitem>
2926
2927 <listitem><para>
2928 If you also use <option>-XUnboxedTuples</option>,
2929 you can write an unboxed tuple in a type synonym:
2930 <programlisting>
2931 type Pr = (# Int, Int #)
2932
2933 h :: Int -> Pr
2934 h x = (# x, x #)
2935 </programlisting>
2936 </para></listitem>
2937
2938 <listitem><para>
2939 You can apply a type synonym to a forall type:
2940 <programlisting>
2941 type Foo a = a -> a -> Bool
2942
2943 f :: Foo (forall b. b->b)
2944 </programlisting>
2945 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2946 <programlisting>
2947 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2948 </programlisting>
2949 </para></listitem>
2950
2951 <listitem><para>
2952 You can apply a type synonym to a partially applied type synonym:
2953 <programlisting>
2954 type Generic i o = forall x. i x -> o x
2955 type Id x = x
2956
2957 foo :: Generic Id []
2958 </programlisting>
2959 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2960 <programlisting>
2961 foo :: forall x. x -> [x]
2962 </programlisting>
2963 </para></listitem>
2964
2965 </itemizedlist>
2966 </para>
2967
2968 <para>
2969 GHC currently does kind checking before expanding synonyms (though even that
2970 could be changed.)
2971 </para>
2972 <para>
2973 After expanding type synonyms, GHC does validity checking on types, looking for
2974 the following mal-formedness which isn't detected simply by kind checking:
2975 <itemizedlist>
2976 <listitem><para>
2977 Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal>
2978 is off)
2979 </para></listitem>
2980 <listitem><para>
2981 Partially-applied type synonym.
2982 </para></listitem>
2983 </itemizedlist>
2984 So, for example, this will be rejected:
2985 <programlisting>
2986 type Pr = forall a. a
2987
2988 h :: [Pr]
2989 h = ...
2990 </programlisting>
2991 because GHC does not allow type constructors applied to for-all types.
2992 </para>
2993 </sect2>
2994
2995
2996 <sect2 id="existential-quantification">
2997 <title>Existentially quantified data constructors
2998 </title>
2999
3000 <para>
3001 The idea of using existential quantification in data type declarations
3002 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
3003 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
3004 London, 1991). It was later formalised by Laufer and Odersky
3005 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
3006 TOPLAS, 16(5), pp1411-1430, 1994).
3007 It's been in Lennart
3008 Augustsson's <command>hbc</command> Haskell compiler for several years, and
3009 proved very useful. Here's the idea. Consider the declaration:
3010 </para>
3011
3012 <para>
3013
3014 <programlisting>
3015 data Foo = forall a. MkFoo a (a -> Bool)
3016 | Nil
3017 </programlisting>
3018
3019 </para>
3020
3021 <para>
3022 The data type <literal>Foo</literal> has two constructors with types:
3023 </para>
3024
3025 <para>
3026
3027 <programlisting>
3028 MkFoo :: forall a. a -> (a -> Bool) -> Foo
3029 Nil :: Foo
3030 </programlisting>
3031
3032 </para>
3033
3034 <para>
3035 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
3036 does not appear in the data type itself, which is plain <literal>Foo</literal>.
3037 For example, the following expression is fine:
3038 </para>
3039
3040 <para>
3041
3042 <programlisting>
3043 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
3044 </programlisting>
3045
3046 </para>
3047
3048 <para>
3049 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
3050 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
3051 isUpper</function> packages a character with a compatible function. These
3052 two things are each of type <literal>Foo</literal> and can be put in a list.
3053 </para>
3054
3055 <para>
3056 What can we do with a value of type <literal>Foo</literal>?. In particular,
3057 what happens when we pattern-match on <function>MkFoo</function>?
3058 </para>
3059
3060 <para>
3061
3062 <programlisting>
3063 f (MkFoo val fn) = ???
3064 </programlisting>
3065
3066 </para>
3067
3068 <para>
3069 Since all we know about <literal>val</literal> and <function>fn</function> is that they
3070 are compatible, the only (useful) thing we can do with them is to
3071 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
3072 </para>
3073
3074 <para>
3075
3076 <programlisting>
3077 f :: Foo -> Bool
3078 f (MkFoo val fn) = fn val
3079 </programlisting>
3080
3081 </para>
3082
3083 <para>
3084 What this allows us to do is to package heterogeneous values
3085 together with a bunch of functions that manipulate them, and then treat
3086 that collection of packages in a uniform manner. You can express
3087 quite a bit of object-oriented-like programming this way.
3088 </para>
3089
3090 <sect3 id="existential">
3091 <title>Why existential?
3092 </title>
3093
3094 <para>
3095 What has this to do with <emphasis>existential</emphasis> quantification?
3096 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
3097 </para>
3098
3099 <para>
3100
3101 <programlisting>
3102 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
3103 </programlisting>
3104
3105 </para>
3106
3107 <para>
3108 But Haskell programmers can safely think of the ordinary
3109 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
3110 adding a new existential quantification construct.
3111 </para>
3112
3113 </sect3>
3114
3115 <sect3 id="existential-with-context">
3116 <title>Existentials and type classes</title>
3117
3118 <para>
3119 An easy extension is to allow
3120 arbitrary contexts before the constructor. For example:
3121 </para>
3122
3123 <para>
3124
3125 <programlisting>
3126 data Baz = forall a. Eq a => Baz1 a a
3127 | forall b. Show b => Baz2 b (b -> b)
3128 </programlisting>
3129
3130 </para>
3131
3132 <para>
3133 The two constructors have the types you'd expect:
3134 </para>
3135
3136 <para>
3137
3138 <programlisting>
3139 Baz1 :: forall a. Eq a => a -> a -> Baz
3140 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3141 </programlisting>
3142
3143 </para>
3144
3145 <para>
3146 But when pattern matching on <function>Baz1</function> the matched values can be compared
3147 for equality, and when pattern matching on <function>Baz2</function> the first matched
3148 value can be converted to a string (as well as applying the function to it).
3149 So this program is legal:
3150 </para>
3151
3152 <para>
3153
3154 <programlisting>
3155 f :: Baz -> String
3156 f (Baz1 p q) | p == q = "Yes"
3157 | otherwise = "No"
3158 f (Baz2 v fn) = show (fn v)
3159 </programlisting>
3160
3161 </para>
3162
3163 <para>
3164 Operationally, in a dictionary-passing implementation, the
3165 constructors <function>Baz1</function> and <function>Baz2</function> must store the
3166 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
3167 extract it on pattern matching.
3168 </para>
3169
3170 </sect3>
3171
3172 <sect3 id="existential-records">
3173 <title>Record Constructors</title>
3174
3175 <para>
3176 GHC allows existentials to be used with records syntax as well. For example:
3177
3178 <programlisting>
3179 data Counter a = forall self. NewCounter
3180 { _this :: self
3181 , _inc :: self -> self
3182 , _display :: self -> IO ()
3183 , tag :: a
3184 }
3185 </programlisting>
3186 Here <literal>tag</literal> is a public field, with a well-typed selector
3187 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
3188 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
3189 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
3190 compile-time error. In other words, <emphasis>GHC defines a record selector function
3191 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
3192 (This example used an underscore in the fields for which record selectors
3193 will not be defined, but that is only programming style; GHC ignores them.)
3194 </para>
3195
3196 <para>
3197 To make use of these hidden fields, we need to create some helper functions:
3198
3199 <programlisting>
3200 inc :: Counter a -> Counter a
3201 inc (NewCounter x i d t) = NewCounter
3202 { _this = i x, _inc = i, _display = d, tag = t }
3203
3204 display :: Counter a -> IO ()
3205 display NewCounter{ _this = x, _display = d } = d x
3206 </programlisting>
3207
3208 Now we can define counters with different underlying implementations:
3209
3210 <programlisting>
3211 counterA :: Counter String
3212 counterA = NewCounter
3213 { _this = 0, _inc = (1+), _display = print, tag = "A" }
3214
3215 counterB :: Counter String
3216 counterB = NewCounter
3217 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
3218
3219 main = do
3220 display (inc counterA) -- prints "1"
3221 display (inc (inc counterB)) -- prints "##"
3222 </programlisting>
3223
3224 Record update syntax is supported for existentials (and GADTs):
3225 <programlisting>
3226 setTag :: Counter a -> a -> Counter a
3227 setTag obj t = obj{ tag = t }
3228 </programlisting>
3229 The rule for record update is this: <emphasis>
3230 the types of the updated fields may
3231 mention only the universally-quantified type variables
3232 of the data constructor. For GADTs, the field may mention only types
3233 that appear as a simple type-variable argument in the constructor's result
3234 type</emphasis>. For example:
3235 <programlisting>
3236 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
3237 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
3238 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
3239 -- existentially quantified)
3240
3241 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
3242 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
3243 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
3244 -- type-variable argument in G1's result type)
3245 </programlisting>
3246 </para>
3247
3248 </sect3>
3249
3250
3251 <sect3>
3252 <title>Restrictions</title>
3253
3254 <para>
3255 There are several restrictions on the ways in which existentially-quantified
3256 constructors can be use.
3257 </para>
3258
3259 <para>
3260
3261 <itemizedlist>
3262 <listitem>
3263
3264 <para>
3265 When pattern matching, each pattern match introduces a new,
3266 distinct, type for each existential type variable. These types cannot
3267 be unified with any other type, nor can they escape from the scope of
3268 the pattern match. For example, these fragments are incorrect:
3269
3270
3271 <programlisting>
3272 f1 (MkFoo a f) = a
3273 </programlisting>
3274
3275
3276 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
3277 is the result of <function>f1</function>. One way to see why this is wrong is to
3278 ask what type <function>f1</function> has:
3279
3280
3281 <programlisting>
3282 f1 :: Foo -> a -- Weird!
3283 </programlisting>
3284
3285
3286 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
3287 this:
3288
3289
3290 <programlisting>
3291 f1 :: forall a. Foo -> a -- Wrong!
3292 </programlisting>
3293
3294
3295 The original program is just plain wrong. Here's another sort of error
3296
3297
3298 <programlisting>
3299 f2 (Baz1 a b) (Baz1 p q) = a==q
3300 </programlisting>
3301
3302
3303 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
3304 <literal>a==q</literal> is wrong because it equates the two distinct types arising
3305 from the two <function>Baz1</function> constructors.
3306
3307
3308 </para>
3309 </listitem>
3310 <listitem>
3311
3312 <para>
3313 You can't pattern-match on an existentially quantified
3314 constructor in a <literal>let</literal> or <literal>where</literal> group of
3315 bindings. So this is illegal:
3316
3317
3318 <programlisting>
3319 f3 x = a==b where { Baz1 a b = x }
3320 </programlisting>
3321
3322 Instead, use a <literal>case</literal> expression:
3323
3324 <programlisting>
3325 f3 x = case x of Baz1 a b -> a==b
3326 </programlisting>
3327
3328 In general, you can only pattern-match
3329 on an existentially-quantified constructor in a <literal>case</literal> expression or
3330 in the patterns of a function definition.
3331
3332 The reason for this restriction is really an implementation one.
3333 Type-checking binding groups is already a nightmare without
3334 existentials complicating the picture. Also an existential pattern
3335 binding at the top level of a module doesn't make sense, because it's
3336 not clear how to prevent the existentially-quantified type "escaping".
3337 So for now, there's a simple-to-state restriction. We'll see how
3338 annoying it is.
3339
3340 </para>
3341 </listitem>
3342 <listitem>
3343
3344 <para>
3345 You can't use existential quantification for <literal>newtype</literal>
3346 declarations. So this is illegal:
3347
3348
3349 <programlisting>
3350 newtype T = forall a. Ord a => MkT a
3351 </programlisting>
3352
3353
3354 Reason: a value of type <literal>T</literal> must be represented as a
3355 pair of a dictionary for <literal>Ord t</literal> and a value of type
3356 <literal>t</literal>. That contradicts the idea that
3357 <literal>newtype</literal> should have no concrete representation.
3358 You can get just the same efficiency and effect by using
3359 <literal>data</literal> instead of <literal>newtype</literal>. If
3360 there is no overloading involved, then there is more of a case for
3361 allowing an existentially-quantified <literal>newtype</literal>,
3362 because the <literal>data</literal> version does carry an
3363 implementation cost, but single-field existentially quantified
3364 constructors aren't much use. So the simple restriction (no
3365 existential stuff on <literal>newtype</literal>) stands, unless there
3366 are convincing reasons to change it.
3367
3368
3369 </para>
3370 </listitem>
3371 <listitem>
3372
3373 <para>
3374 You can't use <literal>deriving</literal> to define instances of a
3375 data type with existentially quantified data constructors.
3376
3377 Reason: in most cases it would not make sense. For example:;
3378
3379 <programlisting>
3380 data T = forall a. MkT [a] deriving( Eq )
3381 </programlisting>
3382
3383 To derive <literal>Eq</literal> in the standard way we would need to have equality
3384 between the single component of two <function>MkT</function> constructors:
3385
3386 <programlisting>
3387 instance Eq T where
3388 (MkT a) == (MkT b) = ???
3389 </programlisting>
3390
3391 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
3392 It's just about possible to imagine examples in which the derived instance
3393 would make sense, but it seems altogether simpler simply to prohibit such
3394 declarations. Define your own instances!
3395 </para>
3396 </listitem>
3397
3398 </itemizedlist>
3399
3400 </para>
3401
3402 </sect3>
3403 </sect2>
3404
3405 <!-- ====================== Generalised algebraic data types ======================= -->
3406
3407 <sect2 id="gadt-style">
3408 <title>Declaring data types with explicit constructor signatures</title>
3409
3410 <para>When the <literal>GADTSyntax</literal> extension is enabled,
3411 GHC allows you to declare an algebraic data type by
3412 giving the type signatures of constructors explicitly. For example:
3413 <programlisting>
3414 data Maybe a where
3415 Nothing :: Maybe a
3416 Just :: a -> Maybe a
3417 </programlisting>
3418 The form is called a "GADT-style declaration"
3419 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
3420 can only be declared using this form.</para>
3421 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
3422 For example, these two declarations are equivalent:
3423 <programlisting>
3424 data Foo = forall a. MkFoo a (a -> Bool)
3425 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
3426 </programlisting>
3427 </para>
3428 <para>Any data type that can be declared in standard Haskell-98 syntax
3429 can also be declared using GADT-style syntax.
3430 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
3431 they treat class constraints on the data constructors differently.
3432 Specifically, if the constructor is given a type-class context, that
3433 context is made available by pattern matching. For example:
3434 <programlisting>
3435 data Set a where
3436 MkSet :: Eq a => [a] -> Set a
3437
3438 makeSet :: Eq a => [a] -> Set a
3439 makeSet xs = MkSet (nub xs)
3440
3441 insert :: a -> Set a -> Set a
3442 insert a (MkSet as) | a `elem` as = MkSet as
3443 | otherwise = MkSet (a:as)
3444 </programlisting>
3445 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3446 gives rise to a <literal>(Eq a)</literal>
3447 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3448 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3449 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3450 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3451 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3452 In the example, the equality dictionary is used to satisfy the equality constraint
3453 generated by the call to <literal>elem</literal>, so that the type of
3454 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3455 </para>
3456 <para>
3457 For example, one possible application is to reify dictionaries:
3458 <programlisting>
3459 data NumInst a where
3460 MkNumInst :: Num a => NumInst a
3461
3462 intInst :: NumInst Int
3463 intInst = MkNumInst
3464
3465 plus :: NumInst a -> a -> a -> a
3466 plus MkNumInst p q = p + q
3467 </programlisting>
3468 Here, a value of type <literal>NumInst a</literal> is equivalent
3469 to an explicit <literal>(Num a)</literal> dictionary.
3470 </para>
3471 <para>
3472 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3473 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3474 like this:
3475 <programlisting>
3476 data NumInst a
3477 = Num a => MkNumInst (NumInst a)
3478 </programlisting>
3479 Notice that, unlike the situation when declaring an existential, there is
3480 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3481 data type's universally quantified type variable <literal>a</literal>.
3482 A constructor may have both universal and existential type variables: for example,
3483 the following two declarations are equivalent:
3484 <programlisting>
3485 data T1 a
3486 = forall b. (Num a, Eq b) => MkT1 a b
3487 data T2 a where
3488 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3489 </programlisting>
3490 </para>
3491 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3492 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3493 In Haskell 98 the definition
3494 <programlisting>
3495 data Eq a => Set' a = MkSet' [a]
3496 </programlisting>
3497 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3498 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3499 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3500 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3501 GHC's behaviour is much more useful, as well as much more intuitive.
3502 </para>
3503
3504 <para>
3505 The rest of this section gives further details about GADT-style data
3506 type declarations.
3507
3508 <itemizedlist>
3509 <listitem><para>
3510 The result type of each data constructor must begin with the type constructor being defined.
3511 If the result type of all constructors
3512 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3513 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3514 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3515 </para></listitem>
3516
3517 <listitem><para>
3518 As with other type signatures, you can give a single signature for several data constructors.
3519 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3520 <programlisting>
3521 data T a where
3522 T1,T2 :: a -> T a
3523 T3 :: T a
3524 </programlisting>
3525 </para></listitem>
3526
3527 <listitem><para>
3528 The type signature of
3529 each constructor is independent, and is implicitly universally quantified as usual.
3530 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3531 have no scope, and different constructors may have different universally-quantified type variables:
3532 <programlisting>
3533 data T a where -- The 'a' has no scope
3534 T1,T2 :: b -> T b -- Means forall b. b -> T b
3535 T3 :: T a -- Means forall a. T a
3536 </programlisting>
3537 </para></listitem>
3538
3539 <listitem><para>
3540 A constructor signature may mention type class constraints, which can differ for
3541 different constructors. For example, this is fine:
3542 <programlisting>
3543 data T a where
3544 T1 :: Eq b => b -> b -> T b
3545 T2 :: (Show c, Ix c) => c -> [c] -> T c
3546 </programlisting>
3547 When pattern matching, these constraints are made available to discharge constraints
3548 in the body of the match. For example:
3549 <programlisting>
3550 f :: T a -> String
3551 f (T1 x y) | x==y = "yes"
3552 | otherwise = "no"
3553 f (T2 a b) = show a
3554 </programlisting>
3555 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3556 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3557 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3558 </para></listitem>
3559
3560 <listitem><para>
3561 Unlike a Haskell-98-style
3562 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3563 have no scope. Indeed, one can write a kind signature instead:
3564 <programlisting>
3565 data Set :: * -> * where ...
3566 </programlisting>
3567 or even a mixture of the two:
3568 <programlisting>
3569 data Bar a :: (* -> *) -> * where ...
3570 </programlisting>
3571 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3572 like this:
3573 <programlisting>
3574 data Bar a (b :: * -> *) where ...
3575 </programlisting>
3576 </para></listitem>
3577
3578
3579 <listitem><para>
3580 You can use strictness annotations, in the obvious places
3581 in the constructor type:
3582 <programlisting>
3583 data Term a where
3584 Lit :: !Int -> Term Int
3585 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3586 Pair :: Term a -> Term b -> Term (a,b)
3587 </programlisting>
3588 </para></listitem>
3589
3590 <listitem><para>
3591 You can use a <literal>deriving</literal> clause on a GADT-style data type
3592 declaration. For example, these two declarations are equivalent
3593 <programlisting>
3594 data Maybe1 a where {
3595 Nothing1 :: Maybe1 a ;
3596 Just1 :: a -> Maybe1 a
3597 } deriving( Eq, Ord )
3598
3599 data Maybe2 a = Nothing2 | Just2 a
3600 deriving( Eq, Ord )
3601 </programlisting>
3602 </para></listitem>
3603
3604 <listitem><para>
3605 The type signature may have quantified type variables that do not appear
3606 in the result type:
3607 <programlisting>
3608 data Foo where
3609 MkFoo :: a -> (a->Bool) -> Foo
3610 Nil :: Foo
3611 </programlisting>
3612 Here the type variable <literal>a</literal> does not appear in the result type
3613 of either constructor.
3614 Although it is universally quantified in the type of the constructor, such
3615 a type variable is often called "existential".
3616 Indeed, the above declaration declares precisely the same type as
3617 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3618 </para><para>
3619 The type may contain a class context too, of course:
3620 <programlisting>
3621 data Showable where
3622 MkShowable :: Show a => a -> Showable
3623 </programlisting>
3624 </para></listitem>
3625
3626 <listitem><para>
3627 You can use record syntax on a GADT-style data type declaration:
3628
3629 <programlisting>
3630 data Person where
3631 Adult :: { name :: String, children :: [Person] } -> Person
3632 Child :: Show a => { name :: !String, funny :: a } -> Person
3633 </programlisting>
3634 As usual, for every constructor that has a field <literal>f</literal>, the type of
3635 field <literal>f</literal> must be the same (modulo alpha conversion).
3636 The <literal>Child</literal> constructor above shows that the signature
3637 may have a context, existentially-quantified variables, and strictness annotations,
3638 just as in the non-record case. (NB: the "type" that follows the double-colon
3639 is not really a type, because of the record syntax and strictness annotations.
3640 A "type" of this form can appear only in a constructor signature.)
3641 </para></listitem>
3642
3643 <listitem><para>
3644 Record updates are allowed with GADT-style declarations,
3645 only fields that have the following property: the type of the field
3646 mentions no existential type variables.
3647 </para></listitem>
3648
3649 <listitem><para>
3650 As in the case of existentials declared using the Haskell-98-like record syntax
3651 (<xref linkend="existential-records"/>),
3652 record-selector functions are generated only for those fields that have well-typed
3653 selectors.
3654 Here is the example of that section, in GADT-style syntax:
3655 <programlisting>
3656 data Counter a where
3657 NewCounter :: { _this :: self
3658 , _inc :: self -> self
3659 , _display :: self -> IO ()
3660 , tag :: a
3661 } -> Counter a
3662 </programlisting>
3663 As before, only one selector function is generated here, that for <literal>tag</literal>.
3664 Nevertheless, you can still use all the field names in pattern matching and record construction.
3665 </para></listitem>
3666
3667 <listitem><para>
3668 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3669 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3670 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3671 So GHC implements the following design: a data constructor declared in a GADT-style data type
3672 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3673 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3674 <programlisting>
3675 infix 6 (:--:)
3676 data T a where
3677 (:--:) :: Int -> Bool -> T Int
3678 </programlisting>
3679 </para></listitem>
3680 </itemizedlist></para>
3681 </sect2>
3682
3683 <sect2 id="gadt">
3684 <title>Generalised Algebraic Data Types (GADTs)</title>
3685
3686 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3687 by allowing constructors to have richer return types. Here is an example:
3688 <programlisting>
3689 data Term a where
3690 Lit :: Int -> Term Int
3691 Succ :: Term Int -> Term Int
3692 IsZero :: Term Int -> Term Bool
3693 If :: Term Bool -> Term a -> Term a -> Term a
3694 Pair :: Term a -> Term b -> Term (a,b)
3695 </programlisting>
3696 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3697 case with ordinary data types. This generality allows us to
3698 write a well-typed <literal>eval</literal> function
3699 for these <literal>Terms</literal>:
3700 <programlisting>
3701 eval :: Term a -> a
3702 eval (Lit i) = i
3703 eval (Succ t) = 1 + eval t
3704 eval (IsZero t) = eval t == 0
3705 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3706 eval (Pair e1 e2) = (eval e1, eval e2)
3707 </programlisting>
3708 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3709 For example, in the right hand side of the equation
3710 <programlisting>
3711 eval :: Term a -> a
3712 eval (Lit i) = ...
3713 </programlisting>
3714 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3715 A precise specification of the type rules is beyond what this user manual aspires to,
3716 but the design closely follows that described in
3717 the paper <ulink
3718 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3719 unification-based type inference for GADTs</ulink>,
3720 (ICFP 2006).
3721 The general principle is this: <emphasis>type refinement is only carried out
3722 based on user-supplied type annotations</emphasis>.
3723 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3724 and lots of obscure error messages will
3725 occur. However, the refinement is quite general. For example, if we had:
3726 <programlisting>
3727 eval :: Term a -> a -> a
3728 eval (Lit i) j = i+j
3729 </programlisting>
3730 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3731 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3732 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3733 </para>
3734 <para>
3735 These and many other examples are given in papers by Hongwei Xi, and
3736 Tim Sheard. There is a longer introduction
3737 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3738 and Ralf Hinze's
3739 <ulink url="http://www.cs.ox.ac.uk/ralf.hinze/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3740 may use different notation to that implemented in GHC.
3741 </para>
3742 <para>
3743 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3744 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XGADTSyntax</option>
3745 and <option>-XMonoLocalBinds</option>.
3746 <itemizedlist>
3747 <listitem><para>
3748 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3749 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3750 The result type of each constructor must begin with the type constructor being defined,
3751 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3752 For example, in the <literal>Term</literal> data
3753 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3754 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3755 constructor).
3756 </para></listitem>
3757
3758 <listitem><para>
3759 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3760 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3761 whose result type is not just <literal>T a b</literal>.
3762 </para></listitem>
3763
3764 <listitem><para>
3765 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3766 an ordinary data type.
3767 </para></listitem>
3768
3769 <listitem><para>
3770 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3771 For example:
3772 <programlisting>
3773 data Term a where
3774 Lit :: { val :: Int } -> Term Int
3775 Succ :: { num :: Term Int } -> Term Int
3776 Pred :: { num :: Term Int } -> Term Int
3777 IsZero :: { arg :: Term Int } -> Term Bool
3778 Pair :: { arg1 :: Term a
3779 , arg2 :: Term b
3780 } -> Term (a,b)
3781 If :: { cnd :: Term Bool
3782 , tru :: Term a
3783 , fls :: Term a
3784 } -> Term a
3785 </programlisting>
3786 However, for GADTs there is the following additional constraint:
3787 every constructor that has a field <literal>f</literal> must have
3788 the same result type (modulo alpha conversion)
3789 Hence, in the above example, we cannot merge the <literal>num</literal>
3790 and <literal>arg</literal> fields above into a
3791 single name. Although their field types are both <literal>Term Int</literal>,
3792 their selector functions actually have different types:
3793
3794 <programlisting>
3795 num :: Term Int -> Term Int
3796 arg :: Term Bool -> Term Int
3797 </programlisting>
3798 </para></listitem>
3799
3800 <listitem><para>
3801 When pattern-matching against data constructors drawn from a GADT,
3802 for example in a <literal>case</literal> expression, the following rules apply:
3803 <itemizedlist>
3804 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3805 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3806 <listitem><para>The type of any free variable mentioned in any of
3807 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3808 </itemizedlist>
3809 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3810 way to ensure that a variable a rigid type is to give it a type signature.
3811 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3812 Simple unification-based type inference for GADTs
3813 </ulink>. The criteria implemented by GHC are given in the Appendix.
3814
3815 </para></listitem>
3816
3817 </itemizedlist>
3818 </para>
3819
3820 </sect2>
3821 </sect1>
3822
3823 <!-- ====================== End of Generalised algebraic data types ======================= -->
3824
3825 <sect1 id="deriving">
3826 <title>Extensions to the "deriving" mechanism</title>
3827
3828 <sect2 id="deriving-inferred">
3829 <title>Inferred context for deriving clauses</title>
3830
3831 <para>
3832 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3833 legal. For example:
3834 <programlisting>
3835 data T0 f a = MkT0 a deriving( Eq )
3836 data T1 f a = MkT1 (f a) deriving( Eq )
3837 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3838 </programlisting>
3839 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3840 <programlisting>
3841 instance Eq a => Eq (T0 f a) where ...
3842 instance Eq (f a) => Eq (T1 f a) where ...
3843 instance Eq (f (f a)) => Eq (T2 f a) where ...
3844 </programlisting>
3845 The first of these is obviously fine. The second is still fine, although less obviously.
3846 The third is not Haskell 98, and risks losing termination of instances.
3847 </para>
3848 <para>
3849 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3850 each constraint in the inferred instance context must consist only of type variables,
3851 with no repetitions.
3852 </para>
3853 <para>
3854 This rule is applied regardless of flags. If you want a more exotic context, you can write
3855 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3856 </para>
3857 </sect2>
3858
3859 <sect2 id="stand-alone-deriving">
3860 <title>Stand-alone deriving declarations</title>
3861
3862 <para>
3863 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3864 <programlisting>
3865 data Foo a = Bar a | Baz String
3866
3867 deriving instance Eq a => Eq (Foo a)
3868 </programlisting>
3869 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3870 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3871 </para>
3872 <para>
3873 However, standalone deriving differs from a <literal>deriving</literal> clause in a number
3874 of important ways:
3875 <itemizedlist>
3876 <listitem><para>The standalone deriving declaration does not need to be in the
3877 same module as the data type declaration. (But be aware of the dangers of
3878 orphan instances (<xref linkend="orphan-modules"/>).
3879 </para></listitem>
3880
3881 <listitem><para>
3882 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3883 exactly as you would in an ordinary instance declaration.
3884 (In contrast, in a <literal>deriving</literal> clause
3885 attached to a data type declaration, the context is inferred.)
3886 </para></listitem>
3887
3888 <listitem><para>
3889 Unlike a <literal>deriving</literal>
3890 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3891 than the data type (assuming you also use
3892 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3893 for example
3894 <programlisting>
3895 data Foo a = Bar a | Baz String
3896
3897 deriving instance Eq a => Eq (Foo [a])
3898 deriving instance Eq a => Eq (Foo (Maybe a))
3899 </programlisting>
3900 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3901 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3902 </para></listitem>
3903
3904 <listitem><para>
3905 Unlike a <literal>deriving</literal>
3906 declaration attached to a <literal>data</literal> declaration,
3907 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3908 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3909 your problem. (GHC will show you the offending code if it has a type error.)
3910 </para>
3911 <para>
3912 The merit of this is that you can derive instances for GADTs and other exotic
3913 data types, providing only that the boilerplate code does indeed typecheck. For example:
3914 <programlisting>
3915 data T a where
3916 T1 :: T Int
3917 T2 :: T Bool
3918
3919 deriving instance Show (T a)
3920 </programlisting>
3921 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3922 data type declaration for <literal>T</literal>,
3923 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3924 the instance declaration using stand-alone deriving.
3925 </para>
3926 <para>
3927 The down-side is that,
3928 if the boilerplate code fails to typecheck, you will get an error message about that
3929 code, which you did not write. Whereas, with a <literal>deriving</literal> clause
3930 the side-conditions are necessarily more conservative, but any error message
3931 may be more comprehensible.
3932 </para>
3933 </listitem>
3934 </itemizedlist></para>
3935
3936 <para>
3937 In other ways, however, a standalone deriving obeys the same rules as ordinary deriving:
3938 <itemizedlist>
3939 <listitem><para>
3940 A <literal>deriving instance</literal> declaration
3941 must obey the same rules concerning form and termination as ordinary instance declarations,
3942 controlled by the same flags; see <xref linkend="instance-decls"/>.
3943 </para></listitem>
3944
3945 <listitem>
3946 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3947 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3948 For example:
3949 <programlisting>
3950 newtype Foo a = MkFoo (State Int a)
3951
3952 deriving instance MonadState Int Foo
3953 </programlisting>
3954 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3955 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
3956 </para></listitem>
3957 </itemizedlist></para>
3958
3959 </sect2>
3960
3961 <sect2 id="deriving-extra">
3962 <title>Deriving instances of extra classes (<literal>Data</literal>, etc)</title>
3963
3964 <para>
3965 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3966 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3967 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3968 classes <literal>Eq</literal>, <literal>Ord</literal>,
3969 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3970 </para>
3971 <para>
3972 GHC extends this list with several more classes that may be automatically derived:
3973 <itemizedlist>
3974 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
3975 instances of the classes <literal>Generic</literal> and
3976 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
3977 You can use these to define generic functions,
3978 as described in <xref linkend="generic-programming"/>.
3979 </para></listitem>
3980
3981 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3982 the class <literal>Functor</literal>,
3983 defined in <literal>GHC.Base</literal>.
3984 </para></listitem>
3985
3986 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of
3987 the class <literal>Data</literal>,
3988 defined in <literal>Data.Data</literal>. See <xref linkend="deriving-typeable"/> for
3989 deriving <literal>Typeable</literal>.
3990 </para></listitem>
3991
3992 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3993 the class <literal>Foldable</literal>,
3994 defined in <literal>Data.Foldable</literal>.
3995 </para></listitem>
3996
3997 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3998 the class <literal>Traversable</literal>,
3999 defined in <literal>Data.Traversable</literal>. Since the <literal>Traversable</literal>
4000 instance dictates the instances of <literal>Functor</literal> and
4001 <literal>Foldable</literal>, you'll probably want to derive them too, so
4002 <option>-XDeriveTraversable</option> implies
4003 <option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>.
4004 </para></listitem>
4005 </itemizedlist>
4006 You can also use a standalone deriving declaration instead
4007 (see <xref linkend="stand-alone-deriving"/>).
4008 </para>
4009 <para>
4010 In each case the appropriate class must be in scope before it
4011 can be mentioned in the <literal>deriving</literal> clause.
4012 </para>
4013 </sect2>
4014
4015 <sect2 id="deriving-typeable">
4016 <title>Deriving <literal>Typeable</literal> instances</title>
4017
4018 <para>The class <literal>Typeable</literal> is very special:
4019 <itemizedlist>
4020 <listitem><para>
4021 <literal>Typeable</literal> is kind-polymorphic (see
4022 <xref linkend="kind-polymorphism"/>).
4023 </para></listitem>
4024
4025 <listitem><para>
4026 Only derived instances of <literal>Typeable</literal> are allowed;
4027 i.e. handwritten instances are forbidden. This ensures that the
4028 programmer cannot subert the type system by writing bogus instances.
4029 </para></listitem>
4030
4031 <listitem><para>
4032 With <option>-XDeriveDataTypeable</option>
4033 GHC allows you to derive instances of <literal>Typeable</literal> for data types or newtypes,
4034 using a <literal>deriving</literal> clause, or using
4035 a standalone deriving declaration (<xref linkend="stand-alone-deriving"/>).
4036 </para></listitem>
4037
4038 <listitem><para>
4039 With <option>-XDataKinds</option>, deriving <literal>Typeable</literal> for a data
4040 type (whether via a deriving clause or standalone deriving)
4041 also derives <literal>Typeable</literal> for the promoted data constructors (<xref linkend="promotion"/>).
4042 </para></listitem>
4043
4044 <listitem><para>
4045 However, using standalone deriving, you can <emphasis>also</emphasis> derive
4046 a <literal>Typeable</literal> instance for a data family.
4047 You may not add a <literal>deriving(Typeable)</literal> clause to a
4048 <literal>data instance</literal> declaration; instead you must use a
4049 standalone deriving declaration for the data family.
4050 </para></listitem>
4051
4052 <listitem><para>
4053 Using standalone deriving, you can <emphasis>also</emphasis> derive
4054 a <literal>Typeable</literal> instance for a type class.
4055 </para></listitem>
4056
4057 <listitem><para>
4058 The flag <option>-XAutoDeriveTypeable</option> triggers the generation
4059 of derived <literal>Typeable</literal> instances for every datatype, data family,
4060 and type class declaration in the module it is used, unless a manually-specified one is
4061 already provided.
4062 This flag implies <option>-XDeriveDataTypeable</option>.
4063 </para></listitem>
4064 </itemizedlist>
4065
4066 </para>
4067
4068 </sect2>
4069
4070 <sect2 id="newtype-deriving">
4071 <title>Generalised derived instances for newtypes</title>
4072
4073 <para>
4074 When you define an abstract type using <literal>newtype</literal>, you may want
4075 the new type to inherit some instances from its representation. In
4076 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
4077 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
4078 other classes you have to write an explicit instance declaration. For
4079 example, if you define
4080
4081 <programlisting>
4082 newtype Dollars = Dollars Int
4083 </programlisting>
4084
4085 and you want to use arithmetic on <literal>Dollars</literal>, you have to
4086 explicitly define an instance of <literal>Num</literal>:
4087
4088 <programlisting>
4089 instance Num Dollars where
4090 Dollars a + Dollars b = Dollars (a+b)
4091 ...
4092 </programlisting>
4093 All the instance does is apply and remove the <literal>newtype</literal>
4094 constructor. It is particularly galling that, since the constructor
4095 doesn't appear at run-time, this instance declaration defines a
4096 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
4097 dictionary, only slower!
4098 </para>
4099
4100
4101 <sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title>
4102 <para>
4103 GHC now permits such instances to be derived instead,
4104 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
4105 so one can write