9acb56fc292ee16139c481c6268ac66f3de910b6
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by commandline flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>::</entry> <!-- no special char, apparently -->
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&larrtl;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&rarrtl;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry></entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry></entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483 <sect2 id="binary-literals">
484 <title>Binary integer literals</title>
485 <para>
486 Haskell 2010 and Haskell 98 allows for integer literals to
487 be given in decimal, octal (prefixed by
488 <literal>0o</literal> or <literal>0O</literal>), or
489 hexadecimal notation (prefixed by <literal>0x</literal> or
490 <literal>0X</literal>).
491 </para>
492
493 <para>
494 The language extension <option>-XBinaryLiterals</option>
495 adds support for expressing integer literals in binary
496 notation with the prefix <literal>0b</literal> or
497 <literal>0B</literal>. For instance, the binary integer
498 literal <literal>0b11001001</literal> will be desugared into
499 <literal>fromInteger 201</literal> when
500 <option>-XBinaryLiterals</option> is enabled.
501 </para>
502 </sect2>
503
504 <!-- ====================== HIERARCHICAL MODULES ======================= -->
505
506
507 <sect2 id="hierarchical-modules">
508 <title>Hierarchical Modules</title>
509
510 <para>GHC supports a small extension to the syntax of module
511 names: a module name is allowed to contain a dot
512 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
513 &ldquo;hierarchical module namespace&rdquo; extension, because
514 it extends the normally flat Haskell module namespace into a
515 more flexible hierarchy of modules.</para>
516
517 <para>This extension has very little impact on the language
518 itself; modules names are <emphasis>always</emphasis> fully
519 qualified, so you can just think of the fully qualified module
520 name as <quote>the module name</quote>. In particular, this
521 means that the full module name must be given after the
522 <literal>module</literal> keyword at the beginning of the
523 module; for example, the module <literal>A.B.C</literal> must
524 begin</para>
525
526 <programlisting>module A.B.C</programlisting>
527
528
529 <para>It is a common strategy to use the <literal>as</literal>
530 keyword to save some typing when using qualified names with
531 hierarchical modules. For example:</para>
532
533 <programlisting>
534 import qualified Control.Monad.ST.Strict as ST
535 </programlisting>
536
537 <para>For details on how GHC searches for source and interface
538 files in the presence of hierarchical modules, see <xref
539 linkend="search-path"/>.</para>
540
541 <para>GHC comes with a large collection of libraries arranged
542 hierarchically; see the accompanying <ulink
543 url="../libraries/index.html">library
544 documentation</ulink>. More libraries to install are available
545 from <ulink
546 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
547 </sect2>
548
549 <!-- ====================== PATTERN GUARDS ======================= -->
550
551 <sect2 id="pattern-guards">
552 <title>Pattern guards</title>
553
554 <para>
555 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
556 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
557 </para>
558
559 <para>
560 Suppose we have an abstract data type of finite maps, with a
561 lookup operation:
562
563 <programlisting>
564 lookup :: FiniteMap -> Int -> Maybe Int
565 </programlisting>
566
567 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
568 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
569 </para>
570
571 <programlisting>
572 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
573 | otherwise = var1 + var2
574 where
575 m1 = lookup env var1
576 m2 = lookup env var2
577 ok1 = maybeToBool m1
578 ok2 = maybeToBool m2
579 val1 = expectJust m1
580 val2 = expectJust m2
581 </programlisting>
582
583 <para>
584 The auxiliary functions are
585 </para>
586
587 <programlisting>
588 maybeToBool :: Maybe a -&gt; Bool
589 maybeToBool (Just x) = True
590 maybeToBool Nothing = False
591
592 expectJust :: Maybe a -&gt; a
593 expectJust (Just x) = x
594 expectJust Nothing = error "Unexpected Nothing"
595 </programlisting>
596
597 <para>
598 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
599 ok2</literal> checks that both lookups succeed, using
600 <function>maybeToBool</function> to convert the <function>Maybe</function>
601 types to booleans. The (lazily evaluated) <function>expectJust</function>
602 calls extract the values from the results of the lookups, and binds the
603 returned values to <varname>val1</varname> and <varname>val2</varname>
604 respectively. If either lookup fails, then clunky takes the
605 <literal>otherwise</literal> case and returns the sum of its arguments.
606 </para>
607
608 <para>
609 This is certainly legal Haskell, but it is a tremendously verbose and
610 un-obvious way to achieve the desired effect. Arguably, a more direct way
611 to write clunky would be to use case expressions:
612 </para>
613
614 <programlisting>
615 clunky env var1 var2 = case lookup env var1 of
616 Nothing -&gt; fail
617 Just val1 -&gt; case lookup env var2 of
618 Nothing -&gt; fail
619 Just val2 -&gt; val1 + val2
620 where
621 fail = var1 + var2
622 </programlisting>
623
624 <para>
625 This is a bit shorter, but hardly better. Of course, we can rewrite any set
626 of pattern-matching, guarded equations as case expressions; that is
627 precisely what the compiler does when compiling equations! The reason that
628 Haskell provides guarded equations is because they allow us to write down
629 the cases we want to consider, one at a time, independently of each other.
630 This structure is hidden in the case version. Two of the right-hand sides
631 are really the same (<function>fail</function>), and the whole expression
632 tends to become more and more indented.
633 </para>
634
635 <para>
636 Here is how I would write clunky:
637 </para>
638
639 <programlisting>
640 clunky env var1 var2
641 | Just val1 &lt;- lookup env var1
642 , Just val2 &lt;- lookup env var2
643 = val1 + val2
644 ...other equations for clunky...
645 </programlisting>
646
647 <para>
648 The semantics should be clear enough. The qualifiers are matched in order.
649 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
650 right hand side is evaluated and matched against the pattern on the left.
651 If the match fails then the whole guard fails and the next equation is
652 tried. If it succeeds, then the appropriate binding takes place, and the
653 next qualifier is matched, in the augmented environment. Unlike list
654 comprehensions, however, the type of the expression to the right of the
655 <literal>&lt;-</literal> is the same as the type of the pattern to its
656 left. The bindings introduced by pattern guards scope over all the
657 remaining guard qualifiers, and over the right hand side of the equation.
658 </para>
659
660 <para>
661 Just as with list comprehensions, boolean expressions can be freely mixed
662 with among the pattern guards. For example:
663 </para>
664
665 <programlisting>
666 f x | [y] &lt;- x
667 , y > 3
668 , Just z &lt;- h y
669 = ...
670 </programlisting>
671
672 <para>
673 Haskell's current guards therefore emerge as a special case, in which the
674 qualifier list has just one element, a boolean expression.
675 </para>
676 </sect2>
677
678 <!-- ===================== View patterns =================== -->
679
680 <sect2 id="view-patterns">
681 <title>View patterns
682 </title>
683
684 <para>
685 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
686 More information and examples of view patterns can be found on the
687 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
688 page</ulink>.
689 </para>
690
691 <para>
692 View patterns are somewhat like pattern guards that can be nested inside
693 of other patterns. They are a convenient way of pattern-matching
694 against values of abstract types. For example, in a programming language
695 implementation, we might represent the syntax of the types of the
696 language as follows:
697
698 <programlisting>
699 type Typ
700
701 data TypView = Unit
702 | Arrow Typ Typ
703
704 view :: Typ -> TypView
705
706 -- additional operations for constructing Typ's ...
707 </programlisting>
708
709 The representation of Typ is held abstract, permitting implementations
710 to use a fancy representation (e.g., hash-consing to manage sharing).
711
712 Without view patterns, using this signature a little inconvenient:
713 <programlisting>
714 size :: Typ -> Integer
715 size t = case view t of
716 Unit -> 1
717 Arrow t1 t2 -> size t1 + size t2
718 </programlisting>
719
720 It is necessary to iterate the case, rather than using an equational
721 function definition. And the situation is even worse when the matching
722 against <literal>t</literal> is buried deep inside another pattern.
723 </para>
724
725 <para>
726 View patterns permit calling the view function inside the pattern and
727 matching against the result:
728 <programlisting>
729 size (view -> Unit) = 1
730 size (view -> Arrow t1 t2) = size t1 + size t2
731 </programlisting>
732
733 That is, we add a new form of pattern, written
734 <replaceable>expression</replaceable> <literal>-></literal>
735 <replaceable>pattern</replaceable> that means "apply the expression to
736 whatever we're trying to match against, and then match the result of
737 that application against the pattern". The expression can be any Haskell
738 expression of function type, and view patterns can be used wherever
739 patterns are used.
740 </para>
741
742 <para>
743 The semantics of a pattern <literal>(</literal>
744 <replaceable>exp</replaceable> <literal>-></literal>
745 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
746
747 <itemizedlist>
748
749 <listitem> Scoping:
750
751 <para>The variables bound by the view pattern are the variables bound by
752 <replaceable>pat</replaceable>.
753 </para>
754
755 <para>
756 Any variables in <replaceable>exp</replaceable> are bound occurrences,
757 but variables bound "to the left" in a pattern are in scope. This
758 feature permits, for example, one argument to a function to be used in
759 the view of another argument. For example, the function
760 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
761 written using view patterns as follows:
762
763 <programlisting>
764 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
765 ...other equations for clunky...
766 </programlisting>
767 </para>
768
769 <para>
770 More precisely, the scoping rules are:
771 <itemizedlist>
772 <listitem>
773 <para>
774 In a single pattern, variables bound by patterns to the left of a view
775 pattern expression are in scope. For example:
776 <programlisting>
777 example :: Maybe ((String -> Integer,Integer), String) -> Bool
778 example Just ((f,_), f -> 4) = True
779 </programlisting>
780
781 Additionally, in function definitions, variables bound by matching earlier curried
782 arguments may be used in view pattern expressions in later arguments:
783 <programlisting>
784 example :: (String -> Integer) -> String -> Bool
785 example f (f -> 4) = True
786 </programlisting>
787 That is, the scoping is the same as it would be if the curried arguments
788 were collected into a tuple.
789 </para>
790 </listitem>
791
792 <listitem>
793 <para>
794 In mutually recursive bindings, such as <literal>let</literal>,
795 <literal>where</literal>, or the top level, view patterns in one
796 declaration may not mention variables bound by other declarations. That
797 is, each declaration must be self-contained. For example, the following
798 program is not allowed:
799 <programlisting>
800 let {(x -> y) = e1 ;
801 (y -> x) = e2 } in x
802 </programlisting>
803
804 (For some amplification on this design choice see
805 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
806
807 </para>
808 </listitem>
809 </itemizedlist>
810
811 </para>
812 </listitem>
813
814 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
815 <replaceable>T1</replaceable> <literal>-></literal>
816 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
817 a <replaceable>T2</replaceable>, then the whole view pattern matches a
818 <replaceable>T1</replaceable>.
819 </para></listitem>
820
821 <listitem><para> Matching: To the equations in Section 3.17.3 of the
822 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
823 Report</ulink>, add the following:
824 <programlisting>
825 case v of { (e -> p) -> e1 ; _ -> e2 }
826 =
827 case (e v) of { p -> e1 ; _ -> e2 }
828 </programlisting>
829 That is, to match a variable <replaceable>v</replaceable> against a pattern
830 <literal>(</literal> <replaceable>exp</replaceable>
831 <literal>-></literal> <replaceable>pat</replaceable>
832 <literal>)</literal>, evaluate <literal>(</literal>
833 <replaceable>exp</replaceable> <replaceable> v</replaceable>
834 <literal>)</literal> and match the result against
835 <replaceable>pat</replaceable>.
836 </para></listitem>
837
838 <listitem><para> Efficiency: When the same view function is applied in
839 multiple branches of a function definition or a case expression (e.g.,
840 in <literal>size</literal> above), GHC makes an attempt to collect these
841 applications into a single nested case expression, so that the view
842 function is only applied once. Pattern compilation in GHC follows the
843 matrix algorithm described in Chapter 4 of <ulink
844 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
845 Implementation of Functional Programming Languages</ulink>. When the
846 top rows of the first column of a matrix are all view patterns with the
847 "same" expression, these patterns are transformed into a single nested
848 case. This includes, for example, adjacent view patterns that line up
849 in a tuple, as in
850 <programlisting>
851 f ((view -> A, p1), p2) = e1
852 f ((view -> B, p3), p4) = e2
853 </programlisting>
854 </para>
855
856 <para> The current notion of when two view pattern expressions are "the
857 same" is very restricted: it is not even full syntactic equality.
858 However, it does include variables, literals, applications, and tuples;
859 e.g., two instances of <literal>view ("hi", "there")</literal> will be
860 collected. However, the current implementation does not compare up to
861 alpha-equivalence, so two instances of <literal>(x, view x ->
862 y)</literal> will not be coalesced.
863 </para>
864
865 </listitem>
866
867 </itemizedlist>
868 </para>
869
870 </sect2>
871
872 <!-- ===================== Pattern synonyms =================== -->
873
874 <sect2 id="pattern-synonyms">
875 <title>Pattern synonyms
876 </title>
877
878 <para>
879 Pattern synonyms are enabled by the flag
880 <literal>-XPatternSynonyms</literal>, which is required for both
881 defining them <emphasis>and</emphasis> using them. More information
882 and examples of view patterns can be found on the <ulink
883 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
884 page</ulink>.
885 </para>
886
887 <para>
888 Pattern synonyms enable giving names to parametrized pattern
889 schemes. They can also be thought of as abstract constructors that
890 don't have a bearing on data representation. For example, in a
891 programming language implementation, we might represent types of the
892 language as follows:
893 </para>
894
895 <programlisting>
896 data Type = App String [Type]
897 </programlisting>
898
899 <para>
900 Here are some examples of using said representation.
901 Consider a few types of the <literal>Type</literal> universe encoded
902 like this:
903 </para>
904
905 <programlisting>
906 App "->" [t1, t2] -- t1 -> t2
907 App "Int" [] -- Int
908 App "Maybe" [App "Int" []] -- Maybe Int
909 </programlisting>
910
911 <para>
912 This representation is very generic in that no types are given special
913 treatment. However, some functions might need to handle some known
914 types specially, for example the following two functions collect all
915 argument types of (nested) arrow types, and recognize the
916 <literal>Int</literal> type, respectively:
917 </para>
918
919 <programlisting>
920 collectArgs :: Type -> [Type]
921 collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
922 collectArgs _ = []
923
924 isInt :: Type -> Bool
925 isInt (App "Int" []) = True
926 isInt _ = False
927 </programlisting>
928
929 <para>
930 Matching on <literal>App</literal> directly is both hard to read and
931 error prone to write. And the situation is even worse when the
932 matching is nested:
933 </para>
934
935 <programlisting>
936 isIntEndo :: Type -> Bool
937 isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
938 isIntEndo _ = False
939 </programlisting>
940
941 <para>
942 Pattern synonyms permit abstracting from the representation to expose
943 matchers that behave in a constructor-like manner with respect to
944 pattern matching. We can create pattern synonyms for the known types
945 we care about, without committing the representation to them (note
946 that these don't have to be defined in the same module as the
947 <literal>Type</literal> type):
948 </para>
949
950 <programlisting>
951 pattern Arrow t1 t2 = App "->" [t1, t2]
952 pattern Int = App "Int" []
953 pattern Maybe t = App "Maybe" [t]
954 </programlisting>
955
956 <para>
957 Which enables us to rewrite our functions in a much cleaner style:
958 </para>
959
960 <programlisting>
961 collectArgs :: Type -> [Type]
962 collectArgs (Arrow t1 t2) = t1 : collectArgs t2
963 collectArgs _ = []
964
965 isInt :: Type -> Bool
966 isInt Int = True
967 isInt _ = False
968
969 isIntEndo :: Type -> Bool
970 isIntEndo (Arrow Int Int) = True
971 isIntEndo _ = False
972 </programlisting>
973
974 <para>
975 Note that in this example, the pattern synonyms
976 <literal>Int</literal> and <literal>Arrow</literal> can also be used
977 as expressions (they are <emphasis>bidirectional</emphasis>). This
978 is not necessarily the case: <emphasis>unidirectional</emphasis>
979 pattern synonyms can also be declared with the following syntax:
980 </para>
981
982 <programlisting>
983 pattern Head x &lt;- x:xs
984 </programlisting>
985
986 <para>
987 In this case, <literal>Head</literal> <replaceable>x</replaceable>
988 cannot be used in expressions, only patterns, since it wouldn't
989 specify a value for the <replaceable>xs</replaceable> on the
990 right-hand side.
991 </para>
992
993 <para>
994 The syntax and semantics of pattern synonyms are elaborated in the
995 following subsections.
996 See the <ulink
997 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
998 page</ulink> for more details.
999 </para>
1000
1001 <sect3> <title>Syntax and scoping of pattern synonyms</title>
1002 <para>
1003 A pattern synonym declaration can be either unidirectional or
1004 bidirectional. The syntax for unidirectional pattern synonyms is:
1005 <programlisting>
1006 pattern Name args &lt;- pat
1007 </programlisting>
1008 and the syntax for bidirectional pattern synonyms is:
1009 <programlisting>
1010 pattern Name args = pat
1011 </programlisting>
1012 Either prefix or infix syntax can be
1013 used.
1014 </para>
1015 <para>
1016 Pattern synonym declarations can only occur in the top level of a
1017 module. In particular, they are not allowed as local
1018 definitions. Currently, they also don't work in GHCi, but that is a
1019 technical restriction that will be lifted in later versions.
1020 </para>
1021 <para>
1022 The variables in the left-hand side of the definition are bound by
1023 the pattern on the right-hand side. For bidirectional pattern
1024 synonyms, all the variables of the right-hand side must also occur
1025 on the left-hand side; also, wildcard patterns and view patterns are
1026 not allowed. For unidirectional pattern synonyms, there is no
1027 restriction on the right-hand side pattern.
1028 </para>
1029
1030 <para>
1031 Pattern synonyms cannot be defined recursively.
1032 </para>
1033 </sect3>
1034
1035 <sect3 id="patsyn-impexp"> <title>Import and export of pattern synonyms</title>
1036
1037 <para>
1038 The name of the pattern synonym itself is in the same namespace as
1039 proper data constructors. In an export or import specification,
1040 you must prefix pattern
1041 names with the <literal>pattern</literal> keyword, e.g.:
1042 <programlisting>
1043 module Example (pattern Single) where
1044 pattern Single x = [x]
1045 </programlisting>
1046 Without the <literal>pattern</literal> prefix, <literal>Single</literal> would
1047 be interpreted as a type constructor in the export list.
1048 </para>
1049 <para>
1050 You may also use the <literal>pattern</literal> keyword in an import/export
1051 specification to import or export an ordinary data constructor. For example:
1052 <programlisting>
1053 import Data.Maybe( pattern Just )
1054 </programlisting>
1055 would bring into scope the data constructor <literal>Just</literal> from the
1056 <literal>Maybe</literal> type, without also bringing the type constructor
1057 <literal>Maybe</literal> into scope.
1058 </para>
1059 </sect3>
1060
1061 <sect3> <title>Typing of pattern synonyms</title>
1062
1063 <para>
1064 Given a pattern synonym definition of the form
1065 </para>
1066 <programlisting>
1067 pattern P var1 var2 ... varN &lt;- pat
1068 </programlisting>
1069 <para>
1070 it is assigned a <emphasis>pattern type</emphasis> of the form
1071 </para>
1072 <programlisting>
1073 pattern CProv => P t1 t2 ... tN :: CReq => t
1074 </programlisting>
1075 <para>
1076 where <replaceable>CProv</replaceable> and
1077 <replaceable>CReq</replaceable> are type contexts, and
1078 <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ...,
1079 <replaceable>tN</replaceable> and <replaceable>t</replaceable> are
1080 types.
1081 </para>
1082
1083 <para>
1084 A pattern synonym of this type can be used in a pattern if the
1085 instatiated (monomorphic) type satisfies the constraints of
1086 <replaceable>CReq</replaceable>. In this case, it extends the context
1087 available in the right-hand side of the match with
1088 <replaceable>CProv</replaceable>, just like how an existentially-typed
1089 data constructor can extend the context.
1090 </para>
1091
1092 <para>
1093 For example, in the following program:
1094 </para>
1095 <programlisting>
1096 {-# LANGUAGE PatternSynonyms, GADTs #-}
1097 module ShouldCompile where
1098
1099 data T a where
1100 MkT :: (Show b) => a -> b -> T a
1101
1102 pattern ExNumPat x = MkT 42 x
1103 </programlisting>
1104
1105 <para>
1106 the pattern type of <literal>ExNumPat</literal> is
1107 </para>
1108
1109 <programlisting>
1110 pattern (Show b) => ExNumPat b :: (Num a, Eq a) => T a
1111 </programlisting>
1112
1113 <para>
1114 and so can be used in a function definition like the following:
1115 </para>
1116
1117 <programlisting>
1118 f :: (Num t, Eq t) => T t -> String
1119 f (ExNumPat x) = show x
1120 </programlisting>
1121
1122 <para>
1123 For bidirectional pattern synonyms, uses as expressions have the type
1124 </para>
1125 <programlisting>
1126 (CProv, CReq) => t1 -> t2 -> ... -> tN -> t
1127 </programlisting>
1128
1129 <para>
1130 So in the previous example, <literal>ExNumPat</literal>,
1131 when used in an expression, has type
1132 </para>
1133 <programlisting>
1134 ExNumPat :: (Show b, Num a, Eq a) => b -> T t
1135 </programlisting>
1136 </sect3>
1137
1138 <sect3><title>Matching of pattern synonyms</title>
1139
1140 <para>
1141 A pattern synonym occurrence in a pattern is evaluated by first
1142 matching against the pattern synonym itself, and then on the argument
1143 patterns. For example, in the following program, <literal>f</literal>
1144 and <literal>f'</literal> are equivalent:
1145 </para>
1146
1147 <programlisting>
1148 pattern Pair x y &lt;- [x, y]
1149
1150 f (Pair True True) = True
1151 f _ = False
1152
1153 f' [x, y] | True &lt;- x, True &lt;- y = True
1154 f' _ = False
1155 </programlisting>
1156
1157 <para>
1158 Note that the strictness of <literal>f</literal> differs from that
1159 of <literal>g</literal> defined below:
1160 <programlisting>
1161 g [True, True] = True
1162 g _ = False
1163
1164 *Main> f (False:undefined)
1165 *** Exception: Prelude.undefined
1166 *Main> g (False:undefined)
1167 False
1168 </programlisting>
1169 </para>
1170 </sect3>
1171
1172 </sect2>
1173
1174 <!-- ===================== n+k patterns =================== -->
1175
1176 <sect2 id="n-k-patterns">
1177 <title>n+k patterns</title>
1178 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
1179
1180 <para>
1181 <literal>n+k</literal> pattern support is disabled by default. To enable
1182 it, you can use the <option>-XNPlusKPatterns</option> flag.
1183 </para>
1184
1185 </sect2>
1186
1187 <!-- ===================== Traditional record syntax =================== -->
1188
1189 <sect2 id="traditional-record-syntax">
1190 <title>Traditional record syntax</title>
1191 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
1192
1193 <para>
1194 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
1195 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
1196 </para>
1197
1198 </sect2>
1199
1200 <!-- ===================== Recursive do-notation =================== -->
1201
1202 <sect2 id="recursive-do-notation">
1203 <title>The recursive do-notation
1204 </title>
1205
1206 <para>
1207 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
1208 that is, the variables bound in a do-expression are visible only in the textually following
1209 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
1210 group.
1211 </para>
1212
1213 <para>
1214 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
1215 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
1216 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
1217 <programlisting>
1218 class Monad m => MonadFix m where
1219 mfix :: (a -> m a) -> m a
1220 </programlisting>
1221 Haskell's
1222 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
1223 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
1224 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
1225 </para>
1226
1227 <para>
1228 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
1229 an extended version of the do-notation that allows recursive bindings.
1230 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
1231 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
1232 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
1233 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
1234 are recursively defined, much like in an ordinary let-expression. Due to the new
1235 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
1236 </para>
1237
1238 <para>
1239 Here is a simple (albeit contrived) example:
1240 <programlisting>
1241 {-# LANGUAGE RecursiveDo #-}
1242 justOnes = mdo { xs &lt;- Just (1:xs)
1243 ; return (map negate xs) }
1244 </programlisting>
1245 or equivalently
1246 <programlisting>
1247 {-# LANGUAGE RecursiveDo #-}
1248 justOnes = do { rec { xs &lt;- Just (1:xs) }
1249 ; return (map negate xs) }
1250 </programlisting>
1251 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
1252 </para>
1253
1254 <para>
1255 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
1256 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
1257 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
1258 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
1259 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
1260 </para>
1261
1262 <sect3>
1263 <title>Recursive binding groups</title>
1264
1265 <para>
1266 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
1267 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
1268 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
1269 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
1270 <programlisting>
1271 do { a &lt;- getChar do { a &lt;- getChar
1272 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
1273 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
1274 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
1275 </programlisting>
1276 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
1277 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
1278 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
1279 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
1280 </para>
1281
1282 <para>
1283 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
1284 group, it will compute its set of bound variables, and will introduce an appropriate call
1285 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
1286 <literal>MonadFix</literal> class. Here is an example:
1287 <programlisting>
1288 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
1289 ; c &lt;- f b a } ; c &lt;- f b a
1290 ; return (b,c) })
1291 </programlisting>
1292 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
1293 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
1294 <programlisting>
1295 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1296 </programlisting>
1297 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1298 </para>
1299
1300 <para>
1301 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
1302 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
1303 for the <literal>mdo</literal> notation, which is described next.
1304 </para>
1305 </sect3>
1306
1307 <sect3>
1308 <title>The <literal>mdo</literal> notation</title>
1309
1310 <para>
1311 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
1312 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
1313 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
1314 in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
1315 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
1316 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
1317 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
1318 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
1319 interference, changing the termination behavior of the resulting translation.
1320 (Details can be found in Sections 3.1 and 7.2.2 of
1321 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1322 </para>
1323
1324 <para>
1325 The <literal>mdo</literal> notation removes the burden of placing
1326 explicit <literal>rec</literal> blocks in the code. Unlike an
1327 ordinary <literal>do</literal> expression, in which variables bound by
1328 statements are only in scope for later statements, variables bound in
1329 an <literal>mdo</literal> expression are in scope for all statements
1330 of the expression. The compiler then automatically identifies minimal
1331 mutually recursively dependent segments of statements, treating them as
1332 if the user had wrapped a <literal>rec</literal> qualifier around them.
1333 </para>
1334
1335 <para>
1336 The definition is syntactic:
1337 </para>
1338 <itemizedlist>
1339 <listitem>
1340 <para>
1341 A generator <replaceable>g</replaceable>
1342 <emphasis>depends</emphasis> on a textually following generator
1343 <replaceable>g'</replaceable>, if
1344 </para>
1345 <itemizedlist>
1346 <listitem>
1347 <para>
1348 <replaceable>g'</replaceable> defines a variable that
1349 is used by <replaceable>g</replaceable>, or
1350 </para>
1351 </listitem>
1352 <listitem>
1353 <para>
1354 <replaceable>g'</replaceable> textually appears between
1355 <replaceable>g</replaceable> and
1356 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1357 depends on <replaceable>g''</replaceable>.
1358 </para>
1359 </listitem>
1360 </itemizedlist>
1361 </listitem>
1362 <listitem>
1363 <para>
1364 A <emphasis>segment</emphasis> of a given
1365 <literal>mdo</literal>-expression is a minimal sequence of generators
1366 such that no generator of the sequence depends on an outside
1367 generator. As a special case, although it is not a generator,
1368 the final expression in an <literal>mdo</literal>-expression is
1369 considered to form a segment by itself.
1370 </para>
1371 </listitem>
1372 </itemizedlist>
1373 <para>
1374 Segments in this sense are
1375 related to <emphasis>strongly-connected components</emphasis> analysis,
1376 with the exception that bindings in a segment cannot be reordered and
1377 must be contiguous.
1378 </para>
1379
1380 <para>
1381 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1382 <programlisting>
1383 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1384 ; b &lt;- f a c ; rec { b &lt;- f a c
1385 ; c &lt;- f b a ; ; c &lt;- f b a }
1386 ; z &lt;- h a b ; z &lt;- h a b
1387 ; d &lt;- g d e ; rec { d &lt;- g d e
1388 ; e &lt;- g a z ; ; e &lt;- g a z }
1389 ; putChar c } ; putChar c }
1390 </programlisting>
1391 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1392 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1393 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1394 would expect.
1395 </para>
1396
1397 <para>
1398 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1399 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1400 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1401 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1402 version would do so.
1403 </para>
1404
1405 <para>
1406 Here are some other important points in using the recursive-do notation:
1407
1408 <itemizedlist>
1409 <listitem>
1410 <para>
1411 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1412 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1413 blocks inside <literal>do</literal> expressions.)
1414 </para>
1415 </listitem>
1416 <listitem>
1417 <para>
1418 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1419 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1420 <literal>rec</literal> blocks in a single expression.
1421 </para>
1422 </listitem>
1423 <listitem>
1424 <para>
1425 If recursive bindings are required for a monad, then that monad must be declared an instance of
1426 the <literal>MonadFix</literal> class.
1427 </para>
1428 </listitem>
1429 <listitem>
1430 <para>
1431 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1432 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1433 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1434 state monad (strict and lazy, respectively).
1435 </para>
1436 </listitem>
1437 <listitem>
1438 <para>
1439 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1440 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1441 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1442 </para>
1443 </listitem>
1444 </itemizedlist>
1445 </para>
1446 </sect3>
1447
1448
1449 </sect2>
1450
1451
1452 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1453
1454 <sect2 id="parallel-list-comprehensions">
1455 <title>Parallel List Comprehensions</title>
1456 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1457 </indexterm>
1458 <indexterm><primary>parallel list comprehensions</primary>
1459 </indexterm>
1460
1461 <para>Parallel list comprehensions are a natural extension to list
1462 comprehensions. List comprehensions can be thought of as a nice
1463 syntax for writing maps and filters. Parallel comprehensions
1464 extend this to include the zipWith family.</para>
1465
1466 <para>A parallel list comprehension has multiple independent
1467 branches of qualifier lists, each separated by a `|' symbol. For
1468 example, the following zips together two lists:</para>
1469
1470 <programlisting>
1471 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1472 </programlisting>
1473
1474 <para>The behaviour of parallel list comprehensions follows that of
1475 zip, in that the resulting list will have the same length as the
1476 shortest branch.</para>
1477
1478 <para>We can define parallel list comprehensions by translation to
1479 regular comprehensions. Here's the basic idea:</para>
1480
1481 <para>Given a parallel comprehension of the form: </para>
1482
1483 <programlisting>
1484 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1485 | q1 &lt;- e21, q2 &lt;- e22, ...
1486 ...
1487 ]
1488 </programlisting>
1489
1490 <para>This will be translated to: </para>
1491
1492 <programlisting>
1493 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1494 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1495 ...
1496 ]
1497 </programlisting>
1498
1499 <para>where `zipN' is the appropriate zip for the given number of
1500 branches.</para>
1501
1502 </sect2>
1503
1504 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1505
1506 <sect2 id="generalised-list-comprehensions">
1507 <title>Generalised (SQL-Like) List Comprehensions</title>
1508 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1509 </indexterm>
1510 <indexterm><primary>extended list comprehensions</primary>
1511 </indexterm>
1512 <indexterm><primary>group</primary></indexterm>
1513 <indexterm><primary>sql</primary></indexterm>
1514
1515
1516 <para>Generalised list comprehensions are a further enhancement to the
1517 list comprehension syntactic sugar to allow operations such as sorting
1518 and grouping which are familiar from SQL. They are fully described in the
1519 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1520 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1521 except that the syntax we use differs slightly from the paper.</para>
1522 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1523 <para>Here is an example:
1524 <programlisting>
1525 employees = [ ("Simon", "MS", 80)
1526 , ("Erik", "MS", 100)
1527 , ("Phil", "Ed", 40)
1528 , ("Gordon", "Ed", 45)
1529 , ("Paul", "Yale", 60)]
1530
1531 output = [ (the dept, sum salary)
1532 | (name, dept, salary) &lt;- employees
1533 , then group by dept using groupWith
1534 , then sortWith by (sum salary)
1535 , then take 5 ]
1536 </programlisting>
1537 In this example, the list <literal>output</literal> would take on
1538 the value:
1539
1540 <programlisting>
1541 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1542 </programlisting>
1543 </para>
1544 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1545 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1546 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1547
1548 <para>There are five new forms of comprehension qualifier,
1549 all introduced by the (existing) keyword <literal>then</literal>:
1550 <itemizedlist>
1551 <listitem>
1552
1553 <programlisting>
1554 then f
1555 </programlisting>
1556
1557 This statement requires that <literal>f</literal> have the type <literal>
1558 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1559 motivating example, as this form is used to apply <literal>take 5</literal>.
1560
1561 </listitem>
1562
1563
1564 <listitem>
1565 <para>
1566 <programlisting>
1567 then f by e
1568 </programlisting>
1569
1570 This form is similar to the previous one, but allows you to create a function
1571 which will be passed as the first argument to f. As a consequence f must have
1572 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1573 from the type, this function lets f &quot;project out&quot; some information
1574 from the elements of the list it is transforming.</para>
1575
1576 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1577 is supplied with a function that lets it find out the <literal>sum salary</literal>
1578 for any item in the list comprehension it transforms.</para>
1579
1580 </listitem>
1581
1582
1583 <listitem>
1584
1585 <programlisting>
1586 then group by e using f
1587 </programlisting>
1588
1589 <para>This is the most general of the grouping-type statements. In this form,
1590 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1591 As with the <literal>then f by e</literal> case above, the first argument
1592 is a function supplied to f by the compiler which lets it compute e on every
1593 element of the list being transformed. However, unlike the non-grouping case,
1594 f additionally partitions the list into a number of sublists: this means that
1595 at every point after this statement, binders occurring before it in the comprehension
1596 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1597 this, let's look at an example:</para>
1598
1599 <programlisting>
1600 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1601 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1602 groupRuns f = groupBy (\x y -> f x == f y)
1603
1604 output = [ (the x, y)
1605 | x &lt;- ([1..3] ++ [1..2])
1606 , y &lt;- [4..6]
1607 , then group by x using groupRuns ]
1608 </programlisting>
1609
1610 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1611
1612 <programlisting>
1613 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1614 </programlisting>
1615
1616 <para>Note that we have used the <literal>the</literal> function to change the type
1617 of x from a list to its original numeric type. The variable y, in contrast, is left
1618 unchanged from the list form introduced by the grouping.</para>
1619
1620 </listitem>
1621
1622 <listitem>
1623
1624 <programlisting>
1625 then group using f
1626 </programlisting>
1627
1628 <para>With this form of the group statement, f is required to simply have the type
1629 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1630 comprehension so far directly. An example of this form is as follows:</para>
1631
1632 <programlisting>
1633 output = [ x
1634 | y &lt;- [1..5]
1635 , x &lt;- "hello"
1636 , then group using inits]
1637 </programlisting>
1638
1639 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1640
1641 <programlisting>
1642 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1643 </programlisting>
1644
1645 </listitem>
1646 </itemizedlist>
1647 </para>
1648 </sect2>
1649
1650 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1651
1652 <sect2 id="monad-comprehensions">
1653 <title>Monad comprehensions</title>
1654 <indexterm><primary>monad comprehensions</primary></indexterm>
1655
1656 <para>
1657 Monad comprehensions generalise the list comprehension notation,
1658 including parallel comprehensions
1659 (<xref linkend="parallel-list-comprehensions"/>) and
1660 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1661 to work for any monad.
1662 </para>
1663
1664 <para>Monad comprehensions support:</para>
1665
1666 <itemizedlist>
1667 <listitem>
1668 <para>
1669 Bindings:
1670 </para>
1671
1672 <programlisting>
1673 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1674 </programlisting>
1675
1676 <para>
1677 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1678 <literal>return</literal> functions to the usual do-notation:
1679 </para>
1680
1681 <programlisting>
1682 do x &lt;- Just 1
1683 y &lt;- Just 2
1684 return (x+y)
1685 </programlisting>
1686
1687 </listitem>
1688 <listitem>
1689 <para>
1690 Guards:
1691 </para>
1692
1693 <programlisting>
1694 [ x | x &lt;- [1..10], x &lt;= 5 ]
1695 </programlisting>
1696
1697 <para>
1698 Guards are translated with the <literal>guard</literal> function,
1699 which requires a <literal>MonadPlus</literal> instance:
1700 </para>
1701
1702 <programlisting>
1703 do x &lt;- [1..10]
1704 guard (x &lt;= 5)
1705 return x
1706 </programlisting>
1707
1708 </listitem>
1709 <listitem>
1710 <para>
1711 Transform statements (as with <literal>-XTransformListComp</literal>):
1712 </para>
1713
1714 <programlisting>
1715 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1716 </programlisting>
1717
1718 <para>
1719 This translates to:
1720 </para>
1721
1722 <programlisting>
1723 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1724 y &lt;- [1..x]
1725 return (x,y))
1726 return (x+y)
1727 </programlisting>
1728
1729 </listitem>
1730 <listitem>
1731 <para>
1732 Group statements (as with <literal>-XTransformListComp</literal>):
1733 </para>
1734
1735 <programlisting>
1736 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1737 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1738 </programlisting>
1739
1740 </listitem>
1741 <listitem>
1742 <para>
1743 Parallel statements (as with <literal>-XParallelListComp</literal>):
1744 </para>
1745
1746 <programlisting>
1747 [ (x+y) | x &lt;- [1..10]
1748 | y &lt;- [11..20]
1749 ]
1750 </programlisting>
1751
1752 <para>
1753 Parallel statements are translated using the
1754 <literal>mzip</literal> function, which requires a
1755 <literal>MonadZip</literal> instance defined in
1756 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1757 </para>
1758
1759 <programlisting>
1760 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1761 return x)
1762 (do y &lt;- [11..20]
1763 return y)
1764 return (x+y)
1765 </programlisting>
1766
1767 </listitem>
1768 </itemizedlist>
1769
1770 <para>
1771 All these features are enabled by default if the
1772 <literal>MonadComprehensions</literal> extension is enabled. The types
1773 and more detailed examples on how to use comprehensions are explained
1774 in the previous chapters <xref
1775 linkend="generalised-list-comprehensions"/> and <xref
1776 linkend="parallel-list-comprehensions"/>. In general you just have
1777 to replace the type <literal>[a]</literal> with the type
1778 <literal>Monad m => m a</literal> for monad comprehensions.
1779 </para>
1780
1781 <para>
1782 Note: Even though most of these examples are using the list monad,
1783 monad comprehensions work for any monad.
1784 The <literal>base</literal> package offers all necessary instances for
1785 lists, which make <literal>MonadComprehensions</literal> backward
1786 compatible to built-in, transform and parallel list comprehensions.
1787 </para>
1788 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1789 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1790 <programlisting>
1791 Expressions: e
1792 Declarations: d
1793 Lists of qualifiers: Q,R,S
1794
1795 -- Basic forms
1796 D[ e | ] = return e
1797 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1798 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1799 D[ e | let d, Q ] = let d in D[ e | Q ]
1800
1801 -- Parallel comprehensions (iterate for multiple parallel branches)
1802 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1803
1804 -- Transform comprehensions
1805 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1806
1807 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1808
1809 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1810 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1811 Qv -&gt; D[ e | R ]
1812
1813 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1814 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1815 Qv -&gt; D[ e | R ]
1816
1817 where Qv is the tuple of variables bound by Q (and used subsequently)
1818 selQvi is a selector mapping Qv to the ith component of Qv
1819
1820 Operator Standard binding Expected type
1821 --------------------------------------------------------------------
1822 return GHC.Base t1 -&gt; m t2
1823 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1824 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1825 guard Control.Monad t1 -&gt; m t2
1826 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1827 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1828 </programlisting>
1829 The comprehension should typecheck when its desugaring would typecheck.
1830 </para>
1831 <para>
1832 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1833 Without rebindable
1834 syntax, the operators from the "standard binding" module are used; with
1835 rebindable syntax, the operators are looked up in the current lexical scope.
1836 For example, parallel comprehensions will be typechecked and desugared
1837 using whatever "<literal>mzip</literal>" is in scope.
1838 </para>
1839 <para>
1840 The rebindable operators must have the "Expected type" given in the
1841 table above. These types are surprisingly general. For example, you can
1842 use a bind operator with the type
1843 <programlisting>
1844 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1845 </programlisting>
1846 In the case of transform comprehensions, notice that the groups are
1847 parameterised over some arbitrary type <literal>n</literal> (provided it
1848 has an <literal>fmap</literal>, as well as
1849 the comprehension being over an arbitrary monad.
1850 </para>
1851 </sect2>
1852
1853 <!-- ===================== REBINDABLE SYNTAX =================== -->
1854
1855 <sect2 id="rebindable-syntax">
1856 <title>Rebindable syntax and the implicit Prelude import</title>
1857
1858 <para><indexterm><primary>-XNoImplicitPrelude
1859 option</primary></indexterm> GHC normally imports
1860 <filename>Prelude.hi</filename> files for you. If you'd
1861 rather it didn't, then give it a
1862 <option>-XNoImplicitPrelude</option> option. The idea is
1863 that you can then import a Prelude of your own. (But don't
1864 call it <literal>Prelude</literal>; the Haskell module
1865 namespace is flat, and you must not conflict with any
1866 Prelude module.)</para>
1867
1868 <para>Suppose you are importing a Prelude of your own
1869 in order to define your own numeric class
1870 hierarchy. It completely defeats that purpose if the
1871 literal "1" means "<literal>Prelude.fromInteger
1872 1</literal>", which is what the Haskell Report specifies.
1873 So the <option>-XRebindableSyntax</option>
1874 flag causes
1875 the following pieces of built-in syntax to refer to
1876 <emphasis>whatever is in scope</emphasis>, not the Prelude
1877 versions:
1878 <itemizedlist>
1879 <listitem>
1880 <para>An integer literal <literal>368</literal> means
1881 "<literal>fromInteger (368::Integer)</literal>", rather than
1882 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1883 </para> </listitem>
1884
1885 <listitem><para>Fractional literals are handed in just the same way,
1886 except that the translation is
1887 <literal>fromRational (3.68::Rational)</literal>.
1888 </para> </listitem>
1889
1890 <listitem><para>The equality test in an overloaded numeric pattern
1891 uses whatever <literal>(==)</literal> is in scope.
1892 </para> </listitem>
1893
1894 <listitem><para>The subtraction operation, and the
1895 greater-than-or-equal test, in <literal>n+k</literal> patterns
1896 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1897 </para></listitem>
1898
1899 <listitem>
1900 <para>Negation (e.g. "<literal>- (f x)</literal>")
1901 means "<literal>negate (f x)</literal>", both in numeric
1902 patterns, and expressions.
1903 </para></listitem>
1904
1905 <listitem>
1906 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1907 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1908 </para></listitem>
1909
1910 <listitem>
1911 <para>"Do" notation is translated using whatever
1912 functions <literal>(>>=)</literal>,
1913 <literal>(>>)</literal>, and <literal>fail</literal>,
1914 are in scope (not the Prelude
1915 versions). List comprehensions, <literal>mdo</literal>
1916 (<xref linkend="recursive-do-notation"/>), and parallel array
1917 comprehensions, are unaffected. </para></listitem>
1918
1919 <listitem>
1920 <para>Arrow
1921 notation (see <xref linkend="arrow-notation"/>)
1922 uses whatever <literal>arr</literal>,
1923 <literal>(>>>)</literal>, <literal>first</literal>,
1924 <literal>app</literal>, <literal>(|||)</literal> and
1925 <literal>loop</literal> functions are in scope. But unlike the
1926 other constructs, the types of these functions must match the
1927 Prelude types very closely. Details are in flux; if you want
1928 to use this, ask!
1929 </para></listitem>
1930 </itemizedlist>
1931 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1932 </para>
1933 <para>
1934 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1935 even if that is a little unexpected. For example, the
1936 static semantics of the literal <literal>368</literal>
1937 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1938 <literal>fromInteger</literal> to have any of the types:
1939 <programlisting>
1940 fromInteger :: Integer -> Integer
1941 fromInteger :: forall a. Foo a => Integer -> a
1942 fromInteger :: Num a => a -> Integer
1943 fromInteger :: Integer -> Bool -> Bool
1944 </programlisting>
1945 </para>
1946
1947 <para>Be warned: this is an experimental facility, with
1948 fewer checks than usual. Use <literal>-dcore-lint</literal>
1949 to typecheck the desugared program. If Core Lint is happy
1950 you should be all right.</para>
1951
1952 </sect2>
1953
1954 <sect2 id="postfix-operators">
1955 <title>Postfix operators</title>
1956
1957 <para>
1958 The <option>-XPostfixOperators</option> flag enables a small
1959 extension to the syntax of left operator sections, which allows you to
1960 define postfix operators. The extension is this: the left section
1961 <programlisting>
1962 (e !)
1963 </programlisting>
1964 is equivalent (from the point of view of both type checking and execution) to the expression
1965 <programlisting>
1966 ((!) e)
1967 </programlisting>
1968 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1969 The strict Haskell 98 interpretation is that the section is equivalent to
1970 <programlisting>
1971 (\y -> (!) e y)
1972 </programlisting>
1973 That is, the operator must be a function of two arguments. GHC allows it to
1974 take only one argument, and that in turn allows you to write the function
1975 postfix.
1976 </para>
1977 <para>The extension does not extend to the left-hand side of function
1978 definitions; you must define such a function in prefix form.</para>
1979
1980 </sect2>
1981
1982 <sect2 id="tuple-sections">
1983 <title>Tuple sections</title>
1984
1985 <para>
1986 The <option>-XTupleSections</option> flag enables Python-style partially applied
1987 tuple constructors. For example, the following program
1988 <programlisting>
1989 (, True)
1990 </programlisting>
1991 is considered to be an alternative notation for the more unwieldy alternative
1992 <programlisting>
1993 \x -> (x, True)
1994 </programlisting>
1995 You can omit any combination of arguments to the tuple, as in the following
1996 <programlisting>
1997 (, "I", , , "Love", , 1337)
1998 </programlisting>
1999 which translates to
2000 <programlisting>
2001 \a b c d -> (a, "I", b, c, "Love", d, 1337)
2002 </programlisting>
2003 </para>
2004
2005 <para>
2006 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
2007 will also be available for them, like so
2008 <programlisting>
2009 (# , True #)
2010 </programlisting>
2011 Because there is no unboxed unit tuple, the following expression
2012 <programlisting>
2013 (# #)
2014 </programlisting>
2015 continues to stand for the unboxed singleton tuple data constructor.
2016 </para>
2017
2018 </sect2>
2019
2020 <sect2 id="lambda-case">
2021 <title>Lambda-case</title>
2022 <para>
2023 The <option>-XLambdaCase</option> flag enables expressions of the form
2024 <programlisting>
2025 \case { p1 -> e1; ...; pN -> eN }
2026 </programlisting>
2027 which is equivalent to
2028 <programlisting>
2029 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
2030 </programlisting>
2031 Note that <literal>\case</literal> starts a layout, so you can write
2032 <programlisting>
2033 \case
2034 p1 -> e1
2035 ...
2036 pN -> eN
2037 </programlisting>
2038 </para>
2039 </sect2>
2040
2041 <sect2 id="empty-case">
2042 <title>Empty case alternatives</title>
2043 <para>
2044 The <option>-XEmptyCase</option> flag enables
2045 case expressions, or lambda-case expressions, that have no alternatives,
2046 thus:
2047 <programlisting>
2048 case e of { } -- No alternatives
2049 or
2050 \case { } -- -XLambdaCase is also required
2051 </programlisting>
2052 This can be useful when you know that the expression being scrutinised
2053 has no non-bottom values. For example:
2054 <programlisting>
2055 data Void
2056 f :: Void -> Int
2057 f x = case x of { }
2058 </programlisting>
2059 With dependently-typed features it is more useful
2060 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
2061 For example, consider these two candidate definitions of <literal>absurd</literal>:
2062 <programlisting>
2063 data a :==: b where
2064 Refl :: a :==: a
2065
2066 absurd :: True :~: False -> a
2067 absurd x = error "absurd" -- (A)
2068 absurd x = case x of {} -- (B)
2069 </programlisting>
2070 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
2071 is an empty type. So (B) has no partiality and GHC should be able to compile with
2072 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
2073 yet clever enough to do that.)
2074 On the other hand (A) looks dangerous, and GHC doesn't check to make
2075 sure that, in fact, the function can never get called.
2076 </para>
2077 </sect2>
2078
2079 <sect2 id="multi-way-if">
2080 <title>Multi-way if-expressions</title>
2081 <para>
2082 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
2083 with multiple branches:
2084 <programlisting>
2085 if | guard1 -> expr1
2086 | ...
2087 | guardN -> exprN
2088 </programlisting>
2089 which is roughly equivalent to
2090 <programlisting>
2091 case () of
2092 _ | guard1 -> expr1
2093 ...
2094 _ | guardN -> exprN
2095 </programlisting>
2096 </para>
2097
2098 <para>Multi-way if expressions introduce a new layout context. So the
2099 example above is equivalent to:
2100 <programlisting>
2101 if { | guard1 -> expr1
2102 ; | ...
2103 ; | guardN -> exprN
2104 }
2105 </programlisting>
2106 The following behaves as expected:
2107 <programlisting>
2108 if | guard1 -> if | guard2 -> expr2
2109 | guard3 -> expr3
2110 | guard4 -> expr4
2111 </programlisting>
2112 because layout translates it as
2113 <programlisting>
2114 if { | guard1 -> if { | guard2 -> expr2
2115 ; | guard3 -> expr3
2116 }
2117 ; | guard4 -> expr4
2118 }
2119 </programlisting>
2120 Layout with multi-way if works in the same way as other layout
2121 contexts, except that the semi-colons between guards in a multi-way if
2122 are optional. So it is not necessary to line up all the guards at the
2123 same column; this is consistent with the way guards work in function
2124 definitions and case expressions.
2125 </para>
2126 </sect2>
2127
2128 <sect2 id="disambiguate-fields">
2129 <title>Record field disambiguation</title>
2130 <para>
2131 In record construction and record pattern matching
2132 it is entirely unambiguous which field is referred to, even if there are two different
2133 data types in scope with a common field name. For example:
2134 <programlisting>
2135 module M where
2136 data S = MkS { x :: Int, y :: Bool }
2137
2138 module Foo where
2139 import M
2140
2141 data T = MkT { x :: Int }
2142
2143 ok1 (MkS { x = n }) = n+1 -- Unambiguous
2144 ok2 n = MkT { x = n+1 } -- Unambiguous
2145
2146 bad1 k = k { x = 3 } -- Ambiguous
2147 bad2 k = x k -- Ambiguous
2148 </programlisting>
2149 Even though there are two <literal>x</literal>'s in scope,
2150 it is clear that the <literal>x</literal> in the pattern in the
2151 definition of <literal>ok1</literal> can only mean the field
2152 <literal>x</literal> from type <literal>S</literal>. Similarly for
2153 the function <literal>ok2</literal>. However, in the record update
2154 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
2155 it is not clear which of the two types is intended.
2156 </para>
2157 <para>
2158 Haskell 98 regards all four as ambiguous, but with the
2159 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
2160 the former two. The rules are precisely the same as those for instance
2161 declarations in Haskell 98, where the method names on the left-hand side
2162 of the method bindings in an instance declaration refer unambiguously
2163 to the method of that class (provided they are in scope at all), even
2164 if there are other variables in scope with the same name.
2165 This reduces the clutter of qualified names when you import two
2166 records from different modules that use the same field name.
2167 </para>
2168 <para>
2169 Some details:
2170 <itemizedlist>
2171 <listitem><para>
2172 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
2173 <programlisting>
2174 module Foo where
2175 import M
2176 x=True
2177 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
2178 </programlisting>
2179 </para></listitem>
2180
2181 <listitem><para>
2182 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
2183 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
2184 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
2185 <programlisting>
2186 module Foo where
2187 import qualified M -- Note qualified
2188
2189 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
2190 </programlisting>
2191 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
2192 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
2193 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
2194 is not. (In effect, it is qualified by the constructor.)
2195 </para></listitem>
2196 </itemizedlist>
2197 </para>
2198
2199 </sect2>
2200
2201 <!-- ===================== Record puns =================== -->
2202
2203 <sect2 id="record-puns">
2204 <title>Record puns
2205 </title>
2206
2207 <para>
2208 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
2209 </para>
2210
2211 <para>
2212 When using records, it is common to write a pattern that binds a
2213 variable with the same name as a record field, such as:
2214
2215 <programlisting>
2216 data C = C {a :: Int}
2217 f (C {a = a}) = a
2218 </programlisting>
2219 </para>
2220
2221 <para>
2222 Record punning permits the variable name to be elided, so one can simply
2223 write
2224
2225 <programlisting>
2226 f (C {a}) = a
2227 </programlisting>
2228
2229 to mean the same pattern as above. That is, in a record pattern, the
2230 pattern <literal>a</literal> expands into the pattern <literal>a =
2231 a</literal> for the same name <literal>a</literal>.
2232 </para>
2233
2234 <para>
2235 Note that:
2236 <itemizedlist>
2237 <listitem><para>
2238 Record punning can also be used in an expression, writing, for example,
2239 <programlisting>
2240 let a = 1 in C {a}
2241 </programlisting>
2242 instead of
2243 <programlisting>
2244 let a = 1 in C {a = a}
2245 </programlisting>
2246 The expansion is purely syntactic, so the expanded right-hand side
2247 expression refers to the nearest enclosing variable that is spelled the
2248 same as the field name.
2249 </para></listitem>
2250
2251 <listitem><para>
2252 Puns and other patterns can be mixed in the same record:
2253 <programlisting>
2254 data C = C {a :: Int, b :: Int}
2255 f (C {a, b = 4}) = a
2256 </programlisting>
2257 </para></listitem>
2258
2259 <listitem><para>
2260 Puns can be used wherever record patterns occur (e.g. in
2261 <literal>let</literal> bindings or at the top-level).
2262 </para></listitem>
2263
2264 <listitem><para>
2265 A pun on a qualified field name is expanded by stripping off the module qualifier.
2266 For example:
2267 <programlisting>
2268 f (C {M.a}) = a
2269 </programlisting>
2270 means
2271 <programlisting>
2272 f (M.C {M.a = a}) = a
2273 </programlisting>
2274 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
2275 is only in scope in qualified form.)
2276 </para></listitem>
2277 </itemizedlist>
2278 </para>
2279
2280
2281 </sect2>
2282
2283 <!-- ===================== Record wildcards =================== -->
2284
2285 <sect2 id="record-wildcards">
2286 <title>Record wildcards
2287 </title>
2288
2289 <para>
2290 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
2291 This flag implies <literal>-XDisambiguateRecordFields</literal>.
2292 </para>
2293
2294 <para>
2295 For records with many fields, it can be tiresome to write out each field
2296 individually in a record pattern, as in
2297 <programlisting>
2298 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
2299 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
2300 </programlisting>
2301 </para>
2302
2303 <para>
2304 Record wildcard syntax permits a "<literal>..</literal>" in a record
2305 pattern, where each elided field <literal>f</literal> is replaced by the
2306 pattern <literal>f = f</literal>. For example, the above pattern can be
2307 written as
2308 <programlisting>
2309 f (C {a = 1, ..}) = b + c + d
2310 </programlisting>
2311 </para>
2312
2313 <para>
2314 More details:
2315 <itemizedlist>
2316 <listitem><para>
2317 Wildcards can be mixed with other patterns, including puns
2318 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
2319 = 1, b, ..})</literal>. Additionally, record wildcards can be used
2320 wherever record patterns occur, including in <literal>let</literal>
2321 bindings and at the top-level. For example, the top-level binding
2322 <programlisting>
2323 C {a = 1, ..} = e
2324 </programlisting>
2325 defines <literal>b</literal>, <literal>c</literal>, and
2326 <literal>d</literal>.
2327 </para></listitem>
2328
2329 <listitem><para>
2330 Record wildcards can also be used in expressions, writing, for example,
2331 <programlisting>
2332 let {a = 1; b = 2; c = 3; d = 4} in C {..}
2333 </programlisting>
2334 in place of
2335 <programlisting>
2336 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
2337 </programlisting>
2338 The expansion is purely syntactic, so the record wildcard
2339 expression refers to the nearest enclosing variables that are spelled
2340 the same as the omitted field names.
2341 </para></listitem>
2342
2343 <listitem><para>
2344 The "<literal>..</literal>" expands to the missing
2345 <emphasis>in-scope</emphasis> record fields.
2346 Specifically the expansion of "<literal>C {..}</literal>" includes
2347 <literal>f</literal> if and only if:
2348 <itemizedlist>
2349 <listitem><para>
2350 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2351 </para></listitem>
2352 <listitem><para>
2353 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2354 </para></listitem>
2355 <listitem><para>
2356 In the case of expressions (but not patterns),
2357 the variable <literal>f</literal> is in scope unqualified,
2358 apart from the binding of the record selector itself.
2359 </para></listitem>
2360 </itemizedlist>
2361 For example
2362 <programlisting>
2363 module M where
2364 data R = R { a,b,c :: Int }
2365 module X where
2366 import M( R(a,c) )
2367 f b = R { .. }
2368 </programlisting>
2369 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2370 omitting <literal>b</literal> since the record field is not in scope,
2371 and omitting <literal>c</literal> since the variable <literal>c</literal>
2372 is not in scope (apart from the binding of the
2373 record selector <literal>c</literal>, of course).
2374 </para></listitem>
2375 </itemizedlist>
2376 </para>
2377
2378 </sect2>
2379
2380 <!-- ===================== Local fixity declarations =================== -->
2381
2382 <sect2 id="local-fixity-declarations">
2383 <title>Local Fixity Declarations
2384 </title>
2385
2386 <para>A careful reading of the Haskell 98 Report reveals that fixity
2387 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2388 <literal>infixr</literal>) are permitted to appear inside local bindings
2389 such those introduced by <literal>let</literal> and
2390 <literal>where</literal>. However, the Haskell Report does not specify
2391 the semantics of such bindings very precisely.
2392 </para>
2393
2394 <para>In GHC, a fixity declaration may accompany a local binding:
2395 <programlisting>
2396 let f = ...
2397 infixr 3 `f`
2398 in
2399 ...
2400 </programlisting>
2401 and the fixity declaration applies wherever the binding is in scope.
2402 For example, in a <literal>let</literal>, it applies in the right-hand
2403 sides of other <literal>let</literal>-bindings and the body of the
2404 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2405 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2406 declarations of a <literal>let</literal> statement scope over other
2407 statements in the group, just as the bound name does.
2408 </para>
2409
2410 <para>
2411 Moreover, a local fixity declaration *must* accompany a local binding of
2412 that name: it is not possible to revise the fixity of name bound
2413 elsewhere, as in
2414 <programlisting>
2415 let infixr 9 $ in ...
2416 </programlisting>
2417
2418 Because local fixity declarations are technically Haskell 98, no flag is
2419 necessary to enable them.
2420 </para>
2421 </sect2>
2422
2423 <sect2 id="package-imports">
2424 <title>Import and export extensions</title>
2425
2426 <sect3>
2427 <title>Hiding things the imported module doesn't export</title>
2428
2429 <para>
2430 Technically in Haskell 2010 this is illegal:
2431 <programlisting>
2432 module A( f ) where
2433 f = True
2434
2435 module B where
2436 import A hiding( g ) -- A does not export g
2437 g = f
2438 </programlisting>
2439 The <literal>import A hiding( g )</literal> in module <literal>B</literal>
2440 is technically an error (<ulink url="http://www.haskell.org/onlinereport/haskell2010/haskellch5.html#x11-1020005.3.1">Haskell Report, 5.3.1</ulink>)
2441 because <literal>A</literal> does not export <literal>g</literal>.
2442 However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of
2443 <literal>A</literal> might export <literal>g</literal>, and you want <literal>B</literal> to work
2444 in either case.
2445 </para>
2446 <para>
2447 The warning <literal>-fwarn-dodgy-imports</literal>, which is off by default but included with <literal>-W</literal>,
2448 warns if you hide something that the imported module does not export.
2449 </para>
2450 </sect3>
2451
2452 <sect3>
2453 <title>Package-qualified imports</title>
2454
2455 <para>With the <option>-XPackageImports</option> flag, GHC allows
2456 import declarations to be qualified by the package name that the
2457 module is intended to be imported from. For example:</para>
2458
2459 <programlisting>
2460 import "network" Network.Socket
2461 </programlisting>
2462
2463 <para>would import the module <literal>Network.Socket</literal> from
2464 the package <literal>network</literal> (any version). This may
2465 be used to disambiguate an import when the same module is
2466 available from multiple packages, or is present in both the
2467 current package being built and an external package.</para>
2468
2469 <para>The special package name <literal>this</literal> can be used to
2470 refer to the current package being built.</para>
2471
2472 <para>Note: you probably don't need to use this feature, it was
2473 added mainly so that we can build backwards-compatible versions of
2474 packages when APIs change. It can lead to fragile dependencies in
2475 the common case: modules occasionally move from one package to
2476 another, rendering any package-qualified imports broken.</para>
2477 </sect3>
2478
2479 <sect3 id="safe-imports-ext">
2480 <title>Safe imports</title>
2481
2482 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2483 and <option>-XUnsafe</option> language flags, GHC extends
2484 the import declaration syntax to take an optional <literal>safe</literal>
2485 keyword after the <literal>import</literal> keyword. This feature
2486 is part of the Safe Haskell GHC extension. For example:</para>
2487
2488 <programlisting>
2489 import safe qualified Network.Socket as NS
2490 </programlisting>
2491
2492 <para>would import the module <literal>Network.Socket</literal>
2493 with compilation only succeeding if Network.Socket can be
2494 safely imported. For a description of when a import is
2495 considered safe see <xref linkend="safe-haskell"/></para>
2496
2497 </sect3>
2498
2499 <sect3 id="explicit-namespaces">
2500 <title>Explicit namespaces in import/export</title>
2501
2502 <para> In an import or export list, such as
2503 <programlisting>
2504 module M( f, (++) ) where ...
2505 import N( f, (++) )
2506 ...
2507 </programlisting>
2508 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2509 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2510 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2511 case, how would you export or import it?
2512 </para>
2513 <para>
2514 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2515 a type constructor in an import or export list with "<literal>type</literal>" to
2516 disambiguate this case, thus:
2517 <programlisting>
2518 module M( f, type (++) ) where ...
2519 import N( f, type (++) )
2520 ...
2521 module N( f, type (++) ) where
2522 data family a ++ b = L a | R b
2523 </programlisting>
2524 The extension <option>-XExplicitNamespaces</option>
2525 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2526 </para>
2527 <para>
2528 In addition, with <option>-XPatternSynonyms</option> you can prefix the name of
2529 a data constructor in an import or export list with the keyword <literal>pattern</literal>,
2530 to allow the import or export of a data constructor without its parent type constructor
2531 (see <xref linkend="patsyn-impexp"/>).
2532 </para>
2533 </sect3>
2534
2535 </sect2>
2536
2537 <sect2 id="syntax-stolen">
2538 <title>Summary of stolen syntax</title>
2539
2540 <para>Turning on an option that enables special syntax
2541 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2542 to compile, perhaps because it uses a variable name which has
2543 become a reserved word. This section lists the syntax that is
2544 "stolen" by language extensions.
2545 We use
2546 notation and nonterminal names from the Haskell 98 lexical syntax
2547 (see the Haskell 98 Report).
2548 We only list syntax changes here that might affect
2549 existing working programs (i.e. "stolen" syntax). Many of these
2550 extensions will also enable new context-free syntax, but in all
2551 cases programs written to use the new syntax would not be
2552 compilable without the option enabled.</para>
2553
2554 <para>There are two classes of special
2555 syntax:
2556
2557 <itemizedlist>
2558 <listitem>
2559 <para>New reserved words and symbols: character sequences
2560 which are no longer available for use as identifiers in the
2561 program.</para>
2562 </listitem>
2563 <listitem>
2564 <para>Other special syntax: sequences of characters that have
2565 a different meaning when this particular option is turned
2566 on.</para>
2567 </listitem>
2568 </itemizedlist>
2569
2570 The following syntax is stolen:
2571
2572 <variablelist>
2573 <varlistentry>
2574 <term>
2575 <literal>forall</literal>
2576 <indexterm><primary><literal>forall</literal></primary></indexterm>
2577 </term>
2578 <listitem><para>
2579 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2580 <option>-XScopedTypeVariables</option>,
2581 <option>-XLiberalTypeSynonyms</option>,
2582 <option>-XRankNTypes</option>,
2583 <option>-XExistentialQuantification</option>
2584 </para></listitem>
2585 </varlistentry>
2586
2587 <varlistentry>
2588 <term>
2589 <literal>mdo</literal>
2590 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2591 </term>
2592 <listitem><para>
2593 Stolen by: <option>-XRecursiveDo</option>
2594 </para></listitem>
2595 </varlistentry>
2596
2597 <varlistentry>
2598 <term>
2599 <literal>foreign</literal>
2600 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2601 </term>
2602 <listitem><para>
2603 Stolen by: <option>-XForeignFunctionInterface</option>
2604 </para></listitem>
2605 </varlistentry>
2606
2607 <varlistentry>
2608 <term>
2609 <literal>rec</literal>,
2610 <literal>proc</literal>, <literal>-&lt;</literal>,
2611 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2612 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2613 <literal>|)</literal> brackets
2614 <indexterm><primary><literal>proc</literal></primary></indexterm>
2615 </term>
2616 <listitem><para>
2617 Stolen by: <option>-XArrows</option>
2618 </para></listitem>
2619 </varlistentry>
2620
2621 <varlistentry>
2622 <term>
2623 <literal>?<replaceable>varid</replaceable></literal>
2624 <indexterm><primary>implicit parameters</primary></indexterm>
2625 </term>
2626 <listitem><para>
2627 Stolen by: <option>-XImplicitParams</option>
2628 </para></listitem>
2629 </varlistentry>
2630
2631 <varlistentry>
2632 <term>
2633 <literal>[|</literal>,
2634 <literal>[e|</literal>, <literal>[p|</literal>,
2635 <literal>[d|</literal>, <literal>[t|</literal>,
2636 <literal>$(</literal>,
2637 <literal>$$(</literal>,
2638 <literal>[||</literal>,
2639 <literal>[e||</literal>,
2640 <literal>$<replaceable>varid</replaceable></literal>,
2641 <literal>$$<replaceable>varid</replaceable></literal>
2642 <indexterm><primary>Template Haskell</primary></indexterm>
2643 </term>
2644 <listitem><para>
2645 Stolen by: <option>-XTemplateHaskell</option>
2646 </para></listitem>
2647 </varlistentry>
2648
2649 <varlistentry>
2650 <term>
2651 <literal>[<replaceable>varid</replaceable>|</literal>
2652 <indexterm><primary>quasi-quotation</primary></indexterm>
2653 </term>
2654 <listitem><para>
2655 Stolen by: <option>-XQuasiQuotes</option>
2656 </para></listitem>
2657 </varlistentry>
2658
2659 <varlistentry>
2660 <term>
2661 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2662 <replaceable>char</replaceable><literal>&num;</literal>,
2663 <replaceable>string</replaceable><literal>&num;</literal>,
2664 <replaceable>integer</replaceable><literal>&num;</literal>,
2665 <replaceable>float</replaceable><literal>&num;</literal>,
2666 <replaceable>float</replaceable><literal>&num;&num;</literal>
2667 </term>
2668 <listitem><para>
2669 Stolen by: <option>-XMagicHash</option>
2670 </para></listitem>
2671 </varlistentry>
2672
2673 <varlistentry>
2674 <term>
2675 <literal>(&num;</literal>, <literal>&num;)</literal>
2676 </term>
2677 <listitem><para>
2678 Stolen by: <option>-XUnboxedTuples</option>
2679 </para></listitem>
2680 </varlistentry>
2681
2682 <varlistentry>
2683 <term>
2684 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2685 </term>
2686 <listitem><para>
2687 Stolen by: <option>-XBangPatterns</option>
2688 </para></listitem>
2689 </varlistentry>
2690
2691 <varlistentry>
2692 <term>
2693 <literal>pattern</literal>
2694 </term>
2695 <listitem><para>
2696 Stolen by: <option>-XPatternSynonyms</option>
2697 </para></listitem>
2698 </varlistentry>
2699 </variablelist>
2700 </para>
2701 </sect2>
2702 </sect1>
2703
2704
2705 <!-- TYPE SYSTEM EXTENSIONS -->
2706 <sect1 id="data-type-extensions">
2707 <title>Extensions to data types and type synonyms</title>
2708
2709 <sect2 id="nullary-types">
2710 <title>Data types with no constructors</title>
2711
2712 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2713 GHC lets you declare a data type with no constructors. For example:</para>
2714
2715 <programlisting>
2716 data S -- S :: *
2717 data T a -- T :: * -> *
2718 </programlisting>
2719
2720 <para>Syntactically, the declaration lacks the "= constrs" part. The
2721 type can be parameterised over types of any kind, but if the kind is
2722 not <literal>*</literal> then an explicit kind annotation must be used
2723 (see <xref linkend="kinding"/>).</para>
2724
2725 <para>Such data types have only one value, namely bottom.
2726 Nevertheless, they can be useful when defining "phantom types".</para>
2727 </sect2>
2728
2729 <sect2 id="datatype-contexts">
2730 <title>Data type contexts</title>
2731
2732 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2733
2734 <programlisting>
2735 data Eq a => Set a = NilSet | ConsSet a (Set a)
2736 </programlisting>
2737
2738 <para>give constructors with types:</para>
2739
2740 <programlisting>
2741 NilSet :: Set a
2742 ConsSet :: Eq a => a -> Set a -> Set a
2743 </programlisting>
2744
2745 <para>This is widely considered a misfeature, and is going to be removed from
2746 the language. In GHC, it is controlled by the deprecated extension
2747 <literal>DatatypeContexts</literal>.</para>
2748 </sect2>
2749
2750 <sect2 id="infix-tycons">
2751 <title>Infix type constructors, classes, and type variables</title>
2752
2753 <para>
2754 GHC allows type constructors, classes, and type variables to be operators, and
2755 to be written infix, very much like expressions. More specifically:
2756 <itemizedlist>
2757 <listitem><para>
2758 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2759 The lexical syntax is the same as that for data constructors.
2760 </para></listitem>
2761 <listitem><para>
2762 Data type and type-synonym declarations can be written infix, parenthesised
2763 if you want further arguments. E.g.
2764 <screen>
2765 data a :*: b = Foo a b
2766 type a :+: b = Either a b
2767 class a :=: b where ...
2768
2769 data (a :**: b) x = Baz a b x
2770 type (a :++: b) y = Either (a,b) y
2771 </screen>
2772 </para></listitem>
2773 <listitem><para>
2774 Types, and class constraints, can be written infix. For example
2775 <screen>
2776 x :: Int :*: Bool
2777 f :: (a :=: b) => a -> b
2778 </screen>
2779 </para></listitem>
2780 <listitem><para>
2781 Back-quotes work
2782 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2783 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2784 </para></listitem>
2785 <listitem><para>
2786 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2787 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2788 sets the fixity for a data constructor and the corresponding type constructor. For example:
2789 <screen>
2790 infixl 7 T, :*:
2791 </screen>
2792 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2793 and similarly for <literal>:*:</literal>.
2794 <literal>Int `a` Bool</literal>.
2795 </para></listitem>
2796 <listitem><para>
2797 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2798 </para></listitem>
2799
2800 </itemizedlist>
2801 </para>
2802 </sect2>
2803
2804 <sect2 id="type-operators">
2805 <title>Type operators</title>
2806 <para>
2807 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2808 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2809 <programlisting>
2810 type T (+) = ((+), (+))
2811 -- Just like: type T a = (a,a)
2812
2813 f :: T Int -> Int
2814 f (x,y)= x
2815 </programlisting>
2816 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2817 allow you to write them infix.
2818 </para>
2819 <para>
2820 The language <option>-XTypeOperators</option> changes this behaviour:
2821 <itemizedlist>
2822 <listitem><para>
2823 Operator symbols become type <emphasis>constructors</emphasis> rather than
2824 type <emphasis>variables</emphasis>.
2825 </para></listitem>
2826 <listitem><para>
2827 Operator symbols in types can be written infix, both in definitions and uses.
2828 for example:
2829 <programlisting>
2830 data a + b = Plus a b
2831 type Foo = Int + Bool
2832 </programlisting>
2833 </para></listitem>
2834 <listitem><para>
2835 There is now some potential ambiguity in import and export lists; for example
2836 if you write <literal>import M( (+) )</literal> do you mean the
2837 <emphasis>function</emphasis> <literal>(+)</literal> or the
2838 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2839 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2840 by <option>-XExplicitTypeOperators</option>) GHC allows you to specify the latter
2841 by preceding it with the keyword <literal>type</literal>, thus:
2842 <programlisting>
2843 import M( type (+) )
2844 </programlisting>
2845 See <xref linkend="explicit-namespaces"/>.
2846 </para></listitem>
2847 <listitem><para>
2848 The fixity of a type operator may be set using the usual fixity declarations
2849 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2850 a single fixity.
2851 </para></listitem>
2852 </itemizedlist>
2853 </para>
2854 </sect2>
2855
2856 <sect2 id="type-synonyms">
2857 <title>Liberalised type synonyms</title>
2858
2859 <para>
2860 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2861 on individual synonym declarations.
2862 With the <option>-XLiberalTypeSynonyms</option> extension,
2863 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2864 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2865
2866 <itemizedlist>
2867 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2868 in a type synonym, thus:
2869 <programlisting>
2870 type Discard a = forall b. Show b => a -> b -> (a, String)
2871
2872 f :: Discard a
2873 f x y = (x, show y)
2874
2875 g :: Discard Int -> (Int,String) -- A rank-2 type
2876 g f = f 3 True
2877 </programlisting>
2878 </para>
2879 </listitem>
2880
2881 <listitem><para>
2882 If you also use <option>-XUnboxedTuples</option>,
2883 you can write an unboxed tuple in a type synonym:
2884 <programlisting>
2885 type Pr = (# Int, Int #)
2886
2887 h :: Int -> Pr
2888 h x = (# x, x #)
2889 </programlisting>
2890 </para></listitem>
2891
2892 <listitem><para>
2893 You can apply a type synonym to a forall type:
2894 <programlisting>
2895 type Foo a = a -> a -> Bool
2896
2897 f :: Foo (forall b. b->b)
2898 </programlisting>
2899 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2900 <programlisting>
2901 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2902 </programlisting>
2903 </para></listitem>
2904
2905 <listitem><para>
2906 You can apply a type synonym to a partially applied type synonym:
2907 <programlisting>
2908 type Generic i o = forall x. i x -> o x
2909 type Id x = x
2910
2911 foo :: Generic Id []
2912 </programlisting>
2913 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2914 <programlisting>
2915 foo :: forall x. x -> [x]
2916 </programlisting>
2917 </para></listitem>
2918
2919 </itemizedlist>
2920 </para>
2921
2922 <para>
2923 GHC currently does kind checking before expanding synonyms (though even that
2924 could be changed.)
2925 </para>
2926 <para>
2927 After expanding type synonyms, GHC does validity checking on types, looking for
2928 the following mal-formedness which isn't detected simply by kind checking:
2929 <itemizedlist>
2930 <listitem><para>
2931 Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal>
2932 is off)
2933 </para></listitem>
2934 <listitem><para>
2935 Partially-applied type synonym.
2936 </para></listitem>
2937 </itemizedlist>
2938 So, for example, this will be rejected:
2939 <programlisting>
2940 type Pr = forall a. a
2941
2942 h :: [Pr]
2943 h = ...
2944 </programlisting>
2945 because GHC does not allow type constructors applied to for-all types.
2946 </para>
2947 </sect2>
2948
2949
2950 <sect2 id="existential-quantification">
2951 <title>Existentially quantified data constructors
2952 </title>
2953
2954 <para>
2955 The idea of using existential quantification in data type declarations
2956 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2957 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2958 London, 1991). It was later formalised by Laufer and Odersky
2959 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2960 TOPLAS, 16(5), pp1411-1430, 1994).
2961 It's been in Lennart
2962 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2963 proved very useful. Here's the idea. Consider the declaration:
2964 </para>
2965
2966 <para>
2967
2968 <programlisting>
2969 data Foo = forall a. MkFoo a (a -> Bool)
2970 | Nil
2971 </programlisting>
2972
2973 </para>
2974
2975 <para>
2976 The data type <literal>Foo</literal> has two constructors with types:
2977 </para>
2978
2979 <para>
2980
2981 <programlisting>
2982 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2983 Nil :: Foo
2984 </programlisting>
2985
2986 </para>
2987
2988 <para>
2989 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2990 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2991 For example, the following expression is fine:
2992 </para>
2993
2994 <para>
2995
2996 <programlisting>
2997 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2998 </programlisting>
2999
3000 </para>
3001
3002 <para>
3003 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
3004 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
3005 isUpper</function> packages a character with a compatible function. These
3006 two things are each of type <literal>Foo</literal> and can be put in a list.
3007 </para>
3008
3009 <para>
3010 What can we do with a value of type <literal>Foo</literal>?. In particular,
3011 what happens when we pattern-match on <function>MkFoo</function>?
3012 </para>
3013
3014 <para>
3015
3016 <programlisting>
3017 f (MkFoo val fn) = ???
3018 </programlisting>
3019
3020 </para>
3021
3022 <para>
3023 Since all we know about <literal>val</literal> and <function>fn</function> is that they
3024 are compatible, the only (useful) thing we can do with them is to
3025 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
3026 </para>
3027
3028 <para>
3029
3030 <programlisting>
3031 f :: Foo -> Bool
3032 f (MkFoo val fn) = fn val
3033 </programlisting>
3034
3035 </para>
3036
3037 <para>
3038 What this allows us to do is to package heterogeneous values
3039 together with a bunch of functions that manipulate them, and then treat
3040 that collection of packages in a uniform manner. You can express
3041 quite a bit of object-oriented-like programming this way.
3042 </para>
3043
3044 <sect3 id="existential">
3045 <title>Why existential?
3046 </title>
3047
3048 <para>
3049 What has this to do with <emphasis>existential</emphasis> quantification?
3050 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
3051 </para>
3052
3053 <para>
3054
3055 <programlisting>
3056 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
3057 </programlisting>
3058
3059 </para>
3060
3061 <para>
3062 But Haskell programmers can safely think of the ordinary
3063 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
3064 adding a new existential quantification construct.
3065 </para>
3066
3067 </sect3>
3068
3069 <sect3 id="existential-with-context">
3070 <title>Existentials and type classes</title>
3071
3072 <para>
3073 An easy extension is to allow
3074 arbitrary contexts before the constructor. For example:
3075 </para>
3076
3077 <para>
3078
3079 <programlisting>
3080 data Baz = forall a. Eq a => Baz1 a a
3081 | forall b. Show b => Baz2 b (b -> b)
3082 </programlisting>
3083
3084 </para>
3085
3086 <para>
3087 The two constructors have the types you'd expect:
3088 </para>
3089
3090 <para>
3091
3092 <programlisting>
3093 Baz1 :: forall a. Eq a => a -> a -> Baz
3094 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3095 </programlisting>
3096
3097 </para>
3098
3099 <para>
3100 But when pattern matching on <function>Baz1</function> the matched values can be compared
3101 for equality, and when pattern matching on <function>Baz2</function> the first matched
3102 value can be converted to a string (as well as applying the function to it).
3103 So this program is legal:
3104 </para>
3105
3106 <para>
3107
3108 <programlisting>
3109 f :: Baz -> String
3110 f (Baz1 p q) | p == q = "Yes"
3111 | otherwise = "No"
3112 f (Baz2 v fn) = show (fn v)
3113 </programlisting>
3114
3115 </para>
3116
3117 <para>
3118 Operationally, in a dictionary-passing implementation, the
3119 constructors <function>Baz1</function> and <function>Baz2</function> must store the
3120 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
3121 extract it on pattern matching.
3122 </para>
3123
3124 </sect3>
3125
3126 <sect3 id="existential-records">
3127 <title>Record Constructors</title>
3128
3129 <para>
3130 GHC allows existentials to be used with records syntax as well. For example:
3131
3132 <programlisting>
3133 data Counter a = forall self. NewCounter
3134 { _this :: self
3135 , _inc :: self -> self
3136 , _display :: self -> IO ()
3137 , tag :: a
3138 }
3139 </programlisting>
3140 Here <literal>tag</literal> is a public field, with a well-typed selector
3141 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
3142 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
3143 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
3144 compile-time error. In other words, <emphasis>GHC defines a record selector function
3145 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
3146 (This example used an underscore in the fields for which record selectors
3147 will not be defined, but that is only programming style; GHC ignores them.)
3148 </para>
3149
3150 <para>
3151 To make use of these hidden fields, we need to create some helper functions:
3152
3153 <programlisting>
3154 inc :: Counter a -> Counter a
3155 inc (NewCounter x i d t) = NewCounter
3156 { _this = i x, _inc = i, _display = d, tag = t }
3157
3158 display :: Counter a -> IO ()
3159 display NewCounter{ _this = x, _display = d } = d x
3160 </programlisting>
3161
3162 Now we can define counters with different underlying implementations:
3163
3164 <programlisting>
3165 counterA :: Counter String
3166 counterA = NewCounter
3167 { _this = 0, _inc = (1+), _display = print, tag = "A" }
3168
3169 counterB :: Counter String
3170 counterB = NewCounter
3171 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
3172
3173 main = do
3174 display (inc counterA) -- prints "1"
3175 display (inc (inc counterB)) -- prints "##"
3176 </programlisting>
3177
3178 Record update syntax is supported for existentials (and GADTs):
3179 <programlisting>
3180 setTag :: Counter a -> a -> Counter a
3181 setTag obj t = obj{ tag = t }
3182 </programlisting>
3183 The rule for record update is this: <emphasis>
3184 the types of the updated fields may
3185 mention only the universally-quantified type variables
3186 of the data constructor. For GADTs, the field may mention only types
3187 that appear as a simple type-variable argument in the constructor's result
3188 type</emphasis>. For example:
3189 <programlisting>
3190 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
3191 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
3192 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
3193 -- existentially quantified)
3194
3195 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
3196 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
3197 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
3198 -- type-variable argument in G1's result type)
3199 </programlisting>
3200 </para>
3201
3202 </sect3>
3203
3204
3205 <sect3>
3206 <title>Restrictions</title>
3207
3208 <para>
3209 There are several restrictions on the ways in which existentially-quantified
3210 constructors can be use.
3211 </para>
3212
3213 <para>
3214
3215 <itemizedlist>
3216 <listitem>
3217
3218 <para>
3219 When pattern matching, each pattern match introduces a new,
3220 distinct, type for each existential type variable. These types cannot
3221 be unified with any other type, nor can they escape from the scope of
3222 the pattern match. For example, these fragments are incorrect:
3223
3224
3225 <programlisting>
3226 f1 (MkFoo a f) = a
3227 </programlisting>
3228
3229
3230 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
3231 is the result of <function>f1</function>. One way to see why this is wrong is to
3232 ask what type <function>f1</function> has:
3233
3234
3235 <programlisting>
3236 f1 :: Foo -> a -- Weird!
3237 </programlisting>
3238
3239
3240 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
3241 this:
3242
3243
3244 <programlisting>
3245 f1 :: forall a. Foo -> a -- Wrong!
3246 </programlisting>
3247
3248
3249 The original program is just plain wrong. Here's another sort of error
3250
3251
3252 <programlisting>
3253 f2 (Baz1 a b) (Baz1 p q) = a==q
3254 </programlisting>
3255
3256
3257 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
3258 <literal>a==q</literal> is wrong because it equates the two distinct types arising
3259 from the two <function>Baz1</function> constructors.
3260
3261
3262 </para>
3263 </listitem>
3264 <listitem>
3265
3266 <para>
3267 You can't pattern-match on an existentially quantified
3268 constructor in a <literal>let</literal> or <literal>where</literal> group of
3269 bindings. So this is illegal:
3270
3271
3272 <programlisting>
3273 f3 x = a==b where { Baz1 a b = x }
3274 </programlisting>
3275
3276 Instead, use a <literal>case</literal> expression:
3277
3278 <programlisting>
3279 f3 x = case x of Baz1 a b -> a==b
3280 </programlisting>
3281
3282 In general, you can only pattern-match
3283 on an existentially-quantified constructor in a <literal>case</literal> expression or
3284 in the patterns of a function definition.
3285
3286 The reason for this restriction is really an implementation one.
3287 Type-checking binding groups is already a nightmare without
3288 existentials complicating the picture. Also an existential pattern
3289 binding at the top level of a module doesn't make sense, because it's
3290 not clear how to prevent the existentially-quantified type "escaping".
3291 So for now, there's a simple-to-state restriction. We'll see how
3292 annoying it is.
3293
3294 </para>
3295 </listitem>
3296 <listitem>
3297
3298 <para>
3299 You can't use existential quantification for <literal>newtype</literal>
3300 declarations. So this is illegal:
3301
3302
3303 <programlisting>
3304 newtype T = forall a. Ord a => MkT a
3305 </programlisting>
3306
3307
3308 Reason: a value of type <literal>T</literal> must be represented as a
3309 pair of a dictionary for <literal>Ord t</literal> and a value of type
3310 <literal>t</literal>. That contradicts the idea that
3311 <literal>newtype</literal> should have no concrete representation.
3312 You can get just the same efficiency and effect by using
3313 <literal>data</literal> instead of <literal>newtype</literal>. If
3314 there is no overloading involved, then there is more of a case for
3315 allowing an existentially-quantified <literal>newtype</literal>,
3316 because the <literal>data</literal> version does carry an
3317 implementation cost, but single-field existentially quantified
3318 constructors aren't much use. So the simple restriction (no
3319 existential stuff on <literal>newtype</literal>) stands, unless there
3320 are convincing reasons to change it.
3321
3322
3323 </para>
3324 </listitem>
3325 <listitem>
3326
3327 <para>
3328 You can't use <literal>deriving</literal> to define instances of a
3329 data type with existentially quantified data constructors.
3330
3331 Reason: in most cases it would not make sense. For example:;
3332
3333 <programlisting>
3334 data T = forall a. MkT [a] deriving( Eq )
3335 </programlisting>
3336
3337 To derive <literal>Eq</literal> in the standard way we would need to have equality
3338 between the single component of two <function>MkT</function> constructors:
3339
3340 <programlisting>
3341 instance Eq T where
3342 (MkT a) == (MkT b) = ???
3343 </programlisting>
3344
3345 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
3346 It's just about possible to imagine examples in which the derived instance
3347 would make sense, but it seems altogether simpler simply to prohibit such
3348 declarations. Define your own instances!
3349 </para>
3350 </listitem>
3351
3352 </itemizedlist>
3353
3354 </para>
3355
3356 </sect3>
3357 </sect2>
3358
3359 <!-- ====================== Generalised algebraic data types ======================= -->
3360
3361 <sect2 id="gadt-style">
3362 <title>Declaring data types with explicit constructor signatures</title>
3363
3364 <para>When the <literal>GADTSyntax</literal> extension is enabled,
3365 GHC allows you to declare an algebraic data type by
3366 giving the type signatures of constructors explicitly. For example:
3367 <programlisting>
3368 data Maybe a where
3369 Nothing :: Maybe a
3370 Just :: a -> Maybe a
3371 </programlisting>
3372 The form is called a "GADT-style declaration"
3373 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
3374 can only be declared using this form.</para>
3375 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
3376 For example, these two declarations are equivalent:
3377 <programlisting>
3378 data Foo = forall a. MkFoo a (a -> Bool)
3379 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
3380 </programlisting>
3381 </para>
3382 <para>Any data type that can be declared in standard Haskell-98 syntax
3383 can also be declared using GADT-style syntax.
3384 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
3385 they treat class constraints on the data constructors differently.
3386 Specifically, if the constructor is given a type-class context, that
3387 context is made available by pattern matching. For example:
3388 <programlisting>
3389 data Set a where
3390 MkSet :: Eq a => [a] -> Set a
3391
3392 makeSet :: Eq a => [a] -> Set a
3393 makeSet xs = MkSet (nub xs)
3394
3395 insert :: a -> Set a -> Set a
3396 insert a (MkSet as) | a `elem` as = MkSet as
3397 | otherwise = MkSet (a:as)
3398 </programlisting>
3399 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3400 gives rise to a <literal>(Eq a)</literal>
3401 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3402 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3403 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3404 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3405 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3406 In the example, the equality dictionary is used to satisfy the equality constraint
3407 generated by the call to <literal>elem</literal>, so that the type of
3408 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3409 </para>
3410 <para>
3411 For example, one possible application is to reify dictionaries:
3412 <programlisting>
3413 data NumInst a where
3414 MkNumInst :: Num a => NumInst a
3415
3416 intInst :: NumInst Int
3417 intInst = MkNumInst
3418
3419 plus :: NumInst a -> a -> a -> a
3420 plus MkNumInst p q = p + q
3421 </programlisting>
3422 Here, a value of type <literal>NumInst a</literal> is equivalent
3423 to an explicit <literal>(Num a)</literal> dictionary.
3424 </para>
3425 <para>
3426 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3427 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3428 like this:
3429 <programlisting>
3430 data NumInst a
3431 = Num a => MkNumInst (NumInst a)
3432 </programlisting>
3433 Notice that, unlike the situation when declaring an existential, there is
3434 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3435 data type's universally quantified type variable <literal>a</literal>.
3436 A constructor may have both universal and existential type variables: for example,
3437 the following two declarations are equivalent:
3438 <programlisting>
3439 data T1 a
3440 = forall b. (Num a, Eq b) => MkT1 a b
3441 data T2 a where
3442 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3443 </programlisting>
3444 </para>
3445 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3446 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3447 In Haskell 98 the definition
3448 <programlisting>
3449 data Eq a => Set' a = MkSet' [a]
3450 </programlisting>
3451 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3452 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3453 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3454 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3455 GHC's behaviour is much more useful, as well as much more intuitive.
3456 </para>
3457
3458 <para>
3459 The rest of this section gives further details about GADT-style data
3460 type declarations.
3461
3462 <itemizedlist>
3463 <listitem><para>
3464 The result type of each data constructor must begin with the type constructor being defined.
3465 If the result type of all constructors
3466 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3467 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3468 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3469 </para></listitem>
3470
3471 <listitem><para>
3472 As with other type signatures, you can give a single signature for several data constructors.
3473 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3474 <programlisting>
3475 data T a where
3476 T1,T2 :: a -> T a
3477 T3 :: T a
3478 </programlisting>
3479 </para></listitem>
3480
3481 <listitem><para>
3482 The type signature of
3483 each constructor is independent, and is implicitly universally quantified as usual.
3484 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3485 have no scope, and different constructors may have different universally-quantified type variables:
3486 <programlisting>
3487 data T a where -- The 'a' has no scope
3488 T1,T2 :: b -> T b -- Means forall b. b -> T b
3489 T3 :: T a -- Means forall a. T a
3490 </programlisting>
3491 </para></listitem>
3492
3493 <listitem><para>
3494 A constructor signature may mention type class constraints, which can differ for
3495 different constructors. For example, this is fine:
3496 <programlisting>
3497 data T a where
3498 T1 :: Eq b => b -> b -> T b
3499 T2 :: (Show c, Ix c) => c -> [c] -> T c
3500 </programlisting>
3501 When pattern matching, these constraints are made available to discharge constraints
3502 in the body of the match. For example:
3503 <programlisting>
3504 f :: T a -> String
3505 f (T1 x y) | x==y = "yes"
3506 | otherwise = "no"
3507 f (T2 a b) = show a
3508 </programlisting>
3509 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3510 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3511 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3512 </para></listitem>
3513
3514 <listitem><para>
3515 Unlike a Haskell-98-style
3516 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3517 have no scope. Indeed, one can write a kind signature instead:
3518 <programlisting>
3519 data Set :: * -> * where ...
3520 </programlisting>
3521 or even a mixture of the two:
3522 <programlisting>
3523 data Bar a :: (* -> *) -> * where ...
3524 </programlisting>
3525 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3526 like this:
3527 <programlisting>
3528 data Bar a (b :: * -> *) where ...
3529 </programlisting>
3530 </para></listitem>
3531
3532
3533 <listitem><para>
3534 You can use strictness annotations, in the obvious places
3535 in the constructor type:
3536 <programlisting>
3537 data Term a where
3538 Lit :: !Int -> Term Int
3539 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3540 Pair :: Term a -> Term b -> Term (a,b)
3541 </programlisting>
3542 </para></listitem>
3543
3544 <listitem><para>
3545 You can use a <literal>deriving</literal> clause on a GADT-style data type
3546 declaration. For example, these two declarations are equivalent
3547 <programlisting>
3548 data Maybe1 a where {
3549 Nothing1 :: Maybe1 a ;
3550 Just1 :: a -> Maybe1 a
3551 } deriving( Eq, Ord )
3552
3553 data Maybe2 a = Nothing2 | Just2 a
3554 deriving( Eq, Ord )
3555 </programlisting>
3556 </para></listitem>
3557
3558 <listitem><para>
3559 The type signature may have quantified type variables that do not appear
3560 in the result type:
3561 <programlisting>
3562 data Foo where
3563 MkFoo :: a -> (a->Bool) -> Foo
3564 Nil :: Foo
3565 </programlisting>
3566 Here the type variable <literal>a</literal> does not appear in the result type
3567 of either constructor.
3568 Although it is universally quantified in the type of the constructor, such
3569 a type variable is often called "existential".
3570 Indeed, the above declaration declares precisely the same type as
3571 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3572 </para><para>
3573 The type may contain a class context too, of course:
3574 <programlisting>
3575 data Showable where
3576 MkShowable :: Show a => a -> Showable
3577 </programlisting>
3578 </para></listitem>
3579
3580 <listitem><para>
3581 You can use record syntax on a GADT-style data type declaration:
3582
3583 <programlisting>
3584 data Person where
3585 Adult :: { name :: String, children :: [Person] } -> Person
3586 Child :: Show a => { name :: !String, funny :: a } -> Person
3587 </programlisting>
3588 As usual, for every constructor that has a field <literal>f</literal>, the type of
3589 field <literal>f</literal> must be the same (modulo alpha conversion).
3590 The <literal>Child</literal> constructor above shows that the signature
3591 may have a context, existentially-quantified variables, and strictness annotations,
3592 just as in the non-record case. (NB: the "type" that follows the double-colon
3593 is not really a type, because of the record syntax and strictness annotations.
3594 A "type" of this form can appear only in a constructor signature.)
3595 </para></listitem>
3596
3597 <listitem><para>
3598 Record updates are allowed with GADT-style declarations,
3599 only fields that have the following property: the type of the field
3600 mentions no existential type variables.
3601 </para></listitem>
3602
3603 <listitem><para>
3604 As in the case of existentials declared using the Haskell-98-like record syntax
3605 (<xref linkend="existential-records"/>),
3606 record-selector functions are generated only for those fields that have well-typed
3607 selectors.
3608 Here is the example of that section, in GADT-style syntax:
3609 <programlisting>
3610 data Counter a where
3611 NewCounter :: { _this :: self
3612 , _inc :: self -> self
3613 , _display :: self -> IO ()
3614 , tag :: a
3615 } -> Counter a
3616 </programlisting>
3617 As before, only one selector function is generated here, that for <literal>tag</literal>.
3618 Nevertheless, you can still use all the field names in pattern matching and record construction.
3619 </para></listitem>
3620
3621 <listitem><para>
3622 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3623 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3624 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3625 So GHC implements the following design: a data constructor declared in a GADT-style data type
3626 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3627 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3628 <programlisting>
3629 infix 6 (:--:)
3630 data T a where
3631 (:--:) :: Int -> Bool -> T Int
3632 </programlisting>
3633 </para></listitem>
3634 </itemizedlist></para>
3635 </sect2>
3636
3637 <sect2 id="gadt">
3638 <title>Generalised Algebraic Data Types (GADTs)</title>
3639
3640 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3641 by allowing constructors to have richer return types. Here is an example:
3642 <programlisting>
3643 data Term a where
3644 Lit :: Int -> Term Int
3645 Succ :: Term Int -> Term Int
3646 IsZero :: Term Int -> Term Bool
3647 If :: Term Bool -> Term a -> Term a -> Term a
3648 Pair :: Term a -> Term b -> Term (a,b)
3649 </programlisting>
3650 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3651 case with ordinary data types. This generality allows us to
3652 write a well-typed <literal>eval</literal> function
3653 for these <literal>Terms</literal>:
3654 <programlisting>
3655 eval :: Term a -> a
3656 eval (Lit i) = i
3657 eval (Succ t) = 1 + eval t
3658 eval (IsZero t) = eval t == 0
3659 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3660 eval (Pair e1 e2) = (eval e1, eval e2)
3661 </programlisting>
3662 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3663 For example, in the right hand side of the equation
3664 <programlisting>
3665 eval :: Term a -> a
3666 eval (Lit i) = ...
3667 </programlisting>
3668 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3669 A precise specification of the type rules is beyond what this user manual aspires to,
3670 but the design closely follows that described in
3671 the paper <ulink
3672 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3673 unification-based type inference for GADTs</ulink>,
3674 (ICFP 2006).
3675 The general principle is this: <emphasis>type refinement is only carried out
3676 based on user-supplied type annotations</emphasis>.
3677 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3678 and lots of obscure error messages will
3679 occur. However, the refinement is quite general. For example, if we had:
3680 <programlisting>
3681 eval :: Term a -> a -> a
3682 eval (Lit i) j = i+j
3683 </programlisting>
3684 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3685 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3686 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3687 </para>
3688 <para>
3689 These and many other examples are given in papers by Hongwei Xi, and
3690 Tim Sheard. There is a longer introduction
3691 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3692 and Ralf Hinze's
3693 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3694 may use different notation to that implemented in GHC.
3695 </para>
3696 <para>
3697 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3698 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
3699 <itemizedlist>
3700 <listitem><para>
3701 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3702 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3703 The result type of each constructor must begin with the type constructor being defined,
3704 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3705 For example, in the <literal>Term</literal> data
3706 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3707 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3708 constructor).
3709 </para></listitem>
3710
3711 <listitem><para>
3712 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3713 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3714 whose result type is not just <literal>T a b</literal>.
3715 </para></listitem>
3716
3717 <listitem><para>
3718 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3719 an ordinary data type.
3720 </para></listitem>
3721
3722 <listitem><para>
3723 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3724 For example:
3725 <programlisting>
3726 data Term a where
3727 Lit :: { val :: Int } -> Term Int
3728 Succ :: { num :: Term Int } -> Term Int
3729 Pred :: { num :: Term Int } -> Term Int
3730 IsZero :: { arg :: Term Int } -> Term Bool
3731 Pair :: { arg1 :: Term a
3732 , arg2 :: Term b
3733 } -> Term (a,b)
3734 If :: { cnd :: Term Bool
3735 , tru :: Term a
3736 , fls :: Term a
3737 } -> Term a
3738 </programlisting>
3739 However, for GADTs there is the following additional constraint:
3740 every constructor that has a field <literal>f</literal> must have
3741 the same result type (modulo alpha conversion)
3742 Hence, in the above example, we cannot merge the <literal>num</literal>
3743 and <literal>arg</literal> fields above into a
3744 single name. Although their field types are both <literal>Term Int</literal>,
3745 their selector functions actually have different types:
3746
3747 <programlisting>
3748 num :: Term Int -> Term Int
3749 arg :: Term Bool -> Term Int
3750 </programlisting>
3751 </para></listitem>
3752
3753 <listitem><para>
3754 When pattern-matching against data constructors drawn from a GADT,
3755 for example in a <literal>case</literal> expression, the following rules apply:
3756 <itemizedlist>
3757 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3758 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3759 <listitem><para>The type of any free variable mentioned in any of
3760 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3761 </itemizedlist>
3762 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3763 way to ensure that a variable a rigid type is to give it a type signature.
3764 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3765 Simple unification-based type inference for GADTs
3766 </ulink>. The criteria implemented by GHC are given in the Appendix.
3767
3768 </para></listitem>
3769
3770 </itemizedlist>
3771 </para>
3772
3773 </sect2>
3774 </sect1>
3775
3776 <!-- ====================== End of Generalised algebraic data types ======================= -->
3777
3778 <sect1 id="deriving">
3779 <title>Extensions to the "deriving" mechanism</title>
3780
3781 <sect2 id="deriving-inferred">
3782 <title>Inferred context for deriving clauses</title>
3783
3784 <para>
3785 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3786 legal. For example:
3787 <programlisting>
3788 data T0 f a = MkT0 a deriving( Eq )
3789 data T1 f a = MkT1 (f a) deriving( Eq )
3790 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3791 </programlisting>
3792 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3793 <programlisting>
3794 instance Eq a => Eq (T0 f a) where ...
3795 instance Eq (f a) => Eq (T1 f a) where ...
3796 instance Eq (f (f a)) => Eq (T2 f a) where ...
3797 </programlisting>
3798 The first of these is obviously fine. The second is still fine, although less obviously.
3799 The third is not Haskell 98, and risks losing termination of instances.
3800 </para>
3801 <para>
3802 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3803 each constraint in the inferred instance context must consist only of type variables,
3804 with no repetitions.
3805 </para>
3806 <para>
3807 This rule is applied regardless of flags. If you want a more exotic context, you can write
3808 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3809 </para>
3810 </sect2>
3811
3812 <sect2 id="stand-alone-deriving">
3813 <title>Stand-alone deriving declarations</title>
3814
3815 <para>
3816 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3817 <programlisting>
3818 data Foo a = Bar a | Baz String
3819
3820 deriving instance Eq a => Eq (Foo a)
3821 </programlisting>
3822 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3823 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3824 </para>
3825 <para>
3826 However, standalone deriving differs from a <literal>deriving</literal> clause in a number
3827 of important ways:
3828 <itemizedlist>
3829 <listitem><para>The standalone deriving declaration does not need to be in the
3830 same module as the data type declaration. (But be aware of the dangers of
3831 orphan instances (<xref linkend="orphan-modules"/>).
3832 </para></listitem>
3833
3834 <listitem><para>
3835 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3836 exactly as you would in an ordinary instance declaration.
3837 (In contrast, in a <literal>deriving</literal> clause
3838 attached to a data type declaration, the context is inferred.)
3839 </para></listitem>
3840
3841 <listitem><para>
3842 Unlike a <literal>deriving</literal>
3843 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3844 than the data type (assuming you also use
3845 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3846 for example
3847 <programlisting>
3848 data Foo a = Bar a | Baz String
3849
3850 deriving instance Eq a => Eq (Foo [a])
3851 deriving instance Eq a => Eq (Foo (Maybe a))
3852 </programlisting>
3853 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3854 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3855 </para></listitem>
3856
3857 <listitem><para>
3858 Unlike a <literal>deriving</literal>
3859 declaration attached to a <literal>data</literal> declaration,
3860 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3861 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3862 your problem. (GHC will show you the offending code if it has a type error.)
3863 </para>
3864 <para>
3865 The merit of this is that you can derive instances for GADTs and other exotic
3866 data types, providing only that the boilerplate code does indeed typecheck. For example:
3867 <programlisting>
3868 data T a where
3869 T1 :: T Int
3870 T2 :: T Bool
3871
3872 deriving instance Show (T a)
3873 </programlisting>
3874 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3875 data type declaration for <literal>T</literal>,
3876 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3877 the instance declaration using stand-alone deriving.
3878 </para>
3879 <para>
3880 The down-side is that,
3881 if the boilerplate code fails to typecheck, you will get an error message about that
3882 code, which you did not write. Whereas, with a <literal>deriving</literal> clause
3883 the side-conditions are necessarily more conservative, but any error message
3884 may be more comprehensible.
3885 </para>
3886 </listitem>
3887 </itemizedlist></para>
3888
3889 <para>
3890 In other ways, however, a standalone deriving obeys the same rules as ordinary deriving:
3891 <itemizedlist>
3892 <listitem><para>
3893 A <literal>deriving instance</literal> declaration
3894 must obey the same rules concerning form and termination as ordinary instance declarations,
3895 controlled by the same flags; see <xref linkend="instance-decls"/>.
3896 </para></listitem>
3897
3898 <listitem>
3899 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3900 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3901 For example:
3902 <programlisting>
3903 newtype Foo a = MkFoo (State Int a)
3904
3905 deriving instance MonadState Int Foo
3906 </programlisting>
3907 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3908 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
3909 </para></listitem>
3910 </itemizedlist></para>
3911
3912 </sect2>
3913
3914 <sect2 id="deriving-extra">
3915 <title>Deriving instances of extra classes (<literal>Data</literal>, etc)</title>
3916
3917 <para>
3918 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3919 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3920 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3921 classes <literal>Eq</literal>, <literal>Ord</literal>,
3922 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3923 </para>
3924 <para>
3925 GHC extends this list with several more classes that may be automatically derived:
3926 <itemizedlist>
3927 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
3928 instances of the classes <literal>Generic</literal> and
3929 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
3930 You can use these to define generic functions,
3931 as described in <xref linkend="generic-programming"/>.
3932 </para></listitem>
3933
3934 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3935 the class <literal>Functor</literal>,
3936 defined in <literal>GHC.Base</literal>.
3937 </para></listitem>
3938
3939 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of
3940 the class <literal>Data</literal>,
3941 defined in <literal>Data.Data</literal>. See <xref linkend="deriving-typeable"/> for
3942 deriving <literal>Typeable</literal>.
3943 </para></listitem>
3944
3945 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3946 the class <literal>Foldable</literal>,
3947 defined in <literal>Data.Foldable</literal>.
3948 </para></listitem>
3949
3950 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3951 the class <literal>Traversable</literal>,
3952 defined in <literal>Data.Traversable</literal>. Since the <literal>Traversable</literal>
3953 instance dictates the instances of <literal>Functor</literal> and
3954 <literal>Foldable</literal>, you'll probably want to derive them too, so
3955 <option>-XDeriveTraversable</option> implies
3956 <option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>.
3957 </para></listitem>
3958 </itemizedlist>
3959 You can also use a standalone deriving declaration instead
3960 (see <xref linkend="stand-alone-deriving"/>).
3961 </para>
3962 <para>
3963 In each case the appropriate class must be in scope before it
3964 can be mentioned in the <literal>deriving</literal> clause.
3965 </para>
3966 </sect2>
3967
3968 <sect2 id="deriving-typeable">
3969 <title>Deriving <literal>Typeable</literal> instances</title>
3970
3971 <para>The class <literal>Typeable</literal> is very special:
3972 <itemizedlist>
3973 <listitem><para>
3974 <literal>Typeable</literal> is kind-polymorphic (see
3975 <xref linkend="kind-polymorphism"/>).
3976 </para></listitem>
3977
3978 <listitem><para>
3979 Only derived instances of <literal>Typeable</literal> are allowed;
3980 i.e. handwritten instances are forbidden. This ensures that the
3981 programmer cannot subert the type system by writing bogus instances.
3982 </para></listitem>
3983
3984 <listitem><para>
3985 With <option>-XDeriveDataTypeable</option>
3986 GHC allows you to derive instances of <literal>Typeable</literal> for data types or newtypes,
3987 using a <literal>deriving</literal> clause, or using
3988 a standalone deriving declaration (<xref linkend="stand-alone-deriving"/>).
3989 </para></listitem>
3990
3991 <listitem><para>
3992 With <option>-XDataKinds</option>, deriving <literal>Typeable</literal> for a data
3993 type (whether via a deriving clause or standalone deriving)
3994 also derives <literal>Typeable</literal> for the promoted data constructors (<xref linkend="promotion"/>).
3995 </para></listitem>
3996
3997 <listitem><para>
3998 However, using standalone deriving, you can <emphasis>also</emphasis> derive
3999 a <literal>Typeable</literal> instance for a data family.
4000 You may not add a <literal>deriving(Typeable)</literal> clause to a
4001 <literal>data instance</literal> declaration; instead you must use a
4002 standalone deriving declaration for the data family.
4003 </para></listitem>
4004
4005 <listitem><para>
4006 Using standalone deriving, you can <emphasis>also</emphasis> derive
4007 a <literal>Typeable</literal> instance for a type class.
4008 </para></listitem>
4009
4010 <listitem><para>
4011 The flag <option>-XAutoDeriveTypeable</option> triggers the generation
4012 of derived <literal>Typeable</literal> instances for every datatype, data family,
4013 and type class declaration in the module it is used, unless a manually-specified one is
4014 already provided.
4015 This flag implies <option>-XDeriveDataTypeable</option>.
4016 </para></listitem>
4017 </itemizedlist>
4018
4019 </para>
4020
4021 </sect2>
4022
4023 <sect2 id="newtype-deriving">
4024 <title>Generalised derived instances for newtypes</title>
4025
4026 <para>
4027 When you define an abstract type using <literal>newtype</literal>, you may want
4028 the new type to inherit some instances from its representation. In
4029 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
4030 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
4031 other classes you have to write an explicit instance declaration. For
4032 example, if you define
4033
4034 <programlisting>
4035 newtype Dollars = Dollars Int
4036 </programlisting>
4037
4038 and you want to use arithmetic on <literal>Dollars</literal>, you have to
4039 explicitly define an instance of <literal>Num</literal>:
4040
4041 <programlisting>
4042 instance Num Dollars where
4043 Dollars a + Dollars b = Dollars (a+b)
4044 ...
4045 </programlisting>
4046 All the instance does is apply and remove the <literal>newtype</literal>
4047 constructor. It is particularly galling that, since the constructor
4048 doesn't appear at run-time, this instance declaration defines a
4049 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
4050 dictionary, only slower!
4051 </para>
4052
4053
4054 <sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title>
4055 <para>
4056 GHC now permits such instances to be derived instead,
4057 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
4058 so one can write
4059 <programlisting>
4060 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
4061 </programlisting>
4062
4063 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
4064 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
4065 derives an instance declaration of the form
4066
4067 <programlisting>
4068 instance Num Int => Num Dollars
4069 </programlisting>
4070
4071 which just adds or removes the <literal>newtype</literal> constructor according to the type.
4072 </para>
4073 <para>
4074
4075 We can also derive instances of constructor classes in a similar
4076 way. For example, suppose we have implemented state and failure monad
4077 transformers, such that
4078
4079 <programlisting>
4080 instance Monad m => Monad (State s m)
4081 instance Monad m => Monad (Failure m)
4082 </programlisting>
4083 In Haskell 98, we can define a parsing monad by
4084 <programlisting>
4085 type Parser tok m a = State [tok] (Failure m) a
4086 </programlisting>
4087
4088 which is automatically a monad thanks to the instance declarations
4089 above. With the extension, we can make the parser type abstract,
4090 without needing to write an instance of class <literal>Monad</literal>, via
4091
4092 <programlisting>
4093 newtype Parser tok m a = Parser (State [tok]