a19bc8b6297286baab7e287ebd44e8e70b9c882f
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by commandline flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>::</entry> <!-- no special char, apparently -->
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&larrtl;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&rarrtl;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry></entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry></entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483
484 <!-- ====================== HIERARCHICAL MODULES ======================= -->
485
486
487 <sect2 id="hierarchical-modules">
488 <title>Hierarchical Modules</title>
489
490 <para>GHC supports a small extension to the syntax of module
491 names: a module name is allowed to contain a dot
492 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
493 &ldquo;hierarchical module namespace&rdquo; extension, because
494 it extends the normally flat Haskell module namespace into a
495 more flexible hierarchy of modules.</para>
496
497 <para>This extension has very little impact on the language
498 itself; modules names are <emphasis>always</emphasis> fully
499 qualified, so you can just think of the fully qualified module
500 name as <quote>the module name</quote>. In particular, this
501 means that the full module name must be given after the
502 <literal>module</literal> keyword at the beginning of the
503 module; for example, the module <literal>A.B.C</literal> must
504 begin</para>
505
506 <programlisting>module A.B.C</programlisting>
507
508
509 <para>It is a common strategy to use the <literal>as</literal>
510 keyword to save some typing when using qualified names with
511 hierarchical modules. For example:</para>
512
513 <programlisting>
514 import qualified Control.Monad.ST.Strict as ST
515 </programlisting>
516
517 <para>For details on how GHC searches for source and interface
518 files in the presence of hierarchical modules, see <xref
519 linkend="search-path"/>.</para>
520
521 <para>GHC comes with a large collection of libraries arranged
522 hierarchically; see the accompanying <ulink
523 url="../libraries/index.html">library
524 documentation</ulink>. More libraries to install are available
525 from <ulink
526 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
527 </sect2>
528
529 <!-- ====================== PATTERN GUARDS ======================= -->
530
531 <sect2 id="pattern-guards">
532 <title>Pattern guards</title>
533
534 <para>
535 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
536 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
537 </para>
538
539 <para>
540 Suppose we have an abstract data type of finite maps, with a
541 lookup operation:
542
543 <programlisting>
544 lookup :: FiniteMap -> Int -> Maybe Int
545 </programlisting>
546
547 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
548 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
549 </para>
550
551 <programlisting>
552 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
553 | otherwise = var1 + var2
554 where
555 m1 = lookup env var1
556 m2 = lookup env var2
557 ok1 = maybeToBool m1
558 ok2 = maybeToBool m2
559 val1 = expectJust m1
560 val2 = expectJust m2
561 </programlisting>
562
563 <para>
564 The auxiliary functions are
565 </para>
566
567 <programlisting>
568 maybeToBool :: Maybe a -&gt; Bool
569 maybeToBool (Just x) = True
570 maybeToBool Nothing = False
571
572 expectJust :: Maybe a -&gt; a
573 expectJust (Just x) = x
574 expectJust Nothing = error "Unexpected Nothing"
575 </programlisting>
576
577 <para>
578 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
579 ok2</literal> checks that both lookups succeed, using
580 <function>maybeToBool</function> to convert the <function>Maybe</function>
581 types to booleans. The (lazily evaluated) <function>expectJust</function>
582 calls extract the values from the results of the lookups, and binds the
583 returned values to <varname>val1</varname> and <varname>val2</varname>
584 respectively. If either lookup fails, then clunky takes the
585 <literal>otherwise</literal> case and returns the sum of its arguments.
586 </para>
587
588 <para>
589 This is certainly legal Haskell, but it is a tremendously verbose and
590 un-obvious way to achieve the desired effect. Arguably, a more direct way
591 to write clunky would be to use case expressions:
592 </para>
593
594 <programlisting>
595 clunky env var1 var2 = case lookup env var1 of
596 Nothing -&gt; fail
597 Just val1 -&gt; case lookup env var2 of
598 Nothing -&gt; fail
599 Just val2 -&gt; val1 + val2
600 where
601 fail = var1 + var2
602 </programlisting>
603
604 <para>
605 This is a bit shorter, but hardly better. Of course, we can rewrite any set
606 of pattern-matching, guarded equations as case expressions; that is
607 precisely what the compiler does when compiling equations! The reason that
608 Haskell provides guarded equations is because they allow us to write down
609 the cases we want to consider, one at a time, independently of each other.
610 This structure is hidden in the case version. Two of the right-hand sides
611 are really the same (<function>fail</function>), and the whole expression
612 tends to become more and more indented.
613 </para>
614
615 <para>
616 Here is how I would write clunky:
617 </para>
618
619 <programlisting>
620 clunky env var1 var2
621 | Just val1 &lt;- lookup env var1
622 , Just val2 &lt;- lookup env var2
623 = val1 + val2
624 ...other equations for clunky...
625 </programlisting>
626
627 <para>
628 The semantics should be clear enough. The qualifiers are matched in order.
629 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
630 right hand side is evaluated and matched against the pattern on the left.
631 If the match fails then the whole guard fails and the next equation is
632 tried. If it succeeds, then the appropriate binding takes place, and the
633 next qualifier is matched, in the augmented environment. Unlike list
634 comprehensions, however, the type of the expression to the right of the
635 <literal>&lt;-</literal> is the same as the type of the pattern to its
636 left. The bindings introduced by pattern guards scope over all the
637 remaining guard qualifiers, and over the right hand side of the equation.
638 </para>
639
640 <para>
641 Just as with list comprehensions, boolean expressions can be freely mixed
642 with among the pattern guards. For example:
643 </para>
644
645 <programlisting>
646 f x | [y] &lt;- x
647 , y > 3
648 , Just z &lt;- h y
649 = ...
650 </programlisting>
651
652 <para>
653 Haskell's current guards therefore emerge as a special case, in which the
654 qualifier list has just one element, a boolean expression.
655 </para>
656 </sect2>
657
658 <!-- ===================== View patterns =================== -->
659
660 <sect2 id="view-patterns">
661 <title>View patterns
662 </title>
663
664 <para>
665 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
666 More information and examples of view patterns can be found on the
667 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
668 page</ulink>.
669 </para>
670
671 <para>
672 View patterns are somewhat like pattern guards that can be nested inside
673 of other patterns. They are a convenient way of pattern-matching
674 against values of abstract types. For example, in a programming language
675 implementation, we might represent the syntax of the types of the
676 language as follows:
677
678 <programlisting>
679 type Typ
680
681 data TypView = Unit
682 | Arrow Typ Typ
683
684 view :: Typ -> TypView
685
686 -- additional operations for constructing Typ's ...
687 </programlisting>
688
689 The representation of Typ is held abstract, permitting implementations
690 to use a fancy representation (e.g., hash-consing to manage sharing).
691
692 Without view patterns, using this signature a little inconvenient:
693 <programlisting>
694 size :: Typ -> Integer
695 size t = case view t of
696 Unit -> 1
697 Arrow t1 t2 -> size t1 + size t2
698 </programlisting>
699
700 It is necessary to iterate the case, rather than using an equational
701 function definition. And the situation is even worse when the matching
702 against <literal>t</literal> is buried deep inside another pattern.
703 </para>
704
705 <para>
706 View patterns permit calling the view function inside the pattern and
707 matching against the result:
708 <programlisting>
709 size (view -> Unit) = 1
710 size (view -> Arrow t1 t2) = size t1 + size t2
711 </programlisting>
712
713 That is, we add a new form of pattern, written
714 <replaceable>expression</replaceable> <literal>-></literal>
715 <replaceable>pattern</replaceable> that means "apply the expression to
716 whatever we're trying to match against, and then match the result of
717 that application against the pattern". The expression can be any Haskell
718 expression of function type, and view patterns can be used wherever
719 patterns are used.
720 </para>
721
722 <para>
723 The semantics of a pattern <literal>(</literal>
724 <replaceable>exp</replaceable> <literal>-></literal>
725 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
726
727 <itemizedlist>
728
729 <listitem> Scoping:
730
731 <para>The variables bound by the view pattern are the variables bound by
732 <replaceable>pat</replaceable>.
733 </para>
734
735 <para>
736 Any variables in <replaceable>exp</replaceable> are bound occurrences,
737 but variables bound "to the left" in a pattern are in scope. This
738 feature permits, for example, one argument to a function to be used in
739 the view of another argument. For example, the function
740 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
741 written using view patterns as follows:
742
743 <programlisting>
744 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
745 ...other equations for clunky...
746 </programlisting>
747 </para>
748
749 <para>
750 More precisely, the scoping rules are:
751 <itemizedlist>
752 <listitem>
753 <para>
754 In a single pattern, variables bound by patterns to the left of a view
755 pattern expression are in scope. For example:
756 <programlisting>
757 example :: Maybe ((String -> Integer,Integer), String) -> Bool
758 example Just ((f,_), f -> 4) = True
759 </programlisting>
760
761 Additionally, in function definitions, variables bound by matching earlier curried
762 arguments may be used in view pattern expressions in later arguments:
763 <programlisting>
764 example :: (String -> Integer) -> String -> Bool
765 example f (f -> 4) = True
766 </programlisting>
767 That is, the scoping is the same as it would be if the curried arguments
768 were collected into a tuple.
769 </para>
770 </listitem>
771
772 <listitem>
773 <para>
774 In mutually recursive bindings, such as <literal>let</literal>,
775 <literal>where</literal>, or the top level, view patterns in one
776 declaration may not mention variables bound by other declarations. That
777 is, each declaration must be self-contained. For example, the following
778 program is not allowed:
779 <programlisting>
780 let {(x -> y) = e1 ;
781 (y -> x) = e2 } in x
782 </programlisting>
783
784 (For some amplification on this design choice see
785 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
786
787 </para>
788 </listitem>
789 </itemizedlist>
790
791 </para>
792 </listitem>
793
794 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
795 <replaceable>T1</replaceable> <literal>-></literal>
796 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
797 a <replaceable>T2</replaceable>, then the whole view pattern matches a
798 <replaceable>T1</replaceable>.
799 </para></listitem>
800
801 <listitem><para> Matching: To the equations in Section 3.17.3 of the
802 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
803 Report</ulink>, add the following:
804 <programlisting>
805 case v of { (e -> p) -> e1 ; _ -> e2 }
806 =
807 case (e v) of { p -> e1 ; _ -> e2 }
808 </programlisting>
809 That is, to match a variable <replaceable>v</replaceable> against a pattern
810 <literal>(</literal> <replaceable>exp</replaceable>
811 <literal>-></literal> <replaceable>pat</replaceable>
812 <literal>)</literal>, evaluate <literal>(</literal>
813 <replaceable>exp</replaceable> <replaceable> v</replaceable>
814 <literal>)</literal> and match the result against
815 <replaceable>pat</replaceable>.
816 </para></listitem>
817
818 <listitem><para> Efficiency: When the same view function is applied in
819 multiple branches of a function definition or a case expression (e.g.,
820 in <literal>size</literal> above), GHC makes an attempt to collect these
821 applications into a single nested case expression, so that the view
822 function is only applied once. Pattern compilation in GHC follows the
823 matrix algorithm described in Chapter 4 of <ulink
824 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
825 Implementation of Functional Programming Languages</ulink>. When the
826 top rows of the first column of a matrix are all view patterns with the
827 "same" expression, these patterns are transformed into a single nested
828 case. This includes, for example, adjacent view patterns that line up
829 in a tuple, as in
830 <programlisting>
831 f ((view -> A, p1), p2) = e1
832 f ((view -> B, p3), p4) = e2
833 </programlisting>
834 </para>
835
836 <para> The current notion of when two view pattern expressions are "the
837 same" is very restricted: it is not even full syntactic equality.
838 However, it does include variables, literals, applications, and tuples;
839 e.g., two instances of <literal>view ("hi", "there")</literal> will be
840 collected. However, the current implementation does not compare up to
841 alpha-equivalence, so two instances of <literal>(x, view x ->
842 y)</literal> will not be coalesced.
843 </para>
844
845 </listitem>
846
847 </itemizedlist>
848 </para>
849
850 </sect2>
851
852 <!-- ===================== Pattern synonyms =================== -->
853
854 <sect2 id="pattern-synonyms">
855 <title>Pattern synonyms
856 </title>
857
858 <para>
859 Pattern synonyms are enabled by the flag
860 <literal>-XPatternSynonyms</literal>, which is required for both
861 defining them <emphasis>and</emphasis> using them. More information
862 and examples of view patterns can be found on the <ulink
863 url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
864 page</ulink>.
865 </para>
866
867 <para>
868 Pattern synonyms enable giving names to parametrized pattern
869 schemes. They can also be thought of as abstract constructors that
870 don't have a bearing on data representation. For example, in a
871 programming language implementation, we might represent types of the
872 language as follows:
873 </para>
874
875 <programlisting>
876 data Type = App String [Type]
877 </programlisting>
878
879 <para>
880 Here are some examples of using said representation.
881 Consider a few types of the <literal>Type</literal> universe encoded
882 like this:
883 </para>
884
885 <programlisting>
886 App "->" [t1, t2] -- t1 -> t2
887 App "Int" [] -- Int
888 App "Maybe" [App "Int" []] -- Maybe Int
889 </programlisting>
890
891 <para>
892 This representation is very generic in that no types are given special
893 treatment. However, some functions might need to handle some known
894 types specially, for example the following two functions collect all
895 argument types of (nested) arrow types, and recognize the
896 <literal>Int</literal> type, respectively:
897 </para>
898
899 <programlisting>
900 collectArgs :: Type -> [Type]
901 collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
902 collectArgs _ = []
903
904 isInt :: Type -> Bool
905 isInt (App "Int" []) = True
906 isInt _ = False
907 </programlisting>
908
909 <para>
910 Matching on <literal>App</literal> directly is both hard to read and
911 error prone to write. And the situation is even worse when the
912 matching is nested:
913 </para>
914
915 <programlisting>
916 isIntEndo :: Type -> Bool
917 isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
918 isIntEndo _ = False
919 </programlisting>
920
921 <para>
922 Pattern synonyms permit abstracting from the representation to expose
923 matchers that behave in a constructor-like manner with respect to
924 pattern matching. We can create pattern synonyms for the known types
925 we care about, without committing the representation to them (note
926 that these don't have to be defined in the same module as the
927 <literal>Type</literal> type):
928 </para>
929
930 <programlisting>
931 pattern Arrow t1 t2 = App "->" [t1, t2]
932 pattern Int = App "Int" []
933 pattern Maybe t = App "Maybe" [t]
934 </programlisting>
935
936 <para>
937 Which enables us to rewrite our functions in a much cleaner style:
938 </para>
939
940 <programlisting>
941 collectArgs :: Type -> [Type]
942 collectArgs (Arrow t1 t2) = t1 : collectArgs t2
943 collectArgs _ = []
944
945 isInt :: Type -> Bool
946 isInt Int = True
947 isInt _ = False
948
949 isIntEndo :: Type -> Bool
950 isIntEndo (Arrow Int Int) = True
951 isIntEndo _ = False
952 </programlisting>
953
954 <para>
955 Note that in this example, the pattern synonyms
956 <literal>Int</literal> and <literal>Arrow</literal> can also be used
957 as expressions (they are <emphasis>bidirectional</emphasis>). This
958 is not necessarily the case: <emphasis>unidirectional</emphasis>
959 pattern synonyms can also be declared with the following syntax:
960 </para>
961
962 <programlisting>
963 pattern Head x &lt;- x:xs
964 </programlisting>
965
966 <para>
967 In this case, <literal>Head</literal> <replaceable>x</replaceable>
968 cannot be used in expressions, only patterns, since it wouldn't
969 specify a value for the <replaceable>xs</replaceable> on the
970 right-hand side.
971 </para>
972
973 <para>
974 The semantics of a unidirectional pattern synonym declaration and
975 usage are as follows:
976
977 <itemizedlist>
978
979 <listitem> Syntax:
980 <para>
981 A pattern synonym declaration can be either unidirectional or
982 bidirectional. The syntax for unidirectional pattern synonyms is:
983 </para>
984 <programlisting>
985 pattern Name args &lt;- pat
986 </programlisting>
987 <para>
988 and the syntax for bidirectional pattern synonyms is:
989 </para>
990 <programlisting>
991 pattern Name args = pat
992 </programlisting>
993 <para>
994 Pattern synonym declarations can only occur in the top level of a
995 module. In particular, they are not allowed as local
996 definitions. Currently, they also don't work in GHCi, but that is a
997 technical restriction that will be lifted in later versions.
998 </para>
999 <para>
1000 The name of the pattern synonym itself is in the same namespace as
1001 proper data constructors. Either prefix or infix syntax can be
1002 used. In export/import specifications, you have to prefix pattern
1003 names with the <literal>pattern</literal> keyword, e.g.:
1004 </para>
1005 <programlisting>
1006 module Example (pattern Single) where
1007 pattern Single x = [x]
1008 </programlisting>
1009 </listitem>
1010
1011 <listitem> Scoping:
1012
1013 <para>
1014 The variables in the left-hand side of the definition are bound by
1015 the pattern on the right-hand side. For bidirectional pattern
1016 synonyms, all the variables of the right-hand side must also occur
1017 on the left-hand side; also, wildcard patterns and view patterns are
1018 not allowed. For unidirectional pattern synonyms, there is no
1019 restriction on the right-hand side pattern.
1020 </para>
1021
1022 <para>
1023 Pattern synonyms cannot be defined recursively.
1024 </para>
1025
1026 </listitem>
1027
1028 <listitem> Typing:
1029
1030 <para>
1031 Given a pattern synonym definition of the form
1032 </para>
1033 <programlisting>
1034 pattern P var1 var2 ... varN &lt;- pat
1035 </programlisting>
1036 <para>
1037 it is assigned a <emphasis>pattern type</emphasis> of the form
1038 </para>
1039 <programlisting>
1040 pattern CProv => P t1 t2 ... tN :: CReq => t
1041 </programlisting>
1042 <para>
1043 where <replaceable>CProv</replaceable> and
1044 <replaceable>CReq</replaceable> are type contexts, and
1045 <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ...,
1046 <replaceable>tN</replaceable> and <replaceable>t</replaceable> are
1047 types.
1048 </para>
1049
1050 <para>
1051 A pattern synonym of this type can be used in a pattern if the
1052 instatiated (monomorphic) type satisfies the constraints of
1053 <replaceable>CReq</replaceable>. In this case, it extends the context
1054 available in the right-hand side of the match with
1055 <replaceable>CProv</replaceable>, just like how an existentially-typed
1056 data constructor can extend the context.
1057 </para>
1058
1059 <para>
1060 For example, in the following program:
1061 </para>
1062 <programlisting>
1063 {-# LANGUAGE PatternSynonyms, GADTs #-}
1064 module ShouldCompile where
1065
1066 data T a where
1067 MkT :: (Show b) => a -> b -> T a
1068
1069 pattern ExNumPat x = MkT 42 x
1070 </programlisting>
1071
1072 <para>
1073 the pattern type of <literal>ExNumPat</literal> is
1074 </para>
1075
1076 <programlisting>
1077 pattern (Show b) => ExNumPat b :: (Num a, Eq a) => T a
1078 </programlisting>
1079
1080 <para>
1081 and so can be used in a function definition like the following:
1082 </para>
1083
1084 <programlisting>
1085 f :: (Num t, Eq t) => T t -> String
1086 f (ExNumPat x) = show x
1087 </programlisting>
1088
1089 <para>
1090 For bidirectional pattern synonyms, uses as expressions have the type
1091 </para>
1092 <programlisting>
1093 (CProv, CReq) => t1 -> t2 -> ... -> tN -> t
1094 </programlisting>
1095
1096 <para>
1097 So in the previous example, <literal>ExNumPat</literal>,
1098 when used in an expression, has type
1099 </para>
1100 <programlisting>
1101 ExNumPat :: (Show b, Num a, Eq a) => b -> T t
1102 </programlisting>
1103
1104 </listitem>
1105
1106 <listitem> Matching:
1107
1108 <para>
1109 A pattern synonym occurrence in a pattern is evaluated by first
1110 matching against the pattern synonym itself, and then on the argument
1111 patterns. For example, in the following program, <literal>f</literal>
1112 and <literal>f'</literal> are equivalent:
1113 </para>
1114
1115 <programlisting>
1116 pattern Pair x y &lt;- [x, y]
1117
1118 f (Pair True True) = True
1119 f _ = False
1120
1121 f' [x, y] | True &lt;- x, True &lt;- y = True
1122 f' _ = False
1123 </programlisting>
1124
1125 <para>
1126 Note that the strictness of <literal>f</literal> differs from that
1127 of <literal>g</literal> defined below:
1128 </para>
1129
1130 <programlisting>
1131 g [True, True] = True
1132 g _ = False
1133
1134 *Main> f (False:undefined)
1135 *** Exception: Prelude.undefined
1136 *Main> g (False:undefined)
1137 False
1138 </programlisting>
1139 </listitem>
1140 </itemizedlist>
1141 </para>
1142
1143 </sect2>
1144
1145 <!-- ===================== n+k patterns =================== -->
1146
1147 <sect2 id="n-k-patterns">
1148 <title>n+k patterns</title>
1149 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
1150
1151 <para>
1152 <literal>n+k</literal> pattern support is disabled by default. To enable
1153 it, you can use the <option>-XNPlusKPatterns</option> flag.
1154 </para>
1155
1156 </sect2>
1157
1158 <!-- ===================== Traditional record syntax =================== -->
1159
1160 <sect2 id="traditional-record-syntax">
1161 <title>Traditional record syntax</title>
1162 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
1163
1164 <para>
1165 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
1166 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
1167 </para>
1168
1169 </sect2>
1170
1171 <!-- ===================== Recursive do-notation =================== -->
1172
1173 <sect2 id="recursive-do-notation">
1174 <title>The recursive do-notation
1175 </title>
1176
1177 <para>
1178 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
1179 that is, the variables bound in a do-expression are visible only in the textually following
1180 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
1181 group.
1182 </para>
1183
1184 <para>
1185 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
1186 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
1187 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
1188 <programlisting>
1189 class Monad m => MonadFix m where
1190 mfix :: (a -> m a) -> m a
1191 </programlisting>
1192 Haskell's
1193 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
1194 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
1195 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
1196 </para>
1197
1198 <para>
1199 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
1200 an extended version of the do-notation that allows recursive bindings.
1201 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
1202 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
1203 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
1204 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
1205 are recursively defined, much like in an ordinary let-expression. Due to the new
1206 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
1207 </para>
1208
1209 <para>
1210 Here is a simple (albeit contrived) example:
1211 <programlisting>
1212 {-# LANGUAGE RecursiveDo #-}
1213 justOnes = mdo { xs &lt;- Just (1:xs)
1214 ; return (map negate xs) }
1215 </programlisting>
1216 or equivalently
1217 <programlisting>
1218 {-# LANGUAGE RecursiveDo #-}
1219 justOnes = do { rec { xs &lt;- Just (1:xs) }
1220 ; return (map negate xs) }
1221 </programlisting>
1222 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
1223 </para>
1224
1225 <para>
1226 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
1227 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
1228 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
1229 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
1230 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
1231 </para>
1232
1233 <sect3>
1234 <title>Recursive binding groups</title>
1235
1236 <para>
1237 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
1238 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
1239 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
1240 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
1241 <programlisting>
1242 do { a &lt;- getChar do { a &lt;- getChar
1243 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
1244 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
1245 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
1246 </programlisting>
1247 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
1248 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
1249 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
1250 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
1251 </para>
1252
1253 <para>
1254 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
1255 group, it will compute its set of bound variables, and will introduce an appropriate call
1256 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
1257 <literal>MonadFix</literal> class. Here is an example:
1258 <programlisting>
1259 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
1260 ; c &lt;- f b a } ; c &lt;- f b a
1261 ; return (b,c) })
1262 </programlisting>
1263 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
1264 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
1265 <programlisting>
1266 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1267 </programlisting>
1268 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1269 </para>
1270
1271 <para>
1272 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
1273 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
1274 for the <literal>mdo</literal> notation, which is described next.
1275 </para>
1276 </sect3>
1277
1278 <sect3>
1279 <title>The <literal>mdo</literal> notation</title>
1280
1281 <para>
1282 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
1283 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
1284 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
1285 in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
1286 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
1287 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
1288 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
1289 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
1290 interference, changing the termination behavior of the resulting translation.
1291 (Details can be found in Sections 3.1 and 7.2.2 of
1292 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1293 </para>
1294
1295 <para>
1296 The <literal>mdo</literal> notation removes the burden of placing
1297 explicit <literal>rec</literal> blocks in the code. Unlike an
1298 ordinary <literal>do</literal> expression, in which variables bound by
1299 statements are only in scope for later statements, variables bound in
1300 an <literal>mdo</literal> expression are in scope for all statements
1301 of the expression. The compiler then automatically identifies minimal
1302 mutually recursively dependent segments of statements, treating them as
1303 if the user had wrapped a <literal>rec</literal> qualifier around them.
1304 </para>
1305
1306 <para>
1307 The definition is syntactic:
1308 </para>
1309 <itemizedlist>
1310 <listitem>
1311 <para>
1312 A generator <replaceable>g</replaceable>
1313 <emphasis>depends</emphasis> on a textually following generator
1314 <replaceable>g'</replaceable>, if
1315 </para>
1316 <itemizedlist>
1317 <listitem>
1318 <para>
1319 <replaceable>g'</replaceable> defines a variable that
1320 is used by <replaceable>g</replaceable>, or
1321 </para>
1322 </listitem>
1323 <listitem>
1324 <para>
1325 <replaceable>g'</replaceable> textually appears between
1326 <replaceable>g</replaceable> and
1327 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1328 depends on <replaceable>g''</replaceable>.
1329 </para>
1330 </listitem>
1331 </itemizedlist>
1332 </listitem>
1333 <listitem>
1334 <para>
1335 A <emphasis>segment</emphasis> of a given
1336 <literal>mdo</literal>-expression is a minimal sequence of generators
1337 such that no generator of the sequence depends on an outside
1338 generator. As a special case, although it is not a generator,
1339 the final expression in an <literal>mdo</literal>-expression is
1340 considered to form a segment by itself.
1341 </para>
1342 </listitem>
1343 </itemizedlist>
1344 <para>
1345 Segments in this sense are
1346 related to <emphasis>strongly-connected components</emphasis> analysis,
1347 with the exception that bindings in a segment cannot be reordered and
1348 must be contiguous.
1349 </para>
1350
1351 <para>
1352 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1353 <programlisting>
1354 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1355 ; b &lt;- f a c ; rec { b &lt;- f a c
1356 ; c &lt;- f b a ; ; c &lt;- f b a }
1357 ; z &lt;- h a b ; z &lt;- h a b
1358 ; d &lt;- g d e ; rec { d &lt;- g d e
1359 ; e &lt;- g a z ; ; e &lt;- g a z }
1360 ; putChar c } ; putChar c }
1361 </programlisting>
1362 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1363 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1364 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1365 would expect.
1366 </para>
1367
1368 <para>
1369 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1370 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1371 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1372 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1373 version would do so.
1374 </para>
1375
1376 <para>
1377 Here are some other important points in using the recursive-do notation:
1378
1379 <itemizedlist>
1380 <listitem>
1381 <para>
1382 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1383 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1384 blocks inside <literal>do</literal> expressions.)
1385 </para>
1386 </listitem>
1387 <listitem>
1388 <para>
1389 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1390 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1391 <literal>rec</literal> blocks in a single expression.
1392 </para>
1393 </listitem>
1394 <listitem>
1395 <para>
1396 If recursive bindings are required for a monad, then that monad must be declared an instance of
1397 the <literal>MonadFix</literal> class.
1398 </para>
1399 </listitem>
1400 <listitem>
1401 <para>
1402 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1403 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1404 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1405 state monad (strict and lazy, respectively).
1406 </para>
1407 </listitem>
1408 <listitem>
1409 <para>
1410 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1411 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1412 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1413 </para>
1414 </listitem>
1415 </itemizedlist>
1416 </para>
1417 </sect3>
1418
1419
1420 </sect2>
1421
1422
1423 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1424
1425 <sect2 id="parallel-list-comprehensions">
1426 <title>Parallel List Comprehensions</title>
1427 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1428 </indexterm>
1429 <indexterm><primary>parallel list comprehensions</primary>
1430 </indexterm>
1431
1432 <para>Parallel list comprehensions are a natural extension to list
1433 comprehensions. List comprehensions can be thought of as a nice
1434 syntax for writing maps and filters. Parallel comprehensions
1435 extend this to include the zipWith family.</para>
1436
1437 <para>A parallel list comprehension has multiple independent
1438 branches of qualifier lists, each separated by a `|' symbol. For
1439 example, the following zips together two lists:</para>
1440
1441 <programlisting>
1442 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1443 </programlisting>
1444
1445 <para>The behaviour of parallel list comprehensions follows that of
1446 zip, in that the resulting list will have the same length as the
1447 shortest branch.</para>
1448
1449 <para>We can define parallel list comprehensions by translation to
1450 regular comprehensions. Here's the basic idea:</para>
1451
1452 <para>Given a parallel comprehension of the form: </para>
1453
1454 <programlisting>
1455 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1456 | q1 &lt;- e21, q2 &lt;- e22, ...
1457 ...
1458 ]
1459 </programlisting>
1460
1461 <para>This will be translated to: </para>
1462
1463 <programlisting>
1464 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1465 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1466 ...
1467 ]
1468 </programlisting>
1469
1470 <para>where `zipN' is the appropriate zip for the given number of
1471 branches.</para>
1472
1473 </sect2>
1474
1475 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1476
1477 <sect2 id="generalised-list-comprehensions">
1478 <title>Generalised (SQL-Like) List Comprehensions</title>
1479 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1480 </indexterm>
1481 <indexterm><primary>extended list comprehensions</primary>
1482 </indexterm>
1483 <indexterm><primary>group</primary></indexterm>
1484 <indexterm><primary>sql</primary></indexterm>
1485
1486
1487 <para>Generalised list comprehensions are a further enhancement to the
1488 list comprehension syntactic sugar to allow operations such as sorting
1489 and grouping which are familiar from SQL. They are fully described in the
1490 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1491 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1492 except that the syntax we use differs slightly from the paper.</para>
1493 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1494 <para>Here is an example:
1495 <programlisting>
1496 employees = [ ("Simon", "MS", 80)
1497 , ("Erik", "MS", 100)
1498 , ("Phil", "Ed", 40)
1499 , ("Gordon", "Ed", 45)
1500 , ("Paul", "Yale", 60)]
1501
1502 output = [ (the dept, sum salary)
1503 | (name, dept, salary) &lt;- employees
1504 , then group by dept using groupWith
1505 , then sortWith by (sum salary)
1506 , then take 5 ]
1507 </programlisting>
1508 In this example, the list <literal>output</literal> would take on
1509 the value:
1510
1511 <programlisting>
1512 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1513 </programlisting>
1514 </para>
1515 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1516 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1517 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1518
1519 <para>There are five new forms of comprehension qualifier,
1520 all introduced by the (existing) keyword <literal>then</literal>:
1521 <itemizedlist>
1522 <listitem>
1523
1524 <programlisting>
1525 then f
1526 </programlisting>
1527
1528 This statement requires that <literal>f</literal> have the type <literal>
1529 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1530 motivating example, as this form is used to apply <literal>take 5</literal>.
1531
1532 </listitem>
1533
1534
1535 <listitem>
1536 <para>
1537 <programlisting>
1538 then f by e
1539 </programlisting>
1540
1541 This form is similar to the previous one, but allows you to create a function
1542 which will be passed as the first argument to f. As a consequence f must have
1543 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1544 from the type, this function lets f &quot;project out&quot; some information
1545 from the elements of the list it is transforming.</para>
1546
1547 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1548 is supplied with a function that lets it find out the <literal>sum salary</literal>
1549 for any item in the list comprehension it transforms.</para>
1550
1551 </listitem>
1552
1553
1554 <listitem>
1555
1556 <programlisting>
1557 then group by e using f
1558 </programlisting>
1559
1560 <para>This is the most general of the grouping-type statements. In this form,
1561 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1562 As with the <literal>then f by e</literal> case above, the first argument
1563 is a function supplied to f by the compiler which lets it compute e on every
1564 element of the list being transformed. However, unlike the non-grouping case,
1565 f additionally partitions the list into a number of sublists: this means that
1566 at every point after this statement, binders occurring before it in the comprehension
1567 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1568 this, let's look at an example:</para>
1569
1570 <programlisting>
1571 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1572 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1573 groupRuns f = groupBy (\x y -> f x == f y)
1574
1575 output = [ (the x, y)
1576 | x &lt;- ([1..3] ++ [1..2])
1577 , y &lt;- [4..6]
1578 , then group by x using groupRuns ]
1579 </programlisting>
1580
1581 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1582
1583 <programlisting>
1584 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1585 </programlisting>
1586
1587 <para>Note that we have used the <literal>the</literal> function to change the type
1588 of x from a list to its original numeric type. The variable y, in contrast, is left
1589 unchanged from the list form introduced by the grouping.</para>
1590
1591 </listitem>
1592
1593 <listitem>
1594
1595 <programlisting>
1596 then group using f
1597 </programlisting>
1598
1599 <para>With this form of the group statement, f is required to simply have the type
1600 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1601 comprehension so far directly. An example of this form is as follows:</para>
1602
1603 <programlisting>
1604 output = [ x
1605 | y &lt;- [1..5]
1606 , x &lt;- "hello"
1607 , then group using inits]
1608 </programlisting>
1609
1610 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1611
1612 <programlisting>
1613 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1614 </programlisting>
1615
1616 </listitem>
1617 </itemizedlist>
1618 </para>
1619 </sect2>
1620
1621 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1622
1623 <sect2 id="monad-comprehensions">
1624 <title>Monad comprehensions</title>
1625 <indexterm><primary>monad comprehensions</primary></indexterm>
1626
1627 <para>
1628 Monad comprehensions generalise the list comprehension notation,
1629 including parallel comprehensions
1630 (<xref linkend="parallel-list-comprehensions"/>) and
1631 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1632 to work for any monad.
1633 </para>
1634
1635 <para>Monad comprehensions support:</para>
1636
1637 <itemizedlist>
1638 <listitem>
1639 <para>
1640 Bindings:
1641 </para>
1642
1643 <programlisting>
1644 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1645 </programlisting>
1646
1647 <para>
1648 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1649 <literal>return</literal> functions to the usual do-notation:
1650 </para>
1651
1652 <programlisting>
1653 do x &lt;- Just 1
1654 y &lt;- Just 2
1655 return (x+y)
1656 </programlisting>
1657
1658 </listitem>
1659 <listitem>
1660 <para>
1661 Guards:
1662 </para>
1663
1664 <programlisting>
1665 [ x | x &lt;- [1..10], x &lt;= 5 ]
1666 </programlisting>
1667
1668 <para>
1669 Guards are translated with the <literal>guard</literal> function,
1670 which requires a <literal>MonadPlus</literal> instance:
1671 </para>
1672
1673 <programlisting>
1674 do x &lt;- [1..10]
1675 guard (x &lt;= 5)
1676 return x
1677 </programlisting>
1678
1679 </listitem>
1680 <listitem>
1681 <para>
1682 Transform statements (as with <literal>-XTransformListComp</literal>):
1683 </para>
1684
1685 <programlisting>
1686 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1687 </programlisting>
1688
1689 <para>
1690 This translates to:
1691 </para>
1692
1693 <programlisting>
1694 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1695 y &lt;- [1..x]
1696 return (x,y))
1697 return (x+y)
1698 </programlisting>
1699
1700 </listitem>
1701 <listitem>
1702 <para>
1703 Group statements (as with <literal>-XTransformListComp</literal>):
1704 </para>
1705
1706 <programlisting>
1707 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1708 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1709 </programlisting>
1710
1711 </listitem>
1712 <listitem>
1713 <para>
1714 Parallel statements (as with <literal>-XParallelListComp</literal>):
1715 </para>
1716
1717 <programlisting>
1718 [ (x+y) | x &lt;- [1..10]
1719 | y &lt;- [11..20]
1720 ]
1721 </programlisting>
1722
1723 <para>
1724 Parallel statements are translated using the
1725 <literal>mzip</literal> function, which requires a
1726 <literal>MonadZip</literal> instance defined in
1727 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1728 </para>
1729
1730 <programlisting>
1731 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1732 return x)
1733 (do y &lt;- [11..20]
1734 return y)
1735 return (x+y)
1736 </programlisting>
1737
1738 </listitem>
1739 </itemizedlist>
1740
1741 <para>
1742 All these features are enabled by default if the
1743 <literal>MonadComprehensions</literal> extension is enabled. The types
1744 and more detailed examples on how to use comprehensions are explained
1745 in the previous chapters <xref
1746 linkend="generalised-list-comprehensions"/> and <xref
1747 linkend="parallel-list-comprehensions"/>. In general you just have
1748 to replace the type <literal>[a]</literal> with the type
1749 <literal>Monad m => m a</literal> for monad comprehensions.
1750 </para>
1751
1752 <para>
1753 Note: Even though most of these examples are using the list monad,
1754 monad comprehensions work for any monad.
1755 The <literal>base</literal> package offers all necessary instances for
1756 lists, which make <literal>MonadComprehensions</literal> backward
1757 compatible to built-in, transform and parallel list comprehensions.
1758 </para>
1759 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1760 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1761 <programlisting>
1762 Expressions: e
1763 Declarations: d
1764 Lists of qualifiers: Q,R,S
1765
1766 -- Basic forms
1767 D[ e | ] = return e
1768 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1769 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1770 D[ e | let d, Q ] = let d in D[ e | Q ]
1771
1772 -- Parallel comprehensions (iterate for multiple parallel branches)
1773 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1774
1775 -- Transform comprehensions
1776 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1777
1778 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1779
1780 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1781 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1782 Qv -&gt; D[ e | R ]
1783
1784 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1785 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1786 Qv -&gt; D[ e | R ]
1787
1788 where Qv is the tuple of variables bound by Q (and used subsequently)
1789 selQvi is a selector mapping Qv to the ith component of Qv
1790
1791 Operator Standard binding Expected type
1792 --------------------------------------------------------------------
1793 return GHC.Base t1 -&gt; m t2
1794 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1795 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1796 guard Control.Monad t1 -&gt; m t2
1797 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1798 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1799 </programlisting>
1800 The comprehension should typecheck when its desugaring would typecheck.
1801 </para>
1802 <para>
1803 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1804 Without rebindable
1805 syntax, the operators from the "standard binding" module are used; with
1806 rebindable syntax, the operators are looked up in the current lexical scope.
1807 For example, parallel comprehensions will be typechecked and desugared
1808 using whatever "<literal>mzip</literal>" is in scope.
1809 </para>
1810 <para>
1811 The rebindable operators must have the "Expected type" given in the
1812 table above. These types are surprisingly general. For example, you can
1813 use a bind operator with the type
1814 <programlisting>
1815 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1816 </programlisting>
1817 In the case of transform comprehensions, notice that the groups are
1818 parameterised over some arbitrary type <literal>n</literal> (provided it
1819 has an <literal>fmap</literal>, as well as
1820 the comprehension being over an arbitrary monad.
1821 </para>
1822 </sect2>
1823
1824 <!-- ===================== REBINDABLE SYNTAX =================== -->
1825
1826 <sect2 id="rebindable-syntax">
1827 <title>Rebindable syntax and the implicit Prelude import</title>
1828
1829 <para><indexterm><primary>-XNoImplicitPrelude
1830 option</primary></indexterm> GHC normally imports
1831 <filename>Prelude.hi</filename> files for you. If you'd
1832 rather it didn't, then give it a
1833 <option>-XNoImplicitPrelude</option> option. The idea is
1834 that you can then import a Prelude of your own. (But don't
1835 call it <literal>Prelude</literal>; the Haskell module
1836 namespace is flat, and you must not conflict with any
1837 Prelude module.)</para>
1838
1839 <para>Suppose you are importing a Prelude of your own
1840 in order to define your own numeric class
1841 hierarchy. It completely defeats that purpose if the
1842 literal "1" means "<literal>Prelude.fromInteger
1843 1</literal>", which is what the Haskell Report specifies.
1844 So the <option>-XRebindableSyntax</option>
1845 flag causes
1846 the following pieces of built-in syntax to refer to
1847 <emphasis>whatever is in scope</emphasis>, not the Prelude
1848 versions:
1849 <itemizedlist>
1850 <listitem>
1851 <para>An integer literal <literal>368</literal> means
1852 "<literal>fromInteger (368::Integer)</literal>", rather than
1853 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1854 </para> </listitem>
1855
1856 <listitem><para>Fractional literals are handed in just the same way,
1857 except that the translation is
1858 <literal>fromRational (3.68::Rational)</literal>.
1859 </para> </listitem>
1860
1861 <listitem><para>The equality test in an overloaded numeric pattern
1862 uses whatever <literal>(==)</literal> is in scope.
1863 </para> </listitem>
1864
1865 <listitem><para>The subtraction operation, and the
1866 greater-than-or-equal test, in <literal>n+k</literal> patterns
1867 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1868 </para></listitem>
1869
1870 <listitem>
1871 <para>Negation (e.g. "<literal>- (f x)</literal>")
1872 means "<literal>negate (f x)</literal>", both in numeric
1873 patterns, and expressions.
1874 </para></listitem>
1875
1876 <listitem>
1877 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1878 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1879 </para></listitem>
1880
1881 <listitem>
1882 <para>"Do" notation is translated using whatever
1883 functions <literal>(>>=)</literal>,
1884 <literal>(>>)</literal>, and <literal>fail</literal>,
1885 are in scope (not the Prelude
1886 versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array
1887 comprehensions, are unaffected. </para></listitem>
1888
1889 <listitem>
1890 <para>Arrow
1891 notation (see <xref linkend="arrow-notation"/>)
1892 uses whatever <literal>arr</literal>,
1893 <literal>(>>>)</literal>, <literal>first</literal>,
1894 <literal>app</literal>, <literal>(|||)</literal> and
1895 <literal>loop</literal> functions are in scope. But unlike the
1896 other constructs, the types of these functions must match the
1897 Prelude types very closely. Details are in flux; if you want
1898 to use this, ask!
1899 </para></listitem>
1900 </itemizedlist>
1901 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1902 </para>
1903 <para>
1904 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1905 even if that is a little unexpected. For example, the
1906 static semantics of the literal <literal>368</literal>
1907 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1908 <literal>fromInteger</literal> to have any of the types:
1909 <programlisting>
1910 fromInteger :: Integer -> Integer
1911 fromInteger :: forall a. Foo a => Integer -> a
1912 fromInteger :: Num a => a -> Integer
1913 fromInteger :: Integer -> Bool -> Bool
1914 </programlisting>
1915 </para>
1916
1917 <para>Be warned: this is an experimental facility, with
1918 fewer checks than usual. Use <literal>-dcore-lint</literal>
1919 to typecheck the desugared program. If Core Lint is happy
1920 you should be all right.</para>
1921
1922 </sect2>
1923
1924 <sect2 id="postfix-operators">
1925 <title>Postfix operators</title>
1926
1927 <para>
1928 The <option>-XPostfixOperators</option> flag enables a small
1929 extension to the syntax of left operator sections, which allows you to
1930 define postfix operators. The extension is this: the left section
1931 <programlisting>
1932 (e !)
1933 </programlisting>
1934 is equivalent (from the point of view of both type checking and execution) to the expression
1935 <programlisting>
1936 ((!) e)
1937 </programlisting>
1938 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1939 The strict Haskell 98 interpretation is that the section is equivalent to
1940 <programlisting>
1941 (\y -> (!) e y)
1942 </programlisting>
1943 That is, the operator must be a function of two arguments. GHC allows it to
1944 take only one argument, and that in turn allows you to write the function
1945 postfix.
1946 </para>
1947 <para>The extension does not extend to the left-hand side of function
1948 definitions; you must define such a function in prefix form.</para>
1949
1950 </sect2>
1951
1952 <sect2 id="tuple-sections">
1953 <title>Tuple sections</title>
1954
1955 <para>
1956 The <option>-XTupleSections</option> flag enables Python-style partially applied
1957 tuple constructors. For example, the following program
1958 <programlisting>
1959 (, True)
1960 </programlisting>
1961 is considered to be an alternative notation for the more unwieldy alternative
1962 <programlisting>
1963 \x -> (x, True)
1964 </programlisting>
1965 You can omit any combination of arguments to the tuple, as in the following
1966 <programlisting>
1967 (, "I", , , "Love", , 1337)
1968 </programlisting>
1969 which translates to
1970 <programlisting>
1971 \a b c d -> (a, "I", b, c, "Love", d, 1337)
1972 </programlisting>
1973 </para>
1974
1975 <para>
1976 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
1977 will also be available for them, like so
1978 <programlisting>
1979 (# , True #)
1980 </programlisting>
1981 Because there is no unboxed unit tuple, the following expression
1982 <programlisting>
1983 (# #)
1984 </programlisting>
1985 continues to stand for the unboxed singleton tuple data constructor.
1986 </para>
1987
1988 </sect2>
1989
1990 <sect2 id="lambda-case">
1991 <title>Lambda-case</title>
1992 <para>
1993 The <option>-XLambdaCase</option> flag enables expressions of the form
1994 <programlisting>
1995 \case { p1 -> e1; ...; pN -> eN }
1996 </programlisting>
1997 which is equivalent to
1998 <programlisting>
1999 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
2000 </programlisting>
2001 Note that <literal>\case</literal> starts a layout, so you can write
2002 <programlisting>
2003 \case
2004 p1 -> e1
2005 ...
2006 pN -> eN
2007 </programlisting>
2008 </para>
2009 </sect2>
2010
2011 <sect2 id="empty-case">
2012 <title>Empty case alternatives</title>
2013 <para>
2014 The <option>-XEmptyCase</option> flag enables
2015 case expressions, or lambda-case expressions, that have no alternatives,
2016 thus:
2017 <programlisting>
2018 case e of { } -- No alternatives
2019 or
2020 \case { } -- -XLambdaCase is also required
2021 </programlisting>
2022 This can be useful when you know that the expression being scrutinised
2023 has no non-bottom values. For example:
2024 <programlisting>
2025 data Void
2026 f :: Void -> Int
2027 f x = case x of { }
2028 </programlisting>
2029 With dependently-typed features it is more useful
2030 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
2031 For example, consider these two candidate definitions of <literal>absurd</literal>:
2032 <programlisting>
2033 data a :==: b where
2034 Refl :: a :==: a
2035
2036 absurd :: True :~: False -> a
2037 absurd x = error "absurd" -- (A)
2038 absurd x = case x of {} -- (B)
2039 </programlisting>
2040 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
2041 is an empty type. So (B) has no partiality and GHC should be able to compile with
2042 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
2043 yet clever enough to do that.)
2044 On the other hand (A) looks dangerous, and GHC doesn't check to make
2045 sure that, in fact, the function can never get called.
2046 </para>
2047 </sect2>
2048
2049 <sect2 id="multi-way-if">
2050 <title>Multi-way if-expressions</title>
2051 <para>
2052 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
2053 with multiple branches:
2054 <programlisting>
2055 if | guard1 -> expr1
2056 | ...
2057 | guardN -> exprN
2058 </programlisting>
2059 which is roughly equivalent to
2060 <programlisting>
2061 case () of
2062 _ | guard1 -> expr1
2063 ...
2064 _ | guardN -> exprN
2065 </programlisting>
2066 </para>
2067
2068 <para>Multi-way if expressions introduce a new layout context. So the
2069 example above is equivalent to:
2070 <programlisting>
2071 if { | guard1 -> expr1
2072 ; | ...
2073 ; | guardN -> exprN
2074 }
2075 </programlisting>
2076 The following behaves as expected:
2077 <programlisting>
2078 if | guard1 -> if | guard2 -> expr2
2079 | guard3 -> expr3
2080 | guard4 -> expr4
2081 </programlisting>
2082 because layout translates it as
2083 <programlisting>
2084 if { | guard1 -> if { | guard2 -> expr2
2085 ; | guard3 -> expr3
2086 }
2087 ; | guard4 -> expr4
2088 }
2089 </programlisting>
2090 Layout with multi-way if works in the same way as other layout
2091 contexts, except that the semi-colons between guards in a multi-way if
2092 are optional. So it is not necessary to line up all the guards at the
2093 same column; this is consistent with the way guards work in function
2094 definitions and case expressions.
2095 </para>
2096 </sect2>
2097
2098 <sect2 id="disambiguate-fields">
2099 <title>Record field disambiguation</title>
2100 <para>
2101 In record construction and record pattern matching
2102 it is entirely unambiguous which field is referred to, even if there are two different
2103 data types in scope with a common field name. For example:
2104 <programlisting>
2105 module M where
2106 data S = MkS { x :: Int, y :: Bool }
2107
2108 module Foo where
2109 import M
2110
2111 data T = MkT { x :: Int }
2112
2113 ok1 (MkS { x = n }) = n+1 -- Unambiguous
2114 ok2 n = MkT { x = n+1 } -- Unambiguous
2115
2116 bad1 k = k { x = 3 } -- Ambiguous
2117 bad2 k = x k -- Ambiguous
2118 </programlisting>
2119 Even though there are two <literal>x</literal>'s in scope,
2120 it is clear that the <literal>x</literal> in the pattern in the
2121 definition of <literal>ok1</literal> can only mean the field
2122 <literal>x</literal> from type <literal>S</literal>. Similarly for
2123 the function <literal>ok2</literal>. However, in the record update
2124 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
2125 it is not clear which of the two types is intended.
2126 </para>
2127 <para>
2128 Haskell 98 regards all four as ambiguous, but with the
2129 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
2130 the former two. The rules are precisely the same as those for instance
2131 declarations in Haskell 98, where the method names on the left-hand side
2132 of the method bindings in an instance declaration refer unambiguously
2133 to the method of that class (provided they are in scope at all), even
2134 if there are other variables in scope with the same name.
2135 This reduces the clutter of qualified names when you import two
2136 records from different modules that use the same field name.
2137 </para>
2138 <para>
2139 Some details:
2140 <itemizedlist>
2141 <listitem><para>
2142 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
2143 <programlisting>
2144 module Foo where
2145 import M
2146 x=True
2147 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
2148 </programlisting>
2149 </para></listitem>
2150
2151 <listitem><para>
2152 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
2153 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
2154 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
2155 <programlisting>
2156 module Foo where
2157 import qualified M -- Note qualified
2158
2159 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
2160 </programlisting>
2161 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
2162 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
2163 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
2164 is not. (In effect, it is qualified by the constructor.)
2165 </para></listitem>
2166 </itemizedlist>
2167 </para>
2168
2169 </sect2>
2170
2171 <!-- ===================== Record puns =================== -->
2172
2173 <sect2 id="record-puns">
2174 <title>Record puns
2175 </title>
2176
2177 <para>
2178 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
2179 </para>
2180
2181 <para>
2182 When using records, it is common to write a pattern that binds a
2183 variable with the same name as a record field, such as:
2184
2185 <programlisting>
2186 data C = C {a :: Int}
2187 f (C {a = a}) = a
2188 </programlisting>
2189 </para>
2190
2191 <para>
2192 Record punning permits the variable name to be elided, so one can simply
2193 write
2194
2195 <programlisting>
2196 f (C {a}) = a
2197 </programlisting>
2198
2199 to mean the same pattern as above. That is, in a record pattern, the
2200 pattern <literal>a</literal> expands into the pattern <literal>a =
2201 a</literal> for the same name <literal>a</literal>.
2202 </para>
2203
2204 <para>
2205 Note that:
2206 <itemizedlist>
2207 <listitem><para>
2208 Record punning can also be used in an expression, writing, for example,
2209 <programlisting>
2210 let a = 1 in C {a}
2211 </programlisting>
2212 instead of
2213 <programlisting>
2214 let a = 1 in C {a = a}
2215 </programlisting>
2216 The expansion is purely syntactic, so the expanded right-hand side
2217 expression refers to the nearest enclosing variable that is spelled the
2218 same as the field name.
2219 </para></listitem>
2220
2221 <listitem><para>
2222 Puns and other patterns can be mixed in the same record:
2223 <programlisting>
2224 data C = C {a :: Int, b :: Int}
2225 f (C {a, b = 4}) = a
2226 </programlisting>
2227 </para></listitem>
2228
2229 <listitem><para>
2230 Puns can be used wherever record patterns occur (e.g. in
2231 <literal>let</literal> bindings or at the top-level).
2232 </para></listitem>
2233
2234 <listitem><para>
2235 A pun on a qualified field name is expanded by stripping off the module qualifier.
2236 For example:
2237 <programlisting>
2238 f (C {M.a}) = a
2239 </programlisting>
2240 means
2241 <programlisting>
2242 f (M.C {M.a = a}) = a
2243 </programlisting>
2244 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
2245 is only in scope in qualified form.)
2246 </para></listitem>
2247 </itemizedlist>
2248 </para>
2249
2250
2251 </sect2>
2252
2253 <!-- ===================== Record wildcards =================== -->
2254
2255 <sect2 id="record-wildcards">
2256 <title>Record wildcards
2257 </title>
2258
2259 <para>
2260 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
2261 This flag implies <literal>-XDisambiguateRecordFields</literal>.
2262 </para>
2263
2264 <para>
2265 For records with many fields, it can be tiresome to write out each field
2266 individually in a record pattern, as in
2267 <programlisting>
2268 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
2269 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
2270 </programlisting>
2271 </para>
2272
2273 <para>
2274 Record wildcard syntax permits a "<literal>..</literal>" in a record
2275 pattern, where each elided field <literal>f</literal> is replaced by the
2276 pattern <literal>f = f</literal>. For example, the above pattern can be
2277 written as
2278 <programlisting>
2279 f (C {a = 1, ..}) = b + c + d
2280 </programlisting>
2281 </para>
2282
2283 <para>
2284 More details:
2285 <itemizedlist>
2286 <listitem><para>
2287 Wildcards can be mixed with other patterns, including puns
2288 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
2289 = 1, b, ..})</literal>. Additionally, record wildcards can be used
2290 wherever record patterns occur, including in <literal>let</literal>
2291 bindings and at the top-level. For example, the top-level binding
2292 <programlisting>
2293 C {a = 1, ..} = e
2294 </programlisting>
2295 defines <literal>b</literal>, <literal>c</literal>, and
2296 <literal>d</literal>.
2297 </para></listitem>
2298
2299 <listitem><para>
2300 Record wildcards can also be used in expressions, writing, for example,
2301 <programlisting>
2302 let {a = 1; b = 2; c = 3; d = 4} in C {..}
2303 </programlisting>
2304 in place of
2305 <programlisting>
2306 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
2307 </programlisting>
2308 The expansion is purely syntactic, so the record wildcard
2309 expression refers to the nearest enclosing variables that are spelled
2310 the same as the omitted field names.
2311 </para></listitem>
2312
2313 <listitem><para>
2314 The "<literal>..</literal>" expands to the missing
2315 <emphasis>in-scope</emphasis> record fields.
2316 Specifically the expansion of "<literal>C {..}</literal>" includes
2317 <literal>f</literal> if and only if:
2318 <itemizedlist>
2319 <listitem><para>
2320 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2321 </para></listitem>
2322 <listitem><para>
2323 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2324 </para></listitem>
2325 <listitem><para>
2326 In the case of expressions (but not patterns),
2327 the variable <literal>f</literal> is in scope unqualified,
2328 apart from the binding of the record selector itself.
2329 </para></listitem>
2330 </itemizedlist>
2331 For example
2332 <programlisting>
2333 module M where
2334 data R = R { a,b,c :: Int }
2335 module X where
2336 import M( R(a,c) )
2337 f b = R { .. }
2338 </programlisting>
2339 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2340 omitting <literal>b</literal> since the record field is not in scope,
2341 and omitting <literal>c</literal> since the variable <literal>c</literal>
2342 is not in scope (apart from the binding of the
2343 record selector <literal>c</literal>, of course).
2344 </para></listitem>
2345 </itemizedlist>
2346 </para>
2347
2348 </sect2>
2349
2350 <!-- ===================== Local fixity declarations =================== -->
2351
2352 <sect2 id="local-fixity-declarations">
2353 <title>Local Fixity Declarations
2354 </title>
2355
2356 <para>A careful reading of the Haskell 98 Report reveals that fixity
2357 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2358 <literal>infixr</literal>) are permitted to appear inside local bindings
2359 such those introduced by <literal>let</literal> and
2360 <literal>where</literal>. However, the Haskell Report does not specify
2361 the semantics of such bindings very precisely.
2362 </para>
2363
2364 <para>In GHC, a fixity declaration may accompany a local binding:
2365 <programlisting>
2366 let f = ...
2367 infixr 3 `f`
2368 in
2369 ...
2370 </programlisting>
2371 and the fixity declaration applies wherever the binding is in scope.
2372 For example, in a <literal>let</literal>, it applies in the right-hand
2373 sides of other <literal>let</literal>-bindings and the body of the
2374 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2375 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2376 declarations of a <literal>let</literal> statement scope over other
2377 statements in the group, just as the bound name does.
2378 </para>
2379
2380 <para>
2381 Moreover, a local fixity declaration *must* accompany a local binding of
2382 that name: it is not possible to revise the fixity of name bound
2383 elsewhere, as in
2384 <programlisting>
2385 let infixr 9 $ in ...
2386 </programlisting>
2387
2388 Because local fixity declarations are technically Haskell 98, no flag is
2389 necessary to enable them.
2390 </para>
2391 </sect2>
2392
2393 <sect2 id="package-imports">
2394 <title>Package-qualified imports</title>
2395
2396 <para>With the <option>-XPackageImports</option> flag, GHC allows
2397 import declarations to be qualified by the package name that the
2398 module is intended to be imported from. For example:</para>
2399
2400 <programlisting>
2401 import "network" Network.Socket
2402 </programlisting>
2403
2404 <para>would import the module <literal>Network.Socket</literal> from
2405 the package <literal>network</literal> (any version). This may
2406 be used to disambiguate an import when the same module is
2407 available from multiple packages, or is present in both the
2408 current package being built and an external package.</para>
2409
2410 <para>The special package name <literal>this</literal> can be used to
2411 refer to the current package being built.</para>
2412
2413 <para>Note: you probably don't need to use this feature, it was
2414 added mainly so that we can build backwards-compatible versions of
2415 packages when APIs change. It can lead to fragile dependencies in
2416 the common case: modules occasionally move from one package to
2417 another, rendering any package-qualified imports broken.</para>
2418 </sect2>
2419
2420 <sect2 id="safe-imports-ext">
2421 <title>Safe imports</title>
2422
2423 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2424 and <option>-XUnsafe</option> language flags, GHC extends
2425 the import declaration syntax to take an optional <literal>safe</literal>
2426 keyword after the <literal>import</literal> keyword. This feature
2427 is part of the Safe Haskell GHC extension. For example:</para>
2428
2429 <programlisting>
2430 import safe qualified Network.Socket as NS
2431 </programlisting>
2432
2433 <para>would import the module <literal>Network.Socket</literal>
2434 with compilation only succeeding if Network.Socket can be
2435 safely imported. For a description of when a import is
2436 considered safe see <xref linkend="safe-haskell"/></para>
2437
2438 </sect2>
2439
2440 <sect2 id="explicit-namespaces">
2441 <title>Explicit namespaces in import/export</title>
2442
2443 <para> In an import or export list, such as
2444 <programlisting>
2445 module M( f, (++) ) where ...
2446 import N( f, (++) )
2447 ...
2448 </programlisting>
2449 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2450 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2451 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2452 case, how would you export or import it?
2453 </para>
2454 <para>
2455 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2456 a type constructor in an import or export list with "<literal>type</literal>" to
2457 disambiguate this case, thus:
2458 <programlisting>
2459 module M( f, type (++) ) where ...
2460 import N( f, type (++) )
2461 ...
2462 module N( f, type (++) ) where
2463 data family a ++ b = L a | R b
2464 </programlisting>
2465 The extension <option>-XExplicitNamespaces</option>
2466 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2467 </para>
2468 </sect2>
2469
2470 <sect2 id="syntax-stolen">
2471 <title>Summary of stolen syntax</title>
2472
2473 <para>Turning on an option that enables special syntax
2474 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2475 to compile, perhaps because it uses a variable name which has
2476 become a reserved word. This section lists the syntax that is
2477 "stolen" by language extensions.
2478 We use
2479 notation and nonterminal names from the Haskell 98 lexical syntax
2480 (see the Haskell 98 Report).
2481 We only list syntax changes here that might affect
2482 existing working programs (i.e. "stolen" syntax). Many of these
2483 extensions will also enable new context-free syntax, but in all
2484 cases programs written to use the new syntax would not be
2485 compilable without the option enabled.</para>
2486
2487 <para>There are two classes of special
2488 syntax:
2489
2490 <itemizedlist>
2491 <listitem>
2492 <para>New reserved words and symbols: character sequences
2493 which are no longer available for use as identifiers in the
2494 program.</para>
2495 </listitem>
2496 <listitem>
2497 <para>Other special syntax: sequences of characters that have
2498 a different meaning when this particular option is turned
2499 on.</para>
2500 </listitem>
2501 </itemizedlist>
2502
2503 The following syntax is stolen:
2504
2505 <variablelist>
2506 <varlistentry>
2507 <term>
2508 <literal>forall</literal>
2509 <indexterm><primary><literal>forall</literal></primary></indexterm>
2510 </term>
2511 <listitem><para>
2512 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2513 <option>-XScopedTypeVariables</option>,
2514 <option>-XLiberalTypeSynonyms</option>,
2515 <option>-XRankNTypes</option>,
2516 <option>-XExistentialQuantification</option>
2517 </para></listitem>
2518 </varlistentry>
2519
2520 <varlistentry>
2521 <term>
2522 <literal>mdo</literal>
2523 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2524 </term>
2525 <listitem><para>
2526 Stolen by: <option>-XRecursiveDo</option>
2527 </para></listitem>
2528 </varlistentry>
2529
2530 <varlistentry>
2531 <term>
2532 <literal>foreign</literal>
2533 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2534 </term>
2535 <listitem><para>
2536 Stolen by: <option>-XForeignFunctionInterface</option>
2537 </para></listitem>
2538 </varlistentry>
2539
2540 <varlistentry>
2541 <term>
2542 <literal>rec</literal>,
2543 <literal>proc</literal>, <literal>-&lt;</literal>,
2544 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2545 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2546 <literal>|)</literal> brackets
2547 <indexterm><primary><literal>proc</literal></primary></indexterm>
2548 </term>
2549 <listitem><para>
2550 Stolen by: <option>-XArrows</option>
2551 </para></listitem>
2552 </varlistentry>
2553
2554 <varlistentry>
2555 <term>
2556 <literal>?<replaceable>varid</replaceable></literal>
2557 <indexterm><primary>implicit parameters</primary></indexterm>
2558 </term>
2559 <listitem><para>
2560 Stolen by: <option>-XImplicitParams</option>
2561 </para></listitem>
2562 </varlistentry>
2563
2564 <varlistentry>
2565 <term>
2566 <literal>[|</literal>,
2567 <literal>[e|</literal>, <literal>[p|</literal>,
2568 <literal>[d|</literal>, <literal>[t|</literal>,
2569 <literal>$(</literal>,
2570 <literal>$$(</literal>,
2571 <literal>[||</literal>,
2572 <literal>[e||</literal>,
2573 <literal>$<replaceable>varid</replaceable></literal>,
2574 <literal>$$<replaceable>varid</replaceable></literal>
2575 <indexterm><primary>Template Haskell</primary></indexterm>
2576 </term>
2577 <listitem><para>
2578 Stolen by: <option>-XTemplateHaskell</option>
2579 </para></listitem>
2580 </varlistentry>
2581
2582 <varlistentry>
2583 <term>
2584 <literal>[<replaceable>varid</replaceable>|</literal>
2585 <indexterm><primary>quasi-quotation</primary></indexterm>
2586 </term>
2587 <listitem><para>
2588 Stolen by: <option>-XQuasiQuotes</option>
2589 </para></listitem>
2590 </varlistentry>
2591
2592 <varlistentry>
2593 <term>
2594 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2595 <replaceable>char</replaceable><literal>&num;</literal>,
2596 <replaceable>string</replaceable><literal>&num;</literal>,
2597 <replaceable>integer</replaceable><literal>&num;</literal>,
2598 <replaceable>float</replaceable><literal>&num;</literal>,
2599 <replaceable>float</replaceable><literal>&num;&num;</literal>
2600 </term>
2601 <listitem><para>
2602 Stolen by: <option>-XMagicHash</option>
2603 </para></listitem>
2604 </varlistentry>
2605
2606 <varlistentry>
2607 <term>
2608 <literal>(&num;</literal>, <literal>&num;)</literal>
2609 </term>
2610 <listitem><para>
2611 Stolen by: <option>-XUnboxedTuples</option>
2612 </para></listitem>
2613 </varlistentry>
2614
2615 <varlistentry>
2616 <term>
2617 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2618 </term>
2619 <listitem><para>
2620 Stolen by: <option>-XBangPatterns</option>
2621 </para></listitem>
2622 </varlistentry>
2623
2624 <varlistentry>
2625 <term>
2626 <literal>pattern</literal>
2627 </term>
2628 <listitem><para>
2629 Stolen by: <option>-XPatternSynonyms</option>
2630 </para></listitem>
2631 </varlistentry>
2632 </variablelist>
2633 </para>
2634 </sect2>
2635 </sect1>
2636
2637
2638 <!-- TYPE SYSTEM EXTENSIONS -->
2639 <sect1 id="data-type-extensions">
2640 <title>Extensions to data types and type synonyms</title>
2641
2642 <sect2 id="nullary-types">
2643 <title>Data types with no constructors</title>
2644
2645 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2646 GHC lets you declare a data type with no constructors. For example:</para>
2647
2648 <programlisting>
2649 data S -- S :: *
2650 data T a -- T :: * -> *
2651 </programlisting>
2652
2653 <para>Syntactically, the declaration lacks the "= constrs" part. The
2654 type can be parameterised over types of any kind, but if the kind is
2655 not <literal>*</literal> then an explicit kind annotation must be used
2656 (see <xref linkend="kinding"/>).</para>
2657
2658 <para>Such data types have only one value, namely bottom.
2659 Nevertheless, they can be useful when defining "phantom types".</para>
2660 </sect2>
2661
2662 <sect2 id="datatype-contexts">
2663 <title>Data type contexts</title>
2664
2665 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2666
2667 <programlisting>
2668 data Eq a => Set a = NilSet | ConsSet a (Set a)
2669 </programlisting>
2670
2671 <para>give constructors with types:</para>
2672
2673 <programlisting>
2674 NilSet :: Set a
2675 ConsSet :: Eq a => a -> Set a -> Set a
2676 </programlisting>
2677
2678 <para>This is widely considered a misfeature, and is going to be removed from
2679 the language. In GHC, it is controlled by the deprecated extension
2680 <literal>DatatypeContexts</literal>.</para>
2681 </sect2>
2682
2683 <sect2 id="infix-tycons">
2684 <title>Infix type constructors, classes, and type variables</title>
2685
2686 <para>
2687 GHC allows type constructors, classes, and type variables to be operators, and
2688 to be written infix, very much like expressions. More specifically:
2689 <itemizedlist>
2690 <listitem><para>
2691 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2692 The lexical syntax is the same as that for data constructors.
2693 </para></listitem>
2694 <listitem><para>
2695 Data type and type-synonym declarations can be written infix, parenthesised
2696 if you want further arguments. E.g.
2697 <screen>
2698 data a :*: b = Foo a b
2699 type a :+: b = Either a b
2700 class a :=: b where ...
2701
2702 data (a :**: b) x = Baz a b x
2703 type (a :++: b) y = Either (a,b) y
2704 </screen>
2705 </para></listitem>
2706 <listitem><para>
2707 Types, and class constraints, can be written infix. For example
2708 <screen>
2709 x :: Int :*: Bool
2710 f :: (a :=: b) => a -> b
2711 </screen>
2712 </para></listitem>
2713 <listitem><para>
2714 Back-quotes work
2715 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2716 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2717 </para></listitem>
2718 <listitem><para>
2719 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2720 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2721 sets the fixity for a data constructor and the corresponding type constructor. For example:
2722 <screen>
2723 infixl 7 T, :*:
2724 </screen>
2725 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2726 and similarly for <literal>:*:</literal>.
2727 <literal>Int `a` Bool</literal>.
2728 </para></listitem>
2729 <listitem><para>
2730 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2731 </para></listitem>
2732
2733 </itemizedlist>
2734 </para>
2735 </sect2>
2736
2737 <sect2 id="type-operators">
2738 <title>Type operators</title>
2739 <para>
2740 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2741 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2742 <programlisting>
2743 type T (+) = ((+), (+))
2744 -- Just like: type T a = (a,a)
2745
2746 f :: T Int -> Int
2747 f (x,y)= x
2748 </programlisting>
2749 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2750 allow you to write them infix.
2751 </para>
2752 <para>
2753 The language <option>-XTypeOperators</option> changes this behaviour:
2754 <itemizedlist>
2755 <listitem><para>
2756 Operator symbols become type <emphasis>constructors</emphasis> rather than
2757 type <emphasis>variables</emphasis>.
2758 </para></listitem>
2759 <listitem><para>
2760 Operator symbols in types can be written infix, both in definitions and uses.
2761 for example:
2762 <programlisting>
2763 data a + b = Plus a b
2764 type Foo = Int + Bool
2765 </programlisting>
2766 </para></listitem>
2767 <listitem><para>
2768 There is now some potential ambiguity in import and export lists; for example
2769 if you write <literal>import M( (+) )</literal> do you mean the
2770 <emphasis>function</emphasis> <literal>(+)</literal> or the
2771 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2772 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2773 by <option>-XExplicitTypeOperators</option>) GHC allows you to specify the latter
2774 by preceding it with the keyword <literal>type</literal>, thus:
2775 <programlisting>
2776 import M( type (+) )
2777 </programlisting>
2778 See <xref linkend="explicit-namespaces"/>.
2779 </para></listitem>
2780 <listitem><para>
2781 The fixity of a type operator may be set using the usual fixity declarations
2782 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2783 a single fixity.
2784 </para></listitem>
2785 </itemizedlist>
2786 </para>
2787 </sect2>
2788
2789 <sect2 id="type-synonyms">
2790 <title>Liberalised type synonyms</title>
2791
2792 <para>
2793 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2794 on individual synonym declarations.
2795 With the <option>-XLiberalTypeSynonyms</option> extension,
2796 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2797 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2798
2799 <itemizedlist>
2800 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2801 in a type synonym, thus:
2802 <programlisting>
2803 type Discard a = forall b. Show b => a -> b -> (a, String)
2804
2805 f :: Discard a
2806 f x y = (x, show y)
2807
2808 g :: Discard Int -> (Int,String) -- A rank-2 type
2809 g f = f 3 True
2810 </programlisting>
2811 </para>
2812 </listitem>
2813
2814 <listitem><para>
2815 If you also use <option>-XUnboxedTuples</option>,
2816 you can write an unboxed tuple in a type synonym:
2817 <programlisting>
2818 type Pr = (# Int, Int #)
2819
2820 h :: Int -> Pr
2821 h x = (# x, x #)
2822 </programlisting>
2823 </para></listitem>
2824
2825 <listitem><para>
2826 You can apply a type synonym to a forall type:
2827 <programlisting>
2828 type Foo a = a -> a -> Bool
2829
2830 f :: Foo (forall b. b->b)
2831 </programlisting>
2832 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2833 <programlisting>
2834 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2835 </programlisting>
2836 </para></listitem>
2837
2838 <listitem><para>
2839 You can apply a type synonym to a partially applied type synonym:
2840 <programlisting>
2841 type Generic i o = forall x. i x -> o x
2842 type Id x = x
2843
2844 foo :: Generic Id []
2845 </programlisting>
2846 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2847 <programlisting>
2848 foo :: forall x. x -> [x]
2849 </programlisting>
2850 </para></listitem>
2851
2852 </itemizedlist>
2853 </para>
2854
2855 <para>
2856 GHC currently does kind checking before expanding synonyms (though even that
2857 could be changed.)
2858 </para>
2859 <para>
2860 After expanding type synonyms, GHC does validity checking on types, looking for
2861 the following mal-formedness which isn't detected simply by kind checking:
2862 <itemizedlist>
2863 <listitem><para>
2864 Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal>
2865 is off)
2866 </para></listitem>
2867 <listitem><para>
2868 Partially-applied type synonym.
2869 </para></listitem>
2870 </itemizedlist>
2871 So, for example, this will be rejected:
2872 <programlisting>
2873 type Pr = forall a. a
2874
2875 h :: [Pr]
2876 h = ...
2877 </programlisting>
2878 because GHC does not allow type constructors applied to for-all types.
2879 </para>
2880 </sect2>
2881
2882
2883 <sect2 id="existential-quantification">
2884 <title>Existentially quantified data constructors
2885 </title>
2886
2887 <para>
2888 The idea of using existential quantification in data type declarations
2889 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2890 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2891 London, 1991). It was later formalised by Laufer and Odersky
2892 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2893 TOPLAS, 16(5), pp1411-1430, 1994).
2894 It's been in Lennart
2895 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2896 proved very useful. Here's the idea. Consider the declaration:
2897 </para>
2898
2899 <para>
2900
2901 <programlisting>
2902 data Foo = forall a. MkFoo a (a -> Bool)
2903 | Nil
2904 </programlisting>
2905
2906 </para>
2907
2908 <para>
2909 The data type <literal>Foo</literal> has two constructors with types:
2910 </para>
2911
2912 <para>
2913
2914 <programlisting>
2915 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2916 Nil :: Foo
2917 </programlisting>
2918
2919 </para>
2920
2921 <para>
2922 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2923 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2924 For example, the following expression is fine:
2925 </para>
2926
2927 <para>
2928
2929 <programlisting>
2930 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2931 </programlisting>
2932
2933 </para>
2934
2935 <para>
2936 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
2937 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
2938 isUpper</function> packages a character with a compatible function. These
2939 two things are each of type <literal>Foo</literal> and can be put in a list.
2940 </para>
2941
2942 <para>
2943 What can we do with a value of type <literal>Foo</literal>?. In particular,
2944 what happens when we pattern-match on <function>MkFoo</function>?
2945 </para>
2946
2947 <para>
2948
2949 <programlisting>
2950 f (MkFoo val fn) = ???
2951 </programlisting>
2952
2953 </para>
2954
2955 <para>
2956 Since all we know about <literal>val</literal> and <function>fn</function> is that they
2957 are compatible, the only (useful) thing we can do with them is to
2958 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
2959 </para>
2960
2961 <para>
2962
2963 <programlisting>
2964 f :: Foo -> Bool
2965 f (MkFoo val fn) = fn val
2966 </programlisting>
2967
2968 </para>
2969
2970 <para>
2971 What this allows us to do is to package heterogeneous values
2972 together with a bunch of functions that manipulate them, and then treat
2973 that collection of packages in a uniform manner. You can express
2974 quite a bit of object-oriented-like programming this way.
2975 </para>
2976
2977 <sect3 id="existential">
2978 <title>Why existential?
2979 </title>
2980
2981 <para>
2982 What has this to do with <emphasis>existential</emphasis> quantification?
2983 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
2984 </para>
2985
2986 <para>
2987
2988 <programlisting>
2989 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2990 </programlisting>
2991
2992 </para>
2993
2994 <para>
2995 But Haskell programmers can safely think of the ordinary
2996 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
2997 adding a new existential quantification construct.
2998 </para>
2999
3000 </sect3>
3001
3002 <sect3 id="existential-with-context">
3003 <title>Existentials and type classes</title>
3004
3005 <para>
3006 An easy extension is to allow
3007 arbitrary contexts before the constructor. For example:
3008 </para>
3009
3010 <para>
3011
3012 <programlisting>
3013 data Baz = forall a. Eq a => Baz1 a a
3014 | forall b. Show b => Baz2 b (b -> b)
3015 </programlisting>
3016
3017 </para>
3018
3019 <para>
3020 The two constructors have the types you'd expect:
3021 </para>
3022
3023 <para>
3024
3025 <programlisting>
3026 Baz1 :: forall a. Eq a => a -> a -> Baz
3027 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3028 </programlisting>
3029
3030 </para>
3031
3032 <para>
3033 But when pattern matching on <function>Baz1</function> the matched values can be compared
3034 for equality, and when pattern matching on <function>Baz2</function> the first matched
3035 value can be converted to a string (as well as applying the function to it).
3036 So this program is legal:
3037 </para>
3038
3039 <para>
3040
3041 <programlisting>
3042 f :: Baz -> String
3043 f (Baz1 p q) | p == q = "Yes"
3044 | otherwise = "No"
3045 f (Baz2 v fn) = show (fn v)
3046 </programlisting>
3047
3048 </para>
3049
3050 <para>
3051 Operationally, in a dictionary-passing implementation, the
3052 constructors <function>Baz1</function> and <function>Baz2</function> must store the
3053 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
3054 extract it on pattern matching.
3055 </para>
3056
3057 </sect3>
3058
3059 <sect3 id="existential-records">
3060 <title>Record Constructors</title>
3061
3062 <para>
3063 GHC allows existentials to be used with records syntax as well. For example:
3064
3065 <programlisting>
3066 data Counter a = forall self. NewCounter
3067 { _this :: self
3068 , _inc :: self -> self
3069 , _display :: self -> IO ()
3070 , tag :: a
3071 }
3072 </programlisting>
3073 Here <literal>tag</literal> is a public field, with a well-typed selector
3074 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
3075 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
3076 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
3077 compile-time error. In other words, <emphasis>GHC defines a record selector function
3078 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
3079 (This example used an underscore in the fields for which record selectors
3080 will not be defined, but that is only programming style; GHC ignores them.)
3081 </para>
3082
3083 <para>
3084 To make use of these hidden fields, we need to create some helper functions:
3085
3086 <programlisting>
3087 inc :: Counter a -> Counter a
3088 inc (NewCounter x i d t) = NewCounter
3089 { _this = i x, _inc = i, _display = d, tag = t }
3090
3091 display :: Counter a -> IO ()
3092 display NewCounter{ _this = x, _display = d } = d x
3093 </programlisting>
3094
3095 Now we can define counters with different underlying implementations:
3096
3097 <programlisting>
3098 counterA :: Counter String
3099 counterA = NewCounter
3100 { _this = 0, _inc = (1+), _display = print, tag = "A" }
3101
3102 counterB :: Counter String
3103 counterB = NewCounter
3104 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
3105
3106 main = do
3107 display (inc counterA) -- prints "1"
3108 display (inc (inc counterB)) -- prints "##"
3109 </programlisting>
3110
3111 Record update syntax is supported for existentials (and GADTs):
3112 <programlisting>
3113 setTag :: Counter a -> a -> Counter a
3114 setTag obj t = obj{ tag = t }
3115 </programlisting>
3116 The rule for record update is this: <emphasis>
3117 the types of the updated fields may
3118 mention only the universally-quantified type variables
3119 of the data constructor. For GADTs, the field may mention only types
3120 that appear as a simple type-variable argument in the constructor's result
3121 type</emphasis>. For example:
3122 <programlisting>
3123 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
3124 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
3125 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
3126 -- existentially quantified)
3127
3128 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
3129 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
3130 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
3131 -- type-variable argument in G1's result type)
3132 </programlisting>
3133 </para>
3134
3135 </sect3>
3136
3137
3138 <sect3>
3139 <title>Restrictions</title>
3140
3141 <para>
3142 There are several restrictions on the ways in which existentially-quantified
3143 constructors can be use.
3144 </para>
3145
3146 <para>
3147
3148 <itemizedlist>
3149 <listitem>
3150
3151 <para>
3152 When pattern matching, each pattern match introduces a new,
3153 distinct, type for each existential type variable. These types cannot
3154 be unified with any other type, nor can they escape from the scope of
3155 the pattern match. For example, these fragments are incorrect:
3156
3157
3158 <programlisting>
3159 f1 (MkFoo a f) = a
3160 </programlisting>
3161
3162
3163 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
3164 is the result of <function>f1</function>. One way to see why this is wrong is to
3165 ask what type <function>f1</function> has:
3166
3167
3168 <programlisting>
3169 f1 :: Foo -> a -- Weird!
3170 </programlisting>
3171
3172
3173 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
3174 this:
3175
3176
3177 <programlisting>
3178 f1 :: forall a. Foo -> a -- Wrong!
3179 </programlisting>
3180
3181
3182 The original program is just plain wrong. Here's another sort of error
3183
3184
3185 <programlisting>
3186 f2 (Baz1 a b) (Baz1 p q) = a==q
3187 </programlisting>
3188
3189
3190 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
3191 <literal>a==q</literal> is wrong because it equates the two distinct types arising
3192 from the two <function>Baz1</function> constructors.
3193
3194
3195 </para>
3196 </listitem>
3197 <listitem>
3198
3199 <para>
3200 You can't pattern-match on an existentially quantified
3201 constructor in a <literal>let</literal> or <literal>where</literal> group of
3202 bindings. So this is illegal:
3203
3204
3205 <programlisting>
3206 f3 x = a==b where { Baz1 a b = x }
3207 </programlisting>
3208
3209 Instead, use a <literal>case</literal> expression:
3210
3211 <programlisting>
3212 f3 x = case x of Baz1 a b -> a==b
3213 </programlisting>
3214
3215 In general, you can only pattern-match
3216 on an existentially-quantified constructor in a <literal>case</literal> expression or
3217 in the patterns of a function definition.
3218
3219 The reason for this restriction is really an implementation one.
3220 Type-checking binding groups is already a nightmare without
3221 existentials complicating the picture. Also an existential pattern
3222 binding at the top level of a module doesn't make sense, because it's
3223 not clear how to prevent the existentially-quantified type "escaping".
3224 So for now, there's a simple-to-state restriction. We'll see how
3225 annoying it is.
3226
3227 </para>
3228 </listitem>
3229 <listitem>
3230
3231 <para>
3232 You can't use existential quantification for <literal>newtype</literal>
3233 declarations. So this is illegal:
3234
3235
3236 <programlisting>
3237 newtype T = forall a. Ord a => MkT a
3238 </programlisting>
3239
3240
3241 Reason: a value of type <literal>T</literal> must be represented as a
3242 pair of a dictionary for <literal>Ord t</literal> and a value of type
3243 <literal>t</literal>. That contradicts the idea that
3244 <literal>newtype</literal> should have no concrete representation.
3245 You can get just the same efficiency and effect by using
3246 <literal>data</literal> instead of <literal>newtype</literal>. If
3247 there is no overloading involved, then there is more of a case for
3248 allowing an existentially-quantified <literal>newtype</literal>,
3249 because the <literal>data</literal> version does carry an
3250 implementation cost, but single-field existentially quantified
3251 constructors aren't much use. So the simple restriction (no
3252 existential stuff on <literal>newtype</literal>) stands, unless there
3253 are convincing reasons to change it.
3254
3255
3256 </para>
3257 </listitem>
3258 <listitem>
3259
3260 <para>
3261 You can't use <literal>deriving</literal> to define instances of a
3262 data type with existentially quantified data constructors.
3263
3264 Reason: in most cases it would not make sense. For example:;
3265
3266 <programlisting>
3267 data T = forall a. MkT [a] deriving( Eq )
3268 </programlisting>
3269
3270 To derive <literal>Eq</literal> in the standard way we would need to have equality
3271 between the single component of two <function>MkT</function> constructors:
3272
3273 <programlisting>
3274 instance Eq T where
3275 (MkT a) == (MkT b) = ???
3276 </programlisting>
3277
3278 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
3279 It's just about possible to imagine examples in which the derived instance
3280 would make sense, but it seems altogether simpler simply to prohibit such
3281 declarations. Define your own instances!
3282 </para>
3283 </listitem>
3284
3285 </itemizedlist>
3286
3287 </para>
3288
3289 </sect3>
3290 </sect2>
3291
3292 <!-- ====================== Generalised algebraic data types ======================= -->
3293
3294 <sect2 id="gadt-style">
3295 <title>Declaring data types with explicit constructor signatures</title>
3296
3297 <para>When the <literal>GADTSyntax</literal> extension is enabled,
3298 GHC allows you to declare an algebraic data type by
3299 giving the type signatures of constructors explicitly. For example:
3300 <programlisting>
3301 data Maybe a where
3302 Nothing :: Maybe a
3303 Just :: a -> Maybe a
3304 </programlisting>
3305 The form is called a "GADT-style declaration"
3306 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
3307 can only be declared using this form.</para>
3308 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
3309 For example, these two declarations are equivalent:
3310 <programlisting>
3311 data Foo = forall a. MkFoo a (a -> Bool)
3312 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
3313 </programlisting>
3314 </para>
3315 <para>Any data type that can be declared in standard Haskell-98 syntax
3316 can also be declared using GADT-style syntax.
3317 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
3318 they treat class constraints on the data constructors differently.
3319 Specifically, if the constructor is given a type-class context, that
3320 context is made available by pattern matching. For example:
3321 <programlisting>
3322 data Set a where
3323 MkSet :: Eq a => [a] -> Set a
3324
3325 makeSet :: Eq a => [a] -> Set a
3326 makeSet xs = MkSet (nub xs)
3327
3328 insert :: a -> Set a -> Set a
3329 insert a (MkSet as) | a `elem` as = MkSet as
3330 | otherwise = MkSet (a:as)
3331 </programlisting>
3332 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3333 gives rise to a <literal>(Eq a)</literal>
3334 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3335 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3336 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3337 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3338 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3339 In the example, the equality dictionary is used to satisfy the equality constraint
3340 generated by the call to <literal>elem</literal>, so that the type of
3341 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3342 </para>
3343 <para>
3344 For example, one possible application is to reify dictionaries:
3345 <programlisting>
3346 data NumInst a where
3347 MkNumInst :: Num a => NumInst a
3348
3349 intInst :: NumInst Int
3350 intInst = MkNumInst
3351
3352 plus :: NumInst a -> a -> a -> a
3353 plus MkNumInst p q = p + q
3354 </programlisting>
3355 Here, a value of type <literal>NumInst a</literal> is equivalent
3356 to an explicit <literal>(Num a)</literal> dictionary.
3357 </para>
3358 <para>
3359 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3360 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3361 like this:
3362 <programlisting>
3363 data NumInst a
3364 = Num a => MkNumInst (NumInst a)
3365 </programlisting>
3366 Notice that, unlike the situation when declaring an existential, there is
3367 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3368 data type's universally quantified type variable <literal>a</literal>.
3369 A constructor may have both universal and existential type variables: for example,
3370 the following two declarations are equivalent:
3371 <programlisting>
3372 data T1 a
3373 = forall b. (Num a, Eq b) => MkT1 a b
3374 data T2 a where
3375 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3376 </programlisting>
3377 </para>
3378 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3379 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3380 In Haskell 98 the definition
3381 <programlisting>
3382 data Eq a => Set' a = MkSet' [a]
3383 </programlisting>
3384 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3385 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3386 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3387 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3388 GHC's behaviour is much more useful, as well as much more intuitive.
3389 </para>
3390
3391 <para>
3392 The rest of this section gives further details about GADT-style data
3393 type declarations.
3394
3395 <itemizedlist>
3396 <listitem><para>
3397 The result type of each data constructor must begin with the type constructor being defined.
3398 If the result type of all constructors
3399 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3400 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3401 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3402 </para></listitem>
3403
3404 <listitem><para>
3405 As with other type signatures, you can give a single signature for several data constructors.
3406 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3407 <programlisting>
3408 data T a where
3409 T1,T2 :: a -> T a
3410 T3 :: T a
3411 </programlisting>
3412 </para></listitem>
3413
3414 <listitem><para>
3415 The type signature of
3416 each constructor is independent, and is implicitly universally quantified as usual.
3417 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3418 have no scope, and different constructors may have different universally-quantified type variables:
3419 <programlisting>
3420 data T a where -- The 'a' has no scope
3421 T1,T2 :: b -> T b -- Means forall b. b -> T b
3422 T3 :: T a -- Means forall a. T a
3423 </programlisting>
3424 </para></listitem>
3425
3426 <listitem><para>
3427 A constructor signature may mention type class constraints, which can differ for
3428 different constructors. For example, this is fine:
3429 <programlisting>
3430 data T a where
3431 T1 :: Eq b => b -> b -> T b
3432 T2 :: (Show c, Ix c) => c -> [c] -> T c
3433 </programlisting>
3434 When pattern matching, these constraints are made available to discharge constraints
3435 in the body of the match. For example:
3436 <programlisting>
3437 f :: T a -> String
3438 f (T1 x y) | x==y = "yes"
3439 | otherwise = "no"
3440 f (T2 a b) = show a
3441 </programlisting>
3442 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3443 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3444 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3445 </para></listitem>
3446
3447 <listitem><para>
3448 Unlike a Haskell-98-style
3449 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3450 have no scope. Indeed, one can write a kind signature instead:
3451 <programlisting>
3452 data Set :: * -> * where ...
3453 </programlisting>
3454 or even a mixture of the two:
3455 <programlisting>
3456 data Bar a :: (* -> *) -> * where ...
3457 </programlisting>
3458 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3459 like this:
3460 <programlisting>
3461 data Bar a (b :: * -> *) where ...
3462 </programlisting>
3463 </para></listitem>
3464
3465
3466 <listitem><para>
3467 You can use strictness annotations, in the obvious places
3468 in the constructor type:
3469 <programlisting>
3470 data Term a where
3471 Lit :: !Int -> Term Int
3472 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3473 Pair :: Term a -> Term b -> Term (a,b)
3474 </programlisting>
3475 </para></listitem>
3476
3477 <listitem><para>
3478 You can use a <literal>deriving</literal> clause on a GADT-style data type
3479 declaration. For example, these two declarations are equivalent
3480 <programlisting>
3481 data Maybe1 a where {
3482 Nothing1 :: Maybe1 a ;
3483 Just1 :: a -> Maybe1 a
3484 } deriving( Eq, Ord )
3485
3486 data Maybe2 a = Nothing2 | Just2 a
3487 deriving( Eq, Ord )
3488 </programlisting>
3489 </para></listitem>
3490
3491 <listitem><para>
3492 The type signature may have quantified type variables that do not appear
3493 in the result type:
3494 <programlisting>
3495 data Foo where
3496 MkFoo :: a -> (a->Bool) -> Foo
3497 Nil :: Foo
3498 </programlisting>
3499 Here the type variable <literal>a</literal> does not appear in the result type
3500 of either constructor.
3501 Although it is universally quantified in the type of the constructor, such
3502 a type variable is often called "existential".
3503 Indeed, the above declaration declares precisely the same type as
3504 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3505 </para><para>
3506 The type may contain a class context too, of course:
3507 <programlisting>
3508 data Showable where
3509 MkShowable :: Show a => a -> Showable
3510 </programlisting>
3511 </para></listitem>
3512
3513 <listitem><para>
3514 You can use record syntax on a GADT-style data type declaration:
3515
3516 <programlisting>
3517 data Person where
3518 Adult :: { name :: String, children :: [Person] } -> Person
3519 Child :: Show a => { name :: !String, funny :: a } -> Person
3520 </programlisting>
3521 As usual, for every constructor that has a field <literal>f</literal>, the type of
3522 field <literal>f</literal> must be the same (modulo alpha conversion).
3523 The <literal>Child</literal> constructor above shows that the signature
3524 may have a context, existentially-quantified variables, and strictness annotations,
3525 just as in the non-record case. (NB: the "type" that follows the double-colon
3526 is not really a type, because of the record syntax and strictness annotations.
3527 A "type" of this form can appear only in a constructor signature.)
3528 </para></listitem>
3529
3530 <listitem><para>
3531 Record updates are allowed with GADT-style declarations,
3532 only fields that have the following property: the type of the field
3533 mentions no existential type variables.
3534 </para></listitem>
3535
3536 <listitem><para>
3537 As in the case of existentials declared using the Haskell-98-like record syntax
3538 (<xref linkend="existential-records"/>),
3539 record-selector functions are generated only for those fields that have well-typed
3540 selectors.
3541 Here is the example of that section, in GADT-style syntax:
3542 <programlisting>
3543 data Counter a where
3544 NewCounter :: { _this :: self
3545 , _inc :: self -> self
3546 , _display :: self -> IO ()
3547 , tag :: a
3548 } -> Counter a
3549 </programlisting>
3550 As before, only one selector function is generated here, that for <literal>tag</literal>.
3551 Nevertheless, you can still use all the field names in pattern matching and record construction.
3552 </para></listitem>
3553
3554 <listitem><para>
3555 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3556 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3557 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3558 So GHC implements the following design: a data constructor declared in a GADT-style data type
3559 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3560 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3561 <programlisting>
3562 infix 6 (:--:)
3563 data T a where
3564 (:--:) :: Int -> Bool -> T Int
3565 </programlisting>
3566 </para></listitem>
3567 </itemizedlist></para>
3568 </sect2>
3569
3570 <sect2 id="gadt">
3571 <title>Generalised Algebraic Data Types (GADTs)</title>
3572
3573 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3574 by allowing constructors to have richer return types. Here is an example:
3575 <programlisting>
3576 data Term a where
3577 Lit :: Int -> Term Int
3578 Succ :: Term Int -> Term Int
3579 IsZero :: Term Int -> Term Bool
3580 If :: Term Bool -> Term a -> Term a -> Term a
3581 Pair :: Term a -> Term b -> Term (a,b)
3582 </programlisting>
3583 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3584 case with ordinary data types. This generality allows us to
3585 write a well-typed <literal>eval</literal> function
3586 for these <literal>Terms</literal>:
3587 <programlisting>
3588 eval :: Term a -> a
3589 eval (Lit i) = i
3590 eval (Succ t) = 1 + eval t
3591 eval (IsZero t) = eval t == 0
3592 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3593 eval (Pair e1 e2) = (eval e1, eval e2)
3594 </programlisting>
3595 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3596 For example, in the right hand side of the equation
3597 <programlisting>
3598 eval :: Term a -> a
3599 eval (Lit i) = ...
3600 </programlisting>
3601 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3602 A precise specification of the type rules is beyond what this user manual aspires to,
3603 but the design closely follows that described in
3604 the paper <ulink
3605 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3606 unification-based type inference for GADTs</ulink>,
3607 (ICFP 2006).
3608 The general principle is this: <emphasis>type refinement is only carried out
3609 based on user-supplied type annotations</emphasis>.
3610 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3611 and lots of obscure error messages will
3612 occur. However, the refinement is quite general. For example, if we had:
3613 <programlisting>
3614 eval :: Term a -> a -> a
3615 eval (Lit i) j = i+j
3616 </programlisting>
3617 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3618 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3619 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3620 </para>
3621 <para>
3622 These and many other examples are given in papers by Hongwei Xi, and
3623 Tim Sheard. There is a longer introduction
3624 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3625 and Ralf Hinze's
3626 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3627 may use different notation to that implemented in GHC.
3628 </para>
3629 <para>
3630 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3631 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
3632 <itemizedlist>
3633 <listitem><para>
3634 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3635 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3636 The result type of each constructor must begin with the type constructor being defined,
3637 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3638 For example, in the <literal>Term</literal> data
3639 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3640 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3641 constructor).
3642 </para></listitem>
3643
3644 <listitem><para>
3645 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3646 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3647 whose result type is not just <literal>T a b</literal>.
3648 </para></listitem>
3649
3650 <listitem><para>
3651 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3652 an ordinary data type.
3653 </para></listitem>
3654
3655 <listitem><para>
3656 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3657 For example:
3658 <programlisting>
3659 data Term a where
3660 Lit :: { val :: Int } -> Term Int
3661 Succ :: { num :: Term Int } -> Term Int
3662 Pred :: { num :: Term Int } -> Term Int
3663 IsZero :: { arg :: Term Int } -> Term Bool
3664 Pair :: { arg1 :: Term a
3665 , arg2 :: Term b
3666 } -> Term (a,b)
3667 If :: { cnd :: Term Bool
3668 , tru :: Term a
3669 , fls :: Term a
3670 } -> Term a
3671 </programlisting>
3672 However, for GADTs there is the following additional constraint:
3673 every constructor that has a field <literal>f</literal> must have
3674 the same result type (modulo alpha conversion)
3675 Hence, in the above example, we cannot merge the <literal>num</literal>
3676 and <literal>arg</literal> fields above into a
3677 single name. Although their field types are both <literal>Term Int</literal>,
3678 their selector functions actually have different types:
3679
3680 <programlisting>
3681 num :: Term Int -> Term Int
3682 arg :: Term Bool -> Term Int
3683 </programlisting>
3684 </para></listitem>
3685
3686 <listitem><para>
3687 When pattern-matching against data constructors drawn from a GADT,
3688 for example in a <literal>case</literal> expression, the following rules apply:
3689 <itemizedlist>
3690 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3691 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3692 <listitem><para>The type of any free variable mentioned in any of
3693 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3694 </itemizedlist>
3695 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3696 way to ensure that a variable a rigid type is to give it a type signature.
3697 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3698 Simple unification-based type inference for GADTs
3699 </ulink>. The criteria implemented by GHC are given in the Appendix.
3700
3701 </para></listitem>
3702
3703 </itemizedlist>
3704 </para>
3705
3706 </sect2>
3707 </sect1>
3708
3709 <!-- ====================== End of Generalised algebraic data types ======================= -->
3710
3711 <sect1 id="deriving">
3712 <title>Extensions to the "deriving" mechanism</title>
3713
3714 <sect2 id="deriving-inferred">
3715 <title>Inferred context for deriving clauses</title>
3716
3717 <para>
3718 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3719 legal. For example:
3720 <programlisting>
3721 data T0 f a = MkT0 a deriving( Eq )
3722 data T1 f a = MkT1 (f a) deriving( Eq )
3723 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3724 </programlisting>
3725 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3726 <programlisting>
3727 instance Eq a => Eq (T0 f a) where ...
3728 instance Eq (f a) => Eq (T1 f a) where ...
3729 instance Eq (f (f a)) => Eq (T2 f a) where ...
3730 </programlisting>
3731 The first of these is obviously fine. The second is still fine, although less obviously.
3732 The third is not Haskell 98, and risks losing termination of instances.
3733 </para>
3734 <para>
3735 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3736 each constraint in the inferred instance context must consist only of type variables,
3737 with no repetitions.
3738 </para>
3739 <para>
3740 This rule is applied regardless of flags. If you want a more exotic context, you can write
3741 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3742 </para>
3743 </sect2>
3744
3745 <sect2 id="stand-alone-deriving">
3746 <title>Stand-alone deriving declarations</title>
3747
3748 <para>
3749 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3750 <programlisting>
3751 data Foo a = Bar a | Baz String
3752
3753 deriving instance Eq a => Eq (Foo a)
3754 </programlisting>
3755 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3756 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3757 </para>
3758 <para>
3759 However, standalone deriving differs from a <literal>deriving</literal> clause in a number
3760 of important ways:
3761 <itemizedlist>
3762 <listitem><para>The standalone deriving declaration does not need to be in the
3763 same module as the data type declaration. (But be aware of the dangers of
3764 orphan instances (<xref linkend="orphan-modules"/>).
3765 </para></listitem>
3766
3767 <listitem><para>
3768 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3769 exactly as you would in an ordinary instance declaration.
3770 (In contrast, in a <literal>deriving</literal> clause
3771 attached to a data type declaration, the context is inferred.)
3772 </para></listitem>
3773
3774 <listitem><para>
3775 Unlike a <literal>deriving</literal>
3776 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3777 than the data type (assuming you also use
3778 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3779 for example
3780 <programlisting>
3781 data Foo a = Bar a | Baz String
3782
3783 deriving instance Eq a => Eq (Foo [a])
3784 deriving instance Eq a => Eq (Foo (Maybe a))
3785 </programlisting>
3786 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3787 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3788 </para></listitem>
3789
3790 <listitem><para>
3791 Unlike a <literal>deriving</literal>
3792 declaration attached to a <literal>data</literal> declaration,
3793 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3794 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3795 your problem. (GHC will show you the offending code if it has a type error.)
3796 </para>
3797 <para>
3798 The merit of this is that you can derive instances for GADTs and other exotic
3799 data types, providing only that the boilerplate code does indeed typecheck. For example:
3800 <programlisting>
3801 data T a where
3802 T1 :: T Int
3803 T2 :: T Bool
3804
3805 deriving instance Show (T a)
3806 </programlisting>
3807 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3808 data type declaration for <literal>T</literal>,
3809 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3810 the instance declaration using stand-alone deriving.
3811 </para>
3812 <para>
3813 The down-side is that,
3814 if the boilerplate code fails to typecheck, you will get an error message about that
3815 code, which you did not write. Whereas, with a <literal>deriving</literal> clause
3816 the side-conditions are necessarily more conservative, but any error message
3817 may be more comprehensible.
3818 </para>
3819 </listitem>
3820 </itemizedlist></para>
3821
3822 <para>
3823 In other ways, however, a standalone deriving obeys the same rules as ordinary deriving:
3824 <itemizedlist>
3825 <listitem><para>
3826 A <literal>deriving instance</literal> declaration
3827 must obey the same rules concerning form and termination as ordinary instance declarations,
3828 controlled by the same flags; see <xref linkend="instance-decls"/>.
3829 </para></listitem>
3830
3831 <listitem>
3832 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3833 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3834 For example:
3835 <programlisting>
3836 newtype Foo a = MkFoo (State Int a)
3837
3838 deriving instance MonadState Int Foo
3839 </programlisting>
3840 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3841 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
3842 </para></listitem>
3843 </itemizedlist></para>
3844
3845 </sect2>
3846
3847 <sect2 id="deriving-extra">
3848 <title>Deriving instances of extra classes (<literal>Data</literal>, etc)</title>
3849
3850 <para>
3851 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3852 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3853 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3854 classes <literal>Eq</literal>, <literal>Ord</literal>,
3855 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3856 </para>
3857 <para>
3858 GHC extends this list with several more classes that may be automatically derived:
3859 <itemizedlist>
3860 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
3861 instances of the classes <literal>Generic</literal> and
3862 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
3863 You can use these to define generic functions,
3864 as described in <xref linkend="generic-programming"/>.
3865 </para></listitem>
3866
3867 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3868 the class <literal>Functor</literal>,
3869 defined in <literal>GHC.Base</literal>.
3870 </para></listitem>
3871
3872 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of
3873 the class <literal>Data</literal>,
3874 defined in <literal>Data.Data</literal>. See <xref linkend="deriving-typeable"/> for
3875 deriving <literal>Typeable</literal>.
3876 </para></listitem>
3877
3878 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3879 the class <literal>Foldable</literal>,
3880 defined in <literal>Data.Foldable</literal>.
3881 </para></listitem>
3882
3883 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3884 the class <literal>Traversable</literal>,
3885 defined in <literal>Data.Traversable</literal>.
3886 </para></listitem>
3887 </itemizedlist>
3888 You can also use a standalone deriving declaration instead
3889 (see <xref linkend="stand-alone-deriving"/>).
3890 </para>
3891 <para>
3892 In each case the appropriate class must be in scope before it
3893 can be mentioned in the <literal>deriving</literal> clause.
3894 </para>
3895 </sect2>
3896
3897 <sect2 id="deriving-typeable">
3898 <title>Deriving <literal>Typeable</literal> instances</title>
3899
3900 <para>The class <literal>Typeable</literal> is very special:
3901 <itemizedlist>
3902 <listitem><para>
3903 <literal>Typeable</literal> is kind-polymorphic (see
3904 <xref linkend="kind-polymorphism"/>).
3905 </para></listitem>
3906
3907 <listitem><para>
3908 Only derived instances of <literal>Typeable</literal> are allowed;
3909 i.e. handwritten instances are forbidden. This ensures that the
3910 programmer cannot subert the type system by writing bogus instances.
3911 </para></listitem>
3912
3913 <listitem><para>
3914 With <option>-XDeriveDataTypeable</option>
3915 GHC allows you to derive instances of <literal>Typeable</literal> for data types or newtypes,
3916 using a <literal>deriving</literal> clause, or using
3917 a standalone deriving declaration (<xref linkend="stand-alone-deriving"/>).
3918 </para></listitem>
3919
3920 <listitem><para>
3921 With <option>-XDataKinds</option>, deriving <literal>Typeable</literal> for a data
3922 type (whether via a deriving clause or standalone deriving)
3923 also derives <literal>Typeable</literal> for the promoted data constructors (<xref linkend="promotion"/>).
3924 </para></listitem>
3925
3926 <listitem><para>
3927 However, using standalone deriving, you can <emphasis>also</emphasis> derive
3928 a <literal>Typeable</literal> instance for a data family.
3929 You may not add a <literal>deriving(Typeable)</literal> clause to a
3930 <literal>data instance</literal> declaration; instead you must use a
3931 standalone deriving declaration for the data family.
3932 </para></listitem>
3933
3934 <listitem><para>
3935 Using standalone deriving, you can <emphasis>also</emphasis> derive
3936 a <literal>Typeable</literal> instance for a type class.
3937 </para></listitem>
3938
3939 <listitem><para>
3940 The flag <option>-XAutoDeriveTypeable</option> triggers the generation
3941 of derived <literal>Typeable</literal> instances for every datatype, data family,
3942 and type class declaration in the module it is used, unless a manually-specified one is
3943 already provided.
3944 This flag implies <option>-XDeriveDataTypeable</option>.
3945 </para></listitem>
3946 </itemizedlist>
3947
3948 </para>
3949
3950 </sect2>
3951
3952 <sect2 id="newtype-deriving">
3953 <title>Generalised derived instances for newtypes</title>
3954
3955 <para>
3956 When you define an abstract type using <literal>newtype</literal>, you may want
3957 the new type to inherit some instances from its representation. In
3958 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3959 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3960 other classes you have to write an explicit instance declaration. For
3961 example, if you define
3962
3963 <programlisting>
3964 newtype Dollars = Dollars Int
3965 </programlisting>
3966
3967 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3968 explicitly define an instance of <literal>Num</literal>:
3969
3970 <programlisting>
3971 instance Num Dollars where
3972 Dollars a + Dollars b = Dollars (a+b)
3973 ...
3974 </programlisting>
3975 All the instance does is apply and remove the <literal>newtype</literal>
3976 constructor. It is particularly galling that, since the constructor
3977 doesn't appear at run-time, this instance declaration defines a
3978 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3979 dictionary, only slower!
3980 </para>
3981
3982
3983 <sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title>
3984 <para>
3985 GHC now permits such instances to be derived instead,
3986 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
3987 so one can write
3988 <programlisting>
3989 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3990 </programlisting>
3991
3992 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3993 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3994 derives an instance declaration of the form
3995
3996 <programlisting>
3997 instance Num Int => Num Dollars
3998 </programlisting>
3999
4000 which just adds or removes the <literal>newtype</literal> constructor according to the type.
4001 </para>
4002 <para>
4003
4004 We can also derive instances of constructor classes in a similar
4005 way. For example, suppose we have implemented state and failure monad
4006 transformers, such that
4007
4008 <programlisting>
4009 instance Monad m => Monad (State s m)
4010 instance Monad m => Monad (Failure m)
4011 </programlisting>
4012 In Haskell 98, we can define a parsing monad by
4013 <programlisting>
4014 type Parser tok m a = State [tok] (Failure m) a
4015 </programlisting>
4016
4017 which is automatically a monad thanks to the instance declarations
4018 above. With the extension, we can make the parser type abstract,
4019 without needing to write an instance of class <literal>Monad</literal>, via
4020
4021 <programlisting>
4022 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
4023 deriving Monad
4024 </programlisting>
4025 In this case the derived instance declaration is of the form
4026 <programlisting>
4027 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
4028 </programlisting>
4029
4030 Notice that, since <literal>Monad</literal> is a constructor class, the
4031 instance is a <emphasis>partial application</emphasis> of the new type, not the
4032 entire left hand side. We can imagine that the type declaration is
4033 "eta-converted" to generate the context of the instance
4034 declaration.
4035 </para>
4036 <para>
4037
4038 We can even derive instances of multi-parameter classes, provided the
4039 newtype is the last class parameter. In this case, a ``partial
4040 application'' of the class appears in the <literal>deriving</literal>
4041 clause. For example, given the class
4042
4043 <programlisting>
4044 class StateMonad s m | m -> s where ...
4045 instance Monad m => StateMonad s (State s m) where ...
4046 </programlisting>
4047 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
4048 <programlisting>
4049 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
4050 deriving (Monad, StateMonad [tok])
4051 </programlisting>
4052
4053 The derived instance is obtained by completing the application of the
4054 class to the new type:
4055
4056 <programlisting>
4057 instance StateMonad [tok] (State [tok] (Failure m)) =>
4058 StateMonad [tok] (Parser tok m)
4059 </programlisting>
4060 </para>
4061 <para>
4062
4063 As a result of this extension, all derived instances in newtype
4064 declarations are treated uniformly (and implemented just by reusing
4065 the dictionary for the representation type), <emphasis>except</emphasis>
4066 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
4067 the newtype and its representation.
4068 </para>
4069 </sect3>
4070
4071 <sect3> <title> A more precise specification </title>
4072 <para>
4073 A derived instance is derived only for declarations of these forms (after expansion of any type synonyms)
4074
4075 <programlisting>
4076 newtype T v1..vn = MkT (t vk+1..vn) deriving (C t1..tj)
4077 newtype instance T s1..sk vk+1..vn = MkT (t vk+1..vn) deriving (C t1..tj)
4078 </programlisting>
4079 where
4080 <itemizedlist>
4081 <listitem><para>
4082 <literal>v1..vn</literal> are type variables, and <literal>t</literal>,
4083 <literal>s1..sk</literal>, <literal>t1..tj</literal> are types.
4084 </para></listitem>
4085 <listitem><para>
4086 The <literal>(C t1..tj)</literal> is a partial applications of the class <literal>C</literal>,
4087 where the arity of <literal>C</literal>
4088 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
4089 </para></listitem>
4090 <listitem><para>
4091 <literal>k</literal> is chosen so that <literal>C t1..tj (T v1...vk)</literal> is well-kinded.
4092 (Or, in the case of a <literal>data instance</literal>, so that <literal>C t1..tj (T s1..sk)</literal> is
4093 well kinded.)
4094 </para></listitem>
4095 <listitem><para>
4096 The type <literal>t</literal> is an arbitrary type.
4097 </para></listitem>
4098 <listitem><para>
4099 The type variables <literal>vk+1...vn</literal> do not occur in the types <literal>t</literal>,
4100 <literal>s1..sk</literal>, or <literal>t1..tj</literal>.