Tweak holes documentation
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by commandline flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>::</entry> <!-- no special char, apparently -->
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&larrtl;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&rarrtl;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry></entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry></entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483
484 <!-- ====================== HIERARCHICAL MODULES ======================= -->
485
486
487 <sect2 id="hierarchical-modules">
488 <title>Hierarchical Modules</title>
489
490 <para>GHC supports a small extension to the syntax of module
491 names: a module name is allowed to contain a dot
492 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
493 &ldquo;hierarchical module namespace&rdquo; extension, because
494 it extends the normally flat Haskell module namespace into a
495 more flexible hierarchy of modules.</para>
496
497 <para>This extension has very little impact on the language
498 itself; modules names are <emphasis>always</emphasis> fully
499 qualified, so you can just think of the fully qualified module
500 name as <quote>the module name</quote>. In particular, this
501 means that the full module name must be given after the
502 <literal>module</literal> keyword at the beginning of the
503 module; for example, the module <literal>A.B.C</literal> must
504 begin</para>
505
506 <programlisting>module A.B.C</programlisting>
507
508
509 <para>It is a common strategy to use the <literal>as</literal>
510 keyword to save some typing when using qualified names with
511 hierarchical modules. For example:</para>
512
513 <programlisting>
514 import qualified Control.Monad.ST.Strict as ST
515 </programlisting>
516
517 <para>For details on how GHC searches for source and interface
518 files in the presence of hierarchical modules, see <xref
519 linkend="search-path"/>.</para>
520
521 <para>GHC comes with a large collection of libraries arranged
522 hierarchically; see the accompanying <ulink
523 url="../libraries/index.html">library
524 documentation</ulink>. More libraries to install are available
525 from <ulink
526 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
527 </sect2>
528
529 <!-- ====================== PATTERN GUARDS ======================= -->
530
531 <sect2 id="pattern-guards">
532 <title>Pattern guards</title>
533
534 <para>
535 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
536 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
537 </para>
538
539 <para>
540 Suppose we have an abstract data type of finite maps, with a
541 lookup operation:
542
543 <programlisting>
544 lookup :: FiniteMap -> Int -> Maybe Int
545 </programlisting>
546
547 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
548 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
549 </para>
550
551 <programlisting>
552 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
553 | otherwise = var1 + var2
554 where
555 m1 = lookup env var1
556 m2 = lookup env var2
557 ok1 = maybeToBool m1
558 ok2 = maybeToBool m2
559 val1 = expectJust m1
560 val2 = expectJust m2
561 </programlisting>
562
563 <para>
564 The auxiliary functions are
565 </para>
566
567 <programlisting>
568 maybeToBool :: Maybe a -&gt; Bool
569 maybeToBool (Just x) = True
570 maybeToBool Nothing = False
571
572 expectJust :: Maybe a -&gt; a
573 expectJust (Just x) = x
574 expectJust Nothing = error "Unexpected Nothing"
575 </programlisting>
576
577 <para>
578 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
579 ok2</literal> checks that both lookups succeed, using
580 <function>maybeToBool</function> to convert the <function>Maybe</function>
581 types to booleans. The (lazily evaluated) <function>expectJust</function>
582 calls extract the values from the results of the lookups, and binds the
583 returned values to <varname>val1</varname> and <varname>val2</varname>
584 respectively. If either lookup fails, then clunky takes the
585 <literal>otherwise</literal> case and returns the sum of its arguments.
586 </para>
587
588 <para>
589 This is certainly legal Haskell, but it is a tremendously verbose and
590 un-obvious way to achieve the desired effect. Arguably, a more direct way
591 to write clunky would be to use case expressions:
592 </para>
593
594 <programlisting>
595 clunky env var1 var2 = case lookup env var1 of
596 Nothing -&gt; fail
597 Just val1 -&gt; case lookup env var2 of
598 Nothing -&gt; fail
599 Just val2 -&gt; val1 + val2
600 where
601 fail = var1 + var2
602 </programlisting>
603
604 <para>
605 This is a bit shorter, but hardly better. Of course, we can rewrite any set
606 of pattern-matching, guarded equations as case expressions; that is
607 precisely what the compiler does when compiling equations! The reason that
608 Haskell provides guarded equations is because they allow us to write down
609 the cases we want to consider, one at a time, independently of each other.
610 This structure is hidden in the case version. Two of the right-hand sides
611 are really the same (<function>fail</function>), and the whole expression
612 tends to become more and more indented.
613 </para>
614
615 <para>
616 Here is how I would write clunky:
617 </para>
618
619 <programlisting>
620 clunky env var1 var2
621 | Just val1 &lt;- lookup env var1
622 , Just val2 &lt;- lookup env var2
623 = val1 + val2
624 ...other equations for clunky...
625 </programlisting>
626
627 <para>
628 The semantics should be clear enough. The qualifiers are matched in order.
629 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
630 right hand side is evaluated and matched against the pattern on the left.
631 If the match fails then the whole guard fails and the next equation is
632 tried. If it succeeds, then the appropriate binding takes place, and the
633 next qualifier is matched, in the augmented environment. Unlike list
634 comprehensions, however, the type of the expression to the right of the
635 <literal>&lt;-</literal> is the same as the type of the pattern to its
636 left. The bindings introduced by pattern guards scope over all the
637 remaining guard qualifiers, and over the right hand side of the equation.
638 </para>
639
640 <para>
641 Just as with list comprehensions, boolean expressions can be freely mixed
642 with among the pattern guards. For example:
643 </para>
644
645 <programlisting>
646 f x | [y] &lt;- x
647 , y > 3
648 , Just z &lt;- h y
649 = ...
650 </programlisting>
651
652 <para>
653 Haskell's current guards therefore emerge as a special case, in which the
654 qualifier list has just one element, a boolean expression.
655 </para>
656 </sect2>
657
658 <!-- ===================== View patterns =================== -->
659
660 <sect2 id="view-patterns">
661 <title>View patterns
662 </title>
663
664 <para>
665 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
666 More information and examples of view patterns can be found on the
667 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
668 page</ulink>.
669 </para>
670
671 <para>
672 View patterns are somewhat like pattern guards that can be nested inside
673 of other patterns. They are a convenient way of pattern-matching
674 against values of abstract types. For example, in a programming language
675 implementation, we might represent the syntax of the types of the
676 language as follows:
677
678 <programlisting>
679 type Typ
680
681 data TypView = Unit
682 | Arrow Typ Typ
683
684 view :: Typ -> TypView
685
686 -- additional operations for constructing Typ's ...
687 </programlisting>
688
689 The representation of Typ is held abstract, permitting implementations
690 to use a fancy representation (e.g., hash-consing to manage sharing).
691
692 Without view patterns, using this signature a little inconvenient:
693 <programlisting>
694 size :: Typ -> Integer
695 size t = case view t of
696 Unit -> 1
697 Arrow t1 t2 -> size t1 + size t2
698 </programlisting>
699
700 It is necessary to iterate the case, rather than using an equational
701 function definition. And the situation is even worse when the matching
702 against <literal>t</literal> is buried deep inside another pattern.
703 </para>
704
705 <para>
706 View patterns permit calling the view function inside the pattern and
707 matching against the result:
708 <programlisting>
709 size (view -> Unit) = 1
710 size (view -> Arrow t1 t2) = size t1 + size t2
711 </programlisting>
712
713 That is, we add a new form of pattern, written
714 <replaceable>expression</replaceable> <literal>-></literal>
715 <replaceable>pattern</replaceable> that means "apply the expression to
716 whatever we're trying to match against, and then match the result of
717 that application against the pattern". The expression can be any Haskell
718 expression of function type, and view patterns can be used wherever
719 patterns are used.
720 </para>
721
722 <para>
723 The semantics of a pattern <literal>(</literal>
724 <replaceable>exp</replaceable> <literal>-></literal>
725 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
726
727 <itemizedlist>
728
729 <listitem> Scoping:
730
731 <para>The variables bound by the view pattern are the variables bound by
732 <replaceable>pat</replaceable>.
733 </para>
734
735 <para>
736 Any variables in <replaceable>exp</replaceable> are bound occurrences,
737 but variables bound "to the left" in a pattern are in scope. This
738 feature permits, for example, one argument to a function to be used in
739 the view of another argument. For example, the function
740 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
741 written using view patterns as follows:
742
743 <programlisting>
744 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
745 ...other equations for clunky...
746 </programlisting>
747 </para>
748
749 <para>
750 More precisely, the scoping rules are:
751 <itemizedlist>
752 <listitem>
753 <para>
754 In a single pattern, variables bound by patterns to the left of a view
755 pattern expression are in scope. For example:
756 <programlisting>
757 example :: Maybe ((String -> Integer,Integer), String) -> Bool
758 example Just ((f,_), f -> 4) = True
759 </programlisting>
760
761 Additionally, in function definitions, variables bound by matching earlier curried
762 arguments may be used in view pattern expressions in later arguments:
763 <programlisting>
764 example :: (String -> Integer) -> String -> Bool
765 example f (f -> 4) = True
766 </programlisting>
767 That is, the scoping is the same as it would be if the curried arguments
768 were collected into a tuple.
769 </para>
770 </listitem>
771
772 <listitem>
773 <para>
774 In mutually recursive bindings, such as <literal>let</literal>,
775 <literal>where</literal>, or the top level, view patterns in one
776 declaration may not mention variables bound by other declarations. That
777 is, each declaration must be self-contained. For example, the following
778 program is not allowed:
779 <programlisting>
780 let {(x -> y) = e1 ;
781 (y -> x) = e2 } in x
782 </programlisting>
783
784 (For some amplification on this design choice see
785 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
786
787 </para>
788 </listitem>
789 </itemizedlist>
790
791 </para>
792 </listitem>
793
794 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
795 <replaceable>T1</replaceable> <literal>-></literal>
796 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
797 a <replaceable>T2</replaceable>, then the whole view pattern matches a
798 <replaceable>T1</replaceable>.
799 </para></listitem>
800
801 <listitem><para> Matching: To the equations in Section 3.17.3 of the
802 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
803 Report</ulink>, add the following:
804 <programlisting>
805 case v of { (e -> p) -> e1 ; _ -> e2 }
806 =
807 case (e v) of { p -> e1 ; _ -> e2 }
808 </programlisting>
809 That is, to match a variable <replaceable>v</replaceable> against a pattern
810 <literal>(</literal> <replaceable>exp</replaceable>
811 <literal>-></literal> <replaceable>pat</replaceable>
812 <literal>)</literal>, evaluate <literal>(</literal>
813 <replaceable>exp</replaceable> <replaceable> v</replaceable>
814 <literal>)</literal> and match the result against
815 <replaceable>pat</replaceable>.
816 </para></listitem>
817
818 <listitem><para> Efficiency: When the same view function is applied in
819 multiple branches of a function definition or a case expression (e.g.,
820 in <literal>size</literal> above), GHC makes an attempt to collect these
821 applications into a single nested case expression, so that the view
822 function is only applied once. Pattern compilation in GHC follows the
823 matrix algorithm described in Chapter 4 of <ulink
824 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
825 Implementation of Functional Programming Languages</ulink>. When the
826 top rows of the first column of a matrix are all view patterns with the
827 "same" expression, these patterns are transformed into a single nested
828 case. This includes, for example, adjacent view patterns that line up
829 in a tuple, as in
830 <programlisting>
831 f ((view -> A, p1), p2) = e1
832 f ((view -> B, p3), p4) = e2
833 </programlisting>
834 </para>
835
836 <para> The current notion of when two view pattern expressions are "the
837 same" is very restricted: it is not even full syntactic equality.
838 However, it does include variables, literals, applications, and tuples;
839 e.g., two instances of <literal>view ("hi", "there")</literal> will be
840 collected. However, the current implementation does not compare up to
841 alpha-equivalence, so two instances of <literal>(x, view x ->
842 y)</literal> will not be coalesced.
843 </para>
844
845 </listitem>
846
847 </itemizedlist>
848 </para>
849
850 </sect2>
851
852 <!-- ===================== Pattern synonyms =================== -->
853
854 <sect2 id="pattern-synonyms">
855 <title>Pattern synonyms
856 </title>
857
858 <para>
859 Pattern synonyms are enabled by the flag <literal>-XPatternSynonyms</literal>.
860 More information and examples of view patterns can be found on the
861 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki
862 page</ulink>.
863 </para>
864
865 <para>
866 Pattern synonyms enable giving names to parametrized pattern
867 schemes. They can also be thought of as abstract constructors that
868 don't have a bearing on data representation. For example, in a
869 programming language implementation, we might represent types of the
870 language as follows:
871 </para>
872
873 <programlisting>
874 data Type = App String [Type]
875 </programlisting>
876
877 <para>
878 Here are some examples of using said representation.
879 Consider a few types of the <literal>Type</literal> universe encoded
880 like this:
881 </para>
882
883 <programlisting>
884 App "->" [t1, t2] -- t1 -> t2
885 App "Int" [] -- Int
886 App "Maybe" [App "Int" []] -- Maybe Int
887 </programlisting>
888
889 <para>
890 This representation is very generic in that no types are given special
891 treatment. However, some functions might need to handle some known
892 types specially, for example the following two functions collect all
893 argument types of (nested) arrow types, and recognize the
894 <literal>Int</literal> type, respectively:
895 </para>
896
897 <programlisting>
898 collectArgs :: Type -> [Type]
899 collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2
900 collectArgs _ = []
901
902 isInt :: Type -> Bool
903 isInt (App "Int" []) = True
904 isInt _ = False
905 </programlisting>
906
907 <para>
908 Matching on <literal>App</literal> directly is both hard to read and
909 error prone to write. And the situation is even worse when the
910 matching is nested:
911 </para>
912
913 <programlisting>
914 isIntEndo :: Type -> Bool
915 isIntEndo (App "->" [App "Int" [], App "Int" []]) = True
916 isIntEndo _ = False
917 </programlisting>
918
919 <para>
920 Pattern synonyms permit abstracting from the representation to expose
921 matchers that behave in a constructor-like manner with respect to
922 pattern matching. We can create pattern synonyms for the known types
923 we care about, without committing the representation to them (note
924 that these don't have to be defined in the same module as the
925 <literal>Type</literal> type):
926 </para>
927
928 <programlisting>
929 pattern Arrow t1 t2 = App "->" [t1, t2]
930 pattern Int = App "Int" []
931 pattern Maybe t = App "Maybe" [t]
932 </programlisting>
933
934 <para>
935 Which enables us to rewrite our functions in a much cleaner style:
936 </para>
937
938 <programlisting>
939 collectArgs :: Type -> [Type]
940 collectArgs (Arrow t1 t2) = t1 : collectArgs t2
941 collectArgs _ = []
942
943 isInt :: Type -> Bool
944 isInt Int = True
945 isInt _ = False
946
947 isIntEndo :: Type -> Bool
948 isIntEndo (Arrow Int Int) = True
949 isIntEndo _ = False
950 </programlisting>
951
952 <para>
953 Note that in this example, the pattern synonyms
954 <literal>Int</literal> and <literal>Arrow</literal> can also be used
955 as expressions (they are <emphasis>bidirectional</emphasis>). This
956 is not necessarily the case: <emphasis>unidirectional</emphasis>
957 pattern synonyms can also be declared with the following syntax:
958 </para>
959
960 <programlisting>
961 pattern Head x &lt;- x:xs
962 </programlisting>
963
964 <para>
965 In this case, <literal>Head</literal> <replaceable>x</replaceable>
966 cannot be used in expressions, only patterns, since it wouldn't
967 specify a value for the <replaceable>xs</replaceable> on the
968 right-hand side.
969 </para>
970
971 <para>
972 The semantics of a unidirectional pattern synonym declaration and
973 usage are as follows:
974
975 <itemizedlist>
976
977 <listitem> Syntax:
978 <para>
979 A pattern synonym declaration can be either unidirectional or
980 bidirectional. The syntax for unidirectional pattern synonyms is:
981 </para>
982 <programlisting>
983 pattern Name args &lt;- pat
984 </programlisting>
985 <para>
986 and the syntax for bidirectional pattern synonyms is:
987 </para>
988 <programlisting>
989 pattern Name args = pat
990 </programlisting>
991 <para>
992 Pattern synonym declarations can only occur in the top level of a
993 module. In particular, they are not allowed as local
994 definitions. Currently, they also don't work in GHCi, but that is a
995 technical restriction that will be lifted in later versions.
996 </para>
997 <para>
998 The name of the pattern synonym itself is in the same namespace as
999 proper data constructors. Either prefix or infix syntax can be
1000 used. In export/import specifications, you have to prefix pattern
1001 names with the <literal>pattern</literal> keyword, e.g.:
1002 </para>
1003 <programlisting>
1004 module Example (pattern Single) where
1005 pattern Single x = [x]
1006 </programlisting>
1007 </listitem>
1008
1009 <listitem> Scoping:
1010
1011 <para>
1012 The variables in the left-hand side of the definition are bound by
1013 the pattern on the right-hand side. For bidirectional pattern
1014 synonyms, all the variables of the right-hand side must also occur
1015 on the left-hand side; also, wildcard patterns and view patterns are
1016 not allowed. For unidirectional pattern synonyms, there is no
1017 restriction on the right-hand side pattern.
1018 </para>
1019
1020 <para>
1021 Pattern synonyms cannot be defined recursively.
1022 </para>
1023
1024 </listitem>
1025
1026 <listitem> Typing:
1027
1028 <para>
1029 Given a pattern synonym definition of the form
1030 </para>
1031 <programlisting>
1032 pattern P var1 var2 ... varN &lt;- pat
1033 </programlisting>
1034 <para>
1035 it is assigned a <emphasis>pattern type</emphasis> of the form
1036 </para>
1037 <programlisting>
1038 pattern CProv => P t1 t2 ... tN :: CReq => t
1039 </programlisting>
1040 <para>
1041 where <replaceable>CProv</replaceable> and
1042 <replaceable>CReq</replaceable> are type contexts, and
1043 <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ...,
1044 <replaceable>tN</replaceable> and <replaceable>t</replaceable> are
1045 types.
1046 </para>
1047
1048 <para>
1049 A pattern synonym of this type can be used in a pattern if the
1050 instatiated (monomorphic) type satisfies the constraints of
1051 <replaceable>CReq</replaceable>. In this case, it extends the context
1052 available in the right-hand side of the match with
1053 <replaceable>CProv</replaceable>, just like how an existentially-typed
1054 data constructor can extend the context.
1055 </para>
1056
1057 <para>
1058 For example, in the following program:
1059 </para>
1060 <programlisting>
1061 {-# LANGUAGE PatternSynonyms, GADTs #-}
1062 module ShouldCompile where
1063
1064 data T a where
1065 MkT :: (Show b) => a -> b -> T a
1066
1067 pattern ExNumPat x = MkT 42 x
1068 </programlisting>
1069
1070 <para>
1071 the pattern type of <literal>ExNumPat</literal> is
1072 </para>
1073
1074 <programlisting>
1075 pattern (Show b) => ExNumPat b :: (Num a, Eq a) => T a
1076 </programlisting>
1077
1078 <para>
1079 and so can be used in a function definition like the following:
1080 </para>
1081
1082 <programlisting>
1083 f :: (Num t, Eq t) => T t -> String
1084 f (ExNumPat x) = show x
1085 </programlisting>
1086
1087 <para>
1088 For bidirectional pattern synonyms, uses as expressions have the type
1089 </para>
1090 <programlisting>
1091 (CProv, CReq) => t1 -> t2 -> ... -> tN -> t
1092 </programlisting>
1093
1094 <para>
1095 So in the previous example, <literal>ExNumPat</literal>,
1096 when used in an expression, has type
1097 </para>
1098 <programlisting>
1099 ExNumPat :: (Show b, Num a, Eq a) => b -> T t
1100 </programlisting>
1101
1102 </listitem>
1103
1104 <listitem> Matching:
1105
1106 <para>
1107 A pattern synonym occurrence in a pattern is evaluated by first
1108 matching against the pattern synonym itself, and then on the argument
1109 patterns. For example, in the following program, <literal>f</literal>
1110 and <literal>f'</literal> are equivalent:
1111 </para>
1112
1113 <programlisting>
1114 pattern Pair x y &lt;- [x, y]
1115
1116 f (Pair True True) = True
1117 f _ = False
1118
1119 f' [x, y] | True &lt;- x, True &lt;- y = True
1120 f' _ = False
1121 </programlisting>
1122
1123 <para>
1124 Note that the strictness of <literal>f</literal> differs from that
1125 of <literal>g</literal> defined below:
1126 </para>
1127
1128 <programlisting>
1129 g [True, True] = True
1130 g _ = False
1131
1132 *Main> f (False:undefined)
1133 *** Exception: Prelude.undefined
1134 *Main> g (False:undefined)
1135 False
1136 </programlisting>
1137 </listitem>
1138 </itemizedlist>
1139 </para>
1140
1141 </sect2>
1142
1143 <!-- ===================== n+k patterns =================== -->
1144
1145 <sect2 id="n-k-patterns">
1146 <title>n+k patterns</title>
1147 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
1148
1149 <para>
1150 <literal>n+k</literal> pattern support is disabled by default. To enable
1151 it, you can use the <option>-XNPlusKPatterns</option> flag.
1152 </para>
1153
1154 </sect2>
1155
1156 <!-- ===================== Traditional record syntax =================== -->
1157
1158 <sect2 id="traditional-record-syntax">
1159 <title>Traditional record syntax</title>
1160 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
1161
1162 <para>
1163 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
1164 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
1165 </para>
1166
1167 </sect2>
1168
1169 <!-- ===================== Recursive do-notation =================== -->
1170
1171 <sect2 id="recursive-do-notation">
1172 <title>The recursive do-notation
1173 </title>
1174
1175 <para>
1176 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
1177 that is, the variables bound in a do-expression are visible only in the textually following
1178 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
1179 group.
1180 </para>
1181
1182 <para>
1183 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
1184 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
1185 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
1186 <programlisting>
1187 class Monad m => MonadFix m where
1188 mfix :: (a -> m a) -> m a
1189 </programlisting>
1190 Haskell's
1191 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
1192 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
1193 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
1194 </para>
1195
1196 <para>
1197 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
1198 an extended version of the do-notation that allows recursive bindings.
1199 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
1200 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
1201 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
1202 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
1203 are recursively defined, much like in an ordinary let-expression. Due to the new
1204 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
1205 </para>
1206
1207 <para>
1208 Here is a simple (albeit contrived) example:
1209 <programlisting>
1210 {-# LANGUAGE RecursiveDo #-}
1211 justOnes = mdo { xs &lt;- Just (1:xs)
1212 ; return (map negate xs) }
1213 </programlisting>
1214 or equivalently
1215 <programlisting>
1216 {-# LANGUAGE RecursiveDo #-}
1217 justOnes = do { rec { xs &lt;- Just (1:xs) }
1218 ; return (map negate xs) }
1219 </programlisting>
1220 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
1221 </para>
1222
1223 <para>
1224 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
1225 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
1226 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
1227 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
1228 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
1229 </para>
1230
1231 <sect3>
1232 <title>Recursive binding groups</title>
1233
1234 <para>
1235 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
1236 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
1237 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
1238 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
1239 <programlisting>
1240 do { a &lt;- getChar do { a &lt;- getChar
1241 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
1242 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
1243 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
1244 </programlisting>
1245 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
1246 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
1247 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
1248 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
1249 </para>
1250
1251 <para>
1252 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
1253 group, it will compute its set of bound variables, and will introduce an appropriate call
1254 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
1255 <literal>MonadFix</literal> class. Here is an example:
1256 <programlisting>
1257 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
1258 ; c &lt;- f b a } ; c &lt;- f b a
1259 ; return (b,c) })
1260 </programlisting>
1261 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
1262 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
1263 <programlisting>
1264 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1265 </programlisting>
1266 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1267 </para>
1268
1269 <para>
1270 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
1271 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
1272 for the <literal>mdo</literal> notation, which is described next.
1273 </para>
1274 </sect3>
1275
1276 <sect3>
1277 <title>The <literal>mdo</literal> notation</title>
1278
1279 <para>
1280 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
1281 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
1282 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
1283 in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
1284 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
1285 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
1286 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
1287 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
1288 interference, changing the termination behavior of the resulting translation.
1289 (Details can be found in Sections 3.1 and 7.2.2 of
1290 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1291 </para>
1292
1293 <para>
1294 The <literal>mdo</literal> notation removes the burden of placing
1295 explicit <literal>rec</literal> blocks in the code. Unlike an
1296 ordinary <literal>do</literal> expression, in which variables bound by
1297 statements are only in scope for later statements, variables bound in
1298 an <literal>mdo</literal> expression are in scope for all statements
1299 of the expression. The compiler then automatically identifies minimal
1300 mutually recursively dependent segments of statements, treating them as
1301 if the user had wrapped a <literal>rec</literal> qualifier around them.
1302 </para>
1303
1304 <para>
1305 The definition is syntactic:
1306 </para>
1307 <itemizedlist>
1308 <listitem>
1309 <para>
1310 A generator <replaceable>g</replaceable>
1311 <emphasis>depends</emphasis> on a textually following generator
1312 <replaceable>g'</replaceable>, if
1313 </para>
1314 <itemizedlist>
1315 <listitem>
1316 <para>
1317 <replaceable>g'</replaceable> defines a variable that
1318 is used by <replaceable>g</replaceable>, or
1319 </para>
1320 </listitem>
1321 <listitem>
1322 <para>
1323 <replaceable>g'</replaceable> textually appears between
1324 <replaceable>g</replaceable> and
1325 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1326 depends on <replaceable>g''</replaceable>.
1327 </para>
1328 </listitem>
1329 </itemizedlist>
1330 </listitem>
1331 <listitem>
1332 <para>
1333 A <emphasis>segment</emphasis> of a given
1334 <literal>mdo</literal>-expression is a minimal sequence of generators
1335 such that no generator of the sequence depends on an outside
1336 generator. As a special case, although it is not a generator,
1337 the final expression in an <literal>mdo</literal>-expression is
1338 considered to form a segment by itself.
1339 </para>
1340 </listitem>
1341 </itemizedlist>
1342 <para>
1343 Segments in this sense are
1344 related to <emphasis>strongly-connected components</emphasis> analysis,
1345 with the exception that bindings in a segment cannot be reordered and
1346 must be contiguous.
1347 </para>
1348
1349 <para>
1350 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1351 <programlisting>
1352 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1353 ; b &lt;- f a c ; rec { b &lt;- f a c
1354 ; c &lt;- f b a ; ; c &lt;- f b a }
1355 ; z &lt;- h a b ; z &lt;- h a b
1356 ; d &lt;- g d e ; rec { d &lt;- g d e
1357 ; e &lt;- g a z ; ; e &lt;- g a z }
1358 ; putChar c } ; putChar c }
1359 </programlisting>
1360 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1361 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1362 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1363 would expect.
1364 </para>
1365
1366 <para>
1367 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1368 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1369 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1370 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1371 version would do so.
1372 </para>
1373
1374 <para>
1375 Here are some other important points in using the recursive-do notation:
1376
1377 <itemizedlist>
1378 <listitem>
1379 <para>
1380 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1381 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1382 blocks inside <literal>do</literal> expressions.)
1383 </para>
1384 </listitem>
1385 <listitem>
1386 <para>
1387 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1388 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1389 <literal>rec</literal> blocks in a single expression.
1390 </para>
1391 </listitem>
1392 <listitem>
1393 <para>
1394 If recursive bindings are required for a monad, then that monad must be declared an instance of
1395 the <literal>MonadFix</literal> class.
1396 </para>
1397 </listitem>
1398 <listitem>
1399 <para>
1400 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1401 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1402 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1403 state monad (strict and lazy, respectively).
1404 </para>
1405 </listitem>
1406 <listitem>
1407 <para>
1408 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1409 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1410 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1411 </para>
1412 </listitem>
1413 </itemizedlist>
1414 </para>
1415 </sect3>
1416
1417
1418 </sect2>
1419
1420
1421 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1422
1423 <sect2 id="parallel-list-comprehensions">
1424 <title>Parallel List Comprehensions</title>
1425 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1426 </indexterm>
1427 <indexterm><primary>parallel list comprehensions</primary>
1428 </indexterm>
1429
1430 <para>Parallel list comprehensions are a natural extension to list
1431 comprehensions. List comprehensions can be thought of as a nice
1432 syntax for writing maps and filters. Parallel comprehensions
1433 extend this to include the zipWith family.</para>
1434
1435 <para>A parallel list comprehension has multiple independent
1436 branches of qualifier lists, each separated by a `|' symbol. For
1437 example, the following zips together two lists:</para>
1438
1439 <programlisting>
1440 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1441 </programlisting>
1442
1443 <para>The behaviour of parallel list comprehensions follows that of
1444 zip, in that the resulting list will have the same length as the
1445 shortest branch.</para>
1446
1447 <para>We can define parallel list comprehensions by translation to
1448 regular comprehensions. Here's the basic idea:</para>
1449
1450 <para>Given a parallel comprehension of the form: </para>
1451
1452 <programlisting>
1453 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1454 | q1 &lt;- e21, q2 &lt;- e22, ...
1455 ...
1456 ]
1457 </programlisting>
1458
1459 <para>This will be translated to: </para>
1460
1461 <programlisting>
1462 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1463 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1464 ...
1465 ]
1466 </programlisting>
1467
1468 <para>where `zipN' is the appropriate zip for the given number of
1469 branches.</para>
1470
1471 </sect2>
1472
1473 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1474
1475 <sect2 id="generalised-list-comprehensions">
1476 <title>Generalised (SQL-Like) List Comprehensions</title>
1477 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1478 </indexterm>
1479 <indexterm><primary>extended list comprehensions</primary>
1480 </indexterm>
1481 <indexterm><primary>group</primary></indexterm>
1482 <indexterm><primary>sql</primary></indexterm>
1483
1484
1485 <para>Generalised list comprehensions are a further enhancement to the
1486 list comprehension syntactic sugar to allow operations such as sorting
1487 and grouping which are familiar from SQL. They are fully described in the
1488 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1489 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1490 except that the syntax we use differs slightly from the paper.</para>
1491 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1492 <para>Here is an example:
1493 <programlisting>
1494 employees = [ ("Simon", "MS", 80)
1495 , ("Erik", "MS", 100)
1496 , ("Phil", "Ed", 40)
1497 , ("Gordon", "Ed", 45)
1498 , ("Paul", "Yale", 60)]
1499
1500 output = [ (the dept, sum salary)
1501 | (name, dept, salary) &lt;- employees
1502 , then group by dept using groupWith
1503 , then sortWith by (sum salary)
1504 , then take 5 ]
1505 </programlisting>
1506 In this example, the list <literal>output</literal> would take on
1507 the value:
1508
1509 <programlisting>
1510 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1511 </programlisting>
1512 </para>
1513 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1514 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1515 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1516
1517 <para>There are five new forms of comprehension qualifier,
1518 all introduced by the (existing) keyword <literal>then</literal>:
1519 <itemizedlist>
1520 <listitem>
1521
1522 <programlisting>
1523 then f
1524 </programlisting>
1525
1526 This statement requires that <literal>f</literal> have the type <literal>
1527 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1528 motivating example, as this form is used to apply <literal>take 5</literal>.
1529
1530 </listitem>
1531
1532
1533 <listitem>
1534 <para>
1535 <programlisting>
1536 then f by e
1537 </programlisting>
1538
1539 This form is similar to the previous one, but allows you to create a function
1540 which will be passed as the first argument to f. As a consequence f must have
1541 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1542 from the type, this function lets f &quot;project out&quot; some information
1543 from the elements of the list it is transforming.</para>
1544
1545 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1546 is supplied with a function that lets it find out the <literal>sum salary</literal>
1547 for any item in the list comprehension it transforms.</para>
1548
1549 </listitem>
1550
1551
1552 <listitem>
1553
1554 <programlisting>
1555 then group by e using f
1556 </programlisting>
1557
1558 <para>This is the most general of the grouping-type statements. In this form,
1559 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1560 As with the <literal>then f by e</literal> case above, the first argument
1561 is a function supplied to f by the compiler which lets it compute e on every
1562 element of the list being transformed. However, unlike the non-grouping case,
1563 f additionally partitions the list into a number of sublists: this means that
1564 at every point after this statement, binders occurring before it in the comprehension
1565 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1566 this, let's look at an example:</para>
1567
1568 <programlisting>
1569 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1570 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1571 groupRuns f = groupBy (\x y -> f x == f y)
1572
1573 output = [ (the x, y)
1574 | x &lt;- ([1..3] ++ [1..2])
1575 , y &lt;- [4..6]
1576 , then group by x using groupRuns ]
1577 </programlisting>
1578
1579 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1580
1581 <programlisting>
1582 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1583 </programlisting>
1584
1585 <para>Note that we have used the <literal>the</literal> function to change the type
1586 of x from a list to its original numeric type. The variable y, in contrast, is left
1587 unchanged from the list form introduced by the grouping.</para>
1588
1589 </listitem>
1590
1591 <listitem>
1592
1593 <programlisting>
1594 then group using f
1595 </programlisting>
1596
1597 <para>With this form of the group statement, f is required to simply have the type
1598 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1599 comprehension so far directly. An example of this form is as follows:</para>
1600
1601 <programlisting>
1602 output = [ x
1603 | y &lt;- [1..5]
1604 , x &lt;- "hello"
1605 , then group using inits]
1606 </programlisting>
1607
1608 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1609
1610 <programlisting>
1611 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1612 </programlisting>
1613
1614 </listitem>
1615 </itemizedlist>
1616 </para>
1617 </sect2>
1618
1619 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1620
1621 <sect2 id="monad-comprehensions">
1622 <title>Monad comprehensions</title>
1623 <indexterm><primary>monad comprehensions</primary></indexterm>
1624
1625 <para>
1626 Monad comprehensions generalise the list comprehension notation,
1627 including parallel comprehensions
1628 (<xref linkend="parallel-list-comprehensions"/>) and
1629 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1630 to work for any monad.
1631 </para>
1632
1633 <para>Monad comprehensions support:</para>
1634
1635 <itemizedlist>
1636 <listitem>
1637 <para>
1638 Bindings:
1639 </para>
1640
1641 <programlisting>
1642 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1643 </programlisting>
1644
1645 <para>
1646 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1647 <literal>return</literal> functions to the usual do-notation:
1648 </para>
1649
1650 <programlisting>
1651 do x &lt;- Just 1
1652 y &lt;- Just 2
1653 return (x+y)
1654 </programlisting>
1655
1656 </listitem>
1657 <listitem>
1658 <para>
1659 Guards:
1660 </para>
1661
1662 <programlisting>
1663 [ x | x &lt;- [1..10], x &lt;= 5 ]
1664 </programlisting>
1665
1666 <para>
1667 Guards are translated with the <literal>guard</literal> function,
1668 which requires a <literal>MonadPlus</literal> instance:
1669 </para>
1670
1671 <programlisting>
1672 do x &lt;- [1..10]
1673 guard (x &lt;= 5)
1674 return x
1675 </programlisting>
1676
1677 </listitem>
1678 <listitem>
1679 <para>
1680 Transform statements (as with <literal>-XTransformListComp</literal>):
1681 </para>
1682
1683 <programlisting>
1684 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1685 </programlisting>
1686
1687 <para>
1688 This translates to:
1689 </para>
1690
1691 <programlisting>
1692 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1693 y &lt;- [1..x]
1694 return (x,y))
1695 return (x+y)
1696 </programlisting>
1697
1698 </listitem>
1699 <listitem>
1700 <para>
1701 Group statements (as with <literal>-XTransformListComp</literal>):
1702 </para>
1703
1704 <programlisting>
1705 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1706 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1707 </programlisting>
1708
1709 </listitem>
1710 <listitem>
1711 <para>
1712 Parallel statements (as with <literal>-XParallelListComp</literal>):
1713 </para>
1714
1715 <programlisting>
1716 [ (x+y) | x &lt;- [1..10]
1717 | y &lt;- [11..20]
1718 ]
1719 </programlisting>
1720
1721 <para>
1722 Parallel statements are translated using the
1723 <literal>mzip</literal> function, which requires a
1724 <literal>MonadZip</literal> instance defined in
1725 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1726 </para>
1727
1728 <programlisting>
1729 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1730 return x)
1731 (do y &lt;- [11..20]
1732 return y)
1733 return (x+y)
1734 </programlisting>
1735
1736 </listitem>
1737 </itemizedlist>
1738
1739 <para>
1740 All these features are enabled by default if the
1741 <literal>MonadComprehensions</literal> extension is enabled. The types
1742 and more detailed examples on how to use comprehensions are explained
1743 in the previous chapters <xref
1744 linkend="generalised-list-comprehensions"/> and <xref
1745 linkend="parallel-list-comprehensions"/>. In general you just have
1746 to replace the type <literal>[a]</literal> with the type
1747 <literal>Monad m => m a</literal> for monad comprehensions.
1748 </para>
1749
1750 <para>
1751 Note: Even though most of these examples are using the list monad,
1752 monad comprehensions work for any monad.
1753 The <literal>base</literal> package offers all necessary instances for
1754 lists, which make <literal>MonadComprehensions</literal> backward
1755 compatible to built-in, transform and parallel list comprehensions.
1756 </para>
1757 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1758 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1759 <programlisting>
1760 Expressions: e
1761 Declarations: d
1762 Lists of qualifiers: Q,R,S
1763
1764 -- Basic forms
1765 D[ e | ] = return e
1766 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1767 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1768 D[ e | let d, Q ] = let d in D[ e | Q ]
1769
1770 -- Parallel comprehensions (iterate for multiple parallel branches)
1771 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1772
1773 -- Transform comprehensions
1774 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1775
1776 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1777
1778 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1779 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1780 Qv -&gt; D[ e | R ]
1781
1782 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1783 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1784 Qv -&gt; D[ e | R ]
1785
1786 where Qv is the tuple of variables bound by Q (and used subsequently)
1787 selQvi is a selector mapping Qv to the ith component of Qv
1788
1789 Operator Standard binding Expected type
1790 --------------------------------------------------------------------
1791 return GHC.Base t1 -&gt; m t2
1792 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1793 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1794 guard Control.Monad t1 -&gt; m t2
1795 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1796 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1797 </programlisting>
1798 The comprehension should typecheck when its desugaring would typecheck.
1799 </para>
1800 <para>
1801 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1802 Without rebindable
1803 syntax, the operators from the "standard binding" module are used; with
1804 rebindable syntax, the operators are looked up in the current lexical scope.
1805 For example, parallel comprehensions will be typechecked and desugared
1806 using whatever "<literal>mzip</literal>" is in scope.
1807 </para>
1808 <para>
1809 The rebindable operators must have the "Expected type" given in the
1810 table above. These types are surprisingly general. For example, you can
1811 use a bind operator with the type
1812 <programlisting>
1813 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1814 </programlisting>
1815 In the case of transform comprehensions, notice that the groups are
1816 parameterised over some arbitrary type <literal>n</literal> (provided it
1817 has an <literal>fmap</literal>, as well as
1818 the comprehension being over an arbitrary monad.
1819 </para>
1820 </sect2>
1821
1822 <!-- ===================== REBINDABLE SYNTAX =================== -->
1823
1824 <sect2 id="rebindable-syntax">
1825 <title>Rebindable syntax and the implicit Prelude import</title>
1826
1827 <para><indexterm><primary>-XNoImplicitPrelude
1828 option</primary></indexterm> GHC normally imports
1829 <filename>Prelude.hi</filename> files for you. If you'd
1830 rather it didn't, then give it a
1831 <option>-XNoImplicitPrelude</option> option. The idea is
1832 that you can then import a Prelude of your own. (But don't
1833 call it <literal>Prelude</literal>; the Haskell module
1834 namespace is flat, and you must not conflict with any
1835 Prelude module.)</para>
1836
1837 <para>Suppose you are importing a Prelude of your own
1838 in order to define your own numeric class
1839 hierarchy. It completely defeats that purpose if the
1840 literal "1" means "<literal>Prelude.fromInteger
1841 1</literal>", which is what the Haskell Report specifies.
1842 So the <option>-XRebindableSyntax</option>
1843 flag causes
1844 the following pieces of built-in syntax to refer to
1845 <emphasis>whatever is in scope</emphasis>, not the Prelude
1846 versions:
1847 <itemizedlist>
1848 <listitem>
1849 <para>An integer literal <literal>368</literal> means
1850 "<literal>fromInteger (368::Integer)</literal>", rather than
1851 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1852 </para> </listitem>
1853
1854 <listitem><para>Fractional literals are handed in just the same way,
1855 except that the translation is
1856 <literal>fromRational (3.68::Rational)</literal>.
1857 </para> </listitem>
1858
1859 <listitem><para>The equality test in an overloaded numeric pattern
1860 uses whatever <literal>(==)</literal> is in scope.
1861 </para> </listitem>
1862
1863 <listitem><para>The subtraction operation, and the
1864 greater-than-or-equal test, in <literal>n+k</literal> patterns
1865 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1866 </para></listitem>
1867
1868 <listitem>
1869 <para>Negation (e.g. "<literal>- (f x)</literal>")
1870 means "<literal>negate (f x)</literal>", both in numeric
1871 patterns, and expressions.
1872 </para></listitem>
1873
1874 <listitem>
1875 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1876 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1877 </para></listitem>
1878
1879 <listitem>
1880 <para>"Do" notation is translated using whatever
1881 functions <literal>(>>=)</literal>,
1882 <literal>(>>)</literal>, and <literal>fail</literal>,
1883 are in scope (not the Prelude
1884 versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array
1885 comprehensions, are unaffected. </para></listitem>
1886
1887 <listitem>
1888 <para>Arrow
1889 notation (see <xref linkend="arrow-notation"/>)
1890 uses whatever <literal>arr</literal>,
1891 <literal>(>>>)</literal>, <literal>first</literal>,
1892 <literal>app</literal>, <literal>(|||)</literal> and
1893 <literal>loop</literal> functions are in scope. But unlike the
1894 other constructs, the types of these functions must match the
1895 Prelude types very closely. Details are in flux; if you want
1896 to use this, ask!
1897 </para></listitem>
1898 </itemizedlist>
1899 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1900 </para>
1901 <para>
1902 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1903 even if that is a little unexpected. For example, the
1904 static semantics of the literal <literal>368</literal>
1905 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1906 <literal>fromInteger</literal> to have any of the types:
1907 <programlisting>
1908 fromInteger :: Integer -> Integer
1909 fromInteger :: forall a. Foo a => Integer -> a
1910 fromInteger :: Num a => a -> Integer
1911 fromInteger :: Integer -> Bool -> Bool
1912 </programlisting>
1913 </para>
1914
1915 <para>Be warned: this is an experimental facility, with
1916 fewer checks than usual. Use <literal>-dcore-lint</literal>
1917 to typecheck the desugared program. If Core Lint is happy
1918 you should be all right.</para>
1919
1920 </sect2>
1921
1922 <sect2 id="postfix-operators">
1923 <title>Postfix operators</title>
1924
1925 <para>
1926 The <option>-XPostfixOperators</option> flag enables a small
1927 extension to the syntax of left operator sections, which allows you to
1928 define postfix operators. The extension is this: the left section
1929 <programlisting>
1930 (e !)
1931 </programlisting>
1932 is equivalent (from the point of view of both type checking and execution) to the expression
1933 <programlisting>
1934 ((!) e)
1935 </programlisting>
1936 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1937 The strict Haskell 98 interpretation is that the section is equivalent to
1938 <programlisting>
1939 (\y -> (!) e y)
1940 </programlisting>
1941 That is, the operator must be a function of two arguments. GHC allows it to
1942 take only one argument, and that in turn allows you to write the function
1943 postfix.
1944 </para>
1945 <para>The extension does not extend to the left-hand side of function
1946 definitions; you must define such a function in prefix form.</para>
1947
1948 </sect2>
1949
1950 <sect2 id="tuple-sections">
1951 <title>Tuple sections</title>
1952
1953 <para>
1954 The <option>-XTupleSections</option> flag enables Python-style partially applied
1955 tuple constructors. For example, the following program
1956 <programlisting>
1957 (, True)
1958 </programlisting>
1959 is considered to be an alternative notation for the more unwieldy alternative
1960 <programlisting>
1961 \x -> (x, True)
1962 </programlisting>
1963 You can omit any combination of arguments to the tuple, as in the following
1964 <programlisting>
1965 (, "I", , , "Love", , 1337)
1966 </programlisting>
1967 which translates to
1968 <programlisting>
1969 \a b c d -> (a, "I", b, c, "Love", d, 1337)
1970 </programlisting>
1971 </para>
1972
1973 <para>
1974 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
1975 will also be available for them, like so
1976 <programlisting>
1977 (# , True #)
1978 </programlisting>
1979 Because there is no unboxed unit tuple, the following expression
1980 <programlisting>
1981 (# #)
1982 </programlisting>
1983 continues to stand for the unboxed singleton tuple data constructor.
1984 </para>
1985
1986 </sect2>
1987
1988 <sect2 id="lambda-case">
1989 <title>Lambda-case</title>
1990 <para>
1991 The <option>-XLambdaCase</option> flag enables expressions of the form
1992 <programlisting>
1993 \case { p1 -> e1; ...; pN -> eN }
1994 </programlisting>
1995 which is equivalent to
1996 <programlisting>
1997 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
1998 </programlisting>
1999 Note that <literal>\case</literal> starts a layout, so you can write
2000 <programlisting>
2001 \case
2002 p1 -> e1
2003 ...
2004 pN -> eN
2005 </programlisting>
2006 </para>
2007 </sect2>
2008
2009 <sect2 id="empty-case">
2010 <title>Empty case alternatives</title>
2011 <para>
2012 The <option>-XEmptyCase</option> flag enables
2013 case expressions, or lambda-case expressions, that have no alternatives,
2014 thus:
2015 <programlisting>
2016 case e of { } -- No alternatives
2017 or
2018 \case { } -- -XLambdaCase is also required
2019 </programlisting>
2020 This can be useful when you know that the expression being scrutinised
2021 has no non-bottom values. For example:
2022 <programlisting>
2023 data Void
2024 f :: Void -> Int
2025 f x = case x of { }
2026 </programlisting>
2027 With dependently-typed features it is more useful
2028 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
2029 For example, consider these two candidate definitions of <literal>absurd</literal>:
2030 <programlisting>
2031 data a :==: b where
2032 Refl :: a :==: a
2033
2034 absurd :: True :~: False -> a
2035 absurd x = error "absurd" -- (A)
2036 absurd x = case x of {} -- (B)
2037 </programlisting>
2038 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
2039 is an empty type. So (B) has no partiality and GHC should be able to compile with
2040 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
2041 yet clever enough to do that.)
2042 On the other hand (A) looks dangerous, and GHC doesn't check to make
2043 sure that, in fact, the function can never get called.
2044 </para>
2045 </sect2>
2046
2047 <sect2 id="multi-way-if">
2048 <title>Multi-way if-expressions</title>
2049 <para>
2050 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
2051 with multiple branches:
2052 <programlisting>
2053 if | guard1 -> expr1
2054 | ...
2055 | guardN -> exprN
2056 </programlisting>
2057 which is roughly equivalent to
2058 <programlisting>
2059 case () of
2060 _ | guard1 -> expr1
2061 ...
2062 _ | guardN -> exprN
2063 </programlisting>
2064 </para>
2065
2066 <para>Multi-way if expressions introduce a new layout context. So the
2067 example above is equivalent to:
2068 <programlisting>
2069 if { | guard1 -> expr1
2070 ; | ...
2071 ; | guardN -> exprN
2072 }
2073 </programlisting>
2074 The following behaves as expected:
2075 <programlisting>
2076 if | guard1 -> if | guard2 -> expr2
2077 | guard3 -> expr3
2078 | guard4 -> expr4
2079 </programlisting>
2080 because layout translates it as
2081 <programlisting>
2082 if { | guard1 -> if { | guard2 -> expr2
2083 ; | guard3 -> expr3
2084 }
2085 ; | guard4 -> expr4
2086 }
2087 </programlisting>
2088 Layout with multi-way if works in the same way as other layout
2089 contexts, except that the semi-colons between guards in a multi-way if
2090 are optional. So it is not necessary to line up all the guards at the
2091 same column; this is consistent with the way guards work in function
2092 definitions and case expressions.
2093 </para>
2094 </sect2>
2095
2096 <sect2 id="disambiguate-fields">
2097 <title>Record field disambiguation</title>
2098 <para>
2099 In record construction and record pattern matching
2100 it is entirely unambiguous which field is referred to, even if there are two different
2101 data types in scope with a common field name. For example:
2102 <programlisting>
2103 module M where
2104 data S = MkS { x :: Int, y :: Bool }
2105
2106 module Foo where
2107 import M
2108
2109 data T = MkT { x :: Int }
2110
2111 ok1 (MkS { x = n }) = n+1 -- Unambiguous
2112 ok2 n = MkT { x = n+1 } -- Unambiguous
2113
2114 bad1 k = k { x = 3 } -- Ambiguous
2115 bad2 k = x k -- Ambiguous
2116 </programlisting>
2117 Even though there are two <literal>x</literal>'s in scope,
2118 it is clear that the <literal>x</literal> in the pattern in the
2119 definition of <literal>ok1</literal> can only mean the field
2120 <literal>x</literal> from type <literal>S</literal>. Similarly for
2121 the function <literal>ok2</literal>. However, in the record update
2122 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
2123 it is not clear which of the two types is intended.
2124 </para>
2125 <para>
2126 Haskell 98 regards all four as ambiguous, but with the
2127 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
2128 the former two. The rules are precisely the same as those for instance
2129 declarations in Haskell 98, where the method names on the left-hand side
2130 of the method bindings in an instance declaration refer unambiguously
2131 to the method of that class (provided they are in scope at all), even
2132 if there are other variables in scope with the same name.
2133 This reduces the clutter of qualified names when you import two
2134 records from different modules that use the same field name.
2135 </para>
2136 <para>
2137 Some details:
2138 <itemizedlist>
2139 <listitem><para>
2140 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
2141 <programlisting>
2142 module Foo where
2143 import M
2144 x=True
2145 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
2146 </programlisting>
2147 </para></listitem>
2148
2149 <listitem><para>
2150 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
2151 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
2152 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
2153 <programlisting>
2154 module Foo where
2155 import qualified M -- Note qualified
2156
2157 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
2158 </programlisting>
2159 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
2160 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
2161 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
2162 is not. (In effect, it is qualified by the constructor.)
2163 </para></listitem>
2164 </itemizedlist>
2165 </para>
2166
2167 </sect2>
2168
2169 <!-- ===================== Record puns =================== -->
2170
2171 <sect2 id="record-puns">
2172 <title>Record puns
2173 </title>
2174
2175 <para>
2176 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
2177 </para>
2178
2179 <para>
2180 When using records, it is common to write a pattern that binds a
2181 variable with the same name as a record field, such as:
2182
2183 <programlisting>
2184 data C = C {a :: Int}
2185 f (C {a = a}) = a
2186 </programlisting>
2187 </para>
2188
2189 <para>
2190 Record punning permits the variable name to be elided, so one can simply
2191 write
2192
2193 <programlisting>
2194 f (C {a}) = a
2195 </programlisting>
2196
2197 to mean the same pattern as above. That is, in a record pattern, the
2198 pattern <literal>a</literal> expands into the pattern <literal>a =
2199 a</literal> for the same name <literal>a</literal>.
2200 </para>
2201
2202 <para>
2203 Note that:
2204 <itemizedlist>
2205 <listitem><para>
2206 Record punning can also be used in an expression, writing, for example,
2207 <programlisting>
2208 let a = 1 in C {a}
2209 </programlisting>
2210 instead of
2211 <programlisting>
2212 let a = 1 in C {a = a}
2213 </programlisting>
2214 The expansion is purely syntactic, so the expanded right-hand side
2215 expression refers to the nearest enclosing variable that is spelled the
2216 same as the field name.
2217 </para></listitem>
2218
2219 <listitem><para>
2220 Puns and other patterns can be mixed in the same record:
2221 <programlisting>
2222 data C = C {a :: Int, b :: Int}
2223 f (C {a, b = 4}) = a
2224 </programlisting>
2225 </para></listitem>
2226
2227 <listitem><para>
2228 Puns can be used wherever record patterns occur (e.g. in
2229 <literal>let</literal> bindings or at the top-level).
2230 </para></listitem>
2231
2232 <listitem><para>
2233 A pun on a qualified field name is expanded by stripping off the module qualifier.
2234 For example:
2235 <programlisting>
2236 f (C {M.a}) = a
2237 </programlisting>
2238 means
2239 <programlisting>
2240 f (M.C {M.a = a}) = a
2241 </programlisting>
2242 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
2243 is only in scope in qualified form.)
2244 </para></listitem>
2245 </itemizedlist>
2246 </para>
2247
2248
2249 </sect2>
2250
2251 <!-- ===================== Record wildcards =================== -->
2252
2253 <sect2 id="record-wildcards">
2254 <title>Record wildcards
2255 </title>
2256
2257 <para>
2258 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
2259 This flag implies <literal>-XDisambiguateRecordFields</literal>.
2260 </para>
2261
2262 <para>
2263 For records with many fields, it can be tiresome to write out each field
2264 individually in a record pattern, as in
2265 <programlisting>
2266 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
2267 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
2268 </programlisting>
2269 </para>
2270
2271 <para>
2272 Record wildcard syntax permits a "<literal>..</literal>" in a record
2273 pattern, where each elided field <literal>f</literal> is replaced by the
2274 pattern <literal>f = f</literal>. For example, the above pattern can be
2275 written as
2276 <programlisting>
2277 f (C {a = 1, ..}) = b + c + d
2278 </programlisting>
2279 </para>
2280
2281 <para>
2282 More details:
2283 <itemizedlist>
2284 <listitem><para>
2285 Wildcards can be mixed with other patterns, including puns
2286 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
2287 = 1, b, ..})</literal>. Additionally, record wildcards can be used
2288 wherever record patterns occur, including in <literal>let</literal>
2289 bindings and at the top-level. For example, the top-level binding
2290 <programlisting>
2291 C {a = 1, ..} = e
2292 </programlisting>
2293 defines <literal>b</literal>, <literal>c</literal>, and
2294 <literal>d</literal>.
2295 </para></listitem>
2296
2297 <listitem><para>
2298 Record wildcards can also be used in expressions, writing, for example,
2299 <programlisting>
2300 let {a = 1; b = 2; c = 3; d = 4} in C {..}
2301 </programlisting>
2302 in place of
2303 <programlisting>
2304 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
2305 </programlisting>
2306 The expansion is purely syntactic, so the record wildcard
2307 expression refers to the nearest enclosing variables that are spelled
2308 the same as the omitted field names.
2309 </para></listitem>
2310
2311 <listitem><para>
2312 The "<literal>..</literal>" expands to the missing
2313 <emphasis>in-scope</emphasis> record fields.
2314 Specifically the expansion of "<literal>C {..}</literal>" includes
2315 <literal>f</literal> if and only if:
2316 <itemizedlist>
2317 <listitem><para>
2318 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2319 </para></listitem>
2320 <listitem><para>
2321 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2322 </para></listitem>
2323 <listitem><para>
2324 In the case of expressions (but not patterns),
2325 the variable <literal>f</literal> is in scope unqualified,
2326 apart from the binding of the record selector itself.
2327 </para></listitem>
2328 </itemizedlist>
2329 For example
2330 <programlisting>
2331 module M where
2332 data R = R { a,b,c :: Int }
2333 module X where
2334 import M( R(a,c) )
2335 f b = R { .. }
2336 </programlisting>
2337 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2338 omitting <literal>b</literal> since the record field is not in scope,
2339 and omitting <literal>c</literal> since the variable <literal>c</literal>
2340 is not in scope (apart from the binding of the
2341 record selector <literal>c</literal>, of course).
2342 </para></listitem>
2343 </itemizedlist>
2344 </para>
2345
2346 </sect2>
2347
2348 <!-- ===================== Local fixity declarations =================== -->
2349
2350 <sect2 id="local-fixity-declarations">
2351 <title>Local Fixity Declarations
2352 </title>
2353
2354 <para>A careful reading of the Haskell 98 Report reveals that fixity
2355 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2356 <literal>infixr</literal>) are permitted to appear inside local bindings
2357 such those introduced by <literal>let</literal> and
2358 <literal>where</literal>. However, the Haskell Report does not specify
2359 the semantics of such bindings very precisely.
2360 </para>
2361
2362 <para>In GHC, a fixity declaration may accompany a local binding:
2363 <programlisting>
2364 let f = ...
2365 infixr 3 `f`
2366 in
2367 ...
2368 </programlisting>
2369 and the fixity declaration applies wherever the binding is in scope.
2370 For example, in a <literal>let</literal>, it applies in the right-hand
2371 sides of other <literal>let</literal>-bindings and the body of the
2372 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2373 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2374 declarations of a <literal>let</literal> statement scope over other
2375 statements in the group, just as the bound name does.
2376 </para>
2377
2378 <para>
2379 Moreover, a local fixity declaration *must* accompany a local binding of
2380 that name: it is not possible to revise the fixity of name bound
2381 elsewhere, as in
2382 <programlisting>
2383 let infixr 9 $ in ...
2384 </programlisting>
2385
2386 Because local fixity declarations are technically Haskell 98, no flag is
2387 necessary to enable them.
2388 </para>
2389 </sect2>
2390
2391 <sect2 id="package-imports">
2392 <title>Package-qualified imports</title>
2393
2394 <para>With the <option>-XPackageImports</option> flag, GHC allows
2395 import declarations to be qualified by the package name that the
2396 module is intended to be imported from. For example:</para>
2397
2398 <programlisting>
2399 import "network" Network.Socket
2400 </programlisting>
2401
2402 <para>would import the module <literal>Network.Socket</literal> from
2403 the package <literal>network</literal> (any version). This may
2404 be used to disambiguate an import when the same module is
2405 available from multiple packages, or is present in both the
2406 current package being built and an external package.</para>
2407
2408 <para>The special package name <literal>this</literal> can be used to
2409 refer to the current package being built.</para>
2410
2411 <para>Note: you probably don't need to use this feature, it was
2412 added mainly so that we can build backwards-compatible versions of
2413 packages when APIs change. It can lead to fragile dependencies in
2414 the common case: modules occasionally move from one package to
2415 another, rendering any package-qualified imports broken.</para>
2416 </sect2>
2417
2418 <sect2 id="safe-imports-ext">
2419 <title>Safe imports</title>
2420
2421 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2422 and <option>-XUnsafe</option> language flags, GHC extends
2423 the import declaration syntax to take an optional <literal>safe</literal>
2424 keyword after the <literal>import</literal> keyword. This feature
2425 is part of the Safe Haskell GHC extension. For example:</para>
2426
2427 <programlisting>
2428 import safe qualified Network.Socket as NS
2429 </programlisting>
2430
2431 <para>would import the module <literal>Network.Socket</literal>
2432 with compilation only succeeding if Network.Socket can be
2433 safely imported. For a description of when a import is
2434 considered safe see <xref linkend="safe-haskell"/></para>
2435
2436 </sect2>
2437
2438 <sect2 id="explicit-namespaces">
2439 <title>Explicit namespaces in import/export</title>
2440
2441 <para> In an import or export list, such as
2442 <programlisting>
2443 module M( f, (++) ) where ...
2444 import N( f, (++) )
2445 ...
2446 </programlisting>
2447 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2448 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2449 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2450 case, how would you export or import it?
2451 </para>
2452 <para>
2453 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2454 a type constructor in an import or export list with "<literal>type</literal>" to
2455 disambiguate this case, thus:
2456 <programlisting>
2457 module M( f, type (++) ) where ...
2458 import N( f, type (++) )
2459 ...
2460 module N( f, type (++) ) where
2461 data family a ++ b = L a | R b
2462 </programlisting>
2463 The extension <option>-XExplicitNamespaces</option>
2464 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2465 </para>
2466 </sect2>
2467
2468 <sect2 id="syntax-stolen">
2469 <title>Summary of stolen syntax</title>
2470
2471 <para>Turning on an option that enables special syntax
2472 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2473 to compile, perhaps because it uses a variable name which has
2474 become a reserved word. This section lists the syntax that is
2475 "stolen" by language extensions.
2476 We use
2477 notation and nonterminal names from the Haskell 98 lexical syntax
2478 (see the Haskell 98 Report).
2479 We only list syntax changes here that might affect
2480 existing working programs (i.e. "stolen" syntax). Many of these
2481 extensions will also enable new context-free syntax, but in all
2482 cases programs written to use the new syntax would not be
2483 compilable without the option enabled.</para>
2484
2485 <para>There are two classes of special
2486 syntax:
2487
2488 <itemizedlist>
2489 <listitem>
2490 <para>New reserved words and symbols: character sequences
2491 which are no longer available for use as identifiers in the
2492 program.</para>
2493 </listitem>
2494 <listitem>
2495 <para>Other special syntax: sequences of characters that have
2496 a different meaning when this particular option is turned
2497 on.</para>
2498 </listitem>
2499 </itemizedlist>
2500
2501 The following syntax is stolen:
2502
2503 <variablelist>
2504 <varlistentry>
2505 <term>
2506 <literal>forall</literal>
2507 <indexterm><primary><literal>forall</literal></primary></indexterm>
2508 </term>
2509 <listitem><para>
2510 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2511 <option>-XScopedTypeVariables</option>,
2512 <option>-XLiberalTypeSynonyms</option>,
2513 <option>-XRankNTypes</option>,
2514 <option>-XExistentialQuantification</option>
2515 </para></listitem>
2516 </varlistentry>
2517
2518 <varlistentry>
2519 <term>
2520 <literal>mdo</literal>
2521 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2522 </term>
2523 <listitem><para>
2524 Stolen by: <option>-XRecursiveDo</option>
2525 </para></listitem>
2526 </varlistentry>
2527
2528 <varlistentry>
2529 <term>
2530 <literal>foreign</literal>
2531 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2532 </term>
2533 <listitem><para>
2534 Stolen by: <option>-XForeignFunctionInterface</option>
2535 </para></listitem>
2536 </varlistentry>
2537
2538 <varlistentry>
2539 <term>
2540 <literal>rec</literal>,
2541 <literal>proc</literal>, <literal>-&lt;</literal>,
2542 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2543 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2544 <literal>|)</literal> brackets
2545 <indexterm><primary><literal>proc</literal></primary></indexterm>
2546 </term>
2547 <listitem><para>
2548 Stolen by: <option>-XArrows</option>
2549 </para></listitem>
2550 </varlistentry>
2551
2552 <varlistentry>
2553 <term>
2554 <literal>?<replaceable>varid</replaceable></literal>
2555 <indexterm><primary>implicit parameters</primary></indexterm>
2556 </term>
2557 <listitem><para>
2558 Stolen by: <option>-XImplicitParams</option>
2559 </para></listitem>
2560 </varlistentry>
2561
2562 <varlistentry>
2563 <term>
2564 <literal>[|</literal>,
2565 <literal>[e|</literal>, <literal>[p|</literal>,
2566 <literal>[d|</literal>, <literal>[t|</literal>,
2567 <literal>$(</literal>,
2568 <literal>$$(</literal>,
2569 <literal>[||</literal>,
2570 <literal>[e||</literal>,
2571 <literal>$<replaceable>varid</replaceable></literal>,
2572 <literal>$$<replaceable>varid</replaceable></literal>
2573 <indexterm><primary>Template Haskell</primary></indexterm>
2574 </term>
2575 <listitem><para>
2576 Stolen by: <option>-XTemplateHaskell</option>
2577 </para></listitem>
2578 </varlistentry>
2579
2580 <varlistentry>
2581 <term>
2582 <literal>[<replaceable>varid</replaceable>|</literal>
2583 <indexterm><primary>quasi-quotation</primary></indexterm>
2584 </term>
2585 <listitem><para>
2586 Stolen by: <option>-XQuasiQuotes</option>
2587 </para></listitem>
2588 </varlistentry>
2589
2590 <varlistentry>
2591 <term>
2592 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2593 <replaceable>char</replaceable><literal>&num;</literal>,
2594 <replaceable>string</replaceable><literal>&num;</literal>,
2595 <replaceable>integer</replaceable><literal>&num;</literal>,
2596 <replaceable>float</replaceable><literal>&num;</literal>,
2597 <replaceable>float</replaceable><literal>&num;&num;</literal>
2598 </term>
2599 <listitem><para>
2600 Stolen by: <option>-XMagicHash</option>
2601 </para></listitem>
2602 </varlistentry>
2603
2604 <varlistentry>
2605 <term>
2606 <literal>(&num;</literal>, <literal>&num;)</literal>
2607 </term>
2608 <listitem><para>
2609 Stolen by: <option>-XUnboxedTuples</option>
2610 </para></listitem>
2611 </varlistentry>
2612
2613 <varlistentry>
2614 <term>
2615 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2616 </term>
2617 <listitem><para>
2618 Stolen by: <option>-XBangPatterns</option>
2619 </para></listitem>
2620 </varlistentry>
2621
2622 <varlistentry>
2623 <term>
2624 <literal>pattern</literal>
2625 </term>
2626 <listitem><para>
2627 Stolen by: <option>-XPatternSynonyms</option>
2628 </para></listitem>
2629 </varlistentry>
2630 </variablelist>
2631 </para>
2632 </sect2>
2633 </sect1>
2634
2635
2636 <!-- TYPE SYSTEM EXTENSIONS -->
2637 <sect1 id="data-type-extensions">
2638 <title>Extensions to data types and type synonyms</title>
2639
2640 <sect2 id="nullary-types">
2641 <title>Data types with no constructors</title>
2642
2643 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2644 GHC lets you declare a data type with no constructors. For example:</para>
2645
2646 <programlisting>
2647 data S -- S :: *
2648 data T a -- T :: * -> *
2649 </programlisting>
2650
2651 <para>Syntactically, the declaration lacks the "= constrs" part. The
2652 type can be parameterised over types of any kind, but if the kind is
2653 not <literal>*</literal> then an explicit kind annotation must be used
2654 (see <xref linkend="kinding"/>).</para>
2655
2656 <para>Such data types have only one value, namely bottom.
2657 Nevertheless, they can be useful when defining "phantom types".</para>
2658 </sect2>
2659
2660 <sect2 id="datatype-contexts">
2661 <title>Data type contexts</title>
2662
2663 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2664
2665 <programlisting>
2666 data Eq a => Set a = NilSet | ConsSet a (Set a)
2667 </programlisting>
2668
2669 <para>give constructors with types:</para>
2670
2671 <programlisting>
2672 NilSet :: Set a
2673 ConsSet :: Eq a => a -> Set a -> Set a
2674 </programlisting>
2675
2676 <para>This is widely considered a misfeature, and is going to be removed from
2677 the language. In GHC, it is controlled by the deprecated extension
2678 <literal>DatatypeContexts</literal>.</para>
2679 </sect2>
2680
2681 <sect2 id="infix-tycons">
2682 <title>Infix type constructors, classes, and type variables</title>
2683
2684 <para>
2685 GHC allows type constructors, classes, and type variables to be operators, and
2686 to be written infix, very much like expressions. More specifically:
2687 <itemizedlist>
2688 <listitem><para>
2689 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2690 The lexical syntax is the same as that for data constructors.
2691 </para></listitem>
2692 <listitem><para>
2693 Data type and type-synonym declarations can be written infix, parenthesised
2694 if you want further arguments. E.g.
2695 <screen>
2696 data a :*: b = Foo a b
2697 type a :+: b = Either a b
2698 class a :=: b where ...
2699
2700 data (a :**: b) x = Baz a b x
2701 type (a :++: b) y = Either (a,b) y
2702 </screen>
2703 </para></listitem>
2704 <listitem><para>
2705 Types, and class constraints, can be written infix. For example
2706 <screen>
2707 x :: Int :*: Bool
2708 f :: (a :=: b) => a -> b
2709 </screen>
2710 </para></listitem>
2711 <listitem><para>
2712 Back-quotes work
2713 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2714 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2715 </para></listitem>
2716 <listitem><para>
2717 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2718 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2719 sets the fixity for a data constructor and the corresponding type constructor. For example:
2720 <screen>
2721 infixl 7 T, :*:
2722 </screen>
2723 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2724 and similarly for <literal>:*:</literal>.
2725 <literal>Int `a` Bool</literal>.
2726 </para></listitem>
2727 <listitem><para>
2728 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2729 </para></listitem>
2730
2731 </itemizedlist>
2732 </para>
2733 </sect2>
2734
2735 <sect2 id="type-operators">
2736 <title>Type operators</title>
2737 <para>
2738 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2739 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2740 <programlisting>
2741 type T (+) = ((+), (+))
2742 -- Just like: type T a = (a,a)
2743
2744 f :: T Int -> Int
2745 f (x,y)= x
2746 </programlisting>
2747 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2748 allow you to write them infix.
2749 </para>
2750 <para>
2751 The language <option>-XTypeOperators</option> changes this behaviour:
2752 <itemizedlist>
2753 <listitem><para>
2754 Operator symbols become type <emphasis>constructors</emphasis> rather than
2755 type <emphasis>variables</emphasis>.
2756 </para></listitem>
2757 <listitem><para>
2758 Operator symbols in types can be written infix, both in definitions and uses.
2759 for example:
2760 <programlisting>
2761 data a + b = Plus a b
2762 type Foo = Int + Bool
2763 </programlisting>
2764 </para></listitem>
2765 <listitem><para>
2766 There is now some potential ambiguity in import and export lists; for example
2767 if you write <literal>import M( (+) )</literal> do you mean the
2768 <emphasis>function</emphasis> <literal>(+)</literal> or the
2769 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2770 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2771 by <option>-XExplicitTypeOperators</option>) GHC allows you to specify the latter
2772 by preceding it with the keyword <literal>type</literal>, thus:
2773 <programlisting>
2774 import M( type (+) )
2775 </programlisting>
2776 See <xref linkend="explicit-namespaces"/>.
2777 </para></listitem>
2778 <listitem><para>
2779 The fixity of a type operator may be set using the usual fixity declarations
2780 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2781 a single fixity.
2782 </para></listitem>
2783 </itemizedlist>
2784 </para>
2785 </sect2>
2786
2787 <sect2 id="type-synonyms">
2788 <title>Liberalised type synonyms</title>
2789
2790 <para>
2791 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2792 on individual synonym declarations.
2793 With the <option>-XLiberalTypeSynonyms</option> extension,
2794 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2795 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2796
2797 <itemizedlist>
2798 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2799 in a type synonym, thus:
2800 <programlisting>
2801 type Discard a = forall b. Show b => a -> b -> (a, String)
2802
2803 f :: Discard a
2804 f x y = (x, show y)
2805
2806 g :: Discard Int -> (Int,String) -- A rank-2 type
2807 g f = f 3 True
2808 </programlisting>
2809 </para>
2810 </listitem>
2811
2812 <listitem><para>
2813 If you also use <option>-XUnboxedTuples</option>,
2814 you can write an unboxed tuple in a type synonym:
2815 <programlisting>
2816 type Pr = (# Int, Int #)
2817
2818 h :: Int -> Pr
2819 h x = (# x, x #)
2820 </programlisting>
2821 </para></listitem>
2822
2823 <listitem><para>
2824 You can apply a type synonym to a forall type:
2825 <programlisting>
2826 type Foo a = a -> a -> Bool
2827
2828 f :: Foo (forall b. b->b)
2829 </programlisting>
2830 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2831 <programlisting>
2832 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2833 </programlisting>
2834 </para></listitem>
2835
2836 <listitem><para>
2837 You can apply a type synonym to a partially applied type synonym:
2838 <programlisting>
2839 type Generic i o = forall x. i x -> o x
2840 type Id x = x
2841
2842 foo :: Generic Id []
2843 </programlisting>
2844 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2845 <programlisting>
2846 foo :: forall x. x -> [x]
2847 </programlisting>
2848 </para></listitem>
2849
2850 </itemizedlist>
2851 </para>
2852
2853 <para>
2854 GHC currently does kind checking before expanding synonyms (though even that
2855 could be changed.)
2856 </para>
2857 <para>
2858 After expanding type synonyms, GHC does validity checking on types, looking for
2859 the following mal-formedness which isn't detected simply by kind checking:
2860 <itemizedlist>
2861 <listitem><para>
2862 Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal>
2863 is off)
2864 </para></listitem>
2865 <listitem><para>
2866 Partially-applied type synonym.
2867 </para></listitem>
2868 </itemizedlist>
2869 So, for example, this will be rejected:
2870 <programlisting>
2871 type Pr = forall a. a
2872
2873 h :: [Pr]
2874 h = ...
2875 </programlisting>
2876 because GHC does not allow type constructors applied to for-all types.
2877 </para>
2878 </sect2>
2879
2880
2881 <sect2 id="existential-quantification">
2882 <title>Existentially quantified data constructors
2883 </title>
2884
2885 <para>
2886 The idea of using existential quantification in data type declarations
2887 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2888 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2889 London, 1991). It was later formalised by Laufer and Odersky
2890 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2891 TOPLAS, 16(5), pp1411-1430, 1994).
2892 It's been in Lennart
2893 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2894 proved very useful. Here's the idea. Consider the declaration:
2895 </para>
2896
2897 <para>
2898
2899 <programlisting>
2900 data Foo = forall a. MkFoo a (a -> Bool)
2901 | Nil
2902 </programlisting>
2903
2904 </para>
2905
2906 <para>
2907 The data type <literal>Foo</literal> has two constructors with types:
2908 </para>
2909
2910 <para>
2911
2912 <programlisting>
2913 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2914 Nil :: Foo
2915 </programlisting>
2916
2917 </para>
2918
2919 <para>
2920 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2921 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2922 For example, the following expression is fine:
2923 </para>
2924
2925 <para>
2926
2927 <programlisting>
2928 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2929 </programlisting>
2930
2931 </para>
2932
2933 <para>
2934 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
2935 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
2936 isUpper</function> packages a character with a compatible function. These
2937 two things are each of type <literal>Foo</literal> and can be put in a list.
2938 </para>
2939
2940 <para>
2941 What can we do with a value of type <literal>Foo</literal>?. In particular,
2942 what happens when we pattern-match on <function>MkFoo</function>?
2943 </para>
2944
2945 <para>
2946
2947 <programlisting>
2948 f (MkFoo val fn) = ???
2949 </programlisting>
2950
2951 </para>
2952
2953 <para>
2954 Since all we know about <literal>val</literal> and <function>fn</function> is that they
2955 are compatible, the only (useful) thing we can do with them is to
2956 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
2957 </para>
2958
2959 <para>
2960
2961 <programlisting>
2962 f :: Foo -> Bool
2963 f (MkFoo val fn) = fn val
2964 </programlisting>
2965
2966 </para>
2967
2968 <para>
2969 What this allows us to do is to package heterogeneous values
2970 together with a bunch of functions that manipulate them, and then treat
2971 that collection of packages in a uniform manner. You can express
2972 quite a bit of object-oriented-like programming this way.
2973 </para>
2974
2975 <sect3 id="existential">
2976 <title>Why existential?
2977 </title>
2978
2979 <para>
2980 What has this to do with <emphasis>existential</emphasis> quantification?
2981 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
2982 </para>
2983
2984 <para>
2985
2986 <programlisting>
2987 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2988 </programlisting>
2989
2990 </para>
2991
2992 <para>
2993 But Haskell programmers can safely think of the ordinary
2994 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
2995 adding a new existential quantification construct.
2996 </para>
2997
2998 </sect3>
2999
3000 <sect3 id="existential-with-context">
3001 <title>Existentials and type classes</title>
3002
3003 <para>
3004 An easy extension is to allow
3005 arbitrary contexts before the constructor. For example:
3006 </para>
3007
3008 <para>
3009
3010 <programlisting>
3011 data Baz = forall a. Eq a => Baz1 a a
3012 | forall b. Show b => Baz2 b (b -> b)
3013 </programlisting>
3014
3015 </para>
3016
3017 <para>
3018 The two constructors have the types you'd expect:
3019 </para>
3020
3021 <para>
3022
3023 <programlisting>
3024 Baz1 :: forall a. Eq a => a -> a -> Baz
3025 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3026 </programlisting>
3027
3028 </para>
3029
3030 <para>
3031 But when pattern matching on <function>Baz1</function> the matched values can be compared
3032 for equality, and when pattern matching on <function>Baz2</function> the first matched
3033 value can be converted to a string (as well as applying the function to it).
3034 So this program is legal:
3035 </para>
3036
3037 <para>
3038
3039 <programlisting>
3040 f :: Baz -> String
3041 f (Baz1 p q) | p == q = "Yes"
3042 | otherwise = "No"
3043 f (Baz2 v fn) = show (fn v)
3044 </programlisting>
3045
3046 </para>
3047
3048 <para>
3049 Operationally, in a dictionary-passing implementation, the
3050 constructors <function>Baz1</function> and <function>Baz2</function> must store the
3051 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
3052 extract it on pattern matching.
3053 </para>
3054
3055 </sect3>
3056
3057 <sect3 id="existential-records">
3058 <title>Record Constructors</title>
3059
3060 <para>
3061 GHC allows existentials to be used with records syntax as well. For example:
3062
3063 <programlisting>
3064 data Counter a = forall self. NewCounter
3065 { _this :: self
3066 , _inc :: self -> self
3067 , _display :: self -> IO ()
3068 , tag :: a
3069 }
3070 </programlisting>
3071 Here <literal>tag</literal> is a public field, with a well-typed selector
3072 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
3073 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
3074 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
3075 compile-time error. In other words, <emphasis>GHC defines a record selector function
3076 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
3077 (This example used an underscore in the fields for which record selectors
3078 will not be defined, but that is only programming style; GHC ignores them.)
3079 </para>
3080
3081 <para>
3082 To make use of these hidden fields, we need to create some helper functions:
3083
3084 <programlisting>
3085 inc :: Counter a -> Counter a
3086 inc (NewCounter x i d t) = NewCounter
3087 { _this = i x, _inc = i, _display = d, tag = t }
3088
3089 display :: Counter a -> IO ()
3090 display NewCounter{ _this = x, _display = d } = d x
3091 </programlisting>
3092
3093 Now we can define counters with different underlying implementations:
3094
3095 <programlisting>
3096 counterA :: Counter String
3097 counterA = NewCounter
3098 { _this = 0, _inc = (1+), _display = print, tag = "A" }
3099
3100 counterB :: Counter String
3101 counterB = NewCounter
3102 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
3103
3104 main = do
3105 display (inc counterA) -- prints "1"
3106 display (inc (inc counterB)) -- prints "##"
3107 </programlisting>
3108
3109 Record update syntax is supported for existentials (and GADTs):
3110 <programlisting>
3111 setTag :: Counter a -> a -> Counter a
3112 setTag obj t = obj{ tag = t }
3113 </programlisting>
3114 The rule for record update is this: <emphasis>
3115 the types of the updated fields may
3116 mention only the universally-quantified type variables
3117 of the data constructor. For GADTs, the field may mention only types
3118 that appear as a simple type-variable argument in the constructor's result
3119 type</emphasis>. For example:
3120 <programlisting>
3121 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
3122 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
3123 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
3124 -- existentially quantified)
3125
3126 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
3127 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
3128 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
3129 -- type-variable argument in G1's result type)
3130 </programlisting>
3131 </para>
3132
3133 </sect3>
3134
3135
3136 <sect3>
3137 <title>Restrictions</title>
3138
3139 <para>
3140 There are several restrictions on the ways in which existentially-quantified
3141 constructors can be use.
3142 </para>
3143
3144 <para>
3145
3146 <itemizedlist>
3147 <listitem>
3148
3149 <para>
3150 When pattern matching, each pattern match introduces a new,
3151 distinct, type for each existential type variable. These types cannot
3152 be unified with any other type, nor can they escape from the scope of
3153 the pattern match. For example, these fragments are incorrect:
3154
3155
3156 <programlisting>
3157 f1 (MkFoo a f) = a
3158 </programlisting>
3159
3160
3161 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
3162 is the result of <function>f1</function>. One way to see why this is wrong is to
3163 ask what type <function>f1</function> has:
3164
3165
3166 <programlisting>
3167 f1 :: Foo -> a -- Weird!
3168 </programlisting>
3169
3170
3171 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
3172 this:
3173
3174
3175 <programlisting>
3176 f1 :: forall a. Foo -> a -- Wrong!
3177 </programlisting>
3178
3179
3180 The original program is just plain wrong. Here's another sort of error
3181
3182
3183 <programlisting>
3184 f2 (Baz1 a b) (Baz1 p q) = a==q
3185 </programlisting>
3186
3187
3188 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
3189 <literal>a==q</literal> is wrong because it equates the two distinct types arising
3190 from the two <function>Baz1</function> constructors.
3191
3192
3193 </para>
3194 </listitem>
3195 <listitem>
3196
3197 <para>
3198 You can't pattern-match on an existentially quantified
3199 constructor in a <literal>let</literal> or <literal>where</literal> group of
3200 bindings. So this is illegal:
3201
3202
3203 <programlisting>
3204 f3 x = a==b where { Baz1 a b = x }
3205 </programlisting>
3206
3207 Instead, use a <literal>case</literal> expression:
3208
3209 <programlisting>
3210 f3 x = case x of Baz1 a b -> a==b
3211 </programlisting>
3212
3213 In general, you can only pattern-match
3214 on an existentially-quantified constructor in a <literal>case</literal> expression or
3215 in the patterns of a function definition.
3216
3217 The reason for this restriction is really an implementation one.
3218 Type-checking binding groups is already a nightmare without
3219 existentials complicating the picture. Also an existential pattern
3220 binding at the top level of a module doesn't make sense, because it's
3221 not clear how to prevent the existentially-quantified type "escaping".
3222 So for now, there's a simple-to-state restriction. We'll see how
3223 annoying it is.
3224
3225 </para>
3226 </listitem>
3227 <listitem>
3228
3229 <para>
3230 You can't use existential quantification for <literal>newtype</literal>
3231 declarations. So this is illegal:
3232
3233
3234 <programlisting>
3235 newtype T = forall a. Ord a => MkT a
3236 </programlisting>
3237
3238
3239 Reason: a value of type <literal>T</literal> must be represented as a
3240 pair of a dictionary for <literal>Ord t</literal> and a value of type
3241 <literal>t</literal>. That contradicts the idea that
3242 <literal>newtype</literal> should have no concrete representation.
3243 You can get just the same efficiency and effect by using
3244 <literal>data</literal> instead of <literal>newtype</literal>. If
3245 there is no overloading involved, then there is more of a case for
3246 allowing an existentially-quantified <literal>newtype</literal>,
3247 because the <literal>data</literal> version does carry an
3248 implementation cost, but single-field existentially quantified
3249 constructors aren't much use. So the simple restriction (no
3250 existential stuff on <literal>newtype</literal>) stands, unless there
3251 are convincing reasons to change it.
3252
3253
3254 </para>
3255 </listitem>
3256 <listitem>
3257
3258 <para>
3259 You can't use <literal>deriving</literal> to define instances of a
3260 data type with existentially quantified data constructors.
3261
3262 Reason: in most cases it would not make sense. For example:;
3263
3264 <programlisting>
3265 data T = forall a. MkT [a] deriving( Eq )
3266 </programlisting>
3267
3268 To derive <literal>Eq</literal> in the standard way we would need to have equality
3269 between the single component of two <function>MkT</function> constructors:
3270
3271 <programlisting>
3272 instance Eq T where
3273 (MkT a) == (MkT b) = ???
3274 </programlisting>
3275
3276 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
3277 It's just about possible to imagine examples in which the derived instance
3278 would make sense, but it seems altogether simpler simply to prohibit such
3279 declarations. Define your own instances!
3280 </para>
3281 </listitem>
3282
3283 </itemizedlist>
3284
3285 </para>
3286
3287 </sect3>
3288 </sect2>
3289
3290 <!-- ====================== Generalised algebraic data types ======================= -->
3291
3292 <sect2 id="gadt-style">
3293 <title>Declaring data types with explicit constructor signatures</title>
3294
3295 <para>When the <literal>GADTSyntax</literal> extension is enabled,
3296 GHC allows you to declare an algebraic data type by
3297 giving the type signatures of constructors explicitly. For example:
3298 <programlisting>
3299 data Maybe a where
3300 Nothing :: Maybe a
3301 Just :: a -> Maybe a
3302 </programlisting>
3303 The form is called a "GADT-style declaration"
3304 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
3305 can only be declared using this form.</para>
3306 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
3307 For example, these two declarations are equivalent:
3308 <programlisting>
3309 data Foo = forall a. MkFoo a (a -> Bool)
3310 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
3311 </programlisting>
3312 </para>
3313 <para>Any data type that can be declared in standard Haskell-98 syntax
3314 can also be declared using GADT-style syntax.
3315 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
3316 they treat class constraints on the data constructors differently.
3317 Specifically, if the constructor is given a type-class context, that
3318 context is made available by pattern matching. For example:
3319 <programlisting>
3320 data Set a where
3321 MkSet :: Eq a => [a] -> Set a
3322
3323 makeSet :: Eq a => [a] -> Set a
3324 makeSet xs = MkSet (nub xs)
3325
3326 insert :: a -> Set a -> Set a
3327 insert a (MkSet as) | a `elem` as = MkSet as
3328 | otherwise = MkSet (a:as)
3329 </programlisting>
3330 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3331 gives rise to a <literal>(Eq a)</literal>
3332 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3333 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3334 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3335 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3336 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3337 In the example, the equality dictionary is used to satisfy the equality constraint
3338 generated by the call to <literal>elem</literal>, so that the type of
3339 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3340 </para>
3341 <para>
3342 For example, one possible application is to reify dictionaries:
3343 <programlisting>
3344 data NumInst a where
3345 MkNumInst :: Num a => NumInst a
3346
3347 intInst :: NumInst Int
3348 intInst = MkNumInst
3349
3350 plus :: NumInst a -> a -> a -> a
3351 plus MkNumInst p q = p + q
3352 </programlisting>
3353 Here, a value of type <literal>NumInst a</literal> is equivalent
3354 to an explicit <literal>(Num a)</literal> dictionary.
3355 </para>
3356 <para>
3357 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3358 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3359 like this:
3360 <programlisting>
3361 data NumInst a
3362 = Num a => MkNumInst (NumInst a)
3363 </programlisting>
3364 Notice that, unlike the situation when declaring an existential, there is
3365 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3366 data type's universally quantified type variable <literal>a</literal>.
3367 A constructor may have both universal and existential type variables: for example,
3368 the following two declarations are equivalent:
3369 <programlisting>
3370 data T1 a
3371 = forall b. (Num a, Eq b) => MkT1 a b
3372 data T2 a where
3373 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3374 </programlisting>
3375 </para>
3376 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3377 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3378 In Haskell 98 the definition
3379 <programlisting>
3380 data Eq a => Set' a = MkSet' [a]
3381 </programlisting>
3382 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3383 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3384 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3385 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3386 GHC's behaviour is much more useful, as well as much more intuitive.
3387 </para>
3388
3389 <para>
3390 The rest of this section gives further details about GADT-style data
3391 type declarations.
3392
3393 <itemizedlist>
3394 <listitem><para>
3395 The result type of each data constructor must begin with the type constructor being defined.
3396 If the result type of all constructors
3397 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3398 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3399 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3400 </para></listitem>
3401
3402 <listitem><para>
3403 As with other type signatures, you can give a single signature for several data constructors.
3404 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3405 <programlisting>
3406 data T a where
3407 T1,T2 :: a -> T a
3408 T3 :: T a
3409 </programlisting>
3410 </para></listitem>
3411
3412 <listitem><para>
3413 The type signature of
3414 each constructor is independent, and is implicitly universally quantified as usual.
3415 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3416 have no scope, and different constructors may have different universally-quantified type variables:
3417 <programlisting>
3418 data T a where -- The 'a' has no scope
3419 T1,T2 :: b -> T b -- Means forall b. b -> T b
3420 T3 :: T a -- Means forall a. T a
3421 </programlisting>
3422 </para></listitem>
3423
3424 <listitem><para>
3425 A constructor signature may mention type class constraints, which can differ for
3426 different constructors. For example, this is fine:
3427 <programlisting>
3428 data T a where
3429 T1 :: Eq b => b -> b -> T b
3430 T2 :: (Show c, Ix c) => c -> [c] -> T c
3431 </programlisting>
3432 When pattern matching, these constraints are made available to discharge constraints
3433 in the body of the match. For example:
3434 <programlisting>
3435 f :: T a -> String
3436 f (T1 x y) | x==y = "yes"
3437 | otherwise = "no"
3438 f (T2 a b) = show a
3439 </programlisting>
3440 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3441 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3442 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3443 </para></listitem>
3444
3445 <listitem><para>
3446 Unlike a Haskell-98-style
3447 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3448 have no scope. Indeed, one can write a kind signature instead:
3449 <programlisting>
3450 data Set :: * -> * where ...
3451 </programlisting>
3452 or even a mixture of the two:
3453 <programlisting>
3454 data Bar a :: (* -> *) -> * where ...
3455 </programlisting>
3456 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3457 like this:
3458 <programlisting>
3459 data Bar a (b :: * -> *) where ...
3460 </programlisting>
3461 </para></listitem>
3462
3463
3464 <listitem><para>
3465 You can use strictness annotations, in the obvious places
3466 in the constructor type:
3467 <programlisting>
3468 data Term a where
3469 Lit :: !Int -> Term Int
3470 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3471 Pair :: Term a -> Term b -> Term (a,b)
3472 </programlisting>
3473 </para></listitem>
3474
3475 <listitem><para>
3476 You can use a <literal>deriving</literal> clause on a GADT-style data type
3477 declaration. For example, these two declarations are equivalent
3478 <programlisting>
3479 data Maybe1 a where {
3480 Nothing1 :: Maybe1 a ;
3481 Just1 :: a -> Maybe1 a
3482 } deriving( Eq, Ord )
3483
3484 data Maybe2 a = Nothing2 | Just2 a
3485 deriving( Eq, Ord )
3486 </programlisting>
3487 </para></listitem>
3488
3489 <listitem><para>
3490 The type signature may have quantified type variables that do not appear
3491 in the result type:
3492 <programlisting>
3493 data Foo where
3494 MkFoo :: a -> (a->Bool) -> Foo
3495 Nil :: Foo
3496 </programlisting>
3497 Here the type variable <literal>a</literal> does not appear in the result type
3498 of either constructor.
3499 Although it is universally quantified in the type of the constructor, such
3500 a type variable is often called "existential".
3501 Indeed, the above declaration declares precisely the same type as
3502 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3503 </para><para>
3504 The type may contain a class context too, of course:
3505 <programlisting>
3506 data Showable where
3507 MkShowable :: Show a => a -> Showable
3508 </programlisting>
3509 </para></listitem>
3510
3511 <listitem><para>
3512 You can use record syntax on a GADT-style data type declaration:
3513
3514 <programlisting>
3515 data Person where
3516 Adult :: { name :: String, children :: [Person] } -> Person
3517 Child :: Show a => { name :: !String, funny :: a } -> Person
3518 </programlisting>
3519 As usual, for every constructor that has a field <literal>f</literal>, the type of
3520 field <literal>f</literal> must be the same (modulo alpha conversion).
3521 The <literal>Child</literal> constructor above shows that the signature
3522 may have a context, existentially-quantified variables, and strictness annotations,
3523 just as in the non-record case. (NB: the "type" that follows the double-colon
3524 is not really a type, because of the record syntax and strictness annotations.
3525 A "type" of this form can appear only in a constructor signature.)
3526 </para></listitem>
3527
3528 <listitem><para>
3529 Record updates are allowed with GADT-style declarations,
3530 only fields that have the following property: the type of the field
3531 mentions no existential type variables.
3532 </para></listitem>
3533
3534 <listitem><para>
3535 As in the case of existentials declared using the Haskell-98-like record syntax
3536 (<xref linkend="existential-records"/>),
3537 record-selector functions are generated only for those fields that have well-typed
3538 selectors.
3539 Here is the example of that section, in GADT-style syntax:
3540 <programlisting>
3541 data Counter a where
3542 NewCounter :: { _this :: self
3543 , _inc :: self -> self
3544 , _display :: self -> IO ()
3545 , tag :: a
3546 } -> Counter a
3547 </programlisting>
3548 As before, only one selector function is generated here, that for <literal>tag</literal>.
3549 Nevertheless, you can still use all the field names in pattern matching and record construction.
3550 </para></listitem>
3551
3552 <listitem><para>
3553 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3554 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3555 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3556 So GHC implements the following design: a data constructor declared in a GADT-style data type
3557 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3558 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3559 <programlisting>
3560 infix 6 (:--:)
3561 data T a where
3562 (:--:) :: Int -> Bool -> T Int
3563 </programlisting>
3564 </para></listitem>
3565 </itemizedlist></para>
3566 </sect2>
3567
3568 <sect2 id="gadt">
3569 <title>Generalised Algebraic Data Types (GADTs)</title>
3570
3571 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3572 by allowing constructors to have richer return types. Here is an example:
3573 <programlisting>
3574 data Term a where
3575 Lit :: Int -> Term Int
3576 Succ :: Term Int -> Term Int
3577 IsZero :: Term Int -> Term Bool
3578 If :: Term Bool -> Term a -> Term a -> Term a
3579 Pair :: Term a -> Term b -> Term (a,b)
3580 </programlisting>
3581 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3582 case with ordinary data types. This generality allows us to
3583 write a well-typed <literal>eval</literal> function
3584 for these <literal>Terms</literal>:
3585 <programlisting>
3586 eval :: Term a -> a
3587 eval (Lit i) = i
3588 eval (Succ t) = 1 + eval t
3589 eval (IsZero t) = eval t == 0
3590 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3591 eval (Pair e1 e2) = (eval e1, eval e2)
3592 </programlisting>
3593 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3594 For example, in the right hand side of the equation
3595 <programlisting>
3596 eval :: Term a -> a
3597 eval (Lit i) = ...
3598 </programlisting>
3599 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3600 A precise specification of the type rules is beyond what this user manual aspires to,
3601 but the design closely follows that described in
3602 the paper <ulink
3603 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3604 unification-based type inference for GADTs</ulink>,
3605 (ICFP 2006).
3606 The general principle is this: <emphasis>type refinement is only carried out
3607 based on user-supplied type annotations</emphasis>.
3608 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3609 and lots of obscure error messages will
3610 occur. However, the refinement is quite general. For example, if we had:
3611 <programlisting>
3612 eval :: Term a -> a -> a
3613 eval (Lit i) j = i+j
3614 </programlisting>
3615 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3616 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3617 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3618 </para>
3619 <para>
3620 These and many other examples are given in papers by Hongwei Xi, and
3621 Tim Sheard. There is a longer introduction
3622 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3623 and Ralf Hinze's
3624 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3625 may use different notation to that implemented in GHC.
3626 </para>
3627 <para>
3628 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3629 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
3630 <itemizedlist>
3631 <listitem><para>
3632 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3633 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3634 The result type of each constructor must begin with the type constructor being defined,
3635 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3636 For example, in the <literal>Term</literal> data
3637 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3638 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3639 constructor).
3640 </para></listitem>
3641
3642 <listitem><para>
3643 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3644 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3645 whose result type is not just <literal>T a b</literal>.
3646 </para></listitem>
3647
3648 <listitem><para>
3649 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3650 an ordinary data type.
3651 </para></listitem>
3652
3653 <listitem><para>
3654 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3655 For example:
3656 <programlisting>
3657 data Term a where
3658 Lit :: { val :: Int } -> Term Int
3659 Succ :: { num :: Term Int } -> Term Int
3660 Pred :: { num :: Term Int } -> Term Int
3661 IsZero :: { arg :: Term Int } -> Term Bool
3662 Pair :: { arg1 :: Term a
3663 , arg2 :: Term b
3664 } -> Term (a,b)
3665 If :: { cnd :: Term Bool
3666 , tru :: Term a
3667 , fls :: Term a
3668 } -> Term a
3669 </programlisting>
3670 However, for GADTs there is the following additional constraint:
3671 every constructor that has a field <literal>f</literal> must have
3672 the same result type (modulo alpha conversion)
3673 Hence, in the above example, we cannot merge the <literal>num</literal>
3674 and <literal>arg</literal> fields above into a
3675 single name. Although their field types are both <literal>Term Int</literal>,
3676 their selector functions actually have different types:
3677
3678 <programlisting>
3679 num :: Term Int -> Term Int
3680 arg :: Term Bool -> Term Int
3681 </programlisting>
3682 </para></listitem>
3683
3684 <listitem><para>
3685 When pattern-matching against data constructors drawn from a GADT,
3686 for example in a <literal>case</literal> expression, the following rules apply:
3687 <itemizedlist>
3688 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3689 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3690 <listitem><para>The type of any free variable mentioned in any of
3691 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3692 </itemizedlist>
3693 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3694 way to ensure that a variable a rigid type is to give it a type signature.
3695 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3696 Simple unification-based type inference for GADTs
3697 </ulink>. The criteria implemented by GHC are given in the Appendix.
3698
3699 </para></listitem>
3700
3701 </itemizedlist>
3702 </para>
3703
3704 </sect2>
3705 </sect1>
3706
3707 <!-- ====================== End of Generalised algebraic data types ======================= -->
3708
3709 <sect1 id="deriving">
3710 <title>Extensions to the "deriving" mechanism</title>
3711
3712 <sect2 id="deriving-inferred">
3713 <title>Inferred context for deriving clauses</title>
3714
3715 <para>
3716 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3717 legal. For example:
3718 <programlisting>
3719 data T0 f a = MkT0 a deriving( Eq )
3720 data T1 f a = MkT1 (f a) deriving( Eq )
3721 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3722 </programlisting>
3723 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3724 <programlisting>
3725 instance Eq a => Eq (T0 f a) where ...
3726 instance Eq (f a) => Eq (T1 f a) where ...
3727 instance Eq (f (f a)) => Eq (T2 f a) where ...
3728 </programlisting>
3729 The first of these is obviously fine. The second is still fine, although less obviously.
3730 The third is not Haskell 98, and risks losing termination of instances.
3731 </para>
3732 <para>
3733 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3734 each constraint in the inferred instance context must consist only of type variables,
3735 with no repetitions.
3736 </para>
3737 <para>
3738 This rule is applied regardless of flags. If you want a more exotic context, you can write
3739 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3740 </para>
3741 </sect2>
3742
3743 <sect2 id="stand-alone-deriving">
3744 <title>Stand-alone deriving declarations</title>
3745
3746 <para>
3747 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3748 <programlisting>
3749 data Foo a = Bar a | Baz String
3750
3751 deriving instance Eq a => Eq (Foo a)
3752 </programlisting>
3753 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3754 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3755 Note the following points:
3756 <itemizedlist>
3757 <listitem><para>
3758 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3759 exactly as you would in an ordinary instance declaration.
3760 (In contrast, in a <literal>deriving</literal> clause
3761 attached to a data type declaration, the context is inferred.)
3762 </para></listitem>
3763
3764 <listitem><para>
3765 A <literal>deriving instance</literal> declaration
3766 must obey the same rules concerning form and termination as ordinary instance declarations,
3767 controlled by the same flags; see <xref linkend="instance-decls"/>.
3768 </para></listitem>
3769
3770 <listitem><para>
3771 Unlike a <literal>deriving</literal>
3772 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3773 than the data type (assuming you also use
3774 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3775 for example
3776 <programlisting>
3777 data Foo a = Bar a | Baz String
3778
3779 deriving instance Eq a => Eq (Foo [a])
3780 deriving instance Eq a => Eq (Foo (Maybe a))
3781 </programlisting>
3782 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3783 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3784 </para></listitem>
3785
3786 <listitem><para>
3787 Unlike a <literal>deriving</literal>
3788 declaration attached to a <literal>data</literal> declaration,
3789 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3790 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3791 your problem. (GHC will show you the offending code if it has a type error.)
3792 The merit of this is that you can derive instances for GADTs and other exotic
3793 data types, providing only that the boilerplate code does indeed typecheck. For example:
3794 <programlisting>
3795 data T a where
3796 T1 :: T Int
3797 T2 :: T Bool
3798
3799 deriving instance Show (T a)
3800 </programlisting>
3801 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3802 data type declaration for <literal>T</literal>,
3803 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3804 the instance declaration using stand-alone deriving.
3805 </para>
3806 </listitem>
3807
3808 <listitem>
3809 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3810 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3811 For example:
3812 <programlisting>
3813 newtype Foo a = MkFoo (State Int a)
3814
3815 deriving instance MonadState Int Foo
3816 </programlisting>
3817 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3818 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
3819 </para></listitem>
3820 </itemizedlist></para>
3821
3822 </sect2>
3823
3824
3825 <sect2 id="deriving-typeable">
3826 <title>Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc)</title>
3827
3828 <para>
3829 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3830 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3831 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3832 classes <literal>Eq</literal>, <literal>Ord</literal>,
3833 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3834 </para>
3835 <para>
3836 GHC extends this list with several more classes that may be automatically derived:
3837 <itemizedlist>
3838 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of the classes
3839 <literal>Typeable</literal>, and <literal>Data</literal>, defined in the library
3840 modules <literal>Data.Typeable</literal> and <literal>Data.Data</literal> respectively.
3841 </para>
3842 <para>Since GHC 7.8.1, <literal>Typeable</literal> is kind-polymorphic (see
3843 <xref linkend="kind-polymorphism"/>) and can be derived for any datatype and
3844 type class. Instances for datatypes can be derived by attaching a
3845 <literal>deriving Typeable</literal> clause to the datatype declaration, or by
3846 using standalone deriving (see <xref linkend="stand-alone-deriving"/>).
3847 Instances for type classes can only be derived using standalone deriving.
3848 For data families, <literal>Typeable</literal> should only be derived for the
3849 uninstantiated family type; each instance will then automatically have a
3850 <literal>Typeable</literal> instance too.
3851 See also <xref linkend="auto-derive-typeable"/>.
3852 </para>
3853 <para>
3854 Also since GHC 7.8.1, handwritten (ie. not derived) instances of
3855 <literal>Typeable</literal> are forbidden, and will result in an error.
3856 </para>
3857 </listitem>
3858
3859 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
3860 instances of the classes <literal>Generic</literal> and
3861 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
3862 You can use these to define generic functions,
3863 as described in <xref linkend="generic-programming"/>.
3864 </para></listitem>
3865
3866 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3867 the class <literal>Functor</literal>,
3868 defined in <literal>GHC.Base</literal>.
3869 </para></listitem>
3870
3871 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3872 the class <literal>Foldable</literal>,
3873 defined in <literal>Data.Foldable</literal>.
3874 </para></listitem>
3875
3876 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3877 the class <literal>Traversable</literal>,
3878 defined in <literal>Data.Traversable</literal>.
3879 </para></listitem>
3880 </itemizedlist>
3881 In each case the appropriate class must be in scope before it
3882 can be mentioned in the <literal>deriving</literal> clause.
3883 </para>
3884 </sect2>
3885
3886 <sect2 id="auto-derive-typeable">
3887 <title>Automatically deriving <literal>Typeable</literal> instances</title>
3888
3889 <para>
3890 The flag <option>-XAutoDeriveTypeable</option> triggers the generation
3891 of derived <literal>Typeable</literal> instances for every datatype and type
3892 class declaration in the module it is used. It will also generate
3893 <literal>Typeable</literal> instances for any promoted data constructors
3894 (<xref linkend="promotion"/>). This flag implies
3895 <option>-XDeriveDataTypeable</option> (<xref linkend="deriving-typeable"/>).
3896 </para>
3897
3898 </sect2>
3899
3900 <sect2 id="newtype-deriving">
3901 <title>Generalised derived instances for newtypes</title>
3902
3903 <para>
3904 When you define an abstract type using <literal>newtype</literal>, you may want
3905 the new type to inherit some instances from its representation. In
3906 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3907 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3908 other classes you have to write an explicit instance declaration. For
3909 example, if you define
3910
3911 <programlisting>
3912 newtype Dollars = Dollars Int
3913 </programlisting>
3914
3915 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3916 explicitly define an instance of <literal>Num</literal>:
3917
3918 <programlisting>
3919 instance Num Dollars where
3920 Dollars a + Dollars b = Dollars (a+b)
3921 ...
3922 </programlisting>
3923 All the instance does is apply and remove the <literal>newtype</literal>
3924 constructor. It is particularly galling that, since the constructor
3925 doesn't appear at run-time, this instance declaration defines a
3926 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3927 dictionary, only slower!
3928 </para>
3929
3930
3931 <sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title>
3932 <para>
3933 GHC now permits such instances to be derived instead,
3934 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
3935 so one can write
3936 <programlisting>
3937 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3938 </programlisting>
3939
3940 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3941 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3942 derives an instance declaration of the form
3943
3944 <programlisting>
3945 instance Num Int => Num Dollars
3946 </programlisting>
3947
3948 which just adds or removes the <literal>newtype</literal> constructor according to the type.
3949 </para>
3950 <para>
3951
3952 We can also derive instances of constructor classes in a similar
3953 way. For example, suppose we have implemented state and failure monad
3954 transformers, such that
3955
3956 <programlisting>
3957 instance Monad m => Monad (State s m)
3958 instance Monad m => Monad (Failure m)
3959 </programlisting>
3960 In Haskell 98, we can define a parsing monad by
3961 <programlisting>
3962 type Parser tok m a = State [tok] (Failure m) a
3963 </programlisting>
3964
3965 which is automatically a monad thanks to the instance declarations
3966 above. With the extension, we can make the parser type abstract,
3967 without needing to write an instance of class <literal>Monad</literal>, via
3968
3969 <programlisting>
3970 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3971 deriving Monad
3972 </programlisting>
3973 In this case the derived instance declaration is of the form
3974 <programlisting>
3975 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
3976 </programlisting>
3977
3978 Notice that, since <literal>Monad</literal> is a constructor class, the
3979 instance is a <emphasis>partial application</emphasis> of the new type, not the
3980 entire left hand side. We can imagine that the type declaration is
3981 "eta-converted" to generate the context of the instance
3982 declaration.
3983 </para>
3984 <para>
3985
3986 We can even derive instances of multi-parameter classes, provided the
3987 newtype is the last class parameter. In this case, a ``partial
3988 application'' of the class appears in the <literal>deriving</literal>
3989 clause. For example, given the class
3990
3991 <programlisting>
3992 class StateMonad s m | m -> s where ...
3993 instance Monad m => StateMonad s (State s m) where ...
3994 </programlisting>
3995 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
3996 <programlisting>
3997 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3998 deriving (Monad, StateMonad [tok])
3999 </programlisting>
4000
4001 The derived instance is obtained by completing the application of the
4002 class to the new type:
4003
4004 <programlisting>
4005 instance StateMonad [tok] (State [tok] (Failure m)) =>
4006 StateMonad [tok] (Parser tok m)
4007 </programlisting>
4008 </para>
4009 <para>
4010
4011 As a result of this extension, all derived instances in newtype
4012 declarations are treated uniformly (and implemented just by reusing
4013 the dictionary for the representation type), <emphasis>except</emphasis>
4014 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
4015 the newtype and its representation.
4016 </para>
4017 </sect3>
4018
4019 <sect3> <title> A more precise specification </title>
4020 <para>
4021 Derived instance declarations are constructed as follows. Consider the
4022 declaration (after expansion of any type synonyms)
4023
4024 <programlisting>
4025 newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)
4026 </programlisting>
4027
4028 where
4029 <itemizedlist>
4030 <listitem><para>
4031 The <literal>ci</literal> are partial applications of
4032 classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
4033 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
4034 </para></listitem>
4035 <listitem><para>
4036 The <literal>k</literal> is chosen so that <literal>ci (T v1...vk)</literal> is well-kinded.
4037 </para></listitem>
4038 <listitem><para>
4039 The type <literal>t</literal> is an arbitrary type.
4040 </para></listitem>
4041 <listitem><para>
4042 The type variables <literal>vk+1...vn</literal> do not occur in <literal>t</literal>,
4043 nor in the <literal>ci</literal>, and
4044 </para></listitem>
4045 <listitem><para>
4046 None of the <literal>ci</literal> is <literal>Read</literal>, <literal>Show</literal>,
4047 <literal>Typeable</literal>, or <literal>Data</literal>. These classes
4048 should not "look through" the type or its constructor. You can still
4049 derive these classes for a newtype, but it happens in the usual way, not
4050 via this new mechanism.
4051 </para></listitem>
4052 <listitem><para>
4053 It is safe to coerce each of the methods of <literal>ci</literal>. That is,
4054 the missing last argument to each of the <literal>ci</literal> is not used
4055 at a nominal role in any of the <literal>ci</literal>'s methods.
4056 (See <xref linkend="roles"/>.)</para></listitem>
4057 </itemizedlist>
4058 Then, for each <literal>ci</literal>, the derived instance
4059 declaration is:
4060 <programlisting>
4061 instance ci t => ci (T v1...vk)
4062 </programlisting>
4063 As an example which does <emphasis>not</emphasis> work, consider
4064 <programlisting>
4065 newtype NonMonad m s = NonMonad (State s m s) deriving Monad
4066 </programlisting>
4067 Here we cannot derive the instance
4068 <programlisting>
4069 instance Monad (State s m) => Monad (NonMonad m)
4070 </programlisting>
4071
4072 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
4073 and so cannot be "eta-converted" away. It is a good thing that this
4074 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
4075 not, in fact, a monad --- for the same reason. Try defining
4076 <literal>>>=</literal> with the correct type: you won't be able to.
4077 </para>
4078 <para>
4079
4080 Notice also that the <emphasis>order</emphasis> of class parameters becomes
4081 important, since we can only derive instances for the last one. If the
4082 <literal>StateMonad</literal> class above were instead defined as
4083
4084 <programlisting>
4085 class StateMonad m s | m -> s where ...
4086 </programlisting>
4087
4088 then we would not have been able to derive an instance for the
4089 <literal>Parser</literal> type above. We hypothesise that multi-parameter
4090 classes usually have one "main" parameter for which deriving new
4091 instances is most interesting.
4092 </para>
4093 <para>Lastly, all of this applies only for classes other than
4094 <literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>,
4095 and <literal>Data</literal>, for which the built-in derivation applies (section
4096 4.3.3. of the Haskell Report).
4097 (For the standard classes <literal>Eq</literal>, <literal>Ord</literal>,
4098 <literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether
4099 the standard method is used or the one described here.)
4100 </para>
4101 </sect3>
4102 </sect2>
4103 </sect1>
4104
4105
4106 <!-- TYPE SYSTEM EXTENSIONS -->
4107 <sect1 id="type-class-extensions">
4108 <title>Class and instances declarations</title>