Clarification in the docs for INLINE
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by commandline flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85
86 <para>
87 If you want to mention any of the primitive data types or operations in your
88 program, you must first import <literal>GHC.Prim</literal> to bring them
89 into scope. Many of them have names ending in "&num;", and to mention such
90 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
91 </para>
92
93 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
94 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
95 we briefly summarise here. </para>
96
97 <sect2 id="glasgow-unboxed">
98 <title>Unboxed types</title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed tuples</title>
206
207 <para>
208 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>;
209 they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An
210 unboxed tuple looks like this:
211 </para>
212
213 <para>
214
215 <programlisting>
216 (# e_1, ..., e_n #)
217 </programlisting>
218
219 </para>
220
221 <para>
222 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
223 type (primitive or non-primitive). The type of an unboxed tuple looks
224 the same.
225 </para>
226
227 <para>
228 Note that when unboxed tuples are enabled,
229 <literal>(#</literal> is a single lexeme, so for example when using
230 operators like <literal>#</literal> and <literal>#-</literal> you need
231 to write <literal>( # )</literal> and <literal>( #- )</literal> rather than
232 <literal>(#)</literal> and <literal>(#-)</literal>.
233 </para>
234
235 <para>
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
242 tuples.
243 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
244 tuples to avoid unnecessary allocation during sequences of operations.
245 </para>
246
247 <para>
248 There are some restrictions on the use of unboxed tuples:
249 <itemizedlist>
250
251 <listitem>
252 <para>
253 Values of unboxed tuple types are subject to the same restrictions as
254 other unboxed types; i.e. they may not be stored in polymorphic data
255 structures or passed to polymorphic functions.
256 </para>
257 </listitem>
258
259 <listitem>
260 <para>
261 The typical use of unboxed tuples is simply to return multiple values,
262 binding those multiple results with a <literal>case</literal> expression, thus:
263 <programlisting>
264 f x y = (# x+1, y-1 #)
265 g x = case f x x of { (# a, b #) -&#62; a + b }
266 </programlisting>
267 You can have an unboxed tuple in a pattern binding, thus
268 <programlisting>
269 f x = let (# p,q #) = h x in ..body..
270 </programlisting>
271 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
272 the resulting binding is lazy like any other Haskell pattern binding. The
273 above example desugars like this:
274 <programlisting>
275 f x = let t = case h x of { (# p,q #) -> (p,q) }
276 p = fst t
277 q = snd t
278 in ..body..
279 </programlisting>
280 Indeed, the bindings can even be recursive.
281 </para>
282 </listitem>
283 </itemizedlist>
284
285 </para>
286
287 </sect2>
288 </sect1>
289
290
291 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
292
293 <sect1 id="syntax-extns">
294 <title>Syntactic extensions</title>
295
296 <sect2 id="unicode-syntax">
297 <title>Unicode syntax</title>
298 <para>The language
299 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
300 enables Unicode characters to be used to stand for certain ASCII
301 character sequences. The following alternatives are provided:</para>
302
303 <informaltable>
304 <tgroup cols="2" align="left" colsep="1" rowsep="1">
305 <thead>
306 <row>
307 <entry>ASCII</entry>
308 <entry>Unicode alternative</entry>
309 <entry>Code point</entry>
310 <entry>Name</entry>
311 </row>
312 </thead>
313
314 <!--
315 to find the DocBook entities for these characters, find
316 the Unicode code point (e.g. 0x2237), and grep for it in
317 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
318 your system. Some of these Unicode code points don't have
319 equivalent DocBook entities.
320 -->
321
322 <tbody>
323 <row>
324 <entry><literal>::</literal></entry>
325 <entry>::</entry> <!-- no special char, apparently -->
326 <entry>0x2237</entry>
327 <entry>PROPORTION</entry>
328 </row>
329 </tbody>
330 <tbody>
331 <row>
332 <entry><literal>=&gt;</literal></entry>
333 <entry>&rArr;</entry>
334 <entry>0x21D2</entry>
335 <entry>RIGHTWARDS DOUBLE ARROW</entry>
336 </row>
337 </tbody>
338 <tbody>
339 <row>
340 <entry><literal>forall</literal></entry>
341 <entry>&forall;</entry>
342 <entry>0x2200</entry>
343 <entry>FOR ALL</entry>
344 </row>
345 </tbody>
346 <tbody>
347 <row>
348 <entry><literal>-&gt;</literal></entry>
349 <entry>&rarr;</entry>
350 <entry>0x2192</entry>
351 <entry>RIGHTWARDS ARROW</entry>
352 </row>
353 </tbody>
354 <tbody>
355 <row>
356 <entry><literal>&lt;-</literal></entry>
357 <entry>&larr;</entry>
358 <entry>0x2190</entry>
359 <entry>LEFTWARDS ARROW</entry>
360 </row>
361 </tbody>
362
363 <tbody>
364 <row>
365 <entry>-&lt;</entry>
366 <entry>&larrtl;</entry>
367 <entry>0x2919</entry>
368 <entry>LEFTWARDS ARROW-TAIL</entry>
369 </row>
370 </tbody>
371
372 <tbody>
373 <row>
374 <entry>&gt;-</entry>
375 <entry>&rarrtl;</entry>
376 <entry>0x291A</entry>
377 <entry>RIGHTWARDS ARROW-TAIL</entry>
378 </row>
379 </tbody>
380
381 <tbody>
382 <row>
383 <entry>-&lt;&lt;</entry>
384 <entry></entry>
385 <entry>0x291B</entry>
386 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
387 </row>
388 </tbody>
389
390 <tbody>
391 <row>
392 <entry>&gt;&gt;-</entry>
393 <entry></entry>
394 <entry>0x291C</entry>
395 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
396 </row>
397 </tbody>
398
399 <tbody>
400 <row>
401 <entry>*</entry>
402 <entry>&starf;</entry>
403 <entry>0x2605</entry>
404 <entry>BLACK STAR</entry>
405 </row>
406 </tbody>
407
408 </tgroup>
409 </informaltable>
410 </sect2>
411
412 <sect2 id="magic-hash">
413 <title>The magic hash</title>
414 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
415 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
416 a valid type constructor or data constructor.</para>
417
418 <para>The hash sign does not change semantics at all. We tend to use variable
419 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
420 but there is no requirement to do so; they are just plain ordinary variables.
421 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
422 For example, to bring <literal>Int&num;</literal> into scope you must
423 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
424 the <option>-XMagicHash</option> extension
425 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
426 that is now in scope. Note that with this option, the meaning of <literal>x&num;y = 0</literal>
427 is changed: it defines a function <literal>x&num;</literal> taking a single argument <literal>y</literal>;
428 to define the operator <literal>&num;</literal>, put a space: <literal>x &num; y = 0</literal>.
429
430 </para>
431 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
432 <itemizedlist>
433 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
434 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
435 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
436 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
437 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal>.</para></listitem>
438 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
439 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
440 is a <literal>Word&num;</literal>. </para> </listitem>
441 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
442 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
443 </itemizedlist>
444 </para>
445 </sect2>
446
447 <sect2 id="negative-literals">
448 <title>Negative literals</title>
449 <para>
450 The literal <literal>-123</literal> is, according to
451 Haskell98 and Haskell 2010, desugared as
452 <literal>negate (fromInteger 123)</literal>.
453 The language extension <option>-XNegativeLiterals</option>
454 means that it is instead desugared as
455 <literal>fromInteger (-123)</literal>.
456 </para>
457
458 <para>
459 This can make a difference when the positive and negative range of
460 a numeric data type don't match up. For example,
461 in 8-bit arithmetic -128 is representable, but +128 is not.
462 So <literal>negate (fromInteger 128)</literal> will elicit an
463 unexpected integer-literal-overflow message.
464 </para>
465 </sect2>
466
467 <sect2 id="num-decimals">
468 <title>Fractional looking integer literals</title>
469 <para>
470 Haskell 2010 and Haskell 98 define floating literals with
471 the syntax <literal>1.2e6</literal>. These literals have the
472 type <literal>Fractional a => a</literal>.
473 </para>
474
475 <para>
476 The language extension <option>-XNumDecimals</option> allows
477 you to also use the floating literal syntax for instances of
478 <literal>Integral</literal>, and have values like
479 <literal>(1.2e6 :: Num a => a)</literal>
480 </para>
481 </sect2>
482
483
484 <!-- ====================== HIERARCHICAL MODULES ======================= -->
485
486
487 <sect2 id="hierarchical-modules">
488 <title>Hierarchical Modules</title>
489
490 <para>GHC supports a small extension to the syntax of module
491 names: a module name is allowed to contain a dot
492 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
493 &ldquo;hierarchical module namespace&rdquo; extension, because
494 it extends the normally flat Haskell module namespace into a
495 more flexible hierarchy of modules.</para>
496
497 <para>This extension has very little impact on the language
498 itself; modules names are <emphasis>always</emphasis> fully
499 qualified, so you can just think of the fully qualified module
500 name as <quote>the module name</quote>. In particular, this
501 means that the full module name must be given after the
502 <literal>module</literal> keyword at the beginning of the
503 module; for example, the module <literal>A.B.C</literal> must
504 begin</para>
505
506 <programlisting>module A.B.C</programlisting>
507
508
509 <para>It is a common strategy to use the <literal>as</literal>
510 keyword to save some typing when using qualified names with
511 hierarchical modules. For example:</para>
512
513 <programlisting>
514 import qualified Control.Monad.ST.Strict as ST
515 </programlisting>
516
517 <para>For details on how GHC searches for source and interface
518 files in the presence of hierarchical modules, see <xref
519 linkend="search-path"/>.</para>
520
521 <para>GHC comes with a large collection of libraries arranged
522 hierarchically; see the accompanying <ulink
523 url="../libraries/index.html">library
524 documentation</ulink>. More libraries to install are available
525 from <ulink
526 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
527 </sect2>
528
529 <!-- ====================== PATTERN GUARDS ======================= -->
530
531 <sect2 id="pattern-guards">
532 <title>Pattern guards</title>
533
534 <para>
535 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
536 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
537 </para>
538
539 <para>
540 Suppose we have an abstract data type of finite maps, with a
541 lookup operation:
542
543 <programlisting>
544 lookup :: FiniteMap -> Int -> Maybe Int
545 </programlisting>
546
547 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
548 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
549 </para>
550
551 <programlisting>
552 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
553 | otherwise = var1 + var2
554 where
555 m1 = lookup env var1
556 m2 = lookup env var2
557 ok1 = maybeToBool m1
558 ok2 = maybeToBool m2
559 val1 = expectJust m1
560 val2 = expectJust m2
561 </programlisting>
562
563 <para>
564 The auxiliary functions are
565 </para>
566
567 <programlisting>
568 maybeToBool :: Maybe a -&gt; Bool
569 maybeToBool (Just x) = True
570 maybeToBool Nothing = False
571
572 expectJust :: Maybe a -&gt; a
573 expectJust (Just x) = x
574 expectJust Nothing = error "Unexpected Nothing"
575 </programlisting>
576
577 <para>
578 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
579 ok2</literal> checks that both lookups succeed, using
580 <function>maybeToBool</function> to convert the <function>Maybe</function>
581 types to booleans. The (lazily evaluated) <function>expectJust</function>
582 calls extract the values from the results of the lookups, and binds the
583 returned values to <varname>val1</varname> and <varname>val2</varname>
584 respectively. If either lookup fails, then clunky takes the
585 <literal>otherwise</literal> case and returns the sum of its arguments.
586 </para>
587
588 <para>
589 This is certainly legal Haskell, but it is a tremendously verbose and
590 un-obvious way to achieve the desired effect. Arguably, a more direct way
591 to write clunky would be to use case expressions:
592 </para>
593
594 <programlisting>
595 clunky env var1 var2 = case lookup env var1 of
596 Nothing -&gt; fail
597 Just val1 -&gt; case lookup env var2 of
598 Nothing -&gt; fail
599 Just val2 -&gt; val1 + val2
600 where
601 fail = var1 + var2
602 </programlisting>
603
604 <para>
605 This is a bit shorter, but hardly better. Of course, we can rewrite any set
606 of pattern-matching, guarded equations as case expressions; that is
607 precisely what the compiler does when compiling equations! The reason that
608 Haskell provides guarded equations is because they allow us to write down
609 the cases we want to consider, one at a time, independently of each other.
610 This structure is hidden in the case version. Two of the right-hand sides
611 are really the same (<function>fail</function>), and the whole expression
612 tends to become more and more indented.
613 </para>
614
615 <para>
616 Here is how I would write clunky:
617 </para>
618
619 <programlisting>
620 clunky env var1 var2
621 | Just val1 &lt;- lookup env var1
622 , Just val2 &lt;- lookup env var2
623 = val1 + val2
624 ...other equations for clunky...
625 </programlisting>
626
627 <para>
628 The semantics should be clear enough. The qualifiers are matched in order.
629 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
630 right hand side is evaluated and matched against the pattern on the left.
631 If the match fails then the whole guard fails and the next equation is
632 tried. If it succeeds, then the appropriate binding takes place, and the
633 next qualifier is matched, in the augmented environment. Unlike list
634 comprehensions, however, the type of the expression to the right of the
635 <literal>&lt;-</literal> is the same as the type of the pattern to its
636 left. The bindings introduced by pattern guards scope over all the
637 remaining guard qualifiers, and over the right hand side of the equation.
638 </para>
639
640 <para>
641 Just as with list comprehensions, boolean expressions can be freely mixed
642 with among the pattern guards. For example:
643 </para>
644
645 <programlisting>
646 f x | [y] &lt;- x
647 , y > 3
648 , Just z &lt;- h y
649 = ...
650 </programlisting>
651
652 <para>
653 Haskell's current guards therefore emerge as a special case, in which the
654 qualifier list has just one element, a boolean expression.
655 </para>
656 </sect2>
657
658 <!-- ===================== View patterns =================== -->
659
660 <sect2 id="view-patterns">
661 <title>View patterns
662 </title>
663
664 <para>
665 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
666 More information and examples of view patterns can be found on the
667 <ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
668 page</ulink>.
669 </para>
670
671 <para>
672 View patterns are somewhat like pattern guards that can be nested inside
673 of other patterns. They are a convenient way of pattern-matching
674 against values of abstract types. For example, in a programming language
675 implementation, we might represent the syntax of the types of the
676 language as follows:
677
678 <programlisting>
679 type Typ
680
681 data TypView = Unit
682 | Arrow Typ Typ
683
684 view :: Typ -> TypView
685
686 -- additional operations for constructing Typ's ...
687 </programlisting>
688
689 The representation of Typ is held abstract, permitting implementations
690 to use a fancy representation (e.g., hash-consing to manage sharing).
691
692 Without view patterns, using this signature a little inconvenient:
693 <programlisting>
694 size :: Typ -> Integer
695 size t = case view t of
696 Unit -> 1
697 Arrow t1 t2 -> size t1 + size t2
698 </programlisting>
699
700 It is necessary to iterate the case, rather than using an equational
701 function definition. And the situation is even worse when the matching
702 against <literal>t</literal> is buried deep inside another pattern.
703 </para>
704
705 <para>
706 View patterns permit calling the view function inside the pattern and
707 matching against the result:
708 <programlisting>
709 size (view -> Unit) = 1
710 size (view -> Arrow t1 t2) = size t1 + size t2
711 </programlisting>
712
713 That is, we add a new form of pattern, written
714 <replaceable>expression</replaceable> <literal>-></literal>
715 <replaceable>pattern</replaceable> that means "apply the expression to
716 whatever we're trying to match against, and then match the result of
717 that application against the pattern". The expression can be any Haskell
718 expression of function type, and view patterns can be used wherever
719 patterns are used.
720 </para>
721
722 <para>
723 The semantics of a pattern <literal>(</literal>
724 <replaceable>exp</replaceable> <literal>-></literal>
725 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
726
727 <itemizedlist>
728
729 <listitem> Scoping:
730
731 <para>The variables bound by the view pattern are the variables bound by
732 <replaceable>pat</replaceable>.
733 </para>
734
735 <para>
736 Any variables in <replaceable>exp</replaceable> are bound occurrences,
737 but variables bound "to the left" in a pattern are in scope. This
738 feature permits, for example, one argument to a function to be used in
739 the view of another argument. For example, the function
740 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
741 written using view patterns as follows:
742
743 <programlisting>
744 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
745 ...other equations for clunky...
746 </programlisting>
747 </para>
748
749 <para>
750 More precisely, the scoping rules are:
751 <itemizedlist>
752 <listitem>
753 <para>
754 In a single pattern, variables bound by patterns to the left of a view
755 pattern expression are in scope. For example:
756 <programlisting>
757 example :: Maybe ((String -> Integer,Integer), String) -> Bool
758 example Just ((f,_), f -> 4) = True
759 </programlisting>
760
761 Additionally, in function definitions, variables bound by matching earlier curried
762 arguments may be used in view pattern expressions in later arguments:
763 <programlisting>
764 example :: (String -> Integer) -> String -> Bool
765 example f (f -> 4) = True
766 </programlisting>
767 That is, the scoping is the same as it would be if the curried arguments
768 were collected into a tuple.
769 </para>
770 </listitem>
771
772 <listitem>
773 <para>
774 In mutually recursive bindings, such as <literal>let</literal>,
775 <literal>where</literal>, or the top level, view patterns in one
776 declaration may not mention variables bound by other declarations. That
777 is, each declaration must be self-contained. For example, the following
778 program is not allowed:
779 <programlisting>
780 let {(x -> y) = e1 ;
781 (y -> x) = e2 } in x
782 </programlisting>
783
784 (For some amplification on this design choice see
785 <ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
786
787 </para>
788 </listitem>
789 </itemizedlist>
790
791 </para>
792 </listitem>
793
794 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
795 <replaceable>T1</replaceable> <literal>-></literal>
796 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
797 a <replaceable>T2</replaceable>, then the whole view pattern matches a
798 <replaceable>T1</replaceable>.
799 </para></listitem>
800
801 <listitem><para> Matching: To the equations in Section 3.17.3 of the
802 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
803 Report</ulink>, add the following:
804 <programlisting>
805 case v of { (e -> p) -> e1 ; _ -> e2 }
806 =
807 case (e v) of { p -> e1 ; _ -> e2 }
808 </programlisting>
809 That is, to match a variable <replaceable>v</replaceable> against a pattern
810 <literal>(</literal> <replaceable>exp</replaceable>
811 <literal>-></literal> <replaceable>pat</replaceable>
812 <literal>)</literal>, evaluate <literal>(</literal>
813 <replaceable>exp</replaceable> <replaceable> v</replaceable>
814 <literal>)</literal> and match the result against
815 <replaceable>pat</replaceable>.
816 </para></listitem>
817
818 <listitem><para> Efficiency: When the same view function is applied in
819 multiple branches of a function definition or a case expression (e.g.,
820 in <literal>size</literal> above), GHC makes an attempt to collect these
821 applications into a single nested case expression, so that the view
822 function is only applied once. Pattern compilation in GHC follows the
823 matrix algorithm described in Chapter 4 of <ulink
824 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
825 Implementation of Functional Programming Languages</ulink>. When the
826 top rows of the first column of a matrix are all view patterns with the
827 "same" expression, these patterns are transformed into a single nested
828 case. This includes, for example, adjacent view patterns that line up
829 in a tuple, as in
830 <programlisting>
831 f ((view -> A, p1), p2) = e1
832 f ((view -> B, p3), p4) = e2
833 </programlisting>
834 </para>
835
836 <para> The current notion of when two view pattern expressions are "the
837 same" is very restricted: it is not even full syntactic equality.
838 However, it does include variables, literals, applications, and tuples;
839 e.g., two instances of <literal>view ("hi", "there")</literal> will be
840 collected. However, the current implementation does not compare up to
841 alpha-equivalence, so two instances of <literal>(x, view x ->
842 y)</literal> will not be coalesced.
843 </para>
844
845 </listitem>
846
847 </itemizedlist>
848 </para>
849
850 </sect2>
851
852 <!-- ===================== n+k patterns =================== -->
853
854 <sect2 id="n-k-patterns">
855 <title>n+k patterns</title>
856 <indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm>
857
858 <para>
859 <literal>n+k</literal> pattern support is disabled by default. To enable
860 it, you can use the <option>-XNPlusKPatterns</option> flag.
861 </para>
862
863 </sect2>
864
865 <!-- ===================== Traditional record syntax =================== -->
866
867 <sect2 id="traditional-record-syntax">
868 <title>Traditional record syntax</title>
869 <indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm>
870
871 <para>
872 Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default.
873 To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag.
874 </para>
875
876 </sect2>
877
878 <!-- ===================== Recursive do-notation =================== -->
879
880 <sect2 id="recursive-do-notation">
881 <title>The recursive do-notation
882 </title>
883
884 <para>
885 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
886 that is, the variables bound in a do-expression are visible only in the textually following
887 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
888 group.
889 </para>
890
891 <para>
892 It turns out that such recursive bindings do indeed make sense for a variety of monads, but
893 not all. In particular, recursion in this sense requires a fixed-point operator for the underlying
894 monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows:
895 <programlisting>
896 class Monad m => MonadFix m where
897 mfix :: (a -> m a) -> m a
898 </programlisting>
899 Haskell's
900 <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions),
901 <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative
902 side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not.
903 </para>
904
905 <para>
906 For monads that do belong to the <literal>MonadFix</literal> class, GHC provides
907 an extended version of the do-notation that allows recursive bindings.
908 The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>)
909 provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and
910 <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike
911 bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal>
912 are recursively defined, much like in an ordinary let-expression. Due to the new
913 keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>.
914 </para>
915
916 <para>
917 Here is a simple (albeit contrived) example:
918 <programlisting>
919 {-# LANGUAGE RecursiveDo #-}
920 justOnes = mdo { xs &lt;- Just (1:xs)
921 ; return (map negate xs) }
922 </programlisting>
923 or equivalently
924 <programlisting>
925 {-# LANGUAGE RecursiveDo #-}
926 justOnes = do { rec { xs &lt;- Just (1:xs) }
927 ; return (map negate xs) }
928 </programlisting>
929 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
930 </para>
931
932 <para>
933 GHC's implementation the mdo-notation closely follows the original translation as described in the paper
934 <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which
935 in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion
936 in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper
937 with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next.
938 </para>
939
940 <sect3>
941 <title>Recursive binding groups</title>
942
943 <para>
944 The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a
945 mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement.
946 Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in
947 the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare
948 <programlisting>
949 do { a &lt;- getChar do { a &lt;- getChar
950 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
951 ; ; r2 = g r1 } ; ; r2 &lt;- g r1 }
952 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
953 </programlisting>
954 In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout
955 the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it.
956 The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic.
957 (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.)
958 </para>
959
960 <para>
961 The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal>
962 group, it will compute its set of bound variables, and will introduce an appropriate call
963 to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the
964 <literal>MonadFix</literal> class. Here is an example:
965 <programlisting>
966 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\ ~(b,c) -> do { b &lt;- f a c
967 ; c &lt;- f b a } ; c &lt;- f b a
968 ; return (b,c) })
969 </programlisting>
970 As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns.
971 In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement
972 <programlisting>
973 <replaceable>vs</replaceable> &lt;- mfix (\ ~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
974 </programlisting>
975 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
976 </para>
977
978 <para>
979 Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call
980 to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task
981 for the <literal>mdo</literal> notation, which is described next.
982 </para>
983 </sect3>
984
985 <sect3>
986 <title>The <literal>mdo</literal> notation</title>
987
988 <para>
989 A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that
990 the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped
991 around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described
992 in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for
993 Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids
994 unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis>
995 axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis>
996 have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary
997 interference, changing the termination behavior of the resulting translation.
998 (Details can be found in Sections 3.1 and 7.2.2 of
999 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.)
1000 </para>
1001
1002 <para>
1003 The <literal>mdo</literal> notation removes the burden of placing
1004 explicit <literal>rec</literal> blocks in the code. Unlike an
1005 ordinary <literal>do</literal> expression, in which variables bound by
1006 statements are only in scope for later statements, variables bound in
1007 an <literal>mdo</literal> expression are in scope for all statements
1008 of the expression. The compiler then automatically identifies minimal
1009 mutually recursively dependent segments of statements, treating them as
1010 if the user had wrapped a <literal>rec</literal> qualifier around them.
1011 </para>
1012
1013 <para>
1014 The definition is syntactic:
1015 </para>
1016 <itemizedlist>
1017 <listitem>
1018 <para>
1019 A generator <replaceable>g</replaceable>
1020 <emphasis>depends</emphasis> on a textually following generator
1021 <replaceable>g'</replaceable>, if
1022 </para>
1023 <itemizedlist>
1024 <listitem>
1025 <para>
1026 <replaceable>g'</replaceable> defines a variable that
1027 is used by <replaceable>g</replaceable>, or
1028 </para>
1029 </listitem>
1030 <listitem>
1031 <para>
1032 <replaceable>g'</replaceable> textually appears between
1033 <replaceable>g</replaceable> and
1034 <replaceable>g''</replaceable>, where <replaceable>g</replaceable>
1035 depends on <replaceable>g''</replaceable>.
1036 </para>
1037 </listitem>
1038 </itemizedlist>
1039 </listitem>
1040 <listitem>
1041 <para>
1042 A <emphasis>segment</emphasis> of a given
1043 <literal>mdo</literal>-expression is a minimal sequence of generators
1044 such that no generator of the sequence depends on an outside
1045 generator. As a special case, although it is not a generator,
1046 the final expression in an <literal>mdo</literal>-expression is
1047 considered to form a segment by itself.
1048 </para>
1049 </listitem>
1050 </itemizedlist>
1051 <para>
1052 Segments in this sense are
1053 related to <emphasis>strongly-connected components</emphasis> analysis,
1054 with the exception that bindings in a segment cannot be reordered and
1055 must be contiguous.
1056 </para>
1057
1058 <para>
1059 Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks:
1060 <programlisting>
1061 mdo { a &lt;- getChar ===> do { a &lt;- getChar
1062 ; b &lt;- f a c ; rec { b &lt;- f a c
1063 ; c &lt;- f b a ; ; c &lt;- f b a }
1064 ; z &lt;- h a b ; z &lt;- h a b
1065 ; d &lt;- g d e ; rec { d &lt;- g d e
1066 ; e &lt;- g a z ; ; e &lt;- g a z }
1067 ; putChar c } ; putChar c }
1068 </programlisting>
1069 Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks.
1070 If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this
1071 latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one
1072 would expect.
1073 </para>
1074
1075 <para>
1076 In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing
1077 <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting
1078 <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described
1079 in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared
1080 version would do so.
1081 </para>
1082
1083 <para>
1084 Here are some other important points in using the recursive-do notation:
1085
1086 <itemizedlist>
1087 <listitem>
1088 <para>
1089 It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal>
1090 pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal>
1091 blocks inside <literal>do</literal> expressions.)
1092 </para>
1093 </listitem>
1094 <listitem>
1095 <para>
1096 <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be
1097 treated as a single statement. However, it is good style to either use <literal>mdo</literal> or
1098 <literal>rec</literal> blocks in a single expression.
1099 </para>
1100 </listitem>
1101 <listitem>
1102 <para>
1103 If recursive bindings are required for a monad, then that monad must be declared an instance of
1104 the <literal>MonadFix</literal> class.
1105 </para>
1106 </listitem>
1107 <listitem>
1108 <para>
1109 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1110 Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal>
1111 modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal
1112 state monad (strict and lazy, respectively).
1113 </para>
1114 </listitem>
1115 <listitem>
1116 <para>
1117 Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within
1118 an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in
1119 a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.)
1120 </para>
1121 </listitem>
1122 </itemizedlist>
1123 </para>
1124 </sect3>
1125
1126
1127 </sect2>
1128
1129
1130 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1131
1132 <sect2 id="parallel-list-comprehensions">
1133 <title>Parallel List Comprehensions</title>
1134 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1135 </indexterm>
1136 <indexterm><primary>parallel list comprehensions</primary>
1137 </indexterm>
1138
1139 <para>Parallel list comprehensions are a natural extension to list
1140 comprehensions. List comprehensions can be thought of as a nice
1141 syntax for writing maps and filters. Parallel comprehensions
1142 extend this to include the zipWith family.</para>
1143
1144 <para>A parallel list comprehension has multiple independent
1145 branches of qualifier lists, each separated by a `|' symbol. For
1146 example, the following zips together two lists:</para>
1147
1148 <programlisting>
1149 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1150 </programlisting>
1151
1152 <para>The behaviour of parallel list comprehensions follows that of
1153 zip, in that the resulting list will have the same length as the
1154 shortest branch.</para>
1155
1156 <para>We can define parallel list comprehensions by translation to
1157 regular comprehensions. Here's the basic idea:</para>
1158
1159 <para>Given a parallel comprehension of the form: </para>
1160
1161 <programlisting>
1162 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1163 | q1 &lt;- e21, q2 &lt;- e22, ...
1164 ...
1165 ]
1166 </programlisting>
1167
1168 <para>This will be translated to: </para>
1169
1170 <programlisting>
1171 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1172 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1173 ...
1174 ]
1175 </programlisting>
1176
1177 <para>where `zipN' is the appropriate zip for the given number of
1178 branches.</para>
1179
1180 </sect2>
1181
1182 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1183
1184 <sect2 id="generalised-list-comprehensions">
1185 <title>Generalised (SQL-Like) List Comprehensions</title>
1186 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1187 </indexterm>
1188 <indexterm><primary>extended list comprehensions</primary>
1189 </indexterm>
1190 <indexterm><primary>group</primary></indexterm>
1191 <indexterm><primary>sql</primary></indexterm>
1192
1193
1194 <para>Generalised list comprehensions are a further enhancement to the
1195 list comprehension syntactic sugar to allow operations such as sorting
1196 and grouping which are familiar from SQL. They are fully described in the
1197 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1198 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1199 except that the syntax we use differs slightly from the paper.</para>
1200 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1201 <para>Here is an example:
1202 <programlisting>
1203 employees = [ ("Simon", "MS", 80)
1204 , ("Erik", "MS", 100)
1205 , ("Phil", "Ed", 40)
1206 , ("Gordon", "Ed", 45)
1207 , ("Paul", "Yale", 60)]
1208
1209 output = [ (the dept, sum salary)
1210 | (name, dept, salary) &lt;- employees
1211 , then group by dept using groupWith
1212 , then sortWith by (sum salary)
1213 , then take 5 ]
1214 </programlisting>
1215 In this example, the list <literal>output</literal> would take on
1216 the value:
1217
1218 <programlisting>
1219 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1220 </programlisting>
1221 </para>
1222 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1223 (The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary
1224 functions that are exported by <literal>GHC.Exts</literal>.)</para>
1225
1226 <para>There are five new forms of comprehension qualifier,
1227 all introduced by the (existing) keyword <literal>then</literal>:
1228 <itemizedlist>
1229 <listitem>
1230
1231 <programlisting>
1232 then f
1233 </programlisting>
1234
1235 This statement requires that <literal>f</literal> have the type <literal>
1236 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1237 motivating example, as this form is used to apply <literal>take 5</literal>.
1238
1239 </listitem>
1240
1241
1242 <listitem>
1243 <para>
1244 <programlisting>
1245 then f by e
1246 </programlisting>
1247
1248 This form is similar to the previous one, but allows you to create a function
1249 which will be passed as the first argument to f. As a consequence f must have
1250 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1251 from the type, this function lets f &quot;project out&quot; some information
1252 from the elements of the list it is transforming.</para>
1253
1254 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1255 is supplied with a function that lets it find out the <literal>sum salary</literal>
1256 for any item in the list comprehension it transforms.</para>
1257
1258 </listitem>
1259
1260
1261 <listitem>
1262
1263 <programlisting>
1264 then group by e using f
1265 </programlisting>
1266
1267 <para>This is the most general of the grouping-type statements. In this form,
1268 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1269 As with the <literal>then f by e</literal> case above, the first argument
1270 is a function supplied to f by the compiler which lets it compute e on every
1271 element of the list being transformed. However, unlike the non-grouping case,
1272 f additionally partitions the list into a number of sublists: this means that
1273 at every point after this statement, binders occurring before it in the comprehension
1274 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1275 this, let's look at an example:</para>
1276
1277 <programlisting>
1278 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1279 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1280 groupRuns f = groupBy (\x y -> f x == f y)
1281
1282 output = [ (the x, y)
1283 | x &lt;- ([1..3] ++ [1..2])
1284 , y &lt;- [4..6]
1285 , then group by x using groupRuns ]
1286 </programlisting>
1287
1288 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1289
1290 <programlisting>
1291 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1292 </programlisting>
1293
1294 <para>Note that we have used the <literal>the</literal> function to change the type
1295 of x from a list to its original numeric type. The variable y, in contrast, is left
1296 unchanged from the list form introduced by the grouping.</para>
1297
1298 </listitem>
1299
1300 <listitem>
1301
1302 <programlisting>
1303 then group using f
1304 </programlisting>
1305
1306 <para>With this form of the group statement, f is required to simply have the type
1307 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1308 comprehension so far directly. An example of this form is as follows:</para>
1309
1310 <programlisting>
1311 output = [ x
1312 | y &lt;- [1..5]
1313 , x &lt;- "hello"
1314 , then group using inits]
1315 </programlisting>
1316
1317 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1318
1319 <programlisting>
1320 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1321 </programlisting>
1322
1323 </listitem>
1324 </itemizedlist>
1325 </para>
1326 </sect2>
1327
1328 <!-- ===================== MONAD COMPREHENSIONS ===================== -->
1329
1330 <sect2 id="monad-comprehensions">
1331 <title>Monad comprehensions</title>
1332 <indexterm><primary>monad comprehensions</primary></indexterm>
1333
1334 <para>
1335 Monad comprehensions generalise the list comprehension notation,
1336 including parallel comprehensions
1337 (<xref linkend="parallel-list-comprehensions"/>) and
1338 transform comprehensions (<xref linkend="generalised-list-comprehensions"/>)
1339 to work for any monad.
1340 </para>
1341
1342 <para>Monad comprehensions support:</para>
1343
1344 <itemizedlist>
1345 <listitem>
1346 <para>
1347 Bindings:
1348 </para>
1349
1350 <programlisting>
1351 [ x + y | x &lt;- Just 1, y &lt;- Just 2 ]
1352 </programlisting>
1353
1354 <para>
1355 Bindings are translated with the <literal>(&gt;&gt;=)</literal> and
1356 <literal>return</literal> functions to the usual do-notation:
1357 </para>
1358
1359 <programlisting>
1360 do x &lt;- Just 1
1361 y &lt;- Just 2
1362 return (x+y)
1363 </programlisting>
1364
1365 </listitem>
1366 <listitem>
1367 <para>
1368 Guards:
1369 </para>
1370
1371 <programlisting>
1372 [ x | x &lt;- [1..10], x &lt;= 5 ]
1373 </programlisting>
1374
1375 <para>
1376 Guards are translated with the <literal>guard</literal> function,
1377 which requires a <literal>MonadPlus</literal> instance:
1378 </para>
1379
1380 <programlisting>
1381 do x &lt;- [1..10]
1382 guard (x &lt;= 5)
1383 return x
1384 </programlisting>
1385
1386 </listitem>
1387 <listitem>
1388 <para>
1389 Transform statements (as with <literal>-XTransformListComp</literal>):
1390 </para>
1391
1392 <programlisting>
1393 [ x+y | x &lt;- [1..10], y &lt;- [1..x], then take 2 ]
1394 </programlisting>
1395
1396 <para>
1397 This translates to:
1398 </para>
1399
1400 <programlisting>
1401 do (x,y) &lt;- take 2 (do x &lt;- [1..10]
1402 y &lt;- [1..x]
1403 return (x,y))
1404 return (x+y)
1405 </programlisting>
1406
1407 </listitem>
1408 <listitem>
1409 <para>
1410 Group statements (as with <literal>-XTransformListComp</literal>):
1411 </para>
1412
1413 <programlisting>
1414 [ x | x &lt;- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ]
1415 [ x | x &lt;- [1,1,2,2,3], then group using myGroup ]
1416 </programlisting>
1417
1418 </listitem>
1419 <listitem>
1420 <para>
1421 Parallel statements (as with <literal>-XParallelListComp</literal>):
1422 </para>
1423
1424 <programlisting>
1425 [ (x+y) | x &lt;- [1..10]
1426 | y &lt;- [11..20]
1427 ]
1428 </programlisting>
1429
1430 <para>
1431 Parallel statements are translated using the
1432 <literal>mzip</literal> function, which requires a
1433 <literal>MonadZip</literal> instance defined in
1434 <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>:
1435 </para>
1436
1437 <programlisting>
1438 do (x,y) &lt;- mzip (do x &lt;- [1..10]
1439 return x)
1440 (do y &lt;- [11..20]
1441 return y)
1442 return (x+y)
1443 </programlisting>
1444
1445 </listitem>
1446 </itemizedlist>
1447
1448 <para>
1449 All these features are enabled by default if the
1450 <literal>MonadComprehensions</literal> extension is enabled. The types
1451 and more detailed examples on how to use comprehensions are explained
1452 in the previous chapters <xref
1453 linkend="generalised-list-comprehensions"/> and <xref
1454 linkend="parallel-list-comprehensions"/>. In general you just have
1455 to replace the type <literal>[a]</literal> with the type
1456 <literal>Monad m => m a</literal> for monad comprehensions.
1457 </para>
1458
1459 <para>
1460 Note: Even though most of these examples are using the list monad,
1461 monad comprehensions work for any monad.
1462 The <literal>base</literal> package offers all necessary instances for
1463 lists, which make <literal>MonadComprehensions</literal> backward
1464 compatible to built-in, transform and parallel list comprehensions.
1465 </para>
1466 <para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal>
1467 to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>:
1468 <programlisting>
1469 Expressions: e
1470 Declarations: d
1471 Lists of qualifiers: Q,R,S
1472
1473 -- Basic forms
1474 D[ e | ] = return e
1475 D[ e | p &lt;- e, Q ] = e &gt;&gt;= \p -&gt; D[ e | Q ]
1476 D[ e | e, Q ] = guard e &gt;&gt; \p -&gt; D[ e | Q ]
1477 D[ e | let d, Q ] = let d in D[ e | Q ]
1478
1479 -- Parallel comprehensions (iterate for multiple parallel branches)
1480 D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] &gt;&gt;= \(Qv,Rv) -&gt; D[ e | S ]
1481
1482 -- Transform comprehensions
1483 D[ e | Q then f, R ] = f D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1484
1485 D[ e | Q then f by b, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \Qv -&gt; D[ e | R ]
1486
1487 D[ e | Q then group using f, R ] = f D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1488 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1489 Qv -&gt; D[ e | R ]
1490
1491 D[ e | Q then group by b using f, R ] = f (\Qv -&gt; b) D[ Qv | Q ] &gt;&gt;= \ys -&gt;
1492 case (fmap selQv1 ys, ..., fmap selQvn ys) of
1493 Qv -&gt; D[ e | R ]
1494
1495 where Qv is the tuple of variables bound by Q (and used subsequently)
1496 selQvi is a selector mapping Qv to the ith component of Qv
1497
1498 Operator Standard binding Expected type
1499 --------------------------------------------------------------------
1500 return GHC.Base t1 -&gt; m t2
1501 (&gt;&gt;=) GHC.Base m1 t1 -&gt; (t2 -&gt; m2 t3) -&gt; m3 t3
1502 (&gt;&gt;) GHC.Base m1 t1 -&gt; m2 t2 -&gt; m3 t3
1503 guard Control.Monad t1 -&gt; m t2
1504 fmap GHC.Base forall a b. (a-&gt;b) -&gt; n a -&gt; n b
1505 mzip Control.Monad.Zip forall a b. m a -&gt; m b -&gt; m (a,b)
1506 </programlisting>
1507 The comprehension should typecheck when its desugaring would typecheck.
1508 </para>
1509 <para>
1510 Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>).
1511 Without rebindable
1512 syntax, the operators from the "standard binding" module are used; with
1513 rebindable syntax, the operators are looked up in the current lexical scope.
1514 For example, parallel comprehensions will be typechecked and desugared
1515 using whatever "<literal>mzip</literal>" is in scope.
1516 </para>
1517 <para>
1518 The rebindable operators must have the "Expected type" given in the
1519 table above. These types are surprisingly general. For example, you can
1520 use a bind operator with the type
1521 <programlisting>
1522 (>>=) :: T x y a -> (a -> T y z b) -> T x z b
1523 </programlisting>
1524 In the case of transform comprehensions, notice that the groups are
1525 parameterised over some arbitrary type <literal>n</literal> (provided it
1526 has an <literal>fmap</literal>, as well as
1527 the comprehension being over an arbitrary monad.
1528 </para>
1529 </sect2>
1530
1531 <!-- ===================== REBINDABLE SYNTAX =================== -->
1532
1533 <sect2 id="rebindable-syntax">
1534 <title>Rebindable syntax and the implicit Prelude import</title>
1535
1536 <para><indexterm><primary>-XNoImplicitPrelude
1537 option</primary></indexterm> GHC normally imports
1538 <filename>Prelude.hi</filename> files for you. If you'd
1539 rather it didn't, then give it a
1540 <option>-XNoImplicitPrelude</option> option. The idea is
1541 that you can then import a Prelude of your own. (But don't
1542 call it <literal>Prelude</literal>; the Haskell module
1543 namespace is flat, and you must not conflict with any
1544 Prelude module.)</para>
1545
1546 <para>Suppose you are importing a Prelude of your own
1547 in order to define your own numeric class
1548 hierarchy. It completely defeats that purpose if the
1549 literal "1" means "<literal>Prelude.fromInteger
1550 1</literal>", which is what the Haskell Report specifies.
1551 So the <option>-XRebindableSyntax</option>
1552 flag causes
1553 the following pieces of built-in syntax to refer to
1554 <emphasis>whatever is in scope</emphasis>, not the Prelude
1555 versions:
1556 <itemizedlist>
1557 <listitem>
1558 <para>An integer literal <literal>368</literal> means
1559 "<literal>fromInteger (368::Integer)</literal>", rather than
1560 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1561 </para> </listitem>
1562
1563 <listitem><para>Fractional literals are handed in just the same way,
1564 except that the translation is
1565 <literal>fromRational (3.68::Rational)</literal>.
1566 </para> </listitem>
1567
1568 <listitem><para>The equality test in an overloaded numeric pattern
1569 uses whatever <literal>(==)</literal> is in scope.
1570 </para> </listitem>
1571
1572 <listitem><para>The subtraction operation, and the
1573 greater-than-or-equal test, in <literal>n+k</literal> patterns
1574 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1575 </para></listitem>
1576
1577 <listitem>
1578 <para>Negation (e.g. "<literal>- (f x)</literal>")
1579 means "<literal>negate (f x)</literal>", both in numeric
1580 patterns, and expressions.
1581 </para></listitem>
1582
1583 <listitem>
1584 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1585 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1586 </para></listitem>
1587
1588 <listitem>
1589 <para>"Do" notation is translated using whatever
1590 functions <literal>(>>=)</literal>,
1591 <literal>(>>)</literal>, and <literal>fail</literal>,
1592 are in scope (not the Prelude
1593 versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array
1594 comprehensions, are unaffected. </para></listitem>
1595
1596 <listitem>
1597 <para>Arrow
1598 notation (see <xref linkend="arrow-notation"/>)
1599 uses whatever <literal>arr</literal>,
1600 <literal>(>>>)</literal>, <literal>first</literal>,
1601 <literal>app</literal>, <literal>(|||)</literal> and
1602 <literal>loop</literal> functions are in scope. But unlike the
1603 other constructs, the types of these functions must match the
1604 Prelude types very closely. Details are in flux; if you want
1605 to use this, ask!
1606 </para></listitem>
1607 </itemizedlist>
1608 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1609 </para>
1610 <para>
1611 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1612 even if that is a little unexpected. For example, the
1613 static semantics of the literal <literal>368</literal>
1614 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1615 <literal>fromInteger</literal> to have any of the types:
1616 <programlisting>
1617 fromInteger :: Integer -> Integer
1618 fromInteger :: forall a. Foo a => Integer -> a
1619 fromInteger :: Num a => a -> Integer
1620 fromInteger :: Integer -> Bool -> Bool
1621 </programlisting>
1622 </para>
1623
1624 <para>Be warned: this is an experimental facility, with
1625 fewer checks than usual. Use <literal>-dcore-lint</literal>
1626 to typecheck the desugared program. If Core Lint is happy
1627 you should be all right.</para>
1628
1629 </sect2>
1630
1631 <sect2 id="postfix-operators">
1632 <title>Postfix operators</title>
1633
1634 <para>
1635 The <option>-XPostfixOperators</option> flag enables a small
1636 extension to the syntax of left operator sections, which allows you to
1637 define postfix operators. The extension is this: the left section
1638 <programlisting>
1639 (e !)
1640 </programlisting>
1641 is equivalent (from the point of view of both type checking and execution) to the expression
1642 <programlisting>
1643 ((!) e)
1644 </programlisting>
1645 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1646 The strict Haskell 98 interpretation is that the section is equivalent to
1647 <programlisting>
1648 (\y -> (!) e y)
1649 </programlisting>
1650 That is, the operator must be a function of two arguments. GHC allows it to
1651 take only one argument, and that in turn allows you to write the function
1652 postfix.
1653 </para>
1654 <para>The extension does not extend to the left-hand side of function
1655 definitions; you must define such a function in prefix form.</para>
1656
1657 </sect2>
1658
1659 <sect2 id="tuple-sections">
1660 <title>Tuple sections</title>
1661
1662 <para>
1663 The <option>-XTupleSections</option> flag enables Python-style partially applied
1664 tuple constructors. For example, the following program
1665 <programlisting>
1666 (, True)
1667 </programlisting>
1668 is considered to be an alternative notation for the more unwieldy alternative
1669 <programlisting>
1670 \x -> (x, True)
1671 </programlisting>
1672 You can omit any combination of arguments to the tuple, as in the following
1673 <programlisting>
1674 (, "I", , , "Love", , 1337)
1675 </programlisting>
1676 which translates to
1677 <programlisting>
1678 \a b c d -> (a, "I", b, c, "Love", d, 1337)
1679 </programlisting>
1680 </para>
1681
1682 <para>
1683 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
1684 will also be available for them, like so
1685 <programlisting>
1686 (# , True #)
1687 </programlisting>
1688 Because there is no unboxed unit tuple, the following expression
1689 <programlisting>
1690 (# #)
1691 </programlisting>
1692 continues to stand for the unboxed singleton tuple data constructor.
1693 </para>
1694
1695 </sect2>
1696
1697 <sect2 id="lambda-case">
1698 <title>Lambda-case</title>
1699 <para>
1700 The <option>-XLambdaCase</option> flag enables expressions of the form
1701 <programlisting>
1702 \case { p1 -> e1; ...; pN -> eN }
1703 </programlisting>
1704 which is equivalent to
1705 <programlisting>
1706 \freshName -> case freshName of { p1 -> e1; ...; pN -> eN }
1707 </programlisting>
1708 Note that <literal>\case</literal> starts a layout, so you can write
1709 <programlisting>
1710 \case
1711 p1 -> e1
1712 ...
1713 pN -> eN
1714 </programlisting>
1715 </para>
1716 </sect2>
1717
1718 <sect2 id="empty-case">
1719 <title>Empty case alternatives</title>
1720 <para>
1721 The <option>-XEmptyCase</option> flag enables
1722 case expressions, or lambda-case expressions, that have no alternatives,
1723 thus:
1724 <programlisting>
1725 case e of { } -- No alternatives
1726 or
1727 \case { } -- -XLambdaCase is also required
1728 </programlisting>
1729 This can be useful when you know that the expression being scrutinised
1730 has no non-bottom values. For example:
1731 <programlisting>
1732 data Void
1733 f :: Void -> Int
1734 f x = case x of { }
1735 </programlisting>
1736 With dependently-typed features it is more useful
1737 (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>).
1738 For example, consider these two candidate definitions of <literal>absurd</literal>:
1739 <programlisting>
1740 data a :==: b where
1741 Refl :: a :==: a
1742
1743 absurd :: True :~: False -> a
1744 absurd x = error "absurd" -- (A)
1745 absurd x = case x of {} -- (B)
1746 </programlisting>
1747 We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal>
1748 is an empty type. So (B) has no partiality and GHC should be able to compile with
1749 <option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not
1750 yet clever enough to do that.)
1751 On the other hand (A) looks dangerous, and GHC doesn't check to make
1752 sure that, in fact, the function can never get called.
1753 </para>
1754 </sect2>
1755
1756 <sect2 id="multi-way-if">
1757 <title>Multi-way if-expressions</title>
1758 <para>
1759 With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions
1760 with multiple branches:
1761 <programlisting>
1762 if | guard1 -> expr1
1763 | ...
1764 | guardN -> exprN
1765 </programlisting>
1766 which is roughly equivalent to
1767 <programlisting>
1768 case () of
1769 _ | guard1 -> expr1
1770 ...
1771 _ | guardN -> exprN
1772 </programlisting>
1773 except that multi-way if-expressions do not alter the layout.
1774 </para>
1775 </sect2>
1776
1777 <sect2 id="disambiguate-fields">
1778 <title>Record field disambiguation</title>
1779 <para>
1780 In record construction and record pattern matching
1781 it is entirely unambiguous which field is referred to, even if there are two different
1782 data types in scope with a common field name. For example:
1783 <programlisting>
1784 module M where
1785 data S = MkS { x :: Int, y :: Bool }
1786
1787 module Foo where
1788 import M
1789
1790 data T = MkT { x :: Int }
1791
1792 ok1 (MkS { x = n }) = n+1 -- Unambiguous
1793 ok2 n = MkT { x = n+1 } -- Unambiguous
1794
1795 bad1 k = k { x = 3 } -- Ambiguous
1796 bad2 k = x k -- Ambiguous
1797 </programlisting>
1798 Even though there are two <literal>x</literal>'s in scope,
1799 it is clear that the <literal>x</literal> in the pattern in the
1800 definition of <literal>ok1</literal> can only mean the field
1801 <literal>x</literal> from type <literal>S</literal>. Similarly for
1802 the function <literal>ok2</literal>. However, in the record update
1803 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
1804 it is not clear which of the two types is intended.
1805 </para>
1806 <para>
1807 Haskell 98 regards all four as ambiguous, but with the
1808 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
1809 the former two. The rules are precisely the same as those for instance
1810 declarations in Haskell 98, where the method names on the left-hand side
1811 of the method bindings in an instance declaration refer unambiguously
1812 to the method of that class (provided they are in scope at all), even
1813 if there are other variables in scope with the same name.
1814 This reduces the clutter of qualified names when you import two
1815 records from different modules that use the same field name.
1816 </para>
1817 <para>
1818 Some details:
1819 <itemizedlist>
1820 <listitem><para>
1821 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example:
1822 <programlisting>
1823 module Foo where
1824 import M
1825 x=True
1826 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
1827 </programlisting>
1828 </para></listitem>
1829
1830 <listitem><para>
1831 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis>
1832 field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis>
1833 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
1834 <programlisting>
1835 module Foo where
1836 import qualified M -- Note qualified
1837
1838 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
1839 </programlisting>
1840 Since the constructor <literal>MkS</literal> is only in scope qualified, you must
1841 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
1842 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
1843 is not. (In effect, it is qualified by the constructor.)
1844 </para></listitem>
1845 </itemizedlist>
1846 </para>
1847
1848 </sect2>
1849
1850 <!-- ===================== Record puns =================== -->
1851
1852 <sect2 id="record-puns">
1853 <title>Record puns
1854 </title>
1855
1856 <para>
1857 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
1858 </para>
1859
1860 <para>
1861 When using records, it is common to write a pattern that binds a
1862 variable with the same name as a record field, such as:
1863
1864 <programlisting>
1865 data C = C {a :: Int}
1866 f (C {a = a}) = a
1867 </programlisting>
1868 </para>
1869
1870 <para>
1871 Record punning permits the variable name to be elided, so one can simply
1872 write
1873
1874 <programlisting>
1875 f (C {a}) = a
1876 </programlisting>
1877
1878 to mean the same pattern as above. That is, in a record pattern, the
1879 pattern <literal>a</literal> expands into the pattern <literal>a =
1880 a</literal> for the same name <literal>a</literal>.
1881 </para>
1882
1883 <para>
1884 Note that:
1885 <itemizedlist>
1886 <listitem><para>
1887 Record punning can also be used in an expression, writing, for example,
1888 <programlisting>
1889 let a = 1 in C {a}
1890 </programlisting>
1891 instead of
1892 <programlisting>
1893 let a = 1 in C {a = a}
1894 </programlisting>
1895 The expansion is purely syntactic, so the expanded right-hand side
1896 expression refers to the nearest enclosing variable that is spelled the
1897 same as the field name.
1898 </para></listitem>
1899
1900 <listitem><para>
1901 Puns and other patterns can be mixed in the same record:
1902 <programlisting>
1903 data C = C {a :: Int, b :: Int}
1904 f (C {a, b = 4}) = a
1905 </programlisting>
1906 </para></listitem>
1907
1908 <listitem><para>
1909 Puns can be used wherever record patterns occur (e.g. in
1910 <literal>let</literal> bindings or at the top-level).
1911 </para></listitem>
1912
1913 <listitem><para>
1914 A pun on a qualified field name is expanded by stripping off the module qualifier.
1915 For example:
1916 <programlisting>
1917 f (C {M.a}) = a
1918 </programlisting>
1919 means
1920 <programlisting>
1921 f (M.C {M.a = a}) = a
1922 </programlisting>
1923 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
1924 is only in scope in qualified form.)
1925 </para></listitem>
1926 </itemizedlist>
1927 </para>
1928
1929
1930 </sect2>
1931
1932 <!-- ===================== Record wildcards =================== -->
1933
1934 <sect2 id="record-wildcards">
1935 <title>Record wildcards
1936 </title>
1937
1938 <para>
1939 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
1940 This flag implies <literal>-XDisambiguateRecordFields</literal>.
1941 </para>
1942
1943 <para>
1944 For records with many fields, it can be tiresome to write out each field
1945 individually in a record pattern, as in
1946 <programlisting>
1947 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
1948 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
1949 </programlisting>
1950 </para>
1951
1952 <para>
1953 Record wildcard syntax permits a "<literal>..</literal>" in a record
1954 pattern, where each elided field <literal>f</literal> is replaced by the
1955 pattern <literal>f = f</literal>. For example, the above pattern can be
1956 written as
1957 <programlisting>
1958 f (C {a = 1, ..}) = b + c + d
1959 </programlisting>
1960 </para>
1961
1962 <para>
1963 More details:
1964 <itemizedlist>
1965 <listitem><para>
1966 Wildcards can be mixed with other patterns, including puns
1967 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
1968 = 1, b, ..})</literal>. Additionally, record wildcards can be used
1969 wherever record patterns occur, including in <literal>let</literal>
1970 bindings and at the top-level. For example, the top-level binding
1971 <programlisting>
1972 C {a = 1, ..} = e
1973 </programlisting>
1974 defines <literal>b</literal>, <literal>c</literal>, and
1975 <literal>d</literal>.
1976 </para></listitem>
1977
1978 <listitem><para>
1979 Record wildcards can also be used in expressions, writing, for example,
1980 <programlisting>
1981 let {a = 1; b = 2; c = 3; d = 4} in C {..}
1982 </programlisting>
1983 in place of
1984 <programlisting>
1985 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
1986 </programlisting>
1987 The expansion is purely syntactic, so the record wildcard
1988 expression refers to the nearest enclosing variables that are spelled
1989 the same as the omitted field names.
1990 </para></listitem>
1991
1992 <listitem><para>
1993 The "<literal>..</literal>" expands to the missing
1994 <emphasis>in-scope</emphasis> record fields.
1995 Specifically the expansion of "<literal>C {..}</literal>" includes
1996 <literal>f</literal> if and only if:
1997 <itemizedlist>
1998 <listitem><para>
1999 <literal>f</literal> is a record field of constructor <literal>C</literal>.
2000 </para></listitem>
2001 <listitem><para>
2002 The record field <literal>f</literal> is in scope somehow (either qualified or unqualified).
2003 </para></listitem>
2004 <listitem><para>
2005 In the case of expressions (but not patterns),
2006 the variable <literal>f</literal> is in scope unqualified,
2007 apart from the binding of the record selector itself.
2008 </para></listitem>
2009 </itemizedlist>
2010 For example
2011 <programlisting>
2012 module M where
2013 data R = R { a,b,c :: Int }
2014 module X where
2015 import M( R(a,c) )
2016 f b = R { .. }
2017 </programlisting>
2018 The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>,
2019 omitting <literal>b</literal> since the record field is not in scope,
2020 and omitting <literal>c</literal> since the variable <literal>c</literal>
2021 is not in scope (apart from the binding of the
2022 record selector <literal>c</literal>, of course).
2023 </para></listitem>
2024 </itemizedlist>
2025 </para>
2026
2027 </sect2>
2028
2029 <!-- ===================== Local fixity declarations =================== -->
2030
2031 <sect2 id="local-fixity-declarations">
2032 <title>Local Fixity Declarations
2033 </title>
2034
2035 <para>A careful reading of the Haskell 98 Report reveals that fixity
2036 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
2037 <literal>infixr</literal>) are permitted to appear inside local bindings
2038 such those introduced by <literal>let</literal> and
2039 <literal>where</literal>. However, the Haskell Report does not specify
2040 the semantics of such bindings very precisely.
2041 </para>
2042
2043 <para>In GHC, a fixity declaration may accompany a local binding:
2044 <programlisting>
2045 let f = ...
2046 infixr 3 `f`
2047 in
2048 ...
2049 </programlisting>
2050 and the fixity declaration applies wherever the binding is in scope.
2051 For example, in a <literal>let</literal>, it applies in the right-hand
2052 sides of other <literal>let</literal>-bindings and the body of the
2053 <literal>let</literal>C. Or, in recursive <literal>do</literal>
2054 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
2055 declarations of a <literal>let</literal> statement scope over other
2056 statements in the group, just as the bound name does.
2057 </para>
2058
2059 <para>
2060 Moreover, a local fixity declaration *must* accompany a local binding of
2061 that name: it is not possible to revise the fixity of name bound
2062 elsewhere, as in
2063 <programlisting>
2064 let infixr 9 $ in ...
2065 </programlisting>
2066
2067 Because local fixity declarations are technically Haskell 98, no flag is
2068 necessary to enable them.
2069 </para>
2070 </sect2>
2071
2072 <sect2 id="package-imports">
2073 <title>Package-qualified imports</title>
2074
2075 <para>With the <option>-XPackageImports</option> flag, GHC allows
2076 import declarations to be qualified by the package name that the
2077 module is intended to be imported from. For example:</para>
2078
2079 <programlisting>
2080 import "network" Network.Socket
2081 </programlisting>
2082
2083 <para>would import the module <literal>Network.Socket</literal> from
2084 the package <literal>network</literal> (any version). This may
2085 be used to disambiguate an import when the same module is
2086 available from multiple packages, or is present in both the
2087 current package being built and an external package.</para>
2088
2089 <para>The special package name <literal>this</literal> can be used to
2090 refer to the current package being built.</para>
2091
2092 <para>Note: you probably don't need to use this feature, it was
2093 added mainly so that we can build backwards-compatible versions of
2094 packages when APIs change. It can lead to fragile dependencies in
2095 the common case: modules occasionally move from one package to
2096 another, rendering any package-qualified imports broken.</para>
2097 </sect2>
2098
2099 <sect2 id="safe-imports-ext">
2100 <title>Safe imports</title>
2101
2102 <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option>
2103 and <option>-XUnsafe</option> language flags, GHC extends
2104 the import declaration syntax to take an optional <literal>safe</literal>
2105 keyword after the <literal>import</literal> keyword. This feature
2106 is part of the Safe Haskell GHC extension. For example:</para>
2107
2108 <programlisting>
2109 import safe qualified Network.Socket as NS
2110 </programlisting>
2111
2112 <para>would import the module <literal>Network.Socket</literal>
2113 with compilation only succeeding if Network.Socket can be
2114 safely imported. For a description of when a import is
2115 considered safe see <xref linkend="safe-haskell"/></para>
2116
2117 </sect2>
2118
2119 <sect2 id="explicit-namespaces">
2120 <title>Explicit namespaces in import/export</title>
2121
2122 <para> In an import or export list, such as
2123 <programlisting>
2124 module M( f, (++) ) where ...
2125 import N( f, (++) )
2126 ...
2127 </programlisting>
2128 the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>.
2129 However, with type operators (<xref linkend="type-operators"/>) it becomes possible
2130 to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that
2131 case, how would you export or import it?
2132 </para>
2133 <para>
2134 The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of
2135 a type constructor in an import or export list with "<literal>type</literal>" to
2136 disambiguate this case, thus:
2137 <programlisting>
2138 module M( f, type (++) ) where ...
2139 import N( f, type (++) )
2140 ...
2141 module N( f, type (++) ) where
2142 data family a ++ b = L a | R b
2143 </programlisting>
2144 The extension <option>-XExplicitNamespaces</option>
2145 is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>.
2146 </para>
2147 </sect2>
2148
2149 <sect2 id="syntax-stolen">
2150 <title>Summary of stolen syntax</title>
2151
2152 <para>Turning on an option that enables special syntax
2153 <emphasis>might</emphasis> cause working Haskell 98 code to fail
2154 to compile, perhaps because it uses a variable name which has
2155 become a reserved word. This section lists the syntax that is
2156 "stolen" by language extensions.
2157 We use
2158 notation and nonterminal names from the Haskell 98 lexical syntax
2159 (see the Haskell 98 Report).
2160 We only list syntax changes here that might affect
2161 existing working programs (i.e. "stolen" syntax). Many of these
2162 extensions will also enable new context-free syntax, but in all
2163 cases programs written to use the new syntax would not be
2164 compilable without the option enabled.</para>
2165
2166 <para>There are two classes of special
2167 syntax:
2168
2169 <itemizedlist>
2170 <listitem>
2171 <para>New reserved words and symbols: character sequences
2172 which are no longer available for use as identifiers in the
2173 program.</para>
2174 </listitem>
2175 <listitem>
2176 <para>Other special syntax: sequences of characters that have
2177 a different meaning when this particular option is turned
2178 on.</para>
2179 </listitem>
2180 </itemizedlist>
2181
2182 The following syntax is stolen:
2183
2184 <variablelist>
2185 <varlistentry>
2186 <term>
2187 <literal>forall</literal>
2188 <indexterm><primary><literal>forall</literal></primary></indexterm>
2189 </term>
2190 <listitem><para>
2191 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
2192 <option>-XScopedTypeVariables</option>,
2193 <option>-XLiberalTypeSynonyms</option>,
2194 <option>-XRankNTypes</option>,
2195 <option>-XExistentialQuantification</option>
2196 </para></listitem>
2197 </varlistentry>
2198
2199 <varlistentry>
2200 <term>
2201 <literal>mdo</literal>
2202 <indexterm><primary><literal>mdo</literal></primary></indexterm>
2203 </term>
2204 <listitem><para>
2205 Stolen by: <option>-XRecursiveDo</option>
2206 </para></listitem>
2207 </varlistentry>
2208
2209 <varlistentry>
2210 <term>
2211 <literal>foreign</literal>
2212 <indexterm><primary><literal>foreign</literal></primary></indexterm>
2213 </term>
2214 <listitem><para>
2215 Stolen by: <option>-XForeignFunctionInterface</option>
2216 </para></listitem>
2217 </varlistentry>
2218
2219 <varlistentry>
2220 <term>
2221 <literal>rec</literal>,
2222 <literal>proc</literal>, <literal>-&lt;</literal>,
2223 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
2224 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
2225 <literal>|)</literal> brackets
2226 <indexterm><primary><literal>proc</literal></primary></indexterm>
2227 </term>
2228 <listitem><para>
2229 Stolen by: <option>-XArrows</option>
2230 </para></listitem>
2231 </varlistentry>
2232
2233 <varlistentry>
2234 <term>
2235 <literal>?<replaceable>varid</replaceable></literal>,
2236 <literal>%<replaceable>varid</replaceable></literal>
2237 <indexterm><primary>implicit parameters</primary></indexterm>
2238 </term>
2239 <listitem><para>
2240 Stolen by: <option>-XImplicitParams</option>
2241 </para></listitem>
2242 </varlistentry>
2243
2244 <varlistentry>
2245 <term>
2246 <literal>[|</literal>,
2247 <literal>[e|</literal>, <literal>[p|</literal>,
2248 <literal>[d|</literal>, <literal>[t|</literal>,
2249 <literal>$(</literal>,
2250 <literal>$<replaceable>varid</replaceable></literal>
2251 <indexterm><primary>Template Haskell</primary></indexterm>
2252 </term>
2253 <listitem><para>
2254 Stolen by: <option>-XTemplateHaskell</option>
2255 </para></listitem>
2256 </varlistentry>
2257
2258 <varlistentry>
2259 <term>
2260 <literal>[:<replaceable>varid</replaceable>|</literal>
2261 <indexterm><primary>quasi-quotation</primary></indexterm>
2262 </term>
2263 <listitem><para>
2264 Stolen by: <option>-XQuasiQuotes</option>
2265 </para></listitem>
2266 </varlistentry>
2267
2268 <varlistentry>
2269 <term>
2270 <replaceable>varid</replaceable>{<literal>&num;</literal>},
2271 <replaceable>char</replaceable><literal>&num;</literal>,
2272 <replaceable>string</replaceable><literal>&num;</literal>,
2273 <replaceable>integer</replaceable><literal>&num;</literal>,
2274 <replaceable>float</replaceable><literal>&num;</literal>,
2275 <replaceable>float</replaceable><literal>&num;&num;</literal>
2276 </term>
2277 <listitem><para>
2278 Stolen by: <option>-XMagicHash</option>
2279 </para></listitem>
2280 </varlistentry>
2281
2282 <varlistentry>
2283 <term>
2284 <literal>(&num;</literal>, <literal>&num;)</literal>
2285 </term>
2286 <listitem><para>
2287 Stolen by: <option>-XUnboxedTuples</option>
2288 </para></listitem>
2289 </varlistentry>
2290
2291 <varlistentry>
2292 <term>
2293 <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable>
2294 </term>
2295 <listitem><para>
2296 Stolen by: <option>-XBangPatterns</option>
2297 </para></listitem>
2298 </varlistentry>
2299 </variablelist>
2300 </para>
2301 </sect2>
2302 </sect1>
2303
2304
2305 <!-- TYPE SYSTEM EXTENSIONS -->
2306 <sect1 id="data-type-extensions">
2307 <title>Extensions to data types and type synonyms</title>
2308
2309 <sect2 id="nullary-types">
2310 <title>Data types with no constructors</title>
2311
2312 <para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma),
2313 GHC lets you declare a data type with no constructors. For example:</para>
2314
2315 <programlisting>
2316 data S -- S :: *
2317 data T a -- T :: * -> *
2318 </programlisting>
2319
2320 <para>Syntactically, the declaration lacks the "= constrs" part. The
2321 type can be parameterised over types of any kind, but if the kind is
2322 not <literal>*</literal> then an explicit kind annotation must be used
2323 (see <xref linkend="kinding"/>).</para>
2324
2325 <para>Such data types have only one value, namely bottom.
2326 Nevertheless, they can be useful when defining "phantom types".</para>
2327 </sect2>
2328
2329 <sect2 id="datatype-contexts">
2330 <title>Data type contexts</title>
2331
2332 <para>Haskell allows datatypes to be given contexts, e.g.</para>
2333
2334 <programlisting>
2335 data Eq a => Set a = NilSet | ConsSet a (Set a)
2336 </programlisting>
2337
2338 <para>give constructors with types:</para>
2339
2340 <programlisting>
2341 NilSet :: Set a
2342 ConsSet :: Eq a => a -> Set a -> Set a
2343 </programlisting>
2344
2345 <para>This is widely considered a misfeature, and is going to be removed from
2346 the language. In GHC, it is controlled by the deprecated extension
2347 <literal>DatatypeContexts</literal>.</para>
2348 </sect2>
2349
2350 <sect2 id="infix-tycons">
2351 <title>Infix type constructors, classes, and type variables</title>
2352
2353 <para>
2354 GHC allows type constructors, classes, and type variables to be operators, and
2355 to be written infix, very much like expressions. More specifically:
2356 <itemizedlist>
2357 <listitem><para>
2358 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
2359 The lexical syntax is the same as that for data constructors.
2360 </para></listitem>
2361 <listitem><para>
2362 Data type and type-synonym declarations can be written infix, parenthesised
2363 if you want further arguments. E.g.
2364 <screen>
2365 data a :*: b = Foo a b
2366 type a :+: b = Either a b
2367 class a :=: b where ...
2368
2369 data (a :**: b) x = Baz a b x
2370 type (a :++: b) y = Either (a,b) y
2371 </screen>
2372 </para></listitem>
2373 <listitem><para>
2374 Types, and class constraints, can be written infix. For example
2375 <screen>
2376 x :: Int :*: Bool
2377 f :: (a :=: b) => a -> b
2378 </screen>
2379 </para></listitem>
2380 <listitem><para>
2381 Back-quotes work
2382 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
2383 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
2384 </para></listitem>
2385 <listitem><para>
2386 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
2387 one cannot distinguish between the two in a fixity declaration; a fixity declaration
2388 sets the fixity for a data constructor and the corresponding type constructor. For example:
2389 <screen>
2390 infixl 7 T, :*:
2391 </screen>
2392 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
2393 and similarly for <literal>:*:</literal>.
2394 <literal>Int `a` Bool</literal>.
2395 </para></listitem>
2396 <listitem><para>
2397 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
2398 </para></listitem>
2399
2400 </itemizedlist>
2401 </para>
2402 </sect2>
2403
2404 <sect2 id="type-operators">
2405 <title>Type operators</title>
2406 <para>
2407 In types, an operator symbol like <literal>(+)</literal> is normally treated as a type
2408 <emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say
2409 <programlisting>
2410 type T (+) = ((+), (+))
2411 -- Just like: type T a = (a,a)
2412
2413 f :: T Int -> Int
2414 f (x,y)= x
2415 </programlisting>
2416 As you can see, using operators in this way is not very useful, and Haskell 98 does not even
2417 allow you to write them infix.
2418 </para>
2419 <para>
2420 The language <option>-XTypeOperators</option> changes this behaviour:
2421 <itemizedlist>
2422 <listitem><para>
2423 Operator symbols become type <emphasis>constructors</emphasis> rather than
2424 type <emphasis>variables</emphasis>.
2425 </para></listitem>
2426 <listitem><para>
2427 Operator symbols in types can be written infix, both in definitions and uses.
2428 for example:
2429 <programlisting>
2430 data a + b = Plus a b
2431 type Foo = Int + Bool
2432 </programlisting>
2433 </para></listitem>
2434 <listitem><para>
2435 There is now some potential ambiguity in import and export lists; for example
2436 if you write <literal>import M( (+) )</literal> do you mean the
2437 <emphasis>function</emphasis> <literal>(+)</literal> or the
2438 <emphasis>type constructor</emphasis> <literal>(+)</literal>?
2439 The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied
2440 by <option>-XExplicitTypeOperators</option>) GHC allows you to specify the latter
2441 by preceding it with the keyword <literal>type</literal>, thus:
2442 <programlisting>
2443 import M( type (+) )
2444 </programlisting>
2445 See <xref linkend="explicit-namespaces"/>.
2446 </para></listitem>
2447 <listitem><para>
2448 The fixity of a type operator may be set using the usual fixity declarations
2449 but, as in <xref linkend="infix-tycons"/>, the function and type constructor share
2450 a single fixity.
2451 </para></listitem>
2452 </itemizedlist>
2453 </para>
2454 </sect2>
2455
2456 <sect2 id="type-synonyms">
2457 <title>Liberalised type synonyms</title>
2458
2459 <para>
2460 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
2461 on individual synonym declarations.
2462 With the <option>-XLiberalTypeSynonyms</option> extension,
2463 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
2464 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
2465
2466 <itemizedlist>
2467 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
2468 in a type synonym, thus:
2469 <programlisting>
2470 type Discard a = forall b. Show b => a -> b -> (a, String)
2471
2472 f :: Discard a
2473 f x y = (x, show y)
2474
2475 g :: Discard Int -> (Int,String) -- A rank-2 type
2476 g f = f 3 True
2477 </programlisting>
2478 </para>
2479 </listitem>
2480
2481 <listitem><para>
2482 If you also use <option>-XUnboxedTuples</option>,
2483 you can write an unboxed tuple in a type synonym:
2484 <programlisting>
2485 type Pr = (# Int, Int #)
2486
2487 h :: Int -> Pr
2488 h x = (# x, x #)
2489 </programlisting>
2490 </para></listitem>
2491
2492 <listitem><para>
2493 You can apply a type synonym to a forall type:
2494 <programlisting>
2495 type Foo a = a -> a -> Bool
2496
2497 f :: Foo (forall b. b->b)
2498 </programlisting>
2499 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2500 <programlisting>
2501 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2502 </programlisting>
2503 </para></listitem>
2504
2505 <listitem><para>
2506 You can apply a type synonym to a partially applied type synonym:
2507 <programlisting>
2508 type Generic i o = forall x. i x -> o x
2509 type Id x = x
2510
2511 foo :: Generic Id []
2512 </programlisting>
2513 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2514 <programlisting>
2515 foo :: forall x. x -> [x]
2516 </programlisting>
2517 </para></listitem>
2518
2519 </itemizedlist>
2520 </para>
2521
2522 <para>
2523 GHC currently does kind checking before expanding synonyms (though even that
2524 could be changed.)
2525 </para>
2526 <para>
2527 After expanding type synonyms, GHC does validity checking on types, looking for
2528 the following mal-formedness which isn't detected simply by kind checking:
2529 <itemizedlist>
2530 <listitem><para>
2531 Type constructor applied to a type involving for-alls.
2532 </para></listitem>
2533 <listitem><para>
2534 Unboxed tuple on left of an arrow.
2535 </para></listitem>
2536 <listitem><para>
2537 Partially-applied type synonym.
2538 </para></listitem>
2539 </itemizedlist>
2540 So, for example,
2541 this will be rejected:
2542 <programlisting>
2543 type Pr = (# Int, Int #)
2544
2545 h :: Pr -> Int
2546 h x = ...
2547 </programlisting>
2548 because GHC does not allow unboxed tuples on the left of a function arrow.
2549 </para>
2550 </sect2>
2551
2552
2553 <sect2 id="existential-quantification">
2554 <title>Existentially quantified data constructors
2555 </title>
2556
2557 <para>
2558 The idea of using existential quantification in data type declarations
2559 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2560 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2561 London, 1991). It was later formalised by Laufer and Odersky
2562 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2563 TOPLAS, 16(5), pp1411-1430, 1994).
2564 It's been in Lennart
2565 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2566 proved very useful. Here's the idea. Consider the declaration:
2567 </para>
2568
2569 <para>
2570
2571 <programlisting>
2572 data Foo = forall a. MkFoo a (a -> Bool)
2573 | Nil
2574 </programlisting>
2575
2576 </para>
2577
2578 <para>
2579 The data type <literal>Foo</literal> has two constructors with types:
2580 </para>
2581
2582 <para>
2583
2584 <programlisting>
2585 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2586 Nil :: Foo
2587 </programlisting>
2588
2589 </para>
2590
2591 <para>
2592 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2593 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2594 For example, the following expression is fine:
2595 </para>
2596
2597 <para>
2598
2599 <programlisting>
2600 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2601 </programlisting>
2602
2603 </para>
2604
2605 <para>
2606 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
2607 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
2608 isUpper</function> packages a character with a compatible function. These
2609 two things are each of type <literal>Foo</literal> and can be put in a list.
2610 </para>
2611
2612 <para>
2613 What can we do with a value of type <literal>Foo</literal>?. In particular,
2614 what happens when we pattern-match on <function>MkFoo</function>?
2615 </para>
2616
2617 <para>
2618
2619 <programlisting>
2620 f (MkFoo val fn) = ???
2621 </programlisting>
2622
2623 </para>
2624
2625 <para>
2626 Since all we know about <literal>val</literal> and <function>fn</function> is that they
2627 are compatible, the only (useful) thing we can do with them is to
2628 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
2629 </para>
2630
2631 <para>
2632
2633 <programlisting>
2634 f :: Foo -> Bool
2635 f (MkFoo val fn) = fn val
2636 </programlisting>
2637
2638 </para>
2639
2640 <para>
2641 What this allows us to do is to package heterogeneous values
2642 together with a bunch of functions that manipulate them, and then treat
2643 that collection of packages in a uniform manner. You can express
2644 quite a bit of object-oriented-like programming this way.
2645 </para>
2646
2647 <sect3 id="existential">
2648 <title>Why existential?
2649 </title>
2650
2651 <para>
2652 What has this to do with <emphasis>existential</emphasis> quantification?
2653 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
2654 </para>
2655
2656 <para>
2657
2658 <programlisting>
2659 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2660 </programlisting>
2661
2662 </para>
2663
2664 <para>
2665 But Haskell programmers can safely think of the ordinary
2666 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
2667 adding a new existential quantification construct.
2668 </para>
2669
2670 </sect3>
2671
2672 <sect3 id="existential-with-context">
2673 <title>Existentials and type classes</title>
2674
2675 <para>
2676 An easy extension is to allow
2677 arbitrary contexts before the constructor. For example:
2678 </para>
2679
2680 <para>
2681
2682 <programlisting>
2683 data Baz = forall a. Eq a => Baz1 a a
2684 | forall b. Show b => Baz2 b (b -> b)
2685 </programlisting>
2686
2687 </para>
2688
2689 <para>
2690 The two constructors have the types you'd expect:
2691 </para>
2692
2693 <para>
2694
2695 <programlisting>
2696 Baz1 :: forall a. Eq a => a -> a -> Baz
2697 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2698 </programlisting>
2699
2700 </para>
2701
2702 <para>
2703 But when pattern matching on <function>Baz1</function> the matched values can be compared
2704 for equality, and when pattern matching on <function>Baz2</function> the first matched
2705 value can be converted to a string (as well as applying the function to it).
2706 So this program is legal:
2707 </para>
2708
2709 <para>
2710
2711 <programlisting>
2712 f :: Baz -> String
2713 f (Baz1 p q) | p == q = "Yes"
2714 | otherwise = "No"
2715 f (Baz2 v fn) = show (fn v)
2716 </programlisting>
2717
2718 </para>
2719
2720 <para>
2721 Operationally, in a dictionary-passing implementation, the
2722 constructors <function>Baz1</function> and <function>Baz2</function> must store the
2723 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
2724 extract it on pattern matching.
2725 </para>
2726
2727 </sect3>
2728
2729 <sect3 id="existential-records">
2730 <title>Record Constructors</title>
2731
2732 <para>
2733 GHC allows existentials to be used with records syntax as well. For example:
2734
2735 <programlisting>
2736 data Counter a = forall self. NewCounter
2737 { _this :: self
2738 , _inc :: self -> self
2739 , _display :: self -> IO ()
2740 , tag :: a
2741 }
2742 </programlisting>
2743 Here <literal>tag</literal> is a public field, with a well-typed selector
2744 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
2745 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
2746 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
2747 compile-time error. In other words, <emphasis>GHC defines a record selector function
2748 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
2749 (This example used an underscore in the fields for which record selectors
2750 will not be defined, but that is only programming style; GHC ignores them.)
2751 </para>
2752
2753 <para>
2754 To make use of these hidden fields, we need to create some helper functions:
2755
2756 <programlisting>
2757 inc :: Counter a -> Counter a
2758 inc (NewCounter x i d t) = NewCounter
2759 { _this = i x, _inc = i, _display = d, tag = t }
2760
2761 display :: Counter a -> IO ()
2762 display NewCounter{ _this = x, _display = d } = d x
2763 </programlisting>
2764
2765 Now we can define counters with different underlying implementations:
2766
2767 <programlisting>
2768 counterA :: Counter String
2769 counterA = NewCounter
2770 { _this = 0, _inc = (1+), _display = print, tag = "A" }
2771
2772 counterB :: Counter String
2773 counterB = NewCounter
2774 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
2775
2776 main = do
2777 display (inc counterA) -- prints "1"
2778 display (inc (inc counterB)) -- prints "##"
2779 </programlisting>
2780
2781 Record update syntax is supported for existentials (and GADTs):
2782 <programlisting>
2783 setTag :: Counter a -> a -> Counter a
2784 setTag obj t = obj{ tag = t }
2785 </programlisting>
2786 The rule for record update is this: <emphasis>
2787 the types of the updated fields may
2788 mention only the universally-quantified type variables
2789 of the data constructor. For GADTs, the field may mention only types
2790 that appear as a simple type-variable argument in the constructor's result
2791 type</emphasis>. For example:
2792 <programlisting>
2793 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
2794 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
2795 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
2796 -- existentially quantified)
2797
2798 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
2799 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
2800 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
2801 -- type-variable argument in G1's result type)
2802 </programlisting>
2803 </para>
2804
2805 </sect3>
2806
2807
2808 <sect3>
2809 <title>Restrictions</title>
2810
2811 <para>
2812 There are several restrictions on the ways in which existentially-quantified
2813 constructors can be use.
2814 </para>
2815
2816 <para>
2817
2818 <itemizedlist>
2819 <listitem>
2820
2821 <para>
2822 When pattern matching, each pattern match introduces a new,
2823 distinct, type for each existential type variable. These types cannot
2824 be unified with any other type, nor can they escape from the scope of
2825 the pattern match. For example, these fragments are incorrect:
2826
2827
2828 <programlisting>
2829 f1 (MkFoo a f) = a
2830 </programlisting>
2831
2832
2833 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
2834 is the result of <function>f1</function>. One way to see why this is wrong is to
2835 ask what type <function>f1</function> has:
2836
2837
2838 <programlisting>
2839 f1 :: Foo -> a -- Weird!
2840 </programlisting>
2841
2842
2843 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
2844 this:
2845
2846
2847 <programlisting>
2848 f1 :: forall a. Foo -> a -- Wrong!
2849 </programlisting>
2850
2851
2852 The original program is just plain wrong. Here's another sort of error
2853
2854
2855 <programlisting>
2856 f2 (Baz1 a b) (Baz1 p q) = a==q
2857 </programlisting>
2858
2859
2860 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
2861 <literal>a==q</literal> is wrong because it equates the two distinct types arising
2862 from the two <function>Baz1</function> constructors.
2863
2864
2865 </para>
2866 </listitem>
2867 <listitem>
2868
2869 <para>
2870 You can't pattern-match on an existentially quantified
2871 constructor in a <literal>let</literal> or <literal>where</literal> group of
2872 bindings. So this is illegal:
2873
2874
2875 <programlisting>
2876 f3 x = a==b where { Baz1 a b = x }
2877 </programlisting>
2878
2879 Instead, use a <literal>case</literal> expression:
2880
2881 <programlisting>
2882 f3 x = case x of Baz1 a b -> a==b
2883 </programlisting>
2884
2885 In general, you can only pattern-match
2886 on an existentially-quantified constructor in a <literal>case</literal> expression or
2887 in the patterns of a function definition.
2888
2889 The reason for this restriction is really an implementation one.
2890 Type-checking binding groups is already a nightmare without
2891 existentials complicating the picture. Also an existential pattern
2892 binding at the top level of a module doesn't make sense, because it's
2893 not clear how to prevent the existentially-quantified type "escaping".
2894 So for now, there's a simple-to-state restriction. We'll see how
2895 annoying it is.
2896
2897 </para>
2898 </listitem>
2899 <listitem>
2900
2901 <para>
2902 You can't use existential quantification for <literal>newtype</literal>
2903 declarations. So this is illegal:
2904
2905
2906 <programlisting>
2907 newtype T = forall a. Ord a => MkT a
2908 </programlisting>
2909
2910
2911 Reason: a value of type <literal>T</literal> must be represented as a
2912 pair of a dictionary for <literal>Ord t</literal> and a value of type
2913 <literal>t</literal>. That contradicts the idea that
2914 <literal>newtype</literal> should have no concrete representation.
2915 You can get just the same efficiency and effect by using
2916 <literal>data</literal> instead of <literal>newtype</literal>. If
2917 there is no overloading involved, then there is more of a case for
2918 allowing an existentially-quantified <literal>newtype</literal>,
2919 because the <literal>data</literal> version does carry an
2920 implementation cost, but single-field existentially quantified
2921 constructors aren't much use. So the simple restriction (no
2922 existential stuff on <literal>newtype</literal>) stands, unless there
2923 are convincing reasons to change it.
2924
2925
2926 </para>
2927 </listitem>
2928 <listitem>
2929
2930 <para>
2931 You can't use <literal>deriving</literal> to define instances of a
2932 data type with existentially quantified data constructors.
2933
2934 Reason: in most cases it would not make sense. For example:;
2935
2936 <programlisting>
2937 data T = forall a. MkT [a] deriving( Eq )
2938 </programlisting>
2939
2940 To derive <literal>Eq</literal> in the standard way we would need to have equality
2941 between the single component of two <function>MkT</function> constructors:
2942
2943 <programlisting>
2944 instance Eq T where
2945 (MkT a) == (MkT b) = ???
2946 </programlisting>
2947
2948 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
2949 It's just about possible to imagine examples in which the derived instance
2950 would make sense, but it seems altogether simpler simply to prohibit such
2951 declarations. Define your own instances!
2952 </para>
2953 </listitem>
2954
2955 </itemizedlist>
2956
2957 </para>
2958
2959 </sect3>
2960 </sect2>
2961
2962 <!-- ====================== Generalised algebraic data types ======================= -->
2963
2964 <sect2 id="gadt-style">
2965 <title>Declaring data types with explicit constructor signatures</title>
2966
2967 <para>When the <literal>GADTSyntax</literal> extension is enabled,
2968 GHC allows you to declare an algebraic data type by
2969 giving the type signatures of constructors explicitly. For example:
2970 <programlisting>
2971 data Maybe a where
2972 Nothing :: Maybe a
2973 Just :: a -> Maybe a
2974 </programlisting>
2975 The form is called a "GADT-style declaration"
2976 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
2977 can only be declared using this form.</para>
2978 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
2979 For example, these two declarations are equivalent:
2980 <programlisting>
2981 data Foo = forall a. MkFoo a (a -> Bool)
2982 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
2983 </programlisting>
2984 </para>
2985 <para>Any data type that can be declared in standard Haskell-98 syntax
2986 can also be declared using GADT-style syntax.
2987 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
2988 they treat class constraints on the data constructors differently.
2989 Specifically, if the constructor is given a type-class context, that
2990 context is made available by pattern matching. For example:
2991 <programlisting>
2992 data Set a where
2993 MkSet :: Eq a => [a] -> Set a
2994
2995 makeSet :: Eq a => [a] -> Set a
2996 makeSet xs = MkSet (nub xs)
2997
2998 insert :: a -> Set a -> Set a
2999 insert a (MkSet as) | a `elem` as = MkSet as
3000 | otherwise = MkSet (a:as)
3001 </programlisting>
3002 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
3003 gives rise to a <literal>(Eq a)</literal>
3004 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
3005 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
3006 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
3007 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
3008 when pattern-matching that dictionary becomes available for the right-hand side of the match.
3009 In the example, the equality dictionary is used to satisfy the equality constraint
3010 generated by the call to <literal>elem</literal>, so that the type of
3011 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
3012 </para>
3013 <para>
3014 For example, one possible application is to reify dictionaries:
3015 <programlisting>
3016 data NumInst a where
3017 MkNumInst :: Num a => NumInst a
3018
3019 intInst :: NumInst Int
3020 intInst = MkNumInst
3021
3022 plus :: NumInst a -> a -> a -> a
3023 plus MkNumInst p q = p + q
3024 </programlisting>
3025 Here, a value of type <literal>NumInst a</literal> is equivalent
3026 to an explicit <literal>(Num a)</literal> dictionary.
3027 </para>
3028 <para>
3029 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
3030 For example, the <literal>NumInst</literal> data type above could equivalently be declared
3031 like this:
3032 <programlisting>
3033 data NumInst a
3034 = Num a => MkNumInst (NumInst a)
3035 </programlisting>
3036 Notice that, unlike the situation when declaring an existential, there is
3037 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
3038 data type's universally quantified type variable <literal>a</literal>.
3039 A constructor may have both universal and existential type variables: for example,
3040 the following two declarations are equivalent:
3041 <programlisting>
3042 data T1 a
3043 = forall b. (Num a, Eq b) => MkT1 a b
3044 data T2 a where
3045 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
3046 </programlisting>
3047 </para>
3048 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
3049 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
3050 In Haskell 98 the definition
3051 <programlisting>
3052 data Eq a => Set' a = MkSet' [a]
3053 </programlisting>
3054 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
3055 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
3056 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
3057 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
3058 GHC's behaviour is much more useful, as well as much more intuitive.
3059 </para>
3060
3061 <para>
3062 The rest of this section gives further details about GADT-style data
3063 type declarations.
3064
3065 <itemizedlist>
3066 <listitem><para>
3067 The result type of each data constructor must begin with the type constructor being defined.
3068 If the result type of all constructors
3069 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
3070 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
3071 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
3072 </para></listitem>
3073
3074 <listitem><para>
3075 As with other type signatures, you can give a single signature for several data constructors.
3076 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
3077 <programlisting>
3078 data T a where
3079 T1,T2 :: a -> T a
3080 T3 :: T a
3081 </programlisting>
3082 </para></listitem>
3083
3084 <listitem><para>
3085 The type signature of
3086 each constructor is independent, and is implicitly universally quantified as usual.
3087 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
3088 have no scope, and different constructors may have different universally-quantified type variables:
3089 <programlisting>
3090 data T a where -- The 'a' has no scope
3091 T1,T2 :: b -> T b -- Means forall b. b -> T b
3092 T3 :: T a -- Means forall a. T a
3093 </programlisting>
3094 </para></listitem>
3095
3096 <listitem><para>
3097 A constructor signature may mention type class constraints, which can differ for
3098 different constructors. For example, this is fine:
3099 <programlisting>
3100 data T a where
3101 T1 :: Eq b => b -> b -> T b
3102 T2 :: (Show c, Ix c) => c -> [c] -> T c
3103 </programlisting>
3104 When pattern matching, these constraints are made available to discharge constraints
3105 in the body of the match. For example:
3106 <programlisting>
3107 f :: T a -> String
3108 f (T1 x y) | x==y = "yes"
3109 | otherwise = "no"
3110 f (T2 a b) = show a
3111 </programlisting>
3112 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
3113 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
3114 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
3115 </para></listitem>
3116
3117 <listitem><para>
3118 Unlike a Haskell-98-style
3119 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
3120 have no scope. Indeed, one can write a kind signature instead:
3121 <programlisting>
3122 data Set :: * -> * where ...
3123 </programlisting>
3124 or even a mixture of the two:
3125 <programlisting>
3126 data Bar a :: (* -> *) -> * where ...
3127 </programlisting>
3128 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
3129 like this:
3130 <programlisting>
3131 data Bar a (b :: * -> *) where ...
3132 </programlisting>
3133 </para></listitem>
3134
3135
3136 <listitem><para>
3137 You can use strictness annotations, in the obvious places
3138 in the constructor type:
3139 <programlisting>
3140 data Term a where
3141 Lit :: !Int -> Term Int
3142 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
3143 Pair :: Term a -> Term b -> Term (a,b)
3144 </programlisting>
3145 </para></listitem>
3146
3147 <listitem><para>
3148 You can use a <literal>deriving</literal> clause on a GADT-style data type
3149 declaration. For example, these two declarations are equivalent
3150 <programlisting>
3151 data Maybe1 a where {
3152 Nothing1 :: Maybe1 a ;
3153 Just1 :: a -> Maybe1 a
3154 } deriving( Eq, Ord )
3155
3156 data Maybe2 a = Nothing2 | Just2 a
3157 deriving( Eq, Ord )
3158 </programlisting>
3159 </para></listitem>
3160
3161 <listitem><para>
3162 The type signature may have quantified type variables that do not appear
3163 in the result type:
3164 <programlisting>
3165 data Foo where
3166 MkFoo :: a -> (a->Bool) -> Foo
3167 Nil :: Foo
3168 </programlisting>
3169 Here the type variable <literal>a</literal> does not appear in the result type
3170 of either constructor.
3171 Although it is universally quantified in the type of the constructor, such
3172 a type variable is often called "existential".
3173 Indeed, the above declaration declares precisely the same type as
3174 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
3175 </para><para>
3176 The type may contain a class context too, of course:
3177 <programlisting>
3178 data Showable where
3179 MkShowable :: Show a => a -> Showable
3180 </programlisting>
3181 </para></listitem>
3182
3183 <listitem><para>
3184 You can use record syntax on a GADT-style data type declaration:
3185
3186 <programlisting>
3187 data Person where
3188 Adult :: { name :: String, children :: [Person] } -> Person
3189 Child :: Show a => { name :: !String, funny :: a } -> Person
3190 </programlisting>
3191 As usual, for every constructor that has a field <literal>f</literal>, the type of
3192 field <literal>f</literal> must be the same (modulo alpha conversion).
3193 The <literal>Child</literal> constructor above shows that the signature
3194 may have a context, existentially-quantified variables, and strictness annotations,
3195 just as in the non-record case. (NB: the "type" that follows the double-colon
3196 is not really a type, because of the record syntax and strictness annotations.
3197 A "type" of this form can appear only in a constructor signature.)
3198 </para></listitem>
3199
3200 <listitem><para>
3201 Record updates are allowed with GADT-style declarations,
3202 only fields that have the following property: the type of the field
3203 mentions no existential type variables.
3204 </para></listitem>
3205
3206 <listitem><para>
3207 As in the case of existentials declared using the Haskell-98-like record syntax
3208 (<xref linkend="existential-records"/>),
3209 record-selector functions are generated only for those fields that have well-typed
3210 selectors.
3211 Here is the example of that section, in GADT-style syntax:
3212 <programlisting>
3213 data Counter a where
3214 NewCounter :: { _this :: self
3215 , _inc :: self -> self
3216 , _display :: self -> IO ()
3217 , tag :: a
3218 } -> Counter a
3219 </programlisting>
3220 As before, only one selector function is generated here, that for <literal>tag</literal>.
3221 Nevertheless, you can still use all the field names in pattern matching and record construction.
3222 </para></listitem>
3223
3224 <listitem><para>
3225 In a GADT-style data type declaration there is no obvious way to specify that a data constructor
3226 should be infix, which makes a difference if you derive <literal>Show</literal> for the type.
3227 (Data constructors declared infix are displayed infix by the derived <literal>show</literal>.)
3228 So GHC implements the following design: a data constructor declared in a GADT-style data type
3229 declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol,
3230 (b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example
3231 <programlisting>
3232 infix 6 (:--:)
3233 data T a where
3234 (:--:) :: Int -> Bool -> T Int
3235 </programlisting>
3236 </para></listitem>
3237 </itemizedlist></para>
3238 </sect2>
3239
3240 <sect2 id="gadt">
3241 <title>Generalised Algebraic Data Types (GADTs)</title>
3242
3243 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
3244 by allowing constructors to have richer return types. Here is an example:
3245 <programlisting>
3246 data Term a where
3247 Lit :: Int -> Term Int
3248 Succ :: Term Int -> Term Int
3249 IsZero :: Term Int -> Term Bool
3250 If :: Term Bool -> Term a -> Term a -> Term a
3251 Pair :: Term a -> Term b -> Term (a,b)
3252 </programlisting>
3253 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
3254 case with ordinary data types. This generality allows us to
3255 write a well-typed <literal>eval</literal> function
3256 for these <literal>Terms</literal>:
3257 <programlisting>
3258 eval :: Term a -> a
3259 eval (Lit i) = i
3260 eval (Succ t) = 1 + eval t
3261 eval (IsZero t) = eval t == 0
3262 eval (If b e1 e2) = if eval b then eval e1 else eval e2
3263 eval (Pair e1 e2) = (eval e1, eval e2)
3264 </programlisting>
3265 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
3266 For example, in the right hand side of the equation
3267 <programlisting>
3268 eval :: Term a -> a
3269 eval (Lit i) = ...
3270 </programlisting>
3271 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
3272 A precise specification of the type rules is beyond what this user manual aspires to,
3273 but the design closely follows that described in
3274 the paper <ulink
3275 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
3276 unification-based type inference for GADTs</ulink>,
3277 (ICFP 2006).
3278 The general principle is this: <emphasis>type refinement is only carried out
3279 based on user-supplied type annotations</emphasis>.
3280 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
3281 and lots of obscure error messages will
3282 occur. However, the refinement is quite general. For example, if we had:
3283 <programlisting>
3284 eval :: Term a -> a -> a
3285 eval (Lit i) j = i+j
3286 </programlisting>
3287 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
3288 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
3289 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
3290 </para>
3291 <para>
3292 These and many other examples are given in papers by Hongwei Xi, and
3293 Tim Sheard. There is a longer introduction
3294 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
3295 and Ralf Hinze's
3296 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
3297 may use different notation to that implemented in GHC.
3298 </para>
3299 <para>
3300 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
3301 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
3302 <itemizedlist>
3303 <listitem><para>
3304 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
3305 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
3306 The result type of each constructor must begin with the type constructor being defined,
3307 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
3308 For example, in the <literal>Term</literal> data
3309 type above, the type of each constructor must end with <literal>Term ty</literal>, but
3310 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
3311 constructor).
3312 </para></listitem>
3313
3314 <listitem><para>
3315 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
3316 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
3317 whose result type is not just <literal>T a b</literal>.
3318 </para></listitem>
3319
3320 <listitem><para>
3321 You cannot use a <literal>deriving</literal> clause for a GADT; only for
3322 an ordinary data type.
3323 </para></listitem>
3324
3325 <listitem><para>
3326 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
3327 For example:
3328 <programlisting>
3329 data Term a where
3330 Lit :: { val :: Int } -> Term Int
3331 Succ :: { num :: Term Int } -> Term Int
3332 Pred :: { num :: Term Int } -> Term Int
3333 IsZero :: { arg :: Term Int } -> Term Bool
3334 Pair :: { arg1 :: Term a
3335 , arg2 :: Term b
3336 } -> Term (a,b)
3337 If :: { cnd :: Term Bool
3338 , tru :: Term a
3339 , fls :: Term a
3340 } -> Term a
3341 </programlisting>
3342 However, for GADTs there is the following additional constraint:
3343 every constructor that has a field <literal>f</literal> must have
3344 the same result type (modulo alpha conversion)
3345 Hence, in the above example, we cannot merge the <literal>num</literal>
3346 and <literal>arg</literal> fields above into a
3347 single name. Although their field types are both <literal>Term Int</literal>,
3348 their selector functions actually have different types:
3349
3350 <programlisting>
3351 num :: Term Int -> Term Int
3352 arg :: Term Bool -> Term Int
3353 </programlisting>
3354 </para></listitem>
3355
3356 <listitem><para>
3357 When pattern-matching against data constructors drawn from a GADT,
3358 for example in a <literal>case</literal> expression, the following rules apply:
3359 <itemizedlist>
3360 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
3361 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
3362 <listitem><para>The type of any free variable mentioned in any of
3363 the <literal>case</literal> alternatives must be rigid.</para></listitem>
3364 </itemizedlist>
3365 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
3366 way to ensure that a variable a rigid type is to give it a type signature.
3367 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
3368 Simple unification-based type inference for GADTs
3369 </ulink>. The criteria implemented by GHC are given in the Appendix.
3370
3371 </para></listitem>
3372
3373 </itemizedlist>
3374 </para>
3375
3376 </sect2>
3377 </sect1>
3378
3379 <!-- ====================== End of Generalised algebraic data types ======================= -->
3380
3381 <sect1 id="deriving">
3382 <title>Extensions to the "deriving" mechanism</title>
3383
3384 <sect2 id="deriving-inferred">
3385 <title>Inferred context for deriving clauses</title>
3386
3387 <para>
3388 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
3389 legal. For example:
3390 <programlisting>
3391 data T0 f a = MkT0 a deriving( Eq )
3392 data T1 f a = MkT1 (f a) deriving( Eq )
3393 data T2 f a = MkT2 (f (f a)) deriving( Eq )
3394 </programlisting>
3395 The natural generated <literal>Eq</literal> code would result in these instance declarations:
3396 <programlisting>
3397 instance Eq a => Eq (T0 f a) where ...
3398 instance Eq (f a) => Eq (T1 f a) where ...
3399 instance Eq (f (f a)) => Eq (T2 f a) where ...
3400 </programlisting>
3401 The first of these is obviously fine. The second is still fine, although less obviously.
3402 The third is not Haskell 98, and risks losing termination of instances.
3403 </para>
3404 <para>
3405 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
3406 each constraint in the inferred instance context must consist only of type variables,
3407 with no repetitions.
3408 </para>
3409 <para>
3410 This rule is applied regardless of flags. If you want a more exotic context, you can write
3411 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
3412 </para>
3413 </sect2>
3414
3415 <sect2 id="stand-alone-deriving">
3416 <title>Stand-alone deriving declarations</title>
3417
3418 <para>
3419 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
3420 <programlisting>
3421 data Foo a = Bar a | Baz String
3422
3423 deriving instance Eq a => Eq (Foo a)
3424 </programlisting>
3425 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
3426 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
3427 Note the following points:
3428 <itemizedlist>
3429 <listitem><para>
3430 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
3431 exactly as you would in an ordinary instance declaration.
3432 (In contrast, in a <literal>deriving</literal> clause
3433 attached to a data type declaration, the context is inferred.)
3434 </para></listitem>
3435
3436 <listitem><para>
3437 A <literal>deriving instance</literal> declaration
3438 must obey the same rules concerning form and termination as ordinary instance declarations,
3439 controlled by the same flags; see <xref linkend="instance-decls"/>.
3440 </para></listitem>
3441
3442 <listitem><para>
3443 Unlike a <literal>deriving</literal>
3444 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
3445 than the data type (assuming you also use
3446 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
3447 for example
3448 <programlisting>
3449 data Foo a = Bar a | Baz String
3450
3451 deriving instance Eq a => Eq (Foo [a])
3452 deriving instance Eq a => Eq (Foo (Maybe a))
3453 </programlisting>
3454 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
3455 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
3456 </para></listitem>
3457
3458 <listitem><para>
3459 Unlike a <literal>deriving</literal>
3460 declaration attached to a <literal>data</literal> declaration,
3461 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
3462 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
3463 your problem. (GHC will show you the offending code if it has a type error.)
3464 The merit of this is that you can derive instances for GADTs and other exotic
3465 data types, providing only that the boilerplate code does indeed typecheck. For example:
3466 <programlisting>
3467 data T a where
3468 T1 :: T Int
3469 T2 :: T Bool
3470
3471 deriving instance Show (T a)
3472 </programlisting>
3473 In this example, you cannot say <literal>... deriving( Show )</literal> on the
3474 data type declaration for <literal>T</literal>,
3475 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
3476 the instance declaration using stand-alone deriving.
3477 </para>
3478 </listitem>
3479
3480 <listitem>
3481 <para>The stand-alone syntax is generalised for newtypes in exactly the same
3482 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
3483 For example:
3484 <programlisting>
3485 newtype Foo a = MkFoo (State Int a)
3486
3487 deriving instance MonadState Int Foo
3488 </programlisting>
3489 GHC always treats the <emphasis>last</emphasis> parameter of the instance
3490 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
3491 </para></listitem>
3492 </itemizedlist></para>
3493
3494 </sect2>
3495
3496
3497 <sect2 id="deriving-typeable">
3498 <title>Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc)</title>
3499
3500 <para>
3501 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3502 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3503 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3504 classes <literal>Eq</literal>, <literal>Ord</literal>,
3505 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3506 </para>
3507 <para>
3508 GHC extends this list with several more classes that may be automatically derived:
3509 <itemizedlist>
3510 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of the classes
3511 <literal>Typeable</literal>, and <literal>Data</literal>, defined in the library
3512 modules <literal>Data.Typeable</literal> and <literal>Data.Data</literal> respectively.
3513 </para>
3514 <para>Since GHC 7.8.1, <literal>Typeable</literal> is kind-polymorphic (see
3515 <xref linkend="kind-polymorphism"/>) and can be derived for any datatype and
3516 type class. Instances for datatypes can be derived by attaching a
3517 <literal>deriving Typeable</literal> clause to the datatype declaration, or by
3518 using standalone deriving (see <xref linkend="stand-alone-deriving"/>).
3519 Instances for type classes can only be derived using standalone deriving.
3520 For data families, <literal>Typeable</literal> should only be derived for the
3521 uninstantiated family type; each instance will then automatically have a
3522 <literal>Typeable</literal> instance too.
3523 See also <xref linkend="auto-derive-typeable"/>.
3524 </para>
3525 <para>
3526 Also since GHC 7.8.1, handwritten (ie. not derived) instances of
3527 <literal>Typeable</literal> are forbidden, and will result in an error.
3528 </para>
3529 </listitem>
3530
3531 <listitem><para> With <option>-XDeriveGeneric</option>, you can derive
3532 instances of the classes <literal>Generic</literal> and
3533 <literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>.
3534 You can use these to define generic functions,
3535 as described in <xref linkend="generic-programming"/>.
3536 </para></listitem>
3537
3538 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3539 the class <literal>Functor</literal>,
3540 defined in <literal>GHC.Base</literal>.
3541 </para></listitem>
3542
3543 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3544 the class <literal>Foldable</literal>,
3545 defined in <literal>Data.Foldable</literal>.
3546 </para></listitem>
3547
3548 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3549 the class <literal>Traversable</literal>,
3550 defined in <literal>Data.Traversable</literal>.
3551 </para></listitem>
3552 </itemizedlist>
3553 In each case the appropriate class must be in scope before it
3554 can be mentioned in the <literal>deriving</literal> clause.
3555 </para>
3556 </sect2>
3557
3558 <sect2 id="auto-derive-typeable">
3559 <title>Automatically deriving <literal>Typeable</literal> instances</title>
3560
3561 <para>
3562 The flag <option>-XAutoDeriveTypeable</option> triggers the generation
3563 of derived <literal>Typeable</literal> instances for every datatype and type
3564 class declaration in the module it is used. It will also generate
3565 <literal>Typeable</literal> instances for any promoted data constructors
3566 (<xref linkend="promotion"/>). This flag implies
3567 <option>-XDeriveDataTypeable</option> (<xref linkend="deriving-typeable"/>).
3568 </para>
3569
3570 </sect2>
3571
3572 <sect2 id="newtype-deriving">
3573 <title>Generalised derived instances for newtypes</title>
3574
3575 <para>
3576 When you define an abstract type using <literal>newtype</literal>, you may want
3577 the new type to inherit some instances from its representation. In
3578 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3579 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3580 other classes you have to write an explicit instance declaration. For
3581 example, if you define
3582
3583 <programlisting>
3584 newtype Dollars = Dollars Int
3585 </programlisting>
3586
3587 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3588 explicitly define an instance of <literal>Num</literal>:
3589
3590 <programlisting>
3591 instance Num Dollars where
3592 Dollars a + Dollars b = Dollars (a+b)
3593 ...
3594 </programlisting>
3595 All the instance does is apply and remove the <literal>newtype</literal>
3596 constructor. It is particularly galling that, since the constructor
3597 doesn't appear at run-time, this instance declaration defines a
3598 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3599 dictionary, only slower!
3600 </para>
3601
3602
3603 <sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title>
3604 <para>
3605 GHC now permits such instances to be derived instead,
3606 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
3607 so one can write
3608 <programlisting>
3609 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3610 </programlisting>
3611
3612 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3613 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3614 derives an instance declaration of the form
3615
3616 <programlisting>
3617 instance Num Int => Num Dollars
3618 </programlisting>
3619
3620 which just adds or removes the <literal>newtype</literal> constructor according to the type.
3621 </para>
3622 <para>
3623
3624 We can also derive instances of constructor classes in a similar
3625 way. For example, suppose we have implemented state and failure monad
3626 transformers, such that
3627
3628 <programlisting>
3629 instance Monad m => Monad (State s m)
3630 instance Monad m => Monad (Failure m)
3631 </programlisting>
3632 In Haskell 98, we can define a parsing monad by
3633 <programlisting>
3634 type Parser tok m a = State [tok] (Failure m) a
3635 </programlisting>
3636
3637 which is automatically a monad thanks to the instance declarations
3638 above. With the extension, we can make the parser type abstract,
3639 without needing to write an instance of class <literal>Monad</literal>, via
3640
3641 <programlisting>
3642 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3643 deriving Monad
3644 </programlisting>
3645 In this case the derived instance declaration is of the form
3646 <programlisting>
3647 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
3648 </programlisting>
3649
3650 Notice that, since <literal>Monad</literal> is a constructor class, the
3651 instance is a <emphasis>partial application</emphasis> of the new type, not the
3652 entire left hand side. We can imagine that the type declaration is
3653 "eta-converted" to generate the context of the instance
3654 declaration.
3655 </para>
3656 <para>
3657
3658 We can even derive instances of multi-parameter classes, provided the
3659 newtype is the last class parameter. In this case, a ``partial
3660 application'' of the class appears in the <literal>deriving</literal>
3661 clause. For example, given the class
3662
3663 <programlisting>
3664 class StateMonad s m | m -> s where ...
3665 instance Monad m => StateMonad s (State s m) where ...
3666 </programlisting>
3667 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
3668 <programlisting>
3669 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3670 deriving (Monad, StateMonad [tok])
3671 </programlisting>
3672
3673 The derived instance is obtained by completing the application of the
3674 class to the new type:
3675
3676 <programlisting>
3677 instance StateMonad [tok] (State [tok] (Failure m)) =>
3678 StateMonad [tok] (Parser tok m)
3679 </programlisting>
3680 </para>
3681 <para>
3682
3683 As a result of this extension, all derived instances in newtype
3684 declarations are treated uniformly (and implemented just by reusing
3685 the dictionary for the representation type), <emphasis>except</emphasis>
3686 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
3687 the newtype and its representation.
3688 </para>
3689 </sect3>
3690
3691 <sect3> <title> A more precise specification </title>
3692 <para>
3693 Derived instance declarations are constructed as follows. Consider the
3694 declaration (after expansion of any type synonyms)
3695
3696 <programlisting>
3697 newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)
3698 </programlisting>
3699
3700 where
3701 <itemizedlist>
3702 <listitem><para>
3703 The <literal>ci</literal> are partial applications of
3704 classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
3705 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
3706 </para></listitem>
3707 <listitem><para>
3708 The <literal>k</literal> is chosen so that <literal>ci (T v1...vk)</literal> is well-kinded.
3709 </para></listitem>
3710 <listitem><para>
3711 The type <literal>t</literal> is an arbitrary type.
3712 </para></listitem>
3713 <listitem><para>
3714 The type variables <literal>vk+1...vn</literal> do not occur in <literal>t</literal>,
3715 nor in the <literal>ci</literal>, and
3716 </para></listitem>
3717 <listitem><para>
3718 None of the <literal>ci</literal> is <literal>Read</literal>, <literal>Show</literal>,
3719 <literal>Typeable</literal>, or <literal>Data</literal>. These classes
3720 should not "look through" the type or its constructor. You can still
3721 derive these classes for a newtype, but it happens in the usual way, not
3722 via this new mechanism.
3723 </para></listitem>
3724 <listitem><para>
3725 The role of the last parameter of each of the <literal>ci</literal> is <emphasis>not</emphasis> <literal>nominal</literal>. (See <xref linkend="roles"/>.)</para></listitem>
3726 </itemizedlist>
3727 Then, for each <literal>ci</literal>, the derived instance
3728 declaration is:
3729 <programlisting>
3730 instance ci t => ci (T v1...vk)
3731 </programlisting>
3732 As an example which does <emphasis>not</emphasis> work, consider
3733 <programlisting>
3734 newtype NonMonad m s = NonMonad (State s m s) deriving Monad
3735 </programlisting>
3736 Here we cannot derive the instance
3737 <programlisting>
3738 instance Monad (State s m) => Monad (NonMonad m)
3739 </programlisting>
3740
3741 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
3742 and so cannot be "eta-converted" away. It is a good thing that this
3743 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
3744 not, in fact, a monad --- for the same reason. Try defining
3745 <literal>>>=</literal> with the correct type: you won't be able to.
3746 </para>
3747 <para>
3748
3749 Notice also that the <emphasis>order</emphasis> of class parameters becomes
3750 important, since we can only derive instances for the last one. If the
3751 <literal>StateMonad</literal> class above were instead defined as
3752
3753 <programlisting>
3754 class StateMonad m s | m -> s where ...
3755 </programlisting>
3756
3757 then we would not have been able to derive an instance for the
3758 <literal>Parser</literal> type above. We hypothesise that multi-parameter
3759 classes usually have one "main" parameter for which deriving new
3760 instances is most interesting.
3761 </para>
3762 <para>Lastly, all of this applies only for classes other than
3763 <literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>,
3764 and <literal>Data</literal>, for which the built-in derivation applies (section
3765 4.3.3. of the Haskell Report).
3766 (For the standard classes <literal>Eq</literal>, <literal>Ord</literal>,
3767 <literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether
3768 the standard method is used or the one described here.)
3769 </para>
3770 </sect3>
3771 </sect2>
3772 </sect1>
3773
3774
3775 <!-- TYPE SYSTEM EXTENSIONS -->
3776 <sect1 id="type-class-extensions">
3777 <title>Class and instances declarations</title>
3778
3779 <sect2 id="multi-param-type-classes">
3780 <title>Class declarations</title>
3781
3782 <para>
3783 This section, and the next one, documents GHC's type-class extensions.
3784 There's lots of background in the paper <ulink
3785 url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type
3786 classes: exploring the design space</ulink> (Simon Peyton Jones, Mark
3787 Jones, Erik Meijer).
3788 </para>
3789
3790 <sect3>
3791 <title>Multi-parameter type classes</title>
3792 <para>
3793 Multi-parameter type classes are permitted, with flag <option>-XMultiParamTypeClasses</option>.
3794 For example:
3795
3796
3797 <programlisting>
3798 class Collection c a where
3799 union :: c a -> c a -> c a
3800 ...etc.
3801 </programlisting>
3802
3803 </para>
3804 </sect3>
3805
3806 <sect3 id="superclass-rules">
3807 <title>The superclasses of a class declaration</title>
3808
3809 <para>
3810 In Haskell 98 the context of a class declaration (which introduces superclasses)
3811 must be simple; that is, each predicate must consist of a class applied to
3812 type variables. The flag <option>-XFlexibleContexts</option>
3813 (<xref linkend="flexible-contexts"/>)
3814 lifts this restriction,
3815 so that the only restriction on the context in a class declaration is
3816 that the class hierarchy must be acyclic. So these class declarations are OK:
3817
3818
3819 <programlisting>
3820 class Functor (m k) => FiniteMap m k where
3821 ...
3822
3823 class (Monad m, Monad (t m)) => Transform t m where
3824 lift :: m a -> (t m) a
3825 </programlisting>
3826
3827
3828 </para>
3829 <para>
3830 As in Haskell 98, The class hierarchy must be acyclic. However, the definition
3831 of "acyclic" involves only the superclass relationships. For example,
3832 this is OK:
3833
3834
3835 <programlisting>
3836 class C a where {
3837 op :: D b => a -> b -> b
3838 }
3839
3840 class C a => D a where { ... }
3841 </programlisting>
3842
3843
3844 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
3845 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
3846 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
3847 </para>
3848 <para>
3849 With the extension that adds a <link linkend="constraint-kind">kind of constraints</link>,
3850 you can write more exotic superclass definitions. The superclass cycle check is even more
3851 liberal in these case. For example, this is OK:
3852
3853 <programlisting>
3854 class A cls c where
3855 meth :: cls c => c -> c
3856
3857 class A B c => B c where
3858 </programlisting>
3859
3860 A superclass context for a class <literal>C</literal> is allowed if, after expanding
3861 type synonyms to their right-hand-sides, and uses of classes (other than <literal>C</literal>)
3862 to their superclasses, <literal>C</literal> does not occur syntactically in the context.
3863 </para>
3864 </sect3>
3865
3866
3867
3868
3869 <sect3 id="class-method-types">
3870 <title>Class method types</title>
3871
3872 <para>
3873 Haskell 98 prohibits class method types to mention constraints on the
3874 class type variable, thus:
3875 <programlisting>
3876 class Seq s a where
3877 fromList :: [a] -> s a
3878 elem :: Eq a => a -> s a -> Bool
3879 </programlisting>
3880 The type of <literal>elem</literal> is illegal in Haskell 98, because it
3881 contains the constraint <literal>Eq a</literal>, constrains only the
3882 class type variable (in this case <literal>a</literal>).
3883 GHC lifts this restriction (flag <option>-XConstrainedClassMethods</option>).
3884 </para>
3885
3886
3887 </sect3>
3888
3889
3890 <sect3 id="class-default-signatures">
3891 <title>Default method signatures</title>
3892
3893 <para>
3894 Haskell 98 allows you to define a default implementation when declaring a class:
3895 <programlisting>
3896 class Enum a where
3897 enum :: [a]
3898 enum = []
3899 </programlisting>
3900 The type of the <literal>enum</literal> method is <literal>[a]</literal>, and
3901 this is also the type of the default method. You can lift this restriction
3902 and give another type to the default method using the flag
3903 <option>-XDefaultSignatures</option>. For instance, if you have written a
3904 generic implementation of enumeration in a class <literal>GEnum</literal>
3905 with method <literal>genum</literal> in terms of <literal>GHC.Generics</literal>,
3906 you can specify a default method that uses that generic implementation:
3907 <programlisting>
3908 class Enum a where
3909 enum :: [a]
3910 default enum :: (Generic a, GEnum (Rep a)) => [a]
3911 enum = map to genum
3912 </programlisting>
3913 We reuse the keyword <literal>default</literal> to signal that a signature
3914 applies to the default method only; when defining instances of the
3915 <literal>Enum</literal> class, the original type <literal>[a]</literal> of
3916 <literal>enum</literal> still applies. When giving an empty instance, however,
3917 the default implementation <literal>map to genum</literal> is filled-in,
3918 and type-checked with the type
3919 <literal>(Generic a, GEnum (Rep a)) => [a]</literal>.
3920 </para>
3921
3922 <para>
3923 We use default signatures to simplify generic programming in GHC
3924 (<xref linkend="generic-programming"/>).
3925 </para>
3926
3927
3928 </sect3>
3929
3930 <sect3 id="nullary-type-classes">
3931 <title>Nullary type classes</title>
3932 Nullary (no parameter) type classes are enabled with <option>-XNullaryTypeClasses</option>.
3933 Since there are no available parameters, there can be at most one instance
3934 of a nullary class. A nullary type class might be used to document some assumption
3935 in a type signature (such as reliance on the Riemann hypothesis) or add some
3936 globally configurable settings in a program. For example,
3937
3938 <programlisting>
3939 class RiemannHypothesis where
3940 assumeRH :: a -> a
3941
3942 -- Deterministic version of the Miller test
3943 -- correctness depends on the generalized Riemann hypothesis
3944 isPrime :: RiemannHypothesis => Integer -> Bool
3945 isPrime n = assumeRH (...)
3946 </programlisting>
3947
3948 The type signature of <literal>isPrime</literal> informs users that its correctness
3949 depends on an unproven conjecture. If the function is used, the user has
3950 to acknowledge the dependence with:
3951
3952 <programlisting>
3953 instance RiemannHypothesis where
3954 assumeRH = id
3955 </programlisting>
3956
3957 </sect3>
3958 </sect2>
3959
3960 <sect2 id="functional-dependencies">
3961 <title>Functional dependencies
3962 </title>
3963
3964 <para> Functional dependencies are implemented as described by Mark Jones
3965 in &ldquo;<ulink url="http://citeseer.ist.psu.edu/jones00type.html">Type Classes with Functional Dependencies</ulink>&rdquo;, Mark P. Jones,
3966 In Proceedings of the 9th European Symposium on Programming,
3967 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
3968 .
3969 </para>
3970 <para>
3971 Functional dependencies are introduced by a vertical bar in the syntax of a
3972 class declaration; e.g.
3973 <programlisting>
3974 class (Monad m) => MonadState s m | m -> s where ...
3975
3976 class Foo a b c | a b -> c where ...
3977 </programlisting>
3978 There should be more documentation, but there isn't (yet). Yell if you need it.
3979 </para>
3980
3981 <sect3><title>Rules for functional dependencies </title>
3982 <para>
3983 In a class declaration, all of the class type variables must be reachable (in the sense
3984 mentioned in <xref linkend="flexible-contexts"/>)
3985 from the free variables of each method type.
3986 For example:
3987
3988 <programlisting>
3989 class Coll s a where
3990 empty :: s
3991 insert :: s -> a -> s
3992 </programlisting>
3993
3994 is not OK, because the type of <literal>empty</literal> doesn't mention
3995 <literal>a</literal>. Functional dependencies can make the type variable
3996 reachable:
3997 <programlisting>
3998 class Coll s a | s -> a where
3999 empty :: s
4000 insert :: s -> a -> s
4001 </programlisting>
4002
4003 Alternatively <literal>Coll</literal> might be rewritten
4004
4005 <programlisting>
4006 class Coll s a where
4007 empty :: s a
4008 insert :: s a -> a -> s a
4009 </programlisting>
4010
4011
4012 which makes the connection between the type of a collection of
4013 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
4014 Occasionally this really doesn't work, in which case you can split the
4015 class like this:
4016
4017
4018 <programlisting>
4019 class CollE s where
4020 empty :: s
4021
4022 class CollE s => Coll s a where
4023 insert :: s -> a -> s
4024 </programlisting>
4025 </para>
4026 </sect3>
4027
4028
4029 <sect3>
4030 <title>Background on functional dependencies</title>
4031
4032 <para>The following description of the motivation and use of functional dependencies is taken
4033 from the Hugs user manual, reproduced here (with minor changes) by kind
4034 permission of Mark Jones.
4035 </para>
4036 <para>
4037 Consider the following class, intended as part of a
4038 library for collection types:
4039 <programlisting>
4040 class Collects e ce where
4041 empty :: ce
4042 insert :: e -> ce -> ce
4043 member :: e -> ce -> Bool
4044 </programlisting>
4045 The type variable e used here represents the element type, while ce is the type
4046 of the container itself. Within this framework, we might want to define
4047 instances of this class for lists or characteristic functions (both of which
4048 can be used to represent collections of any equality type), bit sets (which can
4049 be used to represent collections of characters), or hash tables (which can be
4050 used to represent any collection whose elements have a hash function). Omitting
4051 standard implementation details, this would lead to the following declarations:
4052 <programlisting>
4053 instance Eq e => Collects e [e] where ...
4054 instance Eq e => Collects e (e -> Bool) where ...
4055 instance Collects Char BitSet where ...
4056 instance (Hashable e, Collects a ce)
4057 => Collects e (Array Int ce) where ...
4058 </programlisting>
4059 All this looks quite promising; we have a class and a range of interesting
4060 implementations. Unfortunately, there are some serious problems with the class
4061 declaration. First, the empty function has an ambiguous type:
4062 <programlisting>
4063 empty :: Collects e ce => ce
4064 </programlisting>
4065 By "ambiguous" we mean that there is a type variable e that appears on the left
4066 of the <literal>=&gt;</literal> symbol, but not on the right. The problem with
4067 this is that, according to the theoretical foundations of Haskell overloading,
4068 we cannot guarantee a well-defined semantics for any term with an ambiguous
4069 type.
4070 </para>
4071 <para>
4072 We can sidestep this specific problem by removing the empty member from the
4073 class declaration. However, although the remaining members, insert and member,
4074 do not have ambiguous types, we still run into problems when we try to use
4075 them. For example, consider the following two functions:
4076 <programlisting>
4077 f x y = insert x . insert y
4078 g = f True 'a'
4079 </programlisting>
4080 for which GHC infers the following types:
4081 <programlisting>
4082 f :: (Collects a c, Collects b c) => a -> b -> c -> c
4083 g :: (Collects Bool c, Collects Char c) => c -> c
4084 </programlisting>
4085 Notice that the type for f allows the two parameters x and y to be assigned
4086 different types, even though it attempts to insert each of the two values, one
4087 after the other, into the same collection. If we're trying to model collections
4088 that contain only one type of value, then this is clearly an inaccurate
4089 type. Worse still, the definition for g is accepted, without causing a type
4090 error. As a result, the error in this code will not be flagged at the point
4091 where it appears. Instead, it will show up only when we try to use g, which
4092 might even be in a different module.
4093 </para>
4094