Update the generics docs; pointed out by Christian Maeder
[ghc.git] / docs / users_guide / glasgow_exts.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <para>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They can all be enabled or disabled by commandline flags
7 or language pragmas. By default GHC understands the most recent Haskell
8 version it supports, plus a handful of extensions.
9 </para>
10
11 <para>
12 Some of the Glasgow extensions serve to give you access to the
13 underlying facilities with which we implement Haskell. Thus, you can
14 get at the Raw Iron, if you are willing to write some non-portable
15 code at a more primitive level. You need not be &ldquo;stuck&rdquo;
16 on performance because of the implementation costs of Haskell's
17 &ldquo;high-level&rdquo; features&mdash;you can always code
18 &ldquo;under&rdquo; them. In an extreme case, you can write all your
19 time-critical code in C, and then just glue it together with Haskell!
20 </para>
21
22 <para>
23 Before you get too carried away working at the lowest level (e.g.,
24 sloshing <literal>MutableByteArray&num;</literal>s around your
25 program), you may wish to check if there are libraries that provide a
26 &ldquo;Haskellised veneer&rdquo; over the features you want. The
27 separate <ulink url="../libraries/index.html">libraries
28 documentation</ulink> describes all the libraries that come with GHC.
29 </para>
30
31 <!-- LANGUAGE OPTIONS -->
32 <sect1 id="options-language">
33 <title>Language options</title>
34
35 <indexterm><primary>language</primary><secondary>option</secondary>
36 </indexterm>
37 <indexterm><primary>options</primary><secondary>language</secondary>
38 </indexterm>
39 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
40 </indexterm>
41
42 <para>The language option flags control what variation of the language are
43 permitted.</para>
44
45 <para>Language options can be controlled in two ways:
46 <itemizedlist>
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
50 <listitem><para>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
53 </listitem>
54 </itemizedlist></para>
55
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 &what_glasgow_exts_does;
60 Enabling these options is the <emphasis>only</emphasis>
61 effect of <option>-fglasgow-exts</option>.
62 We are trying to move away from this portmanteau flag,
63 and towards enabling features individually.</para>
64
65 </sect1>
66
67 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
68 <sect1 id="primitives">
69 <title>Unboxed types and primitive operations</title>
70
71 <para>GHC is built on a raft of primitive data types and operations;
72 "primitive" in the sense that they cannot be defined in Haskell itself.
73 While you really can use this stuff to write fast code,
74 we generally find it a lot less painful, and more satisfying in the
75 long run, to use higher-level language features and libraries. With
76 any luck, the code you write will be optimised to the efficient
77 unboxed version in any case. And if it isn't, we'd like to know
78 about it.</para>
79
80 <para>All these primitive data types and operations are exported by the
81 library <literal>GHC.Prim</literal>, for which there is
82 <ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>.
83 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
84 </para>
85 <para>
86 If you want to mention any of the primitive data types or operations in your
87 program, you must first import <literal>GHC.Prim</literal> to bring them
88 into scope. Many of them have names ending in "&num;", and to mention such
89 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
90 </para>
91
92 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
93 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
94 we briefly summarise here. </para>
95
96 <sect2 id="glasgow-unboxed">
97 <title>Unboxed types
98 </title>
99
100 <para>
101 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
102 </para>
103
104 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
105 that values of that type are represented by a pointer to a heap
106 object. The representation of a Haskell <literal>Int</literal>, for
107 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
108 type, however, is represented by the value itself, no pointers or heap
109 allocation are involved.
110 </para>
111
112 <para>
113 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
114 would use in C: <literal>Int&num;</literal> (long int),
115 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
116 (void *), etc. The <emphasis>primitive operations</emphasis>
117 (PrimOps) on these types are what you might expect; e.g.,
118 <literal>(+&num;)</literal> is addition on
119 <literal>Int&num;</literal>s, and is the machine-addition that we all
120 know and love&mdash;usually one instruction.
121 </para>
122
123 <para>
124 Primitive (unboxed) types cannot be defined in Haskell, and are
125 therefore built into the language and compiler. Primitive types are
126 always unlifted; that is, a value of a primitive type cannot be
127 bottom. We use the convention (but it is only a convention)
128 that primitive types, values, and
129 operations have a <literal>&num;</literal> suffix (see <xref linkend="magic-hash"/>).
130 For some primitive types we have special syntax for literals, also
131 described in the <link linkend="magic-hash">same section</link>.
132 </para>
133
134 <para>
135 Primitive values are often represented by a simple bit-pattern, such
136 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
137 <literal>Double&num;</literal>. But this is not necessarily the case:
138 a primitive value might be represented by a pointer to a
139 heap-allocated object. Examples include
140 <literal>Array&num;</literal>, the type of primitive arrays. A
141 primitive array is heap-allocated because it is too big a value to fit
142 in a register, and would be too expensive to copy around; in a sense,
143 it is accidental that it is represented by a pointer. If a pointer
144 represents a primitive value, then it really does point to that value:
145 no unevaluated thunks, no indirections&hellip;nothing can be at the
146 other end of the pointer than the primitive value.
147 A numerically-intensive program using unboxed types can
148 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
149 counterpart&mdash;we saw a threefold speedup on one example.
150 </para>
151
152 <para>
153 There are some restrictions on the use of primitive types:
154 <itemizedlist>
155 <listitem><para>The main restriction
156 is that you can't pass a primitive value to a polymorphic
157 function or store one in a polymorphic data type. This rules out
158 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
159 integers). The reason for this restriction is that polymorphic
160 arguments and constructor fields are assumed to be pointers: if an
161 unboxed integer is stored in one of these, the garbage collector would
162 attempt to follow it, leading to unpredictable space leaks. Or a
163 <function>seq</function> operation on the polymorphic component may
164 attempt to dereference the pointer, with disastrous results. Even
165 worse, the unboxed value might be larger than a pointer
166 (<literal>Double&num;</literal> for instance).
167 </para>
168 </listitem>
169 <listitem><para> You cannot define a newtype whose representation type
170 (the argument type of the data constructor) is an unboxed type. Thus,
171 this is illegal:
172 <programlisting>
173 newtype A = MkA Int#
174 </programlisting>
175 </para></listitem>
176 <listitem><para> You cannot bind a variable with an unboxed type
177 in a <emphasis>top-level</emphasis> binding.
178 </para></listitem>
179 <listitem><para> You cannot bind a variable with an unboxed type
180 in a <emphasis>recursive</emphasis> binding.
181 </para></listitem>
182 <listitem><para> You may bind unboxed variables in a (non-recursive,
183 non-top-level) pattern binding, but you must make any such pattern-match
184 strict. For example, rather than:
185 <programlisting>
186 data Foo = Foo Int Int#
187
188 f x = let (Foo a b, w) = ..rhs.. in ..body..
189 </programlisting>
190 you must write:
191 <programlisting>
192 data Foo = Foo Int Int#
193
194 f x = let !(Foo a b, w) = ..rhs.. in ..body..
195 </programlisting>
196 since <literal>b</literal> has type <literal>Int#</literal>.
197 </para>
198 </listitem>
199 </itemizedlist>
200 </para>
201
202 </sect2>
203
204 <sect2 id="unboxed-tuples">
205 <title>Unboxed Tuples
206 </title>
207
208 <para>
209 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>,
210 they're available by default with <option>-fglasgow-exts</option>. An
211 unboxed tuple looks like this:
212 </para>
213
214 <para>
215
216 <programlisting>
217 (# e_1, ..., e_n #)
218 </programlisting>
219
220 </para>
221
222 <para>
223 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
224 type (primitive or non-primitive). The type of an unboxed tuple looks
225 the same.
226 </para>
227
228 <para>
229 Unboxed tuples are used for functions that need to return multiple
230 values, but they avoid the heap allocation normally associated with
231 using fully-fledged tuples. When an unboxed tuple is returned, the
232 components are put directly into registers or on the stack; the
233 unboxed tuple itself does not have a composite representation. Many
234 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
235 tuples.
236 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
237 tuples to avoid unnecessary allocation during sequences of operations.
238 </para>
239
240 <para>
241 There are some pretty stringent restrictions on the use of unboxed tuples:
242 <itemizedlist>
243 <listitem>
244
245 <para>
246 Values of unboxed tuple types are subject to the same restrictions as
247 other unboxed types; i.e. they may not be stored in polymorphic data
248 structures or passed to polymorphic functions.
249
250 </para>
251 </listitem>
252 <listitem>
253
254 <para>
255 No variable can have an unboxed tuple type, nor may a constructor or function
256 argument have an unboxed tuple type. The following are all illegal:
257
258
259 <programlisting>
260 data Foo = Foo (# Int, Int #)
261
262 f :: (# Int, Int #) -&#62; (# Int, Int #)
263 f x = x
264
265 g :: (# Int, Int #) -&#62; Int
266 g (# a,b #) = a
267
268 h x = let y = (# x,x #) in ...
269 </programlisting>
270 </para>
271 </listitem>
272 </itemizedlist>
273 </para>
274 <para>
275 The typical use of unboxed tuples is simply to return multiple values,
276 binding those multiple results with a <literal>case</literal> expression, thus:
277 <programlisting>
278 f x y = (# x+1, y-1 #)
279 g x = case f x x of { (# a, b #) -&#62; a + b }
280 </programlisting>
281 You can have an unboxed tuple in a pattern binding, thus
282 <programlisting>
283 f x = let (# p,q #) = h x in ..body..
284 </programlisting>
285 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
286 the resulting binding is lazy like any other Haskell pattern binding. The
287 above example desugars like this:
288 <programlisting>
289 f x = let t = case h x o f{ (# p,q #) -> (p,q)
290 p = fst t
291 q = snd t
292 in ..body..
293 </programlisting>
294 Indeed, the bindings can even be recursive.
295 </para>
296
297 </sect2>
298 </sect1>
299
300
301 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
302
303 <sect1 id="syntax-extns">
304 <title>Syntactic extensions</title>
305
306 <sect2 id="unicode-syntax">
307 <title>Unicode syntax</title>
308 <para>The language
309 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
310 enables Unicode characters to be used to stand for certain ASCII
311 character sequences. The following alternatives are provided:</para>
312
313 <informaltable>
314 <tgroup cols="2" align="left" colsep="1" rowsep="1">
315 <thead>
316 <row>
317 <entry>ASCII</entry>
318 <entry>Unicode alternative</entry>
319 <entry>Code point</entry>
320 <entry>Name</entry>
321 </row>
322 </thead>
323
324 <!--
325 to find the DocBook entities for these characters, find
326 the Unicode code point (e.g. 0x2237), and grep for it in
327 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
328 your system. Some of these Unicode code points don't have
329 equivalent DocBook entities.
330 -->
331
332 <tbody>
333 <row>
334 <entry><literal>::</literal></entry>
335 <entry>::</entry> <!-- no special char, apparently -->
336 <entry>0x2237</entry>
337 <entry>PROPORTION</entry>
338 </row>
339 </tbody>
340 <tbody>
341 <row>
342 <entry><literal>=&gt;</literal></entry>
343 <entry>&rArr;</entry>
344 <entry>0x21D2</entry>
345 <entry>RIGHTWARDS DOUBLE ARROW</entry>
346 </row>
347 </tbody>
348 <tbody>
349 <row>
350 <entry><literal>forall</literal></entry>
351 <entry>&forall;</entry>
352 <entry>0x2200</entry>
353 <entry>FOR ALL</entry>
354 </row>
355 </tbody>
356 <tbody>
357 <row>
358 <entry><literal>-&gt;</literal></entry>
359 <entry>&rarr;</entry>
360 <entry>0x2192</entry>
361 <entry>RIGHTWARDS ARROW</entry>
362 </row>
363 </tbody>
364 <tbody>
365 <row>
366 <entry><literal>&lt;-</literal></entry>
367 <entry>&larr;</entry>
368 <entry>0x2190</entry>
369 <entry>LEFTWARDS ARROW</entry>
370 </row>
371 </tbody>
372
373 <tbody>
374 <row>
375 <entry>-&lt;</entry>
376 <entry>&larrtl;</entry>
377 <entry>0x2919</entry>
378 <entry>LEFTWARDS ARROW-TAIL</entry>
379 </row>
380 </tbody>
381
382 <tbody>
383 <row>
384 <entry>&gt;-</entry>
385 <entry>&rarrtl;</entry>
386 <entry>0x291A</entry>
387 <entry>RIGHTWARDS ARROW-TAIL</entry>
388 </row>
389 </tbody>
390
391 <tbody>
392 <row>
393 <entry>-&lt;&lt;</entry>
394 <entry></entry>
395 <entry>0x291B</entry>
396 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
397 </row>
398 </tbody>
399
400 <tbody>
401 <row>
402 <entry>&gt;&gt;-</entry>
403 <entry></entry>
404 <entry>0x291C</entry>
405 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
406 </row>
407 </tbody>
408
409 <tbody>
410 <row>
411 <entry>*</entry>
412 <entry>&starf;</entry>
413 <entry>0x2605</entry>
414 <entry>BLACK STAR</entry>
415 </row>
416 </tbody>
417
418 </tgroup>
419 </informaltable>
420 </sect2>
421
422 <sect2 id="magic-hash">
423 <title>The magic hash</title>
424 <para>The language extension <option>-XMagicHash</option> allows "&num;" as a
425 postfix modifier to identifiers. Thus, "x&num;" is a valid variable, and "T&num;" is
426 a valid type constructor or data constructor.</para>
427
428 <para>The hash sign does not change sematics at all. We tend to use variable
429 names ending in "&num;" for unboxed values or types (e.g. <literal>Int&num;</literal>),
430 but there is no requirement to do so; they are just plain ordinary variables.
431 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
432 For example, to bring <literal>Int&num;</literal> into scope you must
433 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
434 the <option>-XMagicHash</option> extension
435 then allows you to <emphasis>refer</emphasis> to the <literal>Int&num;</literal>
436 that is now in scope.</para>
437 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
438 <itemizedlist>
439 <listitem><para> <literal>'x'&num;</literal> has type <literal>Char&num;</literal></para> </listitem>
440 <listitem><para> <literal>&quot;foo&quot;&num;</literal> has type <literal>Addr&num;</literal></para> </listitem>
441 <listitem><para> <literal>3&num;</literal> has type <literal>Int&num;</literal>. In general,
442 any Haskell integer lexeme followed by a <literal>&num;</literal> is an <literal>Int&num;</literal> literal, e.g.
443 <literal>-0x3A&num;</literal> as well as <literal>32&num;</literal></para>.</listitem>
444 <listitem><para> <literal>3&num;&num;</literal> has type <literal>Word&num;</literal>. In general,
445 any non-negative Haskell integer lexeme followed by <literal>&num;&num;</literal>
446 is a <literal>Word&num;</literal>. </para> </listitem>
447 <listitem><para> <literal>3.2&num;</literal> has type <literal>Float&num;</literal>.</para> </listitem>
448 <listitem><para> <literal>3.2&num;&num;</literal> has type <literal>Double&num;</literal></para> </listitem>
449 </itemizedlist>
450 </para>
451 </sect2>
452
453 <!-- ====================== HIERARCHICAL MODULES ======================= -->
454
455
456 <sect2 id="hierarchical-modules">
457 <title>Hierarchical Modules</title>
458
459 <para>GHC supports a small extension to the syntax of module
460 names: a module name is allowed to contain a dot
461 <literal>&lsquo;.&rsquo;</literal>. This is also known as the
462 &ldquo;hierarchical module namespace&rdquo; extension, because
463 it extends the normally flat Haskell module namespace into a
464 more flexible hierarchy of modules.</para>
465
466 <para>This extension has very little impact on the language
467 itself; modules names are <emphasis>always</emphasis> fully
468 qualified, so you can just think of the fully qualified module
469 name as <quote>the module name</quote>. In particular, this
470 means that the full module name must be given after the
471 <literal>module</literal> keyword at the beginning of the
472 module; for example, the module <literal>A.B.C</literal> must
473 begin</para>
474
475 <programlisting>module A.B.C</programlisting>
476
477
478 <para>It is a common strategy to use the <literal>as</literal>
479 keyword to save some typing when using qualified names with
480 hierarchical modules. For example:</para>
481
482 <programlisting>
483 import qualified Control.Monad.ST.Strict as ST
484 </programlisting>
485
486 <para>For details on how GHC searches for source and interface
487 files in the presence of hierarchical modules, see <xref
488 linkend="search-path"/>.</para>
489
490 <para>GHC comes with a large collection of libraries arranged
491 hierarchically; see the accompanying <ulink
492 url="../libraries/index.html">library
493 documentation</ulink>. More libraries to install are available
494 from <ulink
495 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
496 </sect2>
497
498 <!-- ====================== PATTERN GUARDS ======================= -->
499
500 <sect2 id="pattern-guards">
501 <title>Pattern guards</title>
502
503 <para>
504 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
505 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
506 </para>
507
508 <para>
509 Suppose we have an abstract data type of finite maps, with a
510 lookup operation:
511
512 <programlisting>
513 lookup :: FiniteMap -> Int -> Maybe Int
514 </programlisting>
515
516 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
517 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
518 </para>
519
520 <programlisting>
521 clunky env var1 var2 | ok1 &amp;&amp; ok2 = val1 + val2
522 | otherwise = var1 + var2
523 where
524 m1 = lookup env var1
525 m2 = lookup env var2
526 ok1 = maybeToBool m1
527 ok2 = maybeToBool m2
528 val1 = expectJust m1
529 val2 = expectJust m2
530 </programlisting>
531
532 <para>
533 The auxiliary functions are
534 </para>
535
536 <programlisting>
537 maybeToBool :: Maybe a -&gt; Bool
538 maybeToBool (Just x) = True
539 maybeToBool Nothing = False
540
541 expectJust :: Maybe a -&gt; a
542 expectJust (Just x) = x
543 expectJust Nothing = error "Unexpected Nothing"
544 </programlisting>
545
546 <para>
547 What is <function>clunky</function> doing? The guard <literal>ok1 &amp;&amp;
548 ok2</literal> checks that both lookups succeed, using
549 <function>maybeToBool</function> to convert the <function>Maybe</function>
550 types to booleans. The (lazily evaluated) <function>expectJust</function>
551 calls extract the values from the results of the lookups, and binds the
552 returned values to <varname>val1</varname> and <varname>val2</varname>
553 respectively. If either lookup fails, then clunky takes the
554 <literal>otherwise</literal> case and returns the sum of its arguments.
555 </para>
556
557 <para>
558 This is certainly legal Haskell, but it is a tremendously verbose and
559 un-obvious way to achieve the desired effect. Arguably, a more direct way
560 to write clunky would be to use case expressions:
561 </para>
562
563 <programlisting>
564 clunky env var1 var2 = case lookup env var1 of
565 Nothing -&gt; fail
566 Just val1 -&gt; case lookup env var2 of
567 Nothing -&gt; fail
568 Just val2 -&gt; val1 + val2
569 where
570 fail = var1 + var2
571 </programlisting>
572
573 <para>
574 This is a bit shorter, but hardly better. Of course, we can rewrite any set
575 of pattern-matching, guarded equations as case expressions; that is
576 precisely what the compiler does when compiling equations! The reason that
577 Haskell provides guarded equations is because they allow us to write down
578 the cases we want to consider, one at a time, independently of each other.
579 This structure is hidden in the case version. Two of the right-hand sides
580 are really the same (<function>fail</function>), and the whole expression
581 tends to become more and more indented.
582 </para>
583
584 <para>
585 Here is how I would write clunky:
586 </para>
587
588 <programlisting>
589 clunky env var1 var2
590 | Just val1 &lt;- lookup env var1
591 , Just val2 &lt;- lookup env var2
592 = val1 + val2
593 ...other equations for clunky...
594 </programlisting>
595
596 <para>
597 The semantics should be clear enough. The qualifiers are matched in order.
598 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
599 right hand side is evaluated and matched against the pattern on the left.
600 If the match fails then the whole guard fails and the next equation is
601 tried. If it succeeds, then the appropriate binding takes place, and the
602 next qualifier is matched, in the augmented environment. Unlike list
603 comprehensions, however, the type of the expression to the right of the
604 <literal>&lt;-</literal> is the same as the type of the pattern to its
605 left. The bindings introduced by pattern guards scope over all the
606 remaining guard qualifiers, and over the right hand side of the equation.
607 </para>
608
609 <para>
610 Just as with list comprehensions, boolean expressions can be freely mixed
611 with among the pattern guards. For example:
612 </para>
613
614 <programlisting>
615 f x | [y] &lt;- x
616 , y > 3
617 , Just z &lt;- h y
618 = ...
619 </programlisting>
620
621 <para>
622 Haskell's current guards therefore emerge as a special case, in which the
623 qualifier list has just one element, a boolean expression.
624 </para>
625 </sect2>
626
627 <!-- ===================== View patterns =================== -->
628
629 <sect2 id="view-patterns">
630 <title>View patterns
631 </title>
632
633 <para>
634 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
635 More information and examples of view patterns can be found on the
636 <ulink url="http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
637 page</ulink>.
638 </para>
639
640 <para>
641 View patterns are somewhat like pattern guards that can be nested inside
642 of other patterns. They are a convenient way of pattern-matching
643 against values of abstract types. For example, in a programming language
644 implementation, we might represent the syntax of the types of the
645 language as follows:
646
647 <programlisting>
648 type Typ
649
650 data TypView = Unit
651 | Arrow Typ Typ
652
653 view :: Type -> TypeView
654
655 -- additional operations for constructing Typ's ...
656 </programlisting>
657
658 The representation of Typ is held abstract, permitting implementations
659 to use a fancy representation (e.g., hash-consing to manage sharing).
660
661 Without view patterns, using this signature a little inconvenient:
662 <programlisting>
663 size :: Typ -> Integer
664 size t = case view t of
665 Unit -> 1
666 Arrow t1 t2 -> size t1 + size t2
667 </programlisting>
668
669 It is necessary to iterate the case, rather than using an equational
670 function definition. And the situation is even worse when the matching
671 against <literal>t</literal> is buried deep inside another pattern.
672 </para>
673
674 <para>
675 View patterns permit calling the view function inside the pattern and
676 matching against the result:
677 <programlisting>
678 size (view -> Unit) = 1
679 size (view -> Arrow t1 t2) = size t1 + size t2
680 </programlisting>
681
682 That is, we add a new form of pattern, written
683 <replaceable>expression</replaceable> <literal>-></literal>
684 <replaceable>pattern</replaceable> that means "apply the expression to
685 whatever we're trying to match against, and then match the result of
686 that application against the pattern". The expression can be any Haskell
687 expression of function type, and view patterns can be used wherever
688 patterns are used.
689 </para>
690
691 <para>
692 The semantics of a pattern <literal>(</literal>
693 <replaceable>exp</replaceable> <literal>-></literal>
694 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
695
696 <itemizedlist>
697
698 <listitem> Scoping:
699
700 <para>The variables bound by the view pattern are the variables bound by
701 <replaceable>pat</replaceable>.
702 </para>
703
704 <para>
705 Any variables in <replaceable>exp</replaceable> are bound occurrences,
706 but variables bound "to the left" in a pattern are in scope. This
707 feature permits, for example, one argument to a function to be used in
708 the view of another argument. For example, the function
709 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
710 written using view patterns as follows:
711
712 <programlisting>
713 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
714 ...other equations for clunky...
715 </programlisting>
716 </para>
717
718 <para>
719 More precisely, the scoping rules are:
720 <itemizedlist>
721 <listitem>
722 <para>
723 In a single pattern, variables bound by patterns to the left of a view
724 pattern expression are in scope. For example:
725 <programlisting>
726 example :: Maybe ((String -> Integer,Integer), String) -> Bool
727 example Just ((f,_), f -> 4) = True
728 </programlisting>
729
730 Additionally, in function definitions, variables bound by matching earlier curried
731 arguments may be used in view pattern expressions in later arguments:
732 <programlisting>
733 example :: (String -> Integer) -> String -> Bool
734 example f (f -> 4) = True
735 </programlisting>
736 That is, the scoping is the same as it would be if the curried arguments
737 were collected into a tuple.
738 </para>
739 </listitem>
740
741 <listitem>
742 <para>
743 In mutually recursive bindings, such as <literal>let</literal>,
744 <literal>where</literal>, or the top level, view patterns in one
745 declaration may not mention variables bound by other declarations. That
746 is, each declaration must be self-contained. For example, the following
747 program is not allowed:
748 <programlisting>
749 let {(x -> y) = e1 ;
750 (y -> x) = e2 } in x
751 </programlisting>
752
753 (For some amplification on this design choice see
754 <ulink url="http://hackage.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.)
755
756 </para>
757 </listitem>
758 </itemizedlist>
759
760 </para>
761 </listitem>
762
763 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
764 <replaceable>T1</replaceable> <literal>-></literal>
765 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
766 a <replaceable>T2</replaceable>, then the whole view pattern matches a
767 <replaceable>T1</replaceable>.
768 </para></listitem>
769
770 <listitem><para> Matching: To the equations in Section 3.17.3 of the
771 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
772 Report</ulink>, add the following:
773 <programlisting>
774 case v of { (e -> p) -> e1 ; _ -> e2 }
775 =
776 case (e v) of { p -> e1 ; _ -> e2 }
777 </programlisting>
778 That is, to match a variable <replaceable>v</replaceable> against a pattern
779 <literal>(</literal> <replaceable>exp</replaceable>
780 <literal>-></literal> <replaceable>pat</replaceable>
781 <literal>)</literal>, evaluate <literal>(</literal>
782 <replaceable>exp</replaceable> <replaceable> v</replaceable>
783 <literal>)</literal> and match the result against
784 <replaceable>pat</replaceable>.
785 </para></listitem>
786
787 <listitem><para> Efficiency: When the same view function is applied in
788 multiple branches of a function definition or a case expression (e.g.,
789 in <literal>size</literal> above), GHC makes an attempt to collect these
790 applications into a single nested case expression, so that the view
791 function is only applied once. Pattern compilation in GHC follows the
792 matrix algorithm described in Chapter 4 of <ulink
793 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
794 Implementation of Functional Programming Languages</ulink>. When the
795 top rows of the first column of a matrix are all view patterns with the
796 "same" expression, these patterns are transformed into a single nested
797 case. This includes, for example, adjacent view patterns that line up
798 in a tuple, as in
799 <programlisting>
800 f ((view -> A, p1), p2) = e1
801 f ((view -> B, p3), p4) = e2
802 </programlisting>
803 </para>
804
805 <para> The current notion of when two view pattern expressions are "the
806 same" is very restricted: it is not even full syntactic equality.
807 However, it does include variables, literals, applications, and tuples;
808 e.g., two instances of <literal>view ("hi", "there")</literal> will be
809 collected. However, the current implementation does not compare up to
810 alpha-equivalence, so two instances of <literal>(x, view x ->
811 y)</literal> will not be coalesced.
812 </para>
813
814 </listitem>
815
816 </itemizedlist>
817 </para>
818
819 </sect2>
820
821 <!-- ===================== n+k patterns =================== -->
822
823 <sect2 id="n-k-patterns">
824 <title>n+k patterns</title>
825 <indexterm><primary><option>-XNoNPlusKPatterns</option></primary></indexterm>
826
827 <para>
828 <literal>n+k</literal> pattern support is enabled by default. To disable
829 it, you can use the <option>-XNoNPlusKPatterns</option> flag.
830 </para>
831
832 </sect2>
833
834 <!-- ===================== Recursive do-notation =================== -->
835
836 <sect2 id="recursive-do-notation">
837 <title>The recursive do-notation
838 </title>
839
840 <para>
841 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
842 that is, the variables bound in a do-expression are visible only in the textually following
843 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
844 group. It turns out that several applications can benefit from recursive bindings in
845 the do-notation. The <option>-XDoRec</option> flag provides the necessary syntactic support.
846 </para>
847 <para>
848 Here is a simple (albeit contrived) example:
849 <programlisting>
850 {-# LANGUAGE DoRec #-}
851 justOnes = do { rec { xs &lt;- Just (1:xs) }
852 ; return (map negate xs) }
853 </programlisting>
854 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
855 </para>
856 <para>
857 The background and motivation for recursive do-notation is described in
858 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
859 by Levent Erkok, John Launchbury,
860 Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
861 The theory behind monadic value recursion is explained further in Erkok's thesis
862 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.
863 However, note that GHC uses a different syntax than the one described in these documents.
864 </para>
865
866 <sect3>
867 <title>Details of recursive do-notation</title>
868 <para>
869 The recursive do-notation is enabled with the flag <option>-XDoRec</option> or, equivalently,
870 the LANGUAGE pragma <option>DoRec</option>. It introduces the single new keyword "<literal>rec</literal>",
871 which wraps a mutually-recursive group of monadic statements,
872 producing a single statement.
873 </para>
874 <para>Similar to a <literal>let</literal>
875 statement, the variables bound in the <literal>rec</literal> are
876 visible throughout the <literal>rec</literal> group, and below it.
877 For example, compare
878 <programlisting>
879 do { a &lt;- getChar do { a &lt;- getChar
880 ; let { r1 = f a r2 ; rec { r1 &lt;- f a r2
881 ; r2 = g r1 } ; r2 &lt;- g r1 }
882 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
883 </programlisting>
884 In both cases, <literal>r1</literal> and <literal>r2</literal> are
885 available both throughout the <literal>let</literal> or <literal>rec</literal> block, and
886 in the statements that follow it. The difference is that <literal>let</literal> is non-monadic,
887 while <literal>rec</literal> is monadic. (In Haskell <literal>let</literal> is
888 really <literal>letrec</literal>, of course.)
889 </para>
890 <para>
891 The static and dynamic semantics of <literal>rec</literal> can be described as follows:
892 <itemizedlist>
893 <listitem><para>
894 First,
895 similar to let-bindings, the <literal>rec</literal> is broken into
896 minimal recursive groups, a process known as <emphasis>segmentation</emphasis>.
897 For example:
898 <programlisting>
899 rec { a &lt;- getChar ===> a &lt;- getChar
900 ; b &lt;- f a c rec { b &lt;- f a c
901 ; c &lt;- f b a ; c &lt;- f b a }
902 ; putChar c } putChar c
903 </programlisting>
904 The details of segmentation are described in Section 3.2 of
905 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>.
906 Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper
907 describes, also has a semantic effect (unless the monad satisfies the right-shrinking law).
908 </para></listitem>
909 <listitem><para>
910 Then each resulting <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal>.
911 For example, the <literal>rec</literal> group in the preceding example is desugared like this:
912 <programlisting>
913 rec { b &lt;- f a c ===> (b,c) &lt;- mfix (\~(b,c) -> do { b &lt;- f a c
914 ; c &lt;- f b a } ; c &lt;- f b a
915 ; return (b,c) })
916 </programlisting>
917 In general, the statment <literal>rec <replaceable>ss</replaceable></literal>
918 is desugared to the statement
919 <programlisting>
920 <replaceable>vs</replaceable> &lt;- mfix (\~<replaceable>vs</replaceable> -&gt; do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
921 </programlisting>
922 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
923 </para><para>
924 The original <literal>rec</literal> typechecks exactly
925 when the above desugared version would do so. For example, this means that
926 the variables <replaceable>vs</replaceable> are all monomorphic in the statements
927 following the <literal>rec</literal>, because they are bound by a lambda.
928 </para>
929 <para>
930 The <literal>mfix</literal> function is defined in the <literal>MonadFix</literal>
931 class, in <literal>Control.Monad.Fix</literal>, thus:
932 <programlisting>
933 class Monad m => MonadFix m where
934 mfix :: (a -> m a) -> m a
935 </programlisting>
936 </para>
937 </listitem>
938 </itemizedlist>
939 </para>
940 <para>
941 Here are some other important points in using the recursive-do notation:
942 <itemizedlist>
943 <listitem><para>
944 It is enabled with the flag <literal>-XDoRec</literal>, which is in turn implied by
945 <literal>-fglasgow-exts</literal>.
946 </para></listitem>
947
948 <listitem><para>
949 If recursive bindings are required for a monad,
950 then that monad must be declared an instance of the <literal>MonadFix</literal> class.
951 </para></listitem>
952
953 <listitem><para>
954 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
955 Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
956 for Haskell's internal state monad (strict and lazy, respectively).
957 </para></listitem>
958
959 <listitem><para>
960 Like <literal>let</literal> and <literal>where</literal> bindings,
961 name shadowing is not allowed within a <literal>rec</literal>;
962 that is, all the names bound in a single <literal>rec</literal> must
963 be distinct (Section 3.3 of the paper).
964 </para></listitem>
965 <listitem><para>
966 It supports rebindable syntax (see <xref linkend="rebindable-syntax"/>).
967 </para></listitem>
968 </itemizedlist>
969 </para>
970 </sect3>
971
972 <sect3 id="mdo-notation"> <title> Mdo-notation (deprecated) </title>
973
974 <para> GHC used to support the flag <option>-XRecursiveDo</option>,
975 which enabled the keyword <literal>mdo</literal>, precisely as described in
976 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
977 but this is now deprecated. Instead of <literal>mdo { Q; e }</literal>, write
978 <literal>do { rec Q; e }</literal>.
979 </para>
980 <para>
981 Historical note: The old implementation of the mdo-notation (and most
982 of the existing documents) used the name
983 <literal>MonadRec</literal> for the class and the corresponding library.
984 This name is not supported by GHC.
985 </para>
986 </sect3>
987
988 </sect2>
989
990
991 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
992
993 <sect2 id="parallel-list-comprehensions">
994 <title>Parallel List Comprehensions</title>
995 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
996 </indexterm>
997 <indexterm><primary>parallel list comprehensions</primary>
998 </indexterm>
999
1000 <para>Parallel list comprehensions are a natural extension to list
1001 comprehensions. List comprehensions can be thought of as a nice
1002 syntax for writing maps and filters. Parallel comprehensions
1003 extend this to include the zipWith family.</para>
1004
1005 <para>A parallel list comprehension has multiple independent
1006 branches of qualifier lists, each separated by a `|' symbol. For
1007 example, the following zips together two lists:</para>
1008
1009 <programlisting>
1010 [ (x, y) | x &lt;- xs | y &lt;- ys ]
1011 </programlisting>
1012
1013 <para>The behavior of parallel list comprehensions follows that of
1014 zip, in that the resulting list will have the same length as the
1015 shortest branch.</para>
1016
1017 <para>We can define parallel list comprehensions by translation to
1018 regular comprehensions. Here's the basic idea:</para>
1019
1020 <para>Given a parallel comprehension of the form: </para>
1021
1022 <programlisting>
1023 [ e | p1 &lt;- e11, p2 &lt;- e12, ...
1024 | q1 &lt;- e21, q2 &lt;- e22, ...
1025 ...
1026 ]
1027 </programlisting>
1028
1029 <para>This will be translated to: </para>
1030
1031 <programlisting>
1032 [ e | ((p1,p2), (q1,q2), ...) &lt;- zipN [(p1,p2) | p1 &lt;- e11, p2 &lt;- e12, ...]
1033 [(q1,q2) | q1 &lt;- e21, q2 &lt;- e22, ...]
1034 ...
1035 ]
1036 </programlisting>
1037
1038 <para>where `zipN' is the appropriate zip for the given number of
1039 branches.</para>
1040
1041 </sect2>
1042
1043 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1044
1045 <sect2 id="generalised-list-comprehensions">
1046 <title>Generalised (SQL-Like) List Comprehensions</title>
1047 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1048 </indexterm>
1049 <indexterm><primary>extended list comprehensions</primary>
1050 </indexterm>
1051 <indexterm><primary>group</primary></indexterm>
1052 <indexterm><primary>sql</primary></indexterm>
1053
1054
1055 <para>Generalised list comprehensions are a further enhancement to the
1056 list comprehension syntactic sugar to allow operations such as sorting
1057 and grouping which are familiar from SQL. They are fully described in the
1058 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1059 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1060 except that the syntax we use differs slightly from the paper.</para>
1061 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1062 <para>Here is an example:
1063 <programlisting>
1064 employees = [ ("Simon", "MS", 80)
1065 , ("Erik", "MS", 100)
1066 , ("Phil", "Ed", 40)
1067 , ("Gordon", "Ed", 45)
1068 , ("Paul", "Yale", 60)]
1069
1070 output = [ (the dept, sum salary)
1071 | (name, dept, salary) &lt;- employees
1072 , then group by dept
1073 , then sortWith by (sum salary)
1074 , then take 5 ]
1075 </programlisting>
1076 In this example, the list <literal>output</literal> would take on
1077 the value:
1078
1079 <programlisting>
1080 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1081 </programlisting>
1082 </para>
1083 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1084 (The function <literal>sortWith</literal> is not a keyword; it is an ordinary
1085 function that is exported by <literal>GHC.Exts</literal>.)</para>
1086
1087 <para>There are five new forms of comprehension qualifier,
1088 all introduced by the (existing) keyword <literal>then</literal>:
1089 <itemizedlist>
1090 <listitem>
1091
1092 <programlisting>
1093 then f
1094 </programlisting>
1095
1096 This statement requires that <literal>f</literal> have the type <literal>
1097 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1098 motivating example, as this form is used to apply <literal>take 5</literal>.
1099
1100 </listitem>
1101
1102
1103 <listitem>
1104 <para>
1105 <programlisting>
1106 then f by e
1107 </programlisting>
1108
1109 This form is similar to the previous one, but allows you to create a function
1110 which will be passed as the first argument to f. As a consequence f must have
1111 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1112 from the type, this function lets f &quot;project out&quot; some information
1113 from the elements of the list it is transforming.</para>
1114
1115 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1116 is supplied with a function that lets it find out the <literal>sum salary</literal>
1117 for any item in the list comprehension it transforms.</para>
1118
1119 </listitem>
1120
1121
1122 <listitem>
1123
1124 <programlisting>
1125 then group by e using f
1126 </programlisting>
1127
1128 <para>This is the most general of the grouping-type statements. In this form,
1129 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1130 As with the <literal>then f by e</literal> case above, the first argument
1131 is a function supplied to f by the compiler which lets it compute e on every
1132 element of the list being transformed. However, unlike the non-grouping case,
1133 f additionally partitions the list into a number of sublists: this means that
1134 at every point after this statement, binders occurring before it in the comprehension
1135 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1136 this, let's look at an example:</para>
1137
1138 <programlisting>
1139 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1140 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1141 groupRuns f = groupBy (\x y -> f x == f y)
1142
1143 output = [ (the x, y)
1144 | x &lt;- ([1..3] ++ [1..2])
1145 , y &lt;- [4..6]
1146 , then group by x using groupRuns ]
1147 </programlisting>
1148
1149 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1150
1151 <programlisting>
1152 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1153 </programlisting>
1154
1155 <para>Note that we have used the <literal>the</literal> function to change the type
1156 of x from a list to its original numeric type. The variable y, in contrast, is left
1157 unchanged from the list form introduced by the grouping.</para>
1158
1159 </listitem>
1160
1161 <listitem>
1162
1163 <programlisting>
1164 then group by e
1165 </programlisting>
1166
1167 <para>This form of grouping is essentially the same as the one described above. However,
1168 since no function to use for the grouping has been supplied it will fall back on the
1169 <literal>groupWith</literal> function defined in
1170 <ulink url="&libraryBaseLocation;/GHC-Exts.html"><literal>GHC.Exts</literal></ulink>. This
1171 is the form of the group statement that we made use of in the opening example.</para>
1172
1173 </listitem>
1174
1175
1176 <listitem>
1177
1178 <programlisting>
1179 then group using f
1180 </programlisting>
1181
1182 <para>With this form of the group statement, f is required to simply have the type
1183 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1184 comprehension so far directly. An example of this form is as follows:</para>
1185
1186 <programlisting>
1187 output = [ x
1188 | y &lt;- [1..5]
1189 , x &lt;- "hello"
1190 , then group using inits]
1191 </programlisting>
1192
1193 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1194
1195 <programlisting>
1196 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1197 </programlisting>
1198
1199 </listitem>
1200 </itemizedlist>
1201 </para>
1202 </sect2>
1203
1204 <!-- ===================== REBINDABLE SYNTAX =================== -->
1205
1206 <sect2 id="rebindable-syntax">
1207 <title>Rebindable syntax and the implicit Prelude import</title>
1208
1209 <para><indexterm><primary>-XNoImplicitPrelude
1210 option</primary></indexterm> GHC normally imports
1211 <filename>Prelude.hi</filename> files for you. If you'd
1212 rather it didn't, then give it a
1213 <option>-XNoImplicitPrelude</option> option. The idea is
1214 that you can then import a Prelude of your own. (But don't
1215 call it <literal>Prelude</literal>; the Haskell module
1216 namespace is flat, and you must not conflict with any
1217 Prelude module.)</para>
1218
1219 <para>Suppose you are importing a Prelude of your own
1220 in order to define your own numeric class
1221 hierarchy. It completely defeats that purpose if the
1222 literal "1" means "<literal>Prelude.fromInteger
1223 1</literal>", which is what the Haskell Report specifies.
1224 So the <option>-XRebindableSyntax</option>
1225 flag causes
1226 the following pieces of built-in syntax to refer to
1227 <emphasis>whatever is in scope</emphasis>, not the Prelude
1228 versions:
1229 <itemizedlist>
1230 <listitem>
1231 <para>An integer literal <literal>368</literal> means
1232 "<literal>fromInteger (368::Integer)</literal>", rather than
1233 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1234 </para> </listitem>
1235
1236 <listitem><para>Fractional literals are handed in just the same way,
1237 except that the translation is
1238 <literal>fromRational (3.68::Rational)</literal>.
1239 </para> </listitem>
1240
1241 <listitem><para>The equality test in an overloaded numeric pattern
1242 uses whatever <literal>(==)</literal> is in scope.
1243 </para> </listitem>
1244
1245 <listitem><para>The subtraction operation, and the
1246 greater-than-or-equal test, in <literal>n+k</literal> patterns
1247 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1248 </para></listitem>
1249
1250 <listitem>
1251 <para>Negation (e.g. "<literal>- (f x)</literal>")
1252 means "<literal>negate (f x)</literal>", both in numeric
1253 patterns, and expressions.
1254 </para></listitem>
1255
1256 <listitem>
1257 <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3")
1258 means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected.
1259 </para></listitem>
1260
1261 <listitem>
1262 <para>"Do" notation is translated using whatever
1263 functions <literal>(>>=)</literal>,
1264 <literal>(>>)</literal>, and <literal>fail</literal>,
1265 are in scope (not the Prelude
1266 versions). List comprehensions, mdo (<xref linkend="mdo-notation"/>), and parallel array
1267 comprehensions, are unaffected. </para></listitem>
1268
1269 <listitem>
1270 <para>Arrow
1271 notation (see <xref linkend="arrow-notation"/>)
1272 uses whatever <literal>arr</literal>,
1273 <literal>(>>>)</literal>, <literal>first</literal>,
1274 <literal>app</literal>, <literal>(|||)</literal> and
1275 <literal>loop</literal> functions are in scope. But unlike the
1276 other constructs, the types of these functions must match the
1277 Prelude types very closely. Details are in flux; if you want
1278 to use this, ask!
1279 </para></listitem>
1280 </itemizedlist>
1281 <option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>.
1282 </para>
1283 <para>
1284 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1285 even if that is a little unexpected. For example, the
1286 static semantics of the literal <literal>368</literal>
1287 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1288 <literal>fromInteger</literal> to have any of the types:
1289 <programlisting>
1290 fromInteger :: Integer -> Integer
1291 fromInteger :: forall a. Foo a => Integer -> a
1292 fromInteger :: Num a => a -> Integer
1293 fromInteger :: Integer -> Bool -> Bool
1294 </programlisting>
1295 </para>
1296
1297 <para>Be warned: this is an experimental facility, with
1298 fewer checks than usual. Use <literal>-dcore-lint</literal>
1299 to typecheck the desugared program. If Core Lint is happy
1300 you should be all right.</para>
1301
1302 </sect2>
1303
1304 <sect2 id="postfix-operators">
1305 <title>Postfix operators</title>
1306
1307 <para>
1308 The <option>-XPostfixOperators</option> flag enables a small
1309 extension to the syntax of left operator sections, which allows you to
1310 define postfix operators. The extension is this: the left section
1311 <programlisting>
1312 (e !)
1313 </programlisting>
1314 is equivalent (from the point of view of both type checking and execution) to the expression
1315 <programlisting>
1316 ((!) e)
1317 </programlisting>
1318 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1319 The strict Haskell 98 interpretation is that the section is equivalent to
1320 <programlisting>
1321 (\y -> (!) e y)
1322 </programlisting>
1323 That is, the operator must be a function of two arguments. GHC allows it to
1324 take only one argument, and that in turn allows you to write the function
1325 postfix.
1326 </para>
1327 <para>The extension does not extend to the left-hand side of function
1328 definitions; you must define such a function in prefix form.</para>
1329
1330 </sect2>
1331
1332 <sect2 id="tuple-sections">
1333 <title>Tuple sections</title>
1334
1335 <para>
1336 The <option>-XTupleSections</option> flag enables Python-style partially applied
1337 tuple constructors. For example, the following program
1338 <programlisting>
1339 (, True)
1340 </programlisting>
1341 is considered to be an alternative notation for the more unwieldy alternative
1342 <programlisting>
1343 \x -> (x, True)
1344 </programlisting>
1345 You can omit any combination of arguments to the tuple, as in the following
1346 <programlisting>
1347 (, "I", , , "Love", , 1337)
1348 </programlisting>
1349 which translates to
1350 <programlisting>
1351 \a b c d -> (a, "I", b, c, "Love", d, 1337)
1352 </programlisting>
1353 </para>
1354
1355 <para>
1356 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
1357 will also be available for them, like so
1358 <programlisting>
1359 (# , True #)
1360 </programlisting>
1361 Because there is no unboxed unit tuple, the following expression
1362 <programlisting>
1363 (# #)
1364 </programlisting>
1365 continues to stand for the unboxed singleton tuple data constructor.
1366 </para>
1367
1368 </sect2>
1369
1370 <sect2 id="disambiguate-fields">
1371 <title>Record field disambiguation</title>
1372 <para>
1373 In record construction and record pattern matching
1374 it is entirely unambiguous which field is referred to, even if there are two different
1375 data types in scope with a common field name. For example:
1376 <programlisting>
1377 module M where
1378 data S = MkS { x :: Int, y :: Bool }
1379
1380 module Foo where
1381 import M
1382
1383 data T = MkT { x :: Int }
1384
1385 ok1 (MkS { x = n }) = n+1 -- Unambiguous
1386 ok2 n = MkT { x = n+1 } -- Unambiguous
1387
1388 bad1 k = k { x = 3 } -- Ambiguous
1389 bad2 k = x k -- Ambiguous
1390 </programlisting>
1391 Even though there are two <literal>x</literal>'s in scope,
1392 it is clear that the <literal>x</literal> in the pattern in the
1393 definition of <literal>ok1</literal> can only mean the field
1394 <literal>x</literal> from type <literal>S</literal>. Similarly for
1395 the function <literal>ok2</literal>. However, in the record update
1396 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
1397 it is not clear which of the two types is intended.
1398 </para>
1399 <para>
1400 Haskell 98 regards all four as ambiguous, but with the
1401 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
1402 the former two. The rules are precisely the same as those for instance
1403 declarations in Haskell 98, where the method names on the left-hand side
1404 of the method bindings in an instance declaration refer unambiguously
1405 to the method of that class (provided they are in scope at all), even
1406 if there are other variables in scope with the same name.
1407 This reduces the clutter of qualified names when you import two
1408 records from different modules that use the same field name.
1409 </para>
1410 <para>
1411 Some details:
1412 <itemizedlist>
1413 <listitem><para>
1414 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For exampe:
1415 <programlisting>
1416 module Foo where
1417 import M
1418 x=True
1419 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
1420 </programlisting>
1421 </para></listitem>
1422
1423 <listitem><para>
1424 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualifed</emphasis>
1425 field names even if the correponding selector is only in scope <emphasis>qualified</emphasis>
1426 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
1427 <programlisting>
1428 module Foo where
1429 import qualified M -- Note qualified
1430
1431 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
1432 </programlisting>
1433 Since the constructore <literal>MkS</literal> is only in scope qualified, you must
1434 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
1435 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
1436 is not. (In effect, it is qualified by the constructor.)
1437 </para></listitem>
1438 </itemizedlist>
1439 </para>
1440
1441 </sect2>
1442
1443 <!-- ===================== Record puns =================== -->
1444
1445 <sect2 id="record-puns">
1446 <title>Record puns
1447 </title>
1448
1449 <para>
1450 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
1451 </para>
1452
1453 <para>
1454 When using records, it is common to write a pattern that binds a
1455 variable with the same name as a record field, such as:
1456
1457 <programlisting>
1458 data C = C {a :: Int}
1459 f (C {a = a}) = a
1460 </programlisting>
1461 </para>
1462
1463 <para>
1464 Record punning permits the variable name to be elided, so one can simply
1465 write
1466
1467 <programlisting>
1468 f (C {a}) = a
1469 </programlisting>
1470
1471 to mean the same pattern as above. That is, in a record pattern, the
1472 pattern <literal>a</literal> expands into the pattern <literal>a =
1473 a</literal> for the same name <literal>a</literal>.
1474 </para>
1475
1476 <para>
1477 Note that:
1478 <itemizedlist>
1479 <listitem><para>
1480 Record punning can also be used in an expression, writing, for example,
1481 <programlisting>
1482 let a = 1 in C {a}
1483 </programlisting>
1484 instead of
1485 <programlisting>
1486 let a = 1 in C {a = a}
1487 </programlisting>
1488 The expansion is purely syntactic, so the expanded right-hand side
1489 expression refers to the nearest enclosing variable that is spelled the
1490 same as the field name.
1491 </para></listitem>
1492
1493 <listitem><para>
1494 Puns and other patterns can be mixed in the same record:
1495 <programlisting>
1496 data C = C {a :: Int, b :: Int}
1497 f (C {a, b = 4}) = a
1498 </programlisting>
1499 </para></listitem>
1500
1501 <listitem><para>
1502 Puns can be used wherever record patterns occur (e.g. in
1503 <literal>let</literal> bindings or at the top-level).
1504 </para></listitem>
1505
1506 <listitem><para>
1507 A pun on a qualified field name is expanded by stripping off the module qualifier.
1508 For example:
1509 <programlisting>
1510 f (C {M.a}) = a
1511 </programlisting>
1512 means
1513 <programlisting>
1514 f (M.C {M.a = a}) = a
1515 </programlisting>
1516 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
1517 is only in scope in qualified form.)
1518 </para></listitem>
1519 </itemizedlist>
1520 </para>
1521
1522
1523 </sect2>
1524
1525 <!-- ===================== Record wildcards =================== -->
1526
1527 <sect2 id="record-wildcards">
1528 <title>Record wildcards
1529 </title>
1530
1531 <para>
1532 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
1533 This flag implies <literal>-XDisambiguateRecordFields</literal>.
1534 </para>
1535
1536 <para>
1537 For records with many fields, it can be tiresome to write out each field
1538 individually in a record pattern, as in
1539 <programlisting>
1540 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
1541 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
1542 </programlisting>
1543 </para>
1544
1545 <para>
1546 Record wildcard syntax permits a "<literal>..</literal>" in a record
1547 pattern, where each elided field <literal>f</literal> is replaced by the
1548 pattern <literal>f = f</literal>. For example, the above pattern can be
1549 written as
1550 <programlisting>
1551 f (C {a = 1, ..}) = b + c + d
1552 </programlisting>
1553 </para>
1554
1555 <para>
1556 More details:
1557 <itemizedlist>
1558 <listitem><para>
1559 Wildcards can be mixed with other patterns, including puns
1560 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
1561 = 1, b, ..})</literal>. Additionally, record wildcards can be used
1562 wherever record patterns occur, including in <literal>let</literal>
1563 bindings and at the top-level. For example, the top-level binding
1564 <programlisting>
1565 C {a = 1, ..} = e
1566 </programlisting>
1567 defines <literal>b</literal>, <literal>c</literal>, and
1568 <literal>d</literal>.
1569 </para></listitem>
1570
1571 <listitem><para>
1572 Record wildcards can also be used in expressions, writing, for example,
1573 <programlisting>
1574 let {a = 1; b = 2; c = 3; d = 4} in C {..}
1575 </programlisting>
1576 in place of
1577 <programlisting>
1578 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
1579 </programlisting>
1580 The expansion is purely syntactic, so the record wildcard
1581 expression refers to the nearest enclosing variables that are spelled
1582 the same as the omitted field names.
1583 </para></listitem>
1584
1585 <listitem><para>
1586 The "<literal>..</literal>" expands to the missing
1587 <emphasis>in-scope</emphasis> record fields, where "in scope"
1588 includes both unqualified and qualified-only.
1589 Any fields that are not in scope are not filled in. For example
1590 <programlisting>
1591 module M where
1592 data R = R { a,b,c :: Int }
1593 module X where
1594 import qualified M( R(a,b) )
1595 f a b = R { .. }
1596 </programlisting>
1597 The <literal>{..}</literal> expands to <literal>{M.a=a,M.b=b}</literal>,
1598 omitting <literal>c</literal> since it is not in scope at all.
1599 </para></listitem>
1600 </itemizedlist>
1601 </para>
1602
1603 </sect2>
1604
1605 <!-- ===================== Local fixity declarations =================== -->
1606
1607 <sect2 id="local-fixity-declarations">
1608 <title>Local Fixity Declarations
1609 </title>
1610
1611 <para>A careful reading of the Haskell 98 Report reveals that fixity
1612 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
1613 <literal>infixr</literal>) are permitted to appear inside local bindings
1614 such those introduced by <literal>let</literal> and
1615 <literal>where</literal>. However, the Haskell Report does not specify
1616 the semantics of such bindings very precisely.
1617 </para>
1618
1619 <para>In GHC, a fixity declaration may accompany a local binding:
1620 <programlisting>
1621 let f = ...
1622 infixr 3 `f`
1623 in
1624 ...
1625 </programlisting>
1626 and the fixity declaration applies wherever the binding is in scope.
1627 For example, in a <literal>let</literal>, it applies in the right-hand
1628 sides of other <literal>let</literal>-bindings and the body of the
1629 <literal>let</literal>C. Or, in recursive <literal>do</literal>
1630 expressions (<xref linkend="recursive-do-notation"/>), the local fixity
1631 declarations of a <literal>let</literal> statement scope over other
1632 statements in the group, just as the bound name does.
1633 </para>
1634
1635 <para>
1636 Moreover, a local fixity declaration *must* accompany a local binding of
1637 that name: it is not possible to revise the fixity of name bound
1638 elsewhere, as in
1639 <programlisting>
1640 let infixr 9 $ in ...
1641 </programlisting>
1642
1643 Because local fixity declarations are technically Haskell 98, no flag is
1644 necessary to enable them.
1645 </para>
1646 </sect2>
1647
1648 <sect2 id="package-imports">
1649 <title>Package-qualified imports</title>
1650
1651 <para>With the <option>-XPackageImports</option> flag, GHC allows
1652 import declarations to be qualified by the package name that the
1653 module is intended to be imported from. For example:</para>
1654
1655 <programlisting>
1656 import "network" Network.Socket
1657 </programlisting>
1658
1659 <para>would import the module <literal>Network.Socket</literal> from
1660 the package <literal>network</literal> (any version). This may
1661 be used to disambiguate an import when the same module is
1662 available from multiple packages, or is present in both the
1663 current package being built and an external package.</para>
1664
1665 <para>Note: you probably don't need to use this feature, it was
1666 added mainly so that we can build backwards-compatible versions of
1667 packages when APIs change. It can lead to fragile dependencies in
1668 the common case: modules occasionally move from one package to
1669 another, rendering any package-qualified imports broken.</para>
1670 </sect2>
1671
1672 <sect2 id="syntax-stolen">
1673 <title>Summary of stolen syntax</title>
1674
1675 <para>Turning on an option that enables special syntax
1676 <emphasis>might</emphasis> cause working Haskell 98 code to fail
1677 to compile, perhaps because it uses a variable name which has
1678 become a reserved word. This section lists the syntax that is
1679 "stolen" by language extensions.
1680 We use
1681 notation and nonterminal names from the Haskell 98 lexical syntax
1682 (see the Haskell 98 Report).
1683 We only list syntax changes here that might affect
1684 existing working programs (i.e. "stolen" syntax). Many of these
1685 extensions will also enable new context-free syntax, but in all
1686 cases programs written to use the new syntax would not be
1687 compilable without the option enabled.</para>
1688
1689 <para>There are two classes of special
1690 syntax:
1691
1692 <itemizedlist>
1693 <listitem>
1694 <para>New reserved words and symbols: character sequences
1695 which are no longer available for use as identifiers in the
1696 program.</para>
1697 </listitem>
1698 <listitem>
1699 <para>Other special syntax: sequences of characters that have
1700 a different meaning when this particular option is turned
1701 on.</para>
1702 </listitem>
1703 </itemizedlist>
1704
1705 The following syntax is stolen:
1706
1707 <variablelist>
1708 <varlistentry>
1709 <term>
1710 <literal>forall</literal>
1711 <indexterm><primary><literal>forall</literal></primary></indexterm>
1712 </term>
1713 <listitem><para>
1714 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
1715 <option>-XScopedTypeVariables</option>,
1716 <option>-XLiberalTypeSynonyms</option>,
1717 <option>-XRank2Types</option>,
1718 <option>-XRankNTypes</option>,
1719 <option>-XPolymorphicComponents</option>,
1720 <option>-XExistentialQuantification</option>
1721 </para></listitem>
1722 </varlistentry>
1723
1724 <varlistentry>
1725 <term>
1726 <literal>mdo</literal>
1727 <indexterm><primary><literal>mdo</literal></primary></indexterm>
1728 </term>
1729 <listitem><para>
1730 Stolen by: <option>-XRecursiveDo</option>,
1731 </para></listitem>
1732 </varlistentry>
1733
1734 <varlistentry>
1735 <term>
1736 <literal>foreign</literal>
1737 <indexterm><primary><literal>foreign</literal></primary></indexterm>
1738 </term>
1739 <listitem><para>
1740 Stolen by: <option>-XForeignFunctionInterface</option>,
1741 </para></listitem>
1742 </varlistentry>
1743
1744 <varlistentry>
1745 <term>
1746 <literal>rec</literal>,
1747 <literal>proc</literal>, <literal>-&lt;</literal>,
1748 <literal>&gt;-</literal>, <literal>-&lt;&lt;</literal>,
1749 <literal>&gt;&gt;-</literal>, and <literal>(|</literal>,
1750 <literal>|)</literal> brackets
1751 <indexterm><primary><literal>proc</literal></primary></indexterm>
1752 </term>
1753 <listitem><para>
1754 Stolen by: <option>-XArrows</option>,
1755 </para></listitem>
1756 </varlistentry>
1757
1758 <varlistentry>
1759 <term>
1760 <literal>?<replaceable>varid</replaceable></literal>,
1761 <literal>%<replaceable>varid</replaceable></literal>
1762 <indexterm><primary>implicit parameters</primary></indexterm>
1763 </term>
1764 <listitem><para>
1765 Stolen by: <option>-XImplicitParams</option>,
1766 </para></listitem>
1767 </varlistentry>
1768
1769 <varlistentry>
1770 <term>
1771 <literal>[|</literal>,
1772 <literal>[e|</literal>, <literal>[p|</literal>,
1773 <literal>[d|</literal>, <literal>[t|</literal>,
1774 <literal>$(</literal>,
1775 <literal>$<replaceable>varid</replaceable></literal>
1776 <indexterm><primary>Template Haskell</primary></indexterm>
1777 </term>
1778 <listitem><para>
1779 Stolen by: <option>-XTemplateHaskell</option>,
1780 </para></listitem>
1781 </varlistentry>
1782
1783 <varlistentry>
1784 <term>
1785 <literal>[:<replaceable>varid</replaceable>|</literal>
1786 <indexterm><primary>quasi-quotation</primary></indexterm>
1787 </term>
1788 <listitem><para>
1789 Stolen by: <option>-XQuasiQuotes</option>,
1790 </para></listitem>
1791 </varlistentry>
1792
1793 <varlistentry>
1794 <term>
1795 <replaceable>varid</replaceable>{<literal>&num;</literal>},
1796 <replaceable>char</replaceable><literal>&num;</literal>,
1797 <replaceable>string</replaceable><literal>&num;</literal>,
1798 <replaceable>integer</replaceable><literal>&num;</literal>,
1799 <replaceable>float</replaceable><literal>&num;</literal>,
1800 <replaceable>float</replaceable><literal>&num;&num;</literal>,
1801 <literal>(&num;</literal>, <literal>&num;)</literal>,
1802 </term>
1803 <listitem><para>
1804 Stolen by: <option>-XMagicHash</option>,
1805 </para></listitem>
1806 </varlistentry>
1807 </variablelist>
1808 </para>
1809 </sect2>
1810 </sect1>
1811
1812
1813 <!-- TYPE SYSTEM EXTENSIONS -->
1814 <sect1 id="data-type-extensions">
1815 <title>Extensions to data types and type synonyms</title>
1816
1817 <sect2 id="nullary-types">
1818 <title>Data types with no constructors</title>
1819
1820 <para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
1821 a data type with no constructors. For example:</para>
1822
1823 <programlisting>
1824 data S -- S :: *
1825 data T a -- T :: * -> *
1826 </programlisting>
1827
1828 <para>Syntactically, the declaration lacks the "= constrs" part. The
1829 type can be parameterised over types of any kind, but if the kind is
1830 not <literal>*</literal> then an explicit kind annotation must be used
1831 (see <xref linkend="kinding"/>).</para>
1832
1833 <para>Such data types have only one value, namely bottom.
1834 Nevertheless, they can be useful when defining "phantom types".</para>
1835 </sect2>
1836
1837 <sect2 id="datatype-contexts">
1838 <title>Data type contexts</title>
1839
1840 <para>Haskell allows datatypes to be given contexts, e.g.</para>
1841
1842 <programlisting>
1843 data Eq a => Set a = NilSet | ConsSet a (Set a)
1844 </programlisting>
1845
1846 <para>give constructors with types:</para>
1847
1848 <programlisting>
1849 NilSet :: Set a
1850 ConsSet :: Eq a => a -> Set a -> Set a
1851 </programlisting>
1852
1853 <para>In GHC this feature is an extension called
1854 <literal>DatatypeContexts</literal>, and on by default.</para>
1855 </sect2>
1856
1857 <sect2 id="infix-tycons">
1858 <title>Infix type constructors, classes, and type variables</title>
1859
1860 <para>
1861 GHC allows type constructors, classes, and type variables to be operators, and
1862 to be written infix, very much like expressions. More specifically:
1863 <itemizedlist>
1864 <listitem><para>
1865 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
1866 The lexical syntax is the same as that for data constructors.
1867 </para></listitem>
1868 <listitem><para>
1869 Data type and type-synonym declarations can be written infix, parenthesised
1870 if you want further arguments. E.g.
1871 <screen>
1872 data a :*: b = Foo a b
1873 type a :+: b = Either a b
1874 class a :=: b where ...
1875
1876 data (a :**: b) x = Baz a b x
1877 type (a :++: b) y = Either (a,b) y
1878 </screen>
1879 </para></listitem>
1880 <listitem><para>
1881 Types, and class constraints, can be written infix. For example
1882 <screen>
1883 x :: Int :*: Bool
1884 f :: (a :=: b) => a -> b
1885 </screen>
1886 </para></listitem>
1887 <listitem><para>
1888 A type variable can be an (unqualified) operator e.g. <literal>+</literal>.
1889 The lexical syntax is the same as that for variable operators, excluding "(.)",
1890 "(!)", and "(*)". In a binding position, the operator must be
1891 parenthesised. For example:
1892 <programlisting>
1893 type T (+) = Int + Int
1894 f :: T Either
1895 f = Left 3
1896
1897 liftA2 :: Arrow (~>)
1898 => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c)
1899 liftA2 = ...
1900 </programlisting>
1901 </para></listitem>
1902 <listitem><para>
1903 Back-quotes work
1904 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
1905 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
1906 </para></listitem>
1907 <listitem><para>
1908 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
1909 one cannot distinguish between the two in a fixity declaration; a fixity declaration
1910 sets the fixity for a data constructor and the corresponding type constructor. For example:
1911 <screen>
1912 infixl 7 T, :*:
1913 </screen>
1914 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
1915 and similarly for <literal>:*:</literal>.
1916 <literal>Int `a` Bool</literal>.
1917 </para></listitem>
1918 <listitem><para>
1919 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
1920 </para></listitem>
1921
1922 </itemizedlist>
1923 </para>
1924 </sect2>
1925
1926 <sect2 id="type-synonyms">
1927 <title>Liberalised type synonyms</title>
1928
1929 <para>
1930 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
1931 on individual synonym declarations.
1932 With the <option>-XLiberalTypeSynonyms</option> extension,
1933 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
1934 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
1935
1936 <itemizedlist>
1937 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
1938 in a type synonym, thus:
1939 <programlisting>
1940 type Discard a = forall b. Show b => a -> b -> (a, String)
1941
1942 f :: Discard a
1943 f x y = (x, show y)
1944
1945 g :: Discard Int -> (Int,String) -- A rank-2 type
1946 g f = f 3 True
1947 </programlisting>
1948 </para>
1949 </listitem>
1950
1951 <listitem><para>
1952 If you also use <option>-XUnboxedTuples</option>,
1953 you can write an unboxed tuple in a type synonym:
1954 <programlisting>
1955 type Pr = (# Int, Int #)
1956
1957 h :: Int -> Pr
1958 h x = (# x, x #)
1959 </programlisting>
1960 </para></listitem>
1961
1962 <listitem><para>
1963 You can apply a type synonym to a forall type:
1964 <programlisting>
1965 type Foo a = a -> a -> Bool
1966
1967 f :: Foo (forall b. b->b)
1968 </programlisting>
1969 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
1970 <programlisting>
1971 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
1972 </programlisting>
1973 </para></listitem>
1974
1975 <listitem><para>
1976 You can apply a type synonym to a partially applied type synonym:
1977 <programlisting>
1978 type Generic i o = forall x. i x -> o x
1979 type Id x = x
1980
1981 foo :: Generic Id []
1982 </programlisting>
1983 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
1984 <programlisting>
1985 foo :: forall x. x -> [x]
1986 </programlisting>
1987 </para></listitem>
1988
1989 </itemizedlist>
1990 </para>
1991
1992 <para>
1993 GHC currently does kind checking before expanding synonyms (though even that
1994 could be changed.)
1995 </para>
1996 <para>
1997 After expanding type synonyms, GHC does validity checking on types, looking for
1998 the following mal-formedness which isn't detected simply by kind checking:
1999 <itemizedlist>
2000 <listitem><para>
2001 Type constructor applied to a type involving for-alls.
2002 </para></listitem>
2003 <listitem><para>
2004 Unboxed tuple on left of an arrow.
2005 </para></listitem>
2006 <listitem><para>
2007 Partially-applied type synonym.
2008 </para></listitem>
2009 </itemizedlist>
2010 So, for example,
2011 this will be rejected:
2012 <programlisting>
2013 type Pr = (# Int, Int #)
2014
2015 h :: Pr -> Int
2016 h x = ...
2017 </programlisting>
2018 because GHC does not allow unboxed tuples on the left of a function arrow.
2019 </para>
2020 </sect2>
2021
2022
2023 <sect2 id="existential-quantification">
2024 <title>Existentially quantified data constructors
2025 </title>
2026
2027 <para>
2028 The idea of using existential quantification in data type declarations
2029 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2030 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2031 London, 1991). It was later formalised by Laufer and Odersky
2032 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2033 TOPLAS, 16(5), pp1411-1430, 1994).
2034 It's been in Lennart
2035 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2036 proved very useful. Here's the idea. Consider the declaration:
2037 </para>
2038
2039 <para>
2040
2041 <programlisting>
2042 data Foo = forall a. MkFoo a (a -> Bool)
2043 | Nil
2044 </programlisting>
2045
2046 </para>
2047
2048 <para>
2049 The data type <literal>Foo</literal> has two constructors with types:
2050 </para>
2051
2052 <para>
2053
2054 <programlisting>
2055 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2056 Nil :: Foo
2057 </programlisting>
2058
2059 </para>
2060
2061 <para>
2062 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2063 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2064 For example, the following expression is fine:
2065 </para>
2066
2067 <para>
2068
2069 <programlisting>
2070 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2071 </programlisting>
2072
2073 </para>
2074
2075 <para>
2076 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
2077 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
2078 isUpper</function> packages a character with a compatible function. These
2079 two things are each of type <literal>Foo</literal> and can be put in a list.
2080 </para>
2081
2082 <para>
2083 What can we do with a value of type <literal>Foo</literal>?. In particular,
2084 what happens when we pattern-match on <function>MkFoo</function>?
2085 </para>
2086
2087 <para>
2088
2089 <programlisting>
2090 f (MkFoo val fn) = ???
2091 </programlisting>
2092
2093 </para>
2094
2095 <para>
2096 Since all we know about <literal>val</literal> and <function>fn</function> is that they
2097 are compatible, the only (useful) thing we can do with them is to
2098 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
2099 </para>
2100
2101 <para>
2102
2103 <programlisting>
2104 f :: Foo -> Bool
2105 f (MkFoo val fn) = fn val
2106 </programlisting>
2107
2108 </para>
2109
2110 <para>
2111 What this allows us to do is to package heterogeneous values
2112 together with a bunch of functions that manipulate them, and then treat
2113 that collection of packages in a uniform manner. You can express
2114 quite a bit of object-oriented-like programming this way.
2115 </para>
2116
2117 <sect3 id="existential">
2118 <title>Why existential?
2119 </title>
2120
2121 <para>
2122 What has this to do with <emphasis>existential</emphasis> quantification?
2123 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
2124 </para>
2125
2126 <para>
2127
2128 <programlisting>
2129 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2130 </programlisting>
2131
2132 </para>
2133
2134 <para>
2135 But Haskell programmers can safely think of the ordinary
2136 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
2137 adding a new existential quantification construct.
2138 </para>
2139
2140 </sect3>
2141
2142 <sect3 id="existential-with-context">
2143 <title>Existentials and type classes</title>
2144
2145 <para>
2146 An easy extension is to allow
2147 arbitrary contexts before the constructor. For example:
2148 </para>
2149
2150 <para>
2151
2152 <programlisting>
2153 data Baz = forall a. Eq a => Baz1 a a
2154 | forall b. Show b => Baz2 b (b -> b)
2155 </programlisting>
2156
2157 </para>
2158
2159 <para>
2160 The two constructors have the types you'd expect:
2161 </para>
2162
2163 <para>
2164
2165 <programlisting>
2166 Baz1 :: forall a. Eq a => a -> a -> Baz
2167 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2168 </programlisting>
2169
2170 </para>
2171
2172 <para>
2173 But when pattern matching on <function>Baz1</function> the matched values can be compared
2174 for equality, and when pattern matching on <function>Baz2</function> the first matched
2175 value can be converted to a string (as well as applying the function to it).
2176 So this program is legal:
2177 </para>
2178
2179 <para>
2180
2181 <programlisting>
2182 f :: Baz -> String
2183 f (Baz1 p q) | p == q = "Yes"
2184 | otherwise = "No"
2185 f (Baz2 v fn) = show (fn v)
2186 </programlisting>
2187
2188 </para>
2189
2190 <para>
2191 Operationally, in a dictionary-passing implementation, the
2192 constructors <function>Baz1</function> and <function>Baz2</function> must store the
2193 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
2194 extract it on pattern matching.
2195 </para>
2196
2197 </sect3>
2198
2199 <sect3 id="existential-records">
2200 <title>Record Constructors</title>
2201
2202 <para>
2203 GHC allows existentials to be used with records syntax as well. For example:
2204
2205 <programlisting>
2206 data Counter a = forall self. NewCounter
2207 { _this :: self
2208 , _inc :: self -> self
2209 , _display :: self -> IO ()
2210 , tag :: a
2211 }
2212 </programlisting>
2213 Here <literal>tag</literal> is a public field, with a well-typed selector
2214 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
2215 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
2216 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
2217 compile-time error. In other words, <emphasis>GHC defines a record selector function
2218 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
2219 (This example used an underscore in the fields for which record selectors
2220 will not be defined, but that is only programming style; GHC ignores them.)
2221 </para>
2222
2223 <para>
2224 To make use of these hidden fields, we need to create some helper functions:
2225
2226 <programlisting>
2227 inc :: Counter a -> Counter a
2228 inc (NewCounter x i d t) = NewCounter
2229 { _this = i x, _inc = i, _display = d, tag = t }
2230
2231 display :: Counter a -> IO ()
2232 display NewCounter{ _this = x, _display = d } = d x
2233 </programlisting>
2234
2235 Now we can define counters with different underlying implementations:
2236
2237 <programlisting>
2238 counterA :: Counter String
2239 counterA = NewCounter
2240 { _this = 0, _inc = (1+), _display = print, tag = "A" }
2241
2242 counterB :: Counter String
2243 counterB = NewCounter
2244 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
2245
2246 main = do
2247 display (inc counterA) -- prints "1"
2248 display (inc (inc counterB)) -- prints "##"
2249 </programlisting>
2250
2251 Record update syntax is supported for existentials (and GADTs):
2252 <programlisting>
2253 setTag :: Counter a -> a -> Counter a
2254 setTag obj t = obj{ tag = t }
2255 </programlisting>
2256 The rule for record update is this: <emphasis>
2257 the types of the updated fields may
2258 mention only the universally-quantified type variables
2259 of the data constructor. For GADTs, the field may mention only types
2260 that appear as a simple type-variable argument in the constructor's result
2261 type</emphasis>. For example:
2262 <programlisting>
2263 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
2264 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
2265 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
2266 -- existentially quantified)
2267
2268 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
2269 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
2270 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
2271 -- type-variable argument in G1's result type)
2272 </programlisting>
2273 </para>
2274
2275 </sect3>
2276
2277
2278 <sect3>
2279 <title>Restrictions</title>
2280
2281 <para>
2282 There are several restrictions on the ways in which existentially-quantified
2283 constructors can be use.
2284 </para>
2285
2286 <para>
2287
2288 <itemizedlist>
2289 <listitem>
2290
2291 <para>
2292 When pattern matching, each pattern match introduces a new,
2293 distinct, type for each existential type variable. These types cannot
2294 be unified with any other type, nor can they escape from the scope of
2295 the pattern match. For example, these fragments are incorrect:
2296
2297
2298 <programlisting>
2299 f1 (MkFoo a f) = a
2300 </programlisting>
2301
2302
2303 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
2304 is the result of <function>f1</function>. One way to see why this is wrong is to
2305 ask what type <function>f1</function> has:
2306
2307
2308 <programlisting>
2309 f1 :: Foo -> a -- Weird!
2310 </programlisting>
2311
2312
2313 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
2314 this:
2315
2316
2317 <programlisting>
2318 f1 :: forall a. Foo -> a -- Wrong!
2319 </programlisting>
2320
2321
2322 The original program is just plain wrong. Here's another sort of error
2323
2324
2325 <programlisting>
2326 f2 (Baz1 a b) (Baz1 p q) = a==q
2327 </programlisting>
2328
2329
2330 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
2331 <literal>a==q</literal> is wrong because it equates the two distinct types arising
2332 from the two <function>Baz1</function> constructors.
2333
2334
2335 </para>
2336 </listitem>
2337 <listitem>
2338
2339 <para>
2340 You can't pattern-match on an existentially quantified
2341 constructor in a <literal>let</literal> or <literal>where</literal> group of
2342 bindings. So this is illegal:
2343
2344
2345 <programlisting>
2346 f3 x = a==b where { Baz1 a b = x }
2347 </programlisting>
2348
2349 Instead, use a <literal>case</literal> expression:
2350
2351 <programlisting>
2352 f3 x = case x of Baz1 a b -> a==b
2353 </programlisting>
2354
2355 In general, you can only pattern-match
2356 on an existentially-quantified constructor in a <literal>case</literal> expression or
2357 in the patterns of a function definition.
2358
2359 The reason for this restriction is really an implementation one.
2360 Type-checking binding groups is already a nightmare without
2361 existentials complicating the picture. Also an existential pattern
2362 binding at the top level of a module doesn't make sense, because it's
2363 not clear how to prevent the existentially-quantified type "escaping".
2364 So for now, there's a simple-to-state restriction. We'll see how
2365 annoying it is.
2366
2367 </para>
2368 </listitem>
2369 <listitem>
2370
2371 <para>
2372 You can't use existential quantification for <literal>newtype</literal>
2373 declarations. So this is illegal:
2374
2375
2376 <programlisting>
2377 newtype T = forall a. Ord a => MkT a
2378 </programlisting>
2379
2380
2381 Reason: a value of type <literal>T</literal> must be represented as a
2382 pair of a dictionary for <literal>Ord t</literal> and a value of type
2383 <literal>t</literal>. That contradicts the idea that
2384 <literal>newtype</literal> should have no concrete representation.
2385 You can get just the same efficiency and effect by using
2386 <literal>data</literal> instead of <literal>newtype</literal>. If
2387 there is no overloading involved, then there is more of a case for
2388 allowing an existentially-quantified <literal>newtype</literal>,
2389 because the <literal>data</literal> version does carry an
2390 implementation cost, but single-field existentially quantified
2391 constructors aren't much use. So the simple restriction (no
2392 existential stuff on <literal>newtype</literal>) stands, unless there
2393 are convincing reasons to change it.
2394
2395
2396 </para>
2397 </listitem>
2398 <listitem>
2399
2400 <para>
2401 You can't use <literal>deriving</literal> to define instances of a
2402 data type with existentially quantified data constructors.
2403
2404 Reason: in most cases it would not make sense. For example:;
2405
2406 <programlisting>
2407 data T = forall a. MkT [a] deriving( Eq )
2408 </programlisting>
2409
2410 To derive <literal>Eq</literal> in the standard way we would need to have equality
2411 between the single component of two <function>MkT</function> constructors:
2412
2413 <programlisting>
2414 instance Eq T where
2415 (MkT a) == (MkT b) = ???
2416 </programlisting>
2417
2418 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
2419 It's just about possible to imagine examples in which the derived instance
2420 would make sense, but it seems altogether simpler simply to prohibit such
2421 declarations. Define your own instances!
2422 </para>
2423 </listitem>
2424
2425 </itemizedlist>
2426
2427 </para>
2428
2429 </sect3>
2430 </sect2>
2431
2432 <!-- ====================== Generalised algebraic data types ======================= -->
2433
2434 <sect2 id="gadt-style">
2435 <title>Declaring data types with explicit constructor signatures</title>
2436
2437 <para>When the <literal>GADTSyntax</literal> extension is enabled,
2438 GHC allows you to declare an algebraic data type by
2439 giving the type signatures of constructors explicitly. For example:
2440 <programlisting>
2441 data Maybe a where
2442 Nothing :: Maybe a
2443 Just :: a -> Maybe a
2444 </programlisting>
2445 The form is called a "GADT-style declaration"
2446 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
2447 can only be declared using this form.</para>
2448 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
2449 For example, these two declarations are equivalent:
2450 <programlisting>
2451 data Foo = forall a. MkFoo a (a -> Bool)
2452 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
2453 </programlisting>
2454 </para>
2455 <para>Any data type that can be declared in standard Haskell-98 syntax
2456 can also be declared using GADT-style syntax.
2457 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
2458 they treat class constraints on the data constructors differently.
2459 Specifically, if the constructor is given a type-class context, that
2460 context is made available by pattern matching. For example:
2461 <programlisting>
2462 data Set a where
2463 MkSet :: Eq a => [a] -> Set a
2464
2465 makeSet :: Eq a => [a] -> Set a
2466 makeSet xs = MkSet (nub xs)
2467
2468 insert :: a -> Set a -> Set a
2469 insert a (MkSet as) | a `elem` as = MkSet as
2470 | otherwise = MkSet (a:as)
2471 </programlisting>
2472 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
2473 gives rise to a <literal>(Eq a)</literal>
2474 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
2475 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
2476 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
2477 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
2478 when pattern-matching that dictionary becomes available for the right-hand side of the match.
2479 In the example, the equality dictionary is used to satisfy the equality constraint
2480 generated by the call to <literal>elem</literal>, so that the type of
2481 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
2482 </para>
2483 <para>
2484 For example, one possible application is to reify dictionaries:
2485 <programlisting>
2486 data NumInst a where
2487 MkNumInst :: Num a => NumInst a
2488
2489 intInst :: NumInst Int
2490 intInst = MkNumInst
2491
2492 plus :: NumInst a -> a -> a -> a
2493 plus MkNumInst p q = p + q
2494 </programlisting>
2495 Here, a value of type <literal>NumInst a</literal> is equivalent
2496 to an explicit <literal>(Num a)</literal> dictionary.
2497 </para>
2498 <para>
2499 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
2500 For example, the <literal>NumInst</literal> data type above could equivalently be declared
2501 like this:
2502 <programlisting>
2503 data NumInst a
2504 = Num a => MkNumInst (NumInst a)
2505 </programlisting>
2506 Notice that, unlike the situation when declaring an existential, there is
2507 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
2508 data type's universally quantified type variable <literal>a</literal>.
2509 A constructor may have both universal and existential type variables: for example,
2510 the following two declarations are equivalent:
2511 <programlisting>
2512 data T1 a
2513 = forall b. (Num a, Eq b) => MkT1 a b
2514 data T2 a where
2515 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
2516 </programlisting>
2517 </para>
2518 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
2519 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
2520 In Haskell 98 the definition
2521 <programlisting>
2522 data Eq a => Set' a = MkSet' [a]
2523 </programlisting>
2524 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
2525 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
2526 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
2527 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
2528 GHC's behaviour is much more useful, as well as much more intuitive.
2529 </para>
2530
2531 <para>
2532 The rest of this section gives further details about GADT-style data
2533 type declarations.
2534
2535 <itemizedlist>
2536 <listitem><para>
2537 The result type of each data constructor must begin with the type constructor being defined.
2538 If the result type of all constructors
2539 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
2540 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
2541 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
2542 </para></listitem>
2543
2544 <listitem><para>
2545 As with other type signatures, you can give a single signature for several data constructors.
2546 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
2547 <programlisting>
2548 data T a where
2549 T1,T2 :: a -> T a
2550 T3 :: T a
2551 </programlisting>
2552 </para></listitem>
2553
2554 <listitem><para>
2555 The type signature of
2556 each constructor is independent, and is implicitly universally quantified as usual.
2557 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
2558 have no scope, and different constructors may have different universally-quantified type variables:
2559 <programlisting>
2560 data T a where -- The 'a' has no scope
2561 T1,T2 :: b -> T b -- Means forall b. b -> T b
2562 T3 :: T a -- Means forall a. T a
2563 </programlisting>
2564 </para></listitem>
2565
2566 <listitem><para>
2567 A constructor signature may mention type class constraints, which can differ for
2568 different constructors. For example, this is fine:
2569 <programlisting>
2570 data T a where
2571 T1 :: Eq b => b -> b -> T b
2572 T2 :: (Show c, Ix c) => c -> [c] -> T c
2573 </programlisting>
2574 When patten matching, these constraints are made available to discharge constraints
2575 in the body of the match. For example:
2576 <programlisting>
2577 f :: T a -> String
2578 f (T1 x y) | x==y = "yes"
2579 | otherwise = "no"
2580 f (T2 a b) = show a
2581 </programlisting>
2582 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
2583 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
2584 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
2585 </para></listitem>
2586
2587 <listitem><para>
2588 Unlike a Haskell-98-style
2589 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
2590 have no scope. Indeed, one can write a kind signature instead:
2591 <programlisting>
2592 data Set :: * -> * where ...
2593 </programlisting>
2594 or even a mixture of the two:
2595 <programlisting>
2596 data Bar a :: (* -> *) -> * where ...
2597 </programlisting>
2598 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
2599 like this:
2600 <programlisting>
2601 data Bar a (b :: * -> *) where ...
2602 </programlisting>
2603 </para></listitem>
2604
2605
2606 <listitem><para>
2607 You can use strictness annotations, in the obvious places
2608 in the constructor type:
2609 <programlisting>
2610 data Term a where
2611 Lit :: !Int -> Term Int
2612 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
2613 Pair :: Term a -> Term b -> Term (a,b)
2614 </programlisting>
2615 </para></listitem>
2616
2617 <listitem><para>
2618 You can use a <literal>deriving</literal> clause on a GADT-style data type
2619 declaration. For example, these two declarations are equivalent
2620 <programlisting>
2621 data Maybe1 a where {
2622 Nothing1 :: Maybe1 a ;
2623 Just1 :: a -> Maybe1 a
2624 } deriving( Eq, Ord )
2625
2626 data Maybe2 a = Nothing2 | Just2 a
2627 deriving( Eq, Ord )
2628 </programlisting>
2629 </para></listitem>
2630
2631 <listitem><para>
2632 The type signature may have quantified type variables that do not appear
2633 in the result type:
2634 <programlisting>
2635 data Foo where
2636 MkFoo :: a -> (a->Bool) -> Foo
2637 Nil :: Foo
2638 </programlisting>
2639 Here the type variable <literal>a</literal> does not appear in the result type
2640 of either constructor.
2641 Although it is universally quantified in the type of the constructor, such
2642 a type variable is often called "existential".
2643 Indeed, the above declaration declares precisely the same type as
2644 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
2645 </para><para>
2646 The type may contain a class context too, of course:
2647 <programlisting>
2648 data Showable where
2649 MkShowable :: Show a => a -> Showable
2650 </programlisting>
2651 </para></listitem>
2652
2653 <listitem><para>
2654 You can use record syntax on a GADT-style data type declaration:
2655
2656 <programlisting>
2657 data Person where
2658 Adult :: { name :: String, children :: [Person] } -> Person
2659 Child :: Show a => { name :: !String, funny :: a } -> Person
2660 </programlisting>
2661 As usual, for every constructor that has a field <literal>f</literal>, the type of
2662 field <literal>f</literal> must be the same (modulo alpha conversion).
2663 The <literal>Child</literal> constructor above shows that the signature
2664 may have a context, existentially-quantified variables, and strictness annotations,
2665 just as in the non-record case. (NB: the "type" that follows the double-colon
2666 is not really a type, because of the record syntax and strictness annotations.
2667 A "type" of this form can appear only in a constructor signature.)
2668 </para></listitem>
2669
2670 <listitem><para>
2671 Record updates are allowed with GADT-style declarations,
2672 only fields that have the following property: the type of the field
2673 mentions no existential type variables.
2674 </para></listitem>
2675
2676 <listitem><para>
2677 As in the case of existentials declared using the Haskell-98-like record syntax
2678 (<xref linkend="existential-records"/>),
2679 record-selector functions are generated only for those fields that have well-typed
2680 selectors.
2681 Here is the example of that section, in GADT-style syntax:
2682 <programlisting>
2683 data Counter a where
2684 NewCounter { _this :: self
2685 , _inc :: self -> self
2686 , _display :: self -> IO ()
2687 , tag :: a
2688 }
2689 :: Counter a
2690 </programlisting>
2691 As before, only one selector function is generated here, that for <literal>tag</literal>.
2692 Nevertheless, you can still use all the field names in pattern matching and record construction.
2693 </para></listitem>
2694 </itemizedlist></para>
2695 </sect2>
2696
2697 <sect2 id="gadt">
2698 <title>Generalised Algebraic Data Types (GADTs)</title>
2699
2700 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
2701 by allowing constructors to have richer return types. Here is an example:
2702 <programlisting>
2703 data Term a where
2704 Lit :: Int -> Term Int
2705 Succ :: Term Int -> Term Int
2706 IsZero :: Term Int -> Term Bool
2707 If :: Term Bool -> Term a -> Term a -> Term a
2708 Pair :: Term a -> Term b -> Term (a,b)
2709 </programlisting>
2710 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
2711 case with ordinary data types. This generality allows us to
2712 write a well-typed <literal>eval</literal> function
2713 for these <literal>Terms</literal>:
2714 <programlisting>
2715 eval :: Term a -> a
2716 eval (Lit i) = i
2717 eval (Succ t) = 1 + eval t
2718 eval (IsZero t) = eval t == 0
2719 eval (If b e1 e2) = if eval b then eval e1 else eval e2
2720 eval (Pair e1 e2) = (eval e1, eval e2)
2721 </programlisting>
2722 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
2723 For example, in the right hand side of the equation
2724 <programlisting>
2725 eval :: Term a -> a
2726 eval (Lit i) = ...
2727 </programlisting>
2728 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
2729 A precise specification of the type rules is beyond what this user manual aspires to,
2730 but the design closely follows that described in
2731 the paper <ulink
2732 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
2733 unification-based type inference for GADTs</ulink>,
2734 (ICFP 2006).
2735 The general principle is this: <emphasis>type refinement is only carried out
2736 based on user-supplied type annotations</emphasis>.
2737 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
2738 and lots of obscure error messages will
2739 occur. However, the refinement is quite general. For example, if we had:
2740 <programlisting>
2741 eval :: Term a -> a -> a
2742 eval (Lit i) j = i+j
2743 </programlisting>
2744 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
2745 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
2746 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
2747 </para>
2748 <para>
2749 These and many other examples are given in papers by Hongwei Xi, and
2750 Tim Sheard. There is a longer introduction
2751 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
2752 and Ralf Hinze's
2753 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
2754 may use different notation to that implemented in GHC.
2755 </para>
2756 <para>
2757 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
2758 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
2759 <itemizedlist>
2760 <listitem><para>
2761 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
2762 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
2763 The result type of each constructor must begin with the type constructor being defined,
2764 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
2765 For example, in the <literal>Term</literal> data
2766 type above, the type of each constructor must end with <literal>Term ty</literal>, but
2767 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
2768 constructor).
2769 </para></listitem>
2770
2771 <listitem><para>
2772 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
2773 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
2774 whose result type is not just <literal>T a b</literal>.
2775 </para></listitem>
2776
2777 <listitem><para>
2778 You cannot use a <literal>deriving</literal> clause for a GADT; only for
2779 an ordinary data type.
2780 </para></listitem>
2781
2782 <listitem><para>
2783 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
2784 For example:
2785 <programlisting>
2786 data Term a where
2787 Lit { val :: Int } :: Term Int
2788 Succ { num :: Term Int } :: Term Int
2789 Pred { num :: Term Int } :: Term Int
2790 IsZero { arg :: Term Int } :: Term Bool
2791 Pair { arg1 :: Term a
2792 , arg2 :: Term b
2793 } :: Term (a,b)
2794 If { cnd :: Term Bool
2795 , tru :: Term a
2796 , fls :: Term a
2797 } :: Term a
2798 </programlisting>
2799 However, for GADTs there is the following additional constraint:
2800 every constructor that has a field <literal>f</literal> must have
2801 the same result type (modulo alpha conversion)
2802 Hence, in the above example, we cannot merge the <literal>num</literal>
2803 and <literal>arg</literal> fields above into a
2804 single name. Although their field types are both <literal>Term Int</literal>,
2805 their selector functions actually have different types:
2806
2807 <programlisting>
2808 num :: Term Int -> Term Int
2809 arg :: Term Bool -> Term Int
2810 </programlisting>
2811 </para></listitem>
2812
2813 <listitem><para>
2814 When pattern-matching against data constructors drawn from a GADT,
2815 for example in a <literal>case</literal> expression, the following rules apply:
2816 <itemizedlist>
2817 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
2818 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
2819 <listitem><para>The type of any free variable mentioned in any of
2820 the <literal>case</literal> alternatives must be rigid.</para></listitem>
2821 </itemizedlist>
2822 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
2823 way to ensure that a variable a rigid type is to give it a type signature.
2824 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
2825 Simple unification-based type inference for GADTs
2826 </ulink>. The criteria implemented by GHC are given in the Appendix.
2827
2828 </para></listitem>
2829
2830 </itemizedlist>
2831 </para>
2832
2833 </sect2>
2834 </sect1>
2835
2836 <!-- ====================== End of Generalised algebraic data types ======================= -->
2837
2838 <sect1 id="deriving">
2839 <title>Extensions to the "deriving" mechanism</title>
2840
2841 <sect2 id="deriving-inferred">
2842 <title>Inferred context for deriving clauses</title>
2843
2844 <para>
2845 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
2846 legal. For example:
2847 <programlisting>
2848 data T0 f a = MkT0 a deriving( Eq )
2849 data T1 f a = MkT1 (f a) deriving( Eq )
2850 data T2 f a = MkT2 (f (f a)) deriving( Eq )
2851 </programlisting>
2852 The natural generated <literal>Eq</literal> code would result in these instance declarations:
2853 <programlisting>
2854 instance Eq a => Eq (T0 f a) where ...
2855 instance Eq (f a) => Eq (T1 f a) where ...
2856 instance Eq (f (f a)) => Eq (T2 f a) where ...
2857 </programlisting>
2858 The first of these is obviously fine. The second is still fine, although less obviously.
2859 The third is not Haskell 98, and risks losing termination of instances.
2860 </para>
2861 <para>
2862 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
2863 each constraint in the inferred instance context must consist only of type variables,
2864 with no repetitions.
2865 </para>
2866 <para>
2867 This rule is applied regardless of flags. If you want a more exotic context, you can write
2868 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
2869 </para>
2870 </sect2>
2871
2872 <sect2 id="stand-alone-deriving">
2873 <title>Stand-alone deriving declarations</title>
2874
2875 <para>
2876 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
2877 <programlisting>
2878 data Foo a = Bar a | Baz String
2879
2880 deriving instance Eq a => Eq (Foo a)
2881 </programlisting>
2882 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
2883 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
2884 Note the following points:
2885 <itemizedlist>
2886 <listitem><para>
2887 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
2888 exactly as you would in an ordinary instance declaration.
2889 (In contrast, in a <literal>deriving</literal> clause
2890 attached to a data type declaration, the context is inferred.)
2891 </para></listitem>
2892
2893 <listitem><para>
2894 A <literal>deriving instance</literal> declaration
2895 must obey the same rules concerning form and termination as ordinary instance declarations,
2896 controlled by the same flags; see <xref linkend="instance-decls"/>.
2897 </para></listitem>
2898
2899 <listitem><para>
2900 Unlike a <literal>deriving</literal>
2901 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
2902 than the data type (assuming you also use
2903 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
2904 for example
2905 <programlisting>
2906 data Foo a = Bar a | Baz String
2907
2908 deriving instance Eq a => Eq (Foo [a])
2909 deriving instance Eq a => Eq (Foo (Maybe a))
2910 </programlisting>
2911 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
2912 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
2913 </para></listitem>
2914
2915 <listitem><para>
2916 Unlike a <literal>deriving</literal>
2917 declaration attached to a <literal>data</literal> declaration,
2918 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
2919 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
2920 your problem. (GHC will show you the offending code if it has a type error.)
2921 The merit of this is that you can derive instances for GADTs and other exotic
2922 data types, providing only that the boilerplate code does indeed typecheck. For example:
2923 <programlisting>
2924 data T a where
2925 T1 :: T Int
2926 T2 :: T Bool
2927
2928 deriving instance Show (T a)
2929 </programlisting>
2930 In this example, you cannot say <literal>... deriving( Show )</literal> on the
2931 data type declaration for <literal>T</literal>,
2932 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
2933 the instance declaration using stand-alone deriving.
2934 </para>
2935 </listitem>
2936
2937 <listitem>
2938 <para>The stand-alone syntax is generalised for newtypes in exactly the same
2939 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
2940 For example:
2941 <programlisting>
2942 newtype Foo a = MkFoo (State Int a)
2943
2944 deriving instance MonadState Int Foo
2945 </programlisting>
2946 GHC always treats the <emphasis>last</emphasis> parameter of the instance
2947 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
2948 </para></listitem>
2949 </itemizedlist></para>
2950
2951 </sect2>
2952
2953
2954 <sect2 id="deriving-typeable">
2955 <title>Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc)</title>
2956
2957 <para>
2958 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
2959 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
2960 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
2961 classes <literal>Eq</literal>, <literal>Ord</literal>,
2962 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
2963 </para>
2964 <para>
2965 GHC extends this list with several more classes that may be automatically derived:
2966 <itemizedlist>
2967 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of the classes
2968 <literal>Typeable</literal>, and <literal>Data</literal>, defined in the library
2969 modules <literal>Data.Typeable</literal> and <literal>Data.Generics</literal> respectively.
2970 </para>
2971 <para>An instance of <literal>Typeable</literal> can only be derived if the
2972 data type has seven or fewer type parameters, all of kind <literal>*</literal>.
2973 The reason for this is that the <literal>Typeable</literal> class is derived using the scheme
2974 described in
2975 <ulink url="http://research.microsoft.com/%7Esimonpj/papers/hmap/gmap2.ps">
2976 Scrap More Boilerplate: Reflection, Zips, and Generalised Casts
2977 </ulink>.
2978 (Section 7.4 of the paper describes the multiple <literal>Typeable</literal> classes that
2979 are used, and only <literal>Typeable1</literal> up to
2980 <literal>Typeable7</literal> are provided in the library.)
2981 In other cases, there is nothing to stop the programmer writing a <literal>TypableX</literal>
2982 class, whose kind suits that of the data type constructor, and
2983 then writing the data type instance by hand.
2984 </para>
2985 </listitem>
2986
2987 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
2988 the class <literal>Functor</literal>,
2989 defined in <literal>GHC.Base</literal>.
2990 </para></listitem>
2991
2992 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
2993 the class <literal>Foldable</literal>,
2994 defined in <literal>Data.Foldable</literal>.
2995 </para></listitem>
2996
2997 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
2998 the class <literal>Traversable</literal>,
2999 defined in <literal>Data.Traversable</literal>.
3000 </para></listitem>
3001 </itemizedlist>
3002 In each case the appropriate class must be in scope before it
3003 can be mentioned in the <literal>deriving</literal> clause.
3004 </para>
3005 </sect2>
3006
3007 <sect2 id="newtype-deriving">
3008 <title>Generalised derived instances for newtypes</title>
3009
3010 <para>
3011 When you define an abstract type using <literal>newtype</literal>, you may want
3012 the new type to inherit some instances from its representation. In
3013 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3014 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3015 other classes you have to write an explicit instance declaration. For
3016 example, if you define
3017
3018 <programlisting>
3019 newtype Dollars = Dollars Int
3020 </programlisting>
3021
3022 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3023 explicitly define an instance of <literal>Num</literal>:
3024
3025 <programlisting>
3026 instance Num Dollars where
3027 Dollars a + Dollars b = Dollars (a+b)
3028 ...
3029 </programlisting>
3030 All the instance does is apply and remove the <literal>newtype</literal>
3031 constructor. It is particularly galling that, since the constructor
3032 doesn't appear at run-time, this instance declaration defines a
3033 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3034 dictionary, only slower!
3035 </para>
3036
3037
3038 <sect3> <title> Generalising the deriving clause </title>
3039 <para>
3040 GHC now permits such instances to be derived instead,
3041 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
3042 so one can write
3043 <programlisting>
3044 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3045 </programlisting>
3046
3047 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3048 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3049 derives an instance declaration of the form
3050
3051 <programlisting>
3052 instance Num Int => Num Dollars
3053 </programlisting>
3054
3055 which just adds or removes the <literal>newtype</literal> constructor according to the type.
3056 </para>
3057 <para>
3058
3059 We can also derive instances of constructor classes in a similar
3060 way. For example, suppose we have implemented state and failure monad
3061 transformers, such that
3062
3063 <programlisting>
3064 instance Monad m => Monad (State s m)
3065 instance Monad m => Monad (Failure m)
3066 </programlisting>
3067 In Haskell 98, we can define a parsing monad by
3068 <programlisting>
3069 type Parser tok m a = State [tok] (Failure m) a
3070 </programlisting>
3071
3072 which is automatically a monad thanks to the instance declarations
3073 above. With the extension, we can make the parser type abstract,
3074 without needing to write an instance of class <literal>Monad</literal>, via
3075
3076 <programlisting>
3077 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3078 deriving Monad
3079 </programlisting>
3080 In this case the derived instance declaration is of the form
3081 <programlisting>
3082 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
3083 </programlisting>
3084
3085 Notice that, since <literal>Monad</literal> is a constructor class, the
3086 instance is a <emphasis>partial application</emphasis> of the new type, not the
3087 entire left hand side. We can imagine that the type declaration is
3088 "eta-converted" to generate the context of the instance
3089 declaration.
3090 </para>
3091 <para>
3092
3093 We can even derive instances of multi-parameter classes, provided the
3094 newtype is the last class parameter. In this case, a ``partial
3095 application'' of the class appears in the <literal>deriving</literal>
3096 clause. For example, given the class
3097
3098 <programlisting>
3099 class StateMonad s m | m -> s where ...
3100 instance Monad m => StateMonad s (State s m) where ...
3101 </programlisting>
3102 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
3103 <programlisting>
3104 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3105 deriving (Monad, StateMonad [tok])
3106 </programlisting>
3107
3108 The derived instance is obtained by completing the application of the
3109 class to the new type:
3110
3111 <programlisting>
3112 instance StateMonad [tok] (State [tok] (Failure m)) =>
3113 StateMonad [tok] (Parser tok m)
3114 </programlisting>
3115 </para>
3116 <para>
3117
3118 As a result of this extension, all derived instances in newtype
3119 declarations are treated uniformly (and implemented just by reusing
3120 the dictionary for the representation type), <emphasis>except</emphasis>
3121 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
3122 the newtype and its representation.
3123 </para>
3124 </sect3>
3125
3126 <sect3> <title> A more precise specification </title>
3127 <para>
3128 Derived instance declarations are constructed as follows. Consider the
3129 declaration (after expansion of any type synonyms)
3130
3131 <programlisting>
3132 newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)
3133 </programlisting>
3134
3135 where
3136 <itemizedlist>
3137 <listitem><para>
3138 The <literal>ci</literal> are partial applications of
3139 classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
3140 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
3141 </para></listitem>
3142 <listitem><para>
3143 The <literal>k</literal> is chosen so that <literal>ci (T v1...vk)</literal> is well-kinded.
3144 </para></listitem>
3145 <listitem><para>
3146 The type <literal>t</literal> is an arbitrary type.
3147 </para></listitem>
3148 <listitem><para>
3149 The type variables <literal>vk+1...vn</literal> do not occur in <literal>t</literal>,
3150 nor in the <literal>ci</literal>, and
3151 </para></listitem>
3152 <listitem><para>
3153 None of the <literal>ci</literal> is <literal>Read</literal>, <literal>Show</literal>,
3154 <literal>Typeable</literal>, or <literal>Data</literal>. These classes
3155 should not "look through" the type or its constructor. You can still
3156 derive these classes for a newtype, but it happens in the usual way, not
3157 via this new mechanism.
3158 </para></listitem>
3159 </itemizedlist>
3160 Then, for each <literal>ci</literal>, the derived instance
3161 declaration is:
3162 <programlisting>
3163 instance ci t => ci (T v1...vk)
3164 </programlisting>
3165 As an example which does <emphasis>not</emphasis> work, consider
3166 <programlisting>
3167 newtype NonMonad m s = NonMonad (State s m s) deriving Monad
3168 </programlisting>
3169 Here we cannot derive the instance
3170 <programlisting>
3171 instance Monad (State s m) => Monad (NonMonad m)
3172 </programlisting>
3173
3174 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
3175 and so cannot be "eta-converted" away. It is a good thing that this
3176 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
3177 not, in fact, a monad --- for the same reason. Try defining
3178 <literal>>>=</literal> with the correct type: you won't be able to.
3179 </para>
3180 <para>
3181
3182 Notice also that the <emphasis>order</emphasis> of class parameters becomes
3183 important, since we can only derive instances for the last one. If the
3184 <literal>StateMonad</literal> class above were instead defined as
3185
3186 <programlisting>
3187 class StateMonad m s | m -> s where ...
3188 </programlisting>
3189
3190 then we would not have been able to derive an instance for the
3191 <literal>Parser</literal> type above. We hypothesise that multi-parameter
3192 classes usually have one "main" parameter for which deriving new
3193 instances is most interesting.
3194 </para>
3195 <para>Lastly, all of this applies only for classes other than
3196 <literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>,
3197 and <literal>Data</literal>, for which the built-in derivation applies (section
3198 4.3.3. of the Haskell Report).
3199 (For the standard classes <literal>Eq</literal>, <literal>Ord</literal>,
3200 <literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether
3201 the standard method is used or the one described here.)
3202 </para>
3203 </sect3>
3204 </sect2>
3205 </sect1>
3206
3207
3208 <!-- TYPE SYSTEM EXTENSIONS -->
3209 <sect1 id="type-class-extensions">
3210 <title>Class and instances declarations</title>
3211
3212 <sect2 id="multi-param-type-classes">
3213 <title>Class declarations</title>
3214
3215 <para>
3216 This section, and the next one, documents GHC's type-class extensions.
3217 There's lots of background in the paper <ulink
3218 url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type
3219 classes: exploring the design space</ulink> (Simon Peyton Jones, Mark
3220 Jones, Erik Meijer).
3221 </para>
3222 <para>
3223 All the extensions are enabled by the <option>-fglasgow-exts</option> flag.
3224 </para>
3225
3226 <sect3>
3227 <title>Multi-parameter type classes</title>
3228 <para>
3229 Multi-parameter type classes are permitted, with flag <option>-XMultiParamTypeClasses</option>.
3230 For example:
3231
3232
3233 <programlisting>
3234 class Collection c a where
3235 union :: c a -> c a -> c a
3236 ...etc.
3237 </programlisting>
3238
3239 </para>
3240 </sect3>
3241
3242 <sect3 id="superclass-rules">
3243 <title>The superclasses of a class declaration</title>
3244
3245 <para>
3246 In Haskell 98 the context of a class declaration (which introduces superclasses)
3247 must be simple; that is, each predicate must consist of a class applied to
3248 type variables. The flag <option>-XFlexibleContexts</option>
3249 (<xref linkend="flexible-contexts"/>)
3250 lifts this restriction,
3251 so that the only restriction on the context in a class declaration is
3252 that the class hierarchy must be acyclic. So these class declarations are OK:
3253
3254
3255 <programlisting>
3256 class Functor (m k) => FiniteMap m k where
3257 ...
3258
3259 class (Monad m, Monad (t m)) => Transform t m where
3260 lift :: m a -> (t m) a
3261 </programlisting>
3262
3263
3264 </para>
3265 <para>
3266 As in Haskell 98, The class hierarchy must be acyclic. However, the definition
3267 of "acyclic" involves only the superclass relationships. For example,
3268 this is OK:
3269
3270
3271 <programlisting>
3272 class C a where {
3273 op :: D b => a -> b -> b
3274 }
3275
3276 class C a => D a where { ... }
3277 </programlisting>
3278
3279
3280 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
3281 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
3282 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
3283 </para>
3284 </sect3>
3285
3286
3287
3288
3289 <sect3 id="class-method-types">
3290 <title>Class method types</title>
3291
3292 <para>
3293 Haskell 98 prohibits class method types to mention constraints on the
3294 class type variable, thus:
3295 <programlisting>
3296 class Seq s a where
3297 fromList :: [a] -> s a
3298 elem :: Eq a => a -> s a -> Bool
3299 </programlisting>
3300 The type of <literal>elem</literal> is illegal in Haskell 98, because it
3301 contains the constraint <literal>Eq a</literal>, constrains only the
3302 class type variable (in this case <literal>a</literal>).
3303 GHC lifts this restriction (flag <option>-XConstrainedClassMethods</option>).
3304 </para>
3305
3306
3307 </sect3>
3308 </sect2>
3309
3310 <sect2 id="functional-dependencies">
3311 <title>Functional dependencies
3312 </title>
3313
3314 <para> Functional dependencies are implemented as described by Mark Jones
3315 in &ldquo;<ulink url="http://citeseer.ist.psu.edu/jones00type.html">Type Classes with Functional Dependencies</ulink>&rdquo;, Mark P. Jones,
3316 In Proceedings of the 9th European Symposium on Programming,
3317 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
3318 .
3319 </para>
3320 <para>
3321 Functional dependencies are introduced by a vertical bar in the syntax of a
3322 class declaration; e.g.
3323 <programlisting>
3324 class (Monad m) => MonadState s m | m -> s where ...
3325
3326 class Foo a b c | a b -> c where ...
3327 </programlisting>
3328 There should be more documentation, but there isn't (yet). Yell if you need it.
3329 </para>
3330
3331 <sect3><title>Rules for functional dependencies </title>
3332 <para>
3333 In a class declaration, all of the class type variables must be reachable (in the sense
3334 mentioned in <xref linkend="flexible-contexts"/>)
3335 from the free variables of each method type.
3336 For example:
3337
3338 <programlisting>
3339 class Coll s a where
3340 empty :: s
3341 insert :: s -> a -> s
3342 </programlisting>
3343
3344 is not OK, because the type of <literal>empty</literal> doesn't mention
3345 <literal>a</literal>. Functional dependencies can make the type variable
3346 reachable:
3347 <programlisting>
3348 class Coll s a | s -> a where
3349 empty :: s
3350 insert :: s -> a -> s
3351 </programlisting>
3352
3353 Alternatively <literal>Coll</literal> might be rewritten
3354
3355 <programlisting>
3356 class Coll s a where
3357 empty :: s a
3358 insert :: s a -> a -> s a
3359 </programlisting>
3360
3361
3362 which makes the connection between the type of a collection of
3363 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
3364 Occasionally this really doesn't work, in which case you can split the
3365 class like this:
3366
3367
3368 <programlisting>
3369 class CollE s where
3370 empty :: s
3371
3372 class CollE s => Coll s a where
3373 insert :: s -> a -> s
3374 </programlisting>
3375 </para>
3376 </sect3>
3377
3378
3379 <sect3>
3380 <title>Background on functional dependencies</title>
3381
3382 <para>The following description of the motivation and use of functional dependencies is taken
3383 from the Hugs user manual, reproduced here (with minor changes) by kind
3384 permission of Mark Jones.
3385 </para>
3386 <para>
3387 Consider the following class, intended as part of a
3388 library for collection types:
3389 <programlisting>
3390 class Collects e ce where
3391 empty :: ce
3392 insert :: e -> ce -> ce
3393 member :: e -> ce -> Bool
3394 </programlisting>
3395 The type variable e used here represents the element type, while ce is the type
3396 of the container itself. Within this framework, we might want to define
3397 instances of this class for lists or characteristic functions (both of which
3398 can be used to represent collections of any equality type), bit sets (which can
3399 be used to represent collections of characters), or hash tables (which can be
3400 used to represent any collection whose elements have a hash function). Omitting
3401 standard implementation details, this would lead to the following declarations:
3402 <programlisting>
3403 instance Eq e => Collects e [e] where ...
3404 instance Eq e => Collects e (e -> Bool) where ...
3405 instance Collects Char BitSet where ...
3406 instance (Hashable e, Collects a ce)
3407 => Collects e (Array Int ce) where ...
3408 </programlisting>
3409 All this looks quite promising; we have a class and a range of interesting
3410 implementations. Unfortunately, there are some serious problems with the class
3411 declaration. First, the empty function has an ambiguous type:
3412 <programlisting>
3413 empty :: Collects e ce => ce
3414 </programlisting>
3415 By "ambiguous" we mean that there is a type variable e that appears on the left
3416 of the <literal>=&gt;</literal> symbol, but not on the right. The problem with
3417 this is that, according to the theoretical foundations of Haskell overloading,
3418 we cannot guarantee a well-defined semantics for any term with an ambiguous
3419 type.
3420 </para>
3421 <para>
3422 We can sidestep this specific problem by removing the empty member from the
3423 class declaration. However, although the remaining members, insert and member,
3424 do not have ambiguous types, we still run into problems when we try to use
3425 them. For example, consider the following two functions:
3426 <programlisting>
3427 f x y = insert x . insert y
3428 g = f True 'a'
3429 </programlisting>
3430 for which GHC infers the following types:
3431 <programlisting>
3432 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3433 g :: (Collects Bool c, Collects Char c) => c -> c
3434 </programlisting>
3435 Notice that the type for f allows the two parameters x and y to be assigned
3436 different types, even though it attempts to insert each of the two values, one
3437 after the other, into the same collection. If we're trying to model collections
3438 that contain only one type of value, then this is clearly an inaccurate
3439 type. Worse still, the definition for g is accepted, without causing a type
3440 error. As a result, the error in this code will not be flagged at the point
3441 where it appears. Instead, it will show up only when we try to use g, which
3442 might even be in a different module.
3443 </para>
3444
3445 <sect4><title>An attempt to use constructor classes</title>
3446
3447 <para>
3448 Faced with the problems described above, some Haskell programmers might be
3449 tempted to use something like the following version of the class declaration:
3450 <programlisting>
3451 class Collects e c where
3452 empty :: c e
3453 insert :: e -> c e -> c e
3454 member :: e -> c e -> Bool
3455 </programlisting>
3456 The key difference here is that we abstract over the type constructor c that is
3457 used to form the collection type c e, and not over that collection type itself,
3458 represented by ce in the original class declaration. This avoids the immediate
3459 problems that we mentioned above: empty has type <literal>Collects e c => c
3460 e</literal>, which is not ambiguous.
3461 </para>
3462 <para>
3463 The function f from the previous section has a more accurate type:
3464 <programlisting>
3465 f :: (Collects e c) => e -> e -> c e -> c e
3466 </programlisting>
3467 The function g from the previous section is now rejected with a type error as
3468 we would hope because the type of f does not allow the two arguments to have
3469 different types.
3470 This, then, is an example of a multiple parameter class that does actually work
3471 quite well in practice, without ambiguity problems.
3472 There is, however, a catch. This version of the Collects class is nowhere near
3473 as general as the original class seemed to be: only one of the four instances
3474 for <literal>Collects</literal>
3475 given above can be used with this version of Collects because only one of
3476 them---the instance for lists---has a collection type that can be written in
3477 the form c e, for some type constructor c, and element type e.
3478 </para>
3479 </sect4>
3480
3481 <sect4><title>Adding functional dependencies</title>
3482
3483 <para>
3484 To get a more useful version of the Collects class, Hugs provides a mechanism
3485 that allows programmers to specify dependencies between the parameters of a
3486 multiple parameter class (For readers with an interest in theoretical
3487 foundations and previous work: The use of dependency information can be seen
3488 both as a generalization of the proposal for `parametric type classes' that was
3489 put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's
3490 later framework for "improvement" of qualified types. The
3491 underlying ideas are also discussed in a more theoretical and abstract setting
3492 in a manuscript [implparam], where they are identified as one point in a
3493 general design space for systems of implicit parameterization.).
3494
3495 To start with an abstract example, consider a declaration such as:
3496 <programlisting>
3497 class C a b where ...
3498 </programlisting>
3499 which tells us simply that C can be thought of as a binary relation on types
3500 (or type constructors, depending on the kinds of a and b). Extra clauses can be
3501 included in the definition of classes to add information about dependencies
3502 between parameters, as in the following examples:
3503 <programlisting>
3504 class D a b | a -> b where ...
3505 class E a b | a -> b, b -> a where ...
3506 </programlisting>
3507 The notation <literal>a -&gt; b</literal> used here between the | and where
3508 symbols --- not to be
3509 confused with a function type --- indicates that the a parameter uniquely
3510 determines the b parameter, and might be read as "a determines b." Thus D is
3511 not just a relation, but actually a (partial) function. Similarly, from the two
3512 dependencies that are included in the definition of E, we can see that E
3513 represents a (partial) one-one mapping between types.
3514 </para>
3515 <para>
3516 More generally, dependencies take the form <literal>x1 ... xn -&gt; y1 ... ym</literal>,
3517 where x1, ..., xn, and y1, ..., yn are type variables with n&gt;0 and
3518 m&gt;=0, meaning that the y parameters are uniquely determined by the x
3519 parameters. Spaces can be used as separators if more than one variable appears
3520 on any single side of a dependency, as in <literal>t -&gt; a b</literal>. Note that a class may be
3521 annotated with multiple dependencies using commas as separators, as in the
3522 definition of E above. Some dependencies that we can write in this notation are
3523 redundant, and will be rejected because they don't serve any useful
3524 purpose, and may instead indicate an error in the program. Examples of
3525 dependencies like this include <literal>a -&gt; a </literal>,
3526 <literal>a -&gt; a a </literal>,
3527 <literal>a -&gt; </literal>, etc. There can also be
3528 some redundancy if multiple dependencies are given, as in
3529 <literal>a-&gt;b</literal>,
3530 <literal>b-&gt;c </literal>, <literal>a-&gt;c </literal>, and
3531 in which some subset implies the remaining dependencies. Examples like this are
3532 not treated as errors. Note that dependencies appear only in class
3533 declarations, and not in any other part of the language. In particular, the
3534 syntax for instance declarations, class constraints, and types is completely
3535 unchanged.
3536 </para>
3537 <para>
3538 By including dependencies in a class declaration, we provide a mechanism for
3539 the programmer to specify each multiple parameter class more precisely. The
3540 compiler, on the other hand, is responsible for ensuring that the set of
3541 instances that are in scope at any given point in the program is consistent
3542 with any declared dependencies. For example, the following pair of instance
3543 declarations cannot appear together in the same scope because they violate the
3544 dependency for D, even though either one on its own would be acceptable:
3545 <programlisting>
3546 instance D Bool Int where ...
3547 instance D Bool Char where ...
3548 </programlisting>
3549 Note also that the following declaration is not allowed, even by itself:
3550 <programlisting>
3551 instance D [a] b where ...
3552 </programlisting>
3553 The problem here is that this instance would allow one particular choice of [a]
3554 to be associated with more than one choice for b, which contradicts the
3555 dependency specified in the definition of D. More generally, this means that,
3556 in any instance of the form:
3557 <programlisting>
3558 instance D t s where ...
3559 </programlisting>
3560 for some particular types t and s, the only variables that can appear in s are
3561 the ones that appear in t, and hence, if the type t is known, then s will be
3562 uniquely determined.
3563 </para>
3564 <para>
3565 The benefit of including dependency information is that it allows us to define
3566 more general multiple parameter classes, without ambiguity problems, and with
3567 the benefit of more accurate types. To illustrate this, we return to the
3568 collection class example, and annotate the original definition of <literal>Collects</literal>
3569 with a simple dependency:
3570 <programlisting>
3571 class Collects e ce | ce -> e where
3572 empty :: ce
3573 insert :: e -> ce -> ce
3574 member :: e -> ce -> Bool
3575 </programlisting>
3576 The dependency <literal>ce -&gt; e</literal> here specifies that the type e of elements is uniquely
3577 determined by the type of the collection ce. Note that both parameters of
3578 Collects are of kind *; there are no constructor classes here. Note too that
3579 all of the instances of Collects that we gave earlier can be used
3580 together with this new definition.
3581 </para>
3582 <para>
3583 What about the ambiguity problems that we encountered with the original
3584 definition? The empty function still has type Collects e ce => ce, but it is no
3585 longer necessary to regard that as an ambiguous type: Although the variable e
3586 does not appear on the right of the => symbol, the dependency for class
3587 Collects tells us that it is uniquely determined by ce, which does appear on
3588 the right of the => symbol. Hence the context in which empty is used can still
3589 give enough information to determine types for both ce and e, without
3590 ambiguity. More generally, we need only regard a type as ambiguous if it
3591 contains a variable on the left of the => that is not uniquely determined
3592 (either directly or indirectly) by the variables on the right.
3593 </para>
3594 <para>
3595 Dependencies also help to produce more accurate types for user defined
3596 functions, and hence to provide earlier detection of errors, and less cluttered
3597 types for programmers to work with. Recall the previous definition for a
3598 function f:
3599 <programlisting>
3600 f x y = insert x y = insert x . insert y
3601 </programlisting>
3602 for which we originally obtained a type:
3603 <programlisting>
3604 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3605 </programlisting>
3606 Given the dependency information that we have for Collects, however, we can
3607 deduce that a and b must be equal because they both appear as the second
3608 parameter in a Collects constraint with the same first parameter c. Hence we
3609 can infer a shorter and more accurate type for f:
3610 <programlisting>
3611 f :: (Collects a c) => a -> a -> c -> c
3612 </programlisting>
3613 In a similar way, the earlier definition of g will now be flagged as a type error.
3614 </para>
3615 <para>
3616 Although we have given only a few examples here, it should be clear that the
3617 addition of dependency information can help to make multiple parameter classes
3618 more useful in practice, avoiding ambiguity problems, and allowing more general
3619 sets of instance declarations.
3620 </para>
3621 </sect4>
3622 </sect3>
3623 </sect2>
3624
3625 <sect2 id="instance-decls">
3626 <title>Instance declarations</title>
3627
3628 <para>An instance declaration has the form
3629 <screen>
3630 instance ( <replaceable>assertion</replaceable><subscript>1</subscript>, ..., <replaceable>assertion</replaceable><subscript>n</subscript>) =&gt; <replaceable>class</replaceable> <replaceable>type</replaceable><subscript>1</subscript> ... <replaceable>type</replaceable><subscript>m</subscript> where ...
3631 </screen>
3632 The part before the "<literal>=&gt;</literal>" is the
3633 <emphasis>context</emphasis>, while the part after the
3634 "<literal>=&gt;</literal>" is the <emphasis>head</emphasis> of the instance declaration.
3635 </para>
3636
3637 <sect3 id="flexible-instance-head">
3638 <title>Relaxed rules for the instance head</title>
3639
3640 <para>
3641 In Haskell 98 the head of an instance declaration
3642 must be of the form <literal>C (T a1 ... an)</literal>, where
3643 <literal>C</literal> is the class, <literal>T</literal> is a data type constructor,
3644 and the <literal>a1 ... an</literal> are distinct type variables.
3645 GHC relaxes these rules in two ways.
3646 <itemizedlist>
3647 <listitem>
3648 <para>
3649 The <option>-XFlexibleInstances</option> flag allows the head of the instance
3650 declaration to mention arbitrary nested types.
3651 For example, this becomes a legal instance declaration
3652 <programlisting>
3653 instance C (Maybe Int) where ...
3654 </programlisting>
3655 See also the <link linkend="instance-overlap">rules on overlap</link>.
3656 </para></listitem>
3657 <listitem><para>
3658 With the <option>-XTypeSynonymInstances</option> flag, instance heads may use type
3659 synonyms. As always, using a type synonym is just shorthand for
3660 writing the RHS of the type synonym definition. For example:
3661
3662
3663 <programlisting>
3664 type Point = (Int,Int)
3665 instance C Point where ...
3666 instance C [Point] where ...
3667 </programlisting>
3668
3669
3670 is legal. However, if you added
3671
3672
3673 <programlisting>
3674 instance C (Int,Int) where ...
3675 </programlisting>
3676
3677
3678 as well, then the compiler will complain about the overlapping
3679 (actually, identical) instance declarations. As always, type synonyms
3680 must be fully applied. You cannot, for example, write:
3681
3682 <programlisting>
3683 type P a = [[a]]
3684 instance Monad P where ...
3685 </programlisting>
3686
3687 </para></listitem>
3688 </itemizedlist>
3689 </para>
3690 </sect3>
3691
3692 <sect3 id="instance-rules">
3693 <title>Relaxed rules for instance contexts</title>
3694
3695 <para>In Haskell 98, the assertions in the context of the instance declaration
3696 must be of the form <literal>C a</literal> where <literal>a</literal>
3697 is a type variable that occurs in the head.
3698 </para>
3699
3700 <para>
3701 The <option>-XFlexibleContexts</option> flag relaxes this rule, as well
3702 as the corresponding rule for type signatures (see <xref linkend="flexible-contexts"/>).
3703 With this flag the context of the instance declaration can each consist of arbitrary
3704 (well-kinded) assertions <literal>(C t1 ... tn)</literal> subject only to the
3705 following rules:
3706 <orderedlist>
3707 <listitem><para>
3708 The Paterson Conditions: for each assertion in the context
3709 <orderedlist>
3710 <listitem><para>No type variable has more occurrences in the assertion than in the head</para></listitem>
3711 <listitem><para>The assertion has fewer constructors and variables (taken together
3712 and counting repetitions) than the head</para></listitem>
3713 </orderedlist>
3714 </para></listitem>
3715
3716 <listitem><para>The Coverage Condition. For each functional dependency,
3717 <replaceable>tvs</replaceable><subscript>left</subscript> <literal>-&gt;</literal>
3718 <replaceable>tvs</replaceable><subscript>right</subscript>, of the class,
3719 every type variable in
3720 S(<replaceable>tvs</replaceable><subscript>right</subscript>) must appear in
3721 S(<replaceable>tvs</replaceable><subscript>left</subscript>), where S is the
3722 substitution mapping each type variable in the class declaration to the
3723 corresponding type in the instance declaration.
3724 </para></listitem>
3725 </orderedlist>
3726 These restrictions ensure that context reduction terminates: each reduction
3727 step makes the problem smaller by at least one
3728 constructor. Both the Paterson Conditions and the Coverage Condition are lifted
3729 if you give the <option>-XUndecidableInstances</option>
3730 flag (<xref linkend="undecidable-instances"/>).
3731 You can find lots of background material about the reason for these
3732 restrictions in the paper <ulink
3733 url="http://research.microsoft.com/%7Esimonpj/papers/fd%2Dchr/">
3734 Understanding functional dependencies via Constraint Handling Rules</ulink>.
3735 </para>
3736 <para>
3737 For example, these are OK:
3738 <programlisting>
3739 instance C Int [a] -- Multiple parameters
3740 instance Eq (S [a]) -- Structured type in head
3741
3742 -- Repeated type variable in head
3743 instance C4 a a => C4 [a] [a]
3744 instance Stateful (ST s) (MutVar s)
3745
3746 -- Head can consist of type variables only
3747 instance C a
3748 instance (Eq a, Show b) => C2 a b
3749
3750 -- Non-type variables in context
3751 instance Show (s a) => Show (Sized s a)
3752 instance C2 Int a => C3 Bool [a]
3753 instance C2 Int a => C3 [a] b
3754 </programlisting>
3755 But these are not:
3756 <programlisting>
3757 -- Context assertion no smaller than head
3758 instance C a => C a where ...
3759 -- (C b b) has more more occurrences of b than the head
3760 instance C b b => Foo [b] where ...
3761 </programlisting>
3762 </para>
3763
3764 <para>
3765 The same restrictions apply to instances generated by
3766 <literal>deriving</literal> clauses. Thus the following is accepted:
3767 <programlisting>
3768 data MinHeap h a = H a (h a)
3769 deriving (Show)
3770 </programlisting>
3771 because the derived instance
3772 <programlisting>
3773 instance (Show a, Show (h a)) => Show (MinHeap h a)
3774 </programlisting>
3775 conforms to the above rules.
3776 </para>
3777
3778 <para>
3779 A useful idiom permitted by the above rules is as follows.
3780 If one allows overlapping instance declarations then it's quite
3781 convenient to have a "default instance" declaration that applies if
3782 something more specific does not:
3783 <programlisting>
3784 instance C a where
3785 op = ... -- Default
3786 </programlisting>
3787 </para>
3788 </sect3>
3789
3790 <sect3 id="undecidable-instances">
3791 <title>Undecidable instances</title>
3792
3793 <para>
3794 Sometimes even the rules of <xref linkend="instance-rules"/> are too onerous.
3795 For example, sometimes you might want to use the following to get the
3796 effect of a "class synonym":
3797 <programlisting>
3798 class (C1 a, C2 a, C3 a) => C a where { }
3799
3800 instance (C1 a, C2 a, C3 a) => C a where { }
3801 </programlisting>
3802 This allows you to write shorter signatures:
3803 <programlisting>
3804 f :: C a => ...
3805 </programlisting>
3806 instead of
3807 <programlisting>
3808 f :: (C1 a, C2 a, C3 a) => ...
3809 </programlisting>
3810 The restrictions on functional dependencies (<xref
3811 linkend="functional-dependencies"/>) are particularly troublesome.
3812 It is tempting to introduce type variables in the context that do not appear in
3813 the head, something that is excluded by the normal rules. For example:
3814 <programlisting>
3815 class HasConverter a b | a -> b where
3816 convert :: a -> b
3817
3818 data Foo a = MkFoo a
3819
3820 instance (HasConverter a b,Show b) => Show (Foo a) where
3821 show (MkFoo value) = show (convert value)
3822 </programlisting>
3823 This is dangerous territory, however. Here, for example, is a program that would make the
3824 typechecker loop:
3825 <programlisting>
3826 class D a
3827 class F a b | a->b
3828 instance F [a] [[a]]
3829 instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head
3830 </programlisting>
3831 Similarly, it can be tempting to lift the coverage condition:
3832 <programlisting>
3833 class Mul a b c | a b -> c where
3834 (.*.) :: a -> b -> c
3835
3836 instance Mul Int Int Int where (.*.) = (*)
3837 instance Mul Int Float Float where x .*. y = fromIntegral x * y
3838 instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v
3839 </programlisting>
3840 The third instance declaration does not obey the coverage condition;
3841 and indeed the (somewhat strange) definition:
3842 <programlisting>
3843 f = \ b x y -> if b then x .*. [y] else y
3844 </programlisting>
3845 makes instance inference go into a loop, because it requires the constraint
3846 <literal>(Mul a [b] b)</literal>.
3847 </para>
3848 <para>
3849 Nevertheless, GHC allows you to experiment with more liberal rules. If you use
3850 the experimental flag <option>-XUndecidableInstances</option>
3851 <indexterm><primary>-XUndecidableInstances</primary></indexterm>,
3852 both the Paterson Conditions and the Coverage Condition
3853 (described in <xref linkend="instance-rules"/>) are lifted. Termination is ensured by having a
3854 fixed-depth recursion stack. If you exceed the stack depth you get a
3855 sort of backtrace, and the opportunity to increase the stack depth
3856 with <option>-fcontext-stack=</option><emphasis>N</emphasis>.
3857 </para>
3858
3859 </sect3>
3860
3861
3862 <sect3 id="instance-overlap">
3863 <title>Overlapping instances</title>
3864 <para>
3865 In general, <emphasis>GHC requires that that it be unambiguous which instance
3866 declaration
3867 should be used to resolve a type-class constraint</emphasis>. This behaviour
3868 can be modified by two flags: <option>-XOverlappingInstances</option>
3869 <indexterm><primary>-XOverlappingInstances
3870 </primary></indexterm>
3871 and <option>-XIncoherentInstances</option>
3872 <indexterm><primary>-XIncoherentInstances
3873 </primary></indexterm>, as this section discusses. Both these
3874 flags are dynamic flags, and can be set on a per-module basis, using
3875 an <literal>OPTIONS_GHC</literal> pragma if desired (<xref linkend="source-file-options"/>).</para>
3876 <para>
3877 When GHC tries to resolve, say, the constraint <literal>C Int Bool</literal>,
3878 it tries to match every instance declaration against the
3879 constraint,
3880 by instantiating the head of the instance declaration. For example, consider
3881 these declarations:
3882 <programlisting>
3883 instance context1 => C Int a where ... -- (A)
3884 instance context2 => C a Bool where ... -- (B)
3885 instance context3 => C Int [a] where ... -- (C)
3886 instance context4 => C Int [Int] where ... -- (D)
3887 </programlisting>
3888 The instances (A) and (B) match the constraint <literal>C Int Bool</literal>,
3889 but (C) and (D) do not. When matching, GHC takes
3890 no account of the context of the instance declaration
3891 (<literal>context1</literal> etc).
3892 GHC's default behaviour is that <emphasis>exactly one instance must match the
3893 constraint it is trying to resolve</emphasis>.
3894 It is fine for there to be a <emphasis>potential</emphasis> of overlap (by
3895 including both declarations (A) and (B), say); an error is only reported if a
3896 particular constraint matches more than one.
3897 </para>
3898
3899 <para>
3900 The <option>-XOverlappingInstances</option> flag instructs GHC to allow
3901 more than one instance to match, provided there is a most specific one. For
3902 example, the constraint <literal>C Int [Int]</literal> matches instances (A),
3903 (C) and (D), but the last is more specific, and hence is chosen. If there is no
3904 most-specific match, the program is rejected.
3905 </para>
3906 <para>
3907 However, GHC is conservative about committing to an overlapping instance. For example:
3908 <programlisting>
3909 f :: [b] -> [b]
3910 f x = ...
3911 </programlisting>
3912 Suppose that from the RHS of <literal>f</literal> we get the constraint
3913 <literal>C Int [b]</literal>. But
3914 GHC does not commit to instance (C), because in a particular
3915 call of <literal>f</literal>, <literal>b</literal> might be instantiate
3916 to <literal>Int</literal>, in which case instance (D) would be more specific still.
3917 So GHC rejects the program.
3918 (If you add the flag <option>-XIncoherentInstances</option>,
3919 GHC will instead pick (C), without complaining about
3920 the problem of subsequent instantiations.)
3921 </para>
3922 <para>
3923 Notice that we gave a type signature to <literal>f</literal>, so GHC had to
3924 <emphasis>check</emphasis> that <literal>f</literal> has the specified type.
3925 Suppose instead we do not give a type signature, asking GHC to <emphasis>infer</emphasis>
3926 it instead. In this case, GHC will refrain from
3927 simplifying the constraint <literal>C Int [b]</literal> (for the same reason
3928 as before) but, rather than rejecting the program, it will infer the type
3929 <programlisting>
3930 f :: C Int [b] => [b] -> [b]
3931 </programlisting>
3932 That postpones the question of which instance to pick to the
3933 call site for <literal>f</literal>
3934 by which time more is known about the type <literal>b</literal>.
3935 You can write this type signature yourself if you use the
3936 <link linkend="flexible-contexts"><option>-XFlexibleContexts</option></link>
3937 flag.
3938 </para>
3939 <para>
3940 Exactly the same situation can arise in instance declarations themselves. Suppose we have
3941 <programlisting>
3942 class Foo a where
3943 f :: a -> a
3944 instance Foo [b] where
3945 f x = ...
3946 </programlisting>
3947 and, as before, the constraint <literal>C Int [b]</literal> arises from <literal>f</literal>'s
3948 right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve
3949 the constraint <literal>C Int [b]</literal>, because it matches more than one instance
3950 declaration. The solution is to postpone the choice by adding the constraint to the context
3951 of the instance declaration, thus:
3952 <programlisting>
3953 instance C Int [b] => Foo [b] where
3954 f x = ...
3955 </programlisting>
3956 (You need <link linkend="instance-rules"><option>-XFlexibleInstances</option></link> to do this.)
3957 </para>
3958 <para>
3959 Warning: overlapping instances must be used with care. They
3960 can give rise to incoherence (ie different instance choices are made
3961 in different parts of the program) even without <option>-XIncoherentInstances</option>. Consider:
3962 <programlisting>
3963 {-# LANGUAGE OverlappingInstances #-}
3964 module Help where
3965
3966 class MyShow a where
3967 myshow :: a -> String
3968
3969 instance MyShow a => MyShow [a] where
3970 myshow xs = concatMap myshow xs
3971
3972 showHelp :: MyShow a => [a] -> String
3973 showHelp xs = myshow xs
3974
3975 {-# LANGUAGE FlexibleInstances, OverlappingInstances #-}
3976 module Main where
3977 import Help
3978
3979 data T = MkT
3980
3981 instance MyShow T where
3982 myshow x = "Used generic instance"
3983
3984 instance MyShow [T] where
3985 myshow xs = "Used more specific instance"
3986
3987 main = do { print (myshow [MkT]); print (showHelp [MkT]) }
3988 </programlisting>
3989 In function <literal>showHelp</literal> GHC sees no overlapping
3990 instances, and so uses the <literal>MyShow [a]</literal> instance
3991 without complaint. In the call to <literal>myshow</literal> in <literal>main</literal>,
3992 GHC resolves the <literal>MyShow [T]</literal> constraint using the overlapping
3993 instance declaration in module <literal>Main</literal>. As a result,
3994 the program prints
3995 <programlisting>
3996 "Used more specific instance"
3997 "Used generic instance"
3998 </programlisting>
3999 (An alternative possible behaviour, not currently implemented,
4000 would be to reject module <literal>Help</literal>
4001 on the grounds that a later instance declaration might overlap the local one.)
4002 </para>
4003 <para>
4004 The willingness to be overlapped or incoherent is a property of
4005 the <emphasis>instance declaration</emphasis> itself, controlled by the
4006 presence or otherwise of the <option>-XOverlappingInstances</option>
4007 and <option>-XIncoherentInstances</option> flags when that module is
4008 being defined. Specifically, during the lookup process:
4009 <itemizedlist>
4010 <listitem><para>
4011 If the constraint being looked up matches two instance declarations IA and IB,
4012 and
4013 <itemizedlist>
4014 <listitem><para>IB is a substitution instance of IA (but not vice versa);
4015 that is, IB is strictly more specific than IA</para></listitem>
4016 <listitem><para>either IA or IB was compiled with <option>-XOverlappingInstances</option></para></listitem>
4017 </itemizedlist>
4018 then the less-specific instance IA is ignored.
4019 </para></listitem>
4020 <listitem><para>
4021 Suppose an instance declaration does not match the constraint being looked up, but
4022 does <emphasis>unify</emphasis> with it, so that it might match when the constraint is further
4023 instantiated. Usually GHC will regard this as a reason for not committing to
4024 some other constraint. But if the instance declaration was compiled with
4025 <option>-XIncoherentInstances</option>, GHC will skip the "does-it-unify?"
4026 check for that declaration.
4027 </para></listitem>
4028 </itemizedlist>
4029 These rules make it possible for a library author to design a library that relies on
4030 overlapping instances without the library client having to know.
4031 </para>
4032 <para>The <option>-XIncoherentInstances</option> flag implies the
4033 <option>-XOverlappingInstances</option> flag, but not vice versa.
4034 </para>
4035 </sect3>
4036
4037
4038
4039 </sect2>
4040
4041 <sect2 id="overloaded-strings">
4042 <title>Overloaded string literals
4043 </title>
4044
4045 <para>
4046 GHC supports <emphasis>overloaded string literals</emphasis>. Normally a
4047 string literal has type <literal>String</literal>, but with overloaded string
4048 literals enabled (with <literal>-XOverloadedStrings</literal>)
4049 a string literal has type <literal>(IsString a) => a</literal>.
4050 </para>
4051 <para>
4052 This means that the usual string syntax can be used, e.g., for packed strings
4053 and other variations of string like types. String literals behave very much
4054 like integer literals, i.e., they can be used in both expressions and patterns.
4055 If used in a pattern the literal with be replaced by an equality test, in the same
4056 way as an integer literal is.
4057 </para>
4058 <para>
4059 The class <literal>IsString</literal> is defined as:
4060 <programlisting>
4061 class IsString a where
4062 fromString :: String -> a
4063 </programlisting>
4064 The only predefined instance is the obvious one to make strings work as usual:
4065 <programlisting>
4066 instance IsString [Char] where
4067 fromString cs = cs
4068 </programlisting>
4069 The class <literal>IsString</literal> is not in scope by default. If you want to mention
4070 it explicitly (for example, to give an instance declaration for it), you can import it
4071 from module <literal>GHC.Exts</literal>.
4072 </para>
4073 <para>
4074 Haskell's defaulting mechanism is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified.
4075 Specifically:
4076 <itemizedlist>
4077 <listitem><para>
4078 Each type in a default declaration must be an
4079 instance of <literal>Num</literal> <emphasis>or</emphasis> of <literal>IsString</literal>.
4080 </para></listitem>
4081
4082 <listitem><para>
4083 The standard defaulting rule (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>)
4084 is extended thus: defaulting applies when all the unresolved constraints involve standard classes
4085 <emphasis>or</emphasis> <literal>IsString</literal>; and at least one is a numeric class
4086 <emphasis>or</emphasis> <literal>IsString</literal>.
4087 </para></listitem>
4088 </itemizedlist>
4089 </para>
4090 <para>
4091 A small example:
4092 <programlisting>
4093 module Main where
4094
4095 import GHC.Exts( IsString(..) )
4096
4097 newtype MyString = MyString String deriving (Eq, Show)
4098 instance IsString MyString where
4099 fromString = MyString
4100
4101 greet :: MyString -> MyString
4102 greet "hello" = "world"
4103 greet other = other
4104
4105 main = do
4106 print $ greet "hello"
4107 print $ greet "fool"
4108 </programlisting>
4109 </para>
4110 <para>
4111 Note that deriving <literal>Eq</literal> is necessary for the pattern matching
4112 to work since it gets translated into an equality comparison.
4113 </para>
4114 </sect2>
4115
4116 </sect1>
4117
4118 <sect1 id="type-families">
4119 <title>Type families</title>
4120
4121 <para>
4122 <firstterm>Indexed type families</firstterm> are a new GHC extension to
4123 facilitate type-level
4124 programming. Type families are a generalisation of <firstterm>associated
4125 data types</firstterm>
4126 (&ldquo;<ulink url="http://www.cse.unsw.edu.au/~chak/papers/CKPM05.html">Associated
4127 Types with Class</ulink>&rdquo;, M. Chakravarty, G. Keller, S. Peyton Jones,