Revert zipWith strictification (re #9949)
[ghc.git] / docs / users_guide / bugs.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <chapter id="bugs-and-infelicities">
3 <title>Known bugs and infelicities</title>
4
5 <sect1 id="vs-Haskell-defn">
6 <title>Haskell&nbsp;standards vs.&nbsp;Glasgow Haskell: language non-compliance
7 </title>
8
9 <indexterm><primary>GHC vs the Haskell standards</primary></indexterm>
10 <indexterm><primary>Haskell standards vs GHC</primary></indexterm>
11
12 <para>
13 This section lists Glasgow Haskell infelicities in its
14 implementation of Haskell&nbsp;98 and Haskell&nbsp;2010.
15 See also the &ldquo;when things go wrong&rdquo; section
16 (<xref linkend="wrong"/>) for information about crashes,
17 space leaks, and other undesirable phenomena.
18 </para>
19
20 <para>
21 The limitations here are listed in Haskell Report order
22 (roughly).
23 </para>
24
25 <sect2 id="haskell-standards-divergence">
26 <title>Divergence from Haskell&nbsp;98 and Haskell&nbsp;2010</title>
27
28 <para>
29 By default, GHC mainly aims to behave (mostly) like a Haskell&nbsp;2010
30 compiler, although you can tell it to try to behave like a
31 particular version of the language with the
32 <literal>-XHaskell98</literal> and
33 <literal>-XHaskell2010</literal> flags. The known deviations
34 from the standards are described below. Unless otherwise stated,
35 the deviation applies in Haskell&nbsp;98, Haskell&nbsp;2010 and
36 the default modes.
37 </para>
38
39 <sect3 id="infelicities-lexical">
40 <title>Lexical syntax</title>
41
42 <itemizedlist>
43 <listitem>
44 <para>Certain lexical rules regarding qualified identifiers
45 are slightly different in GHC compared to the Haskell
46 report. When you have
47 <replaceable>module</replaceable><literal>.</literal><replaceable>reservedop</replaceable>,
48 such as <literal>M.\</literal>, GHC will interpret it as a
49 single qualified operator rather than the two lexemes
50 <literal>M</literal> and <literal>.\</literal>.</para>
51 </listitem>
52 </itemizedlist>
53 </sect3>
54
55 <sect3 id="infelicities-syntax">
56 <title>Context-free syntax</title>
57
58 <itemizedlist>
59 <listitem>
60 <para>In Haskell&nbsp;98 mode and by default (but not in
61 Haskell&nbsp;2010 mode), GHC is a little less strict about the
62 layout rule when used
63 in <literal>do</literal> expressions. Specifically, the
64 restriction that "a nested context must be indented further to
65 the right than the enclosing context" is relaxed to allow the
66 nested context to be at the same level as the enclosing context,
67 if the enclosing context is a <literal>do</literal>
68 expression.</para>
69
70 <para>For example, the following code is accepted by GHC:
71
72 <programlisting>
73 main = do args &lt;- getArgs
74 if null args then return [] else do
75 ps &lt;- mapM process args
76 mapM print ps</programlisting>
77
78 This behaviour is controlled by the
79 <literal>NondecreasingIndentation</literal> extension.
80 </para>
81 </listitem>
82
83 <listitem>
84 <para>GHC doesn't do the fixity resolution in expressions during
85 parsing as required by Haskell&nbsp;98 (but not by Haskell&nbsp;2010).
86 For example, according to the Haskell&nbsp;98 report, the
87 following expression is legal:
88 <programlisting>
89 let x = 42 in x == 42 == True</programlisting>
90 and parses as:
91 <programlisting>
92 (let x = 42 in x == 42) == True</programlisting>
93
94 because according to the report, the <literal>let</literal>
95 expression <quote>extends as far to the right as
96 possible</quote>. Since it can't extend past the second
97 equals sign without causing a parse error
98 (<literal>==</literal> is non-fix), the
99 <literal>let</literal>-expression must terminate there. GHC
100 simply gobbles up the whole expression, parsing like this:
101 <programlisting>
102 (let x = 42 in x == 42 == True)</programlisting></para>
103 </listitem>
104 </itemizedlist>
105 </sect3>
106
107 <sect3 id="infelicities-exprs-pats">
108 <title>Expressions and patterns</title>
109
110 <para>In its default mode, GHC makes some programs slightly more defined
111 than they should be. For example, consider
112 <programlisting>
113 f :: [a] -> b -> b
114 f [] = error "urk"
115 f (x:xs) = \v -> v
116
117 main = print (f [] `seq` True)
118 </programlisting>
119 This should call <literal>error</literal> but actually prints <literal>True</literal>.
120 Reason: GHC eta-expands <literal>f</literal> to
121 <programlisting>
122 f :: [a] -> b -> b
123 f [] v = error "urk"
124 f (x:xs) v = v
125 </programlisting>
126 This improves efficiency slightly but significantly for most programs, and
127 is bad for only a few. To suppress this bogus "optimisation" use <option>-fpedantic-bottoms</option>.
128 </para>
129
130 </sect3>
131
132 <sect3 id="infelicities-decls">
133 <title>Declarations and bindings</title>
134
135 <para>In its default mode, GHC does not accept datatype contexts,
136 as it has been decided to remove them from the next version of the
137 language standard. This behaviour can be controlled with the
138 <option>DatatypeContexts</option> extension.
139 See <xref linkend="datatype-contexts" />.</para>
140 </sect3>
141
142 <sect3 id="infelicities-Modules">
143 <title>Module system and interface files</title>
144
145 <para>GHC requires the use of <literal>hs-boot</literal>
146 files to cut the recursive loops among mutually recursive modules
147 as described in <xref linkend="mutual-recursion"/>. This more of an infelicity
148 than a bug: the Haskell Report says
149 (<ulink url="http://haskell.org/onlinereport/modules.html#sect5.7">Section 5.7</ulink>) "Depending on the Haskell
150 implementation used, separate compilation of mutually
151 recursive modules may require that imported modules contain
152 additional information so that they may be referenced before
153 they are compiled. Explicit type signatures for all exported
154 values may be necessary to deal with mutual recursion. The
155 precise details of separate compilation are not defined by
156 this Report."
157
158 </para>
159
160 </sect3>
161
162 <sect3 id="infelicities-numbers">
163 <title>Numbers, basic types, and built-in classes</title>
164
165 <variablelist>
166 <varlistentry>
167 <term>Num superclasses</term>
168 <listitem>
169 <para>
170 The <literal>Num</literal> class does not have
171 <literal>Show</literal> or <literal>Eq</literal>
172 superclasses.
173 </para>
174
175 <para>
176 You can make code that works with both
177 Haskell98/Haskell2010 and GHC by:
178 <itemizedlist>
179 <listitem>
180 <para>
181 Whenever you make a <literal>Num</literal> instance
182 of a type, also make <literal>Show</literal> and
183 <literal>Eq</literal> instances, and
184 </para>
185 </listitem>
186 <listitem>
187 <para>
188 Whenever you give a function, instance or class a
189 <literal>Num t</literal> constraint, also give it
190 <literal>Show t</literal> and
191 <literal>Eq t</literal> constraints.
192 </para>
193 </listitem>
194 </itemizedlist>
195 </para>
196 </listitem>
197 </varlistentry>
198
199 <varlistentry>
200 <term>Bits superclasses</term>
201 <listitem>
202 <para>
203 The <literal>Bits</literal> class does not have
204 a <literal>Num</literal> superclasses. It therefore
205 does not have default methods for the
206 <literal>bit</literal>,
207 <literal>testBit</literal> and
208 <literal>popCount</literal> methods.
209 </para>
210
211 <para>
212 You can make code that works with both
213 Haskell2010 and GHC by:
214 <itemizedlist>
215 <listitem>
216 <para>
217 Whenever you make a <literal>Bits</literal> instance
218 of a type, also make a <literal>Num</literal>
219 instance, and
220 </para>
221 </listitem>
222 <listitem>
223 <para>
224 Whenever you give a function, instance or class a
225 <literal>Bits t</literal> constraint, also give it
226 a <literal>Num t</literal> constraint, and
227 </para>
228 </listitem>
229 <listitem>
230 <para>
231 Always define the <literal>bit</literal>,
232 <literal>testBit</literal> and
233 <literal>popCount</literal> methods in
234 <literal>Bits</literal> instances.
235 </para>
236 </listitem>
237 </itemizedlist>
238 </para>
239 </listitem>
240 </varlistentry>
241
242 <varlistentry>
243 <term>Extra instances</term>
244 <listitem>
245 <para>
246 The following extra instances are defined:
247 </para>
248 <programlisting>
249 instance Functor ((->) r)
250 instance Monad ((->) r)
251 instance Functor ((,) a)
252 instance Functor (Either a)
253 instance Monad (Either e)
254 </programlisting>
255 </listitem>
256 </varlistentry>
257
258 <varlistentry>
259 <term>Multiply-defined array elements&mdash;not checked:</term>
260 <listitem>
261 <para>This code fragment should
262 elicit a fatal error, but it does not:
263
264 <programlisting>
265 main = print (array (1,1) [(1,2), (1,3)])</programlisting>
266 GHC's implementation of <literal>array</literal> takes the value of an
267 array slot from the last (index,value) pair in the list, and does no
268 checking for duplicates. The reason for this is efficiency, pure and simple.
269 </para>
270 </listitem>
271 </varlistentry>
272 </variablelist>
273
274 </sect3>
275
276 <sect3 id="infelicities-Prelude">
277 <title>In <literal>Prelude</literal> support</title>
278
279 <variablelist>
280 <varlistentry>
281 <term>Arbitrary-sized tuples</term>
282 <listitem>
283 <para>Tuples are currently limited to size 100. HOWEVER:
284 standard instances for tuples (<literal>Eq</literal>,
285 <literal>Ord</literal>, <literal>Bounded</literal>,
286 <literal>Ix</literal> <literal>Read</literal>, and
287 <literal>Show</literal>) are available
288 <emphasis>only</emphasis> up to 16-tuples.</para>
289
290 <para>This limitation is easily subvertible, so please ask
291 if you get stuck on it.</para>
292 </listitem>
293 </varlistentry>
294 <varlistentry>
295 <term><literal>splitAt</literal> semantics</term>
296 <para><literal>Data.List.splitAt</literal> is stricter than specified in the
297 Report. Specifically, the Report specifies that
298 <programlisting>splitAt n xs = (take n xs, drop n xs)</programlisting>
299 which implies that
300 <programlisting>splitAt undefined undefined = (undefined, undefined)</programlisting>
301 but GHC's implementation is strict in its first argument, so
302 <programlisting>splitAt undefined [] = undefined</programlisting>
303 </para>
304 </varlistentry>
305 <varlistentry>
306 <term><literal>Read</literal>ing integers</term>
307 <listitem>
308 <para>GHC's implementation of the
309 <literal>Read</literal> class for integral types accepts
310 hexadecimal and octal literals (the code in the Haskell
311 98 report doesn't). So, for example,
312 <programlisting>read "0xf00" :: Int</programlisting>
313 works in GHC.</para>
314 <para>A possible reason for this is that <literal>readLitChar</literal> accepts hex and
315 octal escapes, so it seems inconsistent not to do so for integers too.</para>
316 </listitem>
317 </varlistentry>
318
319 <varlistentry>
320 <term><literal>isAlpha</literal></term>
321 <listitem>
322 <para>The Haskell 98 definition of <literal>isAlpha</literal>
323 is:</para>
324
325 <programlisting>isAlpha c = isUpper c || isLower c</programlisting>
326
327 <para>GHC's implementation diverges from the Haskell 98
328 definition in the sense that Unicode alphabetic characters which
329 are neither upper nor lower case will still be identified as
330 alphabetic by <literal>isAlpha</literal>.</para>
331 </listitem>
332 </varlistentry>
333
334 <varlistentry>
335 <term><literal>hGetContents</literal></term>
336 <listitem>
337 <para>
338 Lazy I/O throws an exception if an error is
339 encountered, in contrast to the Haskell 98 spec which
340 requires that errors are discarded (see Section 21.2.2
341 of the Haskell 98 report). The exception thrown is
342 the usual IO exception that would be thrown if the
343 failing IO operation was performed in the IO monad, and can
344 be caught by <literal>System.IO.Error.catch</literal>
345 or <literal>Control.Exception.catch</literal>.
346 </para>
347 </listitem>
348 </varlistentry>
349 </variablelist>
350 </sect3>
351
352 <sect3 id="infelicities-ffi">
353 <title>The Foreign Function Interface</title>
354 <variablelist>
355 <varlistentry>
356 <term><literal>hs_init()</literal> not allowed
357 after <literal>hs_exit()</literal></term>
358 <listitem>
359 <para>The FFI spec requires the implementation to support
360 re-initialising itself after being shut down
361 with <literal>hs_exit()</literal>, but GHC does not
362 currently support that.</para>
363 </listitem>
364 </varlistentry>
365 </variablelist>
366 </sect3>
367
368 </sect2>
369
370 <sect2 id="haskell-98-2010-undefined">
371 <title>GHC's interpretation of undefined behaviour in
372 Haskell&nbsp;98 and Haskell&nbsp;2010</title>
373
374 <para>This section documents GHC's take on various issues that are
375 left undefined or implementation specific in Haskell 98.</para>
376
377 <variablelist>
378 <varlistentry>
379 <term>
380 The <literal>Char</literal> type
381 <indexterm><primary><literal>Char</literal></primary><secondary>size of</secondary></indexterm>
382 </term>
383 <listitem>
384 <para>Following the ISO-10646 standard,
385 <literal>maxBound :: Char</literal> in GHC is
386 <literal>0x10FFFF</literal>.</para>
387 </listitem>
388 </varlistentry>
389
390 <varlistentry>
391 <term>
392 Sized integral types
393 <indexterm><primary><literal>Int</literal></primary><secondary>size of</secondary></indexterm>
394 </term>
395 <listitem>
396 <para>In GHC the <literal>Int</literal> type follows the
397 size of an address on the host architecture; in other words
398 it holds 32 bits on a 32-bit machine, and 64-bits on a
399 64-bit machine.</para>
400
401 <para>Arithmetic on <literal>Int</literal> is unchecked for
402 overflow<indexterm><primary>overflow</primary><secondary><literal>Int</literal></secondary>
403 </indexterm>, so all operations on <literal>Int</literal> happen
404 modulo
405 2<superscript><replaceable>n</replaceable></superscript>
406 where <replaceable>n</replaceable> is the size in bits of
407 the <literal>Int</literal> type.</para>
408
409 <para>The <literal>fromInteger</literal><indexterm><primary><literal>fromInteger</literal></primary>
410 </indexterm> function (and hence
411 also <literal>fromIntegral</literal><indexterm><primary><literal>fromIntegral</literal></primary>
412 </indexterm>) is a special case when
413 converting to <literal>Int</literal>. The value of
414 <literal>fromIntegral x :: Int</literal> is given by taking
415 the lower <replaceable>n</replaceable> bits of <literal>(abs
416 x)</literal>, multiplied by the sign of <literal>x</literal>
417 (in 2's complement <replaceable>n</replaceable>-bit
418 arithmetic). This behaviour was chosen so that for example
419 writing <literal>0xffffffff :: Int</literal> preserves the
420 bit-pattern in the resulting <literal>Int</literal>.</para>
421
422
423 <para>Negative literals, such as <literal>-3</literal>, are
424 specified by (a careful reading of) the Haskell Report as
425 meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
426 So <literal>-2147483648</literal> means <literal>negate (fromInteger 2147483648)</literal>.
427 Since <literal>fromInteger</literal> takes the lower 32 bits of the representation,
428 <literal>fromInteger (2147483648::Integer)</literal>, computed at type <literal>Int</literal> is
429 <literal>-2147483648::Int</literal>. The <literal>negate</literal> operation then
430 overflows, but it is unchecked, so <literal>negate (-2147483648::Int)</literal> is just
431 <literal>-2147483648</literal>. In short, one can write <literal>minBound::Int</literal> as
432 a literal with the expected meaning (but that is not in general guaranteed).
433 </para>
434
435 <para>The <literal>fromIntegral</literal> function also
436 preserves bit-patterns when converting between the sized
437 integral types (<literal>Int8</literal>,
438 <literal>Int16</literal>, <literal>Int32</literal>,
439 <literal>Int64</literal> and the unsigned
440 <literal>Word</literal> variants), see the modules
441 <literal>Data.Int</literal> and <literal>Data.Word</literal>
442 in the library documentation.</para>
443 </listitem>
444 </varlistentry>
445
446 <varlistentry>
447 <term>Unchecked float arithmetic</term>
448 <listitem>
449 <para>Operations on <literal>Float</literal> and
450 <literal>Double</literal> numbers are
451 <emphasis>unchecked</emphasis> for overflow, underflow, and
452 other sad occurrences. (note, however, that some
453 architectures trap floating-point overflow and
454 loss-of-precision and report a floating-point exception,
455 probably terminating the
456 program)<indexterm><primary>floating-point
457 exceptions</primary></indexterm>.</para>
458 </listitem>
459 </varlistentry>
460 </variablelist>
461 </sect2>
462
463 </sect1>
464
465
466 <sect1 id="bugs">
467 <title>Known bugs or infelicities</title>
468
469 <para>The bug tracker lists bugs that have been reported in GHC but not
470 yet fixed: see the <ulink url="http://ghc.haskell.org/trac/ghc/">GHC Trac</ulink>. In addition to those, GHC also has the following known bugs
471 or infelicities. These bugs are more permanent; it is unlikely that
472 any of them will be fixed in the short term.</para>
473
474 <sect2 id="bugs-ghc">
475 <title>Bugs in GHC</title>
476
477 <itemizedlist>
478 <listitem>
479 <para> GHC can warn about non-exhaustive or overlapping
480 patterns (see <xref linkend="options-sanity"/>), and usually
481 does so correctly. But not always. It gets confused by
482 string patterns, and by guards, and can then emit bogus
483 warnings. The entire overlap-check code needs an overhaul
484 really.</para>
485 </listitem>
486
487 <listitem>
488 <para>GHC does not allow you to have a data type with a context
489 that mentions type variables that are not data type parameters.
490 For example:
491 <programlisting>
492 data C a b => T a = MkT a
493 </programlisting>
494 so that <literal>MkT</literal>'s type is
495 <programlisting>
496 MkT :: forall a b. C a b => a -> T a
497 </programlisting>
498 In principle, with a suitable class declaration with a functional dependency,
499 it's possible that this type is not ambiguous; but GHC nevertheless rejects
500 it. The type variables mentioned in the context of the data type declaration must
501 be among the type parameters of the data type.</para>
502 </listitem>
503
504 <listitem>
505 <para>GHC's inliner can be persuaded into non-termination
506 using the standard way to encode recursion via a data type:</para>
507 <programlisting>
508 data U = MkU (U -> Bool)
509
510 russel :: U -> Bool
511 russel u@(MkU p) = not $ p u
512
513 x :: Bool
514 x = russel (MkU russel)
515 </programlisting>
516
517 <para>We have never found another class of programs, other
518 than this contrived one, that makes GHC diverge, and fixing
519 the problem would impose an extra overhead on every
520 compilation. So the bug remains un-fixed. There is more
521 background in <ulink
522 url="http://research.microsoft.com/~simonpj/Papers/inlining/">
523 Secrets of the GHC inliner</ulink>.</para>
524 </listitem>
525
526 <listitem>
527 <para>On 32-bit x86 platforms when using the native code
528 generator, the
529 <option>-fexcess-precision</option><indexterm><primary><option>-fexcess-precision</option></primary></indexterm> option
530 is always on. This means that floating-point calculations are
531 non-deterministic, because depending on how the program is
532 compiled (optimisation settings, for example), certain
533 calculations might be done at 80-bit precision instead of the
534 intended 32-bit or 64-bit precision. Floating-point results
535 may differ when optimisation is turned on. In the worst case,
536 referential transparency is violated, because for example
537 <literal>let x = E1 in E2</literal> can evaluate to a
538 different value than <literal>E2[E1/x]</literal>.</para>
539
540 <para>
541 One workaround is to use the
542 <option>-msse2</option><indexterm><primary><option>-msse2</option></primary></indexterm>
543 option (see <xref linkend="options-platform" />, which
544 generates code to use the SSE2 instruction set instead of
545 the x87 instruction set. SSE2 code uses the correct
546 precision for all floating-point operations, and so gives
547 deterministic results. However, note that this only works
548 with processors that support SSE2 (Intel Pentium 4 or AMD
549 Athlon 64 and later), which is why the option is not enabled
550 by default. The libraries that come with GHC are probably
551 built without this option, unless you built GHC yourself.
552 </para>
553 </listitem>
554
555 </itemizedlist>
556 </sect2>
557
558 <sect2 id="bugs-ghci">
559 <title>Bugs in GHCi (the interactive GHC)</title>
560 <itemizedlist>
561 <listitem>
562 <para>GHCi does not respect the <literal>default</literal>
563 declaration in the module whose scope you are in. Instead,
564 for expressions typed at the command line, you always get the
565 default default-type behaviour; that is,
566 <literal>default(Int,Double)</literal>.</para>
567
568 <para>It would be better for GHCi to record what the default
569 settings in each module are, and use those of the 'current'
570 module (whatever that is).</para>
571 </listitem>
572
573 <listitem>
574 <para>On Windows, there's a GNU ld/BFD bug
575 whereby it emits bogus PE object files that have more than
576 0xffff relocations. When GHCi tries to load a package affected by this
577 bug, you get an error message of the form
578 <screen>
579 Loading package javavm ... linking ... WARNING: Overflown relocation field (# relocs found: 30765)
580 </screen>
581 The last time we looked, this bug still
582 wasn't fixed in the BFD codebase, and there wasn't any
583 noticeable interest in fixing it when we reported the bug
584 back in 2001 or so.
585 </para>
586 <para>The workaround is to split up the .o files that make up
587 your package into two or more .o's, along the lines of
588 how the "base" package does it.</para>
589 </listitem>
590 </itemizedlist>
591 </sect2>
592 </sect1>
593
594 </chapter>
595
596 <!-- Emacs stuff:
597 ;;; Local Variables: ***
598 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter") ***
599 ;;; End: ***
600 -->