Improve documentation of INLINE pragmas
[ghc.git] / docs / users_guide / primitives.xml
1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
3
4 <sect1 id="primitives">
5 <title>Unboxed types and primitive operations</title>
6 <indexterm><primary>GHC.Exts module</primary></indexterm>
7
8 <para>This chapter defines all the types which are primitive in
9 Glasgow Haskell, and the operations provided for them. You bring
10 them into scope by importing module <literal>GHC.Exts</literal>.</para>
11
12 <para>Note: while you really can use this stuff to write fast code,
13 we generally find it a lot less painful, and more satisfying in the
14 long run, to use higher-level language features and libraries. With
15 any luck, the code you write will be optimised to the efficient
16 unboxed version in any case. And if it isn't, we'd like to know
17 about it.</para>
18
19 <sect2 id="glasgow-unboxed">
20 <title>Unboxed types
21 </title>
22
23 <para>
24 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
25 </para>
26
27 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
28 that values of that type are represented by a pointer to a heap
29 object. The representation of a Haskell <literal>Int</literal>, for
30 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
31 type, however, is represented by the value itself, no pointers or heap
32 allocation are involved.
33 </para>
34
35 <para>
36 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
37 would use in C: <literal>Int&num;</literal> (long int),
38 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
39 (void *), etc. The <emphasis>primitive operations</emphasis>
40 (PrimOps) on these types are what you might expect; e.g.,
41 <literal>(+&num;)</literal> is addition on
42 <literal>Int&num;</literal>s, and is the machine-addition that we all
43 know and love&mdash;usually one instruction.
44 </para>
45
46 <para>
47 Primitive (unboxed) types cannot be defined in Haskell, and are
48 therefore built into the language and compiler. Primitive types are
49 always unlifted; that is, a value of a primitive type cannot be
50 bottom. We use the convention that primitive types, values, and
51 operations have a <literal>&num;</literal> suffix.
52 </para>
53
54 <para>
55 Primitive values are often represented by a simple bit-pattern, such
56 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
57 <literal>Double&num;</literal>. But this is not necessarily the case:
58 a primitive value might be represented by a pointer to a
59 heap-allocated object. Examples include
60 <literal>Array&num;</literal>, the type of primitive arrays. A
61 primitive array is heap-allocated because it is too big a value to fit
62 in a register, and would be too expensive to copy around; in a sense,
63 it is accidental that it is represented by a pointer. If a pointer
64 represents a primitive value, then it really does point to that value:
65 no unevaluated thunks, no indirections&hellip;nothing can be at the
66 other end of the pointer than the primitive value.
67 </para>
68
69 <para>
70 There are some restrictions on the use of primitive types, the main
71 one being that you can't pass a primitive value to a polymorphic
72 function or store one in a polymorphic data type. This rules out
73 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
74 integers). The reason for this restriction is that polymorphic
75 arguments and constructor fields are assumed to be pointers: if an
76 unboxed integer is stored in one of these, the garbage collector would
77 attempt to follow it, leading to unpredictable space leaks. Or a
78 <function>seq</function> operation on the polymorphic component may
79 attempt to dereference the pointer, with disastrous results. Even
80 worse, the unboxed value might be larger than a pointer
81 (<literal>Double&num;</literal> for instance).
82 </para>
83
84 <para>
85 Nevertheless, A numerically-intensive program using unboxed types can
86 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
87 counterpart&mdash;we saw a threefold speedup on one example.
88 </para>
89
90 </sect2>
91
92 <sect2 id="unboxed-tuples">
93 <title>Unboxed Tuples
94 </title>
95
96 <para>
97 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>,
98 they're available by default with <option>-fglasgow-exts</option>. An
99 unboxed tuple looks like this:
100 </para>
101
102 <para>
103
104 <programlisting>
105 (# e_1, ..., e_n #)
106 </programlisting>
107
108 </para>
109
110 <para>
111 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
112 type (primitive or non-primitive). The type of an unboxed tuple looks
113 the same.
114 </para>
115
116 <para>
117 Unboxed tuples are used for functions that need to return multiple
118 values, but they avoid the heap allocation normally associated with
119 using fully-fledged tuples. When an unboxed tuple is returned, the
120 components are put directly into registers or on the stack; the
121 unboxed tuple itself does not have a composite representation. Many
122 of the primitive operations listed in this section return unboxed
123 tuples.
124 </para>
125
126 <para>
127 There are some pretty stringent restrictions on the use of unboxed tuples:
128 </para>
129
130 <para>
131
132 <itemizedlist>
133 <listitem>
134
135 <para>
136 Unboxed tuple types are subject to the same restrictions as
137 other unboxed types; i.e. they may not be stored in polymorphic data
138 structures or passed to polymorphic functions.
139
140 </para>
141 </listitem>
142 <listitem>
143
144 <para>
145 Unboxed tuples may only be constructed as the direct result of
146 a function, and may only be deconstructed with a <literal>case</literal> expression.
147 eg. the following are valid:
148
149
150 <programlisting>
151 f x y = (# x+1, y-1 #)
152 g x = case f x x of { (# a, b #) -&#62; a + b }
153 </programlisting>
154
155
156 but the following are invalid:
157
158
159 <programlisting>
160 f x y = g (# x, y #)
161 g (# x, y #) = x + y
162 </programlisting>
163
164
165 </para>
166 </listitem>
167 <listitem>
168
169 <para>
170 No variable can have an unboxed tuple type. This is illegal:
171
172
173 <programlisting>
174 f :: (# Int, Int #) -&#62; (# Int, Int #)
175 f x = x
176 </programlisting>
177
178
179 because <literal>x</literal> has an unboxed tuple type.
180
181 </para>
182 </listitem>
183
184 </itemizedlist>
185
186 </para>
187
188 <para>
189 Note: we may relax some of these restrictions in the future.
190 </para>
191
192 <para>
193 The <literal>IO</literal> and <literal>ST</literal> monads use unboxed
194 tuples to avoid unnecessary allocation during sequences of operations.
195 </para>
196
197 </sect2>
198
199 <sect2>
200 <title>Character and numeric types</title>
201
202 <indexterm><primary>character types, primitive</primary></indexterm>
203 <indexterm><primary>numeric types, primitive</primary></indexterm>
204 <indexterm><primary>integer types, primitive</primary></indexterm>
205 <indexterm><primary>floating point types, primitive</primary></indexterm>
206 <para>
207 There are the following obvious primitive types:
208 </para>
209
210 <programlisting>
211 type Char#
212 type Int#
213 type Word#
214 type Addr#
215 type Float#
216 type Double#
217 type Int64#
218 type Word64#
219 </programlisting>
220
221 <indexterm><primary><literal>Char&num;</literal></primary></indexterm>
222 <indexterm><primary><literal>Int&num;</literal></primary></indexterm>
223 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
224 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
225 <indexterm><primary><literal>Float&num;</literal></primary></indexterm>
226 <indexterm><primary><literal>Double&num;</literal></primary></indexterm>
227 <indexterm><primary><literal>Int64&num;</literal></primary></indexterm>
228 <indexterm><primary><literal>Word64&num;</literal></primary></indexterm>
229
230 <para>
231 If you really want to know their exact equivalents in C, see
232 <filename>ghc/includes/StgTypes.h</filename> in the GHC source tree.
233 </para>
234
235 <para>
236 Literals for these types may be written as follows:
237 </para>
238
239 <para>
240
241 <programlisting>
242 1# an Int#
243 1.2# a Float#
244 1.34## a Double#
245 'a'# a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
246 "a"# an Addr# (a `char *'); only characters '\0'..'\255' allowed
247 </programlisting>
248
249 <indexterm><primary>literals, primitive</primary></indexterm>
250 <indexterm><primary>constants, primitive</primary></indexterm>
251 <indexterm><primary>numbers, primitive</primary></indexterm>
252 </para>
253
254 </sect2>
255
256 <sect2>
257 <title>Comparison operations</title>
258
259 <para>
260 <indexterm><primary>comparisons, primitive</primary></indexterm>
261 <indexterm><primary>operators, comparison</primary></indexterm>
262 </para>
263
264 <para>
265
266 <programlisting>
267 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
268
269 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
270 -- ditto for Word# and Addr#
271 </programlisting>
272
273 <indexterm><primary><literal>&#62;&num;</literal></primary></indexterm>
274 <indexterm><primary><literal>&#62;=&num;</literal></primary></indexterm>
275 <indexterm><primary><literal>==&num;</literal></primary></indexterm>
276 <indexterm><primary><literal>/=&num;</literal></primary></indexterm>
277 <indexterm><primary><literal>&#60;&num;</literal></primary></indexterm>
278 <indexterm><primary><literal>&#60;=&num;</literal></primary></indexterm>
279 <indexterm><primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
280 <indexterm><primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
281 <indexterm><primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
282 <indexterm><primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
283 <indexterm><primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
284 <indexterm><primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
285 </para>
286
287 </sect2>
288
289 <sect2>
290 <title>Primitive-character operations</title>
291
292 <para>
293 <indexterm><primary>characters, primitive operations</primary></indexterm>
294 <indexterm><primary>operators, primitive character</primary></indexterm>
295 </para>
296
297 <para>
298
299 <programlisting>
300 ord# :: Char# -&#62; Int#
301 chr# :: Int# -&#62; Char#
302 </programlisting>
303
304 <indexterm><primary><literal>ord&num;</literal></primary></indexterm>
305 <indexterm><primary><literal>chr&num;</literal></primary></indexterm>
306 </para>
307
308 </sect2>
309
310 <sect2>
311 <title>Primitive-<literal>Int</literal> operations</title>
312
313 <para>
314 <indexterm><primary>integers, primitive operations</primary></indexterm>
315 <indexterm><primary>operators, primitive integer</primary></indexterm>
316 </para>
317
318 <para>
319
320 <programlisting>
321 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
322 negateInt# :: Int# -&#62; Int#
323
324 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
325 -- shift left, right arithmetic, right logical
326
327 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
328 -- add, subtract, multiply with carry
329 </programlisting>
330
331 <indexterm><primary><literal>+&num;</literal></primary></indexterm>
332 <indexterm><primary><literal>-&num;</literal></primary></indexterm>
333 <indexterm><primary><literal>*&num;</literal></primary></indexterm>
334 <indexterm><primary><literal>quotInt&num;</literal></primary></indexterm>
335 <indexterm><primary><literal>remInt&num;</literal></primary></indexterm>
336 <indexterm><primary><literal>gcdInt&num;</literal></primary></indexterm>
337 <indexterm><primary><literal>iShiftL&num;</literal></primary></indexterm>
338 <indexterm><primary><literal>iShiftRA&num;</literal></primary></indexterm>
339 <indexterm><primary><literal>iShiftRL&num;</literal></primary></indexterm>
340 <indexterm><primary><literal>addIntC&num;</literal></primary></indexterm>
341 <indexterm><primary><literal>subIntC&num;</literal></primary></indexterm>
342 <indexterm><primary><literal>mulIntC&num;</literal></primary></indexterm>
343 <indexterm><primary>shift operations, integer</primary></indexterm>
344 </para>
345
346 <para>
347 <emphasis>Note:</emphasis> No error/overflow checking!
348 </para>
349
350 </sect2>
351
352 <sect2>
353 <title>Primitive-<literal>Double</literal> and <literal>Float</literal> operations</title>
354
355 <para>
356 <indexterm><primary>floating point numbers, primitive</primary></indexterm>
357 <indexterm><primary>operators, primitive floating point</primary></indexterm>
358 </para>
359
360 <para>
361
362 <programlisting>
363 {+,-,*,/}## :: Double# -&#62; Double# -&#62; Double#
364 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
365 negateDouble# :: Double# -&#62; Double#
366 double2Int# :: Double# -&#62; Int#
367 int2Double# :: Int# -&#62; Double#
368
369 {plus,minus,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
370 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
371 negateFloat# :: Float# -&#62; Float#
372 float2Int# :: Float# -&#62; Int#
373 int2Float# :: Int# -&#62; Float#
374 </programlisting>
375
376 </para>
377
378 <para>
379 <indexterm><primary><literal>+&num;&num;</literal></primary></indexterm>
380 <indexterm><primary><literal>-&num;&num;</literal></primary></indexterm>
381 <indexterm><primary><literal>*&num;&num;</literal></primary></indexterm>
382 <indexterm><primary><literal>/&num;&num;</literal></primary></indexterm>
383 <indexterm><primary><literal>&#60;&num;&num;</literal></primary></indexterm>
384 <indexterm><primary><literal>&#60;=&num;&num;</literal></primary></indexterm>
385 <indexterm><primary><literal>==&num;&num;</literal></primary></indexterm>
386 <indexterm><primary><literal>=/&num;&num;</literal></primary></indexterm>
387 <indexterm><primary><literal>&#62;=&num;&num;</literal></primary></indexterm>
388 <indexterm><primary><literal>&#62;&num;&num;</literal></primary></indexterm>
389 <indexterm><primary><literal>negateDouble&num;</literal></primary></indexterm>
390 <indexterm><primary><literal>double2Int&num;</literal></primary></indexterm>
391 <indexterm><primary><literal>int2Double&num;</literal></primary></indexterm>
392 </para>
393
394 <para>
395 <indexterm><primary><literal>plusFloat&num;</literal></primary></indexterm>
396 <indexterm><primary><literal>minusFloat&num;</literal></primary></indexterm>
397 <indexterm><primary><literal>timesFloat&num;</literal></primary></indexterm>
398 <indexterm><primary><literal>divideFloat&num;</literal></primary></indexterm>
399 <indexterm><primary><literal>gtFloat&num;</literal></primary></indexterm>
400 <indexterm><primary><literal>geFloat&num;</literal></primary></indexterm>
401 <indexterm><primary><literal>eqFloat&num;</literal></primary></indexterm>
402 <indexterm><primary><literal>neFloat&num;</literal></primary></indexterm>
403 <indexterm><primary><literal>ltFloat&num;</literal></primary></indexterm>
404 <indexterm><primary><literal>leFloat&num;</literal></primary></indexterm>
405 <indexterm><primary><literal>negateFloat&num;</literal></primary></indexterm>
406 <indexterm><primary><literal>float2Int&num;</literal></primary></indexterm>
407 <indexterm><primary><literal>int2Float&num;</literal></primary></indexterm>
408 </para>
409
410 <para>
411 And a full complement of trigonometric functions:
412 </para>
413
414 <para>
415
416 <programlisting>
417 expDouble# :: Double# -&#62; Double#
418 logDouble# :: Double# -&#62; Double#
419 sqrtDouble# :: Double# -&#62; Double#
420 sinDouble# :: Double# -&#62; Double#
421 cosDouble# :: Double# -&#62; Double#
422 tanDouble# :: Double# -&#62; Double#
423 asinDouble# :: Double# -&#62; Double#
424 acosDouble# :: Double# -&#62; Double#
425 atanDouble# :: Double# -&#62; Double#
426 sinhDouble# :: Double# -&#62; Double#
427 coshDouble# :: Double# -&#62; Double#
428 tanhDouble# :: Double# -&#62; Double#
429 powerDouble# :: Double# -&#62; Double# -&#62; Double#
430 </programlisting>
431
432 <indexterm><primary>trigonometric functions, primitive</primary></indexterm>
433 </para>
434
435 <para>
436 similarly for <literal>Float&num;</literal>.
437 </para>
438
439 <para>
440 There are two coercion functions for <literal>Float&num;</literal>/<literal>Double&num;</literal>:
441 </para>
442
443 <para>
444
445 <programlisting>
446 float2Double# :: Float# -&#62; Double#
447 double2Float# :: Double# -&#62; Float#
448 </programlisting>
449
450 <indexterm><primary><literal>float2Double&num;</literal></primary></indexterm>
451 <indexterm><primary><literal>double2Float&num;</literal></primary></indexterm>
452 </para>
453
454 <para>
455 The primitive version of <function>decodeDouble</function>
456 (<function>encodeDouble</function> is implemented as an external C
457 function):
458 </para>
459
460 <para>
461
462 <programlisting>
463 decodeDouble# :: Double# -&#62; PrelNum.ReturnIntAndGMP
464 </programlisting>
465
466 <indexterm><primary><literal>encodeDouble&num;</literal></primary></indexterm>
467 <indexterm><primary><literal>decodeDouble&num;</literal></primary></indexterm>
468 </para>
469
470 <para>
471 (And the same for <literal>Float&num;</literal>s.)
472 </para>
473
474 </sect2>
475
476 <sect2 id="integer-operations">
477 <title>Operations on/for <literal>Integers</literal> (interface to GMP)
478 </title>
479
480 <para>
481 <indexterm><primary>arbitrary precision integers</primary></indexterm>
482 <indexterm><primary>Integer, operations on</primary></indexterm>
483 </para>
484
485 <para>
486 We implement <literal>Integers</literal> (arbitrary-precision
487 integers) using the GNU multiple-precision (GMP) package (version
488 2.0.2).
489 </para>
490
491 <para>
492 The data type for <literal>Integer</literal> is either a small
493 integer, represented by an <literal>Int</literal>, or a large integer
494 represented using the pieces required by GMP's
495 <literal>MP&lowbar;INT</literal> in <filename>gmp.h</filename> (see
496 <filename>gmp.info</filename> in
497 <filename>ghc/includes/runtime/gmp</filename>). It comes out as:
498 </para>
499
500 <para>
501
502 <programlisting>
503 data Integer = S# Int# -- small integers
504 | J# Int# ByteArray# -- large integers
505 </programlisting>
506
507 <indexterm><primary>Integer type</primary></indexterm> The primitive
508 ops to support large <literal>Integers</literal> use the
509 &ldquo;pieces&rdquo; of the representation, and are as follows:
510 </para>
511
512 <para>
513
514 <programlisting>
515 negateInteger# :: Int# -&#62; ByteArray# -&#62; Integer
516
517 {plus,minus,times}Integer#, gcdInteger#,
518 quotInteger#, remInteger#, divExactInteger#
519 :: Int# -> ByteArray#
520 -> Int# -> ByteArray#
521 -> (# Int#, ByteArray# #)
522
523 cmpInteger#
524 :: Int# -> ByteArray#
525 -> Int# -> ByteArray#
526 -> Int# -- -1 for &#60;; 0 for ==; +1 for >
527
528 cmpIntegerInt#
529 :: Int# -> ByteArray#
530 -> Int#
531 -> Int# -- -1 for &#60;; 0 for ==; +1 for >
532
533 gcdIntegerInt# ::
534 :: Int# -> ByteArray#
535 -> Int#
536 -> Int#
537
538 divModInteger#, quotRemInteger#
539 :: Int# -> ByteArray#
540 -> Int# -> ByteArray#
541 -> (# Int#, ByteArray#,
542 Int#, ByteArray# #)
543
544 integer2Int# :: Int# -> ByteArray# -> Int#
545
546 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
547 word2Integer# :: Word# -> Integer
548
549 addr2Integer# :: Addr# -> Integer
550 -- the Addr# is taken to be a `char *' string
551 -- to be converted into an Integer.
552 </programlisting>
553
554 <indexterm><primary><literal>negateInteger&num;</literal></primary></indexterm>
555 <indexterm><primary><literal>plusInteger&num;</literal></primary></indexterm>
556 <indexterm><primary><literal>minusInteger&num;</literal></primary></indexterm>
557 <indexterm><primary><literal>timesInteger&num;</literal></primary></indexterm>
558 <indexterm><primary><literal>quotInteger&num;</literal></primary></indexterm>
559 <indexterm><primary><literal>remInteger&num;</literal></primary></indexterm>
560 <indexterm><primary><literal>gcdInteger&num;</literal></primary></indexterm>
561 <indexterm><primary><literal>gcdIntegerInt&num;</literal></primary></indexterm>
562 <indexterm><primary><literal>divExactInteger&num;</literal></primary></indexterm>
563 <indexterm><primary><literal>cmpInteger&num;</literal></primary></indexterm>
564 <indexterm><primary><literal>divModInteger&num;</literal></primary></indexterm>
565 <indexterm><primary><literal>quotRemInteger&num;</literal></primary></indexterm>
566 <indexterm><primary><literal>integer2Int&num;</literal></primary></indexterm>
567 <indexterm><primary><literal>int2Integer&num;</literal></primary></indexterm>
568 <indexterm><primary><literal>word2Integer&num;</literal></primary></indexterm>
569 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
570 </para>
571
572 </sect2>
573
574 <sect2>
575 <title>Words and addresses</title>
576
577 <para>
578 <indexterm><primary>word, primitive type</primary></indexterm>
579 <indexterm><primary>address, primitive type</primary></indexterm>
580 <indexterm><primary>unsigned integer, primitive type</primary></indexterm>
581 <indexterm><primary>pointer, primitive type</primary></indexterm>
582 </para>
583
584 <para>
585 A <literal>Word&num;</literal> is used for bit-twiddling operations.
586 It is the same size as an <literal>Int&num;</literal>, but has no sign
587 nor any arithmetic operations.
588
589 <programlisting>
590 type Word# -- Same size/etc as Int# but *unsigned*
591 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
592 -- described under "arrays"
593 </programlisting>
594
595 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
596 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
597 </para>
598
599 <para>
600 <literal>Word&num;</literal>s and <literal>Addr&num;</literal>s have
601 the usual comparison operations. Other
602 unboxed-<literal>Word</literal> ops (bit-twiddling and coercions):
603 </para>
604
605 <para>
606
607 <programlisting>
608 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
609
610 and#, or#, xor# :: Word# -> Word# -> Word#
611 -- standard bit ops.
612
613 quotWord#, remWord# :: Word# -> Word# -> Word#
614 -- word (i.e. unsigned) versions are different from int
615 -- versions, so we have to provide these explicitly.
616
617 not# :: Word# -> Word#
618
619 shiftL#, shiftRL# :: Word# -> Int# -> Word#
620 -- shift left, right logical
621
622 int2Word# :: Int# -> Word# -- just a cast, really
623 word2Int# :: Word# -> Int#
624 </programlisting>
625
626 <indexterm><primary>bit operations, Word and Addr</primary></indexterm>
627 <indexterm><primary><literal>gtWord&num;</literal></primary></indexterm>
628 <indexterm><primary><literal>geWord&num;</literal></primary></indexterm>
629 <indexterm><primary><literal>eqWord&num;</literal></primary></indexterm>
630 <indexterm><primary><literal>neWord&num;</literal></primary></indexterm>
631 <indexterm><primary><literal>ltWord&num;</literal></primary></indexterm>
632 <indexterm><primary><literal>leWord&num;</literal></primary></indexterm>
633 <indexterm><primary><literal>and&num;</literal></primary></indexterm>
634 <indexterm><primary><literal>or&num;</literal></primary></indexterm>
635 <indexterm><primary><literal>xor&num;</literal></primary></indexterm>
636 <indexterm><primary><literal>not&num;</literal></primary></indexterm>
637 <indexterm><primary><literal>quotWord&num;</literal></primary></indexterm>
638 <indexterm><primary><literal>remWord&num;</literal></primary></indexterm>
639 <indexterm><primary><literal>shiftL&num;</literal></primary></indexterm>
640 <indexterm><primary><literal>shiftRA&num;</literal></primary></indexterm>
641 <indexterm><primary><literal>shiftRL&num;</literal></primary></indexterm>
642 <indexterm><primary><literal>int2Word&num;</literal></primary></indexterm>
643 <indexterm><primary><literal>word2Int&num;</literal></primary></indexterm>
644 </para>
645
646 <para>
647 Unboxed-<literal>Addr</literal> ops (C casts, really):
648
649 <programlisting>
650 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
651
652 int2Addr# :: Int# -> Addr#
653 addr2Int# :: Addr# -> Int#
654 addr2Integer# :: Addr# -> (# Int#, ByteArray# #)
655 </programlisting>
656
657 <indexterm><primary><literal>gtAddr&num;</literal></primary></indexterm>
658 <indexterm><primary><literal>geAddr&num;</literal></primary></indexterm>
659 <indexterm><primary><literal>eqAddr&num;</literal></primary></indexterm>
660 <indexterm><primary><literal>neAddr&num;</literal></primary></indexterm>
661 <indexterm><primary><literal>ltAddr&num;</literal></primary></indexterm>
662 <indexterm><primary><literal>leAddr&num;</literal></primary></indexterm>
663 <indexterm><primary><literal>int2Addr&num;</literal></primary></indexterm>
664 <indexterm><primary><literal>addr2Int&num;</literal></primary></indexterm>
665 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
666 </para>
667
668 <para>
669 The casts between <literal>Int&num;</literal>,
670 <literal>Word&num;</literal> and <literal>Addr&num;</literal>
671 correspond to null operations at the machine level, but are required
672 to keep the Haskell type checker happy.
673 </para>
674
675 <para>
676 Operations for indexing off of C pointers
677 (<literal>Addr&num;</literal>s) to snatch values are listed under
678 &ldquo;arrays&rdquo;.
679 </para>
680
681 </sect2>
682
683 <sect2>
684 <title>Arrays</title>
685
686 <para>
687 <indexterm><primary>arrays, primitive</primary></indexterm>
688 </para>
689
690 <para>
691 The type <literal>Array&num; elt</literal> is the type of primitive,
692 unpointed arrays of values of type <literal>elt</literal>.
693 </para>
694
695 <para>
696
697 <programlisting>
698 type Array# elt
699 </programlisting>
700
701 <indexterm><primary><literal>Array&num;</literal></primary></indexterm>
702 </para>
703
704 <para>
705 <literal>Array&num;</literal> is more primitive than a Haskell
706 array&mdash;indeed, the Haskell <literal>Array</literal> interface is
707 implemented using <literal>Array&num;</literal>&mdash;in that an
708 <literal>Array&num;</literal> is indexed only by
709 <literal>Int&num;</literal>s, starting at zero. It is also more
710 primitive by virtue of being unboxed. That doesn't mean that it isn't
711 a heap-allocated object&mdash;of course, it is. Rather, being unboxed
712 means that it is represented by a pointer to the array itself, and not
713 to a thunk which will evaluate to the array (or to bottom). The
714 components of an <literal>Array&num;</literal> are themselves boxed.
715 </para>
716
717 <para>
718 The type <literal>ByteArray&num;</literal> is similar to
719 <literal>Array&num;</literal>, except that it contains just a string
720 of (non-pointer) bytes.
721 </para>
722
723 <para>
724
725 <programlisting>
726 type ByteArray#
727 </programlisting>
728
729 <indexterm><primary><literal>ByteArray&num;</literal></primary></indexterm>
730 </para>
731
732 <para>
733 Arrays of these types are useful when a Haskell program wishes to
734 construct a value to pass to a C procedure. It is also possible to use
735 them to build (say) arrays of unboxed characters for internal use in a
736 Haskell program. Given these uses, <literal>ByteArray&num;</literal>
737 is deliberately a bit vague about the type of its components.
738 Operations are provided to extract values of type
739 <literal>Char&num;</literal>, <literal>Int&num;</literal>,
740 <literal>Float&num;</literal>, <literal>Double&num;</literal>, and
741 <literal>Addr&num;</literal> from arbitrary offsets within a
742 <literal>ByteArray&num;</literal>. (For type
743 <literal>Foo&num;</literal>, the $i$th offset gets you the $i$th
744 <literal>Foo&num;</literal>, not the <literal>Foo&num;</literal> at
745 byte-position $i$. Mumble.) (If you want a
746 <literal>Word&num;</literal>, grab an <literal>Int&num;</literal>,
747 then coerce it.)
748 </para>
749
750 <para>
751 Lastly, we have static byte-arrays, of type
752 <literal>Addr&num;</literal> &lsqb;mentioned previously]. (Remember
753 the duality between arrays and pointers in C.) Arrays of this types
754 are represented by a pointer to an array in the world outside Haskell,
755 so this pointer is not followed by the garbage collector. In other
756 respects they are just like <literal>ByteArray&num;</literal>. They
757 are only needed in order to pass values from C to Haskell.
758 </para>
759
760 </sect2>
761
762 <sect2>
763 <title>Reading and writing</title>
764
765 <para>
766 Primitive arrays are linear, and indexed starting at zero.
767 </para>
768
769 <para>
770 The size and indices of a <literal>ByteArray&num;</literal>, <literal>Addr&num;</literal>, and
771 <literal>MutableByteArray&num;</literal> are all in bytes. It's up to the program to
772 calculate the correct byte offset from the start of the array. This
773 allows a <literal>ByteArray&num;</literal> to contain a mixture of values of different
774 type, which is often needed when preparing data for and unpicking
775 results from C. (Umm&hellip;not true of indices&hellip;WDP 95/09)
776 </para>
777
778 <para>
779 <emphasis>Should we provide some <literal>sizeOfDouble&num;</literal> constants?</emphasis>
780 </para>
781
782 <para>
783 Out-of-range errors on indexing should be caught by the code which
784 uses the primitive operation; the primitive operations themselves do
785 <emphasis>not</emphasis> check for out-of-range indexes. The intention is that the
786 primitive ops compile to one machine instruction or thereabouts.
787 </para>
788
789 <para>
790 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
791 <emphasis>mutable</emphasis> arrays (see <xref linkend="sect-mutable">), and
792 &ldquo;indexing&rdquo; to refer to reading a value from an <emphasis>immutable</emphasis>
793 array.
794 </para>
795
796 <para>
797 Immutable byte arrays are straightforward to index (all indices are in
798 units of the size of the object being read):
799
800 <programlisting>
801 indexCharArray# :: ByteArray# -> Int# -> Char#
802 indexIntArray# :: ByteArray# -> Int# -> Int#
803 indexAddrArray# :: ByteArray# -> Int# -> Addr#
804 indexFloatArray# :: ByteArray# -> Int# -> Float#
805 indexDoubleArray# :: ByteArray# -> Int# -> Double#
806
807 indexCharOffAddr# :: Addr# -> Int# -> Char#
808 indexIntOffAddr# :: Addr# -> Int# -> Int#
809 indexFloatOffAddr# :: Addr# -> Int# -> Float#
810 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
811 indexAddrOffAddr# :: Addr# -> Int# -> Addr#
812 -- Get an Addr# from an Addr# offset
813 </programlisting>
814
815 <indexterm><primary><literal>indexCharArray&num;</literal></primary></indexterm>
816 <indexterm><primary><literal>indexIntArray&num;</literal></primary></indexterm>
817 <indexterm><primary><literal>indexAddrArray&num;</literal></primary></indexterm>
818 <indexterm><primary><literal>indexFloatArray&num;</literal></primary></indexterm>
819 <indexterm><primary><literal>indexDoubleArray&num;</literal></primary></indexterm>
820 <indexterm><primary><literal>indexCharOffAddr&num;</literal></primary></indexterm>
821 <indexterm><primary><literal>indexIntOffAddr&num;</literal></primary></indexterm>
822 <indexterm><primary><literal>indexFloatOffAddr&num;</literal></primary></indexterm>
823 <indexterm><primary><literal>indexDoubleOffAddr&num;</literal></primary></indexterm>
824 <indexterm><primary><literal>indexAddrOffAddr&num;</literal></primary></indexterm>
825 </para>
826
827 <para>
828 The last of these, <function>indexAddrOffAddr&num;</function>, extracts an <literal>Addr&num;</literal> using an offset
829 from another <literal>Addr&num;</literal>, thereby providing the ability to follow a chain of
830 C pointers.
831 </para>
832
833 <para>
834 Something a bit more interesting goes on when indexing arrays of boxed
835 objects, because the result is simply the boxed object. So presumably
836 it should be entered&mdash;we never usually return an unevaluated
837 object! This is a pain: primitive ops aren't supposed to do
838 complicated things like enter objects. The current solution is to
839 return a single element unboxed tuple (see <xref linkend="unboxed-tuples">).
840 </para>
841
842 <para>
843
844 <programlisting>
845 indexArray# :: Array# elt -> Int# -> (# elt #)
846 </programlisting>
847
848 <indexterm><primary><literal>indexArray&num;</literal></primary></indexterm>
849 </para>
850
851 </sect2>
852
853 <sect2>
854 <title>The state type</title>
855
856 <para>
857 <indexterm><primary><literal>state, primitive type</literal></primary></indexterm>
858 <indexterm><primary><literal>State&num;</literal></primary></indexterm>
859 </para>
860
861 <para>
862 The primitive type <literal>State&num;</literal> represents the state of a state
863 transformer. It is parameterised on the desired type of state, which
864 serves to keep states from distinct threads distinct from one another.
865 But the <emphasis>only</emphasis> effect of this parameterisation is in the type
866 system: all values of type <literal>State&num;</literal> are represented in the same way.
867 Indeed, they are all represented by nothing at all! The code
868 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
869 etc, for primitive states.
870 </para>
871
872 <para>
873
874 <programlisting>
875 type State# s
876 </programlisting>
877
878 </para>
879
880 <para>
881 The type <literal>GHC.RealWorld</literal> is truly opaque: there are no values defined
882 of this type, and no operations over it. It is &ldquo;primitive&rdquo; in that
883 sense - but it is <emphasis>not unlifted!</emphasis> Its only role in life is to be
884 the type which distinguishes the <literal>IO</literal> state transformer.
885 </para>
886
887 <para>
888
889 <programlisting>
890 data RealWorld
891 </programlisting>
892
893 </para>
894
895 </sect2>
896
897 <sect2>
898 <title>State of the world</title>
899
900 <para>
901 A single, primitive, value of type <literal>State&num; RealWorld</literal> is provided.
902 </para>
903
904 <para>
905
906 <programlisting>
907 realWorld# :: State# RealWorld
908 </programlisting>
909
910 <indexterm><primary>realWorld&num; state object</primary></indexterm>
911 </para>
912
913 <para>
914 (Note: in the compiler, not a <literal>PrimOp</literal>; just a mucho magic
915 <literal>Id</literal>. Exported from <literal>GHC</literal>, though).
916 </para>
917
918 </sect2>
919
920 <sect2 id="sect-mutable">
921 <title>Mutable arrays</title>
922
923 <para>
924 <indexterm><primary>mutable arrays</primary></indexterm>
925 <indexterm><primary>arrays, mutable</primary></indexterm>
926 Corresponding to <literal>Array&num;</literal> and <literal>ByteArray&num;</literal>, we have the types of
927 mutable versions of each. In each case, the representation is a
928 pointer to a suitable block of (mutable) heap-allocated storage.
929 </para>
930
931 <para>
932
933 <programlisting>
934 type MutableArray# s elt
935 type MutableByteArray# s
936 </programlisting>
937
938 <indexterm><primary><literal>MutableArray&num;</literal></primary></indexterm>
939 <indexterm><primary><literal>MutableByteArray&num;</literal></primary></indexterm>
940 </para>
941
942 <sect3>
943 <title>Allocation</title>
944
945 <para>
946 <indexterm><primary>mutable arrays, allocation</primary></indexterm>
947 <indexterm><primary>arrays, allocation</primary></indexterm>
948 <indexterm><primary>allocation, of mutable arrays</primary></indexterm>
949 </para>
950
951 <para>
952 Mutable arrays can be allocated. Only pointer-arrays are initialised;
953 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
954 the array-allocation primitive. Reason: only the pointer case has to
955 worry about GC striking with a partly-initialised array.
956 </para>
957
958 <para>
959
960 <programlisting>
961 newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
962
963 newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
964 newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
965 newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
966 newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
967 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
968 </programlisting>
969
970 <indexterm><primary><literal>newArray&num;</literal></primary></indexterm>
971 <indexterm><primary><literal>newCharArray&num;</literal></primary></indexterm>
972 <indexterm><primary><literal>newIntArray&num;</literal></primary></indexterm>
973 <indexterm><primary><literal>newAddrArray&num;</literal></primary></indexterm>
974 <indexterm><primary><literal>newFloatArray&num;</literal></primary></indexterm>
975 <indexterm><primary><literal>newDoubleArray&num;</literal></primary></indexterm>
976 </para>
977
978 <para>
979 The size of a <literal>ByteArray&num;</literal> is given in bytes.
980 </para>
981
982 </sect3>
983
984 <sect3>
985 <title>Reading and writing</title>
986
987 <para>
988 <indexterm><primary>arrays, reading and writing</primary></indexterm>
989 </para>
990
991 <para>
992
993 <programlisting>
994 readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
995 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
996 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
997 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
998 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
999 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1000
1001 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
1002 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
1003 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
1004 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
1005 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
1006 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1007 </programlisting>
1008
1009 <indexterm><primary><literal>readArray&num;</literal></primary></indexterm>
1010 <indexterm><primary><literal>readCharArray&num;</literal></primary></indexterm>
1011 <indexterm><primary><literal>readIntArray&num;</literal></primary></indexterm>
1012 <indexterm><primary><literal>readAddrArray&num;</literal></primary></indexterm>
1013 <indexterm><primary><literal>readFloatArray&num;</literal></primary></indexterm>
1014 <indexterm><primary><literal>readDoubleArray&num;</literal></primary></indexterm>
1015 <indexterm><primary><literal>writeArray&num;</literal></primary></indexterm>
1016 <indexterm><primary><literal>writeCharArray&num;</literal></primary></indexterm>
1017 <indexterm><primary><literal>writeIntArray&num;</literal></primary></indexterm>
1018 <indexterm><primary><literal>writeAddrArray&num;</literal></primary></indexterm>
1019 <indexterm><primary><literal>writeFloatArray&num;</literal></primary></indexterm>
1020 <indexterm><primary><literal>writeDoubleArray&num;</literal></primary></indexterm>
1021 </para>
1022
1023 </sect3>
1024
1025 <sect3>
1026 <title>Equality</title>
1027
1028 <para>
1029 <indexterm><primary>arrays, testing for equality</primary></indexterm>
1030 </para>
1031
1032 <para>
1033 One can take &ldquo;equality&rdquo; of mutable arrays. What is compared is the
1034 <emphasis>name</emphasis> or reference to the mutable array, not its contents.
1035 </para>
1036
1037 <para>
1038
1039 <programlisting>
1040 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
1041 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1042 </programlisting>
1043
1044 <indexterm><primary><literal>sameMutableArray&num;</literal></primary></indexterm>
1045 <indexterm><primary><literal>sameMutableByteArray&num;</literal></primary></indexterm>
1046 </para>
1047
1048 </sect3>
1049
1050 <sect3>
1051 <title>Freezing mutable arrays</title>
1052
1053 <para>
1054 <indexterm><primary>arrays, freezing mutable</primary></indexterm>
1055 <indexterm><primary>freezing mutable arrays</primary></indexterm>
1056 <indexterm><primary>mutable arrays, freezing</primary></indexterm>
1057 </para>
1058
1059 <para>
1060 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
1061 by copying the array and then using <function>unsafeFreeze</function>.)
1062 </para>
1063
1064 <para>
1065
1066 <programlisting>
1067 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1068 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1069 </programlisting>
1070
1071 <indexterm><primary><literal>unsafeFreezeArray&num;</literal></primary></indexterm>
1072 <indexterm><primary><literal>unsafeFreezeByteArray&num;</literal></primary></indexterm>
1073 </para>
1074
1075 </sect3>
1076
1077 </sect2>
1078
1079 <sect2>
1080 <title>Synchronizing variables (M-vars)</title>
1081
1082 <para>
1083 <indexterm><primary>synchronising variables (M-vars)</primary></indexterm>
1084 <indexterm><primary>M-Vars</primary></indexterm>
1085 </para>
1086
1087 <para>
1088 Synchronising variables are the primitive type used to implement
1089 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1090 the operational behaviour of these operations).
1091 </para>
1092
1093 <para>
1094
1095 <programlisting>
1096 type MVar# s elt -- primitive
1097
1098 newMVar# :: State# s -> (# State# s, MVar# s elt #)
1099 takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1100 putMVar# :: SynchVar# s elt -> State# s -> State# s
1101 </programlisting>
1102
1103 <indexterm><primary><literal>SynchVar&num;</literal></primary></indexterm>
1104 <indexterm><primary><literal>newSynchVar&num;</literal></primary></indexterm>
1105 <indexterm><primary><literal>takeMVar</literal></primary></indexterm>
1106 <indexterm><primary><literal>putMVar</literal></primary></indexterm>
1107 </para>
1108
1109 </sect2>
1110
1111 <sect2 id="glasgow-prim-arrays">
1112 <title>Primitive arrays, mutable and otherwise
1113 </title>
1114
1115 <para>
1116 <indexterm><primary>primitive arrays (Glasgow extension)</primary></indexterm>
1117 <indexterm><primary>arrays, primitive (Glasgow extension)</primary></indexterm>
1118 </para>
1119
1120 <para>
1121 GHC knows about quite a few flavours of Large Swathes of Bytes.
1122 </para>
1123
1124 <para>
1125 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1126 objects (type <literal>Array&num; obj</literal>) and primitive arrays of bytes (type
1127 <literal>ByteArray&num;</literal>).
1128 </para>
1129
1130 <para>
1131 Second, it distinguishes between&hellip;
1132 <variablelist>
1133
1134 <varlistentry>
1135 <term>Immutable:</term>
1136 <listitem>
1137 <para>
1138 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
1139 can only read from them. Obviously, they do not need the care and
1140 attention of the state-transformer monad.
1141 </para>
1142 </listitem>
1143 </varlistentry>
1144 <varlistentry>
1145 <term>Mutable:</term>
1146 <listitem>
1147 <para>
1148 Arrays that may be changed or &ldquo;mutated.&rdquo; All the operations on them
1149 live within the state-transformer monad and the updates happen
1150 <emphasis>in-place</emphasis>.
1151 </para>
1152 </listitem>
1153 </varlistentry>
1154 <varlistentry>
1155 <term>&ldquo;Static&rdquo; (in C land):</term>
1156 <listitem>
1157 <para>
1158 A C routine may pass an <literal>Addr&num;</literal> pointer back into Haskell land. There
1159 are then primitive operations with which you may merrily grab values
1160 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
1161 </para>
1162 </listitem>
1163 </varlistentry>
1164 <varlistentry>
1165 <term>&ldquo;Stable&rdquo; pointers:</term>
1166 <listitem>
1167 <para>
1168 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1169 <emphasis>not</emphasis> an unboxed value) to a C routine, you first make the
1170 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
1171 exists. That is, GHC provides a safe way to pass Haskell pointers to
1172 C.
1173 </para>
1174
1175 <para>
1176 Please see the module <literal>Foreign.StablePtr</literal> in the
1177 library documentation for more details.
1178 </para>
1179 </listitem>
1180 </varlistentry>
1181 <varlistentry>
1182 <term>&ldquo;Foreign objects&rdquo;:</term>
1183 <listitem>
1184 <para>
1185 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
1186 C-allocated pointer, say) to Haskell and have Haskell do the Right
1187 Thing when it no longer references the object. So, for example, C
1188 could pass a large bitmap over to Haskell and say &ldquo;please free this
1189 memory when you're done with it.&rdquo;
1190 </para>
1191
1192 <para>
1193 Please see module <literal>Foreign.ForeignPtr</literal> in the library
1194 documentatation for more details.
1195 </para>
1196 </listitem>
1197 </varlistentry>
1198 </variablelist>
1199 </para>
1200
1201 <para>
1202 The libraries documentatation gives more details on all these
1203 &ldquo;primitive array&rdquo; types and the operations on them.
1204 </para>
1205
1206 </sect2>
1207
1208 </sect1>
1209
1210 <!-- Emacs stuff:
1211 ;;; Local Variables: ***
1212 ;;; mode: xml ***
1213 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
1214 ;;; End: ***
1215 -->