Big batch of Backpack documentation edits.
[ghc.git] / docs / backpack / backpack-manual.tex
1 \documentclass{article}
2
3 \usepackage{color}
4 \usepackage{fullpage}
5 \usepackage{hyperref}
6
7 \newcommand{\Red}[1]{{\color{red} #1}}
8
9 \title{The Backpack Manual}
10
11 \begin{document}
12
13 \maketitle
14
15 Backpack is a new module system for Haskell, intended to enable
16 ``separate modular development'', where application-writers depend on
17 libraries by programming against abstract interfaces instead of specific
18 implementations. Our goal is to reduce software coupling, and let the
19 type system and compiler assist developers when they are developing
20 large software systems. Backpack was originally described in a POPL'14
21 paper \url{http://plv.mpi-sws.org/backpack/}, but this document is
22 intended to describe the syntax of a language you might actually be able
23 to \emph{write}, i.e., from the perspective of a software developer.
24
25 \paragraph{Examples of Backpack ``in the large''}
26 In the standard practice of large-scale software development today,
27 users organize code into libraries. Here is a very simple example of
28 some Haskell code structured in this way:
29
30 \begin{verbatim}
31 unit p where
32 module A where
33 x = True
34 y = False
35
36 unit q where
37 include p
38 module B where
39 import A
40 b = x
41 \end{verbatim}
42
43 \verb|p| is a reusable unit (or package/library if you like) which
44 provides a single module \verb|A|. \verb|q| depends on \verb|p| by
45 directly including it, bringing all of the modules of \verb|p| into scope for
46 import.
47
48 There is a downside to writing code this way: without editing your
49 source code, you can't actually swap out \verb|p| with another version
50 (maybe \verb|p-2.0|, or perhaps even a version of \verb|p| written by someone else).
51 Traditionally, this is overcome by relying on an extralinguistic mechanism,
52 the \emph{package manager}, which indirects \verb|include p| so that
53 it can refer to some other piece of code.
54
55 Backpack offers a different, in-language way of expressing dependency
56 without committing to an implementation:
57
58 \begin{verbatim}
59 unit q where
60 require p
61 module B where
62 import A
63 b = x
64 \end{verbatim}
65
66 Or, equivalently:
67
68 \begin{verbatim}
69 unit p-interface where
70 signature A where
71 x :: Bool
72 y :: Bool
73 unit q where
74 include p-interface
75 module B where
76 import A
77 b = x
78 \end{verbatim}
79
80 With the \verb|require| keyword, Backpack automatically computes
81 \verb|p-interface|, which expresses the abstract interface of \verb|p|.
82 \verb|q| utilizes a subset of this interface (unused requirements
83 don't ``count''), resulting in this final version of \verb|q|:
84
85 \begin{verbatim}
86 unit q where
87 signature A where
88 x :: Bool
89 module B where
90 import A
91 b = x
92 \end{verbatim}
93
94 The technical innovation of Backpack is that the indefinite version of \verb|q|, which
95 does not depend on a specific implementation of \verb|p|, still composes
96 naturally and in the \emph{same} way that the definite version of
97 \verb|q|. That is to say, you can take a hierarchy of libraries that \verb|include| one
98 another (today's situation) and replace each \verb|include| with a
99 \verb|require|. The result is a reusable set of components with
100 explicitly defined requirements. If the inferred requirements are not
101 general enough, a signature can be written explicitly.
102
103 Finally, the package manager serves a new role: it can be used to select
104 a combination of components which fulfill all the requirements. Unlike
105 dependency resolution with version numbers, with interfaces the package
106 manager can do this \emph{completely} precisely.
107
108 \paragraph{Examples of Backpack ``in the small''}
109 Although Backpack was originally designed for library-scale development,
110 it can also be used for small-scale modular development,
111 similar to ML modules. Here is a simple example of writing an
112 associated list implementation that is parametrized over the key:
113
114 \begin{verbatim}
115 unit assoc-map where
116 signature H where
117 data T
118 eq :: T -> T -> Bool
119 module Assoc where
120 import H
121 mylookup :: T -> [(T, a)] -> Maybe a
122 mylookup x xs = fmap snd (find (eq x) xs)
123 unit main where
124 module AbsInt where
125 type T = Int
126 eq x y = abs x == abs y
127 include assoc-map (Assoc as AbsIntAssoc) requires (T as AbsInt)
128 module Main where
129 import AbsIntAssoc
130 import Prelude
131 main = print (mylookup -1 [(1,"yes"),(-2,"no")])
132 \end{verbatim}
133
134 For example, in Haskell there are many different kinds of ``string''-like
135 data types: \verb|String| (a linked list of characters), \verb|ByteString|
136 (an unstructured raw bytestring) and \verb|Text| (a UTF-8 encoded bytestring).
137 Many libraries, e.g., parsers, need to work on any string-like input which
138 a user might want to give them.
139
140 A more full example of a Backpack version of some real library
141 code can be found in Appendix~\ref{sec:tagstream-example}.
142
143
144
145 \section{The Backpack file}
146
147 \begin{verbatim}
148 packages ::= "{" package_0 ";" ... ";" package_n "}"
149 \end{verbatim}
150
151 A Backpack file consists of a list of named packages.
152 All packages in a Backpack file live in the global namespace.
153 A package defines a collection of modules, exporting some of
154 these modules so that other modules can make use of them via
155 \emph{include}. You can compile a definite package \verb|p| in a Backpack file \verb|foo.bkp|
156 with \verb|ghc --backpack foo.bkp p|; you can type-check an indefinite package by
157 adding \verb|-fno-code|.
158
159 \Red{ToDo: How do you import an external Backpack file?}
160
161 \Red{ToDo: Facility for private packages.}
162
163 \begin{verbatim}
164 package ::= "package" pkgname [pkgexports] "where" pkgbody
165 pkgname ::= /* package name, e.g. containers (no version!) */
166 pkgbody ::= "{" pkgdecl_0 ";" ... ";" pkgdecl_n "}"
167 \end{verbatim}
168
169 A package begins with the keyword \verb|package|, its name, and an
170 optional export specification (e.g., a list of modules to be exposed).
171 The header is followed a list of declarations which define modules,
172 signatures and include other packages.
173
174 \section{Declarations}
175
176 \begin{verbatim}
177 pkgdecl ::= "module" modid [exports] "where" body
178 | "signature" modid [exports] "where" body
179 | "include" pkgname ["as" pkgname] [inclspec]
180 \end{verbatim}
181
182 A package is made up of package declarations, which either introduce a
183 new module implementation, introduces a new module
184 signature, or includes a package from the package environment.
185 The declaration of modules and signatures is exactly as it is in Haskell98,
186 so we don't reprise the grammar here.
187
188 \Red{ToDo: Clarify whether order matters. Both are valid designs, but I think order-less is more user-friendly.}
189
190 We generally don't expect users to place their module source code
191 in package files; thus we provide the following forms to refer to
192 \verb|hs| and \verb|hsig| files living on the file-system:
193
194 \begin{verbatim}
195 pkgdecl ::= "module" modid_0 "=" path_0 "," ... "," modid_n "=" path_n
196 | "signature" modid_0 "=" path_0 "," ... "," modid_n "=" path_n
197 | "module" modid_0 "," ... "," modid_n
198 | "signature" modid_0 "," ... "," modid_n
199 \end{verbatim}
200
201 Thus, \verb|module A = "A.hs"| defines the body of \verb|A| based
202 on the contents of \verb|A.hs| in the package's source directory.
203 When the assignment is omitted, we implicitly refer to the file path
204 created by replacing periods with directory separators and adding
205 an appropriate file extension (thus, we can also write \verb|module A|).
206
207 \begin{verbatim}
208 pkgdecl ::= "source" path
209 \end{verbatim}
210
211 The \verb|source| keyword is another construct which allows us to
212 define modules by simply scanning the path in question. For example,
213 if \verb|src| contains two files, \verb|A.hs| and \verb|B.hsig|,
214 then ``\verb|source "src"|'' is equivalent to
215 ``\verb|module A = "src/A.hs"; signature B = "src/B.hsig"|''.
216
217 \Red{ToDo: Allow defining package-wide module imports, which propagate to all inline
218 modules and signatures.}
219
220 \Red{ToDo: Allow defining anonymous modules with bare type/expression declarations.}
221
222 \section{Signatures}
223
224 A signature, denoted with \verb|signature| in a Backpack file and a file
225 with the \verb|hsig| extension on the file system, represents a (type) signature for a
226 Haskell module. It can contain type signatures, data declarations, type
227 classes, type class instances and reexports, but it cannot contain any
228 value definitions.\footnote{Signatures are the backbone of the Backpack
229 module system. A signature can be used to type-check client code which
230 uses a module (without the module implementation), or to verify that an
231 implementation upholds some signature (without a client
232 implementation.)} Signatures are essentially
233 \texttt{hs-boot} modules which do not support mutual recursion but
234 have no runtime efficiency cost. Here is an example of a module signature
235 representing an abstract map type:
236
237 \begin{verbatim}
238 module Map where
239 type role Map nominal representational
240 data Map k v
241 instance Functor (Map k)
242 empty :: Map k a
243 \end{verbatim}
244
245 \section{Includes and exports}
246
247 \begin{verbatim}
248 pkgdecl ::= "include" pkgname ["as" pkgname] [inclspec]
249
250 inclspec ::= "(" renaming_0 "," ... "," renaming_n [","] ")"
251 [ "requires" "(" renaming_0 "," ... "," renaming_n [","] ")" ]
252
253 renaming ::= modid [ "as" modid ]
254 | "package" pkgname
255 \end{verbatim}
256
257 % Add me later:
258 % | "hiding" "(" modid_0 "," ... "," modid_n [","] ")"
259
260 An include brings the modules and signatures of a package into scope.
261 If these modules/signatures have the same names as other
262 modules/signatures in scope, \emph{mix-in linking} occurs.
263 In particular:
264
265 \begin{itemize}
266 \item Module + module = error (unless they really are the same!)
267 \item Module + signature = the signature is filled in, and is
268 no longer part of the requirements of the package.
269 \item Signature + signature = the signatures are merged together.
270 \end{itemize}
271
272 An include is associated with an optional \verb|inclspec|,
273 which can be to thin the provided modules and rename the provided and
274 required modules of an include. In its simplest mode of use,
275 an \verb|inclspec| is a list of modules to be brought into scope,
276 e.g. \verb|include p (A, B)|. Requirements cannot be hidden, but
277 they can be renamed to provide an implementation (or even to just
278 reexport the requirement under another name.) If a requirement is
279 not mentioned in an explicit requirements list, it is implicitly included
280 (thus, \verb|requires (Hole)| has only a purely documentary effect).
281 It is not valid to rename a provision to a requirement, or a requirement
282 to a provision.
283
284 \begin{verbatim}
285 pkgexports ::= inclspec
286 \end{verbatim}
287
288 An export, symmetrically, specifies what modules a package will bring
289 into scope if it is included without any \verb|inclspec|. Any module
290 which is omitted from an explicit export list is not exposed (however,
291 like before, requirements cannot be hidden.)
292
293 When an explicit export list is omitted, you can calculate the provides
294 and requires of a package as follows:
295
296 \begin{itemize}
297 \item A package provides any non-included modules and signatures.
298 (It only provides an included module/signature if it is explicitly
299 reexported.)
300 \item A package requires any transitively reachable signatures or
301 hole signatures which are not filled in with an implementation.
302 \end{itemize}
303
304 \Red{ToDo: Properly describe ``hole signatures'' in the declarations section}
305
306 \subsection{Requirements}
307
308 The fact that requirements are \emph{implicitly} propagated from package
309 to package can result in some spooky ``action at a distance''. However,
310 this implicit behavior is one of the key ingredients to making mix-in
311 modular development scale: you don't want to have to explicitly link
312 everything up, as you might have to do in a traditional ML module
313 system.
314
315 You cannot, however, import a requirement, unless it is also provided,
316 which helps increase encapsulation. If a package provides a
317 module, it can be imported:
318
319 \begin{verbatim}
320 package p (A) requires (A) where
321 signature A where
322 x :: Bool
323 package q (B) requires (A) where
324 include p
325 module B where
326 import A -- OK
327 \end{verbatim}
328
329 If it does not, it cannot be imported:
330 \Red{Alternately, the import is OK but doesn't result in any identifiers
331 being brought into scope.}
332
333 \begin{verbatim}
334 package p () requires (A) where -- yes, this is kind of pointless
335 signature A where
336 x :: Bool
337 package q (B) requires (A) where
338 include p
339 module B where
340 import A -- ERROR!
341 \end{verbatim}
342
343 This means that it is always safe for a package to remove requirements
344 or weaken holes; clients will always continue to compile.
345
346 Of course, if there is a different signature for the hole in scope, the
347 import is not an error; however, no declarations from \verb|p| are in scope:
348
349 \begin{verbatim}
350 package p () requires (A) where
351 signature A where
352 x :: Bool
353 package q (B) requires (A) where
354 include p
355 signature A where
356 y :: Bool
357 module B where
358 import A
359 x' = x -- ERROR!
360 y' = y -- OK
361 \end{verbatim}
362
363 To summarize, requirements are part of the interface of a package; however,
364 they provide no identifiers as far as imports are concerned. \Red{There is
365 some subtle interaction with requirements and shaping; see Shaping by example
366 for more details.}
367
368 \subsection{Package includes/exports}
369
370 A package export is easy enough to explain by analogy of module exports
371 in Haskell: a \verb|package p| in an export list explicitly reexports
372 the identifiers from that package; whereas even a default, wildcard export
373 list would only export locally defined identifiers/modules.
374
375 For example, this module exports the modules of both \verb|base|
376 and \verb|array|.
377
378 \begin{verbatim}
379 package combined(package base, package array) where
380 include base
381 include array
382 \end{verbatim}
383
384 However, in Backpack, a package may be included multiple times, making
385 such declarations ambiguous. Thus, a package can be included as a local
386 package name to disambiguate:
387
388 \begin{verbatim}
389 package p(package q1) where -- equivalent to B1
390 include impls (A1, A2)
391 include q as q1 (hole A as A1, B as B1)
392 include q as q2 (hole A as A2, B as B2)
393 \end{verbatim}
394
395 A package include, e.g. \verb|include a (package p)| is only valid if
396 \verb|a| exports the package \verb|p| explicitly.\footnote{It's probably
397 possible to use anonymous packages to allow easily dividing a package
398 into subpackages, but this is silly and you can always just put it
399 in an actual package.}
400
401 \section{(Transparent) signature ascription}
402
403 \begin{verbatim}
404 inclspec ::= ...
405 | "::" pkgexp
406
407 pkgexp ::= pkgname
408 | "package" [exports] "where" pkgbody
409 \end{verbatim}
410
411 Signature ascription subsumes thinning: it narrows the exports
412 of modules in a package to those specified by a signature
413 package. This package \verb|pkgexp| is specified with either
414 a reference to a named package or an \emph{anonymous package}
415 (in prior work, these have been referred to as \emph{units},
416 although here the distinction is not necessary as our system
417 is \emph{purely applicative}).
418
419 Ascription also imposes a \emph{requirement} on the package
420 being abscribed. Suppose you have \verb|p :: psig|, then:
421
422 \begin{itemize}
423 \item Everything provided \verb|psig| must also
424 be provided by \verb|p|.
425 \item Everything required by \verb|p| must also
426 be required by \verb|psig|.
427 \end{itemize}
428
429 \Red{Alternately, the second requirement is not necessary, and you
430 calculate the new requirements by taking the requirements of \texttt{psig},
431 removing the provides of \texttt{p}, and then adding the requirements of \texttt{p}.
432 This makes it possible to ascribe includes for \emph{adapter} packages, which
433 provide some modules given a different set of requirements.}
434
435 Semantically, ascription replaces the module with a signature,
436 type-checks the package against the signature, and then \emph{post
437 facto} links the signature against the implementation.
438 An ascribed include can be replaced with the signature
439 it is ascribed with, resulting in a package which still typechecks
440 but has more holes. \Red{You have to link at the VERY END, because
441 if you link immediately after processing the module with the
442 ascribed include, the module identities will leak. Of course, if
443 we're compiling we just link eagerly. But now this means that
444 if you have a definite package which uses ascription, even assuming
445 all packages in the environment type-check, you must still type-check
446 this package twice, once indefinitely and then with the actual
447 linking relationship.}
448
449 For example, ascription in the export specification thins out all
450 private identifiers from the package:
451
452 \begin{verbatim}
453 package psig where
454 signature A where
455 public :: Bool
456 package p :: psig where
457 module A.Internal where
458 not_exported = 0
459 module A where
460 public = True
461 private = False
462 \end{verbatim}
463
464 and, symmetrically, ascription in an include hides identifiers:
465
466 \begin{verbatim}
467 package psig where
468 signature A where
469 public :: Bool
470 package p where
471 module A where
472 public = True
473 private = False
474 package q where
475 include p :: psig
476 module B where
477 import A
478 ... public ... -- OK
479 ... private ... -- ERROR
480 \end{verbatim}
481
482 \Red{Observation: thinning is subsumed by transparent signature ascription, but NOT renaming. Thus RENAMING does not commute across signature ascription; you must do it either before or after. Syntax for this is tricky.}
483
484 \paragraph{Syntactic sugar for anonymous packages}
485
486 \begin{verbatim}
487 pkgexp ::= pkgbody
488 | path
489 \end{verbatim}
490
491 It may be useful to provide two forms of sugar for specifying anonymous packages:
492 \verb|pkgbody| is equivalent to \verb|package where pkgbody|; and \verb|"path"|
493 is equivalent to \verb|package where source "path"|.
494
495 \appendix
496 \section{Full grammar}
497
498 \begin{verbatim}
499 packages ::= "{" package_0 ";" ... ";" package_n "}"
500
501 package ::= "package" pkgname [pkgexports] "where" pkgbody
502 pkgname ::= /* package name, e.g. containers (no version!) */
503 pkgbody ::= "{" pkgdecl_0 ";" ... ";" pkgdecl_n "}"
504
505 pkgdecl ::= "module" modid [exports] "where" body
506 | "signature" modid [exports] "where" body
507 | "include" pkgname ["as" pkgname] [inclspec]
508 | "module" modid_0 "=" path_0 "," ... "," modid_n "=" path_n
509 | "signature" modid_0 "=" path_0 "," ... "," modid_n "=" path_n
510 | "module" modid_0 "," ... "," modid_n
511 | "signature" modid_0 "," ... "," modid_n
512 | "source" path
513
514 inclspec ::= "(" renaming_0 "," ... "," renaming_n [","] ")"
515 [ "requires" "(" renaming_0 "," ... "," renaming_n [","] ")" ]
516 | "::" pkgexp
517 pkgexports ::= inclspec
518
519 renaming ::= modid [ "as" modid ]
520 | "package" pkgname
521
522 pkgexp ::= pkgname
523 | "package" [exports] "where" pkgbody
524 | pkgbody
525 | path
526 \end{verbatim}
527
528 \newpage
529 \label{sec:tagstream-example}
530 \section{\texttt{tagstream-conduit} example}
531
532 When someone recently asked on Haskell Reddit what the ``precise problem Backpack
533 addresses'' was, Chris Doner offered a nice example from the
534 \verb|tagstream-conduit|. The existing implementation, at \url{http://hackage.haskell.org/package/tagstream-conduit-0.5.5.1/docs/Text-HTML-TagStream-Entities.html}, uses a data type
535 to define a ``module'' which is then used to implement two modules in the
536 library, a variant for \verb|ByteString| and a variant for \verb|Text|.
537
538 Here is how this would be done in Backpack:
539
540 \Red{This still contains some syntax which I haven't fully explained.}
541
542 \begin{verbatim}
543 -- | This is an ordinary module which defines some types
544 -- which are exported by the package.
545 module Text.HTML.TagStream.Types where
546 data Token' s = Text s
547 | ...
548
549 -- | This provides a 'Decoder' implementation for 'ByteString's.
550 -- We define 'Str' to be 'ByteString' and implement a few
551 -- functions manually, as well as reexport existing functions
552 -- from some libraries. We don't plan to publically export
553 -- these from the package.
554 module Decoder.ByteString (
555 module Decoder.ByteString,
556 Builder, break, drop, uncons
557 ) where
558 import Data.ByteString
559 import Data.ByteString.Builder
560 type Str = ByteString
561 toStr = toByteString
562 fromStr = fromByteString
563 decodeEntity = ...
564
565 -- | This provides a 'Decoder' implementation for 'Text', much
566 -- the same as 'Decoder.ByteString'.
567 module Decoder.Text (
568 module Decoder.Text,
569 Builder, break, drop, uncons
570 ) where
571 import Data.Text
572 import Data.Text.Lazy.Builder
573 type Str = Text
574 toStr = toText
575 fromStr = fromText
576 decodeEntity = ...
577
578 -- | This defines the "functor"; the module implementing entity
579 -- decoding, 'Entities', is parametrized by an abstract interface
580 -- 'Decoder' which describes two types, 'Str' and 'Builder', as
581 -- well as operations on them which are avaiable for the implementation.
582 unit decoder where
583 signature Decoder where
584 data Str
585 data Builder
586 toStr :: Builder -> Str
587 break :: (Char -> Bool) -> Str -> (Str, Str)
588 fromStr :: Str -> Builder
589 drop :: Int -> Str -> Str
590 decodeEntity :: Str -> Maybe Str
591 uncons :: Str -> Maybe (Char, Str)
592 module Entities where
593 import Text.HTML.TagStream.Types
594 import Data.Conduit
595 import Decoder
596 decodeEntities :: Monad m => Conduit (Token' Str) m (Token' Str)
597 decodeEntities = ...
598
599 -- | Finally, these two lines instantiate 'Entities' twice;
600 -- once with 'Decoder.ByteString', and once as 'Decoder.Text'.
601 include decoder (Entities as Text.HTML.TagStream.ByteString)
602 requires (Decoder as Decoder.ByteString)
603 include decoder (Entities as Text.HTML.TagStream.Text)
604 requires (Decoder as Decoder.Text)
605 \end{verbatim}
606
607 Without having the source-code inline, the out-of-line Backpack file
608 would look something like this:
609
610 \begin{verbatim}
611 module Text.HTML.TagStream.Types -- Text/HTML/TagStream/Types.hs
612 module Decoder.ByteString -- Decoder/ByteString.hs
613 module Decoder.Text -- Decoder/Text.hs
614 unit decoder where
615 signature Decoder -- decoder/Decoder.hsig
616 module Entities -- decoder/Entities.hs
617 include decoder (Entities as Text.HTML.TagStream.ByteString)
618 requires (Decoder as Decoder.ByteString)
619 include decoder (Entities as Text.HTML.TagStream.Text)
620 requires (Decoder as Decoder.Text)
621 \end{verbatim}
622
623 \end{document}