1 \documentclass{article}
3 \usepackage{supertabular}
4 \usepackage{amsmath}
5 \usepackage{amssymb}
6 \usepackage{stmaryrd}
7 \usepackage{xcolor}
8 \usepackage{fullpage}
9 \usepackage{multirow}
11 \newcommand{\ghcfile}[1]{\textsl{#1}}
12 \newcommand{\arraylabel}[1]{\multicolumn{2}{l}{\!\!\!\!\!\!\!\!\!\text{\underline{#1}:}}}
14 \input{CoreOtt}
16 % increase spacing between rules for ease of reading:
17 \renewcommand{\ottusedrule}[1]{$#1$\$1ex]} 19 \setlength{\parindent}{0in} 20 \setlength{\parskip}{1ex} 22 \newcommand{\gram}[1]{\ottgrammartabular{#1\ottinterrule}} 24 \begin{document} 26 \begin{center} 27 \LARGE 28 System FC, as implemented in GHC\footnote{This 29 document was originally prepared by Richard Eisenberg (\texttt{eir@cis.upenn.edu}), 30 but it should be maintained by anyone who edits the functions or data structures 31 mentioned in this file. Please feel free to contact Richard for more information.}\\ 32 \Large\today 33 \end{center} 35 \section{Introduction} 37 There are a number of details elided from this presentation. The goal of the 38 formalism is to aid in reasoning about type safety, and checks that do not 39 work toward this goal were omitted. For example, various scoping checks (other 40 than basic context inclusion) appear in the GHC code but not here. 42 \section{Grammar} 44 \subsection{Metavariables} 46 We will use the following metavariables: 48 \ottmetavars{}\\ 50 \subsection{Literals} 52 Literals do not play a major role, so we leave them abstract: 54 \gram{\ottlit} 56 We also leave abstract the function \coderef{basicTypes/Literal.lhs}{literalType} 57 and the judgment \coderef{coreSyn/CoreLint.lhs}{lintTyLit} (written [[G |-tylit lit : k]]). 59 \subsection{Variables} 61 GHC uses the same datatype to represent term-level variables and type-level 62 variables: 64 \gram{ 65 \ottz 66 } 67 foo 69 \gram{ 70 \ottn 71 } 73 \subsection{Expressions} 75 The datatype that represents expressions: 77 \gram{\otte} 79 There are a few key invariants about expressions: 80 \begin{itemize} 81 \item The right-hand sides of all top-level and recursive [[let]]s 82 must be of lifted type. 83 \item The right-hand side of a non-recursive [[let]] and the argument 84 of an application may be of unlifted type, but only if the expression 85 is ok-for-speculation. See \verb|#let_app_invariant#| in \ghcfile{coreSyn/CoreSyn.lhs}. 86 \item We allow a non-recursive [[let]] for bind a type variable. 87 \item The [[_]] case for a [[case]] must come first. 88 \item The list of case alternatives must be exhaustive. 89 \item Types and coercions can only appear on the right-hand-side of an application. 90 \end{itemize} 92 Bindings for [[let]] statements: 94 \gram{\ottbinding} 96 Case alternatives: 98 \gram{\ottalt} 100 Constructors as used in patterns: 102 \gram{\ottKp} 104 Notes that can be inserted into the AST. We leave these abstract: 106 \gram{\otttick} 108 A program is just a list of bindings: 110 \gram{\ottprogram} 112 \subsection{Types} 114 \gram{\ottt} 116 There are some invariants on types: 117 \begin{itemize} 118 \item The type [[t1]] in the form [[t1 t2]] must not be a type constructor 119 [[T]]. It should be another application or a type variable. 120 \item The form [[T </ ti // i /> ]] (\texttt{TyConApp}) 121 does \emph{not} need to be saturated. 122 \item A saturated application of [[(->) t1 t2]] should be represented as 123 [[t1 -> t2]]. This is a different point in the grammar, not just pretty-printing. 124 The constructor for a saturated [[(->)]] is \texttt{FunTy}. 125 \item A type-level literal is represented in GHC with a different datatype than 126 a term-level literal, but we are ignoring this distinction here. 127 \end{itemize} 129 \subsection{Coercions} 131 \gram{\ottg} 133 Invariants on coercions: 134 \begin{itemize} 135 \item [[<t1 t2>]] is used; never [[<t1> <t2>]]. 136 \item If [[<T>]] is applied to some coercions, at least one of which is not 137 reflexive, use [[T </ gi // i />]], never [[<T> g1 g2]] \ldots. 138 \item The [[T]] in [[T </gi//i/>]] is never a type synonym, though it could 139 be a type function. 140 \end{itemize} 142 Is it a left projection or a right projection? 144 \gram{\ottLorR} 146 Axioms: 148 \gram{ 149 \ottC\ottinterrule 150 \ottaxBranch 151 } 153 \subsection{Type constructors} 155 Type constructors in GHC contain \emph{lots} of information. We leave most of it out 156 for this formalism: 158 \gram{\ottT} 160 We include some representative primitive type constructors. There are many more in \ghcfile{prelude/TysPrim.lhs}. 162 \gram{\ottH} 164 \section{Contexts} 166 The functions in \ghcfile{coreSyn/CoreLint.lhs} use the \texttt{LintM} monad. 167 This monad contains a context with a set of bound variables [[G]]. The 168 formalism treats [[G]] as an ordered list, but GHC uses a set as its 169 representation. 171 \gram{ 172 \ottG 173 } 175 We assume the Barendregt variable convention that all new variables are 176 fresh in the context. In the implementation, of course, some work is done 177 to guarantee this freshness. In particular, adding a new type variable to 178 the context sometimes requires creating a new, fresh variable name and then 179 applying a substitution. We elide these details in this formalism, but 180 see \coderef{types/Type.lhs}{substTyVarBndr} for details. 182 \section{Judgments} 184 The following functions are used from GHC. Their names are descriptive, and they 185 are not formalized here: \coderef{types/TyCon.lhs}{tyConKind}, 186 \coderef{types/TyCon.lhs}{tyConArity}, \coderef{basicTypes/DataCon.lhs}{dataConTyCon}, \coderef{types/TyCon.lhs}{isNewTyCon}, \coderef{basicTypes/DataCon.lhs}{dataConRepType}. 188 \subsection{Program consistency} 190 Check the entire bindings list in a context including the whole list. We extract 191 the actual variables (with their types/kinds) from the bindings, check for duplicates, 192 and then check each binding. 194 \ottdefnlintCoreBindings{} 196 Here is the definition of [[vars_of]], taken from \coderef{coreSyn/CoreSyn.lhs}{bindersOf}: 198 \[ 199 \begin{array}{ll} 200 [[vars_of n = e]] &= [[n]] \\ 201 [[vars_of rec </ ni = ei // i />]] &= [[</ ni // i />]] 202 \end{array} 203$
205 \subsection{Binding consistency}
207 \ottdefnlintXXbind{}
209 \ottdefnlintSingleBinding{}
211 In the GHC source, this function contains a number of other checks, such
212 as for strictness and exportability. See the source code for further information.
214 \subsection{Expression typing}
216 \ottdefnlintCoreExpr{}
218 %\arraylabel{\coderef{coreSyn/CoreLint.lhs}{checkCaseAlts}} \\
219 %\multicolumn{2}{l}{[[checkCaseAlts]] \text{ checks ordering and exhaustivity constr%aints}} \\
221 %\end{array}
222 %\]
224 \begin{itemize}
225 \item Some explication of \ottdrulename{Tm\_LetRec} is helpful: The idea behind the
226 second premise ($[[</ G, G'i |-ty s'i : ki // i />]]$) is that we wish
227 to check each substituted type $[[s'i]]$ in a context containing all the types
228 that come before it in the list of bindings. The $[[G'i]]$ are contexts
229 containing the names and kinds of all type variables (and term variables,
230 for that matter) up to the $i$th binding. This logic is extracted from
231 \coderef{coreSyn/CoreLint.lhs}{lintAndScopeIds}.
233 \item There is one more case for $[[G |-tm e : t]]$, for type expressions.
234 This is included in the GHC code but is elided
235 here because the case is never used in practice. Type expressions
236 can only appear in arguments to functions, and these are handled
237 in \ottdrulename{Tm\_AppType}.
239 \item The GHC source code checks all arguments in an application expression
240 all at once using \coderef{coreSyn/CoreSyn.lhs}{collectArgs}
241 and \coderef{coreSyn/CoreLint.lhs}{lintCoreArgs}. The operation
242 has been unfolded for presentation here.
244 \item If a $[[tick]]$ contains breakpoints, the GHC source performs additional
245 (scoping) checks.
247 \item The rule for $[[case]]$ statements also checks to make sure that
248 the alternatives in the $[[case]]$ are well-formed with respect to the
249 invariants listed above. These invariants do not affect the type or
250 evaluation of the expression, so the check is omitted here.
252 \item The GHC source code for \ottdrulename{Tm\_Var} contains checks for
253 a dead id and for one-tuples. These checks are omitted here.
254 \end{itemize}
256 \subsection{Kinding}
258 \ottdefnlintType{}
260 \subsection{Kind validity}
262 \ottdefnlintKind{}
264 \subsection{Coercion typing}
266 \ottdefnlintCoercion{}
268 In \ottdrulename{Co\_AxiomInstCo}, the use of $[[inits]]$ creates substitutions from
269 the first $i$ mappings in $[[ </ [ni |-> si] // i /> ]]$. This has the effect of
270 folding the substitution over the kinds for kind-checking.
272 \subsection{Name consistency}
274 There are two very similar checks for names, one declared as a local function:
276 \ottdefnlintSingleBindingXXlintBinder{}
278 \ottdefnlintBinder{}
280 \subsection{Substitution consistency}
282 \ottdefncheckTyKind{}
284 \subsection{Case alternative consistency}
286 \ottdefnlintCoreAlt{}
288 \subsection{Telescope substitution}
290 \ottdefnapplyTys{}
292 \subsection{Case alternative binding consistency}
294 \ottdefnlintAltBinders{}
296 \subsection{Arrow kinding}
298 \ottdefnlintArrow{}
300 \subsection{Type application kinding}
302 \ottdefnlintXXapp{}
304 \subsection{Sub-kinding}
306 \ottdefnisSubKind{}
308 \subsection{Branched axiom conflict checking}
310 The following judgment is used within \ottdrulename{Co\_AxiomInstCo} to make
311 sure that a type family application cannot unify with any previous branch
312 in the axiom.
314 \ottdefncheckXXnoXXconflict{}
316 The judgment $[[apart]]$ checks to see whether two lists of types are surely apart.
317 It checks to see if \coderef{types/Unify.lhs}{tcApartTys} returns \texttt{SurelyApart}.
318 Two types are apart if neither type is a type family application and if they do not
319 unify.
321 \end{document}