1 \documentclass{article}
3 \usepackage{supertabular}
4 \usepackage{amsmath}
5 \usepackage{amssymb}
6 \usepackage{stmaryrd}
7 \usepackage{xcolor}
8 \usepackage{fullpage}
9 \usepackage{multirow}
11 \newcommand{\ghcfile}[1]{\textsl{#1}}
12 \newcommand{\arraylabel}[1]{\multicolumn{2}{l}{\!\!\!\!\!\!\!\!\!\text{\underline{#1}:}}}
14 \input{CoreOtt}
16 % increase spacing between rules for ease of reading:
17 \renewcommand{\ottusedrule}[1]{$#1$\$1ex]} 19 \setlength{\parindent}{0in} 20 \setlength{\parskip}{1ex} 22 \newcommand{\gram}[1]{\ottgrammartabular{#1\ottinterrule}} 24 \begin{document} 26 \begin{center} 27 \LARGE 28 System FC, as implemented in GHC\footnote{This 29 document was originally prepared by Richard Eisenberg (\texttt{eir@cis.upenn.edu}), 30 but it should be maintained by anyone who edits the functions or data structures 31 mentioned in this file. Please feel free to contact Richard for more information.}\\ 32 \Large\today 33 \end{center} 35 \section{Introduction} 37 There are a number of details elided from this presentation. The goal of the 38 formalism is to aid in reasoning about type safety, and checks that do not 39 work toward this goal were omitted. For example, various scoping checks (other 40 than basic context inclusion) appear in the GHC code but not here. 42 \section{Grammar} 44 \subsection{Metavariables} 46 We will use the following metavariables: 48 \ottmetavars{}\\ 50 \subsection{Literals} 52 Literals do not play a major role, so we leave them abstract: 54 \gram{\ottlit} 56 We also leave abstract the function \coderef{basicTypes/Literal.lhs}{literalType} 57 and the judgment \coderef{coreSyn/CoreLint.lhs}{lintTyLit} (written [[G |-tylit lit : k]]). 59 \subsection{Variables} 61 GHC uses the same datatype to represent term-level variables and type-level 62 variables: 64 \gram{ 65 \ottz 66 } 67 foo 69 \gram{ 70 \ottn 71 } 73 \subsection{Expressions} 75 The datatype that represents expressions: 77 \gram{\otte} 79 There are a few key invariants about expressions: 80 \begin{itemize} 81 \item The right-hand sides of all top-level and recursive [[let]]s 82 must be of lifted type. 83 \item The right-hand side of a non-recursive [[let]] and the argument 84 of an application may be of unlifted type, but only if the expression 85 is ok-for-speculation. See \verb|#let_app_invariant#| in \ghcfile{coreSyn/CoreSyn.lhs}. 86 \item We allow a non-recursive [[let]] for bind a type variable. 87 \item The [[_]] case for a [[case]] must come first. 88 \item The list of case alternatives must be exhaustive. 89 \item Types and coercions can only appear on the right-hand-side of an application. 90 \end{itemize} 92 Bindings for [[let]] statements: 94 \gram{\ottbinding} 96 Case alternatives: 98 \gram{\ottalt} 100 Constructors as used in patterns: 102 \gram{\ottKp} 104 Notes that can be inserted into the AST. We leave these abstract: 106 \gram{\otttick} 108 A program is just a list of bindings: 110 \gram{\ottprogram} 112 \subsection{Types} 114 \gram{\ottt} 116 There are some invariants on types: 117 \begin{itemize} 118 \item The type [[t1]] in the form [[t1 t2]] must not be a type constructor 119 [[T]]. It should be another application or a type variable. 120 \item The form [[T </ ti // i /> ]] (\texttt{TyConApp}) 121 does \emph{not} need to be saturated. 122 \item A saturated application of [[(->) t1 t2]] should be represented as 123 [[t1 -> t2]]. This is a different point in the grammar, not just pretty-printing. 124 The constructor for a saturated [[(->)]] is \texttt{FunTy}. 125 \item A type-level literal is represented in GHC with a different datatype than 126 a term-level literal, but we are ignoring this distinction here. 127 \end{itemize} 129 \subsection{Coercions} 131 \gram{\ottg} 133 Invariants on coercions: 134 \begin{itemize} 135 \item [[<t1 t2>]] is used; never [[<t1> <t2>]]. 136 \item If [[<T>]] is applied to some coercions, at least one of which is not 137 reflexive, use [[T </ gi // i />]], never [[<T> g1 g2]] \ldots. 138 \item The [[T]] in [[T </gi//i/>]] is never a type synonym, though it could 139 be a type function. 140 \end{itemize} 142 Is it a left projection or a right projection? 144 \gram{\ottLorR} 146 Axioms: 148 \gram{\ottC} 150 \subsection{Type constructors} 152 Type constructors in GHC contain \emph{lots} of information. We leave most of it out 153 for this formalism: 155 \gram{\ottT} 157 We include some representative primitive type constructors. There are many more in \ghcfile{prelude/TysPrim.lhs}. 159 \gram{\ottH} 161 \section{Contexts} 163 The functions in \ghcfile{coreSyn/CoreLint.lhs} use the \texttt{LintM} monad. 164 This monad contains a context with a set of bound variables [[G]]. The 165 formalism treats [[G]] as an ordered list, but GHC uses a set as its 166 representation. 168 \gram{ 169 \ottG 170 } 172 We assume the Barendregt variable convention that all new variables are 173 fresh in the context. In the implementation, of course, some work is done 174 to guarantee this freshness. In particular, adding a new type variable to 175 the context sometimes requires creating a new, fresh variable name and then 176 applying a substitution. We elide these details in this formalism, but 177 see \coderef{types/Type.lhs}{substTyVarBndr} for details. 179 \section{Judgments} 181 The following functions are used from GHC. Their names are descriptive, and they 182 are not formalized here: \coderef{types/TyCon.lhs}{tyConKind}, 183 \coderef{types/TyCon.lhs}{tyConArity}, \coderef{basicTypes/DataCon.lhs}{dataConTyCon}, \coderef{types/TyCon.lhs}{isNewTyCon}, \coderef{basicTypes/DataCon.lhs}{dataConRepType}. 185 \subsection{Program consistency} 187 Check the entire bindings list in a context including the whole list. We extract 188 the actual variables (with their types/kinds) from the bindings, check for duplicates, 189 and then check each binding. 191 \ottdefnlintCoreBindings{} 193 Here is the definition of [[vars_of]], taken from \coderef{coreSyn/CoreSyn.lhs}{bindersOf}: 195 \[ 196 \begin{array}{ll} 197 [[vars_of n = e]] &= [[n]] \\ 198 [[vars_of rec </ ni = ei // i />]] &= [[</ ni // i />]] 199 \end{array} 200$
202 \subsection{Binding consistency}
204 \ottdefnlintXXbind{}
206 \ottdefnlintSingleBinding{}
208 In the GHC source, this function contains a number of other checks, such
209 as for strictness and exportability. See the source code for further information.
211 \subsection{Expression typing}
213 \ottdefnlintCoreExpr{}
215 %\arraylabel{\coderef{coreSyn/CoreLint.lhs}{checkCaseAlts}} \\
216 %\multicolumn{2}{l}{[[checkCaseAlts]] \text{ checks ordering and exhaustivity constr%aints}} \\
218 %\end{array}
219 %\]
221 \begin{itemize}
222 \item Some explication of \ottdrulename{Tm\_LetRec} is helpful: The idea behind the
223 second premise ($[[</ G, G'i |-ty s'i : ki // i />]]$) is that we wish
224 to check each substituted type $[[s'i]]$ in a context containing all the types
225 that come before it in the list of bindings. The $[[G'i]]$ are contexts
226 containing the names and kinds of all type variables (and term variables,
227 for that matter) up to the $i$th binding. This logic is extracted from
228 \coderef{coreSyn/CoreLint.lhs}{lintAndScopeIds}.
230 \item There is one more case for $[[G |-tm e : t]]$, for type expressions.
231 This is included in the GHC code but is elided
232 here because the case is never used in practice. Type expressions
233 can only appear in arguments to functions, and these are handled
234 in \ottdrulename{Tm\_AppType}.
236 \item The GHC source code checks all arguments in an application expression
237 all at once using \coderef{coreSyn/CoreSyn.lhs}{collectArgs}
238 and \coderef{coreSyn/CoreLint.lhs}{lintCoreArgs}. The operation
239 has been unfolded for presentation here.
241 \item If a $[[tick]]$ contains breakpoints, the GHC source performs additional
242 (scoping) checks.
244 \item The rule for $[[case]]$ statements also checks to make sure that
245 the alternatives in the $[[case]]$ are well-formed with respect to the
246 invariants listed above. These invariants do not affect the type or
247 evaluation of the expression, so the check is omitted here.
249 \item The GHC source code for \ottdrulename{Tm\_Var} contains checks for
250 a dead id and for one-tuples. These checks are omitted here.
251 \end{itemize}
253 \subsection{Kinding}
255 \ottdefnlintType{}
257 \subsection{Kind validity}
259 \ottdefnlintKind{}
261 \subsection{Coercion typing}
263 \ottdefnlintCoercion{}
265 In \ottdrulename{Co\_AxiomInstCo}, the use of $[[inits]]$ creates substitutions from
266 the first $i$ mappings in $[[ </ [ni |-> si] // i /> ]]$. This has the effect of
267 folding the substitution over the kinds for kind-checking.
269 \subsection{Name consistency}
271 There are two very similar checks for names, one declared as a local function
272 within \coderef{coreSyn/CoreLint.lhs}{lintSingleBinding}:
274 \ottdefnlintSingleBindingXXlintBinder{}
276 \ottdefnlintBinder{}
278 \subsection{Substitution consistency}
280 \ottdefncheckTyKind{}
282 \subsection{Case alternative consistency}
284 \ottdefnlintCoreAlt{}
286 \subsection{Telescope substitution}
288 \ottdefnapplyTys{}
290 \subsection{Case alternative binding consistency}
292 \ottdefnlintAltBinders{}
294 \subsection{Arrow kinding}
296 \ottdefnlintArrow{}
298 \subsection{Type application kinding}
300 \ottdefnlintXXapp{}
302 \subsection{Sub-kinding}
304 \ottdefnisSubKind{}
306 \end{document}