Refactor named wildcards (again)
[ghc.git] / compiler / parser / ApiAnnotation.hs
1 {-# LANGUAGE DeriveDataTypeable #-}
2
3 module ApiAnnotation (
4 getAnnotation, getAndRemoveAnnotation,
5 getAnnotationComments,getAndRemoveAnnotationComments,
6 ApiAnns,
7 ApiAnnKey,
8 AnnKeywordId(..),
9 AnnotationComment(..),
10 IsUnicodeSyntax(..),
11 unicodeAnn,
12 HasE(..),
13 LRdrName -- Exists for haddocks only
14 ) where
15
16 import RdrName
17 import Outputable
18 import SrcLoc
19 import qualified Data.Map as Map
20 import Data.Data
21
22
23 {-
24 Note [Api annotations]
25 ~~~~~~~~~~~~~~~~~~~~~~
26 In order to do source to source conversions using the GHC API, the
27 locations of all elements of the original source needs to be tracked.
28 This includes keywords such as 'let' / 'in' / 'do' etc as well as
29 punctuation such as commas and braces, and also comments.
30
31 These are captured in a structure separate from the parse tree, and
32 returned in the pm_annotations field of the ParsedModule type.
33
34 The non-comment annotations are stored indexed to the SrcSpan of the
35 AST element containing them, together with a AnnKeywordId value
36 identifying the specific keyword being captured.
37
38 > type ApiAnnKey = (SrcSpan,AnnKeywordId)
39 >
40 > Map.Map ApiAnnKey SrcSpan
41
42 So
43
44 > let x = 1 in 2 *x
45
46 would result in the AST element
47
48 L span (HsLet (binds for x = 1) (2 * x))
49
50 and the annotations
51
52 (span,AnnLet) having the location of the 'let' keyword
53 (span,AnnIn) having the location of the 'in' keyword
54
55
56 The comments are indexed to the SrcSpan of the lowest AST element
57 enclosing them
58
59 > Map.Map SrcSpan [Located AnnotationComment]
60
61 So the full ApiAnns type is
62
63 > type ApiAnns = ( Map.Map ApiAnnKey SrcSpan
64 > , Map.Map SrcSpan [Located AnnotationComment])
65
66
67 This is done in the lexer / parser as follows.
68
69
70 The PState variable in the lexer has the following variables added
71
72 > annotations :: [(ApiAnnKey,[SrcSpan])],
73 > comment_q :: [Located AnnotationComment],
74 > annotations_comments :: [(SrcSpan,[Located AnnotationComment])]
75
76 The first and last store the values that end up in the ApiAnns value
77 at the end via Map.fromList
78
79 The comment_q captures comments as they are seen in the token stream,
80 so that when they are ready to be allocated via the parser they are
81 available.
82
83 The parser interacts with the lexer using the function
84
85 > addAnnotation :: SrcSpan -> AnnKeywordId -> SrcSpan -> P ()
86
87 which takes the AST element SrcSpan, the annotation keyword and the
88 target SrcSpan.
89
90 This adds the annotation to the `annotations` field of `PState` and
91 transfers any comments in `comment_q` to the `annotations_comments`
92 field.
93
94 Parser
95 ------
96
97 The parser implements a number of helper types and methods for the
98 capture of annotations
99
100 > type AddAnn = (SrcSpan -> P ())
101 >
102 > mj :: AnnKeywordId -> Located e -> (SrcSpan -> P ())
103 > mj a l = (\s -> addAnnotation s a (gl l))
104
105 AddAnn represents the addition of an annotation a to a provided
106 SrcSpan, and `mj` constructs an AddAnn value.
107
108 > ams :: Located a -> [AddAnn] -> P (Located a)
109 > ams a@(L l _) bs = (mapM_ (\a -> a l) bs) >> return a
110
111 So the production in Parser.y for the HsLet AST element is
112
113 | 'let' binds 'in' exp {% ams (sLL $1 $> $ HsLet (snd $ unLoc $2) $4)
114 (mj AnnLet $1:mj AnnIn $3
115 :(fst $ unLoc $2)) }
116
117 This adds an AnnLet annotation for 'let', an AnnIn for 'in', as well
118 as any annotations that may arise in the binds. This will include open
119 and closing braces if they are used to delimit the let expressions.
120
121 The wiki page describing this feature is
122 https://ghc.haskell.org/trac/ghc/wiki/ApiAnnotations
123
124 -}
125 -- ---------------------------------------------------------------------
126
127 type ApiAnns = ( Map.Map ApiAnnKey [SrcSpan]
128 , Map.Map SrcSpan [Located AnnotationComment])
129
130 type ApiAnnKey = (SrcSpan,AnnKeywordId)
131
132
133 -- | Retrieve a list of annotation 'SrcSpan's based on the 'SrcSpan'
134 -- of the annotated AST element, and the known type of the annotation.
135 getAnnotation :: ApiAnns -> SrcSpan -> AnnKeywordId -> [SrcSpan]
136 getAnnotation (anns,_) span ann
137 = case Map.lookup (span,ann) anns of
138 Nothing -> []
139 Just ss -> ss
140
141 -- | Retrieve a list of annotation 'SrcSpan's based on the 'SrcSpan'
142 -- of the annotated AST element, and the known type of the annotation.
143 -- The list is removed from the annotations.
144 getAndRemoveAnnotation :: ApiAnns -> SrcSpan -> AnnKeywordId
145 -> ([SrcSpan],ApiAnns)
146 getAndRemoveAnnotation (anns,cs) span ann
147 = case Map.lookup (span,ann) anns of
148 Nothing -> ([],(anns,cs))
149 Just ss -> (ss,(Map.delete (span,ann) anns,cs))
150
151 -- |Retrieve the comments allocated to the current 'SrcSpan'
152 --
153 -- Note: A given 'SrcSpan' may appear in multiple AST elements,
154 -- beware of duplicates
155 getAnnotationComments :: ApiAnns -> SrcSpan -> [Located AnnotationComment]
156 getAnnotationComments (_,anns) span =
157 case Map.lookup span anns of
158 Just cs -> cs
159 Nothing -> []
160
161 -- |Retrieve the comments allocated to the current 'SrcSpan', and
162 -- remove them from the annotations
163 getAndRemoveAnnotationComments :: ApiAnns -> SrcSpan
164 -> ([Located AnnotationComment],ApiAnns)
165 getAndRemoveAnnotationComments (anns,canns) span =
166 case Map.lookup span canns of
167 Just cs -> (cs,(anns,Map.delete span canns))
168 Nothing -> ([],(anns,canns))
169
170 -- --------------------------------------------------------------------
171
172 -- | API Annotations exist so that tools can perform source to source
173 -- conversions of Haskell code. They are used to keep track of the
174 -- various syntactic keywords that are not captured in the existing
175 -- AST.
176 --
177 -- The annotations, together with original source comments are made
178 -- available in the @'pm_annotations'@ field of @'GHC.ParsedModule'@.
179 -- Comments are only retained if @'Opt_KeepRawTokenStream'@ is set in
180 -- @'DynFlags.DynFlags'@ before parsing.
181 --
182 -- The wiki page describing this feature is
183 -- https://ghc.haskell.org/trac/ghc/wiki/ApiAnnotations
184 --
185 -- Note: in general the names of these are taken from the
186 -- corresponding token, unless otherwise noted
187 -- See note [Api annotations] above for details of the usage
188 data AnnKeywordId
189 = AnnAs
190 | AnnAt
191 | AnnBang -- ^ '!'
192 | AnnBackquote -- ^ '`'
193 | AnnBy
194 | AnnCase -- ^ case or lambda case
195 | AnnClass
196 | AnnClose -- ^ '\#)' or '\#-}' etc
197 | AnnCloseC -- ^ '}'
198 | AnnCloseP -- ^ ')'
199 | AnnCloseS -- ^ ']'
200 | AnnColon
201 | AnnComma -- ^ as a list separator
202 | AnnCommaTuple -- ^ in a RdrName for a tuple
203 | AnnDarrow -- ^ '=>'
204 | AnnDarrowU -- ^ '=>', unicode variant
205 | AnnData
206 | AnnDcolon -- ^ '::'
207 | AnnDcolonU -- ^ '::', unicode variant
208 | AnnDefault
209 | AnnDeriving
210 | AnnDo
211 | AnnDot -- ^ '.'
212 | AnnDotdot -- ^ '..'
213 | AnnElse
214 | AnnEqual
215 | AnnExport
216 | AnnFamily
217 | AnnForall
218 | AnnForallU -- ^ Unicode variant
219 | AnnForeign
220 | AnnFunId -- ^ for function name in matches where there are
221 -- multiple equations for the function.
222 | AnnGroup
223 | AnnHeader -- ^ for CType
224 | AnnHiding
225 | AnnIf
226 | AnnImport
227 | AnnIn
228 | AnnInfix -- ^ 'infix' or 'infixl' or 'infixr'
229 | AnnInstance
230 | AnnLam
231 | AnnLarrow -- ^ '<-'
232 | AnnLarrowU -- ^ '<-', unicode variant
233 | AnnLet
234 | AnnMdo
235 | AnnMinus -- ^ '-'
236 | AnnModule
237 | AnnNewtype
238 | AnnName -- ^ where a name loses its location in the AST, this carries it
239 | AnnOf
240 | AnnOpen -- ^ '(\#' or '{-\# LANGUAGE' etc
241 | AnnOpenC -- ^ '{'
242 | AnnOpenE -- ^ '[e|' or '[e||'
243 | AnnOpenP -- ^ '('
244 | AnnOpenPE -- ^ '$('
245 | AnnOpenPTE -- ^ '$$('
246 | AnnOpenS -- ^ '['
247 | AnnPackageName
248 | AnnPattern
249 | AnnProc
250 | AnnQualified
251 | AnnRarrow -- ^ '->'
252 | AnnRarrowU -- ^ '->', unicode variant
253 | AnnRec
254 | AnnRole
255 | AnnSafe
256 | AnnSemi -- ^ ';'
257 | AnnSimpleQuote -- ^ '''
258 | AnnStatic -- ^ 'static'
259 | AnnThen
260 | AnnThIdSplice -- ^ '$'
261 | AnnThIdTySplice -- ^ '$$'
262 | AnnThTyQuote -- ^ double '''
263 | AnnTilde -- ^ '~'
264 | AnnTildehsh -- ^ '~#'
265 | AnnType
266 | AnnUnit -- ^ '()' for types
267 | AnnUsing
268 | AnnVal -- ^ e.g. INTEGER
269 | AnnValStr -- ^ String value, will need quotes when output
270 | AnnVbar -- ^ '|'
271 | AnnWhere
272 | Annlarrowtail -- ^ '-<'
273 | AnnlarrowtailU -- ^ '-<', unicode variant
274 | Annrarrowtail -- ^ '->'
275 | AnnrarrowtailU -- ^ '->', unicode variant
276 | AnnLarrowtail -- ^ '-<<'
277 | AnnLarrowtailU -- ^ '-<<', unicode variant
278 | AnnRarrowtail -- ^ '>>-'
279 | AnnRarrowtailU -- ^ '>>-', unicode variant
280 | AnnEofPos
281 deriving (Eq, Ord, Data, Typeable, Show)
282
283 instance Outputable AnnKeywordId where
284 ppr x = text (show x)
285
286 -- ---------------------------------------------------------------------
287
288 data AnnotationComment =
289 -- Documentation annotations
290 AnnDocCommentNext String -- ^ something beginning '-- |'
291 | AnnDocCommentPrev String -- ^ something beginning '-- ^'
292 | AnnDocCommentNamed String -- ^ something beginning '-- $'
293 | AnnDocSection Int String -- ^ a section heading
294 | AnnDocOptions String -- ^ doc options (prune, ignore-exports, etc)
295 | AnnDocOptionsOld String -- ^ doc options declared "-- # ..."-style
296 | AnnLineComment String -- ^ comment starting by "--"
297 | AnnBlockComment String -- ^ comment in {- -}
298 deriving (Eq, Ord, Data, Typeable, Show)
299 -- Note: these are based on the Token versions, but the Token type is
300 -- defined in Lexer.x and bringing it in here would create a loop
301
302 instance Outputable AnnotationComment where
303 ppr x = text (show x)
304
305 -- | - 'ApiAnnotation.AnnKeywordId' : 'ApiAnnotation.AnnOpen',
306 -- 'ApiAnnotation.AnnClose','ApiAnnotation.AnnComma',
307 -- 'ApiAnnotation.AnnRarrow','ApiAnnotation.AnnTildehsh',
308 -- 'ApiAnnotation.AnnTilde'
309 -- - May have 'ApiAnnotation.AnnComma' when in a list
310 type LRdrName = Located RdrName
311
312
313 -- | Certain tokens can have alternate representations when unicode syntax is
314 -- enabled. This flag is attached to those tokens in the lexer so that the
315 -- original source representation can be reproduced in the corresponding
316 -- 'ApiAnnotation'
317 data IsUnicodeSyntax = UnicodeSyntax | NormalSyntax
318 deriving (Eq, Ord, Data, Typeable, Show)
319
320 -- | Convert a normal annotation into its unicode equivalent one
321 unicodeAnn :: AnnKeywordId -> AnnKeywordId
322 unicodeAnn AnnForall = AnnForallU
323 unicodeAnn AnnDcolon = AnnDcolonU
324 unicodeAnn AnnLarrow = AnnLarrowU
325 unicodeAnn AnnRarrow = AnnRarrowU
326 unicodeAnn AnnDarrow = AnnDarrowU
327 unicodeAnn Annlarrowtail = AnnLarrowtailU
328 unicodeAnn Annrarrowtail = AnnrarrowtailU
329 unicodeAnn AnnLarrowtail = AnnLarrowtailU
330 unicodeAnn AnnRarrowtail = AnnRarrowtailU
331 unicodeAnn ann = ann
332
333
334 -- | Some template haskell tokens have two variants, one with an `e` the other
335 -- not:
336 --
337 -- > [| or [e|
338 -- > [|| or [e||
339 --
340 -- This type indicates whether the 'e' is present or not.
341 data HasE = HasE | NoE
342 deriving (Eq, Ord, Data, Typeable, Show)