Merge commit '7b0b9f603bb1215e2b7af23c2404d637b95a4988' as 'hadrian'
[ghc.git] / compiler / main / Elf.hs
1 {-
2 -----------------------------------------------------------------------------
3 --
4 -- (c) The University of Glasgow 2015
5 --
6 -- ELF format tools
7 --
8 -----------------------------------------------------------------------------
9 -}
10
11 module Elf (
12 readElfSectionByName,
13 readElfNoteAsString,
14 makeElfNote
15 ) where
16
17 import GhcPrelude
18
19 import AsmUtils
20 import Exception
21 import DynFlags
22 import ErrUtils
23 import Maybes (MaybeT(..),runMaybeT)
24 import Util (charToC)
25 import Outputable (text,hcat,SDoc)
26
27 import Control.Monad (when)
28 import Data.Binary.Get
29 import Data.Word
30 import Data.Char (ord)
31 import Data.ByteString.Lazy (ByteString)
32 import qualified Data.ByteString.Lazy as LBS
33 import qualified Data.ByteString.Lazy.Char8 as B8
34
35 {- Note [ELF specification]
36 ~~~~~~~~~~~~~~~~~~~~~~~~
37
38 ELF (Executable and Linking Format) is described in the System V Application
39 Binary Interface (or ABI). The latter is composed of two parts: a generic
40 part and a processor specific part. The generic ABI describes the parts of
41 the interface that remain constant across all hardware implementations of
42 System V.
43
44 The latest release of the specification of the generic ABI is the version
45 4.1 from March 18, 1997:
46
47 - http://www.sco.com/developers/devspecs/gabi41.pdf
48
49 Since 1997, snapshots of the draft for the "next" version are published:
50
51 - http://www.sco.com/developers/gabi/
52
53 Quoting the notice on the website: "There is more than one instance of these
54 chapters to permit references to older instances to remain valid. All
55 modifications to these chapters are forward-compatible, so that correct use
56 of an older specification will not be invalidated by a newer instance.
57 Approximately on a yearly basis, a new instance will be saved, as it reaches
58 what appears to be a stable state."
59
60 Nevertheless we will see that since 1998 it is not true for Note sections.
61
62 Many ELF sections
63 -----------------
64
65 ELF-4.1: the normal section number fields in ELF are limited to 16 bits,
66 which runs out of bits when you try to cram in more sections than that. Two
67 fields are concerned: the one containing the number of the sections and the
68 one containing the index of the section that contains section's names. (The
69 same thing applies to the field containing the number of segments, but we
70 don't care about it here).
71
72 ELF-next: to solve this, theses fields in the ELF header have an escape
73 value (different for each case), and the actual section number is stashed
74 into unused fields in the first section header.
75
76 We support this extension as it is forward-compatible with ELF-4.1.
77 Moreover, GHC may generate objects with a lot of sections with the
78 "function-sections" feature (one section per function).
79
80 Note sections
81 -------------
82
83 Sections with type "note" (SHT_NOTE in the specification) are used to add
84 arbitrary data into an ELF file. An entry in a note section is composed of a
85 name, a type and a value.
86
87 ELF-4.1: "The note information in sections and program header elements holds
88 any number of entries, each of which is an array of 4-byte words in the
89 format of the target processor." Each entry has the following format:
90 | namesz | Word32: size of the name string (including the ending \0)
91 | descsz | Word32: size of the value
92 | type | Word32: type of the note
93 | name | Name string (with \0 padding to ensure 4-byte alignment)
94 | ... |
95 | desc | Value (with \0 padding to ensure 4-byte alignment)
96 | ... |
97
98 ELF-next: "The note information in sections and program header elements
99 holds a variable amount of entries. In 64-bit objects (files with
100 e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array of 8-byte
101 words in the format of the target processor. In 32-bit objects (files with
102 e_ident[EI_CLASS] equal to ELFCLASS32), each entry is an array of 4-byte
103 words in the format of the target processor." (from 1998-2015 snapshots)
104
105 This is not forward-compatible with ELF-4.1. In practice, for almost all
106 platforms namesz, descz and type fields are 4-byte words for both 32-bit and
107 64-bit objects (see elf.h and readelf source code).
108
109 The only exception in readelf source code is for IA_64 machines with OpenVMS
110 OS: "This OS has so many departures from the ELF standard that we test it at
111 many places" (comment for is_ia64_vms() in readelf.c). In this case, namesz,
112 descsz and type fields are 8-byte words and name and value fields are padded
113 to ensure 8-byte alignment.
114
115 We don't support this platform in the following code. Reading a note section
116 could be done easily (by testing Machine and OS fields in the ELF header).
117 Writing a note section, however, requires that we generate a different
118 assembly code for GAS depending on the target platform and this is a little
119 bit more involved.
120
121 -}
122
123
124 -- | ELF header
125 --
126 -- The ELF header indicates the native word size (32-bit or 64-bit) and the
127 -- endianness of the target machine. We directly store getters for words of
128 -- different sizes as it is more convenient to use. We also store the word size
129 -- as it is useful to skip some uninteresting fields.
130 --
131 -- Other information such as the target machine and OS are left out as we don't
132 -- use them yet. We could add them in the future if we ever need them.
133 data ElfHeader = ElfHeader
134 { gw16 :: Get Word16 -- ^ Get a Word16 with the correct endianness
135 , gw32 :: Get Word32 -- ^ Get a Word32 with the correct endianness
136 , gwN :: Get Word64 -- ^ Get a Word with the correct word size
137 -- and endianness
138 , wordSize :: Int -- ^ Word size in bytes
139 }
140
141
142 -- | Read the ELF header
143 readElfHeader :: DynFlags -> ByteString -> IO (Maybe ElfHeader)
144 readElfHeader dflags bs = runGetOrThrow getHeader bs `catchIO` \_ -> do
145 debugTraceMsg dflags 3 $
146 text ("Unable to read ELF header")
147 return Nothing
148 where
149 getHeader = do
150 magic <- getWord32be
151 ws <- getWord8
152 endian <- getWord8
153 version <- getWord8
154 skip 9 -- skip OSABI, ABI version and padding
155 when (magic /= 0x7F454C46 || version /= 1) $ fail "Invalid ELF header"
156
157 case (ws, endian) of
158 -- ELF 32, little endian
159 (1,1) -> return . Just $ ElfHeader
160 getWord16le
161 getWord32le
162 (fmap fromIntegral getWord32le) 4
163 -- ELF 32, big endian
164 (1,2) -> return . Just $ ElfHeader
165 getWord16be
166 getWord32be
167 (fmap fromIntegral getWord32be) 4
168 -- ELF 64, little endian
169 (2,1) -> return . Just $ ElfHeader
170 getWord16le
171 getWord32le
172 (fmap fromIntegral getWord64le) 8
173 -- ELF 64, big endian
174 (2,2) -> return . Just $ ElfHeader
175 getWord16be
176 getWord32be
177 (fmap fromIntegral getWord64be) 8
178 _ -> fail "Invalid ELF header"
179
180
181 ------------------
182 -- SECTIONS
183 ------------------
184
185
186 -- | Description of the section table
187 data SectionTable = SectionTable
188 { sectionTableOffset :: Word64 -- ^ offset of the table describing sections
189 , sectionEntrySize :: Word16 -- ^ size of an entry in the section table
190 , sectionEntryCount :: Word64 -- ^ number of sections
191 , sectionNameIndex :: Word32 -- ^ index of a special section which
192 -- contains section's names
193 }
194
195 -- | Read the ELF section table
196 readElfSectionTable :: DynFlags
197 -> ElfHeader
198 -> ByteString
199 -> IO (Maybe SectionTable)
200
201 readElfSectionTable dflags hdr bs = action `catchIO` \_ -> do
202 debugTraceMsg dflags 3 $
203 text ("Unable to read ELF section table")
204 return Nothing
205 where
206 getSectionTable :: Get SectionTable
207 getSectionTable = do
208 skip (24 + 2*wordSize hdr) -- skip header and some other fields
209 secTableOffset <- gwN hdr
210 skip 10
211 entrySize <- gw16 hdr
212 entryCount <- gw16 hdr
213 secNameIndex <- gw16 hdr
214 return (SectionTable secTableOffset entrySize
215 (fromIntegral entryCount)
216 (fromIntegral secNameIndex))
217
218 action = do
219 secTable <- runGetOrThrow getSectionTable bs
220 -- In some cases, the number of entries and the index of the section
221 -- containing section's names must be found in unused fields of the first
222 -- section entry (see Note [ELF specification])
223 let
224 offSize0 = fromIntegral $ sectionTableOffset secTable + 8
225 + 3 * fromIntegral (wordSize hdr)
226 offLink0 = fromIntegral $ offSize0 + fromIntegral (wordSize hdr)
227
228 entryCount' <- if sectionEntryCount secTable /= 0
229 then return (sectionEntryCount secTable)
230 else runGetOrThrow (gwN hdr) (LBS.drop offSize0 bs)
231 entryNameIndex' <- if sectionNameIndex secTable /= 0xffff
232 then return (sectionNameIndex secTable)
233 else runGetOrThrow (gw32 hdr) (LBS.drop offLink0 bs)
234 return (Just $ secTable
235 { sectionEntryCount = entryCount'
236 , sectionNameIndex = entryNameIndex'
237 })
238
239
240 -- | A section
241 data Section = Section
242 { entryName :: ByteString -- ^ Name of the section
243 , entryBS :: ByteString -- ^ Content of the section
244 }
245
246 -- | Read a ELF section
247 readElfSectionByIndex :: DynFlags
248 -> ElfHeader
249 -> SectionTable
250 -> Word64
251 -> ByteString
252 -> IO (Maybe Section)
253
254 readElfSectionByIndex dflags hdr secTable i bs = action `catchIO` \_ -> do
255 debugTraceMsg dflags 3 $
256 text ("Unable to read ELF section")
257 return Nothing
258 where
259 -- read an entry from the section table
260 getEntry = do
261 nameIndex <- gw32 hdr
262 skip (4+2*wordSize hdr)
263 offset <- fmap fromIntegral $ gwN hdr
264 size <- fmap fromIntegral $ gwN hdr
265 let bs' = LBS.take size (LBS.drop offset bs)
266 return (nameIndex,bs')
267
268 -- read the entry with the given index in the section table
269 getEntryByIndex x = runGetOrThrow getEntry bs'
270 where
271 bs' = LBS.drop off bs
272 off = fromIntegral $ sectionTableOffset secTable +
273 x * fromIntegral (sectionEntrySize secTable)
274
275 -- Get the name of a section
276 getEntryName nameIndex = do
277 let idx = fromIntegral (sectionNameIndex secTable)
278 (_,nameTable) <- getEntryByIndex idx
279 let bs' = LBS.drop nameIndex nameTable
280 runGetOrThrow getLazyByteStringNul bs'
281
282 action = do
283 (nameIndex,bs') <- getEntryByIndex (fromIntegral i)
284 name <- getEntryName (fromIntegral nameIndex)
285 return (Just $ Section name bs')
286
287
288 -- | Find a section from its name. Return the section contents.
289 --
290 -- We do not perform any check on the section type.
291 findSectionFromName :: DynFlags
292 -> ElfHeader
293 -> SectionTable
294 -> String
295 -> ByteString
296 -> IO (Maybe ByteString)
297 findSectionFromName dflags hdr secTable name bs =
298 rec [0..sectionEntryCount secTable - 1]
299 where
300 -- convert the required section name into a ByteString to perform
301 -- ByteString comparison instead of String comparison
302 name' = B8.pack name
303
304 -- compare recursively each section name and return the contents of
305 -- the matching one, if any
306 rec [] = return Nothing
307 rec (x:xs) = do
308 me <- readElfSectionByIndex dflags hdr secTable x bs
309 case me of
310 Just e | entryName e == name' -> return (Just (entryBS e))
311 _ -> rec xs
312
313
314 -- | Given a section name, read its contents as a ByteString.
315 --
316 -- If the section isn't found or if there is any parsing error, we return
317 -- Nothing
318 readElfSectionByName :: DynFlags
319 -> ByteString
320 -> String
321 -> IO (Maybe LBS.ByteString)
322
323 readElfSectionByName dflags bs name = action `catchIO` \_ -> do
324 debugTraceMsg dflags 3 $
325 text ("Unable to read ELF section \"" ++ name ++ "\"")
326 return Nothing
327 where
328 action = runMaybeT $ do
329 hdr <- MaybeT $ readElfHeader dflags bs
330 secTable <- MaybeT $ readElfSectionTable dflags hdr bs
331 MaybeT $ findSectionFromName dflags hdr secTable name bs
332
333 ------------------
334 -- NOTE SECTIONS
335 ------------------
336
337 -- | read a Note as a ByteString
338 --
339 -- If you try to read a note from a section which does not support the Note
340 -- format, the parsing is likely to fail and Nothing will be returned
341 readElfNoteBS :: DynFlags
342 -> ByteString
343 -> String
344 -> String
345 -> IO (Maybe LBS.ByteString)
346
347 readElfNoteBS dflags bs sectionName noteId = action `catchIO` \_ -> do
348 debugTraceMsg dflags 3 $
349 text ("Unable to read ELF note \"" ++ noteId ++
350 "\" in section \"" ++ sectionName ++ "\"")
351 return Nothing
352 where
353 -- align the getter on n bytes
354 align n = do
355 m <- bytesRead
356 if m `mod` n == 0
357 then return ()
358 else skip 1 >> align n
359
360 -- noteId as a bytestring
361 noteId' = B8.pack noteId
362
363 -- read notes recursively until the one with a valid identifier is found
364 findNote hdr = do
365 align 4
366 namesz <- gw32 hdr
367 descsz <- gw32 hdr
368 _ <- gw32 hdr -- we don't use the note type
369 name <- if namesz == 0
370 then return LBS.empty
371 else getLazyByteStringNul
372 align 4
373 desc <- if descsz == 0
374 then return LBS.empty
375 else getLazyByteString (fromIntegral descsz)
376 if name == noteId'
377 then return $ Just desc
378 else findNote hdr
379
380
381 action = runMaybeT $ do
382 hdr <- MaybeT $ readElfHeader dflags bs
383 sec <- MaybeT $ readElfSectionByName dflags bs sectionName
384 MaybeT $ runGetOrThrow (findNote hdr) sec
385
386 -- | read a Note as a String
387 --
388 -- If you try to read a note from a section which does not support the Note
389 -- format, the parsing is likely to fail and Nothing will be returned
390 readElfNoteAsString :: DynFlags
391 -> FilePath
392 -> String
393 -> String
394 -> IO (Maybe String)
395
396 readElfNoteAsString dflags path sectionName noteId = action `catchIO` \_ -> do
397 debugTraceMsg dflags 3 $
398 text ("Unable to read ELF note \"" ++ noteId ++
399 "\" in section \"" ++ sectionName ++ "\"")
400 return Nothing
401 where
402 action = do
403 bs <- LBS.readFile path
404 note <- readElfNoteBS dflags bs sectionName noteId
405 return (fmap B8.unpack note)
406
407
408 -- | Generate the GAS code to create a Note section
409 --
410 -- Header fields for notes are 32-bit long (see Note [ELF specification]).
411 --
412 -- It seems there is no easy way to force GNU AS to generate a 32-bit word in
413 -- every case. Hence we use .int directive to create them: however "The byte
414 -- order and bit size of the number depends on what kind of target the assembly
415 -- is for." (https://sourceware.org/binutils/docs/as/Int.html#Int)
416 --
417 -- If we add new target platforms, we need to check that the generated words
418 -- are 32-bit long, otherwise we need to use platform specific directives to
419 -- force 32-bit .int in asWord32.
420 makeElfNote :: String -> String -> Word32 -> String -> SDoc
421 makeElfNote sectionName noteName typ contents = hcat [
422 text "\t.section ",
423 text sectionName,
424 text ",\"\",",
425 sectionType "note",
426 text "\n",
427
428 -- note name length (+ 1 for ending \0)
429 asWord32 (length noteName + 1),
430
431 -- note contents size
432 asWord32 (length contents),
433
434 -- note type
435 asWord32 typ,
436
437 -- note name (.asciz for \0 ending string) + padding
438 text "\t.asciz \"",
439 text noteName,
440 text "\"\n",
441 text "\t.align 4\n",
442
443 -- note contents (.ascii to avoid ending \0) + padding
444 text "\t.ascii \"",
445 text (escape contents),
446 text "\"\n",
447 text "\t.align 4\n"]
448 where
449 escape :: String -> String
450 escape = concatMap (charToC.fromIntegral.ord)
451
452 asWord32 :: Show a => a -> SDoc
453 asWord32 x = hcat [
454 text "\t.int ",
455 text (show x),
456 text "\n"]
457
458
459 ------------------
460 -- Helpers
461 ------------------
462
463 -- | runGet in IO monad that throws an IOException on failure
464 runGetOrThrow :: Get a -> LBS.ByteString -> IO a
465 runGetOrThrow g bs = case runGetOrFail g bs of
466 Left _ -> fail "Error while reading file"
467 Right (_,_,a) -> return a