Fix PE linker wibbles
[ghc.git] / rts / linker / PEi386.c
1 /* --------------------------------------------------------------------------
2 * PEi386(+) specifics (Win32 targets)
3 * ------------------------------------------------------------------------*/
4
5 /* The information for this linker comes from
6 Microsoft Portable Executable
7 and Common Object File Format Specification
8 revision 8.3 February 2013
9
10 It can be found online at:
11
12 https://msdn.microsoft.com/en-us/windows/hardware/gg463119.aspx
13
14 Things move, so if that fails, try searching for it via
15
16 http://www.google.com/search?q=PE+COFF+specification
17
18 The ultimate reference for the PE format is the Winnt.h
19 header file that comes with the Platform SDKs; as always,
20 implementations will drift wrt their documentation.
21
22 A good background article on the PE format is Matt Pietrek's
23 March 1994 article in Microsoft System Journal (MSJ)
24 (Vol.9, No. 3): "Peering Inside the PE: A Tour of the
25 Win32 Portable Executable File Format." The info in there
26 has recently been updated in a two part article in
27 MSDN magazine, issues Feb and March 2002,
28 "Inside Windows: An In-Depth Look into the Win32 Portable
29 Executable File Format"
30
31 John Levine's book "Linkers and Loaders" contains useful
32 info on PE too.
33
34 The PE specification doesn't specify how to do the actual
35 relocations. For this reason, and because both PE and ELF are
36 based on COFF, the relocations for the PEi386+ code is based on
37 the ELF relocations for the equivalent relocation type.
38
39 The ELF ABI can be found at
40
41 http://www.x86-64.org/documentation/abi.pdf
42
43 The current code is based on version 0.99.6 - October 2013
44
45 The current GHCi linker supports the following four object file formats:
46
47 * PE/PE+ obj - The normal COFF_ANON_OBJ format which is generated by default
48 from Windows compilers
49
50 * PE/PE+ big-obj - The big object format COFF_ANON_BIG_OBJ which extends the
51 number of sections to 2^31 and the number of symbols in each section. This
52 requires a flag but all Windows compilers can produce it.
53
54 * PE Import format - The import library format defined in the PE standard
55 COFF_IMPORT_LIB and commonly has the file extension .lib
56
57 * GNU BFD import format - The import library format defined and used by GNU
58 tools. See note below.
59
60 Note [BFD import library]
61 ~~~~~~~~~~~~~~~~~~~~~~~~~
62
63 On Windows, compilers don't link directly to dynamic libraries.
64 The reason for this is that the exports are not always by symbol, the
65 Import Address Table (IAT) also allows exports by ordinal number
66 or raw addresses.
67
68 So to solve the linking issue, import libraries were added. Import libraries
69 can be seen as a specification of how to link implicitly against a dynamic
70 library. As a side note, import libraries are also the mechanism which
71 can be used to break mutual dependencies between shared libraries and to
72 implement delay loading or override the location of a shared library at
73 startup.
74
75 Linkers use these import libraries to populate the IAT of the resulting
76 binary. At startup the system dynamic loader processes the IAT entries
77 and populates the symbols with the correct addresses.
78
79 Anyway, the Windows PE format specifies a simple and efficient format for
80 this: It's essentially a list, saying these X symbols can be found in DLL y.
81 Commonly, y is a versioned name. e.g. liby_43.dll. This is an artifact of
82 the days when Windows did not support side-by-side assemblies. So the
83 solution was to version the DLLs by renaming them to include explicit
84 version numbers, and to then use the import libraries to point to the right
85 version, having the linker do the leg work.
86
87 The format in the PE specification is commonly named using the suffix .lib.
88 Unfortunately, GCC/binutils decided not to implement this format, and instead
89 have created their own format. This format is either named using the suffix
90 .dll.a or .a depending on the tool that makes them. This format is
91 undocumented. However the source of dlltool.c in binutils is pretty handy to
92 understant it.
93
94 To understand the implementation in GHC, this is what is important:
95
96 the .idata section group is used to hold this information. An import library
97 object file will always have these section groups, but the specific
98 configuration depends on what the purpose of the file is. They will also
99 never have a CODE or DATA section, though depending on the tool that creates
100 them they may have the section headers, which will mostly be empty.
101
102 You have to different possible configuration:
103
104 1) Those that define a redirection. In this case the .idata$7 section will
105 contain the name of the actual dll to load. This will be the only content
106 of the section. In the symbol table, the last symbol will be the name
107 used to refer to the dll in the relocation tables. This name will always
108 be in the format "symbol_name_iname", however when refered to, the format
109 "_head_symbol_name" is used.
110
111 We record this symbol early on during GetNames and load the dll and use
112 the module handle as the symbol address.
113
114 2) Symbol definitions. In this case .idata$6 will contain the symbol to load.
115 This is stored in the fixed format of 2-byte ordinals followed by a null
116 terminated string with the symbol name. The ordinal is to be used when
117 the dll does not export symbols by name. (NOTE: We don't currently
118 support this in the runtime linker, but it's easy to add should it be
119 needed). The last symbol in the symbol table of the section will contain
120 the name symbol which contains the dll name to use to resolve the
121 reference.
122
123 As a technicality, this also means that the GCC format will allow us to use
124 one library to store references to multiple dlls. This can't be produced by
125 dlltool, but it can be combined using ar. This is an important feature
126 required for dynamic linking support for GHC. So the runtime linker now
127 supports this too.
128
129 Note [Memory allocation]
130 ~~~~~~~~~~~~~~~~~~~~~~~~
131
132 Previously on Windows we would use VirtualAlloc to allocate enough space for
133 loading the entire object file into memory and keep it there for the duration
134 until the entire object file has been unloaded.
135
136 This has a couple of problems, first of, VirtualAlloc and the other Virtual
137 functions interact directly with the memory manager. Requesting memory from
138 VirtualAlloc will always return whole pages (32k), aligned on a 4k boundary.
139
140 This means for an object file of size N kbytes, we're always wasting 32-N
141 kbytes of memory. Nothing else can access this memory.
142
143 Because of this we're now using HeapAlloc and other heap function to create
144 a private heap. Another solution would have been to write our own memory
145 manager to keep track of where we have free memory, but the private heap
146 solution is simpler.
147
148 The private heap is created with full rights just as the pages we used to get
149 from VirtualAlloc (e.g. READ/WRITE/EXECUTE). In the end we end up using
150 memory much more efficiently than before. The downside is that heap memory
151 is always Allocated AND Committed, thus when the heap resizes the new size is
152 committed. It becomes harder to see how much we're actually using. This makes
153 it seem like for small programs that we're using more memory than before.
154 Certainly a clean GHCi startup will have a slightly higher commit count.
155
156 The second major change in how we allocate memory is that we no longer need
157 the entire object file. We now allocate the object file using normal malloc
158 and instead read bits from it. All tables are stored in the Object file info
159 table and are discarded as soon as they are no longer needed, e.g. after
160 relocation is finished. Only section data is kept around, but this data is
161 copied into the private heap.
162
163 The major knock on effect of this is that we have more memory to use in the
164 sub 2GB range, which means that Template Haskell should fail a lot less as we
165 will violate the small memory model much less than before.
166
167 Note [Section alignment]
168 ~~~~~~~~~~~~~~~~~~~~~~~~
169
170 The Windows linker aligns memory to it's section alignment requirement by
171 aligning it during the copying to the private heap. We also ensure that the
172 trampoline "region" we reserve is 8 bytes aligned.
173 */
174
175 #include "Rts.h"
176
177 #if defined(x86_64_HOST_ARCH)
178 #define USED_IF_x86_64_HOST_ARCH /* Nothing */
179 #else
180 #define USED_IF_x86_64_HOST_ARCH STG_UNUSED
181 #endif
182
183 #if defined(mingw32_HOST_OS)
184
185 #include "RtsUtils.h"
186 #include "RtsSymbolInfo.h"
187 #include "GetEnv.h"
188 #include "linker/PEi386.h"
189 #include "linker/PEi386Types.h"
190 #include "LinkerInternals.h"
191
192 #include <windows.h>
193 #include <shfolder.h> /* SHGetFolderPathW */
194 #include <math.h>
195 #include <wchar.h>
196 #include <stdbool.h>
197 #include <stdint.h>
198
199 #include <inttypes.h>
200 #include <dbghelp.h>
201 #include <stdlib.h>
202 #include <psapi.h>
203
204 #if defined(x86_64_HOST_ARCH)
205 static size_t makeSymbolExtra_PEi386(
206 ObjectCode* oc,
207 uint64_t index,
208 size_t s,
209 SymbolName* symbol);
210 #endif
211
212 static void addDLLHandle(
213 pathchar* dll_name,
214 HINSTANCE instance);
215
216 static bool verifyCOFFHeader(
217 uint16_t machine,
218 IMAGE_FILE_HEADER *hdr,
219 pathchar *fileName);
220
221 static bool checkIfDllLoaded(
222 HINSTANCE instance);
223
224 static uint32_t getSectionAlignment(
225 Section section);
226
227 static uint8_t* getAlignedMemory(
228 uint8_t* value,
229 Section section);
230
231 static size_t getAlignedValue(
232 size_t value,
233 Section section);
234
235 static void addCopySection(
236 ObjectCode *oc,
237 Section *s,
238 SectionKind kind,
239 SectionAlloc alloc,
240 void* start,
241 StgWord size);
242
243 static void releaseOcInfo(
244 ObjectCode* oc);
245
246 /* Add ld symbol for PE image base. */
247 #if defined(__GNUC__)
248 #define __ImageBase __MINGW_LSYMBOL(_image_base__)
249 #endif
250
251 /* Get the base of the module. */
252 /* This symbol is defined by ld. */
253 extern IMAGE_DOS_HEADER __ImageBase;
254 #define __image_base (void*)((HINSTANCE)&__ImageBase)
255
256 const Alignments pe_alignments[] = {
257 { IMAGE_SCN_ALIGN_1BYTES , 1 },
258 { IMAGE_SCN_ALIGN_2BYTES , 2 },
259 { IMAGE_SCN_ALIGN_4BYTES , 4 },
260 { IMAGE_SCN_ALIGN_8BYTES , 8 },
261 { IMAGE_SCN_ALIGN_16BYTES , 16 },
262 { IMAGE_SCN_ALIGN_32BYTES , 32 },
263 { IMAGE_SCN_ALIGN_64BYTES , 64 },
264 { IMAGE_SCN_ALIGN_128BYTES , 128 },
265 { IMAGE_SCN_ALIGN_256BYTES , 256 },
266 { IMAGE_SCN_ALIGN_512BYTES , 512 },
267 { IMAGE_SCN_ALIGN_1024BYTES, 1024},
268 { IMAGE_SCN_ALIGN_2048BYTES, 2048},
269 { IMAGE_SCN_ALIGN_4096BYTES, 4096},
270 { IMAGE_SCN_ALIGN_8192BYTES, 8192},
271 };
272
273 const int pe_alignments_cnt = sizeof (pe_alignments) / sizeof (Alignments);
274 const int default_alignment = 8;
275 const int initHeapSizeMB = 15;
276 static HANDLE code_heap = NULL;
277
278 /* Low Fragmentation Heap, try to prevent heap from increasing in size when
279 space can simply be reclaimed. These are enums missing from mingw-w64's
280 headers. */
281 #define HEAP_LFH 2
282 #define HeapOptimizeResources 3
283
284 void initLinker_PEi386()
285 {
286 if (!ghciInsertSymbolTable(WSTR("(GHCi/Ld special symbols)"),
287 symhash, "__image_base__", __image_base, HS_BOOL_TRUE, NULL)) {
288 barf("ghciInsertSymbolTable failed");
289 }
290
291 #if defined(mingw32_HOST_OS)
292 addDLLHandle(WSTR("*.exe"), GetModuleHandle(NULL));
293 /*
294 * Most of these are included by base, but GCC always includes them
295 * So lets make sure we always have them too.
296 *
297 * In most cases they would have been loaded by the
298 * addDLLHandle above.
299 */
300 addDLL(WSTR("msvcrt"));
301 addDLL(WSTR("kernel32"));
302 addDLL(WSTR("advapi32"));
303 addDLL(WSTR("shell32"));
304 addDLL(WSTR("user32"));
305 #endif
306
307 /* See Note [Memory allocation]. */
308 /* Create a private heap which we will use to store all code and data. */
309 SYSTEM_INFO sSysInfo;
310 GetSystemInfo(&sSysInfo);
311 code_heap = HeapCreate (HEAP_CREATE_ENABLE_EXECUTE,
312 initHeapSizeMB * sSysInfo.dwPageSize , 0);
313 if (!code_heap)
314 barf ("Could not create private heap during initialization. Aborting.");
315
316 /* Set some flags for the new code heap. */
317 HeapSetInformation(code_heap, HeapEnableTerminationOnCorruption, NULL, 0);
318 unsigned long HeapInformation = HEAP_LFH;
319 HeapSetInformation(code_heap, HeapEnableTerminationOnCorruption,
320 &HeapInformation, sizeof(HeapInformation));
321 HeapSetInformation(code_heap, HeapOptimizeResources, NULL, 0);
322 }
323
324 void exitLinker_PEi386()
325 {
326 /* See Note [Memory allocation]. */
327 if (code_heap) {
328 HeapDestroy (code_heap);
329 code_heap = NULL;
330 }
331 }
332
333 /* A list thereof. */
334 static OpenedDLL* opened_dlls = NULL;
335
336 /* A list thereof. */
337 static IndirectAddr* indirects = NULL;
338
339 /* Adds a DLL instance to the list of DLLs in which to search for symbols. */
340 static void addDLLHandle(pathchar* dll_name, HINSTANCE instance) {
341
342 /* At this point, we actually know what was loaded.
343 So bail out if it's already been loaded. */
344 if (checkIfDllLoaded(instance))
345 {
346 return;
347 }
348
349 OpenedDLL* o_dll;
350 o_dll = stgMallocBytes( sizeof(OpenedDLL), "addDLLHandle" );
351 o_dll->name = dll_name ? pathdup(dll_name) : NULL;
352 o_dll->instance = instance;
353 o_dll->next = opened_dlls;
354 opened_dlls = o_dll;
355
356 /* Now discover the dependencies of dll_name that were
357 just loaded in our process space. The reason is we have access to them
358 without the user having to explicitly specify them. */
359 PIMAGE_NT_HEADERS header =
360 (PIMAGE_NT_HEADERS)((BYTE *)instance +
361 ((PIMAGE_DOS_HEADER)instance)->e_lfanew);
362 PIMAGE_IMPORT_DESCRIPTOR imports =
363 (PIMAGE_IMPORT_DESCRIPTOR)((BYTE *)instance + header->
364 OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);
365
366 bool importTableMissing =
367 header->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].Size == 0;
368
369 if (importTableMissing) {
370 return;
371 }
372
373 /* Ignore these compatibility shims. */
374 const pathchar* ms_dll = WSTR("api-ms-win-");
375 const int len = wcslen(ms_dll);
376
377 do {
378 pathchar* module = mkPath((char*)(BYTE *)instance + imports->Name);
379 HINSTANCE module_instance = GetModuleHandleW(module);
380 if (0 != wcsncmp(module, ms_dll, len)
381 && module_instance
382 && !checkIfDllLoaded(module_instance))
383 {
384 IF_DEBUG(linker, debugBelch("Loading dependency %" PATH_FMT " -> %" PATH_FMT ".\n", dll_name, module));
385 /* Now recursively load dependencies too. */
386 addDLLHandle(module, module_instance);
387 }
388 stgFree(module);
389 imports++;
390 } while (imports->Name);
391 }
392
393 static OpenedDLL* findLoadedDll(HINSTANCE instance)
394 {
395 for (OpenedDLL* o_dll = opened_dlls; o_dll != NULL; o_dll = o_dll->next) {
396 if (o_dll->instance == instance)
397 {
398 return o_dll;
399 }
400 }
401
402 return NULL;
403 }
404
405 static bool checkIfDllLoaded(HINSTANCE instance)
406 {
407 return findLoadedDll (instance) != NULL;
408 }
409
410 void freePreloadObjectFile_PEi386(ObjectCode *oc)
411 {
412 if (oc->image) {
413 stgFree (oc->image);
414 oc->image = NULL;
415 }
416
417 if (oc->info) {
418 if (oc->info->image) {
419 HeapFree(code_heap, 0, oc->info->image);
420 oc->info->image = NULL;
421 }
422 if (oc->info->ch_info)
423 stgFree (oc->info->ch_info);
424 stgFree (oc->info);
425 oc->info = NULL;
426 }
427
428 IndirectAddr *ia, *ia_next;
429 ia = indirects;
430 while (ia != NULL) {
431 ia_next = ia->next;
432 stgFree(ia);
433 ia = ia_next;
434 }
435 indirects = NULL;
436 }
437
438 static void releaseOcInfo(ObjectCode* oc) {
439 if (!oc) return;
440
441 if (oc->info) {
442 stgFree (oc->info->ch_info);
443 stgFree (oc->info->str_tab);
444 stgFree (oc->info->symbols);
445 stgFree (oc->info);
446 oc->info = NULL;
447 }
448 for (int i = 0; i < oc->n_sections; i++){
449 Section *section = &oc->sections[i];
450 if (section->info) {
451 stgFree (section->info->name);
452 if (section->info->relocs) {
453 stgFree (section->info->relocs);
454 section->info->relocs = NULL;
455 }
456 stgFree (section->info);
457 section->info = NULL;
458 }
459 }
460 }
461
462 /*************
463 * This function determines what kind of COFF image we are dealing with.
464 * This is needed in order to correctly load and verify objects and their
465 * sections.
466 *************/
467 COFF_OBJ_TYPE getObjectType ( char* image, pathchar* fileName )
468 {
469 /* {D1BAA1C7-BAEE-4ba9-AF20-FAF66AA4DCB8} */
470 static const char header_bigobj_classid[16] =
471 {
472 0xC7, 0xA1, 0xBA, 0xD1,
473 0xEE, 0xBA,
474 0xa9, 0x4b,
475 0xAF, 0x20,
476 0xFA, 0xF6, 0x6A, 0xA4, 0xDC, 0xB8
477 };
478
479 WORD machine;
480 COFF_OBJ_TYPE ret = COFF_UNKNOWN;
481 /* First check if we have an ANON_OBJECT_HEADER signature. */
482 ANON_OBJECT_HEADER* anon = (ANON_OBJECT_HEADER*)image;
483 if ( anon->Sig1 == IMAGE_FILE_MACHINE_UNKNOWN
484 && anon->Sig2 == IMPORT_OBJECT_HDR_SIG2)
485 {
486 machine = anon->Machine;
487 if (verifyCOFFHeader (machine, NULL, fileName))
488 {
489 switch (anon->Version)
490 {
491 case 0:
492 ret = COFF_IMPORT_LIB;
493 break;
494 case 1:
495 ret = COFF_ANON_OBJ;
496 break;
497 case 2:
498 if (memcmp (&anon->ClassID, header_bigobj_classid, 16) == 0)
499 ret = COFF_ANON_BIG_OBJ;
500 break;
501 default:
502 break;
503 }
504 }
505 } else {
506 /* If it's not an ANON_OBJECT then try an image file. */
507 IMAGE_FILE_HEADER* img = (IMAGE_FILE_HEADER*)image;
508 machine = img->Machine;
509 if (verifyCOFFHeader (machine, img, fileName))
510 ret = COFF_IMAGE;
511 }
512 return ret;
513 }
514
515 /*************
516 * Retrieve common header information
517 *************/
518 COFF_HEADER_INFO* getHeaderInfo ( ObjectCode* oc )
519 {
520 COFF_OBJ_TYPE coff_type = getObjectType (oc->image, oc->fileName);
521
522 COFF_HEADER_INFO* info
523 = stgMallocBytes (sizeof(COFF_HEADER_INFO), "getHeaderInfo");
524 memset (info, 0, sizeof(COFF_HEADER_INFO));
525 info->type = coff_type;
526 switch (coff_type)
527 {
528 case COFF_IMAGE:
529 {
530 IMAGE_FILE_HEADER* hdr = (IMAGE_FILE_HEADER*)oc->image;
531 info->sizeOfHeader = sizeof(IMAGE_FILE_HEADER);
532 info->sizeOfOptionalHeader = hdr->SizeOfOptionalHeader;
533 info->pointerToSymbolTable = hdr->PointerToSymbolTable;
534 info->numberOfSymbols = hdr->NumberOfSymbols;
535 info->numberOfSections = hdr->NumberOfSections;
536 }
537 break;
538 case COFF_ANON_BIG_OBJ:
539 {
540 ANON_OBJECT_HEADER_BIGOBJ* hdr = (ANON_OBJECT_HEADER_BIGOBJ*)oc->image;
541 info->sizeOfHeader = sizeof(ANON_OBJECT_HEADER_BIGOBJ);
542 info->sizeOfOptionalHeader = 0;
543 info->pointerToSymbolTable = hdr->PointerToSymbolTable;
544 info->numberOfSymbols = hdr->NumberOfSymbols;
545 info->numberOfSections = hdr->NumberOfSections;
546 }
547 break;
548 default:
549 {
550 stgFree (info);
551 info = NULL;
552 errorBelch ("Unknown COFF %d type in getHeaderInfo.", coff_type);
553 }
554 break;
555 }
556
557 return info;
558 }
559
560 /*************
561 * Symbol utility functions
562 *************/
563 __attribute__ ((always_inline)) inline
564 size_t getSymbolSize ( COFF_HEADER_INFO *info )
565 {
566 ASSERT(info);
567 switch (info->type)
568 {
569 case COFF_ANON_BIG_OBJ:
570 return sizeof_COFF_symbol_ex;
571 default:
572 return sizeof_COFF_symbol_og;
573 }
574 }
575
576 __attribute__ ((always_inline)) inline
577 int32_t getSymSectionNumber ( COFF_HEADER_INFO *info, COFF_symbol* sym )
578 {
579 ASSERT(info);
580 ASSERT(sym);
581 switch (info->type)
582 {
583 case COFF_ANON_BIG_OBJ:
584 return sym->ex.SectionNumber;
585 default:
586 return sym->og.SectionNumber;
587 }
588 }
589
590 __attribute__ ((always_inline)) inline
591 uint32_t getSymValue ( COFF_HEADER_INFO *info, COFF_symbol* sym )
592 {
593 ASSERT(info);
594 ASSERT(sym);
595 switch (info->type)
596 {
597 case COFF_ANON_BIG_OBJ:
598 return sym->ex.Value;
599 default:
600 return sym->og.Value;
601 }
602 }
603
604 __attribute__ ((always_inline)) inline
605 uint8_t getSymStorageClass ( COFF_HEADER_INFO *info, COFF_symbol* sym )
606 {
607 ASSERT(info);
608 ASSERT(sym);
609 switch (info->type)
610 {
611 case COFF_ANON_BIG_OBJ:
612 return sym->ex.StorageClass;
613 default:
614 return sym->og.StorageClass;
615 }
616 }
617
618 __attribute__ ((always_inline)) inline
619 uint8_t getSymNumberOfAuxSymbols ( COFF_HEADER_INFO *info, COFF_symbol* sym )
620 {
621 ASSERT(info);
622 ASSERT(sym);
623 switch (info->type)
624 {
625 case COFF_ANON_BIG_OBJ:
626 return sym->ex.NumberOfAuxSymbols;
627 default:
628 return sym->og.NumberOfAuxSymbols;
629 }
630 }
631
632 __attribute__ ((always_inline)) inline
633 uint16_t getSymType ( COFF_HEADER_INFO *info, COFF_symbol* sym )
634 {
635 ASSERT(info);
636 ASSERT(sym);
637 switch (info->type)
638 {
639 case COFF_ANON_BIG_OBJ:
640 return sym->ex.Type;
641 default:
642 return sym->og.Type;
643 }
644 }
645
646 __attribute__ ((always_inline)) inline
647 uint8_t* getSymShortName ( COFF_HEADER_INFO *info, COFF_symbol* sym )
648 {
649 ASSERT(info);
650 ASSERT(sym);
651 switch (info->type)
652 {
653 case COFF_ANON_BIG_OBJ:
654 return sym->ex.N.ShortName;
655 default:
656 return sym->og.N.ShortName;
657 }
658 }
659
660 const char *
661 addDLL_PEi386( pathchar *dll_name, HINSTANCE *loaded )
662 {
663 /* ------------------- Win32 DLL loader ------------------- */
664
665 pathchar* buf;
666 HINSTANCE instance;
667
668 IF_DEBUG(linker, debugBelch("addDLL; dll_name = `%" PATH_FMT "'\n", dll_name));
669
670 /* The file name has no suffix (yet) so that we can try
671 both foo.dll and foo.drv
672
673 The documentation for LoadLibrary says:
674 If no file name extension is specified in the lpFileName
675 parameter, the default library extension .dll is
676 appended. However, the file name string can include a trailing
677 point character (.) to indicate that the module name has no
678 extension. */
679
680 size_t bufsize = pathlen(dll_name) + 10;
681 buf = stgMallocBytes(bufsize * sizeof(wchar_t), "addDLL");
682
683 /* These are ordered by probability of success and order we'd like them. */
684 const wchar_t *formats[] = { L"%ls.DLL", L"%ls.DRV", L"lib%ls.DLL", L"%ls" };
685 const DWORD flags[] = { LOAD_LIBRARY_SEARCH_USER_DIRS | LOAD_LIBRARY_SEARCH_DEFAULT_DIRS, 0 };
686
687 int cFormat, cFlag;
688 int flags_start = 1; /* Assume we don't support the new API. */
689
690 /* Detect if newer API are available, if not, skip the first flags entry. */
691 if (GetProcAddress((HMODULE)LoadLibraryW(L"Kernel32.DLL"), "AddDllDirectory")) {
692 flags_start = 0;
693 }
694
695 /* Iterate through the possible flags and formats. */
696 for (cFlag = flags_start; cFlag < 2; cFlag++)
697 {
698 for (cFormat = 0; cFormat < 4; cFormat++)
699 {
700 snwprintf(buf, bufsize, formats[cFormat], dll_name);
701 instance = LoadLibraryExW(buf, NULL, flags[cFlag]);
702 if (instance == NULL) {
703 if (GetLastError() != ERROR_MOD_NOT_FOUND)
704 {
705 goto error;
706 }
707 }
708 else
709 {
710 break; /* We're done. DLL has been loaded. */
711 }
712 }
713 }
714
715 /* Check if we managed to load the DLL. */
716 if (instance == NULL) {
717 goto error;
718 }
719
720 addDLLHandle(buf, instance);
721 if (loaded) {
722 *loaded = instance;
723 }
724 stgFree(buf);
725
726 return NULL;
727
728 error:
729 stgFree(buf);
730
731 char* errormsg = malloc(sizeof(char) * 80);
732 snprintf(errormsg, 80, "addDLL: %" PATH_FMT " or dependencies not loaded. (Win32 error %lu)", dll_name, GetLastError());
733 /* LoadLibrary failed; return a ptr to the error msg. */
734 return errormsg;
735 }
736
737 pathchar* findSystemLibrary_PEi386( pathchar* dll_name )
738 {
739 const unsigned int init_buf_size = 1024;
740 unsigned int bufsize = init_buf_size;
741 wchar_t* result = malloc(sizeof(wchar_t) * bufsize);
742 DWORD wResult = SearchPathW(NULL, dll_name, NULL, bufsize, result, NULL);
743
744 if (wResult > bufsize) {
745 result = realloc(result, sizeof(wchar_t) * wResult);
746 wResult = SearchPathW(NULL, dll_name, NULL, wResult, result, NULL);
747 }
748
749
750 if (!wResult) {
751 free(result);
752 return NULL;
753 }
754
755 return result;
756 }
757
758 HsPtr addLibrarySearchPath_PEi386(pathchar* dll_path)
759 {
760 HINSTANCE hDLL = LoadLibraryW(L"Kernel32.DLL");
761 LPAddDLLDirectory AddDllDirectory = (LPAddDLLDirectory)GetProcAddress((HMODULE)hDLL, "AddDllDirectory");
762
763 HsPtr result = NULL;
764
765 const unsigned int init_buf_size = 4096;
766 int bufsize = init_buf_size;
767
768 // Make sure the path is an absolute path
769 WCHAR* abs_path = malloc(sizeof(WCHAR) * init_buf_size);
770 DWORD wResult = GetFullPathNameW(dll_path, bufsize, abs_path, NULL);
771 if (!wResult){
772 sysErrorBelch("addLibrarySearchPath[GetFullPathNameW]: %" PATH_FMT " (Win32 error %lu)", dll_path, GetLastError());
773 }
774 else if (wResult > init_buf_size) {
775 abs_path = realloc(abs_path, sizeof(WCHAR) * wResult);
776 if (!GetFullPathNameW(dll_path, bufsize, abs_path, NULL)) {
777 sysErrorBelch("addLibrarySearchPath[GetFullPathNameW]: %" PATH_FMT " (Win32 error %lu)", dll_path, GetLastError());
778 }
779 }
780
781 if (AddDllDirectory) {
782 result = AddDllDirectory(abs_path);
783 }
784 else
785 {
786 warnMissingKBLibraryPaths();
787 WCHAR* str = malloc(sizeof(WCHAR) * init_buf_size);
788 wResult = GetEnvironmentVariableW(L"PATH", str, bufsize);
789
790 if (wResult > init_buf_size) {
791 str = realloc(str, sizeof(WCHAR) * wResult);
792 bufsize = wResult;
793 wResult = GetEnvironmentVariableW(L"PATH", str, bufsize);
794 if (!wResult) {
795 sysErrorBelch("addLibrarySearchPath[GetEnvironmentVariableW]: %" PATH_FMT " (Win32 error %lu)", dll_path, GetLastError());
796 }
797 }
798
799 bufsize = wResult + 2 + pathlen(abs_path);
800 wchar_t* newPath = malloc(sizeof(wchar_t) * bufsize);
801
802 wcscpy(newPath, abs_path);
803 wcscat(newPath, L";");
804 wcscat(newPath, str);
805 if (!SetEnvironmentVariableW(L"PATH", (LPCWSTR)newPath)) {
806 sysErrorBelch("addLibrarySearchPath[SetEnvironmentVariableW]: %" PATH_FMT " (Win32 error %lu)", abs_path, GetLastError());
807 }
808
809 free(newPath);
810 free(abs_path);
811
812 return str;
813 }
814
815 if (!result) {
816 sysErrorBelch("addLibrarySearchPath: %" PATH_FMT " (Win32 error %lu)", abs_path, GetLastError());
817 free(abs_path);
818 return NULL;
819 }
820
821 free(abs_path);
822 return result;
823 }
824
825 bool removeLibrarySearchPath_PEi386(HsPtr dll_path_index)
826 {
827 bool result = false;
828
829 if (dll_path_index != NULL) {
830 HINSTANCE hDLL = LoadLibraryW(L"Kernel32.DLL");
831 LPRemoveDLLDirectory RemoveDllDirectory = (LPRemoveDLLDirectory)GetProcAddress((HMODULE)hDLL, "RemoveDllDirectory");
832
833 if (RemoveDllDirectory) {
834 result = RemoveDllDirectory(dll_path_index);
835 // dll_path_index is now invalid, do not use it after this point.
836 }
837 else
838 {
839 warnMissingKBLibraryPaths();
840 result = SetEnvironmentVariableW(L"PATH", (LPCWSTR)dll_path_index);
841 free(dll_path_index);
842 }
843
844 if (!result) {
845 sysErrorBelch("removeLibrarySearchPath: (Win32 error %lu)", GetLastError());
846 return false;
847 }
848 }
849
850 return !result;
851 }
852
853
854 /* We assume file pointer is right at the
855 beginning of COFF object.
856 */
857 static uint32_t getSectionAlignment(
858 Section section) {
859 uint32_t c = section.info->props;
860 for(int i = 0; i < pe_alignments_cnt; i++)
861 {
862 if ((c & 0xF00000) == pe_alignments[i].mask)
863 return pe_alignments[i].value;
864 }
865
866 /* No alignment flag found, assume 8-byte aligned. */
867 return default_alignment;
868 }
869
870 /* ----------------------
871 * return a memory location aligned to the section requirements
872 */
873 static uint8_t* getAlignedMemory(
874 uint8_t* value, Section section) {
875 uint32_t alignment = getSectionAlignment(section);
876 uintptr_t mask = (uintptr_t)alignment - 1;
877 return (uint8_t*)(((uintptr_t)value + mask) & ~mask);
878 }
879
880 /* ----------------------
881 * return a value aligned to the section requirements
882 */
883 static size_t getAlignedValue(
884 size_t value, Section section) {
885 uint32_t alignment = getSectionAlignment(section);
886 uint32_t mask = (uint32_t)alignment - 1;
887 return (size_t)((value + mask) & ~mask);
888 }
889
890 /* -----------------------
891 * This loads import libraries following Microsoft's official standard in the PE
892 * documentation. This is a smaller more efficient format which is just a list
893 * of symbol name => dll.
894 *
895 * This function must fail gracefully and if it does, the filestream needs to
896 * be reset to what it was when the function was called.
897 */
898 bool checkAndLoadImportLibrary( pathchar* arch_name, char* member_name, FILE* f )
899 {
900 char* image;
901 static bool load_dll_warn = false;
902
903 if (load_dll_warn) { return 0; }
904
905 /* Based on Import Library specification. PE Spec section 7.1 */
906
907 COFF_import_header hdr;
908 size_t n;
909
910 n = fread(&hdr, 1, sizeof_COFF_import_Header, f);
911 if (n != sizeof_COFF_import_Header) {
912 errorBelch("loadImportLibrary: error whilst reading `%s' header "
913 "in `%" PATH_FMT "'\n",
914 member_name, arch_name);
915 fseek(f, -(long int)sizeof_COFF_import_Header, SEEK_CUR);
916 return false;
917 }
918
919 if ( hdr.Sig1 != IMAGE_FILE_MACHINE_UNKNOWN
920 || hdr.Sig2 != IMPORT_OBJECT_HDR_SIG2
921 || getObjectType ((char*)&hdr, arch_name) != COFF_IMPORT_LIB) {
922 fseek(f, -(long int)sizeof_COFF_import_Header, SEEK_CUR);
923 IF_DEBUG(linker, debugBelch("loadArchive: Object `%s` is not an import lib. Skipping...\n", member_name));
924 return false;
925 }
926
927 IF_DEBUG(linker, debugBelch("loadArchive: reading %lu bytes at %ld\n", hdr.SizeOfData, ftell(f)));
928
929 image = stgMallocBytes(hdr.SizeOfData, "checkAndLoadImportLibrary(image)");
930 n = fread(image, 1, hdr.SizeOfData, f);
931 if (n != hdr.SizeOfData) {
932 errorBelch("loadArchive: error whilst reading `%s' header in `%" PATH_FMT "'. Did not read enough bytes.\n",
933 member_name, arch_name);
934 fseek(f, -(n + sizeof_COFF_import_Header), SEEK_CUR);
935 return false;
936 }
937
938 char* symbol = strtok(image, "\0");
939 int symLen = strlen(symbol) + 1;
940 int nameLen = n - symLen;
941 char* dllName = stgMallocBytes(sizeof(char) * nameLen,
942 "checkAndLoadImportLibrary(dllname)");
943 dllName = strncpy(dllName, image + symLen, nameLen);
944 pathchar* dll = stgMallocBytes(sizeof(wchar_t) * nameLen,
945 "checkAndLoadImportLibrary(dll)");
946 mbstowcs(dll, dllName, nameLen);
947 stgFree(dllName);
948
949 IF_DEBUG(linker, debugBelch("loadArchive: read symbol %s from lib `%" PATH_FMT "'\n", symbol, dll));
950 const char* result = addDLL(dll);
951
952 stgFree(image);
953
954 if (result != NULL) {
955 errorBelch("Could not load `%" PATH_FMT "'. Reason: %s\n", dll, result);
956 load_dll_warn = true;
957
958 stgFree(dll);
959 fseek(f, -(n + sizeof_COFF_import_Header), SEEK_CUR);
960 return false;
961 }
962
963 stgFree(dll);
964 return true;
965 }
966
967 static void
968 printName ( uint8_t* name, ObjectCode* oc )
969 {
970 if (name[0]==0 && name[1]==0 && name[2]==0 && name[3]==0) {
971 uint32_t strtab_offset = * (uint32_t*)(name + 4);
972 debugBelch("%s",
973 oc->info->str_tab + strtab_offset - PEi386_STRTAB_OFFSET);
974 } else {
975 int i;
976 for (i = 0; i < 8; i++) {
977 if (name[i] == 0) break;
978 debugBelch("%c", name[i] );
979 }
980 }
981 }
982
983
984 static void
985 copyName ( uint8_t* name, ObjectCode* oc, uint8_t* dst, int dstSize )
986 {
987 if (name[0]==0 && name[1]==0 && name[2]==0 && name[3]==0) {
988 uint32_t strtab_offset = * (uint32_t*)(name + 4);
989 strncpy ((char*)dst,
990 oc->info->str_tab + strtab_offset - PEi386_STRTAB_OFFSET,
991 dstSize);
992 dst[dstSize-1] = 0;
993 } else {
994 int i = 0;
995 while (1) {
996 if (i >= 8) break;
997 if (name[i] == 0) break;
998 dst[i] = name[i];
999 i++;
1000 }
1001 dst[i] = 0;
1002 }
1003 }
1004
1005
1006 char*
1007 get_sym_name ( uint8_t* name, ObjectCode* oc )
1008 {
1009 char* newstr;
1010 /* If the string is longer than 8 bytes, look in the
1011 string table for it -- this will be correctly zero terminated.
1012 */
1013 if (name[0]==0 && name[1]==0 && name[2]==0 && name[3]==0) {
1014 uint32_t strtab_offset = * (uint32_t*)(name + 4);
1015 return oc->info->str_tab + strtab_offset - PEi386_STRTAB_OFFSET;
1016 }
1017 /* Otherwise, if shorter than 8 bytes, return the original,
1018 which by defn is correctly terminated.
1019 */
1020 if (name[7]==0) return (char*)name;
1021 /* The annoying case: 8 bytes. Copy into a temporary
1022 (XXX which is never freed ...)
1023 */
1024 newstr = stgMallocBytes(9, "get_sym_name");
1025 ASSERT(newstr);
1026 strncpy (newstr, (char*)name,8);
1027 newstr[8] = 0;
1028 return newstr;
1029 }
1030
1031 /* Getting the name of a section is mildly tricky, so we make a
1032 function for it. Sadly, in one case we have to copy the string
1033 (when it is exactly 8 bytes long there's no trailing '\0'), so for
1034 consistency we *always* copy the string; the caller must free it
1035 */
1036 char *
1037 get_name_string (uint8_t* name, ObjectCode* oc)
1038 {
1039 char *newstr;
1040
1041 if (name[0]=='/') {
1042 int strtab_offset = strtol((char*)name+1,NULL,10)-PEi386_STRTAB_OFFSET;
1043 char* str = oc->info->str_tab + strtab_offset;
1044 int len = strlen(str);
1045
1046 newstr = stgMallocBytes(len + 1, "cstring_from_section_symbol_name");
1047 strncpy(newstr, str, len + 1);
1048 return newstr;
1049 }
1050 else
1051 {
1052 newstr = stgMallocBytes(9, "cstring_from_section_symbol_name");
1053 ASSERT(newstr);
1054 strncpy(newstr,(char*)name,8);
1055 newstr[8] = 0;
1056 return newstr;
1057 }
1058 }
1059
1060 /* See Note [mingw-w64 name decoration scheme] */
1061 #if !defined(x86_64_HOST_ARCH)
1062 static void
1063 zapTrailingAtSign ( SymbolName* sym )
1064 {
1065 char* lst = strrchr (sym, '@');
1066 if (lst) lst[0]='\0';
1067 }
1068 #endif
1069
1070 SymbolAddr*
1071 lookupSymbolInDLLs ( const SymbolName* lbl )
1072 {
1073 OpenedDLL* o_dll;
1074 SymbolAddr* sym;
1075
1076 for (o_dll = opened_dlls; o_dll != NULL; o_dll = o_dll->next) {
1077 /* debugBelch("look in %ls for %s\n", o_dll->name, lbl); */
1078
1079 sym = GetProcAddress(o_dll->instance, lbl+STRIP_LEADING_UNDERSCORE);
1080 if (sym != NULL) {
1081 /*debugBelch("found %s in %s\n", lbl+1,o_dll->name);*/
1082 return sym;
1083 }
1084
1085 /* Ticket #2283.
1086 Long description: http://support.microsoft.com/kb/132044
1087 tl;dr:
1088 If C/C++ compiler sees __declspec(dllimport) ... foo ...
1089 it generates call *__imp_foo, and __imp_foo here has exactly
1090 the same semantics as in __imp_foo = GetProcAddress(..., "foo")
1091 */
1092 if (sym == NULL && strncmp (lbl, "__imp_", 6) == 0) {
1093 sym = GetProcAddress(o_dll->instance,
1094 lbl + 6 + STRIP_LEADING_UNDERSCORE);
1095 if (sym != NULL) {
1096 IndirectAddr* ret;
1097 ret = stgMallocBytes( sizeof(IndirectAddr), "lookupSymbolInDLLs" );
1098 ret->addr = sym;
1099 ret->next = indirects;
1100 indirects = ret;
1101 IF_DEBUG(linker,
1102 debugBelch("warning: %s from %S is linked instead of %s\n",
1103 lbl+6+STRIP_LEADING_UNDERSCORE, o_dll->name, lbl));
1104 return (void*) & ret->addr;
1105 }
1106 }
1107
1108 sym = GetProcAddress(o_dll->instance, lbl);
1109 if (sym != NULL) {
1110 /*debugBelch("found %s in %s\n", lbl,o_dll->name);*/
1111 return sym;
1112 }
1113 }
1114 return NULL;
1115 }
1116
1117 static bool
1118 verifyCOFFHeader ( uint16_t machine, IMAGE_FILE_HEADER *hdr,
1119 pathchar *fileName )
1120 {
1121 #if defined(i386_HOST_ARCH)
1122 if (machine != IMAGE_FILE_MACHINE_I386) {
1123 errorBelch("%" PATH_FMT ": Not a x86 PE file.", fileName);
1124 return false;
1125 }
1126 #elif defined(x86_64_HOST_ARCH)
1127 if (machine != IMAGE_FILE_MACHINE_AMD64) {
1128 errorBelch("%" PATH_FMT ": Not a x86_64 PE+ file.", fileName);
1129 return false;
1130 }
1131 #else
1132 errorBelch("PE/PE+ not supported on this arch.");
1133 #endif
1134
1135 if (!hdr)
1136 return true;
1137
1138 if (hdr->SizeOfOptionalHeader != 0) {
1139 errorBelch("%" PATH_FMT ": PE/PE+ with nonempty optional header",
1140 fileName);
1141 return 0;
1142 }
1143 if ( (hdr->Characteristics & IMAGE_FILE_EXECUTABLE_IMAGE) ||
1144 (hdr->Characteristics & IMAGE_FILE_DLL ) ||
1145 (hdr->Characteristics & IMAGE_FILE_SYSTEM ) ) {
1146 errorBelch("%" PATH_FMT ": Not a PE/PE+ object file", fileName);
1147 return false;
1148 }
1149 if ( (hdr->Characteristics & IMAGE_FILE_BYTES_REVERSED_HI)) {
1150 errorBelch("%" PATH_FMT ": Invalid PE/PE+ word size or endianness: %d",
1151 fileName,
1152 (int)(hdr->Characteristics));
1153 return false;
1154 }
1155 return true;
1156 }
1157
1158 bool
1159 ocVerifyImage_PEi386 ( ObjectCode* oc )
1160 {
1161 COFF_HEADER_INFO *info = getHeaderInfo (oc);
1162
1163 /* If the header could not be read, then don't process the ObjectCode.
1164 This the case when the ObjectCode has been partially freed. */
1165 if (!info)
1166 return false;
1167
1168 uint32_t i, noRelocs;
1169 COFF_section* sectab;
1170 COFF_symbol* symtab;
1171 uint8_t* strtab;
1172
1173 sectab = (COFF_section*) (
1174 ((uint8_t*)(oc->image))
1175 + info->sizeOfHeader + info->sizeOfOptionalHeader
1176 );
1177 symtab = (COFF_symbol*) (
1178 ((uint8_t*)(oc->image))
1179 + info->pointerToSymbolTable
1180 );
1181 strtab = ((uint8_t*)symtab)
1182 + info->numberOfSymbols * getSymbolSize (info);
1183
1184 /* .BSS Section is initialized in ocGetNames_PEi386
1185 but we need the Sections array initialized here already. */
1186 Section *sections;
1187 sections = (Section*)stgCallocBytes(
1188 sizeof(Section),
1189 info->numberOfSections + 1, /* +1 for the global BSS section see ocGetNames_PEi386 */
1190 "ocVerifyImage_PEi386(sections)");
1191 oc->sections = sections;
1192 oc->n_sections = info->numberOfSections + 1;
1193 oc->info = stgCallocBytes (sizeof(struct ObjectCodeFormatInfo), 1,
1194 "ocVerifyImage_PEi386(info)");
1195 oc->info->secBytesTotal = 0;
1196 oc->info->secBytesUsed = 0;
1197 oc->info->init = NULL;
1198 oc->info->finit = NULL;
1199 oc->info->ch_info = info;
1200
1201 /* Copy the tables over from object-file. Copying these allows us to
1202 simplify the indexing and to release the object file immediately after
1203 this step as all information we need would be in available. After
1204 loading we can also release everything in the info structure as it won't
1205 be needed again further freeing up memory.
1206 COFF_symbol is a union type, so we have to "adjust" the array to be able
1207 to access it using normal subscript notation. This eliminates the complex
1208 indexing later on. */
1209 uint32_t s_symbols = info->numberOfSymbols * sizeof(COFF_symbol);
1210 uint32_t sym_size = getSymbolSize (info);
1211 oc->info->symbols
1212 = stgMallocBytes (s_symbols, "ocVerifyImage_PEi386(oc->info->symbols)");
1213 for (i = 0; i < info->numberOfSymbols; i++)
1214 memcpy (oc->info->symbols+i, (char*)symtab + sym_size * i, sym_size);
1215
1216 uint32_t n_strtab = (*(uint32_t*)strtab) - PEi386_STRTAB_OFFSET;
1217 oc->info->str_tab
1218 = stgMallocBytes (n_strtab, "ocVerifyImage_PEi386(oc->info->str_tab)");
1219 memcpy (oc->info->str_tab, strtab + PEi386_STRTAB_OFFSET, n_strtab);
1220
1221 /* Initialize the Sections */
1222 for (i = 0; i < info->numberOfSections; i++) {
1223 uint32_t relocs_offset;
1224 COFF_section* sectab_i
1225 = (COFF_section*)
1226 myindex(sizeof_COFF_section, sectab, i);
1227
1228 Section *section = &sections[i];
1229 /* Calculate the start of the section data. */
1230 section->start = oc->image + sectab_i->PointerToRawData;
1231 section->size = sectab_i->SizeOfRawData;
1232 section->info = stgCallocBytes (sizeof(struct SectionFormatInfo), 1,
1233 "ocVerifyImage_PEi386(section.info)");
1234 section->info->name = get_name_string (sectab_i->Name, oc);
1235 section->info->alignment = getSectionAlignment (*section);
1236 section->info->props = sectab_i->Characteristics;
1237 section->info->virtualSize = sectab_i->Misc.VirtualSize;
1238 section->info->virtualAddr = sectab_i->VirtualAddress;
1239
1240 COFF_reloc* reltab
1241 = (COFF_reloc*) (oc->image + sectab_i->PointerToRelocations);
1242
1243 if (section->info->props & IMAGE_SCN_LNK_NRELOC_OVFL ) {
1244 /* If the relocation field (a short) has overflowed, the
1245 * real count can be found in the first reloc entry.
1246 *
1247 * See Section 4.1 (last para) of the PE spec (rev6.0).
1248 */
1249 COFF_reloc* rel = (COFF_reloc*)
1250 myindex ( sizeof_COFF_reloc, reltab, 0 );
1251 noRelocs = rel->VirtualAddress;
1252 relocs_offset = 1;
1253 } else {
1254 noRelocs = sectab_i->NumberOfRelocations;
1255 relocs_offset = 0;
1256 }
1257
1258 section->info->noRelocs = noRelocs;
1259 section->info->relocs = NULL;
1260 if (noRelocs > 0) {
1261 section->info->relocs
1262 = stgMallocBytes (noRelocs * sizeof (COFF_reloc),
1263 "ocVerifyImage_PEi386(section->info->relocs)");
1264 memcpy (section->info->relocs, reltab + relocs_offset,
1265 noRelocs * sizeof (COFF_reloc));
1266 }
1267
1268 oc->info->secBytesTotal += getAlignedValue (section->size, *section);
1269 }
1270
1271 /* Initialize the last section's info field which contains the .bss
1272 section, it doesn't need an info so set it to NULL. */
1273 sections[info->numberOfSections].info = NULL;
1274
1275 /* Calculate space for trampolines nearby.
1276 We get back 8-byte aligned memory (is that guaranteed?), but
1277 the offsets to the sections within the file are all 4 mod 8
1278 (is that guaranteed?). We therefore need to offset the image
1279 by 4, so that all the pointers are 8-byte aligned, so that
1280 pointer tagging works. */
1281 /* For 32-bit case we don't need this, hence we use macro
1282 PEi386_IMAGE_OFFSET, which equals to 4 for 64-bit case and 0 for
1283 32-bit case. */
1284 /* We allocate trampolines area for all symbols right behind
1285 image data, aligned on 8. */
1286 oc->info->trampoline
1287 = (PEi386_IMAGE_OFFSET + 2 * default_alignment
1288 + oc->info->secBytesTotal) & ~0x7;
1289 oc->info->secBytesTotal
1290 = oc->info->trampoline + info->numberOfSymbols * sizeof(SymbolExtra);
1291
1292 /* No further verification after this point; only debug printing. */
1293 i = 0;
1294 IF_DEBUG(linker, i=1);
1295 if (i == 0) return true;
1296
1297 debugBelch("sectab offset = %" FMT_SizeT "\n",
1298 ((uint8_t*)sectab) - ((uint8_t*)oc->image) );
1299 debugBelch("symtab offset = %" FMT_SizeT "\n",
1300 ((uint8_t*)symtab) - ((uint8_t*)oc->image) );
1301 debugBelch("strtab offset = %" FMT_SizeT "\n",
1302 ((uint8_t*)strtab) - ((uint8_t*)oc->image) );
1303
1304 debugBelch("\n" );
1305 if (info->type == COFF_IMAGE)
1306 {
1307 IMAGE_FILE_HEADER* hdr = (IMAGE_FILE_HEADER*)oc->image;
1308 debugBelch( "COFF Type: IMAGE_FILE_HEADER\n");
1309 debugBelch( "Machine: 0x%x\n",
1310 (uint32_t)(hdr->Machine) );
1311 debugBelch( "# sections: %d\n",
1312 (uint32_t)(hdr->NumberOfSections) );
1313 debugBelch( "time/date: 0x%x\n",
1314 (uint32_t)(hdr->TimeDateStamp) );
1315 debugBelch( "symtab offset: %d\n",
1316 (uint32_t)(hdr->PointerToSymbolTable) );
1317 debugBelch( "# symbols: %d\n",
1318 (uint32_t)(hdr->NumberOfSymbols) );
1319 debugBelch( "sz of opt hdr: %d\n",
1320 (uint32_t)(hdr->SizeOfOptionalHeader) );
1321 debugBelch( "characteristics: 0x%x\n",
1322 (uint32_t)(hdr->Characteristics) );
1323 }
1324 else if (info->type == COFF_ANON_BIG_OBJ)
1325 {
1326 ANON_OBJECT_HEADER_BIGOBJ* hdr = (ANON_OBJECT_HEADER_BIGOBJ*)oc->image;
1327 debugBelch( "COFF Type: ANON_OBJECT_HEADER_BIGOBJ\n");
1328 debugBelch( "Machine: 0x%x\n",
1329 (uint32_t)(hdr->Machine) );
1330 debugBelch( "# sections: %d\n",
1331 (uint32_t)(hdr->NumberOfSections) );
1332 debugBelch( "time/date: 0x%x\n",
1333 (uint32_t)(hdr->TimeDateStamp) );
1334 debugBelch( "symtab offset: %d\n",
1335 (uint32_t)(hdr->PointerToSymbolTable) );
1336 debugBelch( "# symbols: %d\n",
1337 (uint32_t)(hdr->NumberOfSymbols) );
1338 }
1339 else
1340 {
1341 debugBelch( "COFF Type: UNKNOWN\n");
1342 return false;
1343 }
1344
1345 /* Print the section table. */
1346 debugBelch("\n" );
1347 for (i = 0; i < info->numberOfSections; i++) {
1348 COFF_section* sectab_i
1349 = (COFF_section*)
1350 myindex ( sizeof_COFF_section, sectab, i );
1351 Section section = sections[i];
1352 debugBelch(
1353 "\n"
1354 "section %d\n"
1355 " name `",
1356 i
1357 );
1358 printName (sectab_i->Name, oc);
1359 debugBelch(
1360 "'\n"
1361 " vsize %lu\n"
1362 " vaddr %lu\n"
1363 " data sz %lu\n"
1364 " data off 0x%p\n"
1365 " num rel %hu\n"
1366 " off rel %lu\n"
1367 " ptr raw 0x%lx\n"
1368 " align %u\n"
1369 " data adj %zu\n",
1370 sectab_i->Misc.VirtualSize,
1371 sectab_i->VirtualAddress,
1372 sectab_i->SizeOfRawData,
1373 section.start,
1374 sectab_i->NumberOfRelocations,
1375 sectab_i->PointerToRelocations,
1376 sectab_i->PointerToRawData,
1377 getSectionAlignment (section),
1378 getAlignedValue (section.size, section)
1379 );
1380
1381 noRelocs = section.info->noRelocs;
1382 for (uint32_t j = 0; j < noRelocs; j++) {
1383 COFF_reloc rel = section.info->relocs[j];
1384 debugBelch(
1385 " type 0x%-4x vaddr 0x%-8lx name `",
1386 rel.Type,
1387 rel.VirtualAddress );
1388 COFF_symbol sym = oc->info->symbols[rel.SymbolTableIndex];
1389 printName (getSymShortName (info, &sym), oc);
1390 debugBelch("'\n" );
1391 }
1392
1393 debugBelch("\n" );
1394 }
1395 debugBelch("\n" );
1396 debugBelch("string table has size 0x%x\n", n_strtab + PEi386_STRTAB_OFFSET);
1397 debugBelch("---START of string table---\n");
1398 for (i = 4; i < n_strtab; i++) {
1399 if (strtab[i] == 0)
1400 debugBelch("\n"); else
1401 debugBelch("%c", strtab[i] );
1402 }
1403 debugBelch("--- END of string table---\n");
1404
1405 debugBelch("\n" );
1406
1407 for (i = 0; i < info->numberOfSymbols; i++) {
1408 COFF_symbol* symtab_i = &oc->info->symbols[i];
1409 debugBelch(
1410 "symbol %d\n"
1411 " name `",
1412 i
1413 );
1414 printName (getSymShortName (info, symtab_i), oc);
1415 debugBelch(
1416 "'\n"
1417 " value 0x%x\n"
1418 " 1+sec# %d\n"
1419 " type 0x%x\n"
1420 " sclass 0x%x\n"
1421 " nAux %d\n",
1422 getSymValue (info, symtab_i),
1423 getSymSectionNumber (info, symtab_i),
1424 getSymType (info, symtab_i),
1425 getSymStorageClass (info, symtab_i),
1426 getSymNumberOfAuxSymbols (info, symtab_i)
1427 );
1428 i += getSymNumberOfAuxSymbols (info, symtab_i);
1429 }
1430
1431 debugBelch("\n" );
1432 return true;
1433 }
1434
1435 bool
1436 ocGetNames_PEi386 ( ObjectCode* oc )
1437 {
1438 bool has_code_section = false;
1439
1440 SymbolName* sname;
1441 SymbolAddr* addr;
1442 unsigned int i;
1443
1444 COFF_HEADER_INFO *info = oc->info->ch_info;
1445
1446 /* Copy section information into the ObjectCode. */
1447
1448 for (i = 0; i < info->numberOfSections; i++) {
1449 uint8_t* start;
1450 uint8_t* end;
1451 uint32_t sz;
1452
1453 /* By default consider all section as CODE or DATA,
1454 which means we want to load them. */
1455 SectionKind kind = SECTIONKIND_CODE_OR_RODATA;
1456 Section section = oc->sections[i];
1457
1458 IF_DEBUG(linker, debugBelch("section name = %s\n", section.info->name ));
1459
1460 /* The PE file section flag indicates whether the section
1461 contains code or data. */
1462 if (section.info->props & IMAGE_SCN_CNT_CODE) {
1463 has_code_section = has_code_section || section.size > 0;
1464 kind = SECTIONKIND_CODE_OR_RODATA;
1465 }
1466
1467 if (section.info->props & IMAGE_SCN_CNT_INITIALIZED_DATA)
1468 kind = SECTIONKIND_CODE_OR_RODATA;
1469
1470 /* Check next if it contains any uninitialized data */
1471 if (section.info->props & IMAGE_SCN_CNT_UNINITIALIZED_DATA)
1472 kind = SECTIONKIND_RWDATA;
1473
1474 /* Finally check if it can be discarded.
1475 This will also ignore .debug sections */
1476 if ( section.info->props & IMAGE_SCN_MEM_DISCARDABLE
1477 || section.info->props & IMAGE_SCN_LNK_REMOVE)
1478 kind = SECTIONKIND_OTHER;
1479
1480 if (0==strncmp(".ctors", section.info->name, 6)) {
1481 kind = SECTIONKIND_INIT_ARRAY;
1482 oc->info->init = &oc->sections[i];
1483 }
1484
1485 if (0==strncmp(".dtors", section.info->name, 6)) {
1486 kind = SECTIONKIND_FINIT_ARRAY;
1487 oc->info->finit = &oc->sections[i];
1488 }
1489
1490 if ( 0 == strncmp(".stab" , section.info->name, 5 )
1491 || 0 == strncmp(".stabstr" , section.info->name, 8 )
1492 || 0 == strncmp(".pdata" , section.info->name, 6 )
1493 || 0 == strncmp(".xdata" , section.info->name, 6 )
1494 || 0 == strncmp(".debug" , section.info->name, 6 )
1495 || 0 == strncmp(".rdata$zzz", section.info->name, 10))
1496 kind = SECTIONKIND_DEBUG;
1497
1498 if (0==strncmp(".idata", section.info->name, 6))
1499 kind = SECTIONKIND_IMPORT;
1500
1501 /* See Note [BFD import library]. */
1502 if (0==strncmp(".idata$7", section.info->name, 8))
1503 kind = SECTIONKIND_IMPORT_LIBRARY;
1504
1505 if (0==strncmp(".idata$6", section.info->name, 8)) {
1506 /* The first two bytes contain the ordinal of the function
1507 in the format of lowpart highpart. The two bytes combined
1508 for the total range of 16 bits which is the function export limit
1509 of DLLs. */
1510 sname = (SymbolName*)section.start+2;
1511 COFF_symbol* sym = &oc->info->symbols[info->numberOfSymbols-1];
1512 addr = get_sym_name( getSymShortName (info, sym), oc);
1513
1514 IF_DEBUG(linker,
1515 debugBelch("addImportSymbol `%s' => `%s'\n",
1516 sname, (char*)addr));
1517 /* We're going to free the any data associated with the import
1518 library without copying the sections. So we have to duplicate
1519 the symbol name and values before the pointers become invalid. */
1520 sname = strdup (sname);
1521 addr = strdup (addr);
1522 if (!ghciInsertSymbolTable(oc->fileName, symhash, sname,
1523 addr, false, oc)) {
1524 releaseOcInfo (oc);
1525 stgFree (oc->image);
1526 oc->image = NULL;
1527 return false;
1528 }
1529 setImportSymbol (oc, sname);
1530
1531 /* Don't process this oc any futher. Just exit. */
1532 oc->n_symbols = 0;
1533 oc->symbols = NULL;
1534 stgFree (oc->image);
1535 oc->image = NULL;
1536 releaseOcInfo (oc);
1537 oc->status = OBJECT_DONT_RESOLVE;
1538 return true;
1539 }
1540
1541 /* Allocate space for any (local, anonymous) .bss sections. */
1542 if (0==strncmp(".bss", section.info->name, 4)) {
1543 uint32_t bss_sz;
1544 uint8_t* zspace;
1545
1546 /* sof 10/05: the PE spec text isn't too clear regarding what
1547 * the SizeOfRawData field is supposed to hold for object
1548 * file sections containing just uninitialized data -- for executables,
1549 * it is supposed to be zero; unclear what it's supposed to be
1550 * for object files. However, VirtualSize is guaranteed to be
1551 * zero for object files, which definitely suggests that SizeOfRawData
1552 * will be non-zero (where else would the size of this .bss section be
1553 * stored?) Looking at the COFF_section info for incoming object files,
1554 * this certainly appears to be the case.
1555 *
1556 * => I suspect we've been incorrectly handling .bss sections in
1557 * (relocatable) object files up until now. This turned out to bite us
1558 * with ghc-6.4.1's use of gcc-3.4.x, which has started to emit
1559 * initially-zeroed-out local 'static' variable decls into the .bss
1560 * section. (The specific function in Q which triggered this is
1561 * libraries/base/cbits/dirUtils.c:__hscore_getFolderPath())
1562 *
1563 * TODO: check if this comment is still relevant.
1564 */
1565 if (section.info->virtualSize == 0 && section.size == 0) continue;
1566 /* This is a non-empty .bss section.
1567 Allocate zeroed space for it */
1568 bss_sz = section.info->virtualSize;
1569 if (bss_sz < section.size) { bss_sz = section.size; }
1570 bss_sz = section.info->alignment;
1571 zspace = stgCallocBytes(1, bss_sz, "ocGetNames_PEi386(anonymous bss)");
1572 oc->sections[i].start = getAlignedMemory(zspace, section);
1573 oc->sections[i].size = bss_sz;
1574 addProddableBlock(oc, zspace, bss_sz);
1575 /* debugBelch("BSS anon section at 0x%x\n", zspace); */
1576 }
1577
1578 /* Allocate space for the sections since we have a real oc.
1579 We initially mark it the region as non-accessible. But will adjust
1580 as we go along. */
1581 if (!oc->info->image) {
1582 /* See Note [Memory allocation]. */
1583 ASSERT(code_heap);
1584 oc->info->image
1585 = HeapAlloc (code_heap, HEAP_ZERO_MEMORY, oc->info->secBytesTotal);
1586 if (!oc->info->image)
1587 barf ("Could not allocate any heap memory from private heap.");
1588 }
1589
1590 ASSERT(section.size == 0 || section.info->virtualSize == 0);
1591 sz = section.size;
1592 if (sz < section.info->virtualSize) sz = section.info->virtualSize;
1593
1594 start = section.start;
1595 end = start + sz - 1;
1596
1597 if (kind != SECTIONKIND_OTHER && end >= start) {
1598 /* See Note [Section alignment]. */
1599 addCopySection(oc, &oc->sections[i], kind, SECTION_NOMEM, start, sz);
1600 addProddableBlock(oc, oc->sections[i].start, sz);
1601 }
1602 }
1603
1604 /* Copy exported symbols into the ObjectCode. */
1605
1606 oc->n_symbols = info->numberOfSymbols;
1607 oc->symbols = stgCallocBytes(sizeof(SymbolName*), oc->n_symbols,
1608 "ocGetNames_PEi386(oc->symbols)");
1609
1610 /* Work out the size of the global BSS section */
1611 StgWord globalBssSize = 0;
1612 for (i=0; i < info->numberOfSymbols; i++) {
1613 COFF_symbol* sym = &oc->info->symbols[i];
1614 if (getSymSectionNumber (info, sym) == IMAGE_SYM_UNDEFINED
1615 && getSymValue (info, sym) > 0
1616 && getSymStorageClass (info, sym) != IMAGE_SYM_CLASS_SECTION) {
1617 globalBssSize += getSymValue (info, sym);
1618 }
1619 i += getSymNumberOfAuxSymbols (info, sym);
1620 }
1621
1622 /* Allocate BSS space */
1623 SymbolAddr* bss = NULL;
1624 if (globalBssSize > 0) {
1625 bss = stgCallocBytes(1, globalBssSize,
1626 "ocGetNames_PEi386(non-anonymous bss)");
1627 addSection(&oc->sections[oc->n_sections-1],
1628 SECTIONKIND_RWDATA, SECTION_MALLOC,
1629 bss, globalBssSize, 0, 0, 0);
1630 IF_DEBUG(linker, debugBelch("bss @ %p %" FMT_Word "\n", bss, globalBssSize));
1631 addProddableBlock(oc, bss, globalBssSize);
1632 } else {
1633 addSection(&oc->sections[oc->n_sections-1],
1634 SECTIONKIND_OTHER, SECTION_NOMEM, NULL, 0, 0, 0, 0);
1635 }
1636
1637 /* At this point we're done with oc->image and all relevant memory have
1638 been copied. Release it to free up the memory. */
1639 stgFree (oc->image);
1640 oc->image = NULL;
1641
1642 for (i = 0; i < (uint32_t)oc->n_symbols; i++) {
1643 COFF_symbol* sym = &oc->info->symbols[i];
1644
1645 int32_t secNumber = getSymSectionNumber (info, sym);
1646 uint32_t symValue = getSymValue (info, sym);
1647 uint8_t symStorageClass = getSymStorageClass (info, sym);
1648
1649 addr = NULL;
1650 bool isWeak = false;
1651 sname = get_sym_name (getSymShortName (info, sym), oc);
1652 Section *section = secNumber > 0 ? &oc->sections[secNumber-1] : NULL;
1653
1654 if ( secNumber != IMAGE_SYM_UNDEFINED
1655 && secNumber > 0
1656 && section
1657 && section->kind != SECTIONKIND_IMPORT_LIBRARY) {
1658 /* This symbol is global and defined, viz, exported */
1659 /* for IMAGE_SYMCLASS_EXTERNAL
1660 && !IMAGE_SYM_UNDEFINED,
1661 the address of the symbol is:
1662 address of relevant section + offset in section
1663 */
1664 if (symStorageClass == IMAGE_SYM_CLASS_EXTERNAL
1665 || ( symStorageClass == IMAGE_SYM_CLASS_STATIC
1666 && section->info->props & IMAGE_SCN_LNK_COMDAT)
1667 ) {
1668 addr = (SymbolAddr*)((size_t)section->start + symValue);
1669 isWeak = section->info->props & IMAGE_SCN_LNK_COMDAT;
1670 }
1671 }
1672 else if (symStorageClass == IMAGE_SYM_CLASS_WEAK_EXTERNAL) {
1673 isWeak = true;
1674 }
1675 else if ( secNumber == IMAGE_SYM_UNDEFINED && symValue > 0) {
1676 /* This symbol isn't in any section at all, ie, global bss.
1677 Allocate zeroed space for it from the BSS section */
1678 addr = bss;
1679 bss = (SymbolAddr*)((StgWord)bss + (StgWord)symValue);
1680 IF_DEBUG(linker, debugBelch("bss symbol @ %p %u\n", addr, symValue));
1681 }
1682 else if (secNumber > 0
1683 && section
1684 && section->kind == SECTIONKIND_IMPORT_LIBRARY) {
1685 /* This is an import section. We should load the dll and lookup
1686 the symbols.
1687 See Note [BFD import library]. */
1688 char* dllName = section->start;
1689 if (strlen(dllName) == 0 || dllName[0] == 0 || has_code_section)
1690 continue;
1691
1692 pathchar* dirName = pathdir(oc->fileName);
1693 HsPtr token = addLibrarySearchPath(dirName);
1694 stgFree(dirName);
1695
1696 sym = &oc->info->symbols[oc->n_symbols-1];
1697 sname = get_sym_name (getSymShortName (info, sym), oc);
1698
1699 IF_DEBUG(linker,
1700 debugBelch("loading symbol `%s' from dll: '%ls' => `%s'\n",
1701 sname, oc->fileName, dllName));
1702
1703 pathchar* dll = mkPath(dllName);
1704 HINSTANCE dllInstance = 0;
1705 const char* result = addDLL_PEi386(dll, &dllInstance);
1706 removeLibrarySearchPath(token);
1707 stgFree(dll);
1708
1709 if (result != NULL || dllInstance == 0) {
1710 errorBelch("Could not load `%s'. Reason: %s\n",
1711 (char*)dllName, result);
1712 return false;
1713 }
1714
1715 /* Set the _dll_iname symbol to the dll's handle. */
1716 addr = (SymbolAddr*)dllInstance;
1717
1718 /* the symbols are named <name>_iname when defined, but are named
1719 _head_<name> when looked up. (Ugh. thanks GCC.) So correct it when
1720 stored so we don't have to correct it each time when retrieved. */
1721 int size = strlen(sname)+1;
1722 char *tmp = stgMallocBytes(size * sizeof(char),
1723 "ocGetNames_PEi386");
1724 strncpy (tmp, sname, size);
1725 char *pos = strstr(tmp, "_iname");
1726 /* drop anything after the name. There are some inconsistencies with
1727 whitespaces trailing the name. */
1728 if (pos) pos[0] = '\0';
1729 int start = 0;
1730
1731 /* msys2 project's import lib builder has some inconsistent name
1732 mangling. Their names start with _ or __ yet they drop this when
1733 making the _head_ symbol. So do the same. */
1734 while (tmp[start]=='_')
1735 start++;
1736
1737 snprintf (sname, size, "_head_%s", tmp+start);
1738 sname[size-start]='\0';
1739 stgFree(tmp);
1740 sname = strdup (sname);
1741 if (!ghciInsertSymbolTable(oc->fileName, symhash, sname,
1742 addr, false, oc))
1743 return false;
1744
1745 break;
1746 }
1747
1748 if ((addr != NULL || isWeak)
1749 && (!section || (section && section->kind != SECTIONKIND_IMPORT))) {
1750 /* debugBelch("addSymbol %p `%s' Weak:%lld \n", addr, sname, isWeak); */
1751 sname = strdup (sname);
1752 IF_DEBUG(linker, debugBelch("addSymbol %p `%s'\n", addr, sname));
1753 ASSERT(i < (uint32_t)oc->n_symbols);
1754 oc->symbols[i] = sname;
1755 if (isWeak) {
1756 setWeakSymbol(oc, sname);
1757 }
1758
1759 if (! ghciInsertSymbolTable(oc->fileName, symhash, sname, addr,
1760 isWeak, oc))
1761 return false;
1762 } else {
1763 /* We're skipping the symbol, but if we ever load this
1764 object file we'll want to skip it then too. */
1765 oc->symbols[i] = NULL;
1766 }
1767
1768 i += getSymNumberOfAuxSymbols (info, sym);
1769 }
1770
1771 return true;
1772 }
1773
1774 #if defined(x86_64_HOST_ARCH)
1775
1776 /* We've already reserved a room for symbol extras in loadObj,
1777 * so simply set correct pointer here.
1778 */
1779 bool
1780 ocAllocateSymbolExtras_PEi386 ( ObjectCode* oc )
1781 {
1782 /* If the ObjectCode was unloaded we don't need a trampoline, it's likely
1783 an import library so we're discarding it earlier. */
1784 if (!oc->info)
1785 return false;
1786
1787 const int mask = default_alignment - 1;
1788 size_t origin = oc->info->trampoline;
1789 oc->symbol_extras
1790 = (SymbolExtra*)((uintptr_t)(oc->info->image + origin + mask) & ~mask);
1791 oc->first_symbol_extra = 0;
1792 COFF_HEADER_INFO *info = oc->info->ch_info;
1793 oc->n_symbol_extras = info->numberOfSymbols;
1794
1795 return true;
1796 }
1797
1798 static size_t
1799 makeSymbolExtra_PEi386( ObjectCode* oc, uint64_t index, size_t s, char* symbol )
1800 {
1801 unsigned int curr_thunk;
1802 SymbolExtra *extra;
1803 curr_thunk = oc->first_symbol_extra + index;
1804 if (index >= oc->n_symbol_extras) {
1805 IF_DEBUG(linker, debugBelch("makeSymbolExtra first:%d, num:%lu, member:%s, index:%llu\n", curr_thunk, oc->n_symbol_extras, oc->archiveMemberName, index));
1806 barf("Can't allocate thunk for `%s' in `%" PATH_FMT "' with member `%s'", symbol, oc->fileName, oc->archiveMemberName);
1807 }
1808
1809 extra = oc->symbol_extras + curr_thunk;
1810
1811 if (!extra->addr)
1812 {
1813 // jmp *-14(%rip)
1814 static uint8_t jmp[] = { 0xFF, 0x25, 0xF2, 0xFF, 0xFF, 0xFF };
1815 extra->addr = (uint64_t)s;
1816 memcpy(extra->jumpIsland, jmp, 6);
1817 }
1818
1819 return (size_t)extra->jumpIsland;
1820 }
1821
1822 #endif /* x86_64_HOST_ARCH */
1823
1824 bool
1825 ocResolve_PEi386 ( ObjectCode* oc )
1826 {
1827 uint64_t A;
1828 size_t S;
1829 SymbolAddr* pP;
1830
1831 unsigned int i;
1832 uint32_t j, noRelocs;
1833
1834 /* ToDo: should be variable-sized? But is at least safe in the
1835 sense of buffer-overrun-proof. */
1836 uint8_t symbol[1000];
1837 /* debugBelch("resolving for %s\n", oc->fileName); */
1838
1839 /* Such libraries have been partially freed and can't be resolved. */
1840 if (oc->status == OBJECT_DONT_RESOLVE)
1841 return 1;
1842
1843 COFF_HEADER_INFO *info = oc->info->ch_info;
1844 uint32_t numberOfSections = info->numberOfSections;
1845
1846 for (i = 0; i < numberOfSections; i++) {
1847 Section section = oc->sections[i];
1848
1849 /* Ignore sections called which contain stabs debugging information. */
1850 if (section.kind == SECTIONKIND_DEBUG)
1851 continue;
1852
1853 noRelocs = section.info->noRelocs;
1854 for (j = 0; j < noRelocs; j++) {
1855 COFF_symbol* sym;
1856 COFF_reloc* reloc = &section.info->relocs[j];
1857
1858 /* the location to patch */
1859 pP = (SymbolAddr*)(
1860 (uintptr_t)section.start
1861 + (uintptr_t)reloc->VirtualAddress
1862 - (uintptr_t)section.info->virtualAddr
1863 );
1864 /* the existing contents of pP */
1865 A = *(uint32_t*)pP;
1866 /* the symbol to connect to */
1867 uint64_t symIndex = reloc->SymbolTableIndex;
1868 sym = &oc->info->symbols[symIndex];
1869
1870 IF_DEBUG(linker,
1871 debugBelch(
1872 "reloc sec %2d num %3d: type 0x%-4x "
1873 "vaddr 0x%-8lx name `",
1874 i, j,
1875 reloc->Type,
1876 reloc->VirtualAddress );
1877 printName (getSymShortName (info, sym), oc);
1878 debugBelch("'\n" ));
1879
1880 if (getSymStorageClass (info, sym) == IMAGE_SYM_CLASS_STATIC) {
1881 Section section = oc->sections[getSymSectionNumber (info, sym)-1];
1882 S = ((size_t)(section.start))
1883 + ((size_t)(getSymValue (info, sym)));
1884 } else {
1885 copyName ( getSymShortName (info, sym), oc, symbol,
1886 sizeof(symbol)-1 );
1887 S = (size_t) lookupSymbol_( (char*)symbol );
1888 if ((void*)S == NULL) {
1889 errorBelch(" | %" PATH_FMT ": unknown symbol `%s'", oc->fileName, symbol);
1890 releaseOcInfo (oc);
1891 return false;
1892 }
1893 }
1894 /* All supported relocations write at least 4 bytes */
1895 checkProddableBlock(oc, pP, 4);
1896 switch (reloc->Type) {
1897 #if defined(i386_HOST_ARCH)
1898 case IMAGE_REL_I386_DIR32:
1899 case IMAGE_REL_I386_DIR32NB:
1900 *(uint32_t *)pP = S + A;
1901 break;
1902 case IMAGE_REL_I386_REL32:
1903 /* Tricky. We have to insert a displacement at
1904 pP which, when added to the PC for the _next_
1905 insn, gives the address of the target (S).
1906 Problem is to know the address of the next insn
1907 when we only know pP. We assume that this
1908 literal field is always the last in the insn,
1909 so that the address of the next insn is pP+4
1910 -- hence the constant 4.
1911 Also I don't know if A should be added, but so
1912 far it has always been zero.
1913
1914 SOF 05/2005: 'A' (old contents of *pP) have been observed
1915 to contain values other than zero (the 'wx' object file
1916 that came with wxhaskell-0.9.4; dunno how it was compiled..).
1917 So, add displacement to old value instead of asserting
1918 A to be zero. Fixes wxhaskell-related crashes, and no other
1919 ill effects have been observed.
1920
1921 Update: the reason why we're seeing these more elaborate
1922 relocations is due to a switch in how the NCG compiles SRTs
1923 and offsets to them from info tables. SRTs live in .(ro)data,
1924 while info tables live in .text, causing GAS to emit REL32/DISP32
1925 relocations with non-zero values. Adding the displacement is
1926 the right thing to do.
1927 */
1928 *(uint32_t *)pP = ((uint32_t)S) + A - ((uint32_t)(size_t)pP) - 4;
1929 break;
1930 #elif defined(x86_64_HOST_ARCH)
1931 case 1: /* R_X86_64_64 (ELF constant 1) - IMAGE_REL_AMD64_ADDR64 (PE constant 1) */
1932 {
1933 uint64_t A;
1934 checkProddableBlock(oc, pP, 8);
1935 A = *(uint64_t*)pP;
1936 *(uint64_t *)pP = S + A;
1937 break;
1938 }
1939 case 2: /* R_X86_64_32 (ELF constant 10) - IMAGE_REL_AMD64_ADDR32 (PE constant 2) */
1940 case 3: /* R_X86_64_32S (ELF constant 11) - IMAGE_REL_AMD64_ADDR32NB (PE constant 3) */
1941 case 17: /* R_X86_64_32S ELF constant, no PE mapping. See note [ELF constant in PE file] */
1942 {
1943 uint64_t v;
1944 v = S + A;
1945 if (v >> 32) {
1946 copyName (getSymShortName (info, sym), oc,
1947 symbol, sizeof(symbol)-1);
1948 S = makeSymbolExtra_PEi386(oc, symIndex, S, (char *)symbol);
1949 /* And retry */
1950 v = S + A;
1951 if (v >> 32) {
1952 barf("IMAGE_REL_AMD64_ADDR32[NB]: High bits are set in %zx for %s",
1953 v, (char *)symbol);
1954 }
1955 }
1956 *(uint32_t *)pP = (uint32_t)v;
1957 break;
1958 }
1959 case 4: /* R_X86_64_PC32 (ELF constant 2) - IMAGE_REL_AMD64_REL32 (PE constant 4) */
1960 {
1961 intptr_t v;
1962 v = S + (int32_t)A - ((intptr_t)pP) - 4;
1963 if ((v >> 32) && ((-v) >> 32)) {
1964 /* Make the trampoline then */
1965 copyName (getSymShortName (info, sym),
1966 oc, symbol, sizeof(symbol)-1);
1967 S = makeSymbolExtra_PEi386(oc, symIndex, S, (char *)symbol);
1968 /* And retry */
1969 v = S + (int32_t)A - ((intptr_t)pP) - 4;
1970 if ((v >> 32) && ((-v) >> 32)) {
1971 barf("IMAGE_REL_AMD64_REL32: High bits are set in %zx for %s",
1972 v, (char *)symbol);
1973 }
1974 }
1975 *(uint32_t *)pP = (uint32_t)v;
1976 break;
1977 }
1978 #endif
1979 default:
1980 debugBelch("%" PATH_FMT ": unhandled PEi386 relocation type %d\n",
1981 oc->fileName, reloc->Type);
1982 releaseOcInfo (oc);
1983 return false;
1984 }
1985
1986 }
1987 }
1988
1989 IF_DEBUG(linker, debugBelch("completed %" PATH_FMT "\n", oc->fileName));
1990 return true;
1991 }
1992
1993 /*
1994 Note [ELF constant in PE file]
1995
1996 For some reason, the PE files produced by GHC contain a linux
1997 relocation constant 17 (0x11) in the object files. As far as I (Phyx-) can tell
1998 this constant doesn't seem like it's coming from GHC, or at least I could not find
1999 anything in the .s output that GHC produces which specifies the relocation type.
2000
2001 This leads me to believe that this is a bug in GAS. However because this constant is
2002 there we must deal with it. This is done by mapping it to the equivalent in behaviour PE
2003 relocation constant 0x03.
2004
2005 See #9907
2006 */
2007
2008 bool
2009 ocRunInit_PEi386 ( ObjectCode *oc )
2010 {
2011 if (!oc || !oc->info || !oc->info->init) {
2012 return true;
2013 }
2014
2015 int argc, envc;
2016 char **argv, **envv;
2017
2018 getProgArgv(&argc, &argv);
2019 getProgEnvv(&envc, &envv);
2020
2021 Section section = *oc->info->init;
2022 ASSERT(SECTIONKIND_INIT_ARRAY == section.kind);
2023
2024 uint8_t *init_startC = section.start;
2025 init_t *init_start = (init_t*)init_startC;
2026 init_t *init_end = (init_t*)(init_startC + section.size);
2027
2028 // ctors are run *backwards*!
2029 for (init_t *init = init_end - 1; init >= init_start; init--)
2030 (*init)(argc, argv, envv);
2031
2032 freeProgEnvv(envc, envv);
2033 releaseOcInfo (oc);
2034 return true;
2035 }
2036
2037 SymbolAddr *lookupSymbol_PEi386(SymbolName *lbl)
2038 {
2039 RtsSymbolInfo *pinfo;
2040
2041 if (!ghciLookupSymbolInfo(symhash, lbl, &pinfo)) {
2042 IF_DEBUG(linker, debugBelch("lookupSymbol: symbol '%s' not found\n", lbl));
2043
2044 SymbolAddr* sym;
2045
2046 /* See Note [mingw-w64 name decoration scheme] */
2047 #if !defined(x86_64_HOST_ARCH)
2048 zapTrailingAtSign ( lbl );
2049 #endif
2050 sym = lookupSymbolInDLLs(lbl);
2051 return sym; // might be NULL if not found
2052 } else {
2053 #if defined(mingw32_HOST_OS)
2054 // If Windows, perform initialization of uninitialized
2055 // Symbols from the C runtime which was loaded above.
2056 // We do this on lookup to prevent the hit when
2057 // The symbol isn't being used.
2058 if (pinfo->value == (void*)0xBAADF00D)
2059 {
2060 char symBuffer[50];
2061 sprintf(symBuffer, "_%s", lbl);
2062 static HMODULE msvcrt = NULL;
2063 if (!msvcrt) msvcrt = GetModuleHandle("msvcrt");
2064 pinfo->value = GetProcAddress(msvcrt, symBuffer);
2065 }
2066 else if (pinfo && pinfo->owner && isSymbolImport (pinfo->owner, lbl))
2067 {
2068 /* See Note [BFD import library]. */
2069 HINSTANCE dllInstance = (HINSTANCE)lookupSymbol(pinfo->value);
2070 if (!dllInstance && pinfo->value)
2071 return pinfo->value;
2072
2073 if (!dllInstance)
2074 {
2075 errorBelch("Unable to load import dll symbol `%s'. "
2076 "No _iname symbol.", lbl);
2077 return NULL;
2078 }
2079 IF_DEBUG(linker,
2080 debugBelch("indexing import %s => %s using dll instance %p\n",
2081 lbl, (char*)pinfo->value, dllInstance));
2082 pinfo->value = GetProcAddress((HMODULE)dllInstance, lbl);
2083 clearImportSymbol (pinfo->owner, lbl);
2084 return pinfo->value;
2085 }
2086 #endif
2087 return loadSymbol(lbl, pinfo);
2088 }
2089 }
2090
2091 /* -----------------------------------------------------------------------------
2092 * Section management.
2093 */
2094
2095 /* See Note [Section alignment]. */
2096 static void
2097 addCopySection (ObjectCode *oc, Section *s, SectionKind kind,
2098 SectionAlloc alloc, void* start, StgWord size) {
2099 char* pos = oc->info->image + oc->info->secBytesUsed;
2100 char* newStart = (char*)getAlignedMemory ((uint8_t*)pos, *s);
2101 memcpy (newStart, start, size);
2102 uintptr_t offset = (uintptr_t)newStart - (uintptr_t)oc->info->image;
2103 oc->info->secBytesUsed = (size_t)offset + size;
2104 start = newStart;
2105
2106 /* Initially I wanted to apply the right memory protection to the region and
2107 which would leaved the gaps in between the regions as inaccessible memory
2108 to prevent exploits.
2109 The problem is protection is always on page granularity, so we can use
2110 less memory and be insecure or use more memory and be secure.
2111 For now, I've chosen lower memory over secure as the first pass, this
2112 doesn't regress security over the current implementation. After this
2113 patch I will change to different implementation that will fix the mem
2114 protection and keep the memory size small. */
2115 addSection (s, kind, alloc, start, size, 0, 0, 0);
2116 }
2117
2118 /* -----------------------------------------------------------------------------
2119 * Debugging operations.
2120 */
2121
2122 pathchar*
2123 resolveSymbolAddr_PEi386 (pathchar* buffer, int size,
2124 SymbolAddr* symbol, uintptr_t* top ){
2125 SYMBOL_INFO sym;
2126 ZeroMemory (&sym, sizeof(SYMBOL_INFO));
2127 sym.MaxNameLen = sizeof(char) * 1024;
2128
2129 DWORD64 uDisplacement = 0;
2130 HANDLE hProcess = GetCurrentProcess();
2131 ObjectCode* obj = NULL;
2132 uintptr_t start, end;
2133 *top = 0;
2134
2135 pathprintf (buffer, size, WSTR("0x%" PRIxPTR), symbol);
2136
2137 if (SymFromAddr (hProcess, (uintptr_t)symbol, &uDisplacement, &sym))
2138 {
2139 /* Try using Windows symbols. */
2140 wcscat (buffer, WSTR(" "));
2141 pathchar* name = mkPath (sym.Name);
2142 wcscat (buffer, name);
2143 stgFree (name);
2144 if (uDisplacement != 0)
2145 {
2146 int64_t displacement = (int64_t)uDisplacement;
2147 pathchar s_disp[50];
2148 if (displacement < 0)
2149 pathprintf ((pathchar*)s_disp, 50, WSTR("-%ld"), -displacement);
2150 else
2151 pathprintf ((pathchar*)s_disp, 50, WSTR("+%ld"), displacement);
2152
2153 wcscat (buffer, s_disp);
2154 }
2155 }
2156 else
2157 {
2158 /* Try to calculate from information inside the rts. */
2159 uintptr_t loc = (uintptr_t)symbol;
2160 for (ObjectCode* oc = objects; oc; oc = oc->next) {
2161 for (int i = 0; i < oc->n_sections; i++) {
2162 Section section = oc->sections[i];
2163 start = (uintptr_t)section.start;
2164 end = start + section.size;
2165 if (loc > start && loc <= end)
2166 {
2167 wcscat (buffer, WSTR(" "));
2168 if (oc->archiveMemberName)
2169 {
2170 pathchar* name = mkPath (oc->archiveMemberName);
2171 wcscat (buffer, name);
2172 stgFree (name);
2173 }
2174 else
2175 {
2176 wcscat (buffer, oc->fileName);
2177 }
2178 pathchar s_disp[50];
2179 pathprintf (s_disp, 50, WSTR("+0x%" PRIxPTR), loc - start);
2180 wcscat (buffer, s_disp);
2181 obj = oc;
2182 goto exit_loop;
2183 }
2184 }
2185 }
2186
2187 /* If we managed to make it here, we must not have any symbols nor be
2188 dealing with code we've linked. The only thing left is an internal
2189 segfault or one in a dynamic library. So let's enumerate the module
2190 address space. */
2191 HMODULE *hMods = NULL;
2192 DWORD cbNeeded;
2193 EnumProcessModules (hProcess, hMods, 0, &cbNeeded);
2194 hMods = stgMallocBytes (cbNeeded, "resolveSymbolAddr_PEi386");
2195 if (EnumProcessModules (hProcess, hMods, cbNeeded, &cbNeeded))
2196 {
2197 uintptr_t loc = (uintptr_t)symbol;
2198 MODULEINFO info;
2199 for (uint32_t i = 0; i < cbNeeded / sizeof(HMODULE); i++) {
2200 ZeroMemory (&info, sizeof (MODULEINFO));
2201 if (GetModuleInformation (hProcess, hMods[i], &info,
2202 sizeof(MODULEINFO)))
2203 {
2204 uintptr_t start = (uintptr_t)info.lpBaseOfDll;
2205 uintptr_t end = start + info.SizeOfImage;
2206 if (loc >= start && loc < end)
2207 {
2208 /* Hoera, finally found some information. */
2209 pathchar tmp[MAX_PATH];
2210 if (GetModuleFileNameExW (hProcess, hMods[i], tmp, MAX_PATH))
2211 {
2212 wcscat (buffer, WSTR(" "));
2213 wcscat (buffer, tmp);
2214 pathprintf (tmp, MAX_PATH, WSTR("+0x%" PRIxPTR), loc - start);
2215 wcscat (buffer, tmp);
2216 }
2217 break;
2218 }
2219 }
2220 }
2221 }
2222
2223 stgFree(hMods);
2224 }
2225
2226 /* Finally any file/line number. */
2227 IMAGEHLP_LINE64 lineInfo = {0};
2228 DWORD dwDisplacement = 0;
2229 exit_loop:
2230 if (SymGetLineFromAddr64(hProcess, (uintptr_t)symbol, &dwDisplacement,
2231 &lineInfo))
2232 {
2233 /* Try using Windows symbols. */
2234 pathchar s_line[512];
2235 pathprintf ((pathchar*) s_line, 512, WSTR(" %ls (%lu)"),
2236 lineInfo.FileName, lineInfo.LineNumber);
2237 wcscat (buffer, s_line);
2238 if (dwDisplacement != 0)
2239 {
2240 pathprintf ((pathchar*) s_line, 512, WSTR(" +%lu byte%s"),
2241 dwDisplacement,
2242 (dwDisplacement == 1 ? WSTR("") : WSTR("s")));
2243 }
2244 wcscat (buffer, s_line);
2245 }
2246 else if (obj)
2247 {
2248 /* Try to calculate from information inside the rts. */
2249 typedef struct _SymX { SymbolName* name; uintptr_t loc; } SymX;
2250 SymX* locs = stgCallocBytes (sizeof(SymX), obj->n_symbols,
2251 "resolveSymbolAddr");
2252 int blanks = 0;
2253 for (int i = 0; i < obj->n_symbols; i++) {
2254 SymbolName* sym = obj->symbols[i];
2255 if (sym == NULL)
2256 {
2257 blanks++;
2258 continue;
2259 }
2260 RtsSymbolInfo* a = NULL;
2261 ghciLookupSymbolInfo(symhash, sym, &a);
2262 if (a) {
2263 SymX sx = {0};
2264 sx.name = sym;
2265 sx.loc = (uintptr_t)a->value;
2266 locs[i] = sx;
2267 }
2268 }
2269 int comp (const void * elem1, const void * elem2)
2270 {
2271 SymX f = *((SymX*)elem1);
2272 SymX s = *((SymX*)elem2);
2273 if (f.loc > s.loc) return 1;
2274 if (f.loc < s.loc) return -1;
2275 return 0;
2276 }
2277 qsort (locs, obj->n_symbols, sizeof (SymX), comp);
2278 uintptr_t key = (uintptr_t)symbol;
2279 SymX* res = NULL;
2280
2281 for (int x = blanks; x < obj->n_symbols; x++) {
2282 if (x < (obj->n_symbols -1)) {
2283 if (locs[x].loc >= key && key < locs[x+1].loc) {
2284 res = &locs[x];
2285 break;
2286 }
2287 }
2288 else
2289 {
2290 if (locs[x].loc >= key) {
2291 res = &locs[x];
2292 break;
2293 }
2294 }
2295 }
2296
2297 if (res) {
2298 pathchar s_disp[512];
2299 *top = (uintptr_t)res->loc;
2300 pathprintf ((pathchar*)s_disp, 512,
2301 WSTR("\n\t\t (%s+0x%" PRIxPTR ")"),
2302 res->name, res->loc - key);
2303 wcscat (buffer, s_disp);
2304 }
2305 stgFree (locs);
2306 }
2307
2308 return buffer;
2309 }
2310 #endif /* mingw32_HOST_OS */