NCG: New code layout algorithm.
authorAndreas Klebinger <klebinger.andreas@gmx.at>
Sat, 17 Nov 2018 10:20:36 +0000 (11:20 +0100)
committerAndreas Klebinger <klebinger.andreas@gmx.at>
Sat, 17 Nov 2018 10:20:36 +0000 (11:20 +0100)
commit912fd2b6ca0bc51076835b6e3d1f469b715e2760
treeae1c96217e0eea77d0bfd53101d3fa868d45027d
parent6ba9aa5dd0a539adf02690a9c71d1589f541b3c5
NCG: New code layout algorithm.

Summary:
This patch implements a new code layout algorithm.
It has been tested for x86 and is disabled on other platforms.

Performance varies slightly be CPU/Machine but in general seems to be better
by around 2%.
Nofib shows only small differences of about +/- ~0.5% overall depending on
flags/machine performance in other benchmarks improved significantly.

Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec,
containers, text and xeno.

While the magnitude of gains differed three different CPUs where tested with
all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell,
Skylake

* Library benchmark results summarized:
  * containers: ~1.5% faster
  * aeson: ~2% faster
  * megaparsec: ~2-5% faster
  * xml library benchmarks: 0.2%-1.1% faster
  * vector-benchmarks: 1-4% faster
  * text: 5.5% faster

On average GHC compile times go down, as GHC compiled with the new layout
is faster than the overhead introduced by using the new layout algorithm,

Things this patch does:

* Move code responsilbe for block layout in it's own module.
* Move the NcgImpl Class into the NCGMonad module.
* Extract a control flow graph from the input cmm.
* Update this cfg to keep it in sync with changes during
  asm codegen. This has been tested on x64 but should work on x86.
  Other platforms still use the old codelayout.
* Assign weights to the edges in the CFG based on type and limited static
  analysis which are then used for block layout.
* Once we have the final code layout eliminate some redundant jumps.

  In particular turn a sequences of:
      jne .foo
      jmp .bar
    foo:
  into
      je bar
    foo:
      ..

Test Plan: ci

Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott

Reviewed By: RyanGlScott

Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton

GHC Trac Issues: #15124

Differential Revision: https://phabricator.haskell.org/D4726
34 files changed:
compiler/cmm/CmmMachOp.hs
compiler/cmm/CmmNode.hs
compiler/cmm/CmmPipeline.hs
compiler/cmm/Hoopl/Collections.hs
compiler/cmm/Hoopl/Label.hs
compiler/ghc.cabal.in
compiler/main/DynFlags.hs
compiler/nativeGen/AsmCodeGen.hs
compiler/nativeGen/BlockLayout.hs [new file with mode: 0644]
compiler/nativeGen/CFG.hs [new file with mode: 0644]
compiler/nativeGen/NCGMonad.hs
compiler/nativeGen/PPC/CodeGen.hs
compiler/nativeGen/PPC/Instr.hs
compiler/nativeGen/PPC/RegInfo.hs
compiler/nativeGen/RegAlloc/Linear/Base.hs
compiler/nativeGen/RegAlloc/Linear/JoinToTargets.hs
compiler/nativeGen/RegAlloc/Linear/Main.hs
compiler/nativeGen/RegAlloc/Linear/State.hs
compiler/nativeGen/RegAlloc/Liveness.hs
compiler/nativeGen/SPARC/CodeGen.hs
compiler/nativeGen/SPARC/ShortcutJump.hs
compiler/nativeGen/X86/CodeGen.hs
compiler/nativeGen/X86/Cond.hs
compiler/nativeGen/X86/Instr.hs
compiler/nativeGen/X86/Regs.hs
compiler/utils/Digraph.hs
compiler/utils/OrdList.hs
compiler/utils/Util.hs
docs/users_guide/8.8.1-notes.rst
docs/users_guide/debugging.rst
docs/users_guide/using-optimisation.rst
testsuite/tests/cmm/should_compile/Makefile [new file with mode: 0644]
testsuite/tests/cmm/should_compile/all.T [new file with mode: 0644]
testsuite/tests/cmm/should_compile/selfloop.cmm [new file with mode: 0644]