Use multi-byte NOPs for code alignment on x86-64.
This is intended to speed up execution of such code sequences. It makes
the disassembly output somewhat more readable, too. The multi-byte NOP
instructions are chosen according to the recommendations of both AMD and
Intel. All existing x86-64 processors should support the "0f 1f" opcode
used.
This adds the needed infrastructure to the backend-independent compiler
parts and uses it from x86-64. Backends not using this functionality are
left unchanged.
Extend EMIT-ALIGNMENT to allow to specify multi-byte NOPs to be used
instead of repetitions of the single-byte NOP and change the call in
EMIT-BLOCK-HEADER on x86-64 to trigger this. Extend EMIT-SKIP to call
EMIT-LONG-NOP in this case.
On x86-64, add EMIT-LONG-NOP as the instruction emitter and extend the
disassembler entry for NOP to understand the multi-byte forms, too.
Make EMIT-FILLER decide more carefully whether to join fillers that are
adjacent in the list of segment annotations: Only join them if they are
immediately adjacent in the segment, too. (Otherwise the joined filler
would cover the wrong parts of a shortened alignment sequence.)
In certain circumstances %EMIT-ALIGNMENT splits an alignment into two
parts. This may not be necessary but has not yet been changed, so
sometimes one more long NOP than needed is assembled.