Micro-optimize machine code for some register tests on x86[-64].
Replace all occurrences of (INST OR REG REG) with (INST TEST REG REG)
in VOPs and assembly routines. This removes, for the next read of REG,
the dependency on this instruction, allowing more instruction-level
parallelism, so is potentially faster. Moreover, most of the time the
next instruction is a conditional branch, which allows processors that
support macro-op fusion to fuse the TEST (but not the OR) with this
branch instruction, reducing the resources needed to decode and execute
the two instructions, which again is potentially faster.