1.0.41.40: ppc: Shorten the gencgc allocation sequence.
* Rearrange the allocation sequence to avoid all branches,
relying on the runtime to manipulate the point at which
execution resumes from an allocation trap to compensate.
* Update the runtime to match the new allocation sequence.
* There is a further possible optimization here: The runtime
allocation trap handler can also accept an ADDI instruction
where the current sequence uses an ADD. In the case of a
fixed allocation size, this would save loading the temp
register with the size.
* Another optimization, along the same lines as the previous
one: With a fixed allocation size, adjusting the pointer to
point to the beginning of the data block and setting the lowtag
could be done in a single instruction.
* A third optimization, one which would entail modifying the
allocation trap handler slightly, and depends on at least the
first optimization above being in place: Once temp-tn is no
longer being used to hold the allocation size for fixed
allocations, it is available to hold the address of the
alloc region when threading is disabled, thus saving having to
reload it (two instructions).