Pierre Thierry [Sun, 28 Aug 2011 19:38:28 +0000 (21:38 +0200)]
Update URL of "Long, Painful History of Time"
Previous URL was not available anymore (404 error)
Also rewrapped the paragraph to the same width as the others
Stas Boukarev [Mon, 24 Jun 2013 10:28:30 +0000 (14:28 +0400)]
Simplify EMIT-VOP further.
EMIT-VOP is only ever used in conjunction with INSERT-VOP-SEQUENCE, by
returning two values: first and last VOPs, all linked together,
INSERT-VOP-SEQUENCE then inserts them into the block. But nowadays
EMIT-VOP always returns the same VOP as the second value.
* EMIT-VOP now returns one value, the emitted VOP.
* INSERT-VOP-SEQUENCE is renamed to INSERT-VOP, accepts only one VOP.
* A new function EMIT-AND-INSERT-VOP is added, which combines them,
and is used anywhere where EMIT-VOP was used.
This makes things less complicated, and reduces core size by 32KB, the
same as the previous commit, for a total of 64KB of savings
essentially for free.
(Also squeeze a couple of line-break fixes)
Stas Boukarev [Mon, 24 Jun 2013 09:50:35 +0000 (13:50 +0400)]
Simplify EMIT-GENERIC-VOP.
Since there's only one kind of templates now, there's no need for
indirection. Rename EMIT-GENERIC-VOP to EMIT-VOP, remove EMIT-FUNCTION
slot from TEMPLATE, call EMIT-VOP directly.
Stas Boukarev [Sat, 22 Jun 2013 15:37:18 +0000 (19:37 +0400)]
backtrace: don't cons large lists when RCX is overwritten inside XEPs.
To present a list with the actual number of passed arguments in the
backtrace, clean-xep used the arg-count register and added missing
arguments in the form of #<unknown>, but if the register is
overwritten by other code, it could cons very large lists, exhausting
heap. Do such arg-list clean up only upon INVALID-ARG-COUNT-ERROR.
Fixes lp#1192929.
Joshua Elsasser [Sun, 9 Jun 2013 04:36:48 +0000 (21:36 -0700)]
Hopefully fix the windows build to grovel time structures correctly.
It is a little misleading to say "correctly" since struct timespec
doesn't really exist on windows. Groveling the definition that we
define in our own pthreads wrapper seems the most consistent choice.
The grovel-headers.c changes have only been tested in isolation, not
with a real build. Thanks to Kyle Isom for testing, any resulting
build problems are entirely my fault.
Joshua Elsasser [Sun, 12 May 2013 15:36:11 +0000 (08:36 -0700)]
Grovel timeval and timespec struct definitions rather than hard-coding.
Paul Khuong [Tue, 18 Jun 2013 17:23:42 +0000 (13:23 -0400)]
Fix instruction encoding for XMM shifts with immediate count
x86 keeps getting more and more devious: the source/dest operand
is in the r/m field for these instructions, so REX.B must be set,
rather than REX.R, to access > xmm7. Intel's new documentation
seems clearer about these issues, at least.
Stas Boukarev [Tue, 11 Jun 2013 11:20:10 +0000 (15:20 +0400)]
check-mod-fixnum: correct the test for power-of-two.
Testing for the power of two was performed on a fixnumized number,
causing the optimization for power-of-two to be never applied.
Stas Boukarev [Mon, 10 Jun 2013 18:54:55 +0000 (22:54 +0400)]
Add a missing :suppress-errors keyword for WRITE defknown.
Christophe Rhodes [Mon, 10 Jun 2013 12:26:27 +0000 (13:26 +0100)]
Note removal of post-receive-email in NEWS
Christophe Rhodes [Mon, 10 Jun 2013 12:20:14 +0000 (13:20 +0100)]
remove git/ directory
The scripts therein weren't directly relevant for SBCL development,
only for infrastructure to assist that development; additionally,
they are derived from git examples licensed under the GPL, and it
was difficult to explain the effects of this in a short paragraph
in the master COPYING file. A new sbcl-git-hooks repository is
available on SourceForge for these and any other infrastructure
customizations.
Lutz Euler [Mon, 10 Jun 2013 11:44:20 +0000 (13:44 +0200)]
Micro-optimize DOUBLE-FLOAT-LOW-BITS on x68-64.
Instead of loading a 64-bit register from memory and zeroing the upper
32 bits of it by the sequence SHL reg, 32; SHR reg, 32 simply load the
corresponding 32-bit register from memory, relying on the implicit
zero-extension to 64 bits this does. This is smaller and faster.
For example, if the input to the VOP is a descriptor register, the old
instruction sequence is:
MOV RDX, [RDX-7]
SHL RDX, 32
SHR RDX, 32
and the new one:
MOV EDX, [RDX-7]
Regarding store-to-load forwarding this change should make no
difference: Most current processors can forward a 64-bit store to a
32-bit load from the same address. The exception is Intel's Atom which
can forward only to a load of the same size as the store; but it also
supports this only between integer registers, and DOUBLE-FLOAT-LOW-BITS
mostly or even always acts on memory slots written from an XMM register
(of the three storage classes it supports as input, for the first it
does the store itself from an XMM register; for the other two I have
investigated some disassemblies and always found the prior store to be
from am XMM register).
Lutz Euler [Mon, 10 Jun 2013 10:37:22 +0000 (12:37 +0200)]
Make clean.sh clean up doc/internals, too.
For completeness and equal treatment with doc/manual.
Lutz Euler [Mon, 10 Jun 2013 10:37:22 +0000 (12:37 +0200)]
git: Add entries for the HTML manual to doc/internals/.gitignore.
These are the files and directories generated by "make html" in
doc/internals.
Lutz Euler [Sun, 9 Jun 2013 15:58:52 +0000 (17:58 +0200)]
git: New file doc/internals/.gitignore.
Ignore the files generated by building the internals manual.
Copied and adapted from doc/manual/.gitignore.
Paul Khuong [Sat, 8 Jun 2013 16:53:16 +0000 (12:53 -0400)]
Insert error traps after full calls inferred not to return
An explicit error trap after full calls to known functions helps
understand type derivation errors at runtime; it's certainly better
than executing arbitrary bytes.
Only do this when the return type was tightened to NIL via type
derivation; if a function is defknowned not to return, it really
shouldn't.
Paul Khuong [Sat, 8 Jun 2013 15:26:19 +0000 (11:26 -0400)]
Only use MASK-SIGNED-FIELD VOPs as last resorts
The MOVE hack usually leads to better code when it can be used.
Paul Khuong [Sat, 8 Jun 2013 05:39:10 +0000 (01:39 -0400)]
Handle (aref v (+ i k)), with i negative
* Update the fndb to allow negative index values for foo-ref-with-offset
and foo-set-with-offset.
* Adjust VOPs accordingly.
* Fix fold-index-addressing: only fold constant offsets if the resulting
index argument would be a fixnum, and compute the new offset correctly
for subtractions.
* Unmark the corresponding test as an expected feailure, and add a test
to make sure VOPs for data-vector-{ref,set}-with-offset accept negative
index values (unless the element size is too small
to fold offsets in an EA).
* Un-package-qualify a few spurious test-util:with-test.
Paul Khuong [Sat, 8 Jun 2013 05:38:32 +0000 (01:38 -0400)]
Fix a typo in bignum--ref-with-offset
Paul Khuong [Sat, 8 Jun 2013 05:34:52 +0000 (01:34 -0400)]
Consistently force (double) rounding of foreign x87 values
SBCL always functions in 64 bit mode, but switches to 80 bit for
foreign calls. Return values might be unexpectedly precise.
Force a round-trip from the x87 unit and the stack to make sure
FP return values are rounded to the correct width.
Paul Khuong [Sat, 8 Jun 2013 05:31:22 +0000 (01:31 -0400)]
Look for left-over dead code when *check-consistency*
If ir1opt leaves dead code around, later parts of the compilation
pipeline can become seriously confused. Detect such issues earlier,
rather than as mysterious failures.
Paul Khuong [Sat, 8 Jun 2013 03:37:52 +0000 (23:37 -0400)]
Simplify RATIONAL/constant FLOAT and INTEGER/constant RATIO comparisons
The spec says that rationals and floats are compared by first calling
RATIONAL on the float. Constant fold the call to RATIONAL, and round
to an integer if applicable.
Paul Khuong [Sat, 8 Jun 2013 02:29:55 +0000 (22:29 -0400)]
Silence notes about being specialised EQ templates on x86oids
During LTN, we emit notes when the final chosen template costs
at least 6 more units than an unapplicable template. Adjust the
costs of EQ VOPs to avoid triggering this logic.
Paul Khuong [Sat, 8 Jun 2013 02:27:59 +0000 (22:27 -0400)]
Silence the transforms that detect rightward arithmetic shift
It seems most of the time, the notes complained about clear left
shifts.
Paul Khuong [Sat, 8 Jun 2013 02:26:59 +0000 (22:26 -0400)]
Mark DATA-VECTOR-REF[-WITH-OFFSET] as unsafely flushable
Unsafe code will be able to eliminate array reads as dead code.
Paul Khuong [Sat, 8 Jun 2013 02:25:57 +0000 (22:25 -0400)]
Adjust internal encoding for TN location for larger SC count limit
We need one more bit to encode values up to 62.
Paul Khuong [Sat, 8 Jun 2013 01:43:16 +0000 (21:43 -0400)]
New VOP for LOGAND of bignum and word-sized constant on x86-64
We can directly read the first bignum digit and AND it.
Paul Khuong [Sat, 8 Jun 2013 01:42:17 +0000 (21:42 -0400)]
MASK-SIGNED-FIELD VOPs on x86-64
Other platforms go through the MOVE hack like before.
Paul Khuong [Fri, 7 Jun 2013 23:36:13 +0000 (19:36 -0400)]
More identity folding for LOGAND and LOGIOR with constants
* Handle more complex cases than only powers of two:
compare the variant argument with a power-of-two-sized
prefix of the constant bit pattern.
* Add parallel logic for LOGIOR: if all the ones we're ORing in
are already set because the variant argument is a small enough
negative integer, we've got an identity.
* This is a bit hairy, so exhaustively check the logic with small
values.
Paul Khuong [Fri, 7 Jun 2013 23:06:32 +0000 (19:06 -0400)]
More associativity-based constant-folding
Detect patterns like (+ (- x k1) k2) and (* (/ x k1) k2).
Paul Khuong [Fri, 7 Jun 2013 23:03:44 +0000 (19:03 -0400)]
Enable signed modular arithmetic for LOGIOR
When the result of a bitwise or is known to be negative, we don't
need to compute the most significant bits (they're all ones).
Paul Khuong [Fri, 7 Jun 2013 23:01:09 +0000 (19:01 -0400)]
Enable more modular arithmetic
The rewrites now trigger when the result type for LOGAND or
MASK-SIGNED-FIELD is an union of integer types.
Paul Khuong [Fri, 7 Jun 2013 22:46:25 +0000 (18:46 -0400)]
Complete cut-to-width for modular arithmetic
For each modular argument, go through the nodes that provide its
value and try to narrow down their bitwidth. If we fail on any and
the result might be too wide, splice in an explicit call to LOGAND
or MASK-SIGNED-FIELD. Skip that last step if the value is an argument
to an equivalent LOGAND or MASK-SIGNED-FIELD.
Test case by Eric Marsden.
Paul Khuong [Fri, 7 Jun 2013 22:42:42 +0000 (18:42 -0400)]
Fix negation of SIMD-PACK types
type-union2 can punt and return NIL.
Lutz Euler [Thu, 6 Jun 2013 14:26:30 +0000 (16:26 +0200)]
Simplify getting the contents of assembler segments.
Extend FINALIZE-SEGMENT to compact the segment's buffer and provide an
exported function to get at this buffer. This resolves an old KLUDGE
noted at ON-SEGMENT-CONTENTS-VECTORLY, making this function unnecessary.
There are several benefits to this change: First, the consumers of
assembler segment's contents, like WRITE-SEGMENT-CONTENTS which is used
for example during FASL dumping, or MAKE-CORE-COMPONENT, now call
WRITE-SEQUENCE respectively COPY-BYTE-VECTOR-TO-SYSTEM-AREA only once
per segment and not once per the pieces of the segment's contents that
ON-SEGMENT-CONTENTS-VECTORLY provided, which makes for less overhead.
Second, this allows to greatly simplify the whole operation of
DISASSEMBLE-ASSEM-SEGMENT, in the course deleting several helpers of it.
So far this repartitioned the pieces of the segment's contents from
ON-SEGMENT-CONTENTS-VECTORLY, while caring not to split the contents
inside instructions, which needed a sizable amount of code. Now the
segment's contents are simply disassembled as a whole. Also, the old
code (specifically SEGMENT-OVERFLOW) didn't take prefix instructions
into account correctly which surfaced as the bug in lp#1085729.
Fixes lp#1085729.
Also, fix an unrelated typo in NEWS.
Stas Boukarev [Wed, 5 Jun 2013 23:11:42 +0000 (03:11 +0400)]
Stop exporting unused symbols.
Stas Boukarev [Wed, 5 Jun 2013 19:28:44 +0000 (23:28 +0400)]
Factor out read-var-integer into a function.
read-var-integer macro is used quite a number of times, expand the
macro into a SETF, which calls %read-var-integer, which does actual
reading. Reduces the core size by 65KB on x86-64.
Stas Boukarev [Wed, 5 Jun 2013 10:50:34 +0000 (18:50 +0800)]
sb-bsd-sockets: More robust inet-socket-bind test on Windows.
Nested unwind-protects aren't supported on Windows.
Stas Boukarev [Wed, 5 Jun 2013 18:24:54 +0000 (22:24 +0400)]
Get rid of vm-support-routines indirection.
VM routines were defined using two functions, one calling another
through structure slots. This is unnecessary, removing leads to a
~200KB core size reduction on x86-64.
Stas Boukarev [Wed, 5 Jun 2013 14:38:42 +0000 (18:38 +0400)]
Optimize (mod FIXNUM) type-checks on x86oids.
Instead of two (and (>= x 0) (< x FIXNUM)) comparisons, do one
unsigned.
(mod power-of-two) is further optimized by doing one mask test
determine the range and fixnumness in one go.
Stas Boukarev [Tue, 4 Jun 2013 20:58:55 +0000 (00:58 +0400)]
sb-bsd-socket tests: don't listen on a predefined port.
Listening on 1974 prevents from building contribs in parallel.
Christophe Rhodes [Tue, 4 Jun 2013 12:00:50 +0000 (13:00 +0100)]
fix CL case conversions of characters involving iota subscript
Oh boy. Judging by the length of the web page explaining the issue
(at <http://www.tlg.uci.edu/~opoudjis/unicode/unicode_adscript.html>)
this is a bit of a minefield. I hope that this doesn't contribute
further to the trouble...
Although the combined _WITH_PROSGEGRAMMENI characters are of
general class "Lt" (i.e. titlecase), for CL purposes we treat them
as the uppercase equivalent of the lowercase _WITH_YPOGEGRAMMENI
characters (as directly specified by the case mapping data in
UnicodeData.txt). This is a little awkward, and involves a bit
of rearrangement in the indices of the misc table entries to make
the (CL) uppercase/lowercase tests efficient, but seems to be the
best of all possible worlds given that we must comply with CL's
character-to-character case mappings -- the alternative of not
providing an uppercase version of LOWERCASE_OMEGA_WITH_YPOGEGRAMMENI
seems even weirder.
The way this is done in ucd.lisp is a little bit kludgy, because we
have to avoid giving the same exception to the serbian titlecase
digraphs (Dz and friends) which mustn't map to anything, or else
we'd break invertibility. (The lowercase dz and uppercase DZ are
already (CL) case mappings of each other). Probably the thing which
will confuse future readers is that some (Unicode) titlecase
characters are (CL) upper-case-p.
Paul Khuong [Mon, 3 Jun 2013 17:21:25 +0000 (13:21 -0400)]
Simpler and more precise type derivation for APPEND/NCONC
We can suppose that all but the last argument are lists when
deriving the return type... and the logic to compute the return
type can be much simpler: it's either a CONS, the last argument,
or we don't know which (yet).
Stas Boukarev [Mon, 3 Jun 2013 16:40:59 +0000 (20:40 +0400)]
Uninitialized type-error conditions can now be printed.
(print (make-condition 'simple-type-error)) signalled an unbound slot
error.
Reported by Eric Marsden, fixes lp#1184586.
Stas Boukarev [Mon, 3 Jun 2013 16:30:19 +0000 (20:30 +0400)]
sb-bsd-sockets: Fix type of canonname in addrinfo.
Should be c-string-pointer, not c-string.
Fixes lp#1187041, patch by Jerry James.
Stas Boukarev [Mon, 3 Jun 2013 14:25:19 +0000 (18:25 +0400)]
Fix APPEND/NCONC type derivation properly this time.
Use type-intersection for checking types, it's more robust than what
was there before.
And a slight improvement. When argument in the middle can't be a NIL,
then the end result is guaranteed to be a CONS. Previously, the
assumption was if the type is a CONS, but that doesn't work with types
like (or cons vector).
Christophe Rhodes [Mon, 3 Jun 2013 13:54:17 +0000 (14:54 +0100)]
fixes in EXPT type derivation
It was possible to construct mostly (but not completely) unobservable
bogus floating-point types when deriving the type of functions
returning the value of calls to EXPT. Noticed by Vsevolod Dyomkin,
who found a way to observe it by redefining methods.
Stas Boukarev [Mon, 3 Jun 2013 11:28:02 +0000 (15:28 +0400)]
Fix NCONC type derivation.
Properly check the types of arguments, instead of testing for subtypes
or supertypes of LIST, check for arguments to be subtypes of NULL or CONS.
Reported by Jerry James.
Stas Boukarev [Mon, 3 Jun 2013 10:46:30 +0000 (14:46 +0400)]
sleep: Add more precautions to avoid consing on x86.
Stas Boukarev [Mon, 3 Jun 2013 09:52:06 +0000 (13:52 +0400)]
Fix sleep on ratios, avoiding consing.
Enable sleep tests for some platforms.
Christophe Rhodes [Mon, 3 Jun 2013 09:28:02 +0000 (10:28 +0100)]
some tests of SLEEP with ratios
Christophe Rhodes [Mon, 3 Jun 2013 08:49:49 +0000 (09:49 +0100)]
fix sleep on most ratios
really really ensure that the second argument to nanosleep is an
integer
Christophe Rhodes [Mon, 3 Jun 2013 08:48:55 +0000 (09:48 +0100)]
delete ye olde FIXME relating to unbound variable warnings
Testing the code in the real system gives a full warning
Christophe Rhodes [Mon, 3 Jun 2013 08:47:32 +0000 (09:47 +0100)]
fix (again) the handling of read errors in the debugger
Actually the read errors were doing what we wanted, but EOF was no
longer popping one debugger level. The control transfer is a bit
gnarly, so explicitly grab the restart we might want to use and pass
it as an argument to DEBUG-READ.
Stas Boukarev [Sun, 2 Jun 2013 20:15:33 +0000 (00:15 +0400)]
Avoid consing in SLEEP.
Try to compute seconds without consing, when the arguments are small
enough (in the fixnum range).
Add a transform to go directly to sb-unix:nanosleep when possible.
Stas Boukarev [Sun, 2 Jun 2013 19:18:20 +0000 (23:18 +0400)]
Make %coerce-callable-to-fun static on x86oids.
It's called a lot when doing funcall or apply.
Stas Boukarev [Sun, 2 Jun 2013 18:58:27 +0000 (22:58 +0400)]
Don't go through fdefn when referencing #'known-functions.
Known functions are know to be always present, save on indirection by
using them directly.
Christophe Rhodes [Sun, 2 Jun 2013 18:50:05 +0000 (19:50 +0100)]
slightly better handling of read errors in the debugger
wrap the read as well as the eval in the WITH-SIMPLE-RESTART ABORT
so that the user can return to the existing debugger level on read
errors.
Stas Boukarev [Sun, 2 Jun 2013 18:22:44 +0000 (22:22 +0400)]
Correct call-indirect for >32-bit addresses.
Stas Boukarev [Sun, 2 Jun 2013 17:25:21 +0000 (21:25 +0400)]
Better calls to static functions on x86-64.
Encode the calls to static functions as an immediate argument to the
CALL instruction when possible.
Stas Boukarev [Sun, 2 Jun 2013 17:14:15 +0000 (21:14 +0400)]
Better calls to static functions on x86-64.
Encode the calls to static functions as an immediate argument to the
CALL instruction when possible.
Stas Boukarev [Sun, 2 Jun 2013 16:33:54 +0000 (20:33 +0400)]
Better initialization of ir2-component-constants on x86-64.
x86 uses the first constant for the fixup vector, x86-64 doesn't.
Don't leave empty space in ir2-component-constants.
On the C side, call gencgc_apply_code_fixups only on x86, and not on
x86_64 as well, since its body is conditionalized for x86_64 anyway.
Stas Boukarev [Fri, 31 May 2013 16:08:19 +0000 (20:08 +0400)]
Remove unused variables in the compiler.
*code-vector* *next-location* *result-fixups* were last seen being
used on 24 Aug 1990.
Stas Boukarev [Sun, 2 Jun 2013 16:21:38 +0000 (20:21 +0400)]
disassemble: Better annotation of static functions and safepoints.
Static functions, like LENGTH and some others, weren't annotated in
the disassemble output.
Also add annotations for safepoints.
Christophe Rhodes [Sun, 2 Jun 2013 13:12:30 +0000 (14:12 +0100)]
1.1.8: will be tagged as "sbcl-1.1.8"
Lutz Euler [Sun, 2 Jun 2013 12:58:49 +0000 (14:58 +0200)]
Fix expected result for a character comparison test under non-unicode.
The test "character.pure.lisp / :case-insensitive-char-comparisons
:exhaustive" was marked as an expected failure as the unicode
implementation is not yet complete, but on a non-unicode build it
actually already succeeds.
Stas Boukarev [Fri, 31 May 2013 12:27:11 +0000 (16:27 +0400)]
Fix a regression in APPEND type derivation.
Do not just test the arguments for validity by being subtypes or
supertypes of LIST, but also if it's a subytpe of NULL, since ATOM or
SYMBOL won't satisfy the LIST tests.
Regression since f32ee7d.."Better type derivation for APPEND, NCONC,
LIST.", reported by Eric Marsden.
Paul Khuong [Fri, 31 May 2013 01:49:55 +0000 (21:49 -0400)]
Stricter precondition when strength reducing variable right shifts
Looking at the node's derived type is safer than a result type
constraint, which seems to consider the LVAR's derived or
truly-declared type.
Remove a redundant AVER too. If people call %ash/right directly and
incorrectly, they're looking for trouble. Moreover, the call's
type will be derived only if the argument types are correct.
Reported by Eric Marsden on sbcl-devel; further reduced test cases
by Christophe Rhodes.
Paul Khuong [Mon, 27 May 2013 21:38:15 +0000 (17:38 -0400)]
Compute single-value-type correctly in the absence of required values
* For the longest time (at least 2003), we didn't take defaulting
into account and did not union the single-value type with NULL.
For some reason, the issue didn't manigest itself until we improved
code generation for EQ/EQL this month.
Thanks to Attila Lendvai for the reasonable test case.
* Also, fix a typo in a VOP name for EQ of fixnum values.
Christophe Rhodes [Mon, 27 May 2013 07:11:05 +0000 (08:11 +0100)]
fix CHAR-EQUALity of non-ascii caseful characters
Or at least mostly fix it. There are issues surrounding iota subscript and
titlecase, which aren't regressions in the current release and will require
some more investigation to fix.
Lutz Euler [Sat, 25 May 2013 14:39:28 +0000 (16:39 +0200)]
Skip unicode normalization tests on non-unicode builds.
This was accidentally missed when the tests were introduced.
Paul Khuong [Fri, 24 May 2013 20:45:57 +0000 (16:45 -0400)]
Revert "Fix (aref vector (+ i constant)) with i negative on x86oids"
This reverts commit
5d3a728a1d9a91e7218fe53f12f96ab63b846810.
The current code is still wrong, but better the bugs we've always had
than the ones that break currently-working code.
Kept the test case, and added the one we failed on.
Paul Khuong [Fri, 24 May 2013 17:13:12 +0000 (13:13 -0400)]
Even safer substitution of constants in CUT-TO-WIDTH
* Fix another aspect of the modular arithmetic bug that was only
partially fixed by ccd2a1d (Substitute constants with modular
equivalents more safely); detected by the previous fix not
working on !x86oids.
Paul Khuong [Fri, 24 May 2013 17:08:55 +0000 (13:08 -0400)]
Robustify specialised IF/IF conversion introduced in 729ce57
* When unlinking a node from its destination LVAR, always mark
the node as potentially up for dead code elimination. IR2 can
become really confused when converting dead code; a more
systematic cleanup pass might provide a useful safety net.
* The changes make a widely-used ir1-manipulation function safer,
so this might also fix some other obscure compiler bug.
* Reported by James Y Knight on IRC and Fila Kolodny on Launchpad
(fixes lp#1183496).
Paul Khuong [Fri, 24 May 2013 17:07:36 +0000 (13:07 -0400)]
Silence the compiler when (runtime-) compiling PCL innards
Paul Khuong [Fri, 24 May 2013 17:07:35 +0000 (13:07 -0400)]
Fix (aref vector (+ i constant)) with i negative on x86oids
* The VOPs for indexed access with constant offset take a fixnum
index. Adjust fndb entries to reflect that.
* Fix FOLD-INDEX-ADDRESSING: don't convert if the resulting index
would be wider than a fixnum, and compute the new offset correctly
for subtractions.
* Test case by Douglas Katzman.
Jingyi Hou [Thu, 23 May 2013 18:20:32 +0000 (02:20 +0800)]
tweak so that block-delete-p is findable by grep for 'def.*block-delete-p'
Jingyi Hou [Thu, 23 May 2013 18:18:23 +0000 (02:18 +0800)]
search_for_executable() fails to process last part of PATH if PATH does not end with ':'
Christophe Rhodes [Thu, 23 May 2013 19:31:59 +0000 (20:31 +0100)]
fix build with #!-SB-UNICODE
Don't build a hash table with high-codepoint characters as values under
those circumstances.
Paul Khuong [Wed, 22 May 2013 18:19:47 +0000 (14:19 -0400)]
Improved SIMD-PACK manipulation VOPs on x86-64
* Tighten naive (and, in one case, wrong-looking) lifetime
specifications to enable more register coalescing.
* Exploit the coalescing by replacing explicit move instructions
with MOVE, and specify the SC to get MOVAPS for float data.
* Microoptimise away some PSRLDQ by 0.
Paul Khuong [Wed, 22 May 2013 18:17:23 +0000 (14:17 -0400)]
Specialised VOPs for EQ of fixnum values on x86oids
Steal fixnum EQL VOPs to implement EQ, like we already do for
characters, words and signed-words: otherwise they're converted
as signed-word EQ.
Paul Khuong [Wed, 22 May 2013 18:12:17 +0000 (14:12 -0400)]
Preserve types when swapping constant arguments and commute LOGTEST
* Add a transform to ensure any constant argument to LOGTEST is in
second position.
* Commutative-arg-swap used to often cause suboptimal code: subsequent
transforms fire before constraint propagation has tightened types
back to their original value. Hack with TRULY-THE for now. A more
general fix (e.g. by declaring the type of arguments in spliced-in
lambda expressions) would be even better.
Paul Khuong [Wed, 22 May 2013 05:31:28 +0000 (01:31 -0400)]
Update/clarify the status of FUNCTIONP and COMPILED-FUNCTION-P
The two functions aren't collapsed anymore on #!+sb-eval builds.
Paul Khuong [Wed, 22 May 2013 04:50:01 +0000 (00:50 -0400)]
Optimize (- (* x constant)) into (* x (- constant))
Another marginal lvar-fun-name/splice-fun-arg hack.
Closes lp#1065770.
Paul Khuong [Wed, 22 May 2013 04:18:26 +0000 (00:18 -0400)]
More efficient move-from-signed on x86-64 with 63-bit fixnums
We can SHL instead of IMUL to check for overflow, and only have to
RCR the sign bit back in to recover the original value.
Paul Khuong [Wed, 22 May 2013 03:46:32 +0000 (23:46 -0400)]
Simpler word-sized variable right shifts on x86 and x86-64
* Known negative shifts are converted to another function that
only handles machine-friendly right shifts.
* The transforms and VOPs are conditionalised on ash-right-vops,
so other platforms aren't penalised.
* The new transforms trigger a lot of notes; this is suboptimal,
and one test had to be adjusted.
Paul Khuong [Wed, 22 May 2013 03:43:03 +0000 (23:43 -0400)]
Simplify (- (- x)) for rationals
The transform is trivial enough to execute without a real pattern
matching framework, but we're close to the limit.
Paul Khuong [Tue, 21 May 2013 23:49:19 +0000 (19:49 -0400)]
Evaluate global inline functions via their fdefinition
* When inlinable global functions are evaluated for value, emit
code to refer to their fdefinition, rather than to a bogus
entry point.
* Make sure we only generate code to refer to XEPs and fail early
otherwise, rather than after backpatching.
* Fixes lp#1035721.
Paul Khuong [Tue, 21 May 2013 21:57:04 +0000 (17:57 -0400)]
Truthful error reporting for complicated compile-time type mismatches
Type mismatches for multiple-use LVARs (i.e. resulting from conditional
expressions) can't be pinpointed to a single source for the value(s).
Such expressions used to be reported as type mismatches with the constant
NIL. Instead, switch to a more complex format with the lowest common source
form, if any (hopefully the conditional), and the nodes that may deliver
the form's value.
Do the same when warning about non-EQ-comparable CATCH tags.
Paul Khuong [Tue, 21 May 2013 20:20:40 +0000 (16:20 -0400)]
Implement EQ of unboxed characters and small integers on x86oids
More important now that we actually weaken EQL of EQ-comparable
types into EQ. We just have to re-purpose pre-existing EQL and
CHAR= templates (and adjust the generic test's cost to be less
attractive), but I can't test on !x86oids.
Spotted by Douglas Katzman.
Paul Khuong [Tue, 21 May 2013 19:12:58 +0000 (15:12 -0400)]
Complete SSE instruction definitions for x86-64
* New instruction formats:
- 2-byte instructions with GP/mem source and XMM destination.
- 1- and 2-byte instructions with XMM source and GP/mem destination.
- F3-escape instructions GP/mem source and GP destination.
- 2-byte instructions with GP/mem source and GP destination.
* Complete support for SSE instruction sets:
- SSE3
- SSSE3
- SSE4.1
- SSE4.2
* Fix definition of pblendvb, blendvps, blendvpd: These require a third operand,
implicitly in XMM0.
* PEXTRW has a new 2-byte encoding in SSE4.1 which allows a memory address as
the destination operand. The new encoding is only used when dst is a memory
address, otherwise the old backward-compatible encoding is used.
* Fix 64-bit popcnt (F3 still comes REX.W), and make it check for operand sizes,
like the new CRC32.
* Slightly adapted from Jonathan Armond to work with Douglas Katzman's F3-specific
r, r/m instruction format.
Paul Khuong [Tue, 21 May 2013 19:12:46 +0000 (15:12 -0400)]
Export SB-SIMD-PACK symbols from SB-EXT
Export the SIMD-PACK type, the SIMD-PACK-P predicate,
%make-simd-pack-{ub32,ub64,single,double}, and
%simd-pack-{ub32s,ub64s,singles,doubles}.
These are far from useful yet, but at least future extensions
can work with SB-EXT instead of SB-KERNEL.
Also, says so in NEWS.
Paul Khuong [Tue, 21 May 2013 19:12:26 +0000 (15:12 -0400)]
SB-SIMD-PACK on x86-64
* Enable them by default on x86-64;
* And run some smoke tests, at least.
Paul Khuong [Tue, 21 May 2013 19:11:54 +0000 (15:11 -0400)]
Additional niceties and middle end support for short vector SIMD packs
* Allow FASL loading/dumping of (boxed) SIMD packs, and mark them as
trivially (i.e. without going through make-load-form) dumpable.
* SIMD packs print nicely, and take the element type into account while
doing so.
* (C)TYPE-OF is more accurate for SIMD packs; this enables IR2 conversion
to choose the right primitive type and storage class for constants.
The FASL code was kept on life support by Alexander Gavrilov for too many years,
and the printing logic is a very light adaptation of the output code he developed
for his branch.
Paul Khuong [Tue, 21 May 2013 19:11:26 +0000 (15:11 -0400)]
Back end work for short vector SIMD packs
* Platform-agnostic changes:
- Declare type testing/checking routines.
- Define three primitive types: simd-pack-double for packs
of doubles, simd-pack-single for packs of singles, and
simd-pack-int for packs of integer/unknown.
- Define a heap-representation for 128-bit SIMD packs,
along with reserving a widetag and filling the corresponding
entries in gencgc's tables.
- Make the simd-pack class definition fully concrete.
- Teach IR1 how to expand SIMD-PACK type checks.
- IR2-conversion maps SIMD-PACK types to the right primitive type.
- Increase the limit on the number of storage classes: SIMD packs
went way past the previous (arbitrary?) limit of 40.
* Platform-specific changes, in src/compiler/target/simd-pack:
- Create new storage classes (that are backed by the float-reg [i.e. SSE]
storage base): one for each of double, single and integer sse packs.
- Also create the corresponding immediate-constant and stack storage
classes.
- Teach the assembler and the inline constant code about this new kind
of registers/constants, and how to map constant SIMD-PACKs to which SC.
- Define movement/conversion VOPs for SSE packs, along with VOP routines
needed for basic creation/manipulation of SSE packs.
- The type-checking VOP in generic/late-type-vops is extremely
x86-64-specific... IIRC, there are ordering issues I do not
want to tangle with.
* Implementation idiosyncrasy: while type *tests* (i.e. TYPEP calls) consider
the element type, type *checks* (e.g. THE or DECLARE) only check for
SIMD-PACKness, without looking at the element type. This is allowed by the
standard, is similar to what Python does for FUNCTION types, and helps
code remain efficient even when type checks can't be fully elided.
The vast majority of the code is verbatim or heavily inspired by Alexander
Gavrilov's branch.
Paul Khuong [Tue, 21 May 2013 19:10:50 +0000 (15:10 -0400)]
Front end infrastructure for short vector SIMD packs
* new feature, sb-simd-pack.
* define a new IR1 type for SIMD packs:
- (SB!KERNEL:SIMD-PACK [eltype]), where [eltype] is a subtype
of the plaform-specific SIMD element type universe, or * (default),
the union of all these possibilities;
- Element types are always upgraded to the platform's element type
(small) universe, so we can easily manipulate unions of SIMD-PACK
types by working in terms of the element types.
* immediately specify the universe of SIMD pack element types
(sb!kernel:*simd-pack-element-types*) for x86-64, to ensure
#!+sb-simd-pack buildability.
* declare basic functions to create/manipulate SIMD packs:
- simd-pack-p is the basic type predicate;
- %simd-pack-tag returns a fixnum tag associated with each SIMD-PACK;
currently, we suppose it only encodes the element type, as the
position of the element type in *simd-pack-element-types*;
- %make-simd-pack creates a 128-bit SIMD pack from a tag and two
64 bit integers;
- %make-simd-pack-double creates an appropriately-tagged pack from
two double floats;
- %make-simd-pack-single creates a tagged pack from four single
floats;
- %make-simd-pack-ub{32,64} creates a tagged pack from four 32 bit
or two 64 bit integers;
- %simd-pack-{low,high} returns the low/high integer half of a
128 bit pack;
- %simd-pack-ub{32,64}s returns the four integer quarters or two
integer halves of a 128 bit pack;
- %simd-pack-singles returns the four singles in a 128 bit pack;
- %simd-pack-doubles returns the two doubles in a 128 bit pack.
Alexander Gavrilov kept a branch alive for the last couple years. The
creation/manipulation primitives are largely taken from that branch,
or informed by the branch's usage.
Stas Boukarev [Tue, 21 May 2013 11:05:19 +0000 (15:05 +0400)]
Fix foreign-symbol-address transform on +sb-dynamic-core.
Badly placed ` was resulting in a wrong result.
Paul Khuong [Tue, 21 May 2013 00:02:04 +0000 (20:02 -0400)]
Make some instances of IF/IF conversion more direct
When faced with CFGs that look like (if (if ...) ...), we duplicate
the outer NULL test forward in the branches (and jump to the correct
branch, so very little code is duplicated). However, this transform
depends on later ir1 optimisation to handle patterns like
(if (if ... nil t) ...). Try and get them right with a specialised
rewrite to get good code even when ir1opt doesn't run until fixpoint.
Also, refactored the code a bit while working on it.
Paul Khuong [Mon, 20 May 2013 22:14:43 +0000 (18:14 -0400)]
Exploit specialised VOPs for EQL of anything/constant fixnum
By swapping constant arguments to the right ourselves before
strength reducing EQL into EQ, rather than erroneously using
commutative-arg-swap.
Spotted by Douglas Katzman.
Paul Khuong [Mon, 20 May 2013 21:38:19 +0000 (17:38 -0400)]
More efficient integer=>word conversion and fixnump tests on x86-64
* Special-case on 63-bit fixnums to detect non-zero fixnum tag bits
with a shift right when converting fixnum-or-bignum to ub64.
* In fixnump/unsigned-byte-64, use MOVE to avoid useless mov x, x.
* In fixnump/signed-byte-64, use the conversion's left shift to
detect overflows.
* Based on a patch by Douglas Katzman.