Stas Boukarev [Fri, 30 Aug 2013 00:40:31 +0000 (04:40 +0400)]
tests/run-compiler.sh: use gcc, not cc.
Stas Boukarev [Thu, 29 Aug 2013 23:56:36 +0000 (03:56 +0400)]
Fix floating point exceptions persisting on Solaris.
(/ 0d0 0d0)
(cosh 90)
signaled an invalid-operation exception, and not overflow, on COSH,
because the previous error wasn't cleared. Clear the exception flags
in the sigfpe handler.
Stas Boukarev [Thu, 29 Aug 2013 22:34:22 +0000 (02:34 +0400)]
PPRINT (setf . a) correctly.
It was printed as (setf a).
Reported by Douglas Katzman.
Stas Boukarev [Thu, 29 Aug 2013 20:21:28 +0000 (00:21 +0400)]
Fix building on Solaris x86-64.
sb-unix:unix-select used macros which expanded into many forms,
limited by sb-unix:fd-setsize, which on Solaris-x86-64 is 65536, as
opposed to 1024 on Linux. This resulted in long compile times which
were likely to exhaust the heap.
Use functions instead of macros.
Lutz Euler [Wed, 28 Aug 2013 21:09:05 +0000 (23:09 +0200)]
Improve the test float.impure.lisp / (RANGE-REDUCTION PRECISE-PI).
The way the test calculated its expected values was flawed and worked
correctly only accidentally due to the specific test values used and
to allowing a relatively large margin of error.
This commit corrects these calculations, removes some test values and
adds others and tightens the error margin. I do not expect this to cause
the test's outcome on any platform to change.
The flaw was to reduce the arguments by taking the remainder of
truncating modulo 2 pi. This allows precise calculations only of the
sine and the tangent of values slightly above even multiples of pi, but
not for example for the sine of an argument near an odd multiple of pi.
Instead the reduction is now done by taking the remainder of rounding
to the nearest multiple of pi/2 so that all arguments near the zeroes
of both sine and cosine reduce to values near zero.
This change was prompted when the test unexpectedly failed with some
values from gcc bug 43490 which I tried when investigating lp #1137924.
Stas Boukarev [Wed, 28 Aug 2013 14:46:30 +0000 (18:46 +0400)]
PROBE-FILE on symlinks to pipes inside /proc on Linux.
PROBE-FILE now can access symlinks to pipes and sockets in
/proc/pid/fd/ on Linux.
query-file-system already has code for handling broken symlinks,
resolving the directory part, use it on files for which realpath(3)
fails, which includes pipes and socket links in /proc.
Reported by Eric Schulte.
Stas Boukarev [Fri, 23 Aug 2013 22:32:00 +0000 (02:32 +0400)]
Remove debug-deinit, unused.
Its body consist of
;; Nothing to do right now. Once there was, maybe once there
;; will be again.
Once there is something to do, it can easily be put back.
Stas Boukarev [Fri, 23 Aug 2013 22:12:41 +0000 (02:12 +0400)]
Remove an unused variable, *unwind-to-frame-function*.
It isn't used for about three years now.
Christophe Rhodes [Wed, 28 Aug 2013 13:16:56 +0000 (14:16 +0100)]
1.1.11: will be tagged as "sbcl-1.1.11"
Christophe Rhodes [Wed, 28 Aug 2013 13:16:29 +0000 (14:16 +0100)]
fix NEWS header
Stas Boukarev [Sat, 24 Aug 2013 21:37:17 +0000 (01:37 +0400)]
Revert "Clean up %more-arg-values."
This reverts commit
1e5296127f5b384a2171646747021ebeee73b801.
It breaks slime, a better solution to come in the next release cycle.
Christophe Rhodes [Thu, 22 Aug 2013 12:39:10 +0000 (13:39 +0100)]
Better support for NetBSD/current
Wrap more syscalls to defend against linker rewriting (patch from Robert
Swindells sbcl-devel 2013-07-12, encouragement from NetBSD users on #sbcl
IRC).
Stas Boukarev [Wed, 21 Aug 2013 22:05:02 +0000 (02:05 +0400)]
Fix OPEN when :if-exists/:if-does-not-exist are both NIL or :ERROR.
Such combination results in OPEN never actually opening a file, only
either signalling an error or returning NIL.
Reported by Jan Moringen.
Stas Boukarev [Wed, 21 Aug 2013 12:52:26 +0000 (16:52 +0400)]
Don't hardcode the number of gencgc generations.
Use sb-vm:+pseudo-static-generation+.
Patch by Andreas Franke.
Paul Khuong [Wed, 21 Aug 2013 03:53:01 +0000 (23:53 -0400)]
Replace the Kitten of Death message with a warning in the banner
* Arguably, the Windows ports are now as (un)stable as the other
non-Linux/x86oid ports.
* Either way, the warning is now disabled by --noinform.
* Also, replace the lossage message when the initial thread returns
with a clearer description of the situation.
Stas Boukarev [Wed, 21 Aug 2013 01:27:35 +0000 (05:27 +0400)]
Flush streams more precisely.
The test for the space left in the stream buffer was too conservative,
leaving 1 byte unused.
Patch by Ken Olum.
Fixes lp#910213.
Stas Boukarev [Tue, 20 Aug 2013 23:06:28 +0000 (03:06 +0400)]
Fix thread-alloca test on Windows.
Invoke gcc in a more portable fashion.
Stas Boukarev [Tue, 20 Aug 2013 21:12:00 +0000 (01:12 +0400)]
Fix ROOM on Windows.
A bit-field inside the page struct is defined as "unsigned fields", on
Linux it's packed into 8 bits, but on 32-bit Windows into 32-bits. The
code in room expects the former. Defining it as "unsigned char fields"
solves the problem.
Stas Boukarev [Tue, 20 Aug 2013 17:06:05 +0000 (21:06 +0400)]
Clean up %more-arg-values.
The second argument to %more-arg-values is always 0. Remove it.
Stas Boukarev [Mon, 19 Aug 2013 23:20:04 +0000 (03:20 +0400)]
Clean up and micro-optimize list checking in some x86-64 VOPs.
In length/list and values-list, instead of manually checking for LIST,
call %test-lowtag, which produces more compact code.
Stas Boukarev [Mon, 19 Aug 2013 22:32:18 +0000 (02:32 +0400)]
Micro-optimize copy-more-arg on x86-64.
Instead of copying RCX into RBX, then modifying RCX and later
restoring RCX from RBX, modify RBX instead.
Stas Boukarev [Mon, 19 Aug 2013 22:29:01 +0000 (02:29 +0400)]
Clean up listify-rest-args VOP on x86-64.
It's no longer using loop instructions, remove STD and CLD.
Stas Boukarev [Mon, 19 Aug 2013 16:56:22 +0000 (20:56 +0400)]
Apply a recent optimization more widely.
FOREIGN-SYMBOL-SAP was missing changing
LEA REG, [#xADDRESS]
to
MOV REG, #xADDRESS
Stas Boukarev [Thu, 15 Aug 2013 18:02:54 +0000 (22:02 +0400)]
Add a memory barrier inside pseudo-atomic on PPC.
Solves problems with allocation and multiple threads.
Stas Boukarev [Thu, 15 Aug 2013 17:52:24 +0000 (21:52 +0400)]
Set up alien stack correctly on non-x86oids.
It's assumed that the C stack grows upward everywhere but X86oids,
which is not true. Define two new conditions,
ALIEN_STACK_GROWS_DOWNWARD and ALIEN_STACK_GROWS_UPWARD.
This fixes FFI issues on PPC.
Stas Boukarev [Thu, 15 Aug 2013 17:00:06 +0000 (21:00 +0400)]
create_os_thread: put pthread stack inside alien-stack.
On !LISP_FEATURE_C_STACK_IS_CONTROL_STACK set pthread stack to
alien_stack, not control_stack.
Stas Boukarev [Thu, 15 Aug 2013 14:40:51 +0000 (18:40 +0400)]
Warn when defining a setf-function together with a setf-expander.
Patch by Douglas Katzman.
Stas Boukarev [Thu, 15 Aug 2013 13:43:13 +0000 (17:43 +0400)]
Throw errors on malformed FUNCTION.
(funcall (function X junk)) didn't throw an error in the presence of a
compiler-macro for X.
Patch by Douglas Katzman.
Stas Boukarev [Thu, 15 Aug 2013 13:21:04 +0000 (17:21 +0400)]
Optimize calling asm routines and static foreign functions on x86-64.
Instead of loading the address using
LEA REG, [#xADDRESS]
use
MOV REG, #xADDRESS
Which saves 2 bytes.
Stas Boukarev [Tue, 6 Aug 2013 17:11:16 +0000 (21:11 +0400)]
Fix undefined function errors on PPC and MIPS.
undefined_tramp hardcodes the register in which FDEFN resides, but the
format was recently changed (f69e89d..).
Other platforms can be susceptible to this.
A proper fix would avoid hardcoding this by exporting
sc-offset-scn-byte/sc-offset-offset-byte, and register offsets.
Thanks to the GCC Compile Farm project for providing machines for
testing and uncovering this.
Stas Boukarev [Thu, 1 Aug 2013 17:51:55 +0000 (21:51 +0400)]
Microoptimize (signed-byte 64) type test on x86-64.
Similar to the (unsigned-byte 64) one:
TEST CL, 3
MOV EAX, ECX
=>
MOV EAX, ECX
TEST AL, 3
Also add tests/run-tests-* to .gitignore.
Christophe Rhodes [Wed, 31 Jul 2013 13:06:43 +0000 (14:06 +0100)]
fix manual build under texinfo 5
Texinfo 5 is more assertive about its syntax: macros with
non-alphanumerics have never actually been allowed, but we used to be
able to get away with @& to escape an ampersand under @iftex, and
defining @&key macros under @iffnottex. Nuh-uh, not any more. (fixes
lp#1189146)
The details of the indexes, particularly in html format, differ slightly
under texinfo 4 and 5 (related to the trickery around hiding package
prefixes for decent alphabetization). It might be nice to sort this out
Once And For All, eventually.
Stas Boukarev [Sun, 28 Jul 2013 18:26:18 +0000 (22:26 +0400)]
Microoptimize comparisons with 0 on x86oids.
Implement the common idiom of using TEST REG, REG in place of CMP REG,
0, saving 1 byte, for fast-if->/< VOPs.
Stas Boukarev [Sun, 28 Jul 2013 16:41:58 +0000 (20:41 +0400)]
Optimize (unsigned-byte 32/64) type tests on x86oids.
Instead of doing
TEST CL, 3
MOV EAX, ECX
do
MOV EAX, ECX
TEST AL, 3
AL has shorter encoding and can save 1 byte with ECX or 4 bytes with
ESI, which doesn't have SIL on x86.
Also revert a part of the previous commit which used untagged
pointers, which can cause problems with the GC.
Stas Boukarev [Sun, 28 Jul 2013 15:42:41 +0000 (19:42 +0400)]
Microoptimize type-tests on x86oids.
On x86-64 in %test-lowtag instead of doing:
MOV EAX, ECX
AND AL, 15
CMP AL, 15
do
LEA EAX, [RCX-15]
TEST AL, 15
Which allows to save one byte.
On x86 this optimization is already applied, but since LEA loads a
32-bit integer, EAX can be later used as an already untagged pointer
in %test-headers: MOV EAX, [ECX-7] => MOV EAX, [EAX], which takes one
byte less to encode.
Christophe Rhodes [Sun, 28 Jul 2013 14:14:11 +0000 (15:14 +0100)]
1.1.10: will be tagged as "sbcl-1.1.10"
Paul Khuong [Fri, 28 Jun 2013 06:36:13 +0000 (02:36 -0400)]
Modular integer %NEGATE on x86oids
Forms like (logand (- word) word) now compute the negation in modular
arithmetic, without consing an intermediate bignum, just like integer
addition, multiplication and subtraction.
The VOPs are trivial, and should be easily added on all other
platforms, I just don't have access to build hosts.
Paul Khuong [Tue, 25 Jun 2013 03:24:05 +0000 (23:24 -0400)]
Pack (mostly) stack TNs according to lexical scope information
Packing TNs from shallow scopes before more deeply nested one
is a perfect elimination order when the live ranges span the
full scope (the interference graph is a comparability graph).
Use that as a heuristic, and do that for TNs that are known
to have such simple live ranges before the rest: this ensures
that bad TNs don't mess everything up.
The result is much tighter stack allocation (most of the effect
comes from initialising stack frames at a smaller size, and growing
less aggressively), and fewer long-lived stray references.
Incidentally: fix catch block packing on win32, solving lp#1072739
Paul Khuong [Thu, 27 Jun 2013 22:59:57 +0000 (18:59 -0400)]
Grow regalloc datastructures geometrically for unbounded SCs
Paul Khuong [Thu, 27 Jun 2013 22:46:52 +0000 (18:46 -0400)]
Smaller stack frames on x86oids
Start at 4 slots (for some reason, it seems that 3 isn't really
the minimum, and grows by one slot at a time.
Paul Khuong [Thu, 27 Jun 2013 22:44:08 +0000 (18:44 -0400)]
Disentangle storage base initial size from growth increments
Before, an initial stack frame size of 8 meant that the stack frame
always grew in increments of 8. Not only is a large initial size bad
for GC (it leaves more dead references untouched), but a large increment
is even worse.
Paul Khuong [Thu, 18 Jul 2013 19:03:21 +0000 (15:03 -0400)]
Insert explicit cut to width when needed
When modular arithmetic operations are replaced with specialised
modular variants, the result's bitwidth is determined by the variant,
and might be wider than expected. If necessary, insert an explicit
cut to the exact bitwidth before returning a value in a non-modular
context.
Spotted by pfdietz's random tester.
Fixes lp#1199428.
Paul Khuong [Thu, 18 Jul 2013 18:29:12 +0000 (14:29 -0400)]
Avoid uselessly re-scanning modular arithmetic expressions
When modular arithmetic transforms have already fired for a
subexpression, and that subexpression's width is at most as wide
as the bitwidth we're cutting to, there is no need to re-traverse
the subexpression.
There was already some code to detect that case. Make it more general,
and, more importantly, sound.
Paul Khuong [Tue, 9 Jul 2013 12:16:41 +0000 (08:16 -0400)]
No more destructive MERGE of shared data in best-modular-version
The old code worked by accident: few/no platform implements
untagged signed modular arithmetic VOPs.
The new code handles that common case to avoid consing a fresh list
when the MERGE will be an identity.
Stas Boukarev [Tue, 16 Jul 2013 01:12:51 +0000 (05:12 +0400)]
Optimize TYPEP of (MOD X) on x86/x86-64.
Optimize type-tests in the same vein as type-checks previously, and
implement type-checks by means of type-tests. Further optimize it by
avoiding doing fixnum tests on known fixnums and boxing of
signed/unsigned numbers.
Paul Khuong [Mon, 8 Jul 2013 21:05:26 +0000 (17:05 -0400)]
Handle unbounded integer types in INTEGER-TYPE-NUMERIC-BOUNDS
If any of the integer type in the union lack upper or lower bounds,
immediately abort with unknown bounds (rather than taking the MIN
of NIL and an integer).
Thanks to pfdietz for his random testing.
Fixes lp#1199127.
Lutz Euler [Sun, 7 Jul 2013 13:06:47 +0000 (15:06 +0200)]
Add a regression test for lp#1194673.
Christophe Rhodes [Fri, 5 Jul 2013 16:24:44 +0000 (17:24 +0100)]
restore CLISP cross-compilability
s is the marker for short-float, not single-float. Use f instead.
Stas Boukarev [Thu, 4 Jul 2013 09:58:54 +0000 (13:58 +0400)]
Update ASDF to 3.0.2.
Christophe Rhodes [Thu, 4 Jul 2013 08:29:48 +0000 (09:29 +0100)]
1.1.9: will be tagged as "sbcl-1.1.9"
Christophe Rhodes [Wed, 3 Jul 2013 09:26:01 +0000 (10:26 +0100)]
fix typo in FFI chapter
noted by Michael Crouch (lp#1197129)
Paul Khuong [Sat, 29 Jun 2013 23:54:25 +0000 (19:54 -0400)]
Revert to binding *package* in bootstrappy code
I see no sane way to use sb!ext:print-symbol-with-prefix or
sb-ext:... in ~//. Bind *package* to :keyword instead, for
undefined function conditions.
Reported by adeth on #lisp.
James M. Lawrence [Tue, 25 Jun 2013 21:43:27 +0000 (17:43 -0400)]
Fix SLEEP on 32-bit platforms.
Paul Khuong [Fri, 28 Jun 2013 13:08:11 +0000 (09:08 -0400)]
Fix a typo in the block comment on encoding/decoding universal times
Might as well, while we're updating URLs.
Spotted by Luis Oliveira.
Paul Khuong [Fri, 28 Jun 2013 06:03:33 +0000 (02:03 -0400)]
s/32/n-word-bits/ in bignum-index
For some reason we missed that during all the work on x86-64. Then
again, our static inference is good enough that no runtime check
seem to be left to detect that the declaration is wrong.
Paul Khuong [Fri, 28 Jun 2013 06:01:06 +0000 (02:01 -0400)]
Store FP values from x87 to the heap outside pseudo-atomic
It suffices to get the header right, and that way we avoid signaling
FPEs in PA, when, as is bound to happen, a value is only truncated
when it's boxed.
Paul Khuong [Fri, 28 Jun 2013 05:45:49 +0000 (01:45 -0400)]
double->single float conversion isn't a no-op on x87 anymore
The conversion can result in overflow, so pass through a stack
temporary to force a truncation.
Test case by Peter Keller on sbcl-devel, 2013-06-26.
Paul Khuong [Fri, 28 Jun 2013 03:15:38 +0000 (23:15 -0400)]
New contrib: SB-GMP
This contrib was developed by Stephan Frank to replace some of our
bignum and rational arithmetic code with calls to libgmp. Simply
loading the contrib will transparently accelerate arithmetic on
large rationals when libgmp is available; if libgmp cannot be found,
the contrib should change nothing.
The contrib also wraps additional functions in GNU MP, so that they
accept and return SBCL-native integers or ratios. See GNU MP's manual
for more information.
Paul Khuong [Fri, 28 Jun 2013 02:03:24 +0000 (22:03 -0400)]
Defer some sanity checks to after testing for value refence to inline functions
The functional corresponding to an inline function can be marked as dead when
there remains references in for-value contexts. Detect such references before
making sure the function is still live.
Reported with a reduced test case by Teemu Likonen to sbcl-devel on 2013-06-24.
Jan Moringen [Sun, 23 Jun 2013 17:13:14 +0000 (19:13 +0200)]
In MAKE-THREAD, use WITH-SYSTEM-MUTEX for locking *MAKE-THREAD-LOCK*
Otherwise MAKE-THREAD could be interrupted after having
locked *MAKE-THREAD-LOCK*. If the interrupting code also called
MAKE-THREAD, a recursive lock attempt for *MAKE-THREAD-LOCK* would
occur.
The problem could be easily triggered by
(MAKE-TIMER ... :THREAD <T or a thread>)
Also move let bindings of SETUP-SEM, REAL-FUNCTION, ARGUMENTS and
INITIAL-FUNCTION and the NOT *GC-INHIBIT* assertion out of the
critical section.
Tests have been added in threads.pure.lisp and timer.impure.lisp.
fixes lp#1180102.
Attila Lendvai [Wed, 27 Oct 2010 12:03:34 +0000 (14:03 +0200)]
Wrap the body of sb-debug:backtrace with with-debug-io-syntax.
Added with-debug-io-syntax macro.
Some whitespace changes as well.
Attila Lendvai [Sun, 31 Oct 2010 00:18:53 +0000 (02:18 +0200)]
Provide more info in debugger-disabled-hook.
Before this change if there was an error printing the condition object,
then we didn't even try printing the backtrace afterwards, which can be
a useful source of information even if the condition printing has failed.
Some modifications by Paul Khuong.
Attila Lendvai [Sun, 2 Jan 2011 15:41:47 +0000 (16:41 +0100)]
Make the printing of a slot-unbound error more error tolerant.
Especially against errors coming from custom PRINT-OBJECT methods, in which
case only print the TYPE-OF the instance. Also, print fully qualified
symbol names.
Slightly modified by Paul Khuong.
Attila Lendvai [Sun, 2 Jan 2011 15:42:15 +0000 (16:42 +0100)]
Use sb!ext:print-symbol-with-prefix in implicit-generic-function-warning.
and export this useful function from sb-exr.
Slight mangling by Paul Khuong.
Pierre Thierry [Sun, 28 Aug 2011 19:38:28 +0000 (21:38 +0200)]
Update URL of "Long, Painful History of Time"
Previous URL was not available anymore (404 error)
Also rewrapped the paragraph to the same width as the others
Stas Boukarev [Mon, 24 Jun 2013 10:28:30 +0000 (14:28 +0400)]
Simplify EMIT-VOP further.
EMIT-VOP is only ever used in conjunction with INSERT-VOP-SEQUENCE, by
returning two values: first and last VOPs, all linked together,
INSERT-VOP-SEQUENCE then inserts them into the block. But nowadays
EMIT-VOP always returns the same VOP as the second value.
* EMIT-VOP now returns one value, the emitted VOP.
* INSERT-VOP-SEQUENCE is renamed to INSERT-VOP, accepts only one VOP.
* A new function EMIT-AND-INSERT-VOP is added, which combines them,
and is used anywhere where EMIT-VOP was used.
This makes things less complicated, and reduces core size by 32KB, the
same as the previous commit, for a total of 64KB of savings
essentially for free.
(Also squeeze a couple of line-break fixes)
Stas Boukarev [Mon, 24 Jun 2013 09:50:35 +0000 (13:50 +0400)]
Simplify EMIT-GENERIC-VOP.
Since there's only one kind of templates now, there's no need for
indirection. Rename EMIT-GENERIC-VOP to EMIT-VOP, remove EMIT-FUNCTION
slot from TEMPLATE, call EMIT-VOP directly.
Stas Boukarev [Sat, 22 Jun 2013 15:37:18 +0000 (19:37 +0400)]
backtrace: don't cons large lists when RCX is overwritten inside XEPs.
To present a list with the actual number of passed arguments in the
backtrace, clean-xep used the arg-count register and added missing
arguments in the form of #<unknown>, but if the register is
overwritten by other code, it could cons very large lists, exhausting
heap. Do such arg-list clean up only upon INVALID-ARG-COUNT-ERROR.
Fixes lp#1192929.
Joshua Elsasser [Sun, 9 Jun 2013 04:36:48 +0000 (21:36 -0700)]
Hopefully fix the windows build to grovel time structures correctly.
It is a little misleading to say "correctly" since struct timespec
doesn't really exist on windows. Groveling the definition that we
define in our own pthreads wrapper seems the most consistent choice.
The grovel-headers.c changes have only been tested in isolation, not
with a real build. Thanks to Kyle Isom for testing, any resulting
build problems are entirely my fault.
Joshua Elsasser [Sun, 12 May 2013 15:36:11 +0000 (08:36 -0700)]
Grovel timeval and timespec struct definitions rather than hard-coding.
Paul Khuong [Tue, 18 Jun 2013 17:23:42 +0000 (13:23 -0400)]
Fix instruction encoding for XMM shifts with immediate count
x86 keeps getting more and more devious: the source/dest operand
is in the r/m field for these instructions, so REX.B must be set,
rather than REX.R, to access > xmm7. Intel's new documentation
seems clearer about these issues, at least.
Stas Boukarev [Tue, 11 Jun 2013 11:20:10 +0000 (15:20 +0400)]
check-mod-fixnum: correct the test for power-of-two.
Testing for the power of two was performed on a fixnumized number,
causing the optimization for power-of-two to be never applied.
Stas Boukarev [Mon, 10 Jun 2013 18:54:55 +0000 (22:54 +0400)]
Add a missing :suppress-errors keyword for WRITE defknown.
Christophe Rhodes [Mon, 10 Jun 2013 12:26:27 +0000 (13:26 +0100)]
Note removal of post-receive-email in NEWS
Christophe Rhodes [Mon, 10 Jun 2013 12:20:14 +0000 (13:20 +0100)]
remove git/ directory
The scripts therein weren't directly relevant for SBCL development,
only for infrastructure to assist that development; additionally,
they are derived from git examples licensed under the GPL, and it
was difficult to explain the effects of this in a short paragraph
in the master COPYING file. A new sbcl-git-hooks repository is
available on SourceForge for these and any other infrastructure
customizations.
Lutz Euler [Mon, 10 Jun 2013 11:44:20 +0000 (13:44 +0200)]
Micro-optimize DOUBLE-FLOAT-LOW-BITS on x68-64.
Instead of loading a 64-bit register from memory and zeroing the upper
32 bits of it by the sequence SHL reg, 32; SHR reg, 32 simply load the
corresponding 32-bit register from memory, relying on the implicit
zero-extension to 64 bits this does. This is smaller and faster.
For example, if the input to the VOP is a descriptor register, the old
instruction sequence is:
MOV RDX, [RDX-7]
SHL RDX, 32
SHR RDX, 32
and the new one:
MOV EDX, [RDX-7]
Regarding store-to-load forwarding this change should make no
difference: Most current processors can forward a 64-bit store to a
32-bit load from the same address. The exception is Intel's Atom which
can forward only to a load of the same size as the store; but it also
supports this only between integer registers, and DOUBLE-FLOAT-LOW-BITS
mostly or even always acts on memory slots written from an XMM register
(of the three storage classes it supports as input, for the first it
does the store itself from an XMM register; for the other two I have
investigated some disassemblies and always found the prior store to be
from am XMM register).
Lutz Euler [Mon, 10 Jun 2013 10:37:22 +0000 (12:37 +0200)]
Make clean.sh clean up doc/internals, too.
For completeness and equal treatment with doc/manual.
Lutz Euler [Mon, 10 Jun 2013 10:37:22 +0000 (12:37 +0200)]
git: Add entries for the HTML manual to doc/internals/.gitignore.
These are the files and directories generated by "make html" in
doc/internals.
Lutz Euler [Sun, 9 Jun 2013 15:58:52 +0000 (17:58 +0200)]
git: New file doc/internals/.gitignore.
Ignore the files generated by building the internals manual.
Copied and adapted from doc/manual/.gitignore.
Paul Khuong [Sat, 8 Jun 2013 16:53:16 +0000 (12:53 -0400)]
Insert error traps after full calls inferred not to return
An explicit error trap after full calls to known functions helps
understand type derivation errors at runtime; it's certainly better
than executing arbitrary bytes.
Only do this when the return type was tightened to NIL via type
derivation; if a function is defknowned not to return, it really
shouldn't.
Paul Khuong [Sat, 8 Jun 2013 15:26:19 +0000 (11:26 -0400)]
Only use MASK-SIGNED-FIELD VOPs as last resorts
The MOVE hack usually leads to better code when it can be used.
Paul Khuong [Sat, 8 Jun 2013 05:39:10 +0000 (01:39 -0400)]
Handle (aref v (+ i k)), with i negative
* Update the fndb to allow negative index values for foo-ref-with-offset
and foo-set-with-offset.
* Adjust VOPs accordingly.
* Fix fold-index-addressing: only fold constant offsets if the resulting
index argument would be a fixnum, and compute the new offset correctly
for subtractions.
* Unmark the corresponding test as an expected feailure, and add a test
to make sure VOPs for data-vector-{ref,set}-with-offset accept negative
index values (unless the element size is too small
to fold offsets in an EA).
* Un-package-qualify a few spurious test-util:with-test.
Paul Khuong [Sat, 8 Jun 2013 05:38:32 +0000 (01:38 -0400)]
Fix a typo in bignum--ref-with-offset
Paul Khuong [Sat, 8 Jun 2013 05:34:52 +0000 (01:34 -0400)]
Consistently force (double) rounding of foreign x87 values
SBCL always functions in 64 bit mode, but switches to 80 bit for
foreign calls. Return values might be unexpectedly precise.
Force a round-trip from the x87 unit and the stack to make sure
FP return values are rounded to the correct width.
Paul Khuong [Sat, 8 Jun 2013 05:31:22 +0000 (01:31 -0400)]
Look for left-over dead code when *check-consistency*
If ir1opt leaves dead code around, later parts of the compilation
pipeline can become seriously confused. Detect such issues earlier,
rather than as mysterious failures.
Paul Khuong [Sat, 8 Jun 2013 03:37:52 +0000 (23:37 -0400)]
Simplify RATIONAL/constant FLOAT and INTEGER/constant RATIO comparisons
The spec says that rationals and floats are compared by first calling
RATIONAL on the float. Constant fold the call to RATIONAL, and round
to an integer if applicable.
Paul Khuong [Sat, 8 Jun 2013 02:29:55 +0000 (22:29 -0400)]
Silence notes about being specialised EQ templates on x86oids
During LTN, we emit notes when the final chosen template costs
at least 6 more units than an unapplicable template. Adjust the
costs of EQ VOPs to avoid triggering this logic.
Paul Khuong [Sat, 8 Jun 2013 02:27:59 +0000 (22:27 -0400)]
Silence the transforms that detect rightward arithmetic shift
It seems most of the time, the notes complained about clear left
shifts.
Paul Khuong [Sat, 8 Jun 2013 02:26:59 +0000 (22:26 -0400)]
Mark DATA-VECTOR-REF[-WITH-OFFSET] as unsafely flushable
Unsafe code will be able to eliminate array reads as dead code.
Paul Khuong [Sat, 8 Jun 2013 02:25:57 +0000 (22:25 -0400)]
Adjust internal encoding for TN location for larger SC count limit
We need one more bit to encode values up to 62.
Paul Khuong [Sat, 8 Jun 2013 01:43:16 +0000 (21:43 -0400)]
New VOP for LOGAND of bignum and word-sized constant on x86-64
We can directly read the first bignum digit and AND it.
Paul Khuong [Sat, 8 Jun 2013 01:42:17 +0000 (21:42 -0400)]
MASK-SIGNED-FIELD VOPs on x86-64
Other platforms go through the MOVE hack like before.
Paul Khuong [Fri, 7 Jun 2013 23:36:13 +0000 (19:36 -0400)]
More identity folding for LOGAND and LOGIOR with constants
* Handle more complex cases than only powers of two:
compare the variant argument with a power-of-two-sized
prefix of the constant bit pattern.
* Add parallel logic for LOGIOR: if all the ones we're ORing in
are already set because the variant argument is a small enough
negative integer, we've got an identity.
* This is a bit hairy, so exhaustively check the logic with small
values.
Paul Khuong [Fri, 7 Jun 2013 23:06:32 +0000 (19:06 -0400)]
More associativity-based constant-folding
Detect patterns like (+ (- x k1) k2) and (* (/ x k1) k2).
Paul Khuong [Fri, 7 Jun 2013 23:03:44 +0000 (19:03 -0400)]
Enable signed modular arithmetic for LOGIOR
When the result of a bitwise or is known to be negative, we don't
need to compute the most significant bits (they're all ones).
Paul Khuong [Fri, 7 Jun 2013 23:01:09 +0000 (19:01 -0400)]
Enable more modular arithmetic
The rewrites now trigger when the result type for LOGAND or
MASK-SIGNED-FIELD is an union of integer types.
Paul Khuong [Fri, 7 Jun 2013 22:46:25 +0000 (18:46 -0400)]
Complete cut-to-width for modular arithmetic
For each modular argument, go through the nodes that provide its
value and try to narrow down their bitwidth. If we fail on any and
the result might be too wide, splice in an explicit call to LOGAND
or MASK-SIGNED-FIELD. Skip that last step if the value is an argument
to an equivalent LOGAND or MASK-SIGNED-FIELD.
Test case by Eric Marsden.
Paul Khuong [Fri, 7 Jun 2013 22:42:42 +0000 (18:42 -0400)]
Fix negation of SIMD-PACK types
type-union2 can punt and return NIL.
Lutz Euler [Thu, 6 Jun 2013 14:26:30 +0000 (16:26 +0200)]
Simplify getting the contents of assembler segments.
Extend FINALIZE-SEGMENT to compact the segment's buffer and provide an
exported function to get at this buffer. This resolves an old KLUDGE
noted at ON-SEGMENT-CONTENTS-VECTORLY, making this function unnecessary.
There are several benefits to this change: First, the consumers of
assembler segment's contents, like WRITE-SEGMENT-CONTENTS which is used
for example during FASL dumping, or MAKE-CORE-COMPONENT, now call
WRITE-SEQUENCE respectively COPY-BYTE-VECTOR-TO-SYSTEM-AREA only once
per segment and not once per the pieces of the segment's contents that
ON-SEGMENT-CONTENTS-VECTORLY provided, which makes for less overhead.
Second, this allows to greatly simplify the whole operation of
DISASSEMBLE-ASSEM-SEGMENT, in the course deleting several helpers of it.
So far this repartitioned the pieces of the segment's contents from
ON-SEGMENT-CONTENTS-VECTORLY, while caring not to split the contents
inside instructions, which needed a sizable amount of code. Now the
segment's contents are simply disassembled as a whole. Also, the old
code (specifically SEGMENT-OVERFLOW) didn't take prefix instructions
into account correctly which surfaced as the bug in lp#1085729.
Fixes lp#1085729.
Also, fix an unrelated typo in NEWS.
Stas Boukarev [Wed, 5 Jun 2013 23:11:42 +0000 (03:11 +0400)]
Stop exporting unused symbols.