Nikodemus Siivola [Wed, 22 Apr 2009 20:11:05 +0000 (20:11 +0000)]
1.0.27.22: better signaling from MAKE-STATIC-VECTOR
* Patch by Daniel Lowe.
Nikodemus Siivola [Wed, 22 Apr 2009 19:15:06 +0000 (19:15 +0000)]
1.0.27.21: more careful (SETF DOCUMENTATION) for functions
* Use VALID-FUNCTION-NAME-P to check if we should store the docstring:
previously we stored docstrings for anonymous functions under names
like (LAMBDA (X)) -- Not Good.
Nikodemus Siivola [Wed, 22 Apr 2009 19:09:30 +0000 (19:09 +0000)]
1.0.27.20: lutex-wait don't yet support deadlines
* One test depends that, skip it for now on lutex builds to avoid a
hang.
Richard M Kreuter [Wed, 22 Apr 2009 18:51:21 +0000 (18:51 +0000)]
1.0.27.19: Restore variable access in debugger REPL.
* Contributed by Alex Plotnick <plotnick@cs.brandeis.edu>
Richard M Kreuter [Wed, 22 Apr 2009 15:42:41 +0000 (15:42 +0000)]
1.0.27.18: Changes to ECHO-STREAMs
* Bugfix: PEEK-CHAR always popped the unread-stuff, leading to
spurious duplicate echos in some cases.
* Minor incompatible change: UNREAD-CHAR on an ECHO-STREAM now unreads
onto the echo-stream's input stream. This is unspecified in the
CLHS, but makes SBCL compatible with most implementations (AFAICT,
everybody but CMUCL).
* Minor incompatible change: echo-streams used to buffer arbitrarily
many characters in UNREAD-CHAR. Conforming programs can't have
relied on this, but non-conforming ones might have; users who need
the old CMUCL/SBCL behavior can do it easily and de-facto-portably
with Gray Streams.
* Possible bugfix that nobody cares about: ECHO-N-BIN (which
implements a path through READ-SEQUENCE) can never have worked after
an UNREAD-CHAR, because it tried to store characters into an octet
buffer.
Gabor Melis [Tue, 21 Apr 2009 11:33:38 +0000 (11:33 +0000)]
1.0.27.17: faster local calls on x86/x86-64
Instead of JMPing to TARGET, CALL a trampoline that saves the return
pc and jumps. Although this is an incredibly stupid trick, the paired
CALL/RET instructions are a big win.
Gabor Melis [Tue, 21 Apr 2009 11:30:38 +0000 (11:30 +0000)]
1.0.27.16: slightly smaller LISTIFY-REST-ARGS on x86/x86-64
Gabor Melis [Tue, 21 Apr 2009 11:28:46 +0000 (11:28 +0000)]
1.0.27.15: optimize multiple values recievers on x86/x86-64
... by not emitting unreachable instructions.
Gabor Melis [Tue, 21 Apr 2009 11:25:51 +0000 (11:25 +0000)]
1.0.27.14: bias x86oid frame pointer
Forward port of Alastair Bridgewater's patch.
Duplicate it on x86-64.
Make it so that fp points to ocfp just as if the call had been made by
CALL to a function with the standard prologue "PUSH EBP; MOV ESP,
EBP".
Fix the debugger.
Gabor Melis [Tue, 21 Apr 2009 10:26:05 +0000 (10:26 +0000)]
1.0.27.13: more RET on x86oids
With 0, 2 or 3 values return with idiomatic "POP EBP; RET".
Gabor Melis [Tue, 21 Apr 2009 10:25:04 +0000 (10:25 +0000)]
1.0.27.12: x86/x86-64 calling convention comments, refactoring
Gabor Melis [Tue, 21 Apr 2009 10:24:15 +0000 (10:24 +0000)]
1.0.27.11: swap ocfp and return-pc slots in x86oid call frames
Forward port of Alastair Bridgewater's patch. Also, port it to x86-64.
Bring x86 and x86-64 sources closer in the process.
Plus cleanups, indentation, remove dead code, comments, more checks.
Gabor Melis [Tue, 21 Apr 2009 07:33:10 +0000 (07:33 +0000)]
1.0.27.10: fix call_into_lisp return value on x86-64
Christophe Rhodes [Mon, 13 Apr 2009 21:24:31 +0000 (21:24 +0000)]
1.0.27.9: fix print-object cache handling
1.0.25.50 exposed a bug in the print-object discriminating
function: we need to have the methods for critical printing at
all times, but the implementation allowed other methods into
that initial cache, which was wrong if those methods
were subsequently invalidated. The fix is to keep the initial
cache pristine and to use only copies in the print-object
generic function itself.
Gabor Melis [Mon, 13 Apr 2009 20:00:21 +0000 (20:00 +0000)]
1.0.27.8: slightly faster x86oid pseudo atomic with {e,r}bp
Trusting that within SBCL ebp is even and that it doesn't change
within a pseudo atomic section it's possible to use it to set and
clear pseudo atomic bits. It is ever so slightly faster (about 0.5%
overall on cl-bench on a P4).
Alastair Bridgewater [Sun, 12 Apr 2009 14:03:08 +0000 (14:03 +0000)]
1.0.27.7: Win32 build fix
On Win32, the default cross-compilation host is SBCL with a --sysinit
NUL --userinit NUL. Unfortunately, SBCL itself doesn't recognize NUL
as a valid filename as it's actually a DOS device name and there's a
separate API to check for them. The least losing workaround is to use
a real file with known-harmless content for userinit and sysinit, and
the simplest choice is version.lisp-expr. This changes makes it
possible to build on Win32 without specifying a host lisp.
Alastair Bridgewater [Sat, 11 Apr 2009 18:19:08 +0000 (18:19 +0000)]
1.0.27.6: Make alien-type-class definition work from outside sb-alien.
Added a slot to the alien-type-class structure to hold the name of
the structure for the class.
Added the class structure name as a parameter to
create-alien-type-class-if-necessary in order to populate the slot in
the new alien-type-class structure.
Changed define-alien-type-class to look up included alien type
defstruct names in the alien-type-class for the included type rather
than construct it via SYMBOLICATE (thus breaking the requirement that
all uses of define-alien-type-class be in the sb-alien package).
Gabor Melis [Wed, 8 Apr 2009 16:04:02 +0000 (16:04 +0000)]
1.0.27.5: fix compilation on windows
Thanks to Bart Botta.
Gabor Melis [Tue, 7 Apr 2009 13:04:14 +0000 (13:04 +0000)]
1.0.27.4: x86/x86-64 REP prefix has the same code as REPE (not REPNE)
... although it seems to work either way.
Gabor Melis [Tue, 7 Apr 2009 13:00:35 +0000 (13:00 +0000)]
1.0.27.3: fix UNWIND-TO-FRAME-AND-CALL
Gabor Melis [Mon, 6 Apr 2009 08:54:27 +0000 (08:54 +0000)]
1.0.27.2: fix bug in heap implementation
... used by timers.
Thanks to Ole Arndt for the patch.
Richard M Kreuter [Sat, 4 Apr 2009 01:05:52 +0000 (01:05 +0000)]
1.0.27.1: Fix binary input after UNREAD-CHAR on bivalent streams.
* After an UNREAD-CHAR, READ-BYTE returned a character, and
READ-SEQUENCE with an octet buffer failed when trying to store a
character into the buffer.
Richard M Kreuter [Thu, 2 Apr 2009 21:46:34 +0000 (21:46 +0000)]
release, will be tagged as sbcl_1_0_27
Juho Snellman [Fri, 27 Mar 2009 00:39:39 +0000 (00:39 +0000)]
1.0.26.22: Revert 1.0.26.12
* And add testcase showing why the revert was needed.
Gabor Melis [Tue, 24 Mar 2009 14:44:09 +0000 (14:44 +0000)]
1.0.26.21: fix ERROR leaking memory
Make *COMPILED-DEBUG-FUNS* a weak keyed hash table. Add test.
Christophe Rhodes [Mon, 23 Mar 2009 11:59:48 +0000 (11:59 +0000)]
1.0.26.20: tighter VECTOR-PUSH-EXTEND argument type
The optional extension parameter must be a positive integer.
... declare this in fndb;
... fix the erroneous use in constraints (not only ensuring
positivity, but also it's an extension not a new-length
parameter).
Issue brought to light by Peter Graves' XCL.
Gabor Melis [Mon, 23 Mar 2009 11:26:51 +0000 (11:26 +0000)]
1.0.26.19: more stack safety
Add another guard page to the control, binding and alien stacks that
will lose() whenever it's touched so that if the handler manages to
recover from stack exhaustion then we can be sure that image is not
corrupted.
Juho Snellman [Sun, 22 Mar 2009 22:34:58 +0000 (22:34 +0000)]
1.0.26.18: Solaris x86-64 support
* Patch by Alex Viskovatoff
Gabor Melis [Sun, 22 Mar 2009 22:06:03 +0000 (22:06 +0000)]
1.0.26.17: fix GC/SIG_STOP_FOR_GC race
Consider this: in a PA section GC is requested: GC_PENDING,
pseudo_atomic_interrupted and gc_blocked_deferrables are set,
deferrables are blocked then pseudo_atomic_atomic is cleared, but a
SIG_STOP_FOR_GC arrives before trapping to interrupt_handle_pending.
In sig_stop_for_gc_handler, GC_PENDING is cleared but
pseudo_atomic_interrupted is not and we go on running with
pseudo_atomic_interrupted but without a pending interrupt or GC.
GC_BLOCKED_DEFERRABLES is also left at 1.
Add more checks, fix comments.
Gabor Melis [Sun, 22 Mar 2009 21:45:04 +0000 (21:45 +0000)]
1.0.26.16: fix gencgc on ppc
Regression from 1.0.25.37.
Store the context of allocation trap in interrupt_data and frob that
when gencgc wants to block deferrables.
Also: remove unused, buggy get_interrupt_context_for_thread.
Gabor Melis [Sun, 22 Mar 2009 21:44:07 +0000 (21:44 +0000)]
1.0.26.15: interrupt.c refactoring
- check that all or none of the deferrable signals are blocked
- make passing NULL for sigset in the right context mean the current
sigmask: there is only a single block_signals() function that can
performs sigset arithmetic or change the current mask.
- print pc and sp on memory faults to ease debugging
Christophe Rhodes [Sun, 22 Mar 2009 21:34:28 +0000 (21:34 +0000)]
1.0.26.14: minor portability fixes
Motivated by restarting work on a repeatable-xc-fasl project,
somewhat delayed by Real Life matters...
... use an explicit TYPE declaration for defined types;
... don't redefine host functions when building fasls from the
xc;
... catch one egregiously bad case of a dead clause in TYPECASE
(more lurk);
... don't use host symbols in genesis;
... define a total order for emitting constants.h.
Now clisp on my machine, with the current phase of the moon,
gets as far as dumping the cold core. More Work Needed.
Juho Snellman [Sun, 22 Mar 2009 20:07:50 +0000 (20:07 +0000)]
1.0.26.13: OpenBSD x86-64 support
* Patch by Josh Elsasser
Juho Snellman [Sun, 22 Mar 2009 19:44:13 +0000 (19:44 +0000)]
1.0.26.12: Don't allow (LOOP FOR X ACROSS A ...) where A evaluates to NIL
* Patch by Daniel Lowe
Juho Snellman [Sun, 22 Mar 2009 19:07:09 +0000 (19:07 +0000)]
1.0.26.11: Fix the error message for ENOMEM on mprotect
* Error message was not updated when the variable was renamed
* s/size/bytes/, s/parms/backend-parms/
Gabor Melis [Fri, 20 Mar 2009 11:15:17 +0000 (11:15 +0000)]
1.0.26.10: darwin interrupt fixes
Work around raise(signal) apparently not raising the signal under some
circumstances. See sbcl-devel thread "Hang in tests on Intel MacOS
10.5.6" starting on 2009-03-14.
Also, block all blockables when in install_handler, having just one of
the signals blocked breaks invariants (not really darwin specific).
Replace abort() after the call mach_msg_server() with more a
informative lose(). It's actually returns after attaching and
detaching gdb.
Gabor Melis [Thu, 19 Mar 2009 13:42:05 +0000 (13:42 +0000)]
1.0.26.9: reduce consing in MAP-ALLOCATED-OBJECTS
... on platforms where dynamic space extends past fixnum range
Thanks to Bart Botta for the patch.
Gabor Melis [Tue, 17 Mar 2009 14:05:45 +0000 (14:05 +0000)]
1.0.26.8: QSHOW changes, bug reporting guidelines
- change runtime.h so that a simple '#define QSHOW_SIGNAL 1' turns
QSHOW automatically and defaults to blocking signals during printing
- add notes to BUGS on how to report bugs related to signal handling
- kill a warning in thread.c in code conditional on QSHOW_SIGNAL
- add #include <stdio.h> to x86{-64,}-darwin-os.c so that it compiles
with QSHOW
- add comment explaining the previous commit
Gabor Melis [Tue, 17 Mar 2009 11:27:08 +0000 (11:27 +0000)]
1.0.26.7: use a signal for SIG_STOP_FOR_GC > SIGSEGV on Linux
On Linux a signal generated by pthread_kill() with a signum that's
lower than SIGSEGV can be delivered before a synchronously triggered
SIGSEGV. This means that the sigsegv handler will be invoked with its
context pointing to the handler for the signal that pthread_kill()
sent. It's not really specific to SIGSEGV, it's the same for any
synchronously generated signal.
To work around this, we must never pthread_kill() with a signal with a
lower signum than any of the synchronously triggered signals that we
use: SIGTRAP, SIGSEGV, etc. In practice, currently we only send
SIGPIPE to indicate that the thread interruption queue may need to be
looked at and SIG_STOP_FOR_GC that's defined as SIGUSR1 currently.
With SIGUSR1 being 10 and SIGSEGV 11 this can make
handle_guard_page_triggered lose badly if GC wants to stop the thread
at the same time. So let's use SIGUSR2 instead that's 12. Do the same
on other OSes they may have same bug.
See thread "Signal delivery order" from 2009-03-14 on
kernel-devel@vger.kernel.org:
http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/
6773ac3dcb867da3#
Gabor Melis [Mon, 16 Mar 2009 16:01:07 +0000 (16:01 +0000)]
1.0.26.6: use private operations on futexes
It allows the linux kernel to avoid contention with mmap.
Gabor Melis [Mon, 16 Mar 2009 16:00:07 +0000 (16:00 +0000)]
1.0.26.5: improve CONDITION-WAIT, RELEASE-MUTEX
- minimize the window where a CONDITION-WAIT and a CONDITION-NOTIFY
race to FUTEX-WAIT and FUTEX-WAKE respectively
- make reacquiration of the mutex in CONDITION-WAIT interruptible
- make RELEASE-MUTEX return silently without doing anything if the
owner is not the current thread. This eliminates spurious warnings
upon async unwinding from the mutex reacquiration path of
CONDITION-WAIT.
- add IF-NOT-OWNER parameter to RELEASE-MUTEX with three possible
values: :PUNT, :WARN, :FORCE (see docstring).
Gabor Melis [Mon, 16 Mar 2009 15:59:03 +0000 (15:59 +0000)]
1.0.26.4: less pessimal waitqueues
Readers calling CONDITION-WAIT don't interfere with each other.
CONDITION-WAIT used to set WAITQUEUE-DATA to *CURRENT-THREAD* causing
other readers entering FUTEX-WAIT to return with EWOULDBLOCK.
Set WAITQUEUE-DATA to NIL in readers, and to *CURRENT-THREAD* in
writers.
Also, fix a warning in :SEMAPHORE-MULTIPLE-WAITERS test.
Gabor Melis [Sat, 14 Mar 2009 23:00:29 +0000 (23:00 +0000)]
1.0.26.3: cleanup accesses to *STEPPING* on MIPS and HPPA
Gabor Melis [Sat, 14 Mar 2009 18:21:33 +0000 (18:21 +0000)]
1.0.26.2: alloc_code_object facelift
- use offsetof trace_table_offset instead of a hard coded constant
- alloc (boxed + unboxed) bytes not words
- check the gc is inhibited because the half initilialized code object
could trip gc
Christophe Rhodes [Mon, 2 Mar 2009 08:03:58 +0000 (08:03 +0000)]
1.0.26.1: Modify uploading instructions in release checklist
Christophe Rhodes [Sun, 1 Mar 2009 21:49:12 +0000 (21:49 +0000)]
1.0.26: release, will be tagged as sbcl_1_0_26
Christophe Rhodes [Sun, 1 Mar 2009 20:34:50 +0000 (20:34 +0000)]
1.0.25.58: HPPA fixes from Larry Valkama
Gabor Melis [Sun, 1 Mar 2009 15:57:08 +0000 (15:57 +0000)]
1.0.25.57: fix compilation on win32
Gabor Melis [Tue, 24 Feb 2009 10:32:11 +0000 (10:32 +0000)]
1.0.25.56: SUB-GC: don't observe deadlines
- because the condition that's signalled can cause arbitrary code to
run catching us with pants down
- and we should not skip gc if it was triggerred
Alastair Bridgewater [Tue, 17 Feb 2009 23:41:22 +0000 (23:41 +0000)]
1.0.25.55: x86 disassembler fixes.
Made operand-size-prefix bytes disassemble correctly, using the same
approach used in the x86-64 backend (extra instruction formats for
reading the prefix byte).
Fixed movzx and movsx instructions to indicate the size of the source
data when moving from memory.
Added printers for cbw, cwde and cwd instructions.
Gabor Melis [Mon, 16 Feb 2009 22:30:25 +0000 (22:30 +0000)]
1.0.25.54: centralize scattered arch_os_get_context() calls
... to the six signal handlers trampolines. These are now the only
places where void_context appears, the rest of the runtime uses
os_context_t.
Also, the &void_context+37 hack was removed. On Sparc/Linux the third
parameter of SA_SIGINFO signal handlers is a pointer to sigcontext,
which happens to be the same as &void_context+37 most of the time,
though.
I have tested two on Sparc/Linux boxes, one running 2.6.26 that
randomly segfaulted compiling itself with 1.0.25.12, and another
runnin 2.6.24 that worked fine before at that version.
Thanks to Bruce O'Neel for the shell access.
Gabor Melis [Mon, 16 Feb 2009 22:29:06 +0000 (22:29 +0000)]
1.0.25.53: fix gencgc_handle_wp_violation on multicpu systems
Acquire free_pages_lock around page_table accesses to be sure that the
changes actually propagate to other CPUs and we don't end up losing in
the else branch and also to prevent problems caused by the compiler or
the processor reordering stuff.
Gabor Melis [Mon, 16 Feb 2009 22:27:07 +0000 (22:27 +0000)]
1.0.25.52: go through lisp_memory_fault_error on all platforms
... so that the corruption mechanism can kick in.
Gabor Melis [Mon, 16 Feb 2009 22:26:25 +0000 (22:26 +0000)]
1.0.25.51: use WITH-RECURSIVE-SYSTEM-SPINLOCK
... instead of WITH-RECURSIVE-SPINLOCK because it's possible to
deadlock due to lock ordering with sufficiently unlucky interrupts as
demonstrated by test (:timer :parallel-unschedule) with low
probability.
This affects hash tables and some pcl locks.
Also, use WITH-RECURSIVE-MUTEX for packages.
Not a spinlock becuase it can be held for a long time and not a system
lock (i.e. with WITHOUT-INTERRUPTS) because conflicts are signalled
while holding the lock which I think this warrants a FIXME.
Gabor Melis [Mon, 16 Feb 2009 22:23:08 +0000 (22:23 +0000)]
1.0.25.50: detect binding and alien stack exhaustion
Alien stack exhaustion machinery only works on x86oids.
Gabor Melis [Mon, 16 Feb 2009 22:22:23 +0000 (22:22 +0000)]
1.0.25.49: x86/x86-64 unithread: use the allocated alien stack
... in struct thread and not the original control stack that we switch
away from in call_into_lisp_first_time.
Gabor Melis [Mon, 16 Feb 2009 22:20:39 +0000 (22:20 +0000)]
1.0.25.48: signals internals doc
Gabor Melis [Mon, 16 Feb 2009 22:20:16 +0000 (22:20 +0000)]
1.0.25.47: OOAO restoring fp control word
Do it in all signal handlers, no matter what, on platforms that
require it.
Gabor Melis [Mon, 16 Feb 2009 22:19:27 +0000 (22:19 +0000)]
1.0.25.46: restore errno in signal handlers
Gabor Melis [Mon, 16 Feb 2009 22:17:56 +0000 (22:17 +0000)]
1.0.25.45: fix futex_wait deadlines when interrupted
When the syscall returned with EINTR futex_wait called again with the
same timeout. Now it lets Lisp recalculate the relative timeout from
the deadline.
Gabor Melis [Mon, 16 Feb 2009 22:16:20 +0000 (22:16 +0000)]
1.0.25.44: INTERRUPT-THREAD and timer improvements
The main thing accomplished by this commit is that it's finally
possible to use INTERRUPT-THREAD and TIMERS sanely:
- there is a per thread interruption queue, interruption are executed
in order of arrival
- the interruption has to explicitly enable interrupts with
WITH-INTERRUPTS if needed. In the absence of WITH-INTERRUPTS the
interruption itself is not interrupted and running out of stack is
not a problem.
- timers have an improved repeat mechanism
Implementation notes:
- INTERRUPT-THREAD is implemented on all platforms and builds (that
is, even without :SB-THREAD) by sending a signal to the current
thread (or process without thread). This allows us to hook into the
normal, interrupt deferral mechanism without having to commit OAOO
violations on the Lisp side. And it makes threaded, non-threaded
builds closer, hopefully easing testing.
- SIG_INTERRUPT_THREAD is SIGPIPE on all platforms. SIGPIPE is not
used in SBCL for its original purpose, instead it's for signalling a
thread that it should look at its interruption queue. The handler
(RUN_INTERRUPTION) just returns if there is nothing to do so it's
safe to receive spurious SIGPIPEs coming from the kernel.
- IN-INTERRUPTION does not unblock deferrables anymore, but arranges
for them to be unblocked when interrupts are enabled (see
*UNBLOCK-DEFERRABLES-ON-ENABLING-INTERRUPTS-P*).
- Thread interruption run wrapped in a (WITHOUT-INTERRUPTS
(ALLOW-WITH-INTERRUPTS ...)).
- Repeating timers reschedule themselves when they finished to the
current expiry time + repeat interval even if that's in the past.
Hence, a timer's schedule does not get shifted if it takes a long
time to run. If it takes more time than the repeat interval then it
may catch up on later invokations.
- Timers run wrapped in a (WITHOUT-INTERRUPTS (ALLOW-WITH-INTERRUPTS
...)) even in run in a new thread.
- Enable previously failing tests.
- Add more tests.
- Automatically unschedule repeating timers if they take up all the
CPU.
Gabor Melis [Mon, 16 Feb 2009 22:12:36 +0000 (22:12 +0000)]
1.0.25.43: alpha interrupt context fixes
- interrupt contexts pointers are 64 bit
- add padding to DEFINE-PRIMITIVE-OBJECT THREAD, because the C
compiler aligns interrupt_contexts on a double word boundary
Gabor Melis [Mon, 16 Feb 2009 22:08:38 +0000 (22:08 +0000)]
1.0.25.42: make os_thread 0 on unithread builds
... because storing the pid there (two places really, thread.os_thread
and THREAD-OS-THREAD) complicates SB-POSIX:FORK, and makes a simple
fork() lose.
Gabor Melis [Mon, 16 Feb 2009 22:07:55 +0000 (22:07 +0000)]
1.0.25.41: only call pthread_kill with valid thread ids
... else it segfaults (at least on Linux). Fixes sporadic "Unhandled
memory fault" running timer, INTERRUPT-THREAD heavy code. And block
signals while calling pthread_kill because it is not async signal
safe.
Gabor Melis [Mon, 16 Feb 2009 22:05:45 +0000 (22:05 +0000)]
1.0.25.40: fix JOIN-THREAD
If the thread has not returned normally signal the error when not
holding the mutex anymore. Disable interrupt for the duration of
holding the mutex. Fix test.
Gabor Melis [Mon, 16 Feb 2009 22:04:26 +0000 (22:04 +0000)]
1.0.25.39: thread start/stop fixes
- disable interrupts during create_thread
... to protect against signals (async unwinds, reentrancy, ...)
during malloc, pthread code and to make the pinning of
INITIAL-FUNCTION effective. Also add checks to
new_thread_trampoline.
- make-thread: assert gc enabled (to prevent deadlocks)
- block blockables while holding LOCK_CREATE_THREAD lock
- make dying threads safe from interrupts
Gabor Melis [Mon, 16 Feb 2009 22:01:45 +0000 (22:01 +0000)]
1.0.25.38: fix maybe_gc
Gabor Melis [Mon, 16 Feb 2009 22:01:15 +0000 (22:01 +0000)]
1.0.25.37: block deferrables when gc pending in PA
Consider this:
set_pseudo_atomic_atomic()
alloc and set pseudo_atomic_interrupted and GC_PENDING
clear_pseudo_atomic_atomic()
if (get_pseudo_atomic_interrupted())
gc();
If an async interrupt happens and unwinds between
clear_pseudo_atomic_atomic and gc() we have lost a gc trigger until
the next alloc. Same with SIG_STOP_FOR_GC instead of alloc.
This patch addresses the above problem by blocking deferrables when a
gc is requested in a pseudo atomic section. On exit from the protected
section there is no race because no async unwinding interrupts can
happen.
Gabor Melis [Mon, 16 Feb 2009 21:56:20 +0000 (21:56 +0000)]
1.0.25.36: unblock signals on low level errors
Low level errors (control stack exhausted, memory fault) may trigger
at inconvenient times such as in signal handlers still running with
deferrables blocked. Due to the condition handling mechanism this
leads to executing user code at unsafe places and the image may very
well become corrupted. To allow continuing anyway with fingers crossed
to diagnose the problem, deferrable and/or gc signals are unblocked
before arranging for returning to the Lisp function that signals the
appropriate condition. Before that is done, however, a warning is
fprintf'ed to stderr to that this important piece of information is
not lost in some generic condition handler (or in a interrupt handler
async unwinding before anyone gets a chance to handle the condition)
which makes diagnosis of subsequent problems very hard.
This was brought to light by the checks added in the previous commit.
Gabor Melis [Mon, 16 Feb 2009 21:54:06 +0000 (21:54 +0000)]
1.0.25.35: check that gc signals are unblocked
... when alloc() is called and when calling into Lisp.
Gabor Melis [Mon, 16 Feb 2009 21:53:25 +0000 (21:53 +0000)]
1.0.25.34: gc trigger improvements
- Make sure that gc never runs user code (after gc hooks and
finalizers) if interrupts are disabled and they cannot be enabled
(i.e. *ALLOW-WITH-INTERRUPTS* is NIL).
- clean up maybe_gc: we know that in the interrupted context gc
signals are unblocked (see "check that gc signals are unblocked"
commit) so it's safe to simply enable them and call SUB-GC
Gabor Melis [Mon, 16 Feb 2009 21:52:04 +0000 (21:52 +0000)]
1.0.25.33: protect against recursive gcs
... while holding the *already-in-gc* lock instead of allowing gc to
trigger and then punting.
Gabor Melis [Mon, 16 Feb 2009 21:49:30 +0000 (21:49 +0000)]
1.0.25.32: improvements to WITHOUT-GCING
- implement it with only one UNWIND-PROTECT (about 30% faster)
- add *IN-WITHOUT-GCING*
- in maybe_defer_handler defer interrupts if we are in the racy part
of WITHOUT-GCING, that is with interrupts enabled, gc allowed but
still *IN-WITHOUT-GCING*
- check more invariants
Gabor Melis [Mon, 16 Feb 2009 21:48:20 +0000 (21:48 +0000)]
1.0.25.31: axe GC-{ON,OFF}
... because they are broken, nobody uses them (?), and complicated to
fix.
Rationale:
- There is no way to safely allow gc in a WITHOUT-GCING without making
it entirely useless (nothing like the interrupt protocol with
ALLOW-WITH-INTERRUPTS).
- WITHOUT-GCING implies WITHOUT-INTERRUPTS because interrupts running
with gc disabled may lead to deadlocks (see internals manual on lock
ordering and WITHOUT-GCING) or running out of memory. To adhere to
this contract GC-{ON,OFF} would need to enable/disable interrupts by
setting *INTERRUPTS-ENABLED*, comlicated business for little gain.
Gabor Melis [Mon, 16 Feb 2009 21:45:22 +0000 (21:45 +0000)]
1.0.25.30: INTERRUPT-THREAD without RT signals
All non-win32 platforms converted to use normal signals
(SIGINFO/SIGPWR) to implement INTERRUPT-THREAD.
Remove mention of RT signals from the internals manual.
Gabor Melis [Mon, 16 Feb 2009 21:44:11 +0000 (21:44 +0000)]
1.0.25.29: thread state visibility and synchronization
C does not guarantee that changes made to a volatile variable in one
thread are visibile to other threads. Use locking primitives (that
have memory barrier semantics) for that purpose.
Thus, all_threads need not be volatile: it's always accessed while
holding all_threads_lock. Thread state does not need volatile either,
as signal a handlers don't change it (save for sig_stop_for_gc_handler
but that sets it restores its value). But visibility issues can arise
and potentially deadlock in stop_the_world, so the thread state is
given a lock. And to reduce busy looping while waiting for the state
to change to STATE_SUSPENDED a condition variable is added as well.
Also convert sig_stop_for_gc_handler to use
wait_for_thread_state_change which frees up SIG_RESUME_FROM_GC. I
think this also guarantees that the changes made by the gc are visible
in other threads on non x86 platforms (on x86 it's already the case).
With these changes threads.impure.lisp runs 10% a faster in real time.
Gabor Melis [Mon, 16 Feb 2009 21:42:05 +0000 (21:42 +0000)]
1.0.25.28: always use SIG_RESUME_FROM_GC
The other mechanism relied on real time signals which made it freeze
when the sysystem wide real time signal queue got full on Linux. A
full queue spells trouble for other processes using rt signals.
All platforms are changed to use SIGUSR1 and SIGUSR2 for
SIG_STOP_FOR_GC and SIG_RESUME_FROM_GC.
Check that SIG_RESUME_FROM_GC is never signalled without a
corresponding sigwait.
Gabor Melis [Mon, 16 Feb 2009 21:41:08 +0000 (21:41 +0000)]
1.0.25.27: codify interrupt handling invariants
Gabor Melis [Mon, 16 Feb 2009 21:39:59 +0000 (21:39 +0000)]
1.0.25.26: less interrupt handling leftovers
- less clear_pseudo_atomic_interrupted
Remove it from the beginning of the protected sections in the
runtime to match what we do in Lisp.
- less resetting of signal mask
Don't call reset-signal-mask from the toplevel, I don't think it's
useful anymore and quite likely it was never more than duct tape.
Even when unmasking of signals is called for (in
INVOKE-INTERRUPTION), only unblock the deferrable ones: gc signals
are unblocked in Lisp anyway.
- removed unneeded sigemptyset on pending mask
- move MIPS SIGTRAP workaround to the runtime
Gabor Melis [Mon, 16 Feb 2009 21:39:08 +0000 (21:39 +0000)]
1.0.25.25: sig_stop_for_gc_handler looks at GC_INHIBIT first
... else we'd trap on every single pseudo atomic until gc is finally
allowed.
Gabor Melis [Mon, 16 Feb 2009 21:38:43 +0000 (21:38 +0000)]
1.0.25.24: x86/x86-64 runtime pseudo atomic fixes
Make sure that {clear,set}_pseudo_atomic_{atomic,interrupted} boil
down to a single assembly intruction. And that the compiler does not
reorder them.
I think the ppc bits in pseudo-atomic.h are still broken on both
accounts.
Gabor Melis [Mon, 16 Feb 2009 21:37:34 +0000 (21:37 +0000)]
1.0.25.23: more allocation checks
- in gencgc alloc() check that we are in pseudo atomic
- assert that interrupt_handle_pending does not happen in pseudo atomic
- assert that cheneygc pa_alloc() runs with deferrable signals blocked
- add test code to trigger gc from pa_alloc to help testing how the
runtime deals with such rare events
Gabor Melis [Mon, 16 Feb 2009 21:36:46 +0000 (21:36 +0000)]
1.0.25.22: SIGTERM and SIGABRT
- add SIGTERM to deferrables, because it has a Lisp handler and it
cannot run in some contexts (on the alt stack, gc signals blocked,
...)
- move SIGABRT/SIGIOT handling to C to abort without possibly
triggering Lisp errors due to the same problems as in the previous
point
Gabor Melis [Mon, 16 Feb 2009 21:36:13 +0000 (21:36 +0000)]
1.0.25.21: handling of potential corruptions
- add corruption_warning_and_maybe_lose that prints a warning and
loses depending on lose_on_corruption_p (false by default)
- use corruption_warning_and_maybe_lose when the control stack is
exhausted and on memory faults
- use corruption_warning_and_maybe_lose on the lisp handlers of
SIGILL, SIGBUS and SIGEMT, as invoking them is surely not a good
sign.
- add --lose-on-corruption as a runtime option
- add --disable-ldb as a runtime option
- update the man page and the user manual
- HEAP-EXHAUSTED fixes:
- exit pseduo atomic properly and do pending interrupt if needed
- signalling HEAP-EXHAUSTED in a WITHOUT-INTERRUPTS is dangerous
- use --lose-on-corruption in make-target*.sh
Also, block blockable signals on lose() to prevent other threads,
timers and such from interfering. If only all threads could be stopped
somehow.
Gabor Melis [Mon, 16 Feb 2009 21:32:58 +0000 (21:32 +0000)]
1.0.25.20: test util: print names, status
Gabor Melis [Mon, 16 Feb 2009 21:32:24 +0000 (21:32 +0000)]
1.0.25.19: small test fixes
- mop-6 test: use keywords in test name
... because symbols from the mop-6 package cannot read later from
the file containing test results.
- fix gc deadlock test
Instead of with-all-threads-lock, it was using with-mutex that
enables interrupts.
Gabor Melis [Mon, 16 Feb 2009 21:31:32 +0000 (21:31 +0000)]
1.0.25.18: it's only SHOW
- fix compilation with QSHOW
- SHOW prints thread id on threaded builds
- SHOWing os_threads
- do not print pthread_self() that's the job of SHOW
- always print thread ids with %lu
- states with %x
- add more SHOW to ease debugging
- gc_stop_the_world: don't flood with FSHOW_SIGNAL when waiting for
another thread to change states
- signal safe SHOW
(if QSHOW_SAFE is defined)
Gabor Melis [Mon, 16 Feb 2009 21:29:41 +0000 (21:29 +0000)]
1.0.25.17: kill runtime warnings
Gabor Melis [Mon, 16 Feb 2009 21:28:24 +0000 (21:28 +0000)]
1.0.25.16: minor stylistics changes in the runtime
Gabor Melis [Mon, 16 Feb 2009 21:27:26 +0000 (21:27 +0000)]
1.0.25.15: less compilation warnings
Gabor Melis [Mon, 16 Feb 2009 21:25:44 +0000 (21:25 +0000)]
1.0.25.14: comments
Gabor Melis [Mon, 16 Feb 2009 21:18:36 +0000 (21:18 +0000)]
1.0.25.13: 80 chars, trailing space
Gabor Melis [Wed, 11 Feb 2009 13:56:26 +0000 (13:56 +0000)]
1.0.25.12: fix debugger on non-x86oids
(regression from 1.0.25.9 that broke TOP-FRAME)
Alastair Bridgewater [Wed, 11 Feb 2009 12:46:19 +0000 (12:46 +0000)]
1.0.25.11: Remove unused SIZE slot from catch-block structure.
Richard M Kreuter [Thu, 5 Feb 2009 23:03:36 +0000 (23:03 +0000)]
1.0.25.10: Commit version.lisp-expr, missed in last commit. (Oops.)
Richard M Kreuter [Thu, 5 Feb 2009 18:27:31 +0000 (18:27 +0000)]
1.0.25.10: Fix package locks checks for local functions in the interpeter.
* Package lock checks were being performed in the function binding form's
body's lexical environment, causing lossage for forms like
(locally (declare (disable-package-locks foo:bar))
(flet ((foo:bar ...))
(declare (enable-package-locks foo:bar))
...))
In particular, this broke some TRACE extensions.
Gabor Melis [Thu, 5 Feb 2009 09:56:46 +0000 (09:56 +0000)]
1.0.25.9: INVOKE-WITH-SAVED-FP-AND-PC changes
On x86/x86-64 we stash away the fp and the pc when calling an alien
function in order to allow the debugger to get back at the lisp frames
even if the alien frames confuse the frame parsing heuristics.
This commit optimizes INVOKE-WITH-SAVED-FP-AND-PC to cancel much of
the slowdown caused by 1.0.21.32 and it makes its use conditional on
(<= speed debug).
Gabor Melis [Wed, 4 Feb 2009 14:10:22 +0000 (14:10 +0000)]
1.0.25.8: fix sxhash bug
... brought to light by 1.0.20.27. Declare hashes to be of type HASH
(not INDEX).
Note that INDEX still is used to mean different things:
- a valid index: (integer 0 (array-dimension-limit))
- a "bound" such as the :START arguments: (integer 0 array-dimension-limit)
- a "dimension" as in (make-array 10): (integer 0 array-dimension-limit)
which leads to all kinds of nastiness with array near the limit.
Richard M Kreuter [Tue, 3 Feb 2009 20:22:04 +0000 (20:22 +0000)]
1.0.25.7: Muffle style-warnings around lambda list parsing in the interpeter.
Alastair Bridgewater [Tue, 3 Feb 2009 04:27:08 +0000 (04:27 +0000)]
1.0.25.6: Reunite x86oid and non-x86oid sub-{access,set}-debug-var-slot
Merged the x86oid and non-x86oid versions of sub-access-debug-var-slot
and sub-set-debug-var-slot, reducing the size of debug-int.lisp but
arguably making the conditionalization worse.