1.0.25.44: INTERRUPT-THREAD and timer improvements
authorGabor Melis <mega@hotpop.com>
Mon, 16 Feb 2009 22:16:20 +0000 (22:16 +0000)
committerGabor Melis <mega@hotpop.com>
Mon, 16 Feb 2009 22:16:20 +0000 (22:16 +0000)
The main thing accomplished by this commit is that it's finally
possible to use INTERRUPT-THREAD and TIMERS sanely:

- there is a per thread interruption queue, interruption are executed
  in order of arrival

- the interruption has to explicitly enable interrupts with
  WITH-INTERRUPTS if needed. In the absence of WITH-INTERRUPTS the
  interruption itself is not interrupted and running out of stack is
  not a problem.

- timers have an improved repeat mechanism

Implementation notes:

- INTERRUPT-THREAD is implemented on all platforms and builds (that
  is, even without :SB-THREAD) by sending a signal to the current
  thread (or process without thread). This allows us to hook into the
  normal, interrupt deferral mechanism without having to commit OAOO
  violations on the Lisp side. And it makes threaded, non-threaded
  builds closer, hopefully easing testing.

- SIG_INTERRUPT_THREAD is SIGPIPE on all platforms. SIGPIPE is not
  used in SBCL for its original purpose, instead it's for signalling a
  thread that it should look at its interruption queue. The handler
  (RUN_INTERRUPTION) just returns if there is nothing to do so it's
  safe to receive spurious SIGPIPEs coming from the kernel.

- IN-INTERRUPTION does not unblock deferrables anymore, but arranges
  for them to be unblocked when interrupts are enabled (see
  *UNBLOCK-DEFERRABLES-ON-ENABLING-INTERRUPTS-P*).

- Thread interruption run wrapped in a (WITHOUT-INTERRUPTS
  (ALLOW-WITH-INTERRUPTS ...)).

- Repeating timers reschedule themselves when they finished to the
  current expiry time + repeat interval even if that's in the past.
  Hence, a timer's schedule does not get shifted if it takes a long
  time to run. If it takes more time than the repeat interval then it
  may catch up on later invokations.

- Timers run wrapped in a (WITHOUT-INTERRUPTS (ALLOW-WITH-INTERRUPTS
  ...)) even in run in a new thread.

- Enable previously failing tests.

- Add more tests.

- Automatically unschedule repeating timers if they take up all the
  CPU.

23 files changed:
NEWS
src/code/cold-init.lisp
src/code/early-impl.lisp
src/code/signal.lisp
src/code/target-signal.lisp
src/code/target-thread.lisp
src/code/timer.lisp
src/compiler/generic/parms.lisp
src/runtime/backtrace.c
src/runtime/bsd-os.c
src/runtime/bsd-os.h
src/runtime/darwin-os.h
src/runtime/interrupt.c
src/runtime/interrupt.h
src/runtime/linux-os.c
src/runtime/linux-os.h
src/runtime/sunos-os.c
src/runtime/sunos-os.h
src/runtime/thread.c
src/runtime/win32-os.h
tests/threads.impure.lisp
tests/timer.impure.lisp
version.lisp-expr

diff --git a/NEWS b/NEWS
index e3b105d..fdce7b6 100644 (file)
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,13 @@
 ;;;; -*- coding: utf-8; fill-column: 78 -*-
 changes in sbcl-1.0.26 relative to 1.0.25:
+  * incompatible change: the interruption (be it a function passed to
+    INTERRUPT-THREAD or a timer function) runs in an environment where
+    interrupts can be enabled. The interruption can use
+    WITH-INTERRUPTS or WITHOUT-INTERRUPTS as it sees fit. Use
+    WITHOUT-INTERRUPTS to avoid nesting of interruptions and
+    potentially running out of stack. Keep in mind that in the absance
+    of WITHOUT-INTERRUPTS some potentially blocking operation such as
+    acquiring a lock can enable interrupts.
   * incompatible change: GC-OFF and GC-ON are removed, they were
     always unsafe. Use WITHOUT-GCING instead.
   * new feature: runtime option --disable-ldb
@@ -8,6 +16,13 @@ changes in sbcl-1.0.26 relative to 1.0.25:
     memory, stack, alien stack, binding stack, encountering a memory
     fault, etc. In the absence of --lose-on-corruption a warning is
     printed to stderr.
+  * improvement: generally more stable and reliable interrupt handling
+  * improvement: there is a per thread interruption queue,
+    interruptions are executed in order of arrival
+  * improvement: a repeating timer reschedules itself when the it has
+    finished, but expiration times are spaced equally. If an
+    expiration time is in the past it will trigger after a short grace
+    period that may give a chance to other things to run.
   * optimization: slightly faster gc on multithreaded builds
   * optimization: faster WITHOUT-GCING
   * bug fix: when JOIN-THREAD signals an error, do it when not holding
index c812dab..b9b1994 100644 (file)
@@ -98,6 +98,7 @@
         *gc-pending* nil
         #!+sb-thread *stop-for-gc-pending* #!+sb-thread nil
         *allow-with-interrupts* t
+        sb!unix::*unblock-deferrables-on-enabling-interrupts-p* nil
         *interrupts-enabled* t
         *interrupt-pending* nil
         *break-on-signals* nil
index 3cd6dc8..51fe656 100644 (file)
@@ -35,6 +35,7 @@
                   #!+(or x86 x86-64) *pseudo-atomic-bits*
                   #!+(or hpux) sb!vm::*c-lra*
                   *allow-with-interrupts*
+                  sb!unix::*unblock-deferrables-on-enabling-interrupts-p*
                   *interrupts-enabled*
                   *interrupt-pending*
                   *free-interrupt-context-index*
index 4040857..8d21743 100644 (file)
 (defvar *interrupts-enabled* t)
 (defvar *interrupt-pending* nil)
 (defvar *allow-with-interrupts* t)
+;;; This is to support signal handlers that want to return to the
+;;; interrupted context without leaving anything extra on the stack. A
+;;; simple
+;;;
+;;;  (without-interrupts
+;;;   (unblock-deferrable-signals)
+;;;   (allow-with-interrupts ...))
+;;;
+;;; would not cut it, as upon leaving WITHOUT-INTERRUPTS the pending
+;;; handlers is run with stuff from the function in which this is
+;;; still on the stack.
+(defvar *unblock-deferrables-on-enabling-interrupts-p* nil)
 
 (sb!xc:defmacro without-interrupts (&body body)
   #!+sb-doc
@@ -105,9 +117,13 @@ WITHOUT-INTERRUPTS in:
                              ,',outer-allow-with-interrupts)
                             (*interrupts-enabled*
                              ,',outer-allow-with-interrupts))
-                        (when (and ,',outer-allow-with-interrupts
-                                   *interrupt-pending*)
-                          (receive-pending-interrupt))
+                        (when ,',outer-allow-with-interrupts
+                          (when *unblock-deferrables-on-enabling-interrupts-p*
+                            (setq *unblock-deferrables-on-enabling-interrupts-p*
+                                  nil)
+                            (sb!unix::unblock-deferrable-signals))
+                          (when *interrupt-pending*
+                            (receive-pending-interrupt)))
                         (locally ,@with-forms))))
                 (let ((*interrupts-enabled* nil)
                       (,outer-allow-with-interrupts *allow-with-interrupts*)
@@ -149,8 +165,12 @@ by ALLOW-WITH-INTERRUPTS."
     `(let* ((,allowp *allow-with-interrupts*)
             (,enablep *interrupts-enabled*)
             (*interrupts-enabled* (or ,enablep ,allowp)))
-       (when (and (and ,allowp (not ,enablep)) *interrupt-pending*)
-         (receive-pending-interrupt))
+       (when (and ,allowp (not ,enablep))
+         (when *unblock-deferrables-on-enabling-interrupts-p*
+           (setq *unblock-deferrables-on-enabling-interrupts-p* nil)
+           (sb!unix::unblock-deferrable-signals))
+         (when *interrupt-pending*
+           (receive-pending-interrupt)))
        (locally ,@body))))
 
 (defmacro allow-with-interrupts (&body body)
@@ -163,11 +183,12 @@ by ALLOW-WITH-INTERRUPTS."
   (error "~S is valid only inside ~S."
          'with-local-interrupts 'without-interrupts))
 
-;;; A low-level operation that assumes that *INTERRUPTS-ENABLED* is false,
-;;; and *ALLOW-WITH-INTERRUPTS* is true.
+;;; A low-level operation that assumes that *INTERRUPTS-ENABLED* is
+;;; false, *ALLOW-WITH-INTERRUPTS* is true and deferrable signals are
+;;; unblocked.
 (defun %check-interrupts ()
-  ;; Here we check for pending interrupts first, because reading a special
-  ;; is faster then binding it!
+  ;; Here we check for pending interrupts first, because reading a
+  ;; special is faster then binding it!
   (when *interrupt-pending*
     (let ((*interrupts-enabled* t))
       (receive-pending-interrupt))))
index da471d1..6ddf0b5 100644 (file)
           (sb!impl::*merge-sort-temp-vector* ,empty))
        ,@body)))
 
+;;; Evaluate CLEANUP-FORMS iff PROTECTED-FORM does a non-local exit.
+(defmacro nlx-protect (protected-form &rest cleanup-froms)
+  (with-unique-names (completep)
+    `(let ((,completep nil))
+       (without-interrupts
+         (unwind-protect
+              (progn
+                (allow-with-interrupts
+                  ,protected-form)
+                (setq ,completep t))
+           (unless ,completep
+             ,@cleanup-froms))))))
+
 (defun invoke-interruption (function)
   (without-interrupts
-    (with-interrupt-bindings
-      ;; Reset signal mask: the C-side handler has blocked all
-      ;; deferrable interrupts before arranging return to lisp. This is
-      ;; safe because we can't get a pending interrupt before we unblock
-      ;; signals.
-      (unblock-deferrable-signals)
-      (let ((sb!debug:*stack-top-hint*
-             (nth-value 1 (sb!kernel:find-interrupted-name-and-frame))))
-        (allow-with-interrupts (funcall function))))))
+    ;; Reset signal mask: the C-side handler has blocked all
+    ;; deferrable signals before funcalling into lisp. They are to be
+    ;; unblocked the first time interrupts are enabled. With this
+    ;; mechanism there are no extra frames on the stack from a
+    ;; previous signal handler when the next signal is delivered
+    ;; provided there is no WITH-INTERRUPTS.
+    (let ((*unblock-deferrables-on-enabling-interrupts-p* t))
+      (with-interrupt-bindings
+        (let ((sb!debug:*stack-top-hint*
+               (nth-value 1 (sb!kernel:find-interrupted-name-and-frame))))
+          (allow-with-interrupts
+            (nlx-protect (funcall function)
+                         ;; We've been running with deferrables
+                         ;; blocked in Lisp called by a C signal
+                         ;; handler. If we return normally the sigmask
+                         ;; in the interrupted context is restored.
+                         ;; However, if we do an nlx the operating
+                         ;; system will not restore it for us.
+                         (when *unblock-deferrables-on-enabling-interrupts-p*
+                           ;; This means that storms of interrupts
+                           ;; doing an nlx can still run out of stack.
+                           (unblock-deferrable-signals)))))))))
 
 (defmacro in-interruption ((&key) &body body)
   #!+sb-doc
   (/show "in Lisp-level SIGINT handler" (sap-int context))
   (flet ((interrupt-it ()
            (with-alien ((context (* os-context-t) context))
-             (%break 'sigint 'interactive-interrupt
-                     :context context
-                     :address (sap-int (sb!vm:context-pc context))))))
+             (with-interrupts
+               (%break 'sigint 'interactive-interrupt
+                       :context context
+                       :address (sap-int (sb!vm:context-pc context)))))))
     (sb!thread:interrupt-thread (sb!thread::foreground-thread)
                                 #'interrupt-it)))
 
   (sb!thread::terminate-session)
   (sb!ext:quit))
 
+;;; SIGPIPE is not used in SBCL for its original purpose, instead it's
+;;; for signalling a thread that it should look at its interruption
+;;; queue. The handler (RUN_INTERRUPTION) just returns if there is
+;;; nothing to do so it's safe to receive spurious SIGPIPEs coming
+;;; from the kernel.
+(defun sigpipe-handler (signal code context)
+  (declare (ignore signal code context))
+  (sb!thread::run-interruption))
+
 (defun sb!kernel:signal-cold-init-or-reinit ()
   #!+sb-doc
   "Enable all the default signals that Lisp knows how to deal with."
   (enable-interrupt sigsegv #'sigsegv-handler)
   #!-linux
   (enable-interrupt sigsys #'sigsys-handler)
-  (ignore-interrupt sigpipe)
   (enable-interrupt sigalrm #'sigalrm-handler)
+  (enable-interrupt sigpipe #'sigpipe-handler)
   #!+hpux (ignore-interrupt sigxcpu)
   (unblock-gc-signals)
   (unblock-deferrable-signals)
index 3884f95..87ba3b8 100644 (file)
@@ -106,6 +106,11 @@ in future versions."
 
 ;;;; Aliens, low level stuff
 
+(define-alien-routine "kill_safely"
+    integer
+  (os-thread #!-alpha unsigned-long #!+alpha unsigned-int)
+  (signal int))
+
 #!+sb-thread
 (progn
   ;; FIXME it would be good to define what a thread id is or isn't
@@ -115,9 +120,6 @@ in future versions."
   (define-alien-routine ("create_thread" %create-thread)
       unsigned-long (lisp-fun-address unsigned-long))
 
-  (define-alien-routine "signal_interrupt_thread"
-      integer (os-thread unsigned-long))
-
   (define-alien-routine "block_deferrable_signals"
       void)
 
@@ -218,10 +220,14 @@ in future versions."
         (if (and (not *interrupts-enabled*) *allow-with-interrupts*)
             ;; If interrupts are disabled, but we are allowed to
             ;; enabled them, check for pending interrupts every once
-            ;; in a while.
-            (loop
-              (loop repeat 128 do (cas)) ; 128 is arbitrary here
-              (sb!unix::%check-interrupts))
+            ;; in a while. %CHECK-INTERRUPTS is taking shortcuts, make
+            ;; sure that deferrables are unblocked by doing an empty
+            ;; WITH-INTERRUPTS once.
+            (progn
+              (with-interrupts)
+              (loop
+               (loop repeat 128 do (cas)) ; 128 is arbitrary here
+               (sb!unix::%check-interrupts)))
             (loop (cas)))))
     t))
 
@@ -909,53 +915,47 @@ return DEFAULT if given or else signal JOIN-THREAD-ERROR."
   `(with-system-mutex ((thread-interruptions-lock ,thread))
      ,@body))
 
-;;; Called from the signal handler in C.
+;;; Called from the signal handler.
 (defun run-interruption ()
-  (in-interruption ()
-    (loop
-     (let ((interruption (with-interruptions-lock (*current-thread*)
-                           (pop (thread-interruptions *current-thread*)))))
-       ;; Resignalling after popping one works fine, because from the
-       ;; OS's point of view we have returned from the signal handler
-       ;; (thanks to arrange_return_to_lisp_function) so at least one
-       ;; more signal will be delivered.
-       (when (thread-interruptions *current-thread*)
-         (signal-interrupt-thread (thread-os-thread *current-thread*)))
-       (if interruption
-           (with-interrupts
-             (funcall interruption))
-           (return))))))
-
-;;; The order of interrupt execution is peculiar. If thread A
-;;; interrupts thread B with I1, I2 and B for some reason receives I1
-;;; when FUN2 is already on the list, then it is FUN2 that gets to run
-;;; first. But when FUN2 is run SIG_INTERRUPT_THREAD is enabled again
-;;; and I2 hits pretty soon in FUN2 and run FUN1. This is of course
-;;; just one scenario, and the order of thread interrupt execution is
-;;; undefined.
+  (let ((interruption (with-interruptions-lock (*current-thread*)
+                        (pop (thread-interruptions *current-thread*)))))
+    ;; If there is more to do, then resignal and let the normal
+    ;; interrupt deferral mechanism take care of the rest. From the
+    ;; OS's point of view the signal we are in the handler for is no
+    ;; longer pending, so the signal will not be lost.
+    (when (thread-interruptions *current-thread*)
+      (kill-safely (thread-os-thread *current-thread*) sb!unix:sigpipe))
+    (when interruption
+      (funcall interruption))))
+
 (defun interrupt-thread (thread function)
   #!+sb-doc
   "Interrupt the live THREAD and make it run FUNCTION. A moderate
 degree of care is expected for use of INTERRUPT-THREAD, due to its
 nature: if you interrupt a thread that was holding important locks
 then do something that turns out to need those locks, you probably
-won't like the effect."
-  #!-sb-thread (declare (ignore thread))
-  #!-sb-thread
-  (with-interrupt-bindings
-    (with-interrupts (funcall function)))
-  #!+sb-thread
-  (if (eq thread *current-thread*)
-      (with-interrupt-bindings
-        (with-interrupts (funcall function)))
-      (let ((os-thread (thread-os-thread thread)))
-        (cond ((not os-thread)
-               (error 'interrupt-thread-error :thread thread))
-              (t
-               (with-interruptions-lock (thread)
-                 (push function (thread-interruptions thread)))
-               (when (minusp (signal-interrupt-thread os-thread))
-                 (error 'interrupt-thread-error :thread thread)))))))
+won't like the effect. FUNCTION runs with interrupts disabled, but
+WITH-INTERRUPTS is allowed in it. Keep in mind that many things may
+enable interrupts (GET-MUTEX when contended, for instance) so the
+first thing to do is usually a WITH-INTERRUPTS or a
+WITHOUT-INTERRUPTS. Within a thread interrupts are queued, they are
+run in same the order they were sent."
+  (let ((os-thread (thread-os-thread thread)))
+    (cond ((not os-thread)
+           (error 'interrupt-thread-error :thread thread))
+          (t
+           (with-interruptions-lock (thread)
+             ;; Append to the end of the interruptions queue. It's
+             ;; O(N), but it does not hurt to slow interruptors down a
+             ;; bit when the queue gets long.
+             (setf (thread-interruptions thread)
+                   (append (thread-interruptions thread)
+                           (list (lambda ()
+                                   (without-interrupts
+                                     (allow-with-interrupts
+                                       (funcall function))))))))
+           (when (minusp (kill-safely os-thread sb!unix:sigpipe))
+             (error 'interrupt-thread-error :thread thread))))))
 
 (defun terminate-thread (thread)
   #!+sb-doc
index 29a121c..d7c0c13 100644 (file)
         (heap-extract contents i :key keyfun :test #'<=)
         i))))
 
-;;; thread utility
-
-(defun make-cancellable-interruptor (function)
-  ;; return a list of two functions: one that does the same as
-  ;; FUNCTION until the other is called, from when it does nothing.
-  (let ((mutex (sb!thread:make-mutex))
-        (cancelled-p nil))
-    (list
-     #'(lambda ()
-         (sb!thread:with-recursive-lock (mutex)
-           (unless cancelled-p
-             (funcall function))))
-     #'(lambda ()
-         (sb!thread:with-recursive-lock (mutex)
-           (setq cancelled-p t))))))
-
 ;;; timers
 
 (defstruct (timer
@@ -176,7 +160,10 @@ future versions."
 THREAD is a thread then that thread is to be interrupted with
 FUNCTION. If THREAD is T then a new thread is created each timer
 FUNCTION is run. If THREAD is NIL then FUNCTION can be run in any
-thread."
+thread. When THREAD is not T, INTERRUPT-THREAD is used to run FUNCTION
+and the ordering guarantees of INTERRUPT-THREAD also apply here.
+FUNCTION always runs with interrupts disabled but WITH-INTERRUPTS is
+allowed."
   (%make-timer :name name :function function :thread thread))
 
 (defun timer-name (timer)
@@ -222,6 +209,30 @@ from now. For timers with a repeat interval it returns true."
 
 ;;; Public interface
 
+(defun make-cancellable-interruptor (timer)
+  ;; return a list of two functions: one that does the same as
+  ;; FUNCTION until the other is called, from when it does nothing.
+  (let ((mutex (sb!thread:make-mutex))
+        (cancelledp nil)
+        (function (if (%timer-repeat-interval timer)
+                      (lambda ()
+                        (unwind-protect
+                             (funcall (%timer-function timer))
+                          (reschedule-timer timer)))
+                      (%timer-function timer))))
+    (list
+     (lambda ()
+       ;; Use WITHOUT-INTERRUPTS for the acquiring lock to avoid
+       ;; unblocking deferrables unless it's inevitable.
+       (without-interrupts
+         (sb!thread:with-recursive-lock (mutex)
+           (unless cancelledp
+             (allow-with-interrupts
+               (funcall function))))))
+     (lambda ()
+       (sb!thread:with-recursive-lock (mutex)
+         (setq cancelledp t))))))
+
 (defun %schedule-timer (timer)
   (let ((changed-p nil)
         (old-position (priority-queue-remove *schedule* timer)))
@@ -235,8 +246,7 @@ from now. For timers with a repeat interval it returns true."
       (setq changed-p t))
     (setf (values (%timer-interrupt-function timer)
                   (%timer-cancel-function timer))
-          (values-list (make-cancellable-interruptor
-                        (%timer-function timer))))
+          (values-list (make-cancellable-interruptor timer)))
     (when changed-p
       (set-system-timer)))
   (values))
@@ -290,28 +300,36 @@ triggers."
 ;;; Not public, but related
 
 (defun reschedule-timer (timer)
-  (let ((thread (%timer-thread timer)))
-    (if (and (sb!thread::thread-p thread) (not (sb!thread:thread-alive-p thread)))
-        (unschedule-timer timer)
-        (with-scheduler-lock ()
-          (setf (%timer-expire-time timer) (+ (get-internal-real-time)
-                                              (%timer-repeat-interval timer)))
-          (%schedule-timer timer)))))
+  ;; unless unscheduled
+  (when (%timer-expire-time timer)
+    (let ((thread (%timer-thread timer)))
+      (if (and (sb!thread::thread-p thread)
+               (not (sb!thread:thread-alive-p thread)))
+          (unschedule-timer timer)
+          (with-scheduler-lock ()
+            ;; Schedule at regular intervals. If TIMER has not finished
+            ;; in time then it may catch up later.
+            (incf (%timer-expire-time timer) (%timer-repeat-interval timer))
+            (%schedule-timer timer))))))
 
 ;;; Expiring timers
 
-(defun real-time->sec-and-usec(time)
-  (if (minusp time)
-      (list 0 1)
-      (multiple-value-bind (s u) (floor time internal-time-units-per-second)
-        (setf u (floor (* (/ u internal-time-units-per-second) 1000000)))
-        (if (= 0 s u)
-            ;; 0 0 means "shut down the timer" for setitimer
-            (list 0 1)
-            (list s u)))))
+(defun real-time->sec-and-usec (time)
+  ;; KLUDGE: Always leave 0.0001 second for other stuff in order to
+  ;; avoid starvation.
+  (let ((min-usec 100))
+    (if (minusp time)
+        (list 0 min-usec)
+        (multiple-value-bind (s u) (floor time internal-time-units-per-second)
+          (setf u (floor (* (/ u internal-time-units-per-second) 1000000)))
+          (if (and (= 0 s) (< u min-usec))
+              ;; 0 0 means "shut down the timer" for setitimer
+              (list 0 min-usec)
+              (list s u))))))
 
 (defun set-system-timer ()
   (assert (under-scheduler-lock-p))
+  (assert (not *interrupts-enabled*))
   (let ((next-timer (peek-schedule)))
     (if next-timer
         (let ((delta (- (%timer-expire-time next-timer)
@@ -321,40 +339,34 @@ triggers."
         (sb!unix:unix-setitimer :real 0 0 0 0))))
 
 (defun run-timer (timer)
-  (symbol-macrolet ((function (%timer-function timer))
-                    (repeat-interval (%timer-repeat-interval timer))
-                    (thread (%timer-thread timer)))
-    (when repeat-interval
-      (reschedule-timer timer))
-    (cond ((null thread)
-           (funcall function))
-          ((eq t thread)
-           (sb!thread:make-thread function))
-          (t
-           (handler-case
-               (sb!thread:interrupt-thread thread function)
-             (sb!thread:interrupt-thread-error (c)
-               (declare (ignore c))
-               (warn "Timer ~S failed to interrupt thread ~S."
-                     timer thread)))))))
-
-;; Called from the signal handler.
+  (let ((function (%timer-interrupt-function timer))
+        (thread (%timer-thread timer)))
+    (if (eq t thread)
+        (sb!thread:make-thread (without-interrupts
+                                 (allow-with-interrupts
+                                   function))
+                               :name (format nil "Timer ~A"
+                                             (%timer-name timer)))
+        (let ((thread (or thread sb!thread:*current-thread*)))
+          (handler-case
+              (sb!thread:interrupt-thread thread function)
+            (sb!thread:interrupt-thread-error (c)
+              (declare (ignore c))
+              (warn "Timer ~S failed to interrupt thread ~S."
+                    timer thread)))))))
+
+;;; Called from the signal handler.
 (defun run-expired-timers ()
-  (unwind-protect
-       (with-interrupts
-         (let (timer)
-           (loop
-            (with-scheduler-lock ()
-              (setq timer (peek-schedule))
-              (unless (and timer
-                           (> (get-internal-real-time)
-                              (%timer-expire-time timer)))
-                (return-from run-expired-timers nil))
-              (assert (eq timer (priority-queue-extract-maximum *schedule*))))
-            ;; run the timer without the lock
-            (run-timer timer))))
+  (let (timer)
     (with-scheduler-lock ()
-      (set-system-timer))))
+      (setq timer (peek-schedule))
+      (when (or (null timer)
+                (< (get-internal-real-time)
+                   (%timer-expire-time timer)))
+        (return-from run-expired-timers nil))
+      (assert (eq timer (priority-queue-extract-maximum *schedule*)))
+      (set-system-timer))
+    (run-timer timer)))
 
 (defmacro sb!ext:with-timeout (expires &body body)
   #!+sb-doc
index 623aa51..7954fc4 100644 (file)
@@ -25,7 +25,6 @@
     sb!di::handle-breakpoint
     sb!di::handle-single-step-trap
     fdefinition-object
-    #!+sb-thread sb!thread::run-interruption
     #!+win32 sb!kernel::handle-win32-exception))
 
 (defparameter *common-static-symbols*
index e0e1285..a36d373 100644 (file)
@@ -539,9 +539,6 @@ describe_thread_state(void)
     printf(" SIGALRM = %d\n", sigismember(&mask, SIGALRM));
     printf(" SIGINT = %d\n", sigismember(&mask, SIGINT));
     printf(" SIGPROF = %d\n", sigismember(&mask, SIGPROF));
-#ifdef SIG_INTERRUPT_THREAD
-    printf(" SIG_INTERRUPT_THREAD = %d\n", sigismember(&mask, SIG_INTERRUPT_THREAD));
-#endif
 #ifdef SIG_STOP_FOR_GC
     printf(" SIG_STOP_FOR_GC = %d\n", sigismember(&mask, SIG_STOP_FOR_GC));
 #endif
index e9dacd5..794e98a 100644 (file)
@@ -262,8 +262,6 @@ os_install_interrupt_handlers(void)
 #endif
 
 #ifdef LISP_FEATURE_SB_THREAD
-    undoably_install_low_level_interrupt_handler(SIG_INTERRUPT_THREAD,
-                                                 interrupt_thread_handler);
     undoably_install_low_level_interrupt_handler(SIG_STOP_FOR_GC,
                                                  sig_stop_for_gc_handler);
 #endif
index eb6037b..0f5a4b9 100644 (file)
@@ -59,7 +59,6 @@ typedef ucontext_t os_context_t;
 extern int sig_memory_fault;
 #define SIG_MEMORY_FAULT (sig_memory_fault)
 
-#define SIG_INTERRUPT_THREAD (SIGINFO)
 #define SIG_STOP_FOR_GC (SIGUSR1)
 
 #elif defined __OpenBSD__
index 86d3e0e..35cc0fc 100644 (file)
@@ -33,7 +33,6 @@ typedef ucontext_t os_context_t;
 
 #define SIG_MEMORY_FAULT SIGBUS
 
-#define SIG_INTERRUPT_THREAD (SIGINFO)
 #define SIG_STOP_FOR_GC (SIGUSR1)
 
 #endif /* _DARWIN_OS_H */
index ad53a4c..3c6f716 100644 (file)
@@ -104,10 +104,6 @@ sigaddset_deferrable(sigset_t *s)
     sigaddset(s, SIGVTALRM);
     sigaddset(s, SIGPROF);
     sigaddset(s, SIGWINCH);
-
-#ifdef LISP_FEATURE_SB_THREAD
-    sigaddset(s, SIG_INTERRUPT_THREAD);
-#endif
 }
 
 void
@@ -129,10 +125,6 @@ sigdelset_deferrable(sigset_t *s)
     sigdelset(s, SIGVTALRM);
     sigdelset(s, SIGPROF);
     sigdelset(s, SIGWINCH);
-
-#ifdef LISP_FEATURE_SB_THREAD
-    sigdelset(s, SIG_INTERRUPT_THREAD);
-#endif
 }
 
 void
@@ -202,6 +194,16 @@ check_deferrables_blocked_in_sigset_or_lose(sigset_t *sigset)
 }
 
 void
+check_deferrables_unblocked_or_lose(void)
+{
+#if !defined(LISP_FEATURE_WIN32)
+    sigset_t current;
+    fill_current_sigmask(&current);
+    check_deferrables_unblocked_in_sigset_or_lose(&current);
+#endif
+}
+
+void
 check_deferrables_blocked_or_lose(void)
 {
 #if !defined(LISP_FEATURE_WIN32)
@@ -1330,34 +1332,6 @@ arrange_return_to_lisp_function(os_context_t *context, lispobj function)
            (long)function));
 }
 
-#ifdef LISP_FEATURE_SB_THREAD
-
-int
-signal_interrupt_thread(os_thread_t os_thread)
-{
-    /* FSHOW first, in case we are signalling ourselves. */
-    FSHOW((stderr,"/signal_interrupt_thread: %lu\n", os_thread));
-    return kill_safely(os_thread, SIG_INTERRUPT_THREAD);
-}
-
-/* FIXME: this function can go away when all lisp handlers are invoked
- * via arrange_return_to_lisp_function. */
-void
-interrupt_thread_handler(int num, siginfo_t *info, void *v_context)
-{
-    os_context_t *context = (os_context_t*)arch_os_get_context(&v_context);
-
-    FSHOW_SIGNAL((stderr,"/interrupt_thread_handler\n"));
-    check_blockables_blocked_or_lose();
-
-    /* let the handler enable interrupts again when it sees fit */
-    sigaddset_deferrable(os_context_sigmask_addr(context));
-    arrange_return_to_lisp_function(context,
-                                    StaticSymbolFunction(RUN_INTERRUPTION));
-}
-
-#endif
-
 /* KLUDGE: Theoretically the approach we use for undefined alien
  * variables should work for functions as well, but on PPC/Darwin
  * we get bus error at bogus addresses instead, hence this workaround,
@@ -1538,11 +1512,7 @@ undoably_install_low_level_interrupt_handler (int signal,
     sa.sa_flags = SA_SIGINFO | SA_RESTART
         | (sigaction_nodefer_works ? SA_NODEFER : 0);
 #ifdef LISP_FEATURE_C_STACK_IS_CONTROL_STACK
-    if((signal==SIG_MEMORY_FAULT)
-#ifdef SIG_INTERRUPT_THREAD
-       || (signal==SIG_INTERRUPT_THREAD)
-#endif
-       )
+    if((signal==SIG_MEMORY_FAULT))
         sa.sa_flags |= SA_ONSTACK;
 #endif
 
index 325723f..a9d6ad9 100644 (file)
@@ -123,7 +123,6 @@ extern void do_pending_interrupt(void);
 #endif
 
 #ifdef LISP_FEATURE_SB_THREAD
-extern void interrupt_thread_handler(int, siginfo_t*, void*);
 extern void sig_stop_for_gc_handler(int, siginfo_t*, void*);
 #endif
 typedef void (*interrupt_handler_t)(int, siginfo_t *, void *);
index 4a3f994..62a482b 100644 (file)
@@ -418,8 +418,6 @@ os_install_interrupt_handlers(void)
     undoably_install_low_level_interrupt_handler(SIG_MEMORY_FAULT,
                                                  sigsegv_handler);
 #ifdef LISP_FEATURE_SB_THREAD
-    undoably_install_low_level_interrupt_handler(SIG_INTERRUPT_THREAD,
-                                                 interrupt_thread_handler);
     undoably_install_low_level_interrupt_handler(SIG_STOP_FOR_GC,
                                                  sig_stop_for_gc_handler);
 #endif
index e930693..f0387e9 100644 (file)
@@ -39,5 +39,4 @@ typedef int os_vm_prot_t;
 
 #define SIG_MEMORY_FAULT SIGSEGV
 
-#define SIG_INTERRUPT_THREAD (SIGPWR)
 #define SIG_STOP_FOR_GC (SIGUSR1)
index 8d8bb28..4c42c46 100644 (file)
@@ -234,8 +234,6 @@ os_install_interrupt_handlers()
                                                  sigsegv_handler);
 
 #ifdef LISP_FEATURE_SB_THREAD
-    undoably_install_low_level_interrupt_handler(SIG_INTERRUPT_THREAD,
-                                                 interrupt_thread_handler);
     undoably_install_low_level_interrupt_handler(SIG_STOP_FOR_GC,
                                                  sig_stop_for_gc_handler);
 #endif
index dda6f4f..acb827a 100644 (file)
@@ -32,7 +32,6 @@ typedef int os_vm_prot_t;
 
 #define SIG_MEMORY_FAULT SIGSEGV
 
-#define SIG_INTERRUPT_THREAD (SIGPWR)
 #define SIG_STOP_FOR_GC (SIGUSR1)
 
 /* Yaargh?! */
index bcc470c..038f16b 100644 (file)
@@ -550,7 +550,7 @@ os_thread_t create_thread(lispobj initial_function) {
     /* Must defend against async unwinds. */
     if (SymbolValue(INTERRUPTS_ENABLED, thread) != NIL)
         lose("create_thread is not safe when interrupts are enabled.\n");
-    
+
     /* Assuming that a fresh thread struct has no lisp objects in it,
      * linking it to all_threads can be left to the thread itself
      * without fear of gc lossage. initial_function violates this
@@ -690,37 +690,40 @@ thread_yield()
 int
 kill_safely(os_thread_t os_thread, int signal)
 {
+    FSHOW_SIGNAL((stderr,"/kill_safely: %lu, %d\n", os_thread, signal));
+    {
 #ifdef LISP_FEATURE_SB_THREAD
-    sigset_t oldset;
-    struct thread *thread;
-    /* pthread_kill is not async signal safe and we don't want to be
-     * interrupted while holding the lock. */
-    thread_sigmask(SIG_BLOCK, &deferrable_sigset, &oldset);
-    pthread_mutex_lock(&all_threads_lock);
-    for (thread = all_threads; thread; thread = thread->next) {
-        if (thread->os_thread == os_thread) {
-            int status = pthread_kill(os_thread, signal);
-            if (status)
-                lose("kill_safely: pthread_kill failed with %d\n", status);
-            break;
+        sigset_t oldset;
+        struct thread *thread;
+        /* pthread_kill is not async signal safe and we don't want to be
+         * interrupted while holding the lock. */
+        thread_sigmask(SIG_BLOCK, &deferrable_sigset, &oldset);
+        pthread_mutex_lock(&all_threads_lock);
+        for (thread = all_threads; thread; thread = thread->next) {
+            if (thread->os_thread == os_thread) {
+                int status = pthread_kill(os_thread, signal);
+                if (status)
+                    lose("kill_safely: pthread_kill failed with %d\n", status);
+                break;
+            }
         }
-    }
-    pthread_mutex_unlock(&all_threads_lock);
-    thread_sigmask(SIG_SETMASK,&oldset,0);
-    if (thread)
-        return 0;
-    else
-        return -1;
+        pthread_mutex_unlock(&all_threads_lock);
+        thread_sigmask(SIG_SETMASK,&oldset,0);
+        if (thread)
+            return 0;
+        else
+            return -1;
 #else
-    int status;
-    if (os_thread != 0)
-        lose("kill_safely: who do you want to kill? %d?\n", os_thread);
-    status = raise(signal);
-    if (status == 0) {
-        return 0;
-    } else {
-        lose("cannot raise signal %d, %d %s\n",
-             signal, status, strerror(errno));
-    }
+        int status;
+        if (os_thread != 0)
+            lose("kill_safely: who do you want to kill? %d?\n", os_thread);
+        status = raise(signal);
+        if (status == 0) {
+            return 0;
+        } else {
+            lose("cannot raise signal %d, %d %s\n",
+                 signal, status, strerror(errno));
+        }
 #endif
+    }
 }
index 1526e8d..a09c37f 100644 (file)
@@ -35,7 +35,6 @@ typedef void *siginfo_t;
 
 #define SIG_MEMORY_FAULT SIGSEGV
 
-#define SIG_INTERRUPT_THREAD (SIGRTMIN)
 #define SIG_STOP_FOR_GC (SIGRTMIN+1)
 #define SIG_DEQUEUE (SIGRTMIN+2)
 #define SIG_THREAD_EXIT (SIGRTMIN+3)
index ccc78ac..31341c6 100644 (file)
   (with-mutex (mutex)
     mutex))
 
+(sb-alien:define-alien-routine "check_deferrables_blocked_or_lose"
+    void)
+(sb-alien:define-alien-routine "check_deferrables_unblocked_or_lose"
+    void)
+
+(with-test (:name (:interrupt-thread :deferrables-blocked))
+  (sb-thread:interrupt-thread sb-thread:*current-thread*
+                              (lambda ()
+                                (check-deferrables-blocked-or-lose))))
+
+(with-test (:name (:interrupt-thread :deferrables-unblocked))
+  (sb-thread:interrupt-thread sb-thread:*current-thread*
+                              (lambda ()
+                                (with-interrupts
+                                  (check-deferrables-unblocked-or-lose)))))
+
+(with-test (:name (:interrupt-thread :nlx))
+  (catch 'xxx
+    (sb-thread:interrupt-thread sb-thread:*current-thread*
+                                (lambda ()
+                                  (check-deferrables-blocked-or-lose)
+                                  (throw 'xxx nil))))
+  (check-deferrables-unblocked-or-lose))
+
 #-sb-thread (sb-ext:quit :unix-status 104)
 
+(with-test (:name (:interrupt-thread :deferrables-unblocked-by-spinlock))
+  (let ((spinlock (sb-thread::make-spinlock))
+        (thread (sb-thread:make-thread (lambda ()
+                                         (loop (sleep 1))))))
+    (sb-thread::get-spinlock spinlock)
+    (sb-thread:interrupt-thread thread
+                                (lambda ()
+                                  (check-deferrables-blocked-or-lose)
+                                  (sb-thread::get-spinlock spinlock)
+                                  (check-deferrables-unblocked-or-lose)
+                                  (sb-ext:quit)))
+    (sleep 1)
+    (sb-thread::release-spinlock spinlock)))
+
 ;;; compare-and-swap
 
 (defmacro defincf (name accessor &rest args)
 
 (format t "~&interrupt count test done~%")
 
+(defvar *runningp* nil)
+
+(with-test (:name (:interrupt-thread :no-nesting))
+  (let ((thread (sb-thread:make-thread
+                 (lambda ()
+                   (catch 'xxx
+                     (loop))))))
+    (declare (special runningp))
+    (sleep 0.2)
+    (sb-thread:interrupt-thread thread
+                                (lambda ()
+                                    (let ((*runningp* t))
+                                      (sleep 1))))
+    (sleep 0.2)
+    (sb-thread:interrupt-thread thread
+                                (lambda ()
+                                  (throw 'xxx *runningp*)))
+    (assert (not (sb-thread:join-thread thread)))))
+
+(with-test (:name (:interrupt-thread :nesting))
+  (let ((thread (sb-thread:make-thread
+                 (lambda ()
+                   (catch 'xxx
+                     (loop))))))
+    (declare (special runningp))
+    (sleep 0.2)
+    (sb-thread:interrupt-thread thread
+                                (lambda ()
+                                  (let ((*runningp* t))
+                                    (sb-sys:with-interrupts
+                                      (sleep 1)))))
+    (sleep 0.2)
+    (sb-thread:interrupt-thread thread
+                                (lambda ()
+                                  (throw 'xxx *runningp*)))
+    (assert (sb-thread:join-thread thread))))
+
 (let (a-done b-done)
   (make-thread (lambda ()
                  (dotimes (i 100)
        (interruptor-thread
         (make-thread (lambda ()
                        (sleep 2)
-                       (interrupt-thread main-thread #'break)
+                       (interrupt-thread main-thread
+                                         (lambda ()
+                                           (with-interrupts
+                                             (break))))
                        (sleep 2)
                        (interrupt-thread main-thread #'continue))
                      :name "interruptor")))
index 3cea11c..23029cb 100644 (file)
 
 (use-package :test-util)
 
+(sb-alien:define-alien-routine "check_deferrables_blocked_or_lose"
+    void)
+(sb-alien:define-alien-routine "check_deferrables_unblocked_or_lose"
+    void)
+
+(defun make-limited-timer (fn n &rest args)
+  (let (timer)
+    (setq timer
+          (apply #'sb-ext:make-timer
+                 (lambda ()
+                   (sb-sys:without-interrupts
+                     (decf n)
+                     (cond ((minusp n)
+                            (warn "Unscheduling timer ~A ~
+                                   upon reaching run limit. System too slow?"
+                                  timer)
+                            (sb-ext:unschedule-timer timer))
+                           (t
+                            (sb-sys:allow-with-interrupts
+                              (funcall fn))))))
+                 args))))
+
+(defun make-and-schedule-and-wait (fn time)
+  (let ((finishedp nil))
+    (sb-ext:schedule-timer (sb-ext:make-timer
+                            (lambda ()
+                              (sb-sys:without-interrupts
+                                (unwind-protect
+                                     (sb-sys:allow-with-interrupts
+                                       (funcall fn))
+                                  (setq finishedp t)))))
+                           time)
+    (loop until finishedp)))
+
+(with-test (:name (:timer :deferrables-blocked))
+  (make-and-schedule-and-wait (lambda ()
+                                (check-deferrables-blocked-or-lose))
+                              (random 0.1))
+  (check-deferrables-unblocked-or-lose))
+
+(with-test (:name (:timer :deferrables-unblocked))
+  (make-and-schedule-and-wait (lambda ()
+                                (sb-sys:with-interrupts
+                                  (check-deferrables-unblocked-or-lose)))
+                              (random 0.1))
+  (check-deferrables-unblocked-or-lose))
+
+(with-test (:name (:timer :deferrables-unblocked :unwind))
+  (catch 'xxx
+    (make-and-schedule-and-wait (lambda ()
+                                  (check-deferrables-blocked-or-lose)
+                                  (throw 'xxx nil))
+                                (random 0.1))
+    (sleep 1))
+  (check-deferrables-unblocked-or-lose))
+
 (defmacro raises-timeout-p (&body body)
   `(handler-case (progn (progn ,@body) nil)
     (sb-ext:timeout () t)))
         (loop
            (assert (eq wanted (subtypep type1 type2))))))))
 
-;;; Disabled. Hangs occasionally at least on x86. See comment before
-;;; the next test case.
-#+(and nil sb-thread)
+;;; Used to hang occasionally at least on x86. Two bugs caused it:
+;;; running out of stack (due to repeating timers being rescheduled
+;;; before they ran) and dying threads were open interrupts.
+#+sb-thread
 (with-test (:name (:timer :parallel-unschedule))
   (let ((timer (sb-ext:make-timer (lambda () 42) :name "parallel schedulers"))
         (other nil))
                        (loop for i from 1 upto 10
                              collect (let* ((thread (sb-thread:make-thread #'flop
                                                                            :name (format nil "scheduler ~A" i)))
-                                            (ticker (sb-ext:make-timer (lambda () 13) :thread (or other thread)
-                                                                       :name (format nil "ticker ~A" i))))
+                                            (ticker (make-limited-timer (lambda () 13)
+                                                                               1000
+                                                                               :thread (or other thread)
+                                                                               :name (format nil "ticker ~A" i))))
                                        (setf other thread)
                                        (sb-ext:schedule-timer ticker 0 :repeat-interval 0.00001)
                                        thread)))))))
 
 ;;;; FIXME: OS X 10.4 doesn't like these being at all, and gives us a SIGSEGV
 ;;;; instead of using the Mach expection system! 10.5 on the other tends to
-;;;; lose() where with interrupt already pending. :/
-;;;;
-;;;; FIXME: This test also occasionally hangs on Linux/x86-64 at least. The
-;;;; common feature is one thread in gc_stop_the_world, and another trying to
-;;;; signal_interrupt_thread, but both (apparently) getting EAGAIN repeatedly.
-;;;; Exactly how or why this is happening remains under investigation -- but
-;;;; it seems plausible that the fast timers simply fill up the interrupt
-;;;; queue completely. (On some occasions the process unwedges itself after
-;;;; a few minutes, but not always.)
+;;;; lose() here with interrupt already pending. :/
 ;;;;
-;;;; FIXME: Another failure mode on Linux: recursive entries to
-;;;; RUN-EXPIRED-TIMERS blowing the stack.
-#+nil
+;;;; Used to have problems in genereal, see comment on (:TIMER
+;;;; :PARALLEL-UNSCHEDULE).
 (with-test (:name (:timer :schedule-stress))
   (flet ((test ()
-           (let* ((slow-timers (loop for i from 1 upto 100
-                                     collect (sb-ext:make-timer (lambda () 13) :name (format nil "slow ~A" i))))
-                  (fast-timer (sb-ext:make-timer (lambda () 42) :name "fast")))
-             (sb-ext:schedule-timer fast-timer 0.0001 :repeat-interval 0.0001)
-             (dolist (timer slow-timers)
-               (sb-ext:schedule-timer timer (random 0.1) :repeat-interval (random 0.1)))
-             (dolist (timer slow-timers)
-               (sb-ext:unschedule-timer timer))
-             (sb-ext:unschedule-timer fast-timer))))
-    #+sb-thread
-    (mapcar #'sb-thread:join-thread (loop repeat 10 collect (sb-thread:make-thread #'test)))
-    #-sb-thread
-    (loop repeat 10 do (test))))
+         (let* ((slow-timers
+                 (loop for i from 1 upto 1
+                       collect (make-limited-timer
+                                (lambda () 13)
+                                1000
+                                :name (format nil "slow ~A" i))))
+                (fast-timer (make-limited-timer (lambda () 42) 1000
+                                                :name "fast")))
+           (sb-ext:schedule-timer fast-timer 0.0001 :repeat-interval 0.0001)
+           (dolist (timer slow-timers)
+             (sb-ext:schedule-timer timer (random 0.1)
+                                    :repeat-interval (random 0.1)))
+           (dolist (timer slow-timers)
+             (sb-ext:unschedule-timer timer))
+           (sb-ext:unschedule-timer fast-timer))))
+  #+sb-thread
+  (mapcar #'sb-thread:join-thread
+          (loop repeat 10 collect (sb-thread:make-thread #'test)))
+  #-sb-thread
+  (loop repeat 10 do (test))))
 
 #+sb-thread
 (with-test (:name (:timer :threaded-stress))
index e48494b..16d3626 100644 (file)
@@ -17,4 +17,4 @@
 ;;; checkins which aren't released. (And occasionally for internal
 ;;; versions, especially for internal versions off the main CVS
 ;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
-"1.0.25.43"
+"1.0.25.44"