X-Git-Url: http://repo.macrolet.net/gitweb/?a=blobdiff_plain;f=doc%2Finternals%2Fsignals.texinfo;h=e4aa90b6f735e80f91bc983263fca5ecab9a556c;hb=371577a214ce2659c271279ad48e4c42e1c0c93e;hp=ebdc42146ba3af45939bc20ef65ca5c6188a3d88;hpb=c07c56242a2a7e7949dad974331d5257d44fe937;p=sbcl.git diff --git a/doc/internals/signals.texinfo b/doc/internals/signals.texinfo index ebdc421..e4aa90b 100644 --- a/doc/internals/signals.texinfo +++ b/doc/internals/signals.texinfo @@ -14,17 +14,16 @@ There are two distinct groups of signals. -@subsection Semi-synchronous signals +@subsection Synchronous signals -The first group, tentatively named ``semi-synchronous'', consists of -signals that are raised on illegal instruction, hitting a protected -page, or on a trap. Examples from this group are: +This group consists of signals that are raised on illegal instruction, +hitting a protected page, or on a trap. Examples from this group are: @code{SIGBUS}/@code{SIGSEGV}, @code{SIGTRAP}, @code{SIGILL} and @code{SIGEMT}. The exact meaning and function of these signals varies by platform and OS. Understandably, because these signals are raised in a controllable manner they are never blocked or deferred. -@subsection Blockable signals +@subsection Asynchronous or blockable signals The other group is of blockable signals. Typically, signal handlers block them to protect against being interrupted at all. For example @@ -70,31 +69,60 @@ Something of a special case, a signal that is blockable but not deferrable by @code{WITHOUT-INTERRUPTS} is @code{SIG_STOP_FOR_GC}. It is deferred by pseudo atomic and @code{WITHOUT-GCING}. +@subsection When are signals handled? + +At once or as soon as the mechanism that deferred them allows. + +First, if something is deferred by pseudo atomic then it is run at the +end of pseudo atomic without exceptions. Even when both a GC request +or a @code{SIG_STOP_FOR_GC} and a deferrable signal such as +SIG_INTERRUPT_THREAD interrupts the pseudo atomic section. + +Second, an interrupt deferred by WITHOUT-INTERRUPTS is run when the +interrupts are enabled again. GC cannot interfere. + +Third, if GC or @code{SIG_STOP_FOR_GC} is deferred by +@code{WITHOUT-GCING} then the GC or stopping for GC will happen when +GC is not inhibited anymore. Interrupts cannot delay a gc. + @node Implementation warts @section Implementation warts -@subsection RT signals +@subsection Miscellaneous issues -Sending and receiving the same number of signals is crucial for -@code{INTERRUPT-THREAD} and @code{sig_stop_for_gc}, hence they are -real-time signals for which the kernel maintains a queue as opposed to -just setting a flag for ``sigint pending''. +Signal handlers automatically restore errno and fp state, but +arrange_return_to_lisp_function does not restore errno. -Note, however, that the rt signal queue is finite and on current linux -kernels a system wide resource. If the queue is full, SBCL tries to -signal until it succeeds. This behaviour can lead to deadlocks, if a -thread in a @code{WITHOUT-INTERRUPTS} is interrupted many times, -filling up the queue and then a gc hits and tries to send -@code{SIG_STOP_FOR_GC}. +@subsection POSIX -- Letter and Spirit -@subsection Miscellaneous issues +POSIX restricts signal handlers to a use only a narrow subset of POSIX +functions, and declares anything else to have undefined semantics. -Signal handlers should automatically restore errno and fp -state. Currently, this is not the case. +Apparently the real reason is that a signal handler is potentially +interrupting a POSIX call: so the signal safety requirement is really +a re-entrancy requirement. We can work around the letter of the +standard by arranging to handle the interrupt when the signal handler +returns (see: @code{arrange_return_to_lisp_function}.) This does, +however, in no way protect us from the real issue of re-entrancy: even +though we would no longer be in a signal handler, we might still be in +the middle of an interrupted POSIX call. -Furthormore, while @code{arrange_return_to_lisp_function} exits, most -signal handlers invoke unsafe functions without hesitation: gc and all -lisp level handlers think nothing of it. +For some signals this appears to be a non-issue: @code{SIGSEGV} and +other synchronous signals are raised by our code for our code, and so +we can be sure that we are not interrupting a POSIX call with any of +them. + +For asynchronous signals like @code{SIGALARM} and @code{SIGINT} this +is a real issue. + +The right thing to do in multithreaded builds would probably be to use +POSIX semaphores (which are signal safe) to inform a separate handler +thread about such asynchronous events. In single-threaded builds there +does not seem to be any other option aside from generally blocking +asynch signals and listening for them every once and a while at safe +points. Neither of these is implemented as of SBCL 1.0.4. + +Currently all our handlers invoke unsafe functions without hesitation. @node Programming with signal handling in mind @section Programming with signal handling in mind @@ -132,9 +160,27 @@ derive the rule: in a @code{WITHOUT-GCING} form (or pseudo atomic for that matter) never wait for another thread that's not in @code{WITHOUT-GCING}. +Somewhat of a special case, it is enforced by the runtime that +@code{SIG_STOP_FOR_GC} and @code{SIG_RESUME_FROM_GC} always unblocked +when we might trigger a gc (i.e. on alloc or calling into Lisp). + @subsection Calling user code For the reasons above, calling user code, i.e. functions passed in, or in other words code that one cannot reason about, from non-reentrant code (holding locks), @code{WITHOUT-INTERRUPTS}, @code{WITHOUT-GCING} is dangerous and best avoided. + +@section Debugging + +It is not easy to debug signal problems. The best bet probably is to +enable @code{QSHOW} and @code{QSHOW_SIGNALS} in runtime.h and once +SBCL runs into problems attach gdb. A simple @code{thread apply all +ba} is already tremendously useful. Another possibility is to send a +SIGABORT to SBCL to provoke landing in LDB, if it's compiled with it +and it has not yet done so on its own. + +Note, that fprintf used by QSHOW is not reentrant and at least on x86 +linux it is known to cause deadlocks, so place SHOW and co carefully, +ideally to places where blockable signals are blocked. Use +@code{QSHOW_SAFE} if you like.