Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugFix: restore correct handling to SIGNALS in pigpio #58

Closed
wants to merge 1 commit into from

Conversation

SlySven
Copy link
Contributor

@SlySven SlySven commented Apr 10, 2016

Prior code attempted to assign a generic handler to catch ALL signals; however SIGKILL and SIGSTOP cannot be caught; also the default action for SIGTTOU, SIGTTIN, SIGTSTP and SIGCONT should perhaps be "ignored" NOT "terminate", these signals are associated with Job Control and the first three WILL halt execution until receipt of the forth SIGCONT. As a result the previous code will result in program termination when it is run in a debugger such as gdb and the program is halted at a breakpoint - which is not helpful when debugging!

The previous signal handler uses a macro DBG(...) however that uses the printf(...) family of functions - this is not safe as the signal handler can be executed asynchronously (at any time and at any point in the main program flow) and printf() is not re-entrant! Indeed a big warning ought to be put into the documentation for setgpioSetSignalFunc(...) and gpioSetSignalFuncEx(...) that the user provided function must be safe to run asynchronously - for guidance it should ONLY contain functions in the "safe-functions" list in POSIX.1-2004 (also known as POSIX.1-2001 Technical Corrigendum 2) which is reproduced in the signal(7) man page under the section headed "Async-signal-safe functions" with some revisions from POSIX.1-2008 .

The range of signals that CAN be handled ranges from SIGHUP (1) to SIGRTMAX however SIGRTMAX is not a compile time constant but a runtime macro that typically expands to 64. There is also a gap between SIGUNUSED (a.k.a. SIGSYS, typically 31 on all but SPARC and MIPS platforms - and the RPi is neither of these!) and SIGRTMIN (either 34 or 35 depending on glibc version and the real-time extensions system which uses two or three "reserved" signals internally) the man page for signal(7) explicitly states:

"Because the range of available real-time signals varies according to the
glibc threading implementation (and this variation can occur at run time
according to the available kernel and glibc), and indeed the range of real
-time signals varies across UNIX systems, programs should never refer to
real-time signals using hard-coded numbers, but instead should always
refer to real-time signals using the notation SIGRTMIN+n, and include
suitable (run-time) checks that SIGRTMIN+n does not exceed SIGRTMAX."

Because the signal handler can be executed whilst the signal handler for a particular signal is being changed it is necessary to protect against it using something that may be being changed in the main program loop. The fix that this commit proposes is to maintain a shadow copy of the array of handlers and to use an atomic flag variable to toggle between using the main and shadow copy.

This commit restores a default "ignore" action for those signals that typically are so treated. For the mentioned ones here a short debugging message is also created that warns of the data lose that will occur from pausing the execution of the pigpio library code. This typically happens if a breakpoint is reached in the debugger or if -z is pressed during the running of a executable. This commit also reverts to dumping core for signals that do that by default - this does it by restoring the default signal action for an otherwise unhandled signal in the generic handler that is supposed to do so and then returns to the point in the code that invokes it - a second signal should then occur which WILL dump the core and terminate. This is a change to previous behaviour in that NO attempt is made to clean up {by running gpioTerminate() or exit(...)} - which would possibly corrupt the data that the dump would otherwise contain (as well as moving the program location away from the point where the signal was raised for!)

For otherwise unhandled signals that do not terminate pigpio the problems with the DBG macro and its use of printf() type functions is overcome by using individual atomic flags for each such signal - this does mean that a helper function (void)processSignals() is required. This MUST be included in the main loop of pigpiod and other applications using the pigpio library to produce the actions that were previously unsafely done in the signal handler code. This is the debug level adjustments caused by SIGUSR1 and SIGUSR2 and the debug messages that most of the others produced - those have been extended to include the extra signals this commit recognises as falling in this category.

To prevent "undefined behaviour" it is necessary to change away from using sleep() in the main wait loop of pigpiod as it is subject to that issue if SIGALARM handling is modified which we do allow, fortunately nanosleep() does not suffer from the same problem.

I am not entirely happy about the use of a (pair of) static arrays to hold the user signal handler details - it would probably be better to grab some memory dynamically to get the right size as needed, but my first attempt at doing that didn't work - I kept getting segment violations - I now suspect that I was making a pointer error but I thought it would be better to put this code out for display/criticism/revision as it is currently working for me and to revisit that if I am not just barking up the wrong tree with all of my modifications here.

I think I may have left some documentation errors in some comments as the code here has been tweaked a few times on my local system before squashing the changes into one commit for release! 🙈

Signed-off-by: Stephen Lyons [email protected]

Prior code attempted to assign a generic handler to catch ALL signals;
however SIGKILL and SIGSTOP cannot be caught; also the default action for
SIGTTOU, SIGTTIN, SIGTSTP and SIGCONT should perhaps be "ignored" NOT
"terminate", these signals are associated with Job Control and the first
three WILL halt execution until receipt of the forth SIGCONT.  As a result
the previous code will result in program termination when it is run in a
debugger such as gdb and the program is halted at a breakpoint - which is
not helpful when debugging!

The previous signal handler uses a macro "DBG(...)" however that uses
the printf(...) family of functions - **this is not safe as the signal
handler can be executed asynchronously (at any time and at any point in the
main program flow) and printf() is not re-entrant!** Indeed a big warning
ought to be put into the documentation for setgpioSetSignalFunc(...) and
gpioSetSignalFuncEx(...) that the user provided function **must be safe to
run asynchronously** - for guidance it should ONLY contain functions in the
"safe-functions" list in POSIX.1-2004 (also known as POSIX.1-2001 Technical
Corrigendum 2) which is reproduced in the signal(7) man page under the
section headed "Async-signal-safe functions" with some revisions from
POSIX.1-2008 .

The range of signals that CAN be handled ranges from SIGHUP (1) to SIGRTMAX
however SIGRTMAX is NOT a compile time constant but a runtime macro that
typically expands to 64.  There is also a gap between SIGUNUSED (a.k.a.
SIGSYS, typically 31) and SIGRTMIN (either 34 or 35 depending on glibc
version and the real-time extensions system which uses two or three
"reserved" signals internally) the man page for signal(7) explicitly
states:
"Because the range of available real-time signals varies according to the
 glibc threading implementation (and this variation can occur at run time
 according to the available kernel and glibc), and indeed the range of real
 -time signals varies across UNIX systems, programs should never refer to
 real-time signals using hard-coded numbers, but instead should always
 refer to real-time signals using the notation SIGRTMIN+n, and include
 suitable (run-time) checks that SIGRTMIN+n does not exceed SIGRTMAX."

Because the signal handler can be executed whilst the signal handler for a
particular signal is being changed it is necessary to protect against it
using something that may be being changed in the main program loop.  The
fix that this commit proposes is to maintain a shadow copy of the array of
handlers and to use an atomic flag variable to toggle between using the
main and shadow copy.

This commit restores a default "ignore" action for those signals that
typically are so treated.  For the mentioned ones here a short debugging
message is also created that warns of the data lose that will occur from
pausing the execution of the pigpio library code.  This typically happens
if a breakpoint is reached in the debugger or if <Cntl>-z is pressed during
the running of a executable.  This commit also reverts to dumping core for
signals that do that by default - this does it by restoring the default
signal action for an otherwise unhandled signal in the generic handler that
is supposed to do so and then returns to the point in the code that invokes
it - a second signal should then occur which WILL dump the core and
terminate. This is a change to previous behaviour in that NO attempt is
made to clean up {by running gpioTerminate() or exit(...)} - which would
possibly corrupt the data that the dump would otherwise contain (as well as
moving the program location away from the point where the signal was raised
for!)

For otherwise unhandled signals that do not terminate pigpio the problems
with the DBG macro and its use of printf type functions is overcome by
using individual atomic flags for each such signal - this does mean that
a helper function (void)processSignals() is required. This MUST be included
in the main loop of pigpiod and other applications using the pigpio library
to produce the actions that were previously **unsafely** done in the signal
handler code.  This is the debug level adjustments caused by SIGUSR1 and
SIGUSR2 and the debug messages that most of the others produced - those
have been extended to include the extra signals this commit recognises as
falling in this category.  To prevent "undefined behaviour" it is necessary
to change away from using sleep() in the main wait loop of pigpiod as
it is subject to that issue if SIGALARM handling is modified which we allow
fortunately nanosleep() does not suffer from the same problem.

Signed-off-by: Stephen Lyons <[email protected]>
@joan2937
Copy link
Owner

Thank for that. There is quite a bit to take in.

Would it not be simpler just to use a semaphore to protect against signal handler versus signal handler set up clashes?

I don't have a particular problem with removing the DBG statements although I'd prefer to find some way of keeping the log of unexpected signal events.

What I don't have a feel for is how many real world crashes/failures with the current code would be fixed by the proposed code. I.e. is it mainly to help in (gdb) debugging or have you noticed problems in "normal" usage.

@SlySven
Copy link
Contributor Author

SlySven commented Apr 10, 2016

Yeah there are several things there and they might be better split off but they do interact so it was easier to lump them all together in one swell foop.

I don't have a particular problem with removing the DBG statements although I'd prefer to find some way of keeping the log of unexpected signal events.

I think we have to remove the DBG statements from the handler because of their reliance on *printf(...) family functions - I agree that keeping track of unexpected signal events is desirable which is why there is that bank of signal_xxxx_received - but it is not safe to do increment/decrement type arguments on them (only get/test/set are safe). I could not see a way to record multiple calls - though for POSIX type signals the kernel does not queue multiples of the same one if they are blocked (which I think should happen automagically for a signal whilst it is being handled) - mind you multiple identical real-time ones can be queued up. The processSignals() function is a bit of a bodge to move the reporting to the main loop - there could be a better way to do this so suggestions are welcome.

Would it not be simpler just to use a semaphore to protect against signal handler versus signal handler set up clashes?

Well that is what I thought I was doing with the isGpioSignalArrayBeingEdited variable - the volatile static sig_atomic_t type should be safe against access both within the signal handler and the main code - it is possible a smaller type e.g. a volatile(?) char might also work but I was trying to code in a conservative manner with something that should work - Oh! I've been reading up and this is a Read-Copy-Update method to provide an always consistent set of handlers...! 😀 .

What I don't have a feel for is how many real world crashes/failures with the current code would be fixed by the proposed code. I.e. is it mainly to help in (gdb) debugging or have you noticed problems in "normal" usage.

Well I am developing a usage in the real world and having the program terminate when debugging it is a pain! 😮 Perhaps some of this might be considered overkill but I've recently found trying to debug something when a bug is actually in a modified version of a library is harder than when it is one's own project (I tried to compile a debug version of this library to find out why it was dying with a SIGSEGV in (int)initInitialise(void) when I was trying to use a dynamically allocated gpioSignal array - but that didn't go well/work for me, sigh).

BTW is terminating abnormally (i.e. without calling gpioTerminate() or exit(EXIT_FAILURE) going to cause system hardware problems? Or is there any hardware settings that cannot be restored to normal by a fresh invocation or the kernel as it cleans up after a crash/core dump?

I am sure that pigpio needs to be more firmer about rejecting attempts to assign handlers for signal numbers where that is not permitting {SIGKILL, SIGSTOP, zero, more than SIGRTMAX and the ones in the gap between the POSIX ones and the Real-time ones}. I also think it needs to "ignore" more than it has previously done SIGTSTP and SIGCONT at least {from those in the table in signal(7) list that are marked as "ignored" by default} - though in hindsight I guess a dummy/empty User handler would also do the job without a change with the current code.

@SlySven
Copy link
Contributor Author

SlySven commented Apr 10, 2016

📖 I was looking around the 'net and found a very interesting document that is all about secure/safe/good (?) C coding and it even has a section on the intricacies of signals, now I need to look and see whether the RPi Raspbian implementation has persistent or non-persistent signal handlers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants