So signals make sense once you understand where they came from.
The writers of Unix needed a kernel->user downcall mechanism. Back then, user mode code was thought of as being like what we know call unikernels with the kernel thought of a bit more like a hypervisor. The individual user sessions were, early on in Unix's history, only a couple more primitives on top of what we'd later think of as DOS style environments, simply multiplexed on the hardware. From that perspective, mirroring the microcode->kernel downcall mechanism (interrupts) at the kernel->user boundary makes plenty of sense.
Once you see them for what they are (interrupts for user mode instead of kernel mode), all of their goofiness makes sense.
* SIGSEGV, SIGILL, etc. map to processor exceptions.
* SIGIO, SIGWINCH, etc. map to I/O device interrupts, including SIGALRM as a timer interrupt.
* SIGSTOP, SIGKILL, etc. map to platform interrupts like NMIs.
* SIGUSR, etc. map to IPIs.
It's a one shot downcall mechanism that can happen at any time, so all of the same way you think about concurrency during an interrupt applies here as well. You can't take a regular mutex because it might already be held by the context you interrupted. Less of a deal now, but any non-reentrant code shared with the regular user context is off the table. So just like how in interrupt context, large swaths of the kernel are off limits, large swaths of user mode are off limits except very carefully written, very context dependent code.
I even ported an RTOS to run in *nix user mode for CI/development purposes, where the interrupt controller abstraction was simply implemented as signals. The code didn't look any different than you'd expect from running under a hypervisor with a paravirtualized interrupt controller.
> * SIGSEGV, SIGILL, etc. map to processor exceptions.
That's the idea anyway. But you can also `kill -SEGV $SOMEPID`. There are (now, since threading became a thing on Unix) three categories of signals:
* async, process-directed, as with the command above, SIGALRM, or the like.
* async, thread-directed, as with `pthread_kill`.
* sync, thread-directed, as with the processor exceptions you mentioned.
The signal numbers generally are good enough, but the `si_code` member of `siginfo_t` will more authoritatively distinguish the source of a signal.
Anyway, I agree it's useful to think about the type of signal you're interested in handling. I think most people are primarily interested in async, process-directed signals for graceful shutdown on `SIGTERM`/`SIGINT` and/or config reloading on `SIGHUP`. Fortunately, that's the easiest to handle safely. One of my favorite ways is to not use a signal handler at all. Mask out the signal in main() before creating any threads, and create a thread which loops over sigwaitinfo.
This is a useful classification. With the benefit of hindsight--and threading--it would be nice if async signals could just go away, since the kernel can send a notification message to the application over a file descriptor (see man signalfd(2)) and the application can have a dedicated thread to deal with it at its leisure. That leaves the synchronous signals, which are the kernel's way of telling you that you executed a bad instruction and giving you a chance to do something awfully clever instead of crashing.
> Back then, user mode code was thought of as being like what we know call unikernels with the kernel thought of a bit more like a hypervisor. The individual user sessions were, early on in Unix's history, only a couple more primitives on top of what we'd later think of as DOS style environments, simply multiplexed on the hardware.
How are signals actually implemented? I'm asking because I'm currently working with WASM/WASI, in which a process is strictly single-threaded. The WASM equivalent of a syscall (a "host call") exists, but as you'd expect, it suspends the execution of the WASM code until it returns. As a result, if you want to set up a bidirectional bytestream between the WASM code and the host-language's runtime, you can't block on a call to `read()`, since this would effectively block until data arrives. Polling is obviously not a great solution.
So if I were to implement a signaling mechanism in a WASM runtime, what would that look like?
I think the closest existing concept to signals is the gas/fuel mechanism implemented in some WebAssembly runtimes, used by WASM blockchain VMs to preempt untrusted code. Runtimes either instrument the JITed code to check a fuel counter in every function/loop, or they register a timer signal/interrupt that periodically checks the counter.
From a similar handler, you could check if a signal flag has been set, then call a user-defined signal handler exported from the module.
If you don't control the runtime, you could do this instrumentation to the bytecode rather than the generated machine code, effectively simulating preemption via injected yield points.
There don't seem to be many write-ups on this concept. The best reference seems to be existing implementations:
Wasmer's implementation of metering[0] just traps when it runs out of fuel. WasmEdge's implementation of interruptibility[1] checks a flag and stops execution if it's set.
While neither of these support resuming execution after the deadline, replacing the halt with a call to a signal dispatcher should work.
Wasmtime has two different implementations of interrupting execution that both support resuming[2]. The fuel mechanism[3] is deterministic but the epoch mechanism[4] is more performant. If you're free to pick your runtime, I'm sure you could configure Wasmtime into doing what you want.
The idea of running code up to some resource limit and then aborting it is documented in the Lisp 1.5 manual from 1962, P. 34:
6.4 The Cons Counter and Errorset
The cons counter is a useful device for breaking out of program loops. it automatically causes a trap when a certain number of conses have been performed. The counter is turned on by executing count [n], where n is an integer. If n conses are performed before the counter is turned off, a trap will occur and an error diagnostic will be given.
> if I were to implement a signaling mechanism in a WASM runtime, what would that look like?
Sythesizing the other comments:
The "natural" way to do it exactly resembles a microcontroller interrupt handler. At an instruction boundary, push the current execution state onto the stack; jump to an alternate execution point; provide a special instruction to return to the previous execution state. This also copies all the _problems_ of POSIX signals (may happen in the middle of any execution state, what to do if you're in the middle of a syscall, etc).
If you just want a bidirectional bytestream: build a bidirectional bytestream. The classic select() or poll() style interface.
If you need higher performance but don't want the hassle of interruption at any stack state: build co-operative multitasking into the runtime. Provide a means of very quickly checking a single bit (in practice, a word of atomic granularity) to see if something needs to be done, then write the software so it can check this at various points and co-operate. Similar to uses of CancellationToken in C#.
Let me make sure I'm tracking the essence of your comment:
>At an instruction boundary, push the current execution state onto the stack; jump to an alternate execution point; provide a special instruction to return to the previous execution state.
In the VM, for each instruction:
1. Check if some bit has been set, which would indicate "we gotta interrupt"
2. Push current state onto the stack
3. Jump to pre-defined location & execute whatever's there
At some point, the code from step 3 (be it code I wrote or some guest-supplied interrupt handler) will push an instruction onto the stack that says "resume".
Is that roughly correct?
If so, this sounds like something that (unsurprisingly) requires support at the VM level. WASM has no such feature, though... are you aware of any clever hacks or workarounds?
While this doesn’t answer the question, and apologies if you’re aware of this already… if you just want a bidirectional bytestream, you are probably better off implementing an interface like select()/poll(). Actually, while I have zero WASI experience, it seems like it has one already. This is not the same as “polling” as in checking over and over; rather, the WASM program makes a list of conditions like “data available to read” or “buffer space available to write”, and performs a syscalls which blocks until any of those conditions is met. Then the actual read/write call can be done without blocking. (There is also a design variant where the whole read/write operation is scheduled in advance and the blocking syscall just notifies you which previously scheduled operation has completed, similar to Windows IOCP or Linux io_uring.)
No no, you're right in the thick of the issue, so thank you for your comment :)
At issue is the fact that select/poll requires integration into the language runtime of whatever has been compiled to WASM. For example, imagine that you've compiled an application written in Go to WASM and you want to create a syscall that is analogous to select while still being a distinct thing. The problem you run into is this: what if there's a timer that's going to fire in 500µs? You have no way of determining that and setting your `select` timeout accordingly. To do so would require hacking the Go runtime itself, along with the compiler.
>There is also a design variant where the whole read/write operation is scheduled in advance and the blocking syscall just notifies you which previously scheduled operation has completed, similar to Windows IOCP or Linux io_uring.
I'm not sure I'm fully grasping the distinction, but it seems appealing... can you elaborate? Or is there something I could read/watch?
Nobody else answered you so I'm going to make up an answer which, even if it's not a specifically correct answer, it's a good answer :)
look into a.out and ELF formats for executable files; your executable files, said files hereby called "your code", will be in this format on linux.
There will be an entry point for your code, so if your program gets run, the kernel (or whatever userspace launcher it's given this task to... now that I think of it, from the command line, the shell would fork itself, and one fork would exec your program, so we're talking about exec here) will load your code into memory, and transfer control to--transfer control to means set the program counter to--your code at a particular address (the address of C main(), so to speak). Before this happens, the launcher will have allocated memory for you, set up the stack pointer, opened stdin/out/err, put the command line arguments into place (argc, argv), etc. Then your code can run till it's done, making system calls if it needs to, and potentially being preemptively multitasked, but that's all invisible to your code, black box in terms of how it's implemented.
now for signals...
Signals from the operating system need their own entry point(s) into your code. So, while your code is running, the OS can interrupt it (stuffing everything of yours onto your stack so it can be popped off later when control flow is returned to the same point later). In this meantime, the OS will now transfer control (set the program counter just like it did when it first launched you) and you're off and running with code to handle the signal, the interrupt.
If you have not prepared your code for this event, the default (set up by the launcher for programs) program flow at this point will be sent to a generic handler that says "unhandled signal", and your program will be killed. They're actually lying, the signal is being handled, but it's being handled by a handler that cleans up and exits.
But, you can write code to handle the signal yourself, and when your program launches you (generally in main() or nearby) register the address of the handler you want to use so that when the interrupt comes the OS will know where to send control. At that point it's as if you just called a function/subroutine at wherever your code running at that moment, and when that function returns, things continue as they were before, modulo whatever you did in your handler.
There's a bunch of specifics I don't know, is there one entry point for all signals, and a signal argument token decides where to transfer control, or are there 10 separate handlers so you don't waste time? etc.
if you're doing no-wait I/O with a single thread, one way is to make a blocking call, call into something that doesn't return till there's new data, and that new data may then arrive, handled by your interrupt handler; in that case returning from your input-getting handler could return from the blocking call and your code is off and running as if it just returned in the first place. There are a myriad variations on this, but that's the general idea.
happy to clarify anything, others can probably answer better, etc.
Firstly, thank you. Your comment is relating a couple of distinct ideas together in a way that is very helpful.
>Then your code can run till it's done, making system calls if it needs to, and potentially being preemptively multitasked, but that's all invisible to your code, black box in terms of how it's implemented.
I'm realizing that I don't actually understand how pre-emptive multi-tasking is implemented at the VM level, and I think this is perhaps what I actually want. Can you elaborate and/or suggest any articles/video-lectures on the subject?
Maybe what I need is an operating-systems course...
I think glibc or equiv provides the default signal handlers as part of the stuff that gets included with even "hello world" to make it 10kb (or 10mb, whatever your tools emit).
The default actions are handled by the kernel so that they still exist even before glibc has a chance to come up (for instance so core dumps happen even in early dynamic linking).
> do something with garbage collection (but what? I’m confused about
this still.)
It's an almost hilarious technique. In a garbage collected,
multi-threaded virtual machine the runtime may at some point decide it
is time for garbage collection. However, all threads must first be
stopped because allocating memory while the garbage collector runs
leads to badness and/or the garbage collector moves objects referenced
by threads.
There is no direct way to signal to a thread that it should stop and
having the thread poll some flag would hurt performance. Instead the
compiler inserts dummy instructions (safepoints) at the top of every loop body or
function entry point which writes to a page the runtime controls:
mov [addr], eax
Then when it's time to stop all threads, it just write protects the
page! Every thread eventually stumbles upon the bomb. Boom! SIGSEGV!
Boom! SIGSEGV! Boom! When every thread has exploded the garbage
collector can run in peace. Then the runtime removes the write
protection and resumes executing all threads. Something very similar
is used on Windows (though I think it's called events and not signals)
and sometimes you use read-protected pages rather than write-protected
ones, but the principle is the same.
You could have each thread select(), poll(), or read() a byte, from the same pipe in their SIGSEGV handler. (Those syscalls are async-signal-safe in POSIX.)
When it's time to resume, the GC unprotects the page then writes a byte (or enough bytes if read() is used), which wakes all the blocked mutator threads.
Those threads also need to synchronise acks with the GC. After the GC protects the guard page, the mutators don't stop mutating immediately, so they must send an ack to the GC thread from their SIGSEGV handler to say when they have stopped. A pipe can be used for this too. Use of a non-mutex-using atomic counter, if available, can speed this by ensuring the GC only wakes once. Similarly the mutator threads must send an ack after they resume before returning from their SIGSEGV handler, to ensure the next GC run does not race with a mutator thread still slowly handling the current GC run
On Linux, futex or eventfd are async-signal-safe but non-portable alternatives to pipes. Futex has the advantage of not using a file descriptor and may be faster.
Naïve approach is to have each segv handler poll something to tell when GC is done, and simply return when it can. State is then restored by the kernel.
The most important thing I know about signals is that in dockerized applications, your ENTRYPOINT or CMD must use the exec form (`["command", "arg1", "arg2"]`) in order for your application to receive signals.
A lot of applications, like web/API frameworks, will gracefully shutdown if they receive SIGTERM or SIGINT, but if that signal doesn't propagate to the app you'll need to SIGKILL instead. Kubernetes will do this for you, waiting for your process to shutdown and then ungracefully killing it when it doesn't do so in time.
But you can't do env var substitution with the exec form...
Yeah I don’t disagree at all. I’ve seen so many people just put up with waiting for their containers to get SIGKILLed by the docker timeout, when they could instead use tini to forward SIGINT to trigger their graceful shutdowns. Drives me nuts haha
Ah gotcha. I believe it can be baked into images as well, per the entrypoint example in the readme: https://github.com/krallin/tini
Not sure how this will fare IRL in k8s as I haven’t much experience there. It’s still silly that this is the default behavior where you need something like Tini, but I digress.
A weird (at first sight) thing about signal handlers - the kernel doesn't really know, or care, if you're running in one.
Which makes sense because signal handlers can nest, so ideally the kernel wouldn't have to maintain some state for each enter/exit of signal handlers to be tracked.
Instead, the kernel sets up the necessary conditions for things to act like a signal handler expects and puts some state on your the stack representing your registers, active signal mask, etc, plus some state to ensure a sigreturn syscall will run.
If you return from your handler, that'll take effect and your state is restored to where you were before. If you go into a nested signal handler, the kernel pushes another pile of state on the stack to return to where you are now.
You're free to change that saved state before returning - or just siglongjmp out of there and discard it, if you know what you're doing.
Libc is a lot more tricky about signals, since not all libc functions can be safely called from handlers. From the kernel's point of view, there's nothing magic about them at all but usually we're stuck with libc restrictions into the bargain.
It also makes sense because the kernel has no way of knowing. When you exec a binary, all that the kernel sees is an ELF image with a blob of bytes in the .text section. It doesn't know which part(s) of those blob of bytes is a signal handler. Even if it did know, the kernel has no way of verifying that the function is only ever invoked by the signal (userspace could manually call the signal handler function).
> Libc is a lot more tricky about signals, since not all libc functions can be safely called from handlers.
And this is a huge thing. People do all kinds of operations in signal handlers completely oblivious to the pitfalls. Pitfalls which often do not manifest, making it a great "it works for me" territory.
I once raised a ticket on fluentbit[1] about it but they have abused signal handlers so thoroughly that I do not think they can mitigate the issue without a major rewriting of the signal and crash handling.
Calls to printf() are particularly common in signal handlers I've seen in commercial code. malloc() too occasionally. Sometimes calls to logging functions
These are undefined behaviour, for real (and for good reasons), not just theoretically. They are a cause of reported occasional random crashes, but people don't realise, and it's tricky to demonstrate or warn at compile time.
The first time I encountered signals and EINTR, I was baffled. It felt like I was missing something obvious. This couldn't be how things really work. This must be a joke or something. I must be reading an outdated documentation, and there's a sane solution for this, right?
The existence of EINTR is famously discussed in the Unix hater's handbook [0] and Richard Gabriel's "worse is better" essay [1], not sure which one told the story first. Paraphrasing, in the "MIT philosophy" the kernel should obviously continue the system call automatically because that would be simpler for the caller, and in the "New Jersey philosophy" the Unix implementation is obviously better because the kernel is simpler and functions more transparently.
Though now there's `SA_RESTART`, so Unix ended up doing the "right thing" eventually.
There's a lesson here, I think: "worse is better" is good advice when it lets you ship and get software into the hands of customers quicker, but it doesn't change the fact that you should do the "right thing" at some point.
I wouldn't be so sure that SA_RESTART is the right thing, because using SA_RESTART means you have to do actual work inside your signal handlers. The nice thing about EINTR is your signal handlers can be dumb, and just set a "got_signal = true" variable, so you know 100% for sure your signal handler is signal safe. Then your read() loop just ignores EINTR and checks that variable, to know when it needs to do work.
Unfortunately, due to the delay, a lot of people reimplemented it badly.
For example, signal handling is completely broken in Python since blocking syscalls (think `select`, but see `signal(7)` for a complete list) will not be interrupted.
`SA_RESTART` correctly excludes such syscalls so you can properly handle the EINTR instead of hanging.
The big problem is; if you have a thread reading from a socket, how can you ever interrupt it if you wish to do so? EINTR on signals is really not so bad when you think of it from that angle. (If you don't want to deal with EINTR everywhere, you can block the signals and use e.g. epoll_pwait() to temporarily unblock them at opportune moments. Or you can set SA_RESTART when setting up the signal handler if you do not ever wish this behavior.)
For regular files, Linux has a simple solution for you: You'll never get EINTR. It's only a thing for sockets, pipes and other things that can block indefinitely, which a file cannot.
(Except if you're on NFS, in which case you have to choose between two evils depending on whether your file system is mounted intr or nointr :-) )
"nointr", last time I used it, meant your program could never be interrupted by NFS errors under any circumstances. Therefore, if the NFS master went away, your program couldn't be killed either and would be stuck in D state forever until you rebooted the whole NFS client machine.
(You don't need to reboot, though; you can “mount -o remount,intr …” to switch states and then kill. I wonder if you can also now actually do kill -9 specifically even on nointr, but I haven't checked.)
In my experience (admittedly from some years ago), you have to be pretty careful about using the terminal when NFS fails, since you can pretty easily lock up your terminal, of which you only have a "fixed" supply. If you're using NFS, chances are good your home directory is on NFS so you need to make sure you don't accidentally stat() something in the current path (which is probably in your home directory). Obviously, starting new terminal sessions once you've locked one up isn't going to happen. But you'll probably waste a couple trying to figure out what's going on. You'll also want to have that mount command written on a piece of paper somewhere (and remember that you have it), since you can't read your notes.txt file in your home directory. You probably can't do a web search, since the browser reads/writes a cache to your home directory. (Maybe lynx or links would work?) Hopefully you didn't add ~/bin or an NFS-mounted /opt or something before the system paths or you'll need to do run everything with the full path. But... a bunch of tool installers like to prepend their directory to your path if you let them fix everything automatically. I have a ~/.gem/ruby/2.3.0/bin early in my path right now, completely didn't realize.
I haven't used NFS in decades; back in the day (we had Sun-3's running SunOS), when an NFS server hung, we got a message like `NFS server foo not responding, still trying' over and over again. In the absence of an admin, all you could do was to login again on a different machine. We called it the `Notwork File System'.
The joy of trying to login into UNIX thin terminals with the home directory mounted via NFS and having the network cable with a broken terminator. before ethernet became a thing.
You kind of put the cart before the horse... Threads are the coping mechanism, that has to cope with signals, not the other way around. Threads exist in the way they are because of the original bad design (which included signals).
Potentially, there could be other ways of dealing with communication, some of them already exist in popular operating systems, s.a. sockets. It's actually funny that you mention one in your problem statement. Erlang-style ports are another possible solution.
Replace “thread” with “process”, then. The basic fact doesn't really change; if you want a clean shutdown on Ctrl-C, you'll need to allow an EINTR-like return from a blocking syscall.
>The big problem is; if you have a thread reading from a socket, how can you ever interrupt it if you wish to do so? EINTR on signals is really not so bad when you think of it from that angle.
Yeah, but what if the thread is not currently blocked and you want to interrupt it / stop the thread? Unfortunately, it doesn't queue the EINTR for the next blocking function if the thread was NOT blocked at the moment the signal was sent.
In typical server applications you can use signalfd to handle signals like receiving data from a client instead of polling. I think on older versions of Linux you could use a trick where you write a single byte to a pipe in the signal handler to achieve a similar effect.
Right which is why you don't actually do much of anything in the signal handler except save that it happened somewhere to be processed by your application later.
Uh, that Examples section calls print inside the signal handler. Python's IO stack is non-reentrant. I hope nobody follows these examples and gets exceptions at runtime because of it.
"A Python signal handler does not get executed inside the low-level (C) signal handler. Instead, the low-level signal handler sets a flag which tells the virtual machine to execute the corresponding Python signal handler at a later point(for example at the next bytecode instruction)"
The low-level C handler set the flag then returns. Only some time later does the Python run-time call the associated Python handler that you see in the Examples section.
In this case that's before the next bytecode executes. The write syscall gets interrupted and returns EINTR, then cpython checks what signal was caught and executes the signal handler, before trying to do the remaining write:
There are a few ways to handle this design question ("what happens if something needs to interact with a process while it's blocked in a system call?") more or less sanely, and UNIX chooses the least sane one (making userspace deal with the complexity of system calls possibly doing part or none of the work that was requested). Other operating systems might fully transparently guarantee that the system call completes before the process is notified of said interaction, but this isn't compatible with IPC as "lightweight" (unbuffered and without backpressure) as signals.
The Right Thing is to make system call submission atomic and asynchronous, only waiting on completion by explicit choice, and remove signals entirely in favor of buffered message-passing IPC. This is basically the world we're approaching with io_uring and signalfd, except for ugly interaction with coalescing and signal dispositions (see https://ldpreload.com/blog/signalfd-is-useless), and the fact that many syscalls still can't be performed through io_uring.
If UNIX had a better API for process management, people wouldn't see signals as necessary, but that's its own can of worms with its own Linux-specific partial fix (pidfd) and genre of gripe article (e.g. https://news.ycombinator.com/item?id=35264487).
1. For signal handling: if your signal handler needs to do more that setting a global value that ought to be read regularly by your program, consider that it might not be the mechanism you need, or that you're trying to hack around code design that have implications you really might not want to deal with.
2. For signal sending for process group control: read carefully manuals and forum helps to understand semantic subtleties between groups and sessions, as sending signals from a terminal with ctrl-<d,c,z,…> does not do the same thing than signaling processes from a program system call, which can really sway intuitions you might have on the subject.
You can do surprising stuff in signal handlers; a program I used to work on had an interrupt menu in the signal handler. You could ctrl-c it and choose various "stop", "checkpoint", "dump status info" options. Similar to https://news.ycombinator.com/item?id=37899269 although without going quite so far as to adjust the stack with longjmp(). Whether you're "allowed" to do any of these things is another matter and they're almost certainly undefined behavior.
The interaction between threads and signals is really dire though. You need to mess around with signal masking. I think we'd ended up with an architecture where one thread caught the SIGINT and passed a SIGRT to a different thread; this behaved better in some way I cannot now remember.
If you want your program to be portable to Windows and MacOS as well, prepare for things to be differently broken there.
If your program has an async loop (select or poll etc.) then you need to know about the self-pipe trick for avoiding problems with signals. https://cr.yp.to/docs/selfpipe.html
This pattern of the signal handler merely setting a global var seems robust. Of course, now the programmer must decide when and where to check the global value.
It must also be of an atomic type. POSIX specifies exactly one, namely “volatile sig_atomic_t”. I guess in C++11 etc. you can also use the language-provided atomics.
I was talking to someone at a vintage computer meetup this weekend and we ended up talking about "fear" in exactly this context - fear of asking, or fear that we couldn't know something or that it was too hard. From my perspective it was for exactly things like signals that were not intuitive or broke my mental model of linear programming (threads too, when they first became popular). He was an artist and his fear had been around not understanding what other people are doing with their art, or if his art was getting his message across.
In both our cases, we benefitted from having mentors that helped explain (like this fine article does) by breaking things down and then... letting us learn from the broken down pieces rather than all at once. A mentor can help us conceptualize difficult concepts by showing us, intuitively where to draw our abstractions.
Overcoming the fear for both of us came down to acknowledging it, "owning it", and being ok with it. (Sorry if this is rambling, but it was one of the most rewarding conversations I've had in years, and touched at least tangentially on this article.)
(And I'm still have fear of MMUs, but I've never had to write anything at that level, although it's on my programming bucket list...)
Absolutely. Asynchronous signal safety is among the murkiest waters of systems programming. It's pointless to even try to do anything in a signal handler, it's not safe to do anything more complex than setting a flag. The sanest way to handle signals seems to be signalfd. You just turn off normal signal delivery and handle them by epolling a signals file descriptor instead. Not portable of course, it's a Linux feature.
When using green threads/fibers/coroutines, an interesting technique to make signal handling safer is to run the signal handler asynchronously on a separate fiber/green thread. That way most of the problems of dealing with signals go away, and there's basically no limitation on what you can do inside the signal handler.
I've successfully used this technique in Polyphony [1], a fiber-based Ruby gem for writing concurrent programs. When a signal occurs, Polyphony creates a special-purpose fiber that runs the signal handling code. The fiber is put at the head of the run queue, and is resumed once the currently executed fiber yields control.
I love that userfaultfd now exists as an alternative to trapping sigsegv. I wish it was a bit more flexible though (I'd like to be able to associate user defined meta-data with the fault depending on address, for example.)
That sort of sounds like the solution I was thinking of when reading this.. that handlers should only cache the signal received to be handled at the program's convenience, except for a kill signal etc.
Yes. Run. Never look behind you. One of the most terrifying tar-pits
of my life came from trying to solve a process concurrency problem by
monitoring PID files and using Unix signals to pause and resume
things.
SBCL now uses the "write-bitmap"/"card marking" scheme on some platforms, which is a tiny bit faster (1-2% mentioned on sbcl-devel). More interesting is that doing the touch-detection in software allows for finer grained precision (e.g. #+mark-region-gc uses 128 byte cards) than hardware (e.g. 4kiB pages on x86-64, 16kiB on M1) which can drastically affect scavenging time [0]. The precision is also really nice for non-moving generational schemes: if old and new objects exist on the same card, writes to new objects (which are more common too!) will cause old objects to needlessly be scanned by GC, which is called "card pollution" by Demers et al [1], so reducing the card size reduces the likelihood of that happening.
Oh neat, I thought it was still unoptimized compared to gengc (which afaik does still use write protection). I'll have to check it out, see if I have a machine it'll run faster on
gencgc uses software protection on some but not all architectures -- I recall x86-64 and MIPS but not ARM though. On x86-64 with SBCL 2.3.8 for example:
Wasn't there some additional magic around BSD vs. System V-style signal handling? In one of these the signal handler is restored to the default after each invocation, so the first thing you have to do is reinstate your own handler (and you still might get surprised/terminated by a second signal that arrives too quickly to do this, and gets handled by the default signal handler).
Or is that an arcane thing that no longer applies to modern systems?
Yes, you are describing traditional System V style signals. BSD made signal handlers persistent. Modern (since the 1990s) code uses sigaction() to set up signal handlers which allows you to choose the sensible (BSD) semantics.
I remember someone telling me that the only thing you should be allowed to do in a signal handler is to set a semaphore (or similar synchronization primitive).
The actual logic is to be done in the main application. That's a similar idea as the signalfd function the article mentions.
Obviously an oversimplification, but I think, a good rule of thumb.
For once, Betteridge's Law of Headlines doesn't hold. Signals suck and are best avoided.
When handling the concept in a vacuum, like writing a simple program from scratch that uses them, sure, they're usable.
But add threads, libraries, and complex code and it quickly risks becoming a huge pain. It's one of those things from UNIX that are in principle not that bad of an idea, but just never grew up with the times.
It’s a mess that originates from the notion that terminals must be able to control and interrupt processes started from a shell. Events from the terminal are translated to SIGxxxx and acted upon by well-behaving programs. Had there been no terminals from the beginning, we would probably have a different abstraction for interrupting processes.
There's also the signals that can originate from "within", like SIGILL, the SIGPIPE for interrupting otherwise blocking IO, and SIGCHILD for subprocess management.
IMO it's fine to be scared of these, and fine to be scared of threads (and I have similar feelings about floating-point). You don't actually need threads; threads can't do anything that processes can't, indeed in the Linux 2.4 days threads and processes were the same thing to the scheduler. The fact that there are programs in common use that handle signals does not convince me that it's possible to handle signals correctly in all cases; lots of programs get used for decades while being subtly broken (see the famous "you are not expected to understand this").
While Julia starts scared and ends up feeling better, I've only gotten more scared of Unix signals over time.
Context: I've written a robust command-line utility [0] that must handle signals, and unlike most utilities that are I/O-bound, mine is CPU-bound.
An I/O-bound utility can easily use signalfd() (subject to the gotchas in the "signalfd() is useless" post that Julia links to, which you should also read). signalfd() will work for I/O-bound utilities because it turns those signals into I/O. Perfect.
However, in a CPU-bound program, signals are used specifically to interrupt execution. This is, to put it mildly, as difficult as writing code for interrupts in the embedded space. Why? Because that's really what you're doing: handling an interrupt that can happen at any time.
My solution was something I wish on no one: I used setjmp() and longjmp().
Horrors!
Yep. And it gets worse: I had to longjmp() out of the signal handler.
AH!
And it gets worse: to ensure that there were no memory leaks, I had to keep a stack of jmp_bufs and manually jump to each one, which would be in a function where memory had to be cleaned up.
Cue screams of bloody murder
You may insist that longjmp()'ing out of signal handler is not allowed; it actually is [1], but unlike most other "async-signal-safe" functions, you need to ensure you don't interrupt a syscall or other code that is not async-signal-safe.
So it gets worse: I have a signal lock that the signal handler checks. If it's not locked, the signal handler will longjmp() out of the signal handler. If it is locked, the signal handler sets a flag and returns. Then the code that unlocks signals checks for the flag and does a longjmp() if it's set.
He's dead, Jim!
I have another project that is a framework in C. This framework needs to handle signals for clients. It has to be general, so it has to handle CPU-bound clients. So I had to implement the same thing. I was able to make it easier, but it is also harder because I have that one thing that messes up every Unix API: threads.
Nuclear mushroom cloud
So should you be scared? It depends; if you can get away with signalfd() and know its gotchas, maybe not.
But if you need anything more complex, yes, be very afraid.
I searched for longjmp and was pleased to see someone else brought it up first. Yes, you can longjmp from a signal handler back to standard execution flow. It is even considered a "best practice" using readline!
Have you ever read up on communicating sequential processes? In that model you have one or multiple threads executing work while new messages arrive on a queue. In such an architecture handling signals is trivial.
Yeah, I was going to mention that if the long-running/CPU-bound work was put in a worker thread with a message queue in which state messages are received/handled, it'd be relatively trivial to handle the signal across these constraints, since signals are only delivered upon context switches.
But if you've just got one very tight loop and are running like a bat out of hell, well you deserve the hell you're in. ;) longjmp() is definitely not the right way to do this. A thread dedicated to handling signals (or doing the CPU work) which properly tells the main() thread to die/quit/clean up is how I'd do it .. even if you only spawn a thread for signal handling, its still way cleaner than that longjmp() business .. which is very difficult to understand, even with the context you've given in this thread (I read your code - its nasty) ...
Because signals are delivered on context switch, and you're not allowing any of that to happen. Its not a well-behaved process in that case .. not that you shouldn't be able to just burn as much CPU as you want, just that you would need to open up some space to communicate with the OS in the meantime, and having a signal-handler-thread or a cpu-working-thread is how to do it properly. (Do you also handle SIGHUP? SIGSTOP?)
longjmp() is just asking for hell 6 months later when you come back and have to work out why you did such a hokey thing in the first place.
EDIT: it should be noted that the "side-effect" of using longjmp is .. a context switch .. so you gave yourself some space to behave properly with the OS. Its just that the more future-proof way to do it is to have a thread for work (you have this already with main()) and a thread for signals (you'd just add this to service signals only and give main() a way to cleanly exit) ... Not too complex, and a tad bit easier to read 6 months later ..
Unless you include OS preemption in "context switch," you are wrong; signals can come at any time.
If you do include it, then yes, signals only happen on context switch. But in that case, my CPU-bound code is not preventing context switches; the OS preempts it. This is how all major OS's are designed.
If I didn't longjmp(), by the way, it could take an infinite amount of time between when the user sent SIGINT to when the program stopped by the very nature of what that utility does. By using longjmp(), a SIGINT becomes a true interrupt, which is what users want.
Having a working thread and a signal thread does not change that because the signal thread would need to interrupt the working thread in the same way.
They can come at any time but they are only delivered to your process by the kernel during a context switch. They're queued until then.
> it could take an infinite amount of time between when the user sent SIGINT to when the program stopped by the very nature of what that utility does
A well coordinated signal-handling-thread and a workload-thread won't manifest this issue - poorly managed threads however, will.
>By using longjmp(), a SIGINT becomes a true interrupt, which is what users want.
Hard disagree.
What you've done is turned signals into interrupts, which is .. hokey. And not how signals are intended to be used. Its quite possible to get the behaviour you expect - fast interruption and death of work-code - but you'd have to sort your issues with threads out, first.
EDIT: its decades-old proven technology: use a semaphore or a mutex to keep your threads in lockstep, and avoid this longjmp() malarkey... signals aren't hard, but maybe they're only just a little less harder than threads ..
> What you've done is turned signals into interrupts, which is .. hokey
No that is precisely what they are - user space interrupts. Thats how they work. That’s why one has to worry about async safety and signal safe calls.
I have no idea where you get off on this “allowing a context switch” nonsense. In most of the systems being discussed, if a signal is delivered and the thread is runnable it will be delivered immediately and asynchronously - there is no queuing going on in that case. If the thread is not runnable/scheduled that’s another story but this does not square with what you’re saying, because it sounds strongly that you’re saying that signals are delivered synchronously (with context switches) and they are most definitely not, generally.
Also longjmp is an entirely user space concept - the kernel on the most common systems being discussed has no idea of its operation.
> They can come at any time but they are only delivered to your process by the kernel during a context switch.
This is backwards and extremely misleading; a signal being queued can trigger an immediate context switch (effected via an inter-processor interrupt). See `kick_process()` in the Linux kernel [1].
> What you've done is turned signals into interrupts, which is .. hokey. And not how signals are intended to be used.
Signals are userspace interrupts. That's exactly what they are, and they're no more hokey than hardware interrupts (so, pretty hokey).
I used the system you are advocating in my utility at first.
The problem is that you need to constantly check for work. That is expensive on tight loops. Yes, I checked on every loop iteration unless I knew a small, but constant, amount of work had to be done.
That's the thing: if you have a tight loop, do you know how long it's going to take to run through everything? In addition, when a SIGINT comes, can you be sure that your state is correct?
Here's a test: download my utility and run these commands:
In the 'real world' compare-and-swap operations (such as one would find in atomic types used for communication between worker and handler) are single-cycle operations, if not a hard CPU flag...
>I understand why you think the way you do; I did too. But the real world is more complicated.
Please consider the complications of the high-end audio world, where such techniques are well established. Not only must bat-out-of-hell threads have all the gumption they can muster, but they have to be able to be controlled - in as close to realtime as possible - by outside handler threads.
I think boffinaudio is trying to help you improve your code quality. Its not bad advice to re-think this.
If I was in that context, yes, I would use outside handler threads, but that's because I could probably put the compare-and-swap in one place (or very few).
As of 2.7.0, my bc had 77 places where signals were checked, and I was probably missing a few.
Real-time is different from what I was doing, so it required different techniques.
There is a case for signals in strictly real-time, strictly high-performance, and also strictly realtime+high-performance code.
You have decided to go strictly for high-performance, for your well-argued reasons, and you've abandoned a standard practice for your stated claims, but this isn't just about your code - its about how people can mis-use signals, and in your case you're mis-using signals by not using them.
It is the advice:
>"Yes be very afraid of signals."
.. which feels not entirely appropriate.
So I took a look at the bc code, and I too am terrified of your use of longjmp.
It appears to me you've gotten somewhat smelly code because you didn't find the appropriate datatype for your case, and decided to roll your own scheduling instead. Ouch.
>As of 2.7.0, my bc had 77 places where signals were checked, and I was probably missing a few.
To refactor this to a simple CAS operation to see if the thread should terminate, doesn't seem too unrealistic to me. Only 77 places to drop a macro that does the op - checking only if the signal handler has told your high-performance thread us to stop, hup, die, etc.
Signals are awesome, and work great - obviously - for many, many high-performance applications, and your high-performance, CPU-bound application might feel like the only way you could do it - but you certainly can attain the same performance and still handle signals like a well-behaved application that doesn't have to take big leaps just to stay ahead of the scheduler ..
>Also, I didn't have a scheduler. The point of interrupts is that you don't need a scheduler.
I think where we digress in position is that I do not think you have a good justification for the statement "be scared of signals" because, after all, you are clearly not scared of them and have decided to bend them to your own thoughts on how best to optimize your application, so its sort of ingenuous to hold the position having completely wiped "the standard way to do high-performance signal-handling" from your slate, to put your own special case forward by example. You're clearly not scared of them.
I'm calling you out on it because signals are absolutely not scary, but maybe talking about them with other experts can be.
Your case is more an example of how unscary signals are - but you've opted for longjmp()'s (which are, imho as a systems programmer, a far more cromulant fear) in your code as a solution to a problem which I don't think is really typical.
Thus, not really scary at all.
Well, it was a fun read of some code, and thanks for bc anyway.
Bus activity makes CAS and any atomic operation far more costly than a single cycle. If they were really that cheap then every operation would just be atomic.
In general you must trade off bandwidth for improved latency. Audio work by its nature can and must do this. It is not the appropriate trade off in frankly most cases of computing (even if it is arguably more interesting)
You're assuming that the work can be split into "smallish chunks", instead of a single large CPU-bound computation. Sure, "injecting a stop package in its stream of work" is the best design, but only if you do have a stream of work in the first place. Otherwise, it's either signals, or constantly checking a shared "stop" variable in the middle of a very hot CPU-bound loop.
Yes, but how do you limit the size of those chunks?
Perhaps only do one math operation?
Calculating 2^4294967296 is one operation, but it can take minutes (hours?) on my bc. Do you want it to take that long, not responding to SIGINT the entire time?
Probably not.
Smaller chunks? Then you have to check for signal in a loop, which causes its own problems.
I used to do that, and as I told boffinAudio, that caused problems with state being invalid and such.
It's a userspace risc-v vector extension emulator that handles SIGILL signals to emulate the missing vector instructions on risc-v cpus without the vector extension.
It can just be inserted in to any binary with LD_PRELOAD.
I used the Intel Software Development Emulator (SDE) about a decade ago. It lets you develop and test software on a CPU that doesn't have the necessary vector extensions (AVX, etc.). I presume it uses the SIGILL trapping technique. In practice, I found it to be about 1000× slower than native - but that's why it's intended to be a development tool. https://www.intel.com/content/www/us/en/developer/articles/t...
I know this has been said before, but I absolutely love Julia's style of writing. It's humble and accessible, and full of great humor that is always good natured, as opposed to the more common snarky or negative humor:
> Signals are a way for Unix processes to communicate! Except for SIGKILL. When you get sent SIGKILL nobody communicates with you, you just die immediately.
It's refreshingly earnest and twee, and always technical and to the point (as opposed to other vacuous and rambling bloggers).
Also, it's reassuring to see someone say "I don't know this"/"I'm not sure", instead of trying to pass as an authority on the subject. We all learn together.
Also not communication as it can't be ignored and your process just pauses, if you consider it a thing in its own right.
Really though it is a pair with SIGCONT which is communication: SIGSTOP happens (which your process doesn't even see) then later SIGCONT happens which you do get notified about (“hey, process, you were just paused for a bit, you might want to reassess your surroundings”). This is little different from the scheduler not giving your process any time for a while, what difference there is being the while could be a lot longer and you get explicitly told that it has happened.
The other possibility is that your process never wakes up because a SIGKILL happens first. Or maybe SIGTERM, I'm not sure whether or not a SIGCONT will be sent first in that instance.
There is also the friendlier SIGSTSP which can be handled (or even ignored) but is otherwise identical to SIGSTOP. This is what you get sent if a user hits ctrl+z in most interactive shells (or a more polite process than one that just sends SIGSTOP wants you to pause).
> Signals are a way for Unix processes to communicate!
Not really. They are primarily a mechanism for processes to be interrupted by their "controlling terminal", inherited from the shell (the session leader). User types something special on the TTY, the kernel translates it to SIGxxxx and the foreground process is interrupted or killed, possibly returning control to the shell.
"In v1 there was a separate system call to catch each of interrupt, quit, and two kinds of machine traps. Floating point hardware (v3) brought another and no end was in sight. To stop the proliferation, all traps were subsumed under a single system call, signal (v4). The various traps were given numbers, by which they were known until symbolic names were assigned in v7. We have not been fully weaned yet: the numbers are still needed in the shell trap command."
Regarding communications it adds:
"Never, however, was the basically unstructured signal-kill mechanism regarded as a significant means of interprocess communication. The research systems therefore declined to adopt the more reliable, but also more complex, Berkeley signals."
Signals neither have to originate from a controlling shell and nor do that have to relate to job control.
And even if your point was correct, that still wouldn’t mean that signals aren’t a way for processes to communicate. Job control relies upon IPC to work and signals are just one form of IPC.
I've probably learned more about unix signals from using them on a running machine than I did when I had to write c code using them.
Both nowadays, from when I have to send a SIGHUP to GNOME to restart it when some extension has a memory leak and it locks up. (Really wish the wayland version had that functionality) And quite a while back when I'd often SIGSTOP firefox so my crummy core2duo machine could play videos without skipping frames.
In the past I have used a non-blocking pre-allocated ring buffer to hold signal information, so that the sig handler is not held for too long. Using sem_post, it wakes up a separate thread (waiting using sem_wait). This has been quite pain-free, but not entirely sure how to smoke out bugs in the system.
As someone who once in a while dabbles in Linux systems programming, is there any reason to use signalfd over timerfd when all you want is a timer? The article suggests you can use signals as timers, which left me wondering...
Signals scare the hell out of me- that was the first point in learning about UNIX programming that made me question everything I'd learned about what is available to you as a function implementor/UNIX user in user space. It got even worse when I moved to threads and sudddenly I couldn't reconcile threads with signals (which are very process-oriented).
Reentrant code gives me the same willies- or more correctly, it was a big surprise that people wrote code that depended on static values to store state between invocations.
If you want portability, keep in mind that the man pages specify that you should use sigaction instead of signal when using a function as a signal handler rather than SIG_IGN or SIG_DFL. [0]
i built a robot with two processes. one process is the brain, and all the brain does it wait for SIGUSR1. when it stops getting SIGUSR1 for like 2 tenths of a second in a row, it "ceases all motor function".
the other process, which takes input and game controller stuff and sends camera feeds and turns on lights and measures temperature and blah blah blah, it continuously sends a SIGUSR1 to the brain process. it's like a dead man's switch for a robot. doesn't matter what goes wrong, basically when it stops sending SIGUSR1 then it means bad things have happened and the robot stops itself. now what if it gets stuck in a loop continuously sending SIGUSR1 erroneously? yes, that could be a problem, but it's never happened. unlike before i implemented this where, say, interference with the wifi or dead batteries or whatever could leave the robot running away in 'move forward mode' forever.
I'm surprised that SIGPIPE is not mentioned (and barely present in comments). It's 90% of why I'm scared of signals and every time I write something mixing IO and threads it eventually rears its disgusting head. Can someone shine a light on what the heck it's designed to do?
The earlier part tries to write, gets `0` or `-1` with an error, concludes the pipe is closed, and exits? The doc for `write` even specifies the error code `EPIPE` that is set alongside with the `SIGPIPE`. Why the signal, when it's redundant?
I think the description of SIGWINCH is subtly wrong. AIUI, it’s sent by the terminal program to the child process (e.g. the shell). It is true if by “terminal program” you mean “program that runs in a terminal” (and arguably there are things which are both, e.g. tmux or emacs)
Should you be scared of Unix signals? Yes, so turn them all into self-pipes and handle them in an async I/O event loop if at all possible, and stop thinking about blocking them.
The writers of Unix needed a kernel->user downcall mechanism. Back then, user mode code was thought of as being like what we know call unikernels with the kernel thought of a bit more like a hypervisor. The individual user sessions were, early on in Unix's history, only a couple more primitives on top of what we'd later think of as DOS style environments, simply multiplexed on the hardware. From that perspective, mirroring the microcode->kernel downcall mechanism (interrupts) at the kernel->user boundary makes plenty of sense.
Once you see them for what they are (interrupts for user mode instead of kernel mode), all of their goofiness makes sense.
* SIGSEGV, SIGILL, etc. map to processor exceptions.
* SIGIO, SIGWINCH, etc. map to I/O device interrupts, including SIGALRM as a timer interrupt.
* SIGSTOP, SIGKILL, etc. map to platform interrupts like NMIs.
* SIGUSR, etc. map to IPIs.
It's a one shot downcall mechanism that can happen at any time, so all of the same way you think about concurrency during an interrupt applies here as well. You can't take a regular mutex because it might already be held by the context you interrupted. Less of a deal now, but any non-reentrant code shared with the regular user context is off the table. So just like how in interrupt context, large swaths of the kernel are off limits, large swaths of user mode are off limits except very carefully written, very context dependent code.
I even ported an RTOS to run in *nix user mode for CI/development purposes, where the interrupt controller abstraction was simply implemented as signals. The code didn't look any different than you'd expect from running under a hypervisor with a paravirtualized interrupt controller.