Problem
One of projects I’m working on is responsible for coordinating/running multiple processes. It’s a simple real-time web app that allows starting, stopping, pausing and terminating processes.
Since I’m developing and testing on OSX and the service is running on Linux I came across a surprising behaviour how those platforms handle particular signals.
Description
Common scenario running a process looks like this:
- Run a process to do some work
- Pause it (with
SIGSTOP
) - Terminate paused process (with
SIGTERM
)
I’ve observed 2 different behaviours:
- OSX: works as expected
- Linux: never terminates
Here’s the Go app to illustrate the problem:
package main //sigtest.go
import (
"fmt"
"os/exec"
"syscall"
"time"
)
func main() {
// NOTE: err handling omitted for brevity
// 1. start a proces to do some work
cmd := exec.Command("bash", "-c", "sleep 10000")
cmd.Start()
<-time.After(100 * time.Millisecond)
// 2. "pause" the process
cmd.Process.Signal(syscall.SIGSTOP)
<-time.After(100 * time.Millisecond)
// 3. terminate the process
cmd.Process.Signal(syscall.SIGTERM)
var (
errc = make(chan error)
slow = time.After(2000 * time.Millisecond)
timeout = time.After(5000 * time.Millisecond)
)
// wait for the process to terminate
go func() { errc <- cmd.Wait() }()
retry:
select {
case err := <-errc:
fmt.Println(err)
fmt.Println("Done")
case <-slow:
fmt.Println("Taking longer than it should...")
slow = nil
goto retry
case <-timeout:
fmt.Println("Timeout")
}
}
Build the program for both platforms
GOOS=darwin GOARCH=amd64 go build -o sigtest_darwin sigtest.go
GOOS=linux GOARCH=amd64 go build -o sigtest_linux sigtest.go
Running on Linux produces:
$ ./sigtest_linux
Taking longer than it should...
Timeout
Works as expected on OSX:
$ ./sigtest_darwin
signal: terminated
Done
I’ve not found the official docs/explanation yet, but my assumption is that SIGTERM
is the signal that must be handled by a process itself, yet the process is unable to do so after being is SIGSTOP
-ed on Linux.
Workarounds
- send
SIGCONT
right beforeSIGTERM
forSIGSTOP
-ed process on Linux - send
SIGKILL
insteadSIGTERM
but it removes opportunity for the process to shut down gracefully
PS: Let me know if you have more info on this.
Thanks!