Send SIGQUIT when cancelling an execution

When the context passed to Nomad Allocation Exec is cancelled, the
process is not terminated. Instead, just the WebSocket connection is
closed. In order to terminate long-running processes, a special
character is injected into the standard input stream. This character is
parsed by the tty line discipline (tty has to be true). The line
discipline sends a SIGQUIT signal to the process, terminating it and
producing a core dump (in a file called 'core'). The SIGQUIT signal can
be caught but isn't by default, which is why the runner is destroyed if
the program does not terminate during a grace period after the signal
was sent.
This commit is contained in:
Konrad Hanff
2021-07-21 09:12:44 +02:00
parent 91537a7364
commit 8d24bda61a
10 changed files with 237 additions and 63 deletions

View File

@ -8,7 +8,7 @@ import (
"github.com/hashicorp/nomad/nomad/structs"
"gitlab.hpi.de/codeocean/codemoon/poseidon/internal/config"
"gitlab.hpi.de/codeocean/codemoon/poseidon/pkg/logging"
"gitlab.hpi.de/codeocean/codemoon/poseidon/pkg/nullreader"
"gitlab.hpi.de/codeocean/codemoon/poseidon/pkg/nullio"
"io"
"net/url"
"strconv"
@ -348,7 +348,7 @@ func (a *APIClient) executeCommandInteractivelyWithStderr(allocationID string, c
go func() {
// Catch stderr in separate execution.
exit, err := a.Execute(allocationID, ctx, stderrFifoCommand(currentNanoTime), true,
nullreader.NullReader{}, stderr, io.Discard)
nullio.Reader{}, stderr, io.Discard)
if err != nil {
log.WithError(err).WithField("runner", allocationID).Warn("Stderr task finished with error")
}