Commit Graph

72 Commits

Author SHA1 Message Date
89fc7b2637 Fix Nomad event stream is ignoring errors
when an event stream could be established once.
2022-09-07 21:16:20 +02:00
c6e65c14bb Monitor Nomad allocation startup duration. 2022-07-31 19:42:35 +02:00
1239699e74 Add a warning when allocations fail (#83)
* Log a warning when an allocation fails

* Restructure allocation event handling
2021-12-23 13:10:55 +01:00
c22b76720c Add documentation for guarding the Nomad tasks 2021-12-22 17:30:16 +01:00
251129aa74 Modify filter for runners that should deleted
Only "dead" jobs are now not requested to be deleted. Before also pending and starting runners are ignored.
2021-12-22 17:30:16 +01:00
d57a0c07b8 Implement review suggestions 2021-12-22 17:30:16 +01:00
9f0b04660f Fix goroutine leak in the nullio reader 2021-12-14 13:24:53 +01:00
9cd81930e9 Add API Querier test 2021-12-10 11:30:56 +01:00
ebbbfdb9be Unwrap Nomad error for allocation exec
* This will allow us to inspect whether the websocket connection was closed normally
2021-12-10 10:01:31 +01:00
dce895faff Move the error handler to the api querier
to catch the ws normal close error for all Execute requests
2021-12-09 19:12:20 +01:00
825ebdd3e6 Add forcePull option
* Add forcePull option
for pulling the image when the execution environment gets updated

* Apply suggestions from code review

Co-authored-by: Sebastian Serth <MrSerth@users.noreply.github.com>

* Add unit tests

* Clean up and implement option two

Co-authored-by: Sebastian Serth <MrSerth@users.noreply.github.com>
2021-12-09 14:54:14 +01:00
af939b7810 Catch the "Close normal" error 2021-12-09 13:05:18 +01:00
ac6ce56c38 Remove flaky test case 2021-11-10 13:11:38 +01:00
fff67246d6 Infinite busy waiting for lost event (#31)
* Close evaluation stream for Nomad Job creation
 when set event handler have been finished

* Remove evaluation event stream requests
by handling the events via the main Nomad event handler.
2021-11-10 09:57:40 +01:00
4db1ceb41e Fix Bug with the runner recovery
that the runners of the environment 10 are also recovered for the environment 1.
2021-10-22 16:24:55 +02:00
34d4bb7ea0 Implement routes to list, get and delete execution environments
* #9 Implement routes to list, get and delete execution environments.
A refactoring was required to introduce the ExecutionEnvironment interface.

* Fix MR comments, linting issues and bug that lead to e2e test failure

* Add e2e tests

* Add unit tests
2021-10-21 10:33:52 +02:00
9b106f4cd8 Fix linting issues
An update of golangci-lint yielded new linting issues. This commit
fixes them.
2021-08-05 13:40:48 +02:00
c8c5357b8c Rename module for GitHub 2021-07-30 16:43:05 +02:00
6a60b6cd89 Add config option to enable (m)TLS between Poseidon and Nomad 2021-07-29 09:43:21 +00:00
8d24bda61a Send SIGQUIT when cancelling an execution
When the context passed to Nomad Allocation Exec is cancelled, the
process is not terminated. Instead, just the WebSocket connection is
closed. In order to terminate long-running processes, a special
character is injected into the standard input stream. This character is
parsed by the tty line discipline (tty has to be true). The line
discipline sends a SIGQUIT signal to the process, terminating it and
producing a core dump (in a file called 'core'). The SIGQUIT signal can
be caught but isn't by default, which is why the runner is destroyed if
the program does not terminate during a grace period after the signal
was sent.
2021-07-29 10:28:47 +02:00
3aa1227db6 Use authentication token from config for communication with Nomad 2021-07-27 11:35:55 +00:00
8b26ecbe5f Restructure project
We previously didn't really had any structure in our project apart
from creating a new folder for each package in our project root.
Now that we have accumulated some packages, we use the well-known
Golang project layout in order to clearly communicate our intent
with packages. See https://github.com/golang-standards/project-layout
2021-07-21 12:55:35 +02:00