eb818f92f7
Refactor Runner Destroy Reason Masking
...
and ignore expected reasons such when the runner got destroyed by an API request.
2023-07-24 11:48:14 +01:00
6a1677dea0
Introduce reason for destroying runner
...
in order to return a specific error for OOM Killed Executions.
2023-07-21 15:30:21 +02:00
bfb5977d24
Destroy runner on allocation stopped
...
Destroying the runner when Nomad informs us about its allocation being stopped, fixes the error of executions running into their timeout even if the allocation was stopped long ago.
2023-07-21 15:30:21 +02:00
e7df777db4
Always log Runner and Environment ID.
...
Systematically log the runner id and the environment id by adding the information at the findRunnerMiddleware.
2023-07-15 21:46:56 +02:00
527aaf713f
Fix decreased prewarming pool due to inactivity timer.
...
When allocations fail and restart they are added again to the idle runners. The bug fixed with this commit is that the inactivity timer was not stopped at the restart. This led to the idle runner being removed when the timer expired.
2023-06-16 17:27:45 +01:00
f031219cb8
Fix Nomad event race condition
...
that was triggered by simultaneous deletion of the runner due to inactivity, and the allocation being rescheduled due to a lost node.
It led to the allocation first being rescheduled, and then being stopped. This caused an unexpected stopping of a pending runner on a lower level.
To fix it we added communication from the upper level that the stop of the job was expected.
2023-06-13 14:20:20 +02:00
b620d0fad7
Introduce Allocation State Tracking
...
in order to break down the current state and evaluate if it is invalid.
2023-06-13 14:20:20 +02:00
8f89c14ea1
Cleanup logs for Allocation recovery
...
on startup. The changes do not have functional consequences as adding the allocation just overwrites the old one.
2023-05-10 18:56:51 +01:00
0c8fa9ccfa
Add context to log statements.
2023-04-11 20:45:30 +01:00
038d71ff51
Nomad: Handle Container re-allocation
2023-03-31 14:42:55 +02:00
e0db1bafe8
Fix multiple user Runner use
...
A before unknown Nomad reload adds already known runner again to the idle runner - even if they are already in use.
2023-03-31 14:42:55 +02:00
a78ee22e67
Reduce time racetrack of delete and listFileSystem route.
2023-01-02 11:23:02 +01:00
160df3d9e6
Add retry-mechanism for sample, mark-as-used and return
...
of Nomad runners.
2022-10-24 22:12:09 +01:00
9677253b35
Change Influx field name for the startup duration
...
due to a currently not resolvable type mismatch.
2022-08-10 20:46:17 +02:00
89e15c5c2f
Fix startup time format
...
Before it was a string. To use it efficiently we want it to be a number - in this case in nanoseconds.
2022-08-05 21:16:58 +02:00
c6e65c14bb
Monitor Nomad allocation startup duration.
2022-07-31 19:42:35 +02:00
34040162c2
#89 Generalise the three Storage interfaces and structs into one generic storage manager.
2022-06-29 16:21:19 +02:00
b7a20e3114
Introduce method "Environment" to the Runners interface.
...
This way we can relate to which environment a runner belongs.
2022-04-18 13:17:49 +02:00
136f596dc2
Add aws environments to the statistics
...
but only with the field usedRunners.
2022-04-09 16:35:53 +02:00
6123d20525
Implement core functionality of AWS integration
2022-02-28 14:54:40 +01:00
dd41e0d5c4
Generate structures for an AWS environment and runner
2022-02-28 14:54:40 +01:00
0ef5a4e39f
Make Execution Environment interface Nomad independent
2022-02-28 14:54:40 +01:00
ba43f667c2
Add architecture for multiple managers
...
using the chain of responsibility pattern.
2022-02-28 14:54:40 +01:00