6a1677dea0
Introduce reason for destroying runner
...
in order to return a specific error for OOM Killed Executions.
2023-07-21 15:30:21 +02:00
bfb5977d24
Destroy runner on allocation stopped
...
Destroying the runner when Nomad informs us about its allocation being stopped, fixes the error of executions running into their timeout even if the allocation was stopped long ago.
2023-07-21 15:30:21 +02:00
527aaf713f
Fix decreased prewarming pool due to inactivity timer.
...
When allocations fail and restart they are added again to the idle runners. The bug fixed with this commit is that the inactivity timer was not stopped at the restart. This led to the idle runner being removed when the timer expired.
2023-06-16 17:27:45 +01:00
b620d0fad7
Introduce Allocation State Tracking
...
in order to break down the current state and evaluate if it is invalid.
2023-06-13 14:20:20 +02:00
e0db1bafe8
Fix multiple user Runner use
...
A before unknown Nomad reload adds already known runner again to the idle runner - even if they are already in use.
2023-03-31 14:42:55 +02:00
0c6c48c3cf
#190 Add unit tests for runner recovery.
2022-11-26 13:33:44 +00:00
160df3d9e6
Add retry-mechanism for sample, mark-as-used and return
...
of Nomad runners.
2022-10-24 22:12:09 +01:00
c6e65c14bb
Monitor Nomad allocation startup duration.
2022-07-31 19:42:35 +02:00
498e8f5ff5
#110 Refactor influxdb monitoring
...
to use it as singleton.
This enables the possibility to monitor processes that are independent of an incoming request.
2022-07-01 15:29:31 +02:00
34040162c2
#89 Generalise the three Storage interfaces and structs into one generic storage manager.
2022-06-29 16:21:19 +02:00
0ef5a4e39f
Make Execution Environment interface Nomad independent
2022-02-28 14:54:40 +01:00
ba43f667c2
Add architecture for multiple managers
...
using the chain of responsibility pattern.
2022-02-28 14:54:40 +01:00