poseidon

Author	SHA1	Message	Date
Maximilian Paß	354c16cc37	Fix missing rescheduled idle runners. In today's unattended upgrade, we have seen how the prewarming pool size dropped to (near) zero. This was based on lost Nomad allocations. The allocations got rescheduled, but not added again to Poseidon. The reason for this is a miscommunication between the Event Handling and the Nomad Manager. `removedByPoseidon` was true even if the runner was not removed by the manager, but an idle runner.	2023-09-05 15:15:39 +02:00
Maximilian Paß	13a9da95e5	Introduce a context for RetryExponential as second criteria (next to the maximum number of attempts) for canceling the retrying. This is required as we started with the previous commit to retry the nomad environment recovery. This always fails for unit tests (as they are not connected to an Nomad cluster). Before, we ignored the one error but the retrying leads to unit test timeouts. Additionally, we now stop retrying to create a runner when the environment got deleted.	2023-08-18 09:28:23 +02:00
Maximilian Paß	73759f8a3c	Retry Environment Recovery	2023-08-18 09:28:23 +02:00
Maximilian Paß	e7df777db4	Always log Runner and Environment ID. Systematically log the runner id and the environment id by adding the information at the findRunnerMiddleware.	2023-07-15 21:46:56 +02:00
Maximilian Paß	f7339570ae	Fix increased prewarming pool size by checking the number of required runners before creating an additional runner.	2023-05-28 23:47:07 +01:00
Maximilian Paß	160df3d9e6	Add retry-mechanism for sample, mark-as-used and return of Nomad runners.	2022-10-24 22:12:09 +01:00
Maximilian Paß	7119f3e012	Fix not canceling monitoring events for removed environments and runners.	2022-10-24 13:15:14 +02:00
Maximilian Paß	5d54b0f786	Fix wrong environment id at monitoring data for created or updated environments.	2022-10-24 13:15:14 +02:00
Sebastian Serth	d372e37d1a	Add cni/secure-bridge to isolate host network	2022-09-18 19:02:04 +02:00
Maximilian Paß	1eef26cc83	Add environment id to periodical monitoring events.	2022-08-20 09:17:43 +02:00
Maximilian Paß	5590c50e14	#110 Add periodical monitoring events.	2022-08-19 20:48:46 +02:00
Maximilian Paß	18daa1152c	Save the environment id for runner monitoring.	2022-07-31 19:42:35 +02:00
Maximilian Paß	498e8f5ff5	#110 Refactor influxdb monitoring to use it as singleton. This enables the possibility to monitor processes that are independent of an incoming request.	2022-07-01 15:29:31 +02:00
Maximilian Paß	34040162c2	#89 Generalise the three Storage interfaces and structs into one generic storage manager.	2022-06-29 16:21:19 +02:00
Maximilian Paß	a41659eed4	Enable memory oversubscription (#102 ) * Enable memory oversubscription * Fix and add e2e test	2022-03-18 08:31:27 +01:00
Maximilian Paß	2cf890ab91	Implement review comments	2022-02-28 14:54:40 +01:00
Maximilian Paß	6123d20525	Implement core functionality of AWS integration	2022-02-28 14:54:40 +01:00
Maximilian Paß	dd41e0d5c4	Generate structures for an AWS environment and runner	2022-02-28 14:54:40 +01:00

18 Commits