for internal decisions as this error is strongly used by other packages. By checking such wrapped errors the internal decision can be influenced accidentally.
In this case the retry mechanism checked if the error is context.DeadlineExceeded and assumed it would be created by the internal context. This assumption was wrong.
from an approach that loaded the runners only once at the startup
to a method that will be repeated i.e. if the Nomad Event Stream connection interrupts.
Before the List function dropped all idleRunners of all environments when fetch was set.
Additionally, the replaced environment was not destroyed properly so that a goroutine for it and for all its idle runners remained running.
that was caused by creating an intermediate environment `fetchedEnvironment` when fetching the environments but not removing it in case that we just copy its configuration to the existing environment.
into the utils including all other retry mechanisms.
With this change we fix that the WatchEventStream goroutine does not stop directly when the context is done (but previously only one second after).
as second criteria (next to the maximum number of attempts) for canceling the retrying. This is required as we started with the previous commit to retry the nomad environment recovery. This always fails for unit tests (as they are not connected to an Nomad cluster). Before, we ignored the one error but the retrying leads to unit test timeouts.
Additionally, we now stop retrying to create a runner when the environment got deleted.