from an approach that loaded the runners only once at the startup
to a method that will be repeated i.e. if the Nomad Event Stream connection interrupts.
that are rescheduled while the previous allocation was still pending.
We fix this by removing the race condition handling that should prevent Poseidon from throwing warnings of unexpected allocation stopping.
In today's unattended upgrade, we have seen how the prewarming pool size dropped to (near) zero. This was based on lost Nomad allocations. The allocations got rescheduled, but not added again to Poseidon.
The reason for this is a miscommunication between the Event Handling and the Nomad Manager. `removedByPoseidon` was true even if the runner was not removed by the manager, but an idle runner.
that was triggered by simultaneous deletion of the runner due to inactivity, and the allocation being rescheduled due to a lost node.
It led to the allocation first being rescheduled, and then being stopped. This caused an unexpected stopping of a pending runner on a lower level.
To fix it we added communication from the upper level that the stop of the job was expected.
In the context of #358 we identified that the event with the type `AllocationUpdated` and the client status `pending` is common but not always send by Nomad.
With this Commit we remove the condition that limits the evaluated Nomad events to the event with the type `AllocationUpdated`. Without the condition the event of the type `PlanResult` and the status `pending` will be evaluated equally. By now, this event seems to be sent every time.
This restriction led to started allocation not being registered when the `AllocationUpdated` event with client status `pending` was missing.
- Abstracting from the exec form while generating.
- Removal of single quotes (usage of only double-quotes).
- Bash-nesting using escaping of special characters.