2713e8672c
Add error for empty list file system execution.
...
Normally, the result of executing the `lsCommand` should never be empty. However, we have observed that CodeOcean sometimes receives an empty JSON result if the runner is being deleted while the list file system request is processed. Therefore, we add a check if something has been written to CodeOcean and otherwise report an error.
2023-10-29 15:23:40 +01:00
e4769ee1b3
Bump github.com/google/uuid from 1.3.1 to 1.4.0
...
Bumps [github.com/google/uuid](https://github.com/google/uuid ) from 1.3.1 to 1.4.0.
- [Release notes](https://github.com/google/uuid/releases )
- [Changelog](https://github.com/google/uuid/blob/master/CHANGELOG.md )
- [Commits](https://github.com/google/uuid/compare/v1.3.1...v1.4.0 )
---
updated-dependencies:
- dependency-name: github.com/google/uuid
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-27 09:45:21 +02:00
a024183802
Bump google.golang.org/grpc from 1.58.2 to 1.58.3
...
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go ) from 1.58.2 to 1.58.3.
- [Release notes](https://github.com/grpc/grpc-go/releases )
- [Commits](https://github.com/grpc/grpc-go/compare/v1.58.2...v1.58.3 )
---
updated-dependencies:
- dependency-name: google.golang.org/grpc
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-26 15:15:23 +02:00
14b012486d
Formalize Memory Monitoring
...
by extracting the interval and threshold into configuration options.
Related to f670b07e
.
2023-10-12 16:16:46 +02:00
ca42369057
Bump golang.org/x/net from 0.15.0 to 0.17.0
...
Bumps [golang.org/x/net](https://github.com/golang/net ) from 0.15.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.15.0...v0.17.0 )
---
updated-dependencies:
- dependency-name: golang.org/x/net
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-12 08:26:38 +02:00
3d87cfceeb
Bump github.com/getsentry/sentry-go from 0.24.1 to 0.25.0
...
Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go ) from 0.24.1 to 0.25.0.
- [Release notes](https://github.com/getsentry/sentry-go/releases )
- [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md )
- [Commits](https://github.com/getsentry/sentry-go/compare/v0.24.1...v0.25.0 )
---
updated-dependencies:
- dependency-name: github.com/getsentry/sentry-go
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-06 13:26:43 +02:00
bd6ca6f88f
Bump org.apache.maven.plugins:maven-shade-plugin
...
Bumps [org.apache.maven.plugins:maven-shade-plugin](https://github.com/apache/maven-shade-plugin ) from 3.5.0 to 3.5.1.
- [Release notes](https://github.com/apache/maven-shade-plugin/releases )
- [Commits](https://github.com/apache/maven-shade-plugin/compare/maven-shade-plugin-3.5.0...maven-shade-plugin-3.5.1 )
---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-shade-plugin
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-01 13:37:24 +00:00
da63a8d119
Bump com.amazonaws:aws-java-sdk-apigatewaymanagementapi
...
Bumps [com.amazonaws:aws-java-sdk-apigatewaymanagementapi](https://github.com/aws/aws-sdk-java ) from 1.12.555 to 1.12.560.
- [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md )
- [Commits](https://github.com/aws/aws-sdk-java/compare/1.12.555...1.12.560 )
---
updated-dependencies:
- dependency-name: com.amazonaws:aws-java-sdk-apigatewaymanagementapi
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-10-01 13:27:54 +00:00
89a7ca8d13
Update Dependencies
2023-09-23 16:26:50 +02:00
83bc0115e4
Bump docker/metadata-action from 3 to 5
...
Bumps [docker/metadata-action](https://github.com/docker/metadata-action ) from 3 to 5.
- [Release notes](https://github.com/docker/metadata-action/releases )
- [Upgrade guide](https://github.com/docker/metadata-action/blob/master/UPGRADE.md )
- [Commits](https://github.com/docker/metadata-action/compare/v3...v5 )
---
updated-dependencies:
- dependency-name: docker/metadata-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 10:16:29 +02:00
110e2318e1
Bump tj-actions/branch-names from 6 to 7
...
Bumps [tj-actions/branch-names](https://github.com/tj-actions/branch-names ) from 6 to 7.
- [Release notes](https://github.com/tj-actions/branch-names/releases )
- [Changelog](https://github.com/tj-actions/branch-names/blob/main/HISTORY.md )
- [Commits](https://github.com/tj-actions/branch-names/compare/v6...v7 )
---
updated-dependencies:
- dependency-name: tj-actions/branch-names
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 10:16:19 +02:00
0058bb7dbf
Bump docker/build-push-action from 3 to 5
...
Bumps [docker/build-push-action](https://github.com/docker/build-push-action ) from 3 to 5.
- [Release notes](https://github.com/docker/build-push-action/releases )
- [Commits](https://github.com/docker/build-push-action/compare/v3...v5 )
---
updated-dependencies:
- dependency-name: docker/build-push-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 10:16:09 +02:00
cadbfcf5ca
Bump docker/login-action from 2 to 3
...
Bumps [docker/login-action](https://github.com/docker/login-action ) from 2 to 3.
- [Release notes](https://github.com/docker/login-action/releases )
- [Commits](https://github.com/docker/login-action/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: docker/login-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 10:14:01 +02:00
4a93238a15
Bump actions/checkout from 3 to 4
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases )
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/checkout/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 10:13:46 +02:00
dfad36dfde
Bump com.amazonaws:aws-lambda-java-events in /deploy/aws/java11Exec
...
Bumps [com.amazonaws:aws-lambda-java-events](https://github.com/aws/aws-lambda-java-libs ) from 3.11.2 to 3.11.3.
- [Commits](https://github.com/aws/aws-lambda-java-libs/commits )
---
updated-dependencies:
- dependency-name: com.amazonaws:aws-lambda-java-events
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 08:01:20 +00:00
2aaa4ab800
Bump com.amazonaws:aws-java-sdk-apigatewaymanagementapi
...
Bumps [com.amazonaws:aws-java-sdk-apigatewaymanagementapi](https://github.com/aws/aws-sdk-java ) from 1.12.542 to 1.12.555.
- [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md )
- [Commits](https://github.com/aws/aws-sdk-java/compare/1.12.542...1.12.555 )
---
updated-dependencies:
- dependency-name: com.amazonaws:aws-java-sdk-apigatewaymanagementapi
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-21 09:53:10 +02:00
1cfcd09454
Add Dependabot for GitHub actions
2023-09-21 09:44:46 +02:00
f670b07ea7
Introduce debug memory monitoring
...
that alerts and prints additional debug information in case that Poseidon exceeds 1 GB of memory usage.
2023-09-19 22:31:06 +02:00
c83bdbf083
Fix WebSocket JSON schema
...
by changing the required attribute to be an array as described in the official documentation.
2023-09-19 11:15:58 +02:00
3d252492fe
Fix rescheduled used runners being removed.
...
As they are already rescheduled and therefore recreated they do not need to be removed, but can be handled as a new runner.
2023-09-18 01:06:35 +02:00
6dc83ca7b5
Add regression test for rescheduled used runners being removed.
...
As they are already rescheduled and therefore recreated they do not need to be removed, but can be handled as a new runner.
2023-09-18 01:06:35 +02:00
90d591d4ec
Change default behavior in Nomad Event Handling
...
to not propagate that pending runners are being stopped.
2023-09-18 00:54:26 +02:00
2eb15c8d93
Fix loosing of rescheduled runners
...
that are rescheduled while the previous allocation was still pending.
We fix this by removing the race condition handling that should prevent Poseidon from throwing warnings of unexpected allocation stopping.
2023-09-18 00:54:26 +02:00
788cb0f660
Add regression test for the recent lost runners.
2023-09-18 00:54:26 +02:00
39fc0f9d9d
Update Dependencies
2023-09-16 19:52:52 +02:00
68cd8f43b4
Defuse data race condition of TestWithSeparateStderrReturnsCommandError.
2023-09-11 13:44:29 +02:00
6159f2a045
Fix Goroutine Leak of Nomad execute command
...
that was triggered when [the execution timeout got exceeded, the runner got destroyed, or the WebSocket connection to CodeOcean closed] and the Allocation did not react to the SIGQUIT within the grace period.
2023-09-11 13:44:29 +02:00
59da36303c
Fix Goroutine Leak of Environment Get
...
that was caused by creating an intermediate environment `fetchedEnvironment` when fetching the environments but not removing it in case that we just copy its configuration to the existing environment.
2023-09-11 13:44:29 +02:00
460b8b2065
Refactor TestReturnReturnsErrorWhenApiCallFailed
...
to handle the retry mechanism.
2023-09-11 13:44:29 +02:00
3abd4d9a3d
Refactor all tests to use the MemoryLeakTestSuite.
2023-09-11 13:44:29 +02:00
e3161637a9
Extract the WatchEventStream retry mechanism
...
into the utils including all other retry mechanisms.
With this change we fix that the WatchEventStream goroutine does not stop directly when the context is done (but previously only one second after).
2023-09-11 13:44:29 +02:00
0d6b4f660c
Refactor NewAbstractManager
...
to require a context used for the monitoring.
2023-09-11 13:44:29 +02:00
b28b87d56f
Refactor periodicallySendMonitoringData
...
in order to return directly when the context is done and not just at the next iteration.
2023-09-11 13:44:29 +02:00
a01bd0fa7e
Provide Memory Leak Test Suite
...
by adding an assertion about the number of Goroutines to the unit tests.
2023-09-11 13:44:29 +02:00
b708dddd23
Add Nomad Manager test case
...
that ensures that `onAllocationStopped` returns true when the runner was deleted before by the inactivity timer.
This feature is required for handling a race condition with the event handling of a rescheduled allocation.
2023-09-05 15:15:39 +02:00
354c16cc37
Fix missing rescheduled idle runners.
...
In today's unattended upgrade, we have seen how the prewarming pool size dropped to (near) zero. This was based on lost Nomad allocations. The allocations got rescheduled, but not added again to Poseidon.
The reason for this is a miscommunication between the Event Handling and the Nomad Manager. `removedByPoseidon` was true even if the runner was not removed by the manager, but an idle runner.
2023-09-05 15:15:39 +02:00
67297ec5a2
Add regression test for rescheduled idle runner.
2023-09-05 15:15:39 +02:00
8820938624
Increase severity of two log statements.
2023-09-05 15:15:39 +02:00
390d02055b
Bump com.amazonaws:aws-lambda-java-core in /deploy/aws/java11Exec
...
Bumps [com.amazonaws:aws-lambda-java-core](https://github.com/aws/aws-lambda-java-libs ) from 1.2.2 to 1.2.3.
- [Commits](https://github.com/aws/aws-lambda-java-libs/commits )
---
updated-dependencies:
- dependency-name: com.amazonaws:aws-lambda-java-core
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-01 03:22:19 +00:00
847e11387a
Bump com.amazonaws:aws-java-sdk-apigatewaymanagementapi
...
Bumps [com.amazonaws:aws-java-sdk-apigatewaymanagementapi](https://github.com/aws/aws-sdk-java ) from 1.12.519 to 1.12.542.
- [Changelog](https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md )
- [Commits](https://github.com/aws/aws-sdk-java/compare/1.12.519...1.12.542 )
---
updated-dependencies:
- dependency-name: com.amazonaws:aws-java-sdk-apigatewaymanagementapi
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-09-01 03:13:52 +00:00
188d012bc4
Fix Memory Leak caused by the merge_context.
...
The now removed statement of sending an empty struct into the channel blocked the goroutine until the channel of Done got listened for. This led to a goroutine leak as one does not necessarily has to call the Done function of a context.
We fix this issue by removing this value. It was unnecessary anyway as a closed channel always returns the null-value of the returned type.
2023-08-26 22:51:22 +02:00
b06ff4088f
Bump github.com/google/uuid from 1.3.0 to 1.3.1
...
Bumps [github.com/google/uuid](https://github.com/google/uuid ) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/google/uuid/releases )
- [Changelog](https://github.com/google/uuid/blob/master/CHANGELOG.md )
- [Commits](https://github.com/google/uuid/compare/v1.3.0...v1.3.1 )
---
updated-dependencies:
- dependency-name: github.com/google/uuid
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2023-08-22 06:35:59 +00:00
c0a3fb12c3
Fix UpdateFileSystem Context
...
to be done when either the runner is destroyed (case ignored before) or the request is interrupted.
2023-08-21 22:49:09 +02:00
09604997a7
Implement MergeContext
...
that has multiple contexts as parent and chooses the earliest deadline.
2023-08-21 22:49:09 +02:00
306512bf9c
Fix Context Values are not logged.
...
Only the Sentry hook uses the values of the passed context. Therefore, we removed the values from our log statements when we shifted them from an extra `WithField` call to the context.
We fix this behavior by introducing a Logrus Hook that copies a fixed set of context values to the logging data.
2023-08-21 22:40:37 +02:00
a7d27e8f65
Add missing error log statements.
...
When "markRunnerAsUsed" fails, we silently ignored it. Only, when additionally the return of the runner failed, we threw the error.
When a Runner is destroyed, we are only notified that Nomad removed the allocation, but cannot tell about the reason.
For "the execution did not stop after SIGQUIT" we did not log the belonging runner id.
2023-08-21 22:40:37 +02:00
13cd19ed58
Refactor Nomad Event Stream log message.
2023-08-18 09:28:23 +02:00
13a9da95e5
Introduce a context for RetryExponential
...
as second criteria (next to the maximum number of attempts) for canceling the retrying. This is required as we started with the previous commit to retry the nomad environment recovery. This always fails for unit tests (as they are not connected to an Nomad cluster). Before, we ignored the one error but the retrying leads to unit test timeouts.
Additionally, we now stop retrying to create a runner when the environment got deleted.
2023-08-18 09:28:23 +02:00
73759f8a3c
Retry Environment Recovery
2023-08-18 09:28:23 +02:00
89c18ad45c
Refactor to WithoutCancel context.
...
With Go 1.21 the WithoutCancel context was introduced. This way we can keep the values passed in a new context without having the new context being canceled together with its parent. This behavior suits well for two occurrences where we explicitly had to copy one required value instead of implicitly keeping all values.
2023-08-16 15:13:05 +02:00