Commit Graph

73 Commits

Author SHA1 Message Date
Maximilian Paß
317590d3ea Revert "Debug Health route latency."
This reverts commit 213628b958.
2024-01-26 22:51:55 +01:00
Maximilian Paß
213628b958 Debug Health route latency. 2024-01-26 14:36:16 +01:00
Maximilian Paß
ab12c9046d Decrease Log Severity
of errors trying to read the request body.
2023-11-22 19:14:42 +01:00
Maximilian Paß
70c108aebf Unify the representation of the three dots. 2023-11-09 13:11:39 +01:00
Maximilian Paß
0f7e98f78e Refactor PrewarmingPoolAlert triggering
from route-based to Nomad-Event-Stream-based.
2023-11-09 13:11:39 +01:00
Maximilian Paß
543939e5cb Add independent environment reload
in the case that the prewarming pool is depleting (see PrewarmingPoolThreshold) and is still depleting after a timeout (PrewarmingPoolReloadTimeout).
2023-11-09 13:11:39 +01:00
Maximilian Paß
c46a09eeae Add Prewarming Pool Alert
that checks for every environment if the filled share of the prewarmin pool is at least the specified threshold.
2023-11-09 13:11:39 +01:00
Maximilian Paß
d0dd5c08cb Remove usage of context.DeadlineExceeded
for internal decisions as this error is strongly used by other packages. By checking such wrapped errors the internal decision can be influenced accidentally.
In this case the retry mechanism checked if the error is context.DeadlineExceeded and assumed it would be created by the internal context. This assumption was wrong.
2023-10-31 15:49:56 +01:00
Maximilian Paß
6b69a2d732 Refactor Nomad Recovery
from an approach that loaded the runners only once at the startup
to a method that will be repeated i.e. if the Nomad Event Stream connection interrupts.
2023-10-31 15:49:56 +01:00
Maximilian Paß
6159f2a045 Fix Goroutine Leak of Nomad execute command
that was triggered when [the execution timeout got exceeded, the runner got destroyed, or the WebSocket connection to CodeOcean closed] and the Allocation did not react to the SIGQUIT within the grace period.
2023-09-11 13:44:29 +02:00
Maximilian Paß
3abd4d9a3d Refactor all tests to use the MemoryLeakTestSuite. 2023-09-11 13:44:29 +02:00
Maximilian Paß
a7d27e8f65 Add missing error log statements.
When "markRunnerAsUsed" fails, we silently ignored it. Only, when additionally the return of the runner failed, we threw the error.

When a Runner is destroyed, we are only notified that Nomad removed the allocation, but cannot tell about the reason.

For "the execution did not stop after SIGQUIT" we did not log the belonging runner id.
2023-08-21 22:40:37 +02:00
Maximilian Paß
89c18ad45c Refactor to WithoutCancel context.
With Go 1.21 the WithoutCancel context was introduced. This way we can keep the values passed in a new context without having the new context being canceled together with its parent. This behavior suits well for two occurrences where we explicitly had to copy one required value instead of implicitly keeping all values.
2023-08-16 15:13:05 +02:00
Maximilian Paß
90092c48c1 Fix incomplete debug message
that is created by sending SIGQUIT to the bash process
by not processing output after the the client disconnected / we have sent the SIGQUIT.
2023-08-14 11:37:51 +02:00
Maximilian Paß
eb818f92f7 Refactor Runner Destroy Reason Masking
and ignore expected reasons such when the runner got destroyed by an API request.
2023-07-24 11:48:14 +01:00
Maximilian Paß
6a1677dea0 Introduce reason for destroying runner
in order to return a specific error for OOM Killed Executions.
2023-07-21 15:30:21 +02:00
Maximilian Paß
bfb5977d24 Destroy runner on allocation stopped
Destroying the runner when Nomad informs us about its allocation being stopped, fixes the error of executions running into their timeout even if the allocation was stopped long ago.
2023-07-21 15:30:21 +02:00
Maximilian Paß
d64d8995bd Refactor monitoring of runner and environment id. 2023-07-15 21:46:56 +02:00
Maximilian Paß
e7df777db4 Always log Runner and Environment ID.
Systematically log the runner id and the environment id by adding the information at the findRunnerMiddleware.
2023-07-15 21:46:56 +02:00
Maximilian Paß
2aa10a130f Introduce context for the codeOceanOutputWriter
that represents its lifespan.
2023-04-11 20:45:30 +01:00
Maximilian Paß
0c8fa9ccfa Add context to log statements. 2023-04-11 20:45:30 +01:00
Maximilian Paß
7dadc5dfe9 Refactor Nomad Command Generation.
- Abstracting from the exec form while generating.
- Removal of single quotes (usage of only double-quotes).
- Bash-nesting using escaping of special characters.
2023-03-14 23:42:19 +01:00
Maximilian Paß
4550a4589e Dangerous Context Enrichment
by passing the Sentry Context down our abstraction stack.
This included changes in the complex context management of managing a Command Execution.
2023-02-03 10:29:18 +00:00
Maximilian Paß
2650efbb38 Sentry Tracing Identifier 2023-02-03 10:29:18 +00:00
Maximilian Paß
f2c205a8ed Add additional performance spans 2023-02-03 10:29:18 +00:00
Maximilian Paß
8950ab3776 Add single quotes for inner command.
Change to bash as interpreter.
Forbid single quotes for user commands.
2022-11-04 15:15:43 +01:00
Maximilian Paß
28fb0ca61c Catch context canceled error 2022-10-25 09:36:52 +02:00
Maximilian Paß
5d54b0f786 Fix wrong environment id at monitoring
data for created or updated environments.
2022-10-24 13:15:14 +02:00
Maximilian Paß
195f88177e Add Content-Length and Content-Disposition Header
for GetFileContent route.
2022-10-05 12:11:47 +01:00
Maximilian Paß
0c70ad3b24 Enable unprivileged retrieve of file listing and content. 2022-10-05 12:11:47 +01:00
Maximilian Paß
3469d0ce77 Specify http not found exit code
by replacing it with StatusGone (410) for a missing runner and StatusFailedDependency (424) for missing or not accessible files.
2022-10-05 12:11:47 +01:00
Maximilian Paß
fc77f11d4d Enquote file path for shell execution.
Also, fix json of 500 response.
2022-10-05 12:11:47 +01:00
Maximilian Paß
152b77afe5 Add listing of runners file system. 2022-10-05 12:11:47 +01:00
Maximilian Paß
c7ee7c1e25 Remove superfluous response.WriteHeader call
as the Write of the responseWriter automatically sends also the Header.
2022-10-05 12:11:47 +01:00
Maximilian Paß
f2b25566dd #136 Copy files back from Nomad runner. 2022-10-05 12:11:47 +01:00
Sebastian Serth
1a5a49d7c8 Explicitly switch user for code execution.
Co-authored-by: Maximilian Pass <maximilian.pass@student.hpi.uni-potsdam.de>
2022-09-24 23:09:23 +02:00
Maximilian Paß
d10b31a1fb Remove static (nil) return value. 2022-08-01 11:24:56 +02:00
Maximilian Paß
498e8f5ff5 #110 Refactor influxdb monitoring
to use it as singleton.
This enables the possibility to monitor processes that are independent of an incoming request.
2022-07-01 15:29:31 +02:00
Maximilian Paß
203d5a3a4f #155 refactor and synchronise writing to CodeOcean. (#174)
* #155 refactor and synchronise writing to CodeOcean.

* Reduce complexity of input parsing.

* Update typo in internal/api/ws/codeocean_writer.go

Co-authored-by: Sebastian Serth <MrSerth@users.noreply.github.com>
2022-06-26 20:19:23 +02:00
Maximilian Paß
3afe8ddb66 #155 Enable stopping of the CodeOcean WebSocket read independently of writing to CodeOcean. 2022-06-21 12:13:33 +02:00
Maximilian Paß
3d1ed7cb0f #155 Minimise timing issues with websocket close. 2022-06-21 12:13:33 +02:00
Maximilian Paß
79dc3a94da #155 Add log statement for further investigations (#164) 2022-06-12 11:10:57 +02:00
Maximilian Paß
ecce3c294f #155 Add log statement for further investigations 2022-06-09 22:34:43 +02:00
Maximilian Paß
1e59c1146e Fix CodeQL log injection warning
by removing newlines from logged user input.
2022-06-07 17:21:05 +02:00
Maximilian Paß
25f92e5f94 Add environment specific data to the influxdb data. 2022-04-18 13:17:49 +02:00
Maximilian Paß
eabe3a1b27 Add the Environment ID to the influxdb data.
Also move the interface of an execution environment into its own file, execution_environment.go.
2022-04-18 13:17:49 +02:00
Maximilian Paß
8feffdae3a Add initial structure of influxdb monitoring. 2022-04-18 13:17:49 +02:00
Sebastian Serth
e4ebb5b384 Add trace statements for WebSocket messages
* With `logger.level: TRACE`, the content of WebSocket messages is logged
  together with the corresponding timestamp.
* The input is not further sanitized as this log level
  is not intended for production use.
2022-04-15 12:39:03 +02:00
Maximilian Paß
6123d20525 Implement core functionality of AWS integration 2022-02-28 14:54:40 +01:00
Maximilian Paß
dd41e0d5c4 Generate structures for an AWS environment and runner 2022-02-28 14:54:40 +01:00