Commit Graph

64 Commits

Author SHA1 Message Date
0bfef5e105 Degrade InfluxDB Retry Write log. 2023-07-14 18:54:57 +02:00
5b64725faa Fix golangci-lint errors
that appeared due to the new version v1.53.1.
2023-06-04 11:54:42 +01:00
f377b1376c Add Client Status to Nomad Allocation monitoring
Also add the Nomad Node name as additional debug information.
2023-05-10 19:09:31 +01:00
42efebc194 Monitor the Nomad events
and send all Nomad events to Influxdb.
2023-05-09 00:13:58 +01:00
d8d9abbddd Add Job ID to Nomad Allocation monitoring. 2023-04-23 12:54:57 +01:00
0c8fa9ccfa Add context to log statements. 2023-04-11 20:45:30 +01:00
43221c717e Add context to Sentry Hook.
With this context, tracing information stored in the context can be associated with sentry events/issues.
2023-04-11 20:45:30 +01:00
038d71ff51 Nomad: Handle Container re-allocation 2023-03-31 14:42:55 +02:00
e0db1bafe8 Fix multiple user Runner use
A before unknown Nomad reload adds already known runner again to the idle runner - even if they are already in use.
2023-03-31 14:42:55 +02:00
e877cd1e52 Rename Sentry Span Descriptions. 2023-03-14 23:42:19 +01:00
7dadc5dfe9 Refactor Nomad Command Generation.
- Abstracting from the exec form while generating.
- Removal of single quotes (usage of only double-quotes).
- Bash-nesting using escaping of special characters.
2023-03-14 23:42:19 +01:00
a4599f2cf9 Fix panic on influx shutdown.
Influx was shutdown before Poseidon was terminated. In that mean time the Profiling data has been written. Also in that mean time, a periodical influx event triggers a panic since influx is already shutdown.

We implemented two changes, each fixing this scenario.
2023-03-13 15:21:24 +01:00
aa9d4d30e2 Actual retry sending InfluxDB data
Previously, we always logged the error on first failure and (nevertheless) tried to send the data within 3 minutes (default configuration).

Fixes POSEIDON-1H
Closes #262
2023-02-28 23:47:35 +01:00
2650efbb38 Sentry Tracing Identifier 2023-02-03 10:29:18 +00:00
a9581ac1d9 Performance for ListFileSystem 2023-02-03 10:29:18 +00:00
8950ab3776 Add single quotes for inner command.
Change to bash as interpreter.
Forbid single quotes for user commands.
2022-11-04 15:15:43 +01:00
5e5e13806e Monitor file download. 2022-10-26 01:33:26 +02:00
160df3d9e6 Add retry-mechanism for sample, mark-as-used and return
of Nomad runners.
2022-10-24 22:12:09 +01:00
b9c923da8a Remove unused and deprecated Storer interface. 2022-10-24 22:12:09 +01:00
7119f3e012 Fix not canceling monitoring events for removed environments
and runners.
2022-10-24 13:15:14 +02:00
3509109b6f Fix Ls2JsonWriter
by allowing more spaces in the ls response.
by sending the error response of the list file system route only when no content has been written.
2022-10-05 12:11:47 +01:00
195f88177e Add Content-Length and Content-Disposition Header
for GetFileContent route.
2022-10-05 12:11:47 +01:00
847e5cda65 Extend ls2json reader
by also parsing the link target, permissions, group and owner.
2022-10-05 12:11:47 +01:00
fc77f11d4d Enquote file path for shell execution.
Also, fix json of 500 response.
2022-10-05 12:11:47 +01:00
152b77afe5 Add listing of runners file system. 2022-10-05 12:11:47 +01:00
f2b25566dd #136 Copy files back from Nomad runner. 2022-10-05 12:11:47 +01:00
1a5a49d7c8 Explicitly switch user for code execution.
Co-authored-by: Maximilian Pass <maximilian.pass@student.hpi.uni-potsdam.de>
2022-09-24 23:09:23 +02:00
89fc7b2637 Fix Nomad event stream is ignoring errors
when an event stream could be established once.
2022-09-07 21:16:20 +02:00
e8457ca035 Remove monitoring debug statement. 2022-08-31 09:19:07 +02:00
5590c50e14 #110 Add periodical monitoring events. 2022-08-19 20:48:46 +02:00
9677253b35 Change Influx field name for the startup duration
due to a currently not resolvable type mismatch.
2022-08-10 20:46:17 +02:00
770327cf64 Add storage count debug statement. 2022-08-08 09:17:27 +02:00
6e52b8660d Avoid elements being removed multiple times
as this leads to multiple deletion events in the monitoring.
2022-08-01 11:36:18 +02:00
c6e65c14bb Monitor Nomad allocation startup duration. 2022-07-31 19:42:35 +02:00
49c7a2d405 Save the runner and environment id for executions monitoring. 2022-07-31 19:42:35 +02:00
d9b7989a6c Enable logging for failed monitoring. 2022-07-01 15:29:31 +02:00
3f0c781997 Monitor storage object count. 2022-07-01 15:29:31 +02:00
051fe29d59 Add unit test for monitored storage. 2022-07-01 15:29:31 +02:00
498e8f5ff5 #110 Refactor influxdb monitoring
to use it as singleton.
This enables the possibility to monitor processes that are independent of an incoming request.
2022-07-01 15:29:31 +02:00
275b6aa642 #89 Adjust golangci-lint configuration
as it does not support generics at this moment.

See https://github.com/golangci/golangci-lint/issues/2649
2022-06-29 16:21:19 +02:00
34040162c2 #89 Generalise the three Storage interfaces and structs into one generic storage manager. 2022-06-29 16:21:19 +02:00
a4d13fb8cb #148 Add stage to influx monitoring. 2022-06-21 15:31:29 +02:00
59ca63268b Add CODEOCEAN environment variable. 2022-06-10 18:10:28 +02:00
669ec039ce Update dependencies 2022-06-07 17:21:05 +02:00
1e59c1146e Fix CodeQL log injection warning
by removing newlines from logged user input.
2022-06-07 17:21:05 +02:00
25f92e5f94 Add environment specific data to the influxdb data. 2022-04-18 13:17:49 +02:00
eabe3a1b27 Add the Environment ID to the influxdb data.
Also move the interface of an execution environment into its own file, execution_environment.go.
2022-04-18 13:17:49 +02:00
8feffdae3a Add initial structure of influxdb monitoring. 2022-04-18 13:17:49 +02:00
9f0b04660f Fix goroutine leak in the nullio reader 2021-12-14 13:24:53 +01:00
1de559cebc Add statistics route for execution environments
* Add statistics route for execution environments

* Add maximum to port api definition

Co-authored-by: Sebastian Serth <MrSerth@users.noreply.github.com>
2021-12-08 12:08:22 +01:00