Commit Graph

860 Commits

Author SHA1 Message Date
sirkrypt0
8b26ecbe5f Restructure project
We previously didn't really had any structure in our project apart
from creating a new folder for each package in our project root.
Now that we have accumulated some packages, we use the well-known
Golang project layout in order to clearly communicate our intent
with packages. See https://github.com/golang-standards/project-layout
2021-07-21 12:55:35 +02:00
sirkrypt0
2f1383b743 Add tests for returning mapped ports of runners 2021-07-21 08:22:10 +02:00
sirkrypt0
64764a9809 Return mapped ports when requesting runners
We now store the mapped ports returned by Nomad locally in our runner
struct and return them when requesting the runner. The returned ip
address is in most Nomad setups not reachable from external users.
2021-07-20 23:22:58 +02:00
sirkrypt0
d7c1787b57 Disable allow-failure for linting pipeline
Now that all linting issues are fixed, we disable allow-failure for
the linting step to ensure that later commits adhere to the linter.
2021-07-13 08:59:25 +02:00
sirkrypt0
c7606f3d5f Fix a lot of linting issues
After we introduced the linter we haven't really touched the old code.
This commit now fixes all linting issue that exist right now.
2021-07-13 08:59:25 +02:00
Maximilian Paß
bd7fb53385 Fix bug that the count of the default task group is set to the prewarming pool size 2021-07-07 09:21:57 +02:00
Maximilian Paß
68eacae7fe Fix bug that config task group is not added to the template job (and the faulty tests) 2021-07-06 10:09:36 +02:00
Maximilian Paß
bbc1ce12ca Delete idle runners when the environment is scaled down 2021-07-02 13:00:13 +02:00
Maximilian Paß
66d04fde2a Remove unused function ScaleAllEnvironments 2021-07-01 09:21:09 +00:00
sirkrypt0
50a2a22b74 Only create exactly one new runner when one runner is claimed
Previously we would create as much runners as needed based on the
local idleRunnersCount and the desiredIdleRunnersCount. This is
problematic if two runners are claimed shortly after one another.
As we only add a runner to the idleRunners list once we get the
event from Nomad, the second runner claim in a short timeframe
would create two new runners. This has been fixed now.
2021-06-29 09:11:21 +02:00
Konrad Hanff
e0e254a6af Persist runner timeout in metadata
To be able to restore the runner timeouts even after a Poseidon restart,
the timeout is stored in the Nomad metadata. The timeout will restart,
but at least the runner will be returned at all.
2021-06-23 11:07:17 +02:00
Konrad Hanff
ae08e37106 Add end to end test for inactivity timeout 2021-06-23 11:04:19 +02:00
Konrad Hanff
6c887de6f1 Move NullReader from nomad to util package. 2021-06-23 11:04:19 +02:00
Konrad Hanff
14f8a096eb Add unit and integration tests for runner inactivity timeout. 2021-06-23 11:04:19 +02:00
Konrad Hanff
4b2cae0bd1 Add inactivity timeout for runners.
By removing runners after a specified timeout they no longer stay
around indefinitely and block Nomads capacities. The timeout can be set
individually per runner when requesting the provide route. If it is set
to 0, the runner is never removed automatically.

The timeout is reset when activity is detected. Currently that is when
something gets executed or the filesystem gets modified.
2021-06-23 11:04:18 +02:00
Konrad Hanff
c7ed54942d Move ChannelReceivesSomething to tests package.
ChannelReceivesSomething (formerly WaitForChannel) originally was
located in the helpers package.
This move was done to remove a cyclic dependency with the nomand package.
2021-06-21 10:54:07 +02:00
Konrad Hanff
92f1af83ae Add tests for codeOceanToRaw and null readers
The tests ensure the readers do not return when there is no data
available.
2021-06-21 08:20:04 +00:00
Konrad Hanff
17c1e379c2 Fix busy waiting on stdin
When running an execution, Nomad continuously reads from the stdin
reader. Because the readers we implemented (codeOceanToRawReader and
nullReader) return zero if there is no input available, this leads to
busy waiting and a high CPU load on Poseidon. By waiting indefinitely in
case of the nullReader and for at least one byte on case of the
codeOceanToRawReader before returning, we prevent this issue.
2021-06-21 08:20:04 +00:00
Tobias Kantusch
0b9e5a5ba5 Update README
* Update port to 7200
* Update linter instructions
* Update Docker instructions
2021-06-18 07:31:24 +00:00
sirkrypt0
f5f7521a18 Fix environment recovery
As the environment is no longer stored in the meta information,
Poseidon wasn't able to recover environments. It expected the
environment id to be found in the meta data. We now recover
the environment id from the job id.
2021-06-18 08:39:54 +02:00
Maximilian Paß
2e4a975588 Implement even more merge request comments 2021-06-15 12:05:51 +02:00
sirkrypt0
ff582805b4 Move Nomad job creation to Nomad package
Previously, low level Nomad job creation was done in the environment manager.
It used many functions of the nomad package so we felt like this logic
better belongs to the nomad package.
2021-06-15 11:38:02 +02:00
Maximilian Paß
87f823756b Implement merge request comments 2021-06-15 11:37:47 +02:00
Maximilian Paß
25d78df557 Restore existing jobs and fix rebase (7c99eff3) issues 2021-06-15 11:37:35 +02:00
sirkrypt0
0020590c96 Update all runners when updating environment
Previously only the default job would be updated to the newest specs.
Now all Nomad jobs that belong to the given environment are updated
accordingly.
2021-06-15 11:35:59 +02:00
sirkrypt0
c7d59810e5 Use Nomad jobs as runners instead of allocations
As we can't control which allocations are destroyed when downscaling a job, we decided
to use Nomad jobs as our runners. Thus for each runner we prewarm for an environment,
a corresponding job is created in Nomad. We create a default job that serves as a template
for the runners. Using this, already existing execution environments can easily be restored,
once Poseidon is restarted.
2021-06-15 11:35:54 +02:00
sirkrypt0
8de489929e Remove stderr fifo after interactive execution with stderr finished
Previously the stderr fifo would not be removed, leaving unwanted
artifacts from the execution behind. We now remove the stderr fifo
after the command finished.
2021-06-14 15:04:09 +02:00
sirkrypt0
d3300e839e Add unit tests for separate stdout and stderr on execution 2021-06-11 08:47:25 +00:00
sirkrypt0
f122dd9376 Split stdout and stderr on interactive execution
When running a command interactively, we previously would get stdout
and stderr both served on stdout by Nomad. To circumvent this issue,
we now start a separate execution inside the allocation to split
both streams.
2021-06-11 08:47:25 +00:00
sirkrypt0
19cd4b840e Update Nomad to 1.1.1 and other project dependencies 2021-06-10 18:53:48 +02:00
Jan-Eric Hellenberg
61bc7d0143 Add unit tests for provide runner route 2021-06-10 06:11:31 +00:00
sirkrypt0
7bbd7b7bae Fix task group name
Previously when creating a job, Poseidon would still use the old
task group name format instead of default-group as expected.
2021-06-09 18:22:28 +02:00
Maximilian Paß
32fe47d669 Implement linting issues and merge request comments 2021-06-09 08:35:20 +00:00
Maximilian Paß
4b5f0a3eb6 Add tests for runner manager updating runners 2021-06-09 08:35:20 +00:00
Maximilian Paß
d0a2a1d96c Add tests for receiving allocation updates from Nomad 2021-06-09 08:35:20 +00:00
sirkrypt0
3f572261c2 Add updating cached allocations 2021-06-09 08:35:20 +00:00
sirkrypt0
66821dbfc8 Add query options to Nomad API queries to make sure we query the correct namespace 2021-06-09 08:35:20 +00:00
Jan-Eric Hellenberg
ce2b82d43d Copy files with relative path to active workspace directory of container 2021-06-09 10:24:29 +02:00
sirkrypt0
b32e9c2a67 Remove off by one with needed runners
Earlier we used a channel to store the runners. To make the environment
refresh block, we scheduled an additional runner as the buffered channel
was then filled up. As we don't use the channel anymore, we don't need
the additional runner anymore. Furthermore this leads to weird race
conditions in tests when comparing the runner count to the desired one.
2021-06-03 13:21:49 +00:00
sirkrypt0
3d7b7e1761 Set default minimum count in scaling policy to 0
Previously the minimum was not set, thus defaulting to the value of count.
This did not allow creating execution environments with a prewarmingPoolSize
of 0 as the task group count must not be less than the minimum coun in the
scaling policy.
2021-06-03 13:21:49 +00:00
sirkrypt0
630a006258 Use more uints
Previously we accepted int values although only uint values made sense.
We adjusted this to accept uints where appropriate.
2021-06-03 13:21:49 +00:00
sirkrypt0
1c4daa99a9 Add e2e tests for exec env createOrUpdate
This also adds a Nomad client to the e2e_tests that can be used to
query Nomad and validate that certain actions happened in Nomad
correctly.
2021-06-03 13:21:49 +00:00
sirkrypt0
1be744f2d4 Explicitly set task groups network when networkAccess is false
Previously, updating an environment from with to without network
access would leave the network resource in the task group as they
were before.
2021-06-03 13:21:49 +00:00
sirkrypt0
b990df7b9d Add route to create or update execution environments 2021-06-03 13:21:49 +00:00
sirkrypt0
3d395f0a38 Set network_mode to bridge to overwrite old setting
Previously, the network_mode was only set when creating a job with
network_access = false. This results in Nomad leaving this setting
as is when updating the job to use network. Thus a job would have
had the mapped ports in the Nomad UI, but the Docker network_mode
would still be 'none'.
2021-06-03 13:21:49 +00:00
Maximilian Paß
5f35ba30a2 Remove data race in the runner length function 2021-06-03 11:38:40 +02:00
Jan-Eric Hellenberg
02b3f52a11 Add ability to copy files to and delete files from runner 2021-06-02 14:54:54 +02:00
Konrad Hanff
242d0175a2 Add tests and end-to-end tests for websocket execution
For unit tests, this mocks the runners Execute method with a
customizable function that operates on the request, streams and exit
channel to simulate a real execution.

End-to-end tests are moved to the tests/e2e_tests folder. The tests
folder allows us to have shared helper functions for all tests in a
separate package (tests) that is not included in the non-test build.

This also adds one second of delay before each end-to-end test case by
using the TestSetup method of suite. By slowing down test execution,
this gives Nomad time to create new allocations when a test requested a
runner. Another solution could be to increase the scale of the job to
have enough allocations for all end-to-end tests.

Co-authored-by: Maximilian Paß <maximilian.pass@student.hpi.uni-potsdam.de>
2021-05-31 12:32:51 +02:00
Konrad Hanff
3afcdeaba8 Execute commands in runner via WebSocket
This enables executing commands in runners and forwarding input and
output between the runner and the websocket to the client.

Co-authored-by: Maximilian Paß <maximilian.pass@student.hpi.uni-potsdam.de>
2021-05-31 12:32:51 +02:00
sirkrypt0
892f902377 Add deps dependency to e2e-test make target
Without the deps dependency the e2e-test target would sometimes
fail in the CI due to missing entries in the go.sum file.
2021-05-28 09:13:59 +02:00