added poseidon with aws to k8s changes
This commit is contained in:
138
docs/configuration.md
Normal file
138
docs/configuration.md
Normal file
@ -0,0 +1,138 @@
|
||||
# Configuration
|
||||
|
||||
Poseidon can be configured to suit different use cases.
|
||||
|
||||
|
||||
## Poseidon
|
||||
|
||||
The file `config/config.go` contains a configuration struct containing all possible configuration options for Poseidon. The file also defines default values for most of the configuration options.
|
||||
The options *can* be overridden with a yaml configuration file whose path can be configured with the flag `-config`. By default, Poseidon searches for `configuration.yaml` in the working directory. `configuration.example.yaml` is an example for a configuration file and contains explanations for all options. The keys of the options specified in the configuration file must be written in lowercase.
|
||||
The options *can* also be overridden by environment variables. Currently, only the Go types `string`, `int`, `bool` and `struct` (nested) are implemented. The name of the environment variable is constructed as follows: `POSEIDON_(<name of nested struct>_)*<name of field>` (all letters are uppercase).
|
||||
|
||||
The precedence of configuration possibilities is:
|
||||
|
||||
1. Environment variables
|
||||
2. Configuration file
|
||||
3. Default values
|
||||
|
||||
If a value is not specified, the value of the subsequent possibility is used.
|
||||
|
||||
### Example
|
||||
|
||||
- The default value for the `Port` (type `int`) field in the `Server` field (type `struct`) of the configuration is `7200`.
|
||||
- This can be overwritten with the following `configuration.yaml`:
|
||||
|
||||
```yaml
|
||||
server:
|
||||
port: 4000
|
||||
```
|
||||
|
||||
- Again, this can be overwritten by the environment variable `POSEIDON_SERVER_PORT`, e.g., using `export POSEIDON_SERVER_PORT=5000`.
|
||||
|
||||
### Systemd
|
||||
|
||||
Poseidon can be configured to run as a systemd service. Poseidon can optionally also be configured to use a systemd socket.
|
||||
The use of systemd provides capabilities for managing Poseidon's state and zero downtime deployments.
|
||||
Minimal examples for systemd configurations can be found in `.github/workflows/resources`.
|
||||
|
||||
|
||||
## Nomad
|
||||
|
||||
As a subsystem of Poseidon, Nomad can and should also be configured accordingly.
|
||||
|
||||
### Memory Oversubscription
|
||||
|
||||
Poseidon is using Nomad's feature of memory oversubscription. This way all Runner are allocated with just 16MB. The memory limit defined per execution environment is used as an upper bound for the memory oversubscription.
|
||||
On the one hand, this feature allows Nomad to execute much more Runner in parallel but, on the other hand, it introduces a risk of overloading the Nomad host. Still, this feature is obligatory for Poseidon to work and therefore needs to be enabled. [Example Configuration](./resources/server.example.hcl)
|
||||
|
||||
```hcl
|
||||
default_scheduler_config {
|
||||
memory_oversubscription_enabled = true
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
### Scheduler
|
||||
|
||||
By default, Nomad uses a bin-packing scheduler. This places all Jobs on one host. In our case, a high load then leads to one Nomad client being fully utilised while the others remain mostly idle.
|
||||
To mitigate the overload of a Nomad client, the ["spread" scheduler algorithm](https://www.nomadproject.io/api-docs/operator/scheduler#update-scheduler-configuration) should be used.
|
||||
|
||||
### Maximum Connections per Client
|
||||
|
||||
By default, Nomad only allows 100 maximum concurrent connections per client. However, as Poseidon is a client, this would significantly impair and limit the performance of Poseidon. Thus, this limit should be disabled.
|
||||
|
||||
To do so, ensure the following configuration is set in your Nomad agents, for example by adding it to `/etc/nomad.d/base.hcl`:
|
||||
|
||||
```hcl
|
||||
limits {
|
||||
http_max_conns_per_client = 0
|
||||
}
|
||||
```
|
||||
|
||||
### Enable Networking Support in Nomad
|
||||
|
||||
In order to allow full networking support in Nomad, the `containernetworking-plugins` are required on the host. They can be either installed manually or through a package manager. In the latter case, the installation path might differ. Hence, add the following line to the `client` directive of the Nomad configuration in `/etc/nomad.d/client.hcl`:
|
||||
|
||||
```hcl
|
||||
cni_path = "/usr/lib/cni"
|
||||
```
|
||||
|
||||
If the path is not set up correctly or the dependency is missing, the following error will be shown in Nomad: `failed to find plugin "bridge" in path [/opt/cni/bin]`
|
||||
|
||||
Additionally, we provide a [secure-bridge](./resources/secure-bridge.conflist) configuration for the `containernetworking-plugins`. We highly recommend to use this configuration, as it will automatically configure an appropriate firewall and isolate your local network. Store the [secure-bridge](./resources/secure-bridge.conflist) in an (otherwise) empty folder and specify that folder in `/etc/nomad.d/client.hcl`:
|
||||
|
||||
```hcl
|
||||
cni_config_dir = "<path to folder with *.conflist>"
|
||||
```
|
||||
|
||||
If the path is not set up correctly or with a different name, the placement of allocations will fail in Nomad: `Constraint missing network filtered [all] nodes`. Be sure to set the "dns" and "dns-search" options in `/etc/docker/daemon.json` with reasonable defaults, for example with those shown in our [example configuration for Docker](./resources/docker.daemon.json).
|
||||
|
||||
### Network range
|
||||
|
||||
The default subnet range for Docker containers can be adjusted.
|
||||
This can be done both in the Docker daemon configuration and the CNI secure-bridge configuration.
|
||||
Accordingly, every container using the secure-bridge will receive an IP of the CNI configuration.
|
||||
Both subnet range configurations should not be overlapping.
|
||||
|
||||
An example configuration could use `10.151.0.0/20` for all containers without the CNI secure-bridge and `10.151.16.0/20`
|
||||
for all containers using the CNI secure bridge.
|
||||
This would grant 4096 IPs to both subnets and keep 14 network range blocks of the `10.151.0.0/16` network free for future use (e.g., in other CNI configs).
|
||||
|
||||
### Use gVisor as a sandbox
|
||||
|
||||
We recommend using gVisor as a sandbox for the execution environments. First, [install gVisor following the official documentation](https://gvisor.dev/docs/user_guide/install/) and second, adapt the `/etc/docker/daemon.json` with reasonable defaults as shown in our [example configuration for Docker](./resources/docker.daemon.json).
|
||||
|
||||
## Supported Docker Images
|
||||
|
||||
In general, any Docker image can be used as an execution environment.
|
||||
|
||||
### Users
|
||||
|
||||
If the `privilegedExecution` flag is set to `true` during execution, no additional user is required. Otherwise, the following two requirements must be met:
|
||||
|
||||
- A non-privileged user called `user` needs to be present in the image. This user is used to execute the code.
|
||||
- The Docker image needs to have a `/sbin/setuser` script allowing the execution of the user code as a non-root user, similar to `/usr/bin/su`.
|
||||
|
||||
### Executable Commands
|
||||
|
||||
In order to function properly, Poseidon expects the following commands to be available within the PATH:
|
||||
|
||||
- `cat`
|
||||
- `env`
|
||||
- `ls`
|
||||
- `mkfifo`
|
||||
- `rm`
|
||||
- `bash` (not compatible with `sh` or `zsh`)
|
||||
- `sleep`
|
||||
- `tar` (including the `--absolute-names` option)
|
||||
- `true`
|
||||
- `unset`
|
||||
- `whoami`
|
||||
|
||||
Tests need additional commands:
|
||||
|
||||
- `echo`
|
||||
- `head`
|
||||
- `id`
|
||||
- `make`
|
||||
- `tail`
|
136
docs/development.md
Normal file
136
docs/development.md
Normal file
@ -0,0 +1,136 @@
|
||||
# Development
|
||||
|
||||
## Setup
|
||||
|
||||
If you haven't installed Go on your system yet, follow the [golang installation guide](https://golang.org/doc/install).
|
||||
|
||||
To get your local setup going, run `make bootstrap`. It will install all required dependencies as well as setting up our git hooks. Run `make help` to get an overview of available make targets.
|
||||
|
||||
The project can be compiled using `make build`. This should create a binary which can then be executed.
|
||||
|
||||
Alternatively, the `go run ./cmd/poseidon` command can be used to automatically compile and run the project.
|
||||
|
||||
### URLs
|
||||
|
||||
Once you completed the project setup, you can check the availability using the following URL:
|
||||
|
||||
```http request
|
||||
http://localhost:7200/api/v1/version
|
||||
```
|
||||
|
||||
Using the prefix `/api/v1`, all routes as described in [API documentation](../api/swagger.yaml) are available and thus can be used in conjunction with [CodeOcean](https://github.com/openHPI/codeocean).
|
||||
|
||||
## Tests
|
||||
|
||||
As testing framework we use the [testify](https://github.com/stretchr/testify) toolkit.
|
||||
|
||||
Run `make test` to run the unit tests.
|
||||
|
||||
### Mocks
|
||||
|
||||
For mocks we use [mockery](https://github.com/vektra/mockery). You can create a mock for the interface of your choice by running
|
||||
|
||||
```bash
|
||||
make mock name=INTERFACE_NAME pkg=./PATH/TO/PKG
|
||||
```
|
||||
|
||||
on a specific interface.
|
||||
|
||||
For example, for an interface called `ExecutorApi` in the package `nomad`, you might run
|
||||
|
||||
```bash
|
||||
make mock name=ExecutorApi pkg=./nomad
|
||||
```
|
||||
|
||||
If the interface changes, you can rerun this command (deleting the mock file first to avoid errors may be necessary).
|
||||
|
||||
Mocks can also be generated by using mockery directly on a specific interface. To do this, first navigate to the package the interface is defined in. Then run
|
||||
|
||||
```bash
|
||||
mockery \
|
||||
--name=<<interface_name>> \
|
||||
--structname=<<interface_name>>Mock \
|
||||
--filename=<<interface_name>>Mock.go \
|
||||
--inpackage
|
||||
```
|
||||
|
||||
For example, for an interface called `ExecutorApi` in the package `nomad`, you might run
|
||||
|
||||
```bash
|
||||
mockery \
|
||||
--name=ExecutorApi \
|
||||
--structname=ExecutorAPIMock \
|
||||
--filename=ExecutorAPIMock.go \
|
||||
--inpackage
|
||||
```
|
||||
|
||||
Note that per default, the mocks are created in a `mocks` sub-folder. However, in some cases (if the mock implements private interface methods), it needs to be in the same package as the interface it is mocking. The `--inpackage` flag can be used to avoid creating it in a subdirectory.
|
||||
|
||||
### End-to-end tests
|
||||
|
||||
For e2e tests we provide a separate package. e2e tests require the connection to a Nomad cluster.
|
||||
Run `make e2e-tests` to run the e2e tests. This requires Poseidon to be already running.
|
||||
Instead, you can run `make e2e-docker` to run the API in a Docker container, and the e2e tests afterwards.
|
||||
You can use the `DOCKER_OPTS` variable to add additional arguments to the Docker run command that runs the API. By default, it is set to `-v $(shell pwd)/configuraton.yaml:/configuration.yaml`, which means, your local configuration file is mapped to the container. If you don't want this, use the following command.
|
||||
|
||||
```shell
|
||||
$ make e2e-docker DOCKER_OPTS=""
|
||||
```
|
||||
|
||||
### Local Nomad
|
||||
|
||||
In order to support the development of Poseidon, a local Nomad dev server is recommended. Following the instructions below, you can setup a Nomad server on your local system that won't persist any data between restarts. More details can be found on [Nomad's official website](https://www.nomadproject.io/docs/install).
|
||||
|
||||
#### macOS
|
||||
|
||||
```shell
|
||||
brew tap hashicorp/tap
|
||||
brew install hashicorp/tap/nomad
|
||||
brew services start nomad
|
||||
```
|
||||
|
||||
**Prerequisites**: [Docker for Mac](https://docs.docker.com/desktop/mac/install/) is installed and started:
|
||||
```shell
|
||||
brew install --cask docker
|
||||
```
|
||||
|
||||
**Note**: Due to architecture of Docker networking on macOS, the bridge network is not available with Nomad. Please refer to the [Nomad FAQ](https://www.nomadproject.io/docs/faq#q-how-to-connect-to-my-host-network-when-using-docker-desktop-windows-and-macos) for more information. As a result, those environments having network access enabled won't sync properly to Nomad and thus cannot be started.
|
||||
|
||||
#### Linux
|
||||
|
||||
```shell
|
||||
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
|
||||
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
|
||||
sudo apt-get update && sudo apt-get install nomad
|
||||
sudo nomad agent -dev
|
||||
```
|
||||
|
||||
#### Namespace registration
|
||||
|
||||
As the Nomad dev serer does not persist any data, the namespace selected in the configuration of Poseidon needs to be created each time the Nomad server is started. This can be done with the following command:
|
||||
|
||||
```shell
|
||||
nomad namespace apply -description "Poseidon development namespace" poseidon
|
||||
```
|
||||
|
||||
Alternatively, the namespace used by Poseidon can be updated to `default` so that no additional namespace is required.
|
||||
|
||||
## Coding Style
|
||||
|
||||
### Git hooks
|
||||
|
||||
The repository contains a git pre-commit hook which runs the go formatting tool `gofmt` to ensure the code is formatted properly before committing. To enable them, run `make git-hooks`.
|
||||
|
||||
### Linter
|
||||
|
||||
To lint our source code and ensure a common code style in the codebase we use [Golang CI Lint](https://golangci-lint.run/usage/install/#local-installation) as a linter. Use `make lint` to execute it.
|
||||
|
||||
## Continuous Integration
|
||||
|
||||
We use the Gitlab CI to automatically build the project, run unit and e2e-tests, perform an automated dependency check and deploy instances of the API.
|
||||
|
||||
### Docker
|
||||
|
||||
The CI builds a Docker image and pushes it to the Docker registry associated with this repo. Execute `sudo docker run -p 7200:7200 ghcr.io/openhpi/poseidon` to run the image locally. You can find all available images on the [package listing on GitHub](https://github.com/openHPI/poseidon/pkgs/container/poseidon). Once started, you can then interact with the webserver on your local port 7200.
|
||||
|
||||
You can also build the Docker image locally by executing `make docker` in the root directory of this project. It builds the binary first and a container with the tag `poseidon:latest` afterwards. You can then start a Docker container with `sudo docker run --rm -p 7200:7200 poseidon:latest`.
|
44
docs/nomad_usage.md
Normal file
44
docs/nomad_usage.md
Normal file
@ -0,0 +1,44 @@
|
||||
# Nomad Usage
|
||||
|
||||
Poseidon is an abstraction of the functionality provided by Nomad. In the following we will look at how Poseidon uses Nomad's functionality.
|
||||
|
||||
Nomad is structured in different levels of abstraction. Jobs are collected in namespaces. Each Job can contain several Task Groups. Each Task Group can contain several Tasks. Finally, Allocations map Task Groups to Nomad Clients. For more insights take a look at [the official description](https://www.nomadproject.io/docs/internals/architecture).
|
||||
In our case, a Task is executed in a Docker container.
|
||||
|
||||

|
||||
|
||||
## Execution environments as template Jobs
|
||||
|
||||
Execution Environments are mapped to Nomad Jobs. In the following, we will call these Jobs `Template Jobs`.
|
||||
The naming schema for Template Jobs is "template-\<execution-environment-id\>".
|
||||
|
||||
The last figure shows the structure in Nomad.
|
||||
Each template Job contains a "config" Task Group including a "config" Task. This Task does not perform any computations but is used to store environment-specific attributes, such as the prewarming pool size.
|
||||
In addition, the template Job contains a "default-group" Task Group with a "default-task" Task. In this Task, `sleep infinity` is executed so that the Task remains active and is ready for dynamic executions in the container.
|
||||
|
||||
As shown in the figure, the "config" Task Group has no Allocation, while the "default-group" has an Allocation.
|
||||
This is because the "config" Task Group only stores information but does not execute anything.
|
||||
In the "default-group" the user's code submissions are executed. Therefore, Nomad creates an Allocation that points the Task Group to a Nomad Client for execution.
|
||||
|
||||
## Runner as Nomad Jobs
|
||||
|
||||
As an abstraction of the execution engine, we use `Runner` as a description for Docker containers (currently used) or microVMs.
|
||||
If a user requests a new runner, Poseidon duplicates the template Job of the corresponding environment.
|
||||
|
||||
When a user then executes their code, Poseidon copies the code into the container and executes it.
|
||||
|
||||
## Prewarming
|
||||
|
||||
To reduce the response time in the process of claiming a runner, Poseidon creates a pool of runners that have been started in advance.
|
||||
When a user requests a runner, a runner from this pool can be used.
|
||||
In the background, a new runner is created, thus replenishing the pool.
|
||||
By running in the background, the user does not have to wait as long as the runner needs to start.
|
||||
The implementation of this concept can be seen in [the Runner Manager](/internal/runner/manager.go).
|
||||
|
||||
### Lifecycle
|
||||
|
||||
The prewarming pool is initiated when a new environment is requested/created according to the requested prewarming pool size.
|
||||
|
||||
Every change on the environment (resource constraints, prewarming pool size, network access) leads to the destruction of the environment including all used and idle runners (the prewarming pool). After that, the environment and its prewarming pool is re-created.
|
||||
|
||||
Other causes which lead to the destruction of the prewarming pool are the explicit deletion of the environment by using the API route or when the corresponding template job for a given enviornment is no longer available on Nomad but a force update is requested using the `GET /execution-environments/{id}?fetch=true` route. The issue described in the latter case should not occur in normal operation, but could arise from either manually deleting the template job, scheduling issues on Nomad or other unforseenable edge cases.
|
BIN
docs/resources/OverviewCodeOceanPoseidonNomad.png
Normal file
BIN
docs/resources/OverviewCodeOceanPoseidonNomad.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 184 KiB |
1
docs/resources/OverviewPoseidonNomadMapping.drawio
Normal file
1
docs/resources/OverviewPoseidonNomadMapping.drawio
Normal file
@ -0,0 +1 @@
|
||||
<mxfile host="app.diagrams.net" modified="2021-07-29T09:10:40.306Z" agent="5.0 (X11)" etag="xc0NA0uUuogw2a5ns6QW" version="14.9.2" type="device"><diagram id="_cTqiwv0DkfbUgT-qijx" name="Page-1">7Vxbc+I2FP41PCZjyxfgMckm23bSTtpkp7uPCha2urZFbUGgv76SLeGLDDEYEE4gmcE6ulg+3/mOdGSJgXUXLb8mcBb8TjwUDoDhLQfWlwEApg3AgP8b3iqXDF07F/gJ9kShQvCM/0NCaAjpHHsorRSkhIQUz6rCCYljNKEVGUwS8lYtNiVh9a4z6CNF8DyBoSr9G3s0yKUjMCzkvyDsB/LOpjvOcyIoC4snSQPokbeSyLofWHcJITS/ipZ3KOTKk3rJ6z1syF13LEExbVNh/P3XW/LtzwUdBuljtHiYPl75V6KVBQzn4oEHwA1Ze7dTwpplvaYroQr33zmRGVdpBtQNK8C6sCwy2ZUvvrNWXqXgiaQIeySWGayfr/XCTJbfVYpBpQMgIfPYQ/xxDJb9FmCKnmdwwnPfmPUxWUCjkKVMdrlACcUMx5sQ+zGTUTJbt8nz0HKjHs01OsysEYkQTVasyFuBP3AEqEEJ+6FEGgqb89d1C1jYhUBmB5TARpQUHXaD7Q8SQW8jHE3InRdEogJwBRIlyBx3eErIrM7EMl2GUKb2OkgwQmmm1FPwJsKexxu6neIwvCMhSbJGranD/5g8pQn5iUo5bvYRD1SS5x8hF57edA+DuD1y8ipimGkwABM0cNY9Fv5DBQXksYFFJElCA+KTGIb3hfS2ilNR5pFwYmTo/IMoXQndwTklVezQEtPvvPq1I1I/RGP8+suynFjJRMwet1SJJ3/I9niiqJalZL06iCj2bvh4y9IehhGJvZcAx3nGA+bKE43IQR4Ya+i5arYDzzRJ5skEbdG4mFdQmPiIvsdM1ZASFEKKF9V+HNwsbMUt/EZe08/IWDA0Kow1Rw0+u2mYBdaxOOsq4LygaMbMgnlag+HUDaa6DjfCdgDluk5VuWudafOHpqlo7wQOcYuTmoQwTfGk4qDMioNyO3jHfTzxAX3hqKUvdDr6QlH1ieBsTiOszxpXrc8yalaV90vUKgyLQQNXpWIzXiDdch9QHfQtsxYBvdevanl2kfegsPK1TjoYvtFPw9drwONTGvCudmfbPbC7sTKYXV9f92cEs42azoA6go1POoCpQfgLTH8yyVemxNllDsdBawCpcQ43PhpKatx9E4ZkwrwBiT8nRu7ouuquHP0oqWEQ59LnxKfGIVc7OgDomLEcYR2i3UzngLMW0205bXE7Tlu6sU+NcycknmK/P7ODNUnkwGM5emcHQEt4+yE4M2zJGVMvaYYfjjSudtJYF9LsSZq2Czx6OTNSOOOhKZyH9MrnMVN/qDM8u/HGuVBnT+q0XVoqbdLQwR11/UZyh2bLDn2ljv5Rx75QZz/qSCTfn6qNdFJHdrNEHbheA+ovcRzdxLEuxNmXOG3XBfTypv/rAmB0bvM0PfuAPgJn2oY4QGuMA9QYp++k0T5Dsy7Bzb6kaRvcdB1o9ntvXtuvId/xbnxvvr38cd6bg82BV88WLaxzGwzlJrQLr3fltSWPvry7uVXr7lbZzd4vWtSpo39IBBfq7Ekdq+08cqyVOurGlT4uWtSJo3/RQnMAVqZNiUW7EycmMaqwxlBY48E0yHrNu5BvbpFnFsGBSdV2QUPvcKQuaMgjUx5eyONSz5QfAS2OUpWyGkrvfIoLNJ2ze0qIN5/k1GbPZvA+4NjPE9k+za1H8F6ThkN5lU528Rf93EflntkBMFvzjpAP6njaBrhaF4UsNYa8X6LJPCe8cR8vcELiCInjD5+bqfrPJjkKWn0992VZxpnp1ta8x2c3N1hxZN184lHdoA1aukGtMY3s5fb511/zOEbJZSKzwT02HXw5LYXVNxwSsn45x/p7jiMeKWLJ4odd8pXp4udxrPv/AQ==</diagram></mxfile>
|
BIN
docs/resources/OverviewPoseidonNomadMapping.png
Normal file
BIN
docs/resources/OverviewPoseidonNomadMapping.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 1.5 MiB |
17
docs/resources/client.example.hcl
Normal file
17
docs/resources/client.example.hcl
Normal file
@ -0,0 +1,17 @@
|
||||
client {
|
||||
enabled = true
|
||||
servers = [
|
||||
"server domain 1",
|
||||
"server domain 2"
|
||||
]
|
||||
cni_path = "/usr/lib/cni"
|
||||
}
|
||||
|
||||
plugin "docker" {
|
||||
config {
|
||||
allow_runtimes = ["runsc"]
|
||||
gc {
|
||||
image_delay = "0s"
|
||||
}
|
||||
}
|
||||
}
|
17
docs/resources/docker.daemon.json
Normal file
17
docs/resources/docker.daemon.json
Normal file
@ -0,0 +1,17 @@
|
||||
{
|
||||
"dns": [
|
||||
"8.8.8.8",
|
||||
"8.8.4.4"
|
||||
],
|
||||
"dns-search": [
|
||||
"codeocean.internal"
|
||||
],
|
||||
"default-runtime": "runsc",
|
||||
"runtimes": {
|
||||
"runsc": {
|
||||
"path": "/usr/bin/runsc",
|
||||
"runtimeArgs": [
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
28
docs/resources/nomad.example.hcl
Normal file
28
docs/resources/nomad.example.hcl
Normal file
@ -0,0 +1,28 @@
|
||||
# Full configuration options can be found at https://www.nomadproject.io/docs/configuration
|
||||
|
||||
data_dir = "/opt/nomad/data"
|
||||
bind_addr = "0.0.0.0"
|
||||
|
||||
limits {
|
||||
http_max_conns_per_client = 0
|
||||
}
|
||||
|
||||
# Require TLS
|
||||
tls {
|
||||
http = true
|
||||
rpc = true
|
||||
|
||||
ca_file = "/home/ubuntu/ca.crt"
|
||||
cert_file = "/home/ubuntu/cert.crt"
|
||||
key_file = "/home/ubuntu/cert-key.pem"
|
||||
|
||||
verify_server_hostname = true
|
||||
verify_https_client = true
|
||||
}
|
||||
|
||||
# telemetry {
|
||||
# collection_interval = "10s"
|
||||
# prometheus_metrics = true
|
||||
# publish_allocation_metrics = true
|
||||
# publish_node_metrics = true
|
||||
# }
|
30
docs/resources/poseidon_policy.hcl
Normal file
30
docs/resources/poseidon_policy.hcl
Normal file
@ -0,0 +1,30 @@
|
||||
// Allow-all access policy
|
||||
|
||||
namespace "*" {
|
||||
policy = "write"
|
||||
capabilities = ["alloc-node-exec", "read-job"]
|
||||
}
|
||||
|
||||
agent {
|
||||
policy = "write"
|
||||
}
|
||||
|
||||
operator {
|
||||
policy = "write"
|
||||
}
|
||||
|
||||
quota {
|
||||
policy = "write"
|
||||
}
|
||||
|
||||
node {
|
||||
policy = "write"
|
||||
}
|
||||
|
||||
host_volume "*" {
|
||||
policy = "write"
|
||||
}
|
||||
|
||||
plugin {
|
||||
policy = "read"
|
||||
}
|
105
docs/resources/secure-bridge.conflist
Normal file
105
docs/resources/secure-bridge.conflist
Normal file
@ -0,0 +1,105 @@
|
||||
{
|
||||
"cniVersion": "0.4.0",
|
||||
"name": "secure-bridge",
|
||||
"plugins": [
|
||||
{
|
||||
"type": "loopback"
|
||||
},
|
||||
{
|
||||
"type": "bridge",
|
||||
"bridge": "nomad-filtered",
|
||||
"ipMasq": true,
|
||||
"isGateway": true,
|
||||
"forceAddress": true,
|
||||
"dns":{
|
||||
"nameservers":[
|
||||
"8.8.8.8",
|
||||
"8.8.4.4",
|
||||
"2001:4860:4860::8888",
|
||||
"2001:4860:4860::8844"
|
||||
],
|
||||
"domain": "poseidon.internal",
|
||||
"search": [
|
||||
"poseidon.internal"
|
||||
]
|
||||
},
|
||||
"ipam": {
|
||||
"type": "host-local",
|
||||
"ranges": [
|
||||
[
|
||||
{
|
||||
"subnet": "10.151.16.0/20"
|
||||
}
|
||||
],
|
||||
[
|
||||
{
|
||||
"subnet": "fd00:2::/64"
|
||||
}
|
||||
]
|
||||
],
|
||||
"routes": [
|
||||
{ "dst": "0.0.0.0/5" },
|
||||
{ "dst": "8.0.0.0/7" },
|
||||
{ "dst": "11.0.0.0/8" },
|
||||
{ "dst": "12.0.0.0/6" },
|
||||
{ "dst": "16.0.0.0/4" },
|
||||
{ "dst": "32.0.0.0/3" },
|
||||
{ "dst": "64.0.0.0/2" },
|
||||
{ "dst": "128.0.0.0/3" },
|
||||
{ "dst": "160.0.0.0/5" },
|
||||
{ "dst": "168.0.0.0/8" },
|
||||
{ "dst": "169.0.0.0/9" },
|
||||
{ "dst": "169.128.0.0/10" },
|
||||
{ "dst": "169.192.0.0/11" },
|
||||
{ "dst": "169.224.0.0/12" },
|
||||
{ "dst": "169.240.0.0/13" },
|
||||
{ "dst": "169.248.0.0/14" },
|
||||
{ "dst": "169.252.0.0/15" },
|
||||
{ "dst": "169.255.0.0/16" },
|
||||
{ "dst": "170.0.0.0/8" },
|
||||
{ "dst": "171.0.0.0/12" },
|
||||
{ "dst": "171.32.0.0/11" },
|
||||
{ "dst": "171.64.0.0/10" },
|
||||
{ "dst": "171.128.0.0/9" },
|
||||
{ "dst": "172.0.0.0/6" },
|
||||
{ "dst": "176.0.0.0/4" },
|
||||
{ "dst": "192.0.0.0/9" },
|
||||
{ "dst": "192.128.0.0/11" },
|
||||
{ "dst": "192.160.0.0/13" },
|
||||
{ "dst": "192.169.0.0/16" },
|
||||
{ "dst": "192.170.0.0/15" },
|
||||
{ "dst": "192.172.0.0/14" },
|
||||
{ "dst": "192.176.0.0/12" },
|
||||
{ "dst": "192.192.0.0/10" },
|
||||
{ "dst": "193.0.0.0/8" },
|
||||
{ "dst": "194.0.0.0/7" },
|
||||
{ "dst": "196.0.0.0/6" },
|
||||
{ "dst": "200.0.0.0/5" },
|
||||
{ "dst": "208.0.0.0/4" },
|
||||
{ "dst": "224.0.0.0/3" },
|
||||
{ "dst": "::/1" },
|
||||
{ "dst": "8000::/2" },
|
||||
{ "dst": "c000::/3" },
|
||||
{ "dst": "e000::/4" },
|
||||
{ "dst": "f000::/5" },
|
||||
{ "dst": "f800::/6" },
|
||||
{ "dst": "fe00::/9" },
|
||||
{ "dst": "fec0::/10" },
|
||||
{ "dst": "ff00::/8" }
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "firewall",
|
||||
"backend": "iptables",
|
||||
"iptablesAdminChainName": "NOMAD-ADMIN-FILTERED"
|
||||
},
|
||||
{
|
||||
"type": "portmap",
|
||||
"capabilities": {
|
||||
"portMappings": true
|
||||
},
|
||||
"snat": true
|
||||
}
|
||||
]
|
||||
}
|
15
docs/resources/server.example.hcl
Normal file
15
docs/resources/server.example.hcl
Normal file
@ -0,0 +1,15 @@
|
||||
server {
|
||||
enabled = true
|
||||
bootstrap_expect = 2
|
||||
server_join {
|
||||
retry_join = ["<<other servers domain>>"]
|
||||
retry_max = 3
|
||||
retry_interval = "15s"
|
||||
}
|
||||
|
||||
# https://www.nomadproject.io/docs/configuration/server
|
||||
default_scheduler_config {
|
||||
scheduler_algorithm = "spread"
|
||||
memory_oversubscription_enabled = true
|
||||
}
|
||||
}
|
70
docs/security.md
Normal file
70
docs/security.md
Normal file
@ -0,0 +1,70 @@
|
||||
# Security configurations
|
||||
|
||||
## TLS
|
||||
|
||||
⚠️ We highly encourage the use of TLS in this API to increase the security.
|
||||
|
||||
### Poseidon
|
||||
|
||||
To enable TLS, you need to create an appropriate certificate first.
|
||||
You can do this in the same way [as for Nomad](https://learn.hashicorp.com/tutorials/nomad/security-enable-tls):
|
||||
- `cfssl print-defaults csr | cfssl gencert -initca - | cfssljson -bare poseidon-ca`
|
||||
- Copy `cfssl.json`
|
||||
- `echo '{}' | cfssl gencert -ca=poseidon-ca.pem -ca-key=poseidon-ca-key.pem -config=cfssl.json -hostname="<<poseidon server hostname>>,localhost,127.0.0.1" - | cfssljson -bare poseidon-server`
|
||||
|
||||
|
||||
Then, set `server.tls.active` or the corresponding environment variable to `true` and specify the `server.tls.certfile` and `server.tls.keyfile` options.
|
||||
|
||||
### Nomad
|
||||
|
||||
To enable TLS between Poseidon and Nomad, TLS needs to be first activated in Nomad. See [the Nomad documentation](https://learn.hashicorp.com/collections/nomad/transport-security) for a guideline on how to do that.
|
||||
|
||||
Afterwards, it is *required* to set the `nomad.tls.active` config option to `true`, as Nomad will no longer accept any connections over HTTP. To make sure the authenticity of the Nomad host can be validated, the `nomad.tls.cafile` option has to point to a certificate of the signing authority.
|
||||
|
||||
If using mutual TLS between Poseidon and Nomad is desired, the `nomad.tls.certfile` and `nomad.tls.keyfile` options can hold a client certificate. This certificate must be signed by the same CA as the certificates of the Nomad hosts. Note that mTLS can (and should) be enforced by Nomad in this case using the [verify_https_client](https://www.nomadproject.io/docs/configuration/tls#verify_https_client) configuration option.
|
||||
|
||||
Here are sample configurations for [all Nomad nodes](resources/nomad.example.hcl), [the Nomad servers](resources/server.example.hcl) and [the Nomad clients](resources/client.example.hcl).
|
||||
|
||||
|
||||
## Authentication
|
||||
|
||||
⚠️ Don't use authentication without TLS enabled, as otherwise the token will be transmitted in clear text.
|
||||
|
||||
### Poseidon
|
||||
|
||||
⚠️ We encourage you to enable authentication for this API. If disabled, everyone with access to your API has also indirectly access to your Nomad cluster as this API uses it.
|
||||
|
||||
The API supports authentication via an HTTP header. To enable it, specify the `server.token` value in the `configuration.yaml` or the corresponding environment variable `POSEIDON_SERVER_TOKEN`.
|
||||
|
||||
Once configured, all requests to the API, except the `health` route require the configured token in the `Poseidon-Token` header.
|
||||
|
||||
An example `curl` command with the configured token being `SECRET` looks as follows:
|
||||
|
||||
```bash
|
||||
$ curl -H "Poseidon-Token: SECRET" http://localhost:7200/api/v1/some-protected-route
|
||||
```
|
||||
|
||||
### Nomad
|
||||
|
||||
An alternative or additional measure to mTLS (as mentioned above) is to enable access control in the Nomad cluster to prevent unauthorised actors from performing unwanted actions in the cluster.
|
||||
Instructions on setting up the cluster appropriately can be found in [the Nomad documentation](https://learn.hashicorp.com/collections/nomad/access-control).
|
||||
|
||||
Afterwards, it is recommended to create a specific [Access Policy](https://learn.hashicorp.com/tutorials/nomad/access-control-policies?in=nomad/access-control) for Poseidon with the minimal set of capabilities it needs for operating the cluster. A non-minimal example with complete permissions can be found [here](resources/poseidon_policy.hcl). Poseidon requires a corresponding [Access Token](https://learn.hashicorp.com/tutorials/nomad/access-control-tokens?in=nomad/access-control) to send commands to Nomad. A Token looks like this:
|
||||
|
||||
```text
|
||||
Accessor ID = 463d3216-dc16-570f-380c-a48f5d26d955
|
||||
Secret ID = ea1ac4c5-892b-0bcc-9fc5-5faeb5273a13
|
||||
Name = Poseidon access token
|
||||
Type = client
|
||||
Global = false
|
||||
Policies = [poseidon]
|
||||
Create Time = 2021-07-26 12:45:11.437786378 +0000 UTC
|
||||
Create Index = 246238
|
||||
Modify Index = 246238
|
||||
```
|
||||
|
||||
The `Secret ID` of the Token needs to be specified as the value of `nomad.token` value in the `configuration.yaml` or the corresponding environment variable `POSEIDON_NOMAD_TOKEN`. It may also be required for authentication in the Nomad Web UI and for using the Nomad CLI on the Nomad hosts (where the token can be specified via the `NOMAD_TOKEN` environment variable).
|
||||
|
||||
Once configured, all requests to the Nomad API automatically contain a `X-Nomad-Token` header containing the token.
|
||||
|
||||
⚠️ Make sure that no (overly permissive) `anonymous` access policy is present in the cluster after the policy for Poseidon has been added. Anyone can perform actions as specified by this special policy without authenticating!
|
Reference in New Issue
Block a user