added poseidon with aws to k8s changes

This commit is contained in:
Elmar Kresse
2024-08-12 10:02:36 +02:00
parent 5376f7a027
commit 254460d64c
60 changed files with 6912 additions and 0 deletions

138
docs/configuration.md Normal file
View File

@ -0,0 +1,138 @@
# Configuration
Poseidon can be configured to suit different use cases.
## Poseidon
The file `config/config.go` contains a configuration struct containing all possible configuration options for Poseidon. The file also defines default values for most of the configuration options.
The options *can* be overridden with a yaml configuration file whose path can be configured with the flag `-config`. By default, Poseidon searches for `configuration.yaml` in the working directory. `configuration.example.yaml` is an example for a configuration file and contains explanations for all options. The keys of the options specified in the configuration file must be written in lowercase.
The options *can* also be overridden by environment variables. Currently, only the Go types `string`, `int`, `bool` and `struct` (nested) are implemented. The name of the environment variable is constructed as follows: `POSEIDON_(<name of nested struct>_)*<name of field>` (all letters are uppercase).
The precedence of configuration possibilities is:
1. Environment variables
2. Configuration file
3. Default values
If a value is not specified, the value of the subsequent possibility is used.
### Example
- The default value for the `Port` (type `int`) field in the `Server` field (type `struct`) of the configuration is `7200`.
- This can be overwritten with the following `configuration.yaml`:
```yaml
server:
port: 4000
```
- Again, this can be overwritten by the environment variable `POSEIDON_SERVER_PORT`, e.g., using `export POSEIDON_SERVER_PORT=5000`.
### Systemd
Poseidon can be configured to run as a systemd service. Poseidon can optionally also be configured to use a systemd socket.
The use of systemd provides capabilities for managing Poseidon's state and zero downtime deployments.
Minimal examples for systemd configurations can be found in `.github/workflows/resources`.
## Nomad
As a subsystem of Poseidon, Nomad can and should also be configured accordingly.
### Memory Oversubscription
Poseidon is using Nomad's feature of memory oversubscription. This way all Runner are allocated with just 16MB. The memory limit defined per execution environment is used as an upper bound for the memory oversubscription.
On the one hand, this feature allows Nomad to execute much more Runner in parallel but, on the other hand, it introduces a risk of overloading the Nomad host. Still, this feature is obligatory for Poseidon to work and therefore needs to be enabled. [Example Configuration](./resources/server.example.hcl)
```hcl
default_scheduler_config {
memory_oversubscription_enabled = true
}
```
### Scheduler
By default, Nomad uses a bin-packing scheduler. This places all Jobs on one host. In our case, a high load then leads to one Nomad client being fully utilised while the others remain mostly idle.
To mitigate the overload of a Nomad client, the ["spread" scheduler algorithm](https://www.nomadproject.io/api-docs/operator/scheduler#update-scheduler-configuration) should be used.
### Maximum Connections per Client
By default, Nomad only allows 100 maximum concurrent connections per client. However, as Poseidon is a client, this would significantly impair and limit the performance of Poseidon. Thus, this limit should be disabled.
To do so, ensure the following configuration is set in your Nomad agents, for example by adding it to `/etc/nomad.d/base.hcl`:
```hcl
limits {
http_max_conns_per_client = 0
}
```
### Enable Networking Support in Nomad
In order to allow full networking support in Nomad, the `containernetworking-plugins` are required on the host. They can be either installed manually or through a package manager. In the latter case, the installation path might differ. Hence, add the following line to the `client` directive of the Nomad configuration in `/etc/nomad.d/client.hcl`:
```hcl
cni_path = "/usr/lib/cni"
```
If the path is not set up correctly or the dependency is missing, the following error will be shown in Nomad: `failed to find plugin "bridge" in path [/opt/cni/bin]`
Additionally, we provide a [secure-bridge](./resources/secure-bridge.conflist) configuration for the `containernetworking-plugins`. We highly recommend to use this configuration, as it will automatically configure an appropriate firewall and isolate your local network. Store the [secure-bridge](./resources/secure-bridge.conflist) in an (otherwise) empty folder and specify that folder in `/etc/nomad.d/client.hcl`:
```hcl
cni_config_dir = "<path to folder with *.conflist>"
```
If the path is not set up correctly or with a different name, the placement of allocations will fail in Nomad: `Constraint missing network filtered [all] nodes`. Be sure to set the "dns" and "dns-search" options in `/etc/docker/daemon.json` with reasonable defaults, for example with those shown in our [example configuration for Docker](./resources/docker.daemon.json).
### Network range
The default subnet range for Docker containers can be adjusted.
This can be done both in the Docker daemon configuration and the CNI secure-bridge configuration.
Accordingly, every container using the secure-bridge will receive an IP of the CNI configuration.
Both subnet range configurations should not be overlapping.
An example configuration could use `10.151.0.0/20` for all containers without the CNI secure-bridge and `10.151.16.0/20`
for all containers using the CNI secure bridge.
This would grant 4096 IPs to both subnets and keep 14 network range blocks of the `10.151.0.0/16` network free for future use (e.g., in other CNI configs).
### Use gVisor as a sandbox
We recommend using gVisor as a sandbox for the execution environments. First, [install gVisor following the official documentation](https://gvisor.dev/docs/user_guide/install/) and second, adapt the `/etc/docker/daemon.json` with reasonable defaults as shown in our [example configuration for Docker](./resources/docker.daemon.json).
## Supported Docker Images
In general, any Docker image can be used as an execution environment.
### Users
If the `privilegedExecution` flag is set to `true` during execution, no additional user is required. Otherwise, the following two requirements must be met:
- A non-privileged user called `user` needs to be present in the image. This user is used to execute the code.
- The Docker image needs to have a `/sbin/setuser` script allowing the execution of the user code as a non-root user, similar to `/usr/bin/su`.
### Executable Commands
In order to function properly, Poseidon expects the following commands to be available within the PATH:
- `cat`
- `env`
- `ls`
- `mkfifo`
- `rm`
- `bash` (not compatible with `sh` or `zsh`)
- `sleep`
- `tar` (including the `--absolute-names` option)
- `true`
- `unset`
- `whoami`
Tests need additional commands:
- `echo`
- `head`
- `id`
- `make`
- `tail`

136
docs/development.md Normal file
View File

@ -0,0 +1,136 @@
# Development
## Setup
If you haven't installed Go on your system yet, follow the [golang installation guide](https://golang.org/doc/install).
To get your local setup going, run `make bootstrap`. It will install all required dependencies as well as setting up our git hooks. Run `make help` to get an overview of available make targets.
The project can be compiled using `make build`. This should create a binary which can then be executed.
Alternatively, the `go run ./cmd/poseidon` command can be used to automatically compile and run the project.
### URLs
Once you completed the project setup, you can check the availability using the following URL:
```http request
http://localhost:7200/api/v1/version
```
Using the prefix `/api/v1`, all routes as described in [API documentation](../api/swagger.yaml) are available and thus can be used in conjunction with [CodeOcean](https://github.com/openHPI/codeocean).
## Tests
As testing framework we use the [testify](https://github.com/stretchr/testify) toolkit.
Run `make test` to run the unit tests.
### Mocks
For mocks we use [mockery](https://github.com/vektra/mockery). You can create a mock for the interface of your choice by running
```bash
make mock name=INTERFACE_NAME pkg=./PATH/TO/PKG
```
on a specific interface.
For example, for an interface called `ExecutorApi` in the package `nomad`, you might run
```bash
make mock name=ExecutorApi pkg=./nomad
```
If the interface changes, you can rerun this command (deleting the mock file first to avoid errors may be necessary).
Mocks can also be generated by using mockery directly on a specific interface. To do this, first navigate to the package the interface is defined in. Then run
```bash
mockery \
--name=<<interface_name>> \
--structname=<<interface_name>>Mock \
--filename=<<interface_name>>Mock.go \
--inpackage
```
For example, for an interface called `ExecutorApi` in the package `nomad`, you might run
```bash
mockery \
--name=ExecutorApi \
--structname=ExecutorAPIMock \
--filename=ExecutorAPIMock.go \
--inpackage
```
Note that per default, the mocks are created in a `mocks` sub-folder. However, in some cases (if the mock implements private interface methods), it needs to be in the same package as the interface it is mocking. The `--inpackage` flag can be used to avoid creating it in a subdirectory.
### End-to-end tests
For e2e tests we provide a separate package. e2e tests require the connection to a Nomad cluster.
Run `make e2e-tests` to run the e2e tests. This requires Poseidon to be already running.
Instead, you can run `make e2e-docker` to run the API in a Docker container, and the e2e tests afterwards.
You can use the `DOCKER_OPTS` variable to add additional arguments to the Docker run command that runs the API. By default, it is set to `-v $(shell pwd)/configuraton.yaml:/configuration.yaml`, which means, your local configuration file is mapped to the container. If you don't want this, use the following command.
```shell
$ make e2e-docker DOCKER_OPTS=""
```
### Local Nomad
In order to support the development of Poseidon, a local Nomad dev server is recommended. Following the instructions below, you can setup a Nomad server on your local system that won't persist any data between restarts. More details can be found on [Nomad's official website](https://www.nomadproject.io/docs/install).
#### macOS
```shell
brew tap hashicorp/tap
brew install hashicorp/tap/nomad
brew services start nomad
```
**Prerequisites**: [Docker for Mac](https://docs.docker.com/desktop/mac/install/) is installed and started:
```shell
brew install --cask docker
```
**Note**: Due to architecture of Docker networking on macOS, the bridge network is not available with Nomad. Please refer to the [Nomad FAQ](https://www.nomadproject.io/docs/faq#q-how-to-connect-to-my-host-network-when-using-docker-desktop-windows-and-macos) for more information. As a result, those environments having network access enabled won't sync properly to Nomad and thus cannot be started.
#### Linux
```shell
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get install nomad
sudo nomad agent -dev
```
#### Namespace registration
As the Nomad dev serer does not persist any data, the namespace selected in the configuration of Poseidon needs to be created each time the Nomad server is started. This can be done with the following command:
```shell
nomad namespace apply -description "Poseidon development namespace" poseidon
```
Alternatively, the namespace used by Poseidon can be updated to `default` so that no additional namespace is required.
## Coding Style
### Git hooks
The repository contains a git pre-commit hook which runs the go formatting tool `gofmt` to ensure the code is formatted properly before committing. To enable them, run `make git-hooks`.
### Linter
To lint our source code and ensure a common code style in the codebase we use [Golang CI Lint](https://golangci-lint.run/usage/install/#local-installation) as a linter. Use `make lint` to execute it.
## Continuous Integration
We use the Gitlab CI to automatically build the project, run unit and e2e-tests, perform an automated dependency check and deploy instances of the API.
### Docker
The CI builds a Docker image and pushes it to the Docker registry associated with this repo. Execute `sudo docker run -p 7200:7200 ghcr.io/openhpi/poseidon` to run the image locally. You can find all available images on the [package listing on GitHub](https://github.com/openHPI/poseidon/pkgs/container/poseidon). Once started, you can then interact with the webserver on your local port 7200.
You can also build the Docker image locally by executing `make docker` in the root directory of this project. It builds the binary first and a container with the tag `poseidon:latest` afterwards. You can then start a Docker container with `sudo docker run --rm -p 7200:7200 poseidon:latest`.

44
docs/nomad_usage.md Normal file
View File

@ -0,0 +1,44 @@
# Nomad Usage
Poseidon is an abstraction of the functionality provided by Nomad. In the following we will look at how Poseidon uses Nomad's functionality.
Nomad is structured in different levels of abstraction. Jobs are collected in namespaces. Each Job can contain several Task Groups. Each Task Group can contain several Tasks. Finally, Allocations map Task Groups to Nomad Clients. For more insights take a look at [the official description](https://www.nomadproject.io/docs/internals/architecture).
In our case, a Task is executed in a Docker container.
![Overview Poseidon-Nomad mapping](resources/OverviewPoseidonNomadMapping.png)
## Execution environments as template Jobs
Execution Environments are mapped to Nomad Jobs. In the following, we will call these Jobs `Template Jobs`.
The naming schema for Template Jobs is "template-\<execution-environment-id\>".
The last figure shows the structure in Nomad.
Each template Job contains a "config" Task Group including a "config" Task. This Task does not perform any computations but is used to store environment-specific attributes, such as the prewarming pool size.
In addition, the template Job contains a "default-group" Task Group with a "default-task" Task. In this Task, `sleep infinity` is executed so that the Task remains active and is ready for dynamic executions in the container.
As shown in the figure, the "config" Task Group has no Allocation, while the "default-group" has an Allocation.
This is because the "config" Task Group only stores information but does not execute anything.
In the "default-group" the user's code submissions are executed. Therefore, Nomad creates an Allocation that points the Task Group to a Nomad Client for execution.
## Runner as Nomad Jobs
As an abstraction of the execution engine, we use `Runner` as a description for Docker containers (currently used) or microVMs.
If a user requests a new runner, Poseidon duplicates the template Job of the corresponding environment.
When a user then executes their code, Poseidon copies the code into the container and executes it.
## Prewarming
To reduce the response time in the process of claiming a runner, Poseidon creates a pool of runners that have been started in advance.
When a user requests a runner, a runner from this pool can be used.
In the background, a new runner is created, thus replenishing the pool.
By running in the background, the user does not have to wait as long as the runner needs to start.
The implementation of this concept can be seen in [the Runner Manager](/internal/runner/manager.go).
### Lifecycle
The prewarming pool is initiated when a new environment is requested/created according to the requested prewarming pool size.
Every change on the environment (resource constraints, prewarming pool size, network access) leads to the destruction of the environment including all used and idle runners (the prewarming pool). After that, the environment and its prewarming pool is re-created.
Other causes which lead to the destruction of the prewarming pool are the explicit deletion of the environment by using the API route or when the corresponding template job for a given enviornment is no longer available on Nomad but a force update is requested using the `GET /execution-environments/{id}?fetch=true` route. The issue described in the latter case should not occur in normal operation, but could arise from either manually deleting the template job, scheduling issues on Nomad or other unforseenable edge cases.

Binary file not shown.

After

Width:  |  Height:  |  Size: 184 KiB

View File

@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2021-07-29T09:10:40.306Z" agent="5.0 (X11)" etag="xc0NA0uUuogw2a5ns6QW" version="14.9.2" type="device"><diagram id="_cTqiwv0DkfbUgT-qijx" name="Page-1">7Vxbc+I2FP41PCZjyxfgMckm23bSTtpkp7uPCha2urZFbUGgv76SLeGLDDEYEE4gmcE6ulg+3/mOdGSJgXUXLb8mcBb8TjwUDoDhLQfWlwEApg3AgP8b3iqXDF07F/gJ9kShQvCM/0NCaAjpHHsorRSkhIQUz6rCCYljNKEVGUwS8lYtNiVh9a4z6CNF8DyBoSr9G3s0yKUjMCzkvyDsB/LOpjvOcyIoC4snSQPokbeSyLofWHcJITS/ipZ3KOTKk3rJ6z1syF13LEExbVNh/P3XW/LtzwUdBuljtHiYPl75V6KVBQzn4oEHwA1Ze7dTwpplvaYroQr33zmRGVdpBtQNK8C6sCwy2ZUvvrNWXqXgiaQIeySWGayfr/XCTJbfVYpBpQMgIfPYQ/xxDJb9FmCKnmdwwnPfmPUxWUCjkKVMdrlACcUMx5sQ+zGTUTJbt8nz0HKjHs01OsysEYkQTVasyFuBP3AEqEEJ+6FEGgqb89d1C1jYhUBmB5TARpQUHXaD7Q8SQW8jHE3InRdEogJwBRIlyBx3eErIrM7EMl2GUKb2OkgwQmmm1FPwJsKexxu6neIwvCMhSbJGranD/5g8pQn5iUo5bvYRD1SS5x8hF57edA+DuD1y8ipimGkwABM0cNY9Fv5DBQXksYFFJElCA+KTGIb3hfS2ilNR5pFwYmTo/IMoXQndwTklVezQEtPvvPq1I1I/RGP8+suynFjJRMwet1SJJ3/I9niiqJalZL06iCj2bvh4y9IehhGJvZcAx3nGA+bKE43IQR4Ya+i5arYDzzRJ5skEbdG4mFdQmPiIvsdM1ZASFEKKF9V+HNwsbMUt/EZe08/IWDA0Kow1Rw0+u2mYBdaxOOsq4LygaMbMgnlag+HUDaa6DjfCdgDluk5VuWudafOHpqlo7wQOcYuTmoQwTfGk4qDMioNyO3jHfTzxAX3hqKUvdDr6QlH1ieBsTiOszxpXrc8yalaV90vUKgyLQQNXpWIzXiDdch9QHfQtsxYBvdevanl2kfegsPK1TjoYvtFPw9drwONTGvCudmfbPbC7sTKYXV9f92cEs42azoA6go1POoCpQfgLTH8yyVemxNllDsdBawCpcQ43PhpKatx9E4ZkwrwBiT8nRu7ouuquHP0oqWEQ59LnxKfGIVc7OgDomLEcYR2i3UzngLMW0205bXE7Tlu6sU+NcycknmK/P7ODNUnkwGM5emcHQEt4+yE4M2zJGVMvaYYfjjSudtJYF9LsSZq2Czx6OTNSOOOhKZyH9MrnMVN/qDM8u/HGuVBnT+q0XVoqbdLQwR11/UZyh2bLDn2ljv5Rx75QZz/qSCTfn6qNdFJHdrNEHbheA+ovcRzdxLEuxNmXOG3XBfTypv/rAmB0bvM0PfuAPgJn2oY4QGuMA9QYp++k0T5Dsy7Bzb6kaRvcdB1o9ntvXtuvId/xbnxvvr38cd6bg82BV88WLaxzGwzlJrQLr3fltSWPvry7uVXr7lbZzd4vWtSpo39IBBfq7Ekdq+08cqyVOurGlT4uWtSJo3/RQnMAVqZNiUW7EycmMaqwxlBY48E0yHrNu5BvbpFnFsGBSdV2QUPvcKQuaMgjUx5eyONSz5QfAS2OUpWyGkrvfIoLNJ2ze0qIN5/k1GbPZvA+4NjPE9k+za1H8F6ThkN5lU528Rf93EflntkBMFvzjpAP6njaBrhaF4UsNYa8X6LJPCe8cR8vcELiCInjD5+bqfrPJjkKWn0992VZxpnp1ta8x2c3N1hxZN184lHdoA1aukGtMY3s5fb511/zOEbJZSKzwT02HXw5LYXVNxwSsn45x/p7jiMeKWLJ4odd8pXp4udxrPv/AQ==</diagram></mxfile>

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 MiB

View File

@ -0,0 +1,17 @@
client {
enabled = true
servers = [
"server domain 1",
"server domain 2"
]
cni_path = "/usr/lib/cni"
}
plugin "docker" {
config {
allow_runtimes = ["runsc"]
gc {
image_delay = "0s"
}
}
}

View File

@ -0,0 +1,17 @@
{
"dns": [
"8.8.8.8",
"8.8.4.4"
],
"dns-search": [
"codeocean.internal"
],
"default-runtime": "runsc",
"runtimes": {
"runsc": {
"path": "/usr/bin/runsc",
"runtimeArgs": [
]
}
}
}

View File

@ -0,0 +1,28 @@
# Full configuration options can be found at https://www.nomadproject.io/docs/configuration
data_dir = "/opt/nomad/data"
bind_addr = "0.0.0.0"
limits {
http_max_conns_per_client = 0
}
# Require TLS
tls {
http = true
rpc = true
ca_file = "/home/ubuntu/ca.crt"
cert_file = "/home/ubuntu/cert.crt"
key_file = "/home/ubuntu/cert-key.pem"
verify_server_hostname = true
verify_https_client = true
}
# telemetry {
# collection_interval = "10s"
# prometheus_metrics = true
# publish_allocation_metrics = true
# publish_node_metrics = true
# }

View File

@ -0,0 +1,30 @@
// Allow-all access policy
namespace "*" {
policy = "write"
capabilities = ["alloc-node-exec", "read-job"]
}
agent {
policy = "write"
}
operator {
policy = "write"
}
quota {
policy = "write"
}
node {
policy = "write"
}
host_volume "*" {
policy = "write"
}
plugin {
policy = "read"
}

View File

@ -0,0 +1,105 @@
{
"cniVersion": "0.4.0",
"name": "secure-bridge",
"plugins": [
{
"type": "loopback"
},
{
"type": "bridge",
"bridge": "nomad-filtered",
"ipMasq": true,
"isGateway": true,
"forceAddress": true,
"dns":{
"nameservers":[
"8.8.8.8",
"8.8.4.4",
"2001:4860:4860::8888",
"2001:4860:4860::8844"
],
"domain": "poseidon.internal",
"search": [
"poseidon.internal"
]
},
"ipam": {
"type": "host-local",
"ranges": [
[
{
"subnet": "10.151.16.0/20"
}
],
[
{
"subnet": "fd00:2::/64"
}
]
],
"routes": [
{ "dst": "0.0.0.0/5" },
{ "dst": "8.0.0.0/7" },
{ "dst": "11.0.0.0/8" },
{ "dst": "12.0.0.0/6" },
{ "dst": "16.0.0.0/4" },
{ "dst": "32.0.0.0/3" },
{ "dst": "64.0.0.0/2" },
{ "dst": "128.0.0.0/3" },
{ "dst": "160.0.0.0/5" },
{ "dst": "168.0.0.0/8" },
{ "dst": "169.0.0.0/9" },
{ "dst": "169.128.0.0/10" },
{ "dst": "169.192.0.0/11" },
{ "dst": "169.224.0.0/12" },
{ "dst": "169.240.0.0/13" },
{ "dst": "169.248.0.0/14" },
{ "dst": "169.252.0.0/15" },
{ "dst": "169.255.0.0/16" },
{ "dst": "170.0.0.0/8" },
{ "dst": "171.0.0.0/12" },
{ "dst": "171.32.0.0/11" },
{ "dst": "171.64.0.0/10" },
{ "dst": "171.128.0.0/9" },
{ "dst": "172.0.0.0/6" },
{ "dst": "176.0.0.0/4" },
{ "dst": "192.0.0.0/9" },
{ "dst": "192.128.0.0/11" },
{ "dst": "192.160.0.0/13" },
{ "dst": "192.169.0.0/16" },
{ "dst": "192.170.0.0/15" },
{ "dst": "192.172.0.0/14" },
{ "dst": "192.176.0.0/12" },
{ "dst": "192.192.0.0/10" },
{ "dst": "193.0.0.0/8" },
{ "dst": "194.0.0.0/7" },
{ "dst": "196.0.0.0/6" },
{ "dst": "200.0.0.0/5" },
{ "dst": "208.0.0.0/4" },
{ "dst": "224.0.0.0/3" },
{ "dst": "::/1" },
{ "dst": "8000::/2" },
{ "dst": "c000::/3" },
{ "dst": "e000::/4" },
{ "dst": "f000::/5" },
{ "dst": "f800::/6" },
{ "dst": "fe00::/9" },
{ "dst": "fec0::/10" },
{ "dst": "ff00::/8" }
]
}
},
{
"type": "firewall",
"backend": "iptables",
"iptablesAdminChainName": "NOMAD-ADMIN-FILTERED"
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
},
"snat": true
}
]
}

View File

@ -0,0 +1,15 @@
server {
enabled = true
bootstrap_expect = 2
server_join {
retry_join = ["<<other servers domain>>"]
retry_max = 3
retry_interval = "15s"
}
# https://www.nomadproject.io/docs/configuration/server
default_scheduler_config {
scheduler_algorithm = "spread"
memory_oversubscription_enabled = true
}
}

70
docs/security.md Normal file
View File

@ -0,0 +1,70 @@
# Security configurations
## TLS
⚠️ We highly encourage the use of TLS in this API to increase the security.
### Poseidon
To enable TLS, you need to create an appropriate certificate first.
You can do this in the same way [as for Nomad](https://learn.hashicorp.com/tutorials/nomad/security-enable-tls):
- `cfssl print-defaults csr | cfssl gencert -initca - | cfssljson -bare poseidon-ca`
- Copy `cfssl.json`
- `echo '{}' | cfssl gencert -ca=poseidon-ca.pem -ca-key=poseidon-ca-key.pem -config=cfssl.json -hostname="<<poseidon server hostname>>,localhost,127.0.0.1" - | cfssljson -bare poseidon-server`
Then, set `server.tls.active` or the corresponding environment variable to `true` and specify the `server.tls.certfile` and `server.tls.keyfile` options.
### Nomad
To enable TLS between Poseidon and Nomad, TLS needs to be first activated in Nomad. See [the Nomad documentation](https://learn.hashicorp.com/collections/nomad/transport-security) for a guideline on how to do that.
Afterwards, it is *required* to set the `nomad.tls.active` config option to `true`, as Nomad will no longer accept any connections over HTTP. To make sure the authenticity of the Nomad host can be validated, the `nomad.tls.cafile` option has to point to a certificate of the signing authority.
If using mutual TLS between Poseidon and Nomad is desired, the `nomad.tls.certfile` and `nomad.tls.keyfile` options can hold a client certificate. This certificate must be signed by the same CA as the certificates of the Nomad hosts. Note that mTLS can (and should) be enforced by Nomad in this case using the [verify_https_client](https://www.nomadproject.io/docs/configuration/tls#verify_https_client) configuration option.
Here are sample configurations for [all Nomad nodes](resources/nomad.example.hcl), [the Nomad servers](resources/server.example.hcl) and [the Nomad clients](resources/client.example.hcl).
## Authentication
⚠️ Don't use authentication without TLS enabled, as otherwise the token will be transmitted in clear text.
### Poseidon
⚠️ We encourage you to enable authentication for this API. If disabled, everyone with access to your API has also indirectly access to your Nomad cluster as this API uses it.
The API supports authentication via an HTTP header. To enable it, specify the `server.token` value in the `configuration.yaml` or the corresponding environment variable `POSEIDON_SERVER_TOKEN`.
Once configured, all requests to the API, except the `health` route require the configured token in the `Poseidon-Token` header.
An example `curl` command with the configured token being `SECRET` looks as follows:
```bash
$ curl -H "Poseidon-Token: SECRET" http://localhost:7200/api/v1/some-protected-route
```
### Nomad
An alternative or additional measure to mTLS (as mentioned above) is to enable access control in the Nomad cluster to prevent unauthorised actors from performing unwanted actions in the cluster.
Instructions on setting up the cluster appropriately can be found in [the Nomad documentation](https://learn.hashicorp.com/collections/nomad/access-control).
Afterwards, it is recommended to create a specific [Access Policy](https://learn.hashicorp.com/tutorials/nomad/access-control-policies?in=nomad/access-control) for Poseidon with the minimal set of capabilities it needs for operating the cluster. A non-minimal example with complete permissions can be found [here](resources/poseidon_policy.hcl). Poseidon requires a corresponding [Access Token](https://learn.hashicorp.com/tutorials/nomad/access-control-tokens?in=nomad/access-control) to send commands to Nomad. A Token looks like this:
```text
Accessor ID = 463d3216-dc16-570f-380c-a48f5d26d955
Secret ID = ea1ac4c5-892b-0bcc-9fc5-5faeb5273a13
Name = Poseidon access token
Type = client
Global = false
Policies = [poseidon]
Create Time = 2021-07-26 12:45:11.437786378 +0000 UTC
Create Index = 246238
Modify Index = 246238
```
The `Secret ID` of the Token needs to be specified as the value of `nomad.token` value in the `configuration.yaml` or the corresponding environment variable `POSEIDON_NOMAD_TOKEN`. It may also be required for authentication in the Nomad Web UI and for using the Nomad CLI on the Nomad hosts (where the token can be specified via the `NOMAD_TOKEN` environment variable).
Once configured, all requests to the Nomad API automatically contain a `X-Nomad-Token` header containing the token.
⚠️ Make sure that no (overly permissive) `anonymous` access policy is present in the cluster after the policy for Poseidon has been added. Anyone can perform actions as specified by this special policy without authenticating!