Deployers¶
This is only relevant for admins who are starting up the Workfow services manually.
Explanation¶
The Deployments defined for a Pipeline Config need to run on a site, and Workflow needs to know the orchestration engine that is being used on that site, along with other detailed information. This is where site configs, which are translated to Deployers, come into place. In Workflow's docker-compose.yml
, there is a service, pipelines_managers
, that spawns a container which manages these Deployers.
Required Environment Variables¶
There is an environment variable, SANIC_WORKSPACE_FILEPATH
, which should be changed to specify your Workflow Workspace filepath to an eligble YAML file, which shoud be mounted in the container using volumes
(e.g. ./workflow/workspaces/development.yml:/etc/workspace.yml
). The Workflow Workspace contains information such as what sites should be allowed, where should plots and products be moved to, and relevantly, the deployers
section which details Additionally, there is SANIC_DEPLOYER_REDEPLOY_WAIT_TIME
for how long to wait after removing a service to re-deploy it (and how long to wait after deploying the service). Lastly, there is SANIC_DEPLOYER_MAX_REPLICAS
, which is the maximum number of replicated containers allowed for a single deployment (service).
YAML Example¶
Here is an example Workflow Workspace YAML file, including just the deployers
section:
deployers:
local:
docker:
client_url: "tcp://<address>:4444"
volumes:
test-nfs-volume:
# Will only apply these attributes
# if the volume doesn't exist already
type: volume
target: "/test_data/"
driver_config:
driver: "local"
driver_opts:
type: "nfs"
o: "nfsvers=4.0,noatime,nodiratime,soft,addr=nfs,rw"
device: ":/"
test-shared-memory:
type: "tmpfs"
target: "/dev/shm"
tmpfs_size: 10000000
test-bind-mount:
type: bind
target: "/user_data/"
source: "/home/user/folder"
networks:
test-network:
# Will only apply these attributes
# if the network doesn't exist already
attachable: False # "attachable" is only relevant for "driver" of "overlay"
driver: "overlay"
# Will appear as {"type": "workflow"} on the Docker Service
labels:
type: "workflow"
# node.role / node.label should be set by the site admin
# using the Docker CLI beforehand
constraints: ["node.role == manager"]
log_driver:
# This can be changed to "loki" with custom endpoint options
name: "json-file"
options:
max-size: "10m"
max-file: "3"
Important Docker Specific Notes¶
Docker Client URL & Docker Networks¶
Using Docker-in-Docker Engine¶
If you are running the docker-compose.yaml
(workflow-pipelines
repository) on your machine, you can specify client_url
as tcp://local:4444
, because in the docker-compose.yaml
we specify a docker-in-docker service called local
(and then Docker creates hostnames from their container names, hence the URL). We create this service because Deployers need Docker Swarm, and we don't want to assume the user's local Docker Engine for testing purposes is running a Docker Swarm. Thus, we spawn this docker:dind container, which will start up its own swarm. As such, in docker-compose.yaml
, it is vital that a networks: pipelines: driver: bridge
is specified (driver: ovelay
is only supported on swarm mode). And importantly, this network shoud then be attached to the local
dind and pipelines_managers
API services, using networks: - pipelines
. The need for this attached network is because every spawned Docker Service in the local
Docker Swarm will need to access the buckets
and results
containers to fetch and update the Work it completes. Thus, it needs to be able to access http://buckets
and http://results
. Likewise, pipelines_managers
needs to be able to see the local
container. This is how bridge Docker Netwroks work; they allow container-to-container visibility and communication. Now, any container inside local
will be able to see the local Buckets and Results containers running, and pipelines_managers
should be able to see the local
container so it can connect to its exposed Docker Engine on port 4444.
Using Your Local Docker Engine¶
If you are running docker-compose-integration.yaml
(typically in GitHub Actions, or on your machine if you want really want to use your own local Docker Engine and not the docker:dind's Docker Engine), then you'll want to set client_url
to unix:///var/run/docker.sock
(this assumes your local Docker Engine is already running Docker Swarm with the docker swarm init
command). In the pipelines_managers
service, we mount your local Docker Engine through volumes: - /var/run/docker.sock:/var/run/docker.sock
, to allow this pipelines_managers
container to access your local Docker Engine. As such client_url
is unix:///var/run/docker.sock
(must specify the protocol, in last example it was tcp, in this case its unix). However, just as before now we need every spawned Docker Service to be able to see http://buckets
and http://results
. But our whole local computer isn't wrapped in a container whcih we can attach this pipelines
network, as it was in the previous example. So, we need to add networks: pipelines:
to our deployers
config, so that every Deployment will attach to this network (the pipelines
network specification is empty because the network is already defined in docker-compose-integration.yaml
and created when we run that, so we don't neeed additional parameters). Additionally, in docker-compose-integration.yaml
, since we are running Docker Swarm, we need to change networks: pipelines: driver: bridge
to networks: pipelines: name: pipelines, driver: overlay, attachable: true
:
- We give it a strict name,
pipelines
(Docker Compose prepends other names to the network key identifier by default). - We give it an overlay driver since the Deployment can spawn on any node in your Docker Swarm, so overlay drivers (Docker Networks across nodes) are needed (plus it is mandatory to use overlay drivers in Docker Swarm anyway).
- We allow it to be attachable, so that containers not specifically defined in this
docker-compose-integration.yaml
, such as our Deployments, can still attach to it (default behaviour is to not allow this).
Using Remote Docker Engine¶
If using a remote Docker Engine, it will need to be exposed to a port. One way to do this is to modify /etc/docker/daemon.json
on your desired remote machine. To the JSON object, add the line: "hosts": ["tcp://0.0.0.0:<desired_port>", "unix:///var/run/docker.sock"]
. This will make it so that this node's Docker Engine will be exposed to <hostname>:<port>
to the host network. You can also start the Docker Engine manually with something like dockerd -H tcp://0.0.0.0:<port> --tls=false
, however, it is better to use systemd to start the Docker daemon, otherwise it's tied to your session (and using that method requies all subsequent docker
commands to be prepended with -h tcp://0.0.0.0:<port>
). Either method, oing all this will allow a client_url
of "tcp://<hostname>:4444"
to work. But do not enter localhost
for <hostname>
; pipelines_managers
will be running in its own container, so it needs a speciic address to a Docker Engine that the container has access to (localhost
would just resolve to inside pipelines_manager
).
Docker Secrets¶
If, for exampe, your site is using Docker, and the images you define in your Deployments are on a private registry such as on a DockerHub account, the pipelines_managers
container will need access to the credentials to access that registry. Please add the following to the bottom of the aforementioned docker-compose.yml
:
secrets:
DOCKER_USERNAME:
# If where you are running this Docker Compose file is not apart
# of a Docker Swarm, then you'll need to use a file, like below
# file: ./DOCKER_USERNAME
external: true
DOCKER_PASSWORD:
# file: ./DOCKER_PASSWORD
external: true
Then, reference these secrets in the previously mentioned pipelines_managers
service:
If on a node with Docker Swarm running, you can create global external secrets with:
printf "chimefrb" | docker secret create DOCKER_USERNAME -
printf "PASSWORD" | docker secret create DOCKER_PASSWORD -
YAML Specification¶
site
¶
The name of the site should be the first key in the YAML file, e.g. "local", "chime", "canfar", etc.
driver
¶
The next key, inside the above key, should be the orchestration engine name used at that site, e.g. "docker", "kubernetes", etc.
volumes
¶
Inside of the driver key will be the rest of the keys for this YAML file, starting with volumes, which are file systems that are going to be mounted inside any containers spawned on this site with this driver. For each volume, the key should be the desired or pre-existing name of the volume, and then the values for that key should be:
type
¶
What kind of file system is the volume, can be "volume" for the orchestration engine's built-in managed volumes, "tmpfs" for in-memory mounts, or "bind" for simple local bind mounts.
target
¶
Where the file system will be mounted inside the container. This is relevant for all types.
source
¶
Where the file system is located on the host. This must ONLY be specified for "bind" types.
tmpfs_size
¶
This ONLY applies to "tmpfs" types, and represents the size in bytes that the in-memory mount will supply.
driver_config
¶
Only applying to "volume" types, these are settings to describe the source file system:
driver
¶
What host will the volume be created on, or is already located on. Typically "local" if you want the volume to be created on the local orchestration engine daemon, and not an external one's.
driver_opts
¶
Options to give details of the driver for the source file system:
type
¶
The type of the source file system, typically "nfs".
o
¶
Additional options to describe the type field, e.g. "addr=10.0.0.0,rw" for location and read-write permissions.
device
¶
The filepath of the source filesystem on the host (essentially "source").
networks
¶
Orchestration engine networks that each spawned container will be connected to, to allow container-to-container communicaton. First specify the name of the network as the key, then for values, see below:
attachable
¶
Whether or not external containers not participating in the multi-node orchestration (e.g. Docker Swarm) can still connect to this network.
driver
¶
The type of network, "bridge" for single-node connections, "overlay" for multi-node connections.
labels
¶
Any key/value mapping to attach to the containers spawned with the deployer, e.g. "type: 'workflow'". Allows for easy querying/filtering of Workflow services from non-Workflow services.
constraints
¶
A list of constraints to apply to all spawned containers, such as which nodes can these jobs only spawn on, e.g. ["node.role == manager"]
log_driver
¶
Custom key/value mapping to specify where the logs of the containers will be sent to. Can specify JSON settings, Loki settings, etc.