Skip to content

Deployers

This is only relevant for admins who are starting up the Workfow services manually.

Explanation

The Deployments defined for a Pipeline Config need to run on a site, and Workflow needs to know the orchestration engine that is being used on that site, along with other detailed information. This is where site configs, which are translated to Deployers, come into place. In Workflow's docker-compose.yml, there is a service, pipelines_managers, that spawns a container which manages these Deployers.

Required Environment Variables

There is an environment variable, SANIC_WORKSPACE_FILEPATH, which should be changed to specify your Workflow Workspace filepath to an eligble YAML file, which shoud be mounted in the container using volumes (e.g. ./workflow/workspaces/development.yml:/etc/workspace.yml). The Workflow Workspace contains information such as what sites should be allowed, where should plots and products be moved to, and relevantly, the deployers section which details Additionally, there is SANIC_DEPLOYER_REDEPLOY_WAIT_TIME for how long to wait after removing a service to re-deploy it (and how long to wait after deploying the service). Lastly, there is SANIC_DEPLOYER_MAX_REPLICAS, which is the maximum number of replicated containers allowed for a single deployment (service).

YAML Example

Here is an example Workflow Workspace YAML file, including just the deployers section:

YAML
deployers:
    local:
        docker:
            client_url: "tcp://<address>:4444"
            volumes:
                test-nfs-volume:
                    # Will only apply these attributes
                    # if the volume doesn't exist already
                    type: volume
                    target: "/test_data/"
                    driver_config:
                        driver: "local"
                        driver_opts:
                            type: "nfs"
                            o: "nfsvers=4.0,noatime,nodiratime,soft,addr=nfs,rw"
                            device: ":/"
                test-shared-memory:
                    type: "tmpfs"
                    target: "/dev/shm"
                    tmpfs_size: 10000000
                test-bind-mount:
                    type: bind
                    target: "/user_data/"
                    source: "/home/user/folder"
            networks:
                test-network:
                    # Will only apply these attributes
                    # if the network doesn't exist already
                    attachable: False # "attachable" is only relevant for "driver" of "overlay"
                    driver: "overlay"
            # Will appear as {"type": "workflow"} on the Docker Service
            labels:
                type: "workflow"
            # node.role / node.label should be set by the site admin
            # using the Docker CLI beforehand
            constraints: ["node.role == manager"]
            log_driver:
                # This can be changed to "loki" with custom endpoint options
                name: "json-file"
                options: 
                    max-size: "10m"
                    max-file: "3"

Important Docker Specific Notes

Docker Client URL & Docker Networks

Using Docker-in-Docker Engine

If you are running the docker-compose.yaml (workflow-pipelines repository) on your machine, you can specify client_url as tcp://local:4444, because in the docker-compose.yaml we specify a docker-in-docker service called local (and then Docker creates hostnames from their container names, hence the URL). We create this service because Deployers need Docker Swarm, and we don't want to assume the user's local Docker Engine for testing purposes is running a Docker Swarm. Thus, we spawn this docker:dind container, which will start up its own swarm. As such, in docker-compose.yaml, it is vital that a networks: pipelines: driver: bridge is specified (driver: ovelay is only supported on swarm mode). And importantly, this network shoud then be attached to the local dind and pipelines_managers API services, using networks: - pipelines. The need for this attached network is because every spawned Docker Service in the local Docker Swarm will need to access the buckets and results containers to fetch and update the Work it completes. Thus, it needs to be able to access http://buckets and http://results. Likewise, pipelines_managers needs to be able to see the local container. This is how bridge Docker Netwroks work; they allow container-to-container visibility and communication. Now, any container inside local will be able to see the local Buckets and Results containers running, and pipelines_managers should be able to see the local container so it can connect to its exposed Docker Engine on port 4444.

Using Your Local Docker Engine

If you are running docker-compose-integration.yaml (typically in GitHub Actions, or on your machine if you want really want to use your own local Docker Engine and not the docker:dind's Docker Engine), then you'll want to set client_url to unix:///var/run/docker.sock (this assumes your local Docker Engine is already running Docker Swarm with the docker swarm init command). In the pipelines_managers service, we mount your local Docker Engine through volumes: - /var/run/docker.sock:/var/run/docker.sock, to allow this pipelines_managers container to access your local Docker Engine. As such client_url is unix:///var/run/docker.sock (must specify the protocol, in last example it was tcp, in this case its unix). However, just as before now we need every spawned Docker Service to be able to see http://buckets and http://results. But our whole local computer isn't wrapped in a container whcih we can attach this pipelines network, as it was in the previous example. So, we need to add networks: pipelines: to our deployers config, so that every Deployment will attach to this network (the pipelines network specification is empty because the network is already defined in docker-compose-integration.yaml and created when we run that, so we don't neeed additional parameters). Additionally, in docker-compose-integration.yaml, since we are running Docker Swarm, we need to change networks: pipelines: driver: bridge to networks: pipelines: name: pipelines, driver: overlay, attachable: true:

  • We give it a strict name, pipelines (Docker Compose prepends other names to the network key identifier by default).
  • We give it an overlay driver since the Deployment can spawn on any node in your Docker Swarm, so overlay drivers (Docker Networks across nodes) are needed (plus it is mandatory to use overlay drivers in Docker Swarm anyway).
  • We allow it to be attachable, so that containers not specifically defined in this docker-compose-integration.yaml, such as our Deployments, can still attach to it (default behaviour is to not allow this).

Using Remote Docker Engine

If using a remote Docker Engine, it will need to be exposed to a port. One way to do this is to modify /etc/docker/daemon.json on your desired remote machine. To the JSON object, add the line: "hosts": ["tcp://0.0.0.0:<desired_port>", "unix:///var/run/docker.sock"]. This will make it so that this node's Docker Engine will be exposed to <hostname>:<port> to the host network. You can also start the Docker Engine manually with something like dockerd -H tcp://0.0.0.0:<port> --tls=false, however, it is better to use systemd to start the Docker daemon, otherwise it's tied to your session (and using that method requies all subsequent docker commands to be prepended with -h tcp://0.0.0.0:<port>). Either method, oing all this will allow a client_url of "tcp://<hostname>:4444" to work. But do not enter localhost for <hostname>; pipelines_managers will be running in its own container, so it needs a speciic address to a Docker Engine that the container has access to (localhost would just resolve to inside pipelines_manager).

Docker Secrets

If, for exampe, your site is using Docker, and the images you define in your Deployments are on a private registry such as on a DockerHub account, the pipelines_managers container will need access to the credentials to access that registry. Please add the following to the bottom of the aforementioned docker-compose.yml:

YAML
secrets:
    DOCKER_USERNAME:
        # If where you are running this Docker Compose file is not apart
        # of a Docker Swarm, then you'll need to use a file, like below
        # file: ./DOCKER_USERNAME
        external: true
    DOCKER_PASSWORD:
        # file: ./DOCKER_PASSWORD
        external: true

Then, reference these secrets in the previously mentioned pipelines_managers service:

YAML
secrets:
    - DOCKER_USERNAME
    - DOCKER_PASSWORD

If on a node with Docker Swarm running, you can create global external secrets with:

Bash
printf "chimefrb" | docker secret create DOCKER_USERNAME -
printf "PASSWORD" | docker secret create DOCKER_PASSWORD -

YAML Specification

site


The name of the site should be the first key in the YAML file, e.g. "local", "chime", "canfar", etc.

driver


The next key, inside the above key, should be the orchestration engine name used at that site, e.g. "docker", "kubernetes", etc.

volumes


Inside of the driver key will be the rest of the keys for this YAML file, starting with volumes, which are file systems that are going to be mounted inside any containers spawned on this site with this driver. For each volume, the key should be the desired or pre-existing name of the volume, and then the values for that key should be:

type

What kind of file system is the volume, can be "volume" for the orchestration engine's built-in managed volumes, "tmpfs" for in-memory mounts, or "bind" for simple local bind mounts.

target

Where the file system will be mounted inside the container. This is relevant for all types.

source

Where the file system is located on the host. This must ONLY be specified for "bind" types.

tmpfs_size

This ONLY applies to "tmpfs" types, and represents the size in bytes that the in-memory mount will supply.

driver_config

Only applying to "volume" types, these are settings to describe the source file system:

driver

What host will the volume be created on, or is already located on. Typically "local" if you want the volume to be created on the local orchestration engine daemon, and not an external one's.

driver_opts

Options to give details of the driver for the source file system:

type

The type of the source file system, typically "nfs".

o

Additional options to describe the type field, e.g. "addr=10.0.0.0,rw" for location and read-write permissions.

device

The filepath of the source filesystem on the host (essentially "source").

networks


Orchestration engine networks that each spawned container will be connected to, to allow container-to-container communicaton. First specify the name of the network as the key, then for values, see below:

attachable

Whether or not external containers not participating in the multi-node orchestration (e.g. Docker Swarm) can still connect to this network.

driver

The type of network, "bridge" for single-node connections, "overlay" for multi-node connections.

labels


Any key/value mapping to attach to the containers spawned with the deployer, e.g. "type: 'workflow'". Allows for easy querying/filtering of Workflow services from non-Workflow services.

constraints


A list of constraints to apply to all spawned containers, such as which nodes can these jobs only spawn on, e.g. ["node.role == manager"]

log_driver


Custom key/value mapping to specify where the logs of the containers will be sent to. Can specify JSON settings, Loki settings, etc.