User Tools

Site Tools


nomad:csi

Overview

democratic-csi is a lightweight CSI provider using OpenZFS to store persistent data, and NFS to expose it to nomad jobs.

Nomad implementation

Installation

The CSI plugin is run as nomad jobs with:

  • two instances in controller mode as a service job
  • one instance per node, running in node mode, as a system job

The controllers are responsible for managing the volumes, and the nodes are responsible for mounting the volumes onto the nomad clients prior to starting a job which wishes to use them.

The job definitions, including the configurations live nomad-jobs/democratic-csi

Day to day tasks

Creating a volume

Nomad can't provision new volumes itself yet, they must be created manually. This requires csc. To install it, run:

GO111MODULE=off go get -u github.com/rexray/gocsi/csc

To create a new 100MB volume named traefik-acme, run:

~/go/bin/csc -e tcp://democratic-csi.service.consul.sihnon.net:9000 controller create-volume --req-bytes 104857600 traefik-acme
# "traefik-acme"  104857600       "node_attach_driver"="nfs"      "provisioner_driver"="zfs-generic-nfs"  "server"="kowlan.jellybean.sihnon.net"  "share"="/pool2/democratic/root/traefik-acme"

Registering the volume with Nomad

Create a hcl volume definition file with contents similar to:

vol-acme.json
id = "traefik-acme"
name = "traefik-acme"
type = "csi"
external_id = "traefik-acme"
plugin_id = "kowlan"
access_mode = "single-node-writer"
attachment_mode = "file-system"
mount_options {
    fs_type = "nfs"
    mount_flags = ["nolock"]
}
context {
    node_attach_driver = "nfs"
    provisioner_driver = "zfs-generic-nfs"
    server = "kowlan.jellybean.sihnon.net"
    share = "/pool2/democratic/root/traefik-acme"
}

Register this volume with nomad using:

nomad volume register vol-acme.json

Making changes to a volume definition

If it's necessary to make changes to the volume definition, it must be unregistered and reregistered with the new options:

nomad volume deregister traefik-acme
nomad volume register vol-acme.json

The volume must not be in use for it to be deregisterable. If a job failed, nomad might not properly record that the allocation is no longer using the volume, in which case it can be forceably deregistered with:

nomad volume deregister -force traefik-acme

Resizing a volume

csc is supposed to be able to resize a volume using:

~/go/bin/csc -e tcp://democratic-csi.service.consul.sihnon.net:9000 controller expand-volume --req-bytes 209715200 traefik-acme

This segfaulted for me. However, all it's doing is calling zfs set refquota on the volume, and if this is done manually, it notices the change in size correctly.

Notes

  • The controller creates volumes by running zfs create commands via ssh to the fileserver host
  • It uses a root ssh key for this, which is stored in vault, and made available to the controller by nomad
  • It might be possible to reduce the permissions required, by creating a dedicated user account for this, and delegating zfs permissions to that user (future investigation).
  • Mount option nolock is required because nomad checks to see whether statd is running before allowing the filesystem to be mounted. Because nomad itself is running inside a container here, it cannot see that statd really is running on the host. Need to check how nomad is looking for statd and see if it can be exposed into the container.
  • The controller is configured to enable zfs snapshots by setting the com.sun:auto-snapshot=true dataset attribute in its config file
nomad/csi.txt · Last modified: 2021/01/01 23:26 by ben