Skip to content

Cluster

See also: Universal Workflow Cluster

Shared Data via Cluster Profile

The cluster profile is a shared profile that has been assigned to all machines in the cluster including the manager. The profile is self-referential: it must contain the name of the profile in a parameter so that machine action will be aware of the shared profile.

For example, if we are using the Profile example to create a cluster, then we need to include the Param cluster/profile: example in the profile. While this may appear redundant, it is essential for the machines to find the profile when they are operating against it.

Typically, all cluster scripts start with a "does my cluster profile exist" stanza from the cluster-utilities template and is typically included in any cluster related task with the following:

{{ template "cluster-utilities.tmpl" .}}

The cluster-utilities template has the following code to initialize variables for cluster tasks.

{{ if .ParamExists "cluster/profile" }}
CLUSTER_PROFILE={{.Param "cluster/profile"}}
PROFILE_TOKEN={{.GenerateProfileToken (.Param "cluster/profile") 7200}}
echo "  Cluster Profile is $CLUSTER_PROFILE"
{{ else }}
echo "  WARNING: no cluster profile defined!  Run cluster-initialize task."
{{ end }}

The API has special behaviors that allow machines to modify these templates including an extension for Golang template rendering (see Data Architecture) to include .GenerateProfileToken. This special token must be used when updating the shared template.

Adding Data to the Cluster Profile

As data collects on the cluster profile from the manager or other members, it is common to update shared data as params in the cluster profile. This makes the data available to all members in the cluster.

drpcli -T $PROFILE_TOKEN profiles add $CLUSTER_PROFILE param "myval" to "$OUTPUT"

Developers should pay attention to timing with param data. Params that are injected during template rendering ( e.g.: {{ .Param "myval" }}) are only evaluated when the job is created, and will not change during a task run (also known as a job).

If you are looking for data the could be added or changed inside a job then you should use drpcli to retrieve the information from the shared profile with the -T $PROFILE_TOKEN pattern.

Resolving Potential Race Conditions via Atomic Updates with JSON PATCH

When cases where multiple machines write data into the cluster profile, there is a potential for race conditions. The following strategy is used to address these cases in a scalable way. No additional tooling is required.

The CLI and UX use JSON PATCH (https://tools.ietf.org/html/rfc6902) instead of PUT extensively. PATCH allows atomic field-level updates by including tests in the update. This means that simultaneous updates do not create "last in" race conditions. Instead, the update will fail in a predictable way that can be used in scripts.

The CLI facilitates use of PATCH for atomic operations by allowing scripts to pass in a reference (or pre-modified) object. If the -r reference object does not match then the update will be rejected.

This allows machines take actions that require synchronization among the cluster when waiting on operations to finish on other machines. This requirement is mitigated by the manager pattern.

The following example shows code that runs on all machines but only succeeds for the cluster leader. It assumes the Param my/data is set to a default of none.

{{template "setup.tmpl" .}}
cl=$(get_param "my/data")
while [[ $cl = "none" ]]; do
  drpcli -r "$cl" -T "$PROFILE_TOKEN" profiles set $CLUSTER_PROFILE param "my/data" to "foo" 2>/dev/null >/dev/null && break
  # sleep is is a hack but it allows for backoffs
  sleep 1
  # get the cluster info
  cl=$(get_param "my/data")
done

Cluster Filter to Collect Members

The cluster/filter Param plays a critical role in allowing the cluster manager to collect members of the cluster. The filter is a drpcli string that is applied to a drpcli machines list, or drpcli machines count call to identify the cluster membership.

This process is baked into the helper routines used for the cluster pattern and should be defined in the cluster profile if the default is not sufficient. By default, the cluster/filter is set to Profiles Eq $CLUSTER_PROFILE and will select all the machines attached to the cluster profile including the manager. Developers may choose to define clusters by other criteria such as pool membership, machine attributes or endpoint.

The following example shows how cluster/filter can be used in a task to collect the cluster members including the manager. --slim is used to reduce the return overhead.

CLUSTER_MEMBERS="$(drpcli machines list {{.Param"cluster/filter"}} --slim Params,Meta)"

In practice, additional filters are applied to further select machines based on cluster role or capability.

Starting a Workflow on Cluster Members

During multi-machine tasks, a simple loop can be used to start workflows on the targeted members.

This example shows a loop that selects all members who are cluster leaders (cluster/leader Eq true) and omits the cluster manager as a safeguard (cluster/manager Eq false). Then it applies the target workflow and sets an icon on each leader.

CLUSTER_LEADERS="$(drpcli machines list cluster/manager Eq false cluster/leader Eq true {{.Param "cluster/filter"}} --slim Params,Meta)"
UUIDS=$(jq -rc ".[].Uuid" <<< "$CLUSTER_LEADERS")
for uuid in $UUIDS; do
  echo "  starting k3s leader install workflow on $uuid"
  drpcli machines meta set $uuid key icon to anchor > /dev/null
  drpcli machines workflow $uuid k3s-machine-install > /dev/null
done

Since these operations are made against another machine, multi-machine tasks need to be called with an ExtraClaims definition that allows * actions for the scope: machines.

Working with Cluster Roles

The cluster pattern includes three built in roles:

  • manager
  • leader
  • worker (assumed as not-leader and not-manager)

The cluster/leaders are selected randomly during the cluster-initialize task when run on the cluster manager. The default number of leaders is 1.

Developers can define additional roles by defining and assigning params to members during the process. The three built in roles are used for reference.