Skip to content

Configuration Guide

Administrators configure Kaiwo primarily through

  • The cluster-scoped KaiwoQueueConfig Custom Resource Definition (CRD)
  • The cluster-scoped KaiwoConfig CRD
  • Environment variables or flags passed to the Kaiwo operator

Users configure the CLI tool separately.

KaiwoQueueConfig CRD

This is the central point for managing how Kaiwo interacts with Kueue. There can be only one KaiwoQueueConfig resource in the cluster, and its metadata.name must be kaiwo (or the value specified in the KaiwoConfig Custom Resource field spec.defaultKaiwoQueueConfigName, which defaults to kaiwo, see more below).

Default Configuration on Startup:

The Kaiwo operator includes a startup routine that checks if a KaiwoQueueConfig named kaiwo exists. If it does not, the operator automatically creates a default one. This default configuration aims to provide a functional baseline:

  • It attempts to auto-discover node pools based on common GPU labels (e.g., amd.com/gpu.product-name, nvidia.com/gpu.product, nvidia.com/gpu.count) and CPU/Memory capacity.
  • It creates corresponding Kueue ResourceFlavor resources based on this discovery, labeling the nodes with kaiwo/nodepool=<generated-flavor-name>.
  • It defines a single Kueue ClusterQueue named kaiwo (or the value of DEFAULT_CLUSTER_QUEUE_NAME), configured to use all discovered ResourceFlavors and their estimated capacities as nominalQuota.
  • It specifies that this default ClusterQueue should have a corresponding LocalQueue automatically created in the kaiwo namespace.
  • It does not define any WorkloadPriorityClass resources by default.

You can modify this automatically created configuration or create your own kaiwo resource manually using: kubectl edit kaiwoqueueconfig kaiwo or by applying a YAML manifest.

Key Fields (spec):

  • resourceFlavors: Defines the types of hardware resources available in the cluster, corresponding to Kueue ResourceFlavor resources.

    • name: A unique name for the flavor (e.g., amd-mi300-8gpu, nvidia-a100-40gb, cpu-standard).
    • nodeLabels: A map of labels that nodes must possess to be considered part of this flavor. This is crucial for scheduling pods onto the correct hardware. Example: {"kaiwo/nodepool": "amd-mi300-nodes"}.
    • topologyName: (Optional) The name of a Kueue Topology (defined in spec.topologies) that this flavor references. When set, the flavor enables Topology Aware Scheduling (TAS) for workloads that opt in. See the TAS section below for details.
    • taints: (Optional) A list of Kubernetes taints associated with this flavor. Pods scheduled to this flavor will need corresponding tolerations. Kaiwo automatically adds tolerations for GPU taints if ADD_TAINTS_TO_GPU_NODES is enabled.

    Auto-Discovery vs. Explicit Definition

    If spec.resourceFlavors is empty or omitted in the kaiwo KaiwoQueueConfig, the operator's startup logic attempts to auto-discover node pools and create corresponding flavors as described above. Auto-discovered flavors automatically reference the default topology (configured via KaiwoConfig.spec.defaultTopologyName, defaulting to default-topology) to enable TAS capability. While convenient for initial setup, explicitly defining resourceFlavors in the KaiwoQueueConfig provides more precise control and is generally recommended for production environments. Explicitly defined flavors will override any auto-discovered ones during reconciliation.

    ResourceFlavor Immutability

    Kueue makes ResourceFlavor specs immutable once topologyName is set. If you need to change the spec of such a flavor (e.g., changing or removing topologyName), the Kaiwo controller will automatically handle this by deleting and recreating the flavor. If the old flavor is still in use by a ClusterQueue, Kueue's resource-in-use finalizer may delay the deletion; the controller will converge on subsequent reconciliation cycles.

  • topologies: Defines Kueue Topology resources that describe the physical or logical topology of the cluster (e.g., rack, block, host hierarchy). Topologies are referenced by resourceFlavors via the topologyName field. Each topology specifies a list of levels, where each level is a node label key (e.g., kaiwo/topology-block, kaiwo/topology-rack, kubernetes.io/hostname).

  • clusterQueues: Defines the Kueue ClusterQueue resources managed by Kaiwo.

    • name: The name of the ClusterQueue (e.g., team-a-queue, default-gpu-queue).
    • spec: The full Kueue ClusterQueueSpec. This is where you define resource quotas, cohorts, preemption policies, etc. See Kueue ClusterQueue Documentation.
      • resourceGroups: Define sets of flavors and their associated quotas (nominalQuota). This links the queue to the available hardware defined in resourceFlavors.
      • namespaceSelector: Controls which namespaces can use this queue via LocalQueue resources if those LocalQueues exist. Note that Kaiwo's automatic LocalQueue creation relies on the namespaces field below, not this selector.
    • namespaces: A list of namespace names where Kaiwo should automatically create and manage a Kueue LocalQueue pointing to this ClusterQueue. The LocalQueue created will have the same name as the ClusterQueue.
  • workloadPriorityClasses: Defines Kueue WorkloadPriorityClass resources.

    • Follows the standard Kueue WorkloadPriorityClass structure (name, value, description). Kaiwo ensures these exist as defined. See Kueue Priority Documentation.

Example KaiwoQueueConfig:

apiVersion: kaiwo.silogen.ai/v1alpha1
kind: KaiwoQueueConfig
metadata:
  name: kaiwo # Must be named 'kaiwo' (or DEFAULT_KAIWO_QUEUE_CONFIG_NAME)
spec:
  topologies:
    - name: gpu-topology
      levels:
        - kaiwo/topology-block
        - kaiwo/topology-rack
        - kubernetes.io/hostname

  resourceFlavors:
    - name: amd-mi300-8gpu
      nodeLabels:
        kaiwo/nodepool: amd-mi300-nodes
      topologyName: gpu-topology  # Enables TAS for workloads using this flavor
    - name: cpu-high-mem
      nodeLabels:
        kaiwo/nodepool: cpu-high-mem-nodes
        # No topologyName — TAS not available for this flavor

  clusterQueues:
    - name: ai-research-queue
      namespaces:
        - ai-research-ns-1
        - ai-research-ns-2
      spec:
        queueingStrategy: BestEffortFIFO
        resourceGroups:
          - coveredResources: ["cpu", "memory", "amd.com/gpu"]
            flavors:
              - name: amd-mi300-8gpu
                resources:
                  - name: "cpu"
                    nominalQuota: "192"
                  - name: "memory"
                    nominalQuota: "1024Gi"
                  - name: "amd.com/gpu"
                    nominalQuota: "8"
          - coveredResources: ["cpu", "memory"]
            flavors:
              - name: cpu-high-mem
                resources:
                  - name: "cpu"
                    nominalQuota: "256"
                  - name: "memory"
                    nominalQuota: "2048Gi"

  workloadPriorityClasses:
    - name: high-priority
      value: 1000
    - name: low-priority
      value: 100

Controller Operation and Kueue Resource Synchronization

The KaiwoQueueConfigController acts as a translator, continuously ensuring that the Kueue resources in your cluster accurately reflect the configuration defined in the single kaiwo KaiwoQueueConfig resource. It monitors this resource and automatically manages the lifecycle of the associated Kueue objects:

  • spec.topologies -> Kueue Topology:

    • Each entry defines a Kueue Topology resource describing the cluster's physical or logical topology hierarchy.
    • Topologies are synced before ResourceFlavors to ensure flavors can reference them immediately.
  • spec.resourceFlavors -> Kueue ResourceFlavor:

    • Each entry in this list directly defines a Kueue ResourceFlavor.
    • The controller ensures a corresponding ResourceFlavor exists for each entry, creating or updating it as necessary based on the specified name, nodeLabels, topologyName, and taints.
    • If an entry is removed from this list, the controller deletes the corresponding ResourceFlavor.
    • For flavors with topologyName set, Kueue makes the spec immutable. If a spec change is needed, the controller handles this transparently via delete-and-recreate.
  • spec.clusterQueues -> Kueue ClusterQueue and LocalQueue:

    • Each entry in this list defines a Kueue ClusterQueue. The controller translates the structure into a standard ClusterQueueSpec and ensures the resource exists and matches the definition. Removing an entry deletes the corresponding ClusterQueue.
    • The namespaces field within each clusterQueues entry dictates where Kueue LocalQueues should exist. The controller automatically creates a LocalQueue (named after the ClusterQueue) in each listed namespace, pointing to the corresponding ClusterQueue. If a namespace is removed from the list, or the parent ClusterQueue entry is removed, the controller deletes the associated LocalQueue in that namespace.
  • spec.workloadPriorityClasses -> Kueue WorkloadPriorityClass:

    • Each entry defines a Kueue WorkloadPriorityClass.
    • The controller manages these resources, ensuring they exist with the specified name, value, and description.
    • Removing an entry results in the deletion of the corresponding WorkloadPriorityClass.

Owner References

The controller establishes the kaiwo KaiwoQueueConfig as the owner of all the Kueue resources it creates. This linkage ensures that if the KaiwoQueueConfig is deleted, Kubernetes automatically cleans up all the managed Kueue resources (ResourceFlavor, ClusterQueue, LocalQueue, WorkloadPriorityClass).

Kueue resource management

Kaiwo takes ownership of Kueue ResourceFlavor, ClusterQueue, LocalQueue and WorkloadPriorityClass resources. This means that resources of these types that are created manually, i.e. not via the KaiwoQueueConfig, may be deleted by the Kaiwo Controller

The controller updates the status.status field of the KaiwoQueueConfig resource (Pending, Ready, or Failed) to indicate the current state of synchronization between the desired configuration and the actual Kueue resources in the cluster. This continuous reconciliation keeps the Kueue setup aligned with the central KaiwoQueueConfig.

Topology Aware Scheduling (TAS)

Kaiwo integrates with Kueue's Topology Aware Scheduling to place workload pods close together in the cluster topology (e.g., same rack, same network block), which can improve performance for distributed training workloads.

How it works:

TAS in Kaiwo is a two-layer opt-in system:

  1. Infrastructure layer (admin): ResourceFlavors must reference a Topology via the topologyName field to enable TAS capability. Without this, workloads cannot use TAS even if they request it. Auto-discovered flavors (when dynamicallyUpdateDefaultClusterQueue is enabled) automatically reference the default topology.
  2. Workload layer (user): Individual workloads opt in to TAS by setting preferredTopologyLabel or requiredTopologyLabel in their spec. If neither is set, the workload is scheduled normally without topology constraints, even if the underlying flavor supports TAS.

Configuration:

  1. Define a Topology in spec.topologies with the appropriate hierarchy of node labels.
  2. Reference that topology in your ResourceFlavor via topologyName.
  3. Ensure the nodes in the cluster are labeled with the topology labels (e.g., kaiwo/topology-rack, kaiwo/topology-block, kubernetes.io/hostname).

Users then activate TAS on individual workloads by setting preferredTopologyLabel or requiredTopologyLabel. See the Scheduling guide for workload-level configuration.

Default Topology

The default topology name is configured in KaiwoConfig via spec.defaultTopologyName (defaults to default-topology). Auto-generated ResourceFlavors always reference this topology. The operator also creates this default topology with levels kaiwo/topology-block, kaiwo/topology-rack, and kubernetes.io/hostname.

KaiwoConfig CRD

The Kaiwo Operator's runtime configuration is managed through the KaiwoConfig Custom Resource Definition (CRD). This approach allows Kubernetes administrators to dynamically adjust operator behavior without requiring a restart. The operator always retrieves the most recent configuration values during each reconcile loop.

Configuration Structure

The primary configuration resource is the KaiwoConfig CRD, typically maintained as a singleton within the Kubernetes cluster. Its key components are encapsulated in the KaiwoConfigSpec, which briefly includes:

  • kueue: Configures default integration settings with Kueue, including the default cluster queue name.
  • ray: Specifies Ray-specific parameters, including default container images and memory allocations.
  • storage: Manages default filesystem paths for mounting data storage and HuggingFace caches.
  • nodes: Defines node-specific settings such as GPU resource keys, GPU node taints, and node pool exclusions.
  • scheduling: Sets scheduling-related configurations, like the Kubernetes scheduler name.
  • resourceMonitoring: Configures resource monitoring, including averaging intervals, utilization thresholds, and targeted namespaces.
  • defaultKaiwoQueueConfigName: Specifies the default name for the Kaiwo queue configuration object.

Specifying the Configuration CR

The Kaiwo Operator identifies its configuration resource via the environment variable CONFIG_NAME. By default, this is set to kaiwo. Ensure that a KaiwoConfig resource with this exact name exists in your cluster. The operator automatically creates a default configuration at startup if none exists.

Note

The operator waits up to 30 seconds for the specified configuration resource to be found. If no resource is detected within this period, the operator pod will fail with an error.

Example KaiwoConfig CR

Here's a minimal example of a valid KaiwoConfig definition:

apiVersion: config.kaiwo.silogen.ai/v1alpha1
kind: KaiwoConfig
metadata:
  name: kaiwo
spec:
  scheduling:
    kubeSchedulerName: "kaiwo-scheduler"
  resourceMonitoring:
    averagingTime: "20m"
    lowUtilizationThreshold: 20
    profile: "gpu"

For detailed descriptions of individual configuration fields, please see the full API reference.

Operator Environmental Variables

Some configuration is not changeable during runtime, or they may be often referenced from other config maps or secrets, and thus they are stored as environmental variables. These settings are not dynamic, and the operator must be restarted in order for changes to these values to take effect.

Kueue

  • DEFAULT_KAIWO_QUEUE_CONFIG_NAME: The name of the singleton KaiwoQueueConfig custom resource to be used (defaults to kaiwo)
  • DEFAULT_CLUSTER_QUEUE_NAME: The name of the default Kueue cluster queue (defaults to kaiwo)

Resource Monitoring

To enable and configure resource monitoring within the Kaiwo Operator, the following environment variables must be set on the operator deployment:

  • RESOURCE_MONITORING_ENABLED=true – Enables the resource monitoring component.
  • RESOURCE_MONITORING_PROMETHEUS_ENDPOINT=<prometheus-endpoint> – Specifies the Prometheus endpoint to query metrics from.
  • RESOURCE_MONITORING_POLLING_INTERVAL=10m – Sets the interval between metric polling queries.

Other configuration options

  • WEBHOOK_CERT_DIRECTORY: Path to manually provided webhook certificates (overrides automatic management if set). See Installation.

Forthcoming feature

ENFORCE_KAIWO_ON_GPU_WORKLOADS (Default: false): If true, the mutating admission webhook for batchv1.Job will automatically add the kaiwo.silogen.ai/managed: "true" label to any job requesting GPU resources, forcing it to be managed by Kaiwo/Kueue.

All environmental variables are typically set in the operator's Deployment manifest.

Command-Line Flags:

Refer to the output of kaiwo-operator --help (or check cmd/operator/main.go) for flags controlling metrics, health probes, leader election, and certificate paths.