Configuration Guide
Administrators configure Kaiwo primarily through
- The cluster-scoped
KaiwoQueueConfigCustom Resource Definition (CRD) - The cluster-scoped
KaiwoConfigCRD - Environment variables or flags passed to the Kaiwo operator
Users configure the CLI tool separately.
KaiwoQueueConfig CRD
This is the central point for managing how Kaiwo interacts with Kueue. There can be only one KaiwoQueueConfig resource in the cluster, and its metadata.name must be kaiwo (or the value specified in the KaiwoConfig Custom Resource field spec.defaultKaiwoQueueConfigName, which defaults to kaiwo, see more below).
Default Configuration on Startup:
The Kaiwo operator includes a startup routine that checks if a KaiwoQueueConfig named kaiwo exists. If it does not, the operator automatically creates a default one. This default configuration aims to provide a functional baseline:
- It attempts to auto-discover node pools based on common GPU labels (e.g.,
amd.com/gpu.product-name,nvidia.com/gpu.product,nvidia.com/gpu.count) and CPU/Memory capacity. - It creates corresponding Kueue
ResourceFlavorresources based on this discovery, labeling the nodes withkaiwo/nodepool=<generated-flavor-name>. - It defines a single Kueue
ClusterQueuenamedkaiwo(or the value ofDEFAULT_CLUSTER_QUEUE_NAME), configured to use all discoveredResourceFlavorsand their estimated capacities asnominalQuota. - It specifies that this default
ClusterQueueshould have a correspondingLocalQueueautomatically created in thekaiwonamespace. - It does not define any
WorkloadPriorityClassresources by default.
You can modify this automatically created configuration or create your own kaiwo resource manually using: kubectl edit kaiwoqueueconfig kaiwo or by applying a YAML manifest.
Key Fields (spec):
-
resourceFlavors: Defines the types of hardware resources available in the cluster, corresponding to KueueResourceFlavorresources.name: A unique name for the flavor (e.g.,amd-mi300-8gpu,nvidia-a100-40gb,cpu-standard).nodeLabels: A map of labels that nodes must possess to be considered part of this flavor. This is crucial for scheduling pods onto the correct hardware. Example:{"kaiwo/nodepool": "amd-mi300-nodes"}.topologyName: (Optional) The name of a KueueTopology(defined inspec.topologies) that this flavor references. When set, the flavor enables Topology Aware Scheduling (TAS) for workloads that opt in. See the TAS section below for details.taints: (Optional) A list of Kubernetes taints associated with this flavor. Pods scheduled to this flavor will need corresponding tolerations. Kaiwo automatically adds tolerations for GPU taints ifADD_TAINTS_TO_GPU_NODESis enabled.
Auto-Discovery vs. Explicit Definition
If
spec.resourceFlavorsis empty or omitted in thekaiwoKaiwoQueueConfig, the operator's startup logic attempts to auto-discover node pools and create corresponding flavors as described above. Auto-discovered flavors automatically reference the default topology (configured viaKaiwoConfig.spec.defaultTopologyName, defaulting todefault-topology) to enable TAS capability. While convenient for initial setup, explicitly definingresourceFlavorsin theKaiwoQueueConfigprovides more precise control and is generally recommended for production environments. Explicitly defined flavors will override any auto-discovered ones during reconciliation.ResourceFlavor Immutability
Kueue makes
ResourceFlavorspecs immutable oncetopologyNameis set. If you need to change the spec of such a flavor (e.g., changing or removingtopologyName), the Kaiwo controller will automatically handle this by deleting and recreating the flavor. If the old flavor is still in use by aClusterQueue, Kueue'sresource-in-usefinalizer may delay the deletion; the controller will converge on subsequent reconciliation cycles. -
topologies: Defines KueueTopologyresources that describe the physical or logical topology of the cluster (e.g., rack, block, host hierarchy). Topologies are referenced byresourceFlavorsvia thetopologyNamefield. Each topology specifies a list oflevels, where each level is a node label key (e.g.,kaiwo/topology-block,kaiwo/topology-rack,kubernetes.io/hostname). -
clusterQueues: Defines the KueueClusterQueueresources managed by Kaiwo.name: The name of theClusterQueue(e.g.,team-a-queue,default-gpu-queue).spec: The full KueueClusterQueueSpec. This is where you define resource quotas, cohorts, preemption policies, etc. See Kueue ClusterQueue Documentation.resourceGroups: Define sets of flavors and their associated quotas (nominalQuota). This links the queue to the available hardware defined inresourceFlavors.namespaceSelector: Controls which namespaces can use this queue viaLocalQueueresources if those LocalQueues exist. Note that Kaiwo's automaticLocalQueuecreation relies on thenamespacesfield below, not this selector.
namespaces: A list of namespace names where Kaiwo should automatically create and manage a KueueLocalQueuepointing to thisClusterQueue. TheLocalQueuecreated will have the same name as theClusterQueue.
-
workloadPriorityClasses: Defines KueueWorkloadPriorityClassresources.- Follows the standard Kueue
WorkloadPriorityClassstructure (name,value,description). Kaiwo ensures these exist as defined. See Kueue Priority Documentation.
- Follows the standard Kueue
Example KaiwoQueueConfig:
apiVersion: kaiwo.silogen.ai/v1alpha1
kind: KaiwoQueueConfig
metadata:
name: kaiwo # Must be named 'kaiwo' (or DEFAULT_KAIWO_QUEUE_CONFIG_NAME)
spec:
topologies:
- name: gpu-topology
levels:
- kaiwo/topology-block
- kaiwo/topology-rack
- kubernetes.io/hostname
resourceFlavors:
- name: amd-mi300-8gpu
nodeLabels:
kaiwo/nodepool: amd-mi300-nodes
topologyName: gpu-topology # Enables TAS for workloads using this flavor
- name: cpu-high-mem
nodeLabels:
kaiwo/nodepool: cpu-high-mem-nodes
# No topologyName — TAS not available for this flavor
clusterQueues:
- name: ai-research-queue
namespaces:
- ai-research-ns-1
- ai-research-ns-2
spec:
queueingStrategy: BestEffortFIFO
resourceGroups:
- coveredResources: ["cpu", "memory", "amd.com/gpu"]
flavors:
- name: amd-mi300-8gpu
resources:
- name: "cpu"
nominalQuota: "192"
- name: "memory"
nominalQuota: "1024Gi"
- name: "amd.com/gpu"
nominalQuota: "8"
- coveredResources: ["cpu", "memory"]
flavors:
- name: cpu-high-mem
resources:
- name: "cpu"
nominalQuota: "256"
- name: "memory"
nominalQuota: "2048Gi"
workloadPriorityClasses:
- name: high-priority
value: 1000
- name: low-priority
value: 100
Controller Operation and Kueue Resource Synchronization
The KaiwoQueueConfigController acts as a translator, continuously ensuring that the Kueue resources in your cluster accurately reflect the configuration defined in the single kaiwo KaiwoQueueConfig resource. It monitors this resource and automatically manages the lifecycle of the associated Kueue objects:
-
spec.topologies-> KueueTopology:- Each entry defines a Kueue
Topologyresource describing the cluster's physical or logical topology hierarchy. - Topologies are synced before ResourceFlavors to ensure flavors can reference them immediately.
- Each entry defines a Kueue
-
spec.resourceFlavors-> KueueResourceFlavor:- Each entry in this list directly defines a Kueue
ResourceFlavor. - The controller ensures a corresponding
ResourceFlavorexists for each entry, creating or updating it as necessary based on the specifiedname,nodeLabels,topologyName, andtaints. - If an entry is removed from this list, the controller deletes the corresponding
ResourceFlavor. - For flavors with
topologyNameset, Kueue makes the spec immutable. If a spec change is needed, the controller handles this transparently via delete-and-recreate.
- Each entry in this list directly defines a Kueue
-
spec.clusterQueues-> KueueClusterQueueandLocalQueue:- Each entry in this list defines a Kueue
ClusterQueue. The controller translates the structure into a standardClusterQueueSpecand ensures the resource exists and matches the definition. Removing an entry deletes the correspondingClusterQueue. - The
namespacesfield within eachclusterQueuesentry dictates where KueueLocalQueues should exist. The controller automatically creates aLocalQueue(named after theClusterQueue) in each listed namespace, pointing to the correspondingClusterQueue. If a namespace is removed from the list, or the parentClusterQueueentry is removed, the controller deletes the associatedLocalQueuein that namespace.
- Each entry in this list defines a Kueue
-
spec.workloadPriorityClasses-> KueueWorkloadPriorityClass:- Each entry defines a Kueue
WorkloadPriorityClass. - The controller manages these resources, ensuring they exist with the specified
name,value, anddescription. - Removing an entry results in the deletion of the corresponding
WorkloadPriorityClass.
- Each entry defines a Kueue
Owner References
The controller establishes the kaiwo KaiwoQueueConfig as the owner of all the Kueue resources it creates. This linkage ensures that if the KaiwoQueueConfig is deleted, Kubernetes automatically cleans up all the managed Kueue resources (ResourceFlavor, ClusterQueue, LocalQueue, WorkloadPriorityClass).
Kueue resource management
Kaiwo takes ownership of Kueue ResourceFlavor, ClusterQueue, LocalQueue and WorkloadPriorityClass resources. This means that resources of these types that are created manually, i.e. not via the KaiwoQueueConfig, may be deleted by the Kaiwo Controller
The controller updates the status.status field of the KaiwoQueueConfig resource (Pending, Ready, or Failed) to indicate the current state of synchronization between the desired configuration and the actual Kueue resources in the cluster. This continuous reconciliation keeps the Kueue setup aligned with the central KaiwoQueueConfig.
Topology Aware Scheduling (TAS)
Kaiwo integrates with Kueue's Topology Aware Scheduling to place workload pods close together in the cluster topology (e.g., same rack, same network block), which can improve performance for distributed training workloads.
How it works:
TAS in Kaiwo is a two-layer opt-in system:
- Infrastructure layer (admin): ResourceFlavors must reference a
Topologyvia thetopologyNamefield to enable TAS capability. Without this, workloads cannot use TAS even if they request it. Auto-discovered flavors (whendynamicallyUpdateDefaultClusterQueueis enabled) automatically reference the default topology. - Workload layer (user): Individual workloads opt in to TAS by setting
preferredTopologyLabelorrequiredTopologyLabelin their spec. If neither is set, the workload is scheduled normally without topology constraints, even if the underlying flavor supports TAS.
Configuration:
- Define a
Topologyinspec.topologieswith the appropriate hierarchy of node labels. - Reference that topology in your
ResourceFlavorviatopologyName. - Ensure the nodes in the cluster are labeled with the topology labels (e.g.,
kaiwo/topology-rack,kaiwo/topology-block,kubernetes.io/hostname).
Users then activate TAS on individual workloads by setting preferredTopologyLabel or requiredTopologyLabel. See the Scheduling guide for workload-level configuration.
Default Topology
The default topology name is configured in KaiwoConfig via spec.defaultTopologyName (defaults to default-topology). Auto-generated ResourceFlavors always reference this topology. The operator also creates this default topology with levels kaiwo/topology-block, kaiwo/topology-rack, and kubernetes.io/hostname.
KaiwoConfig CRD
The Kaiwo Operator's runtime configuration is managed through the KaiwoConfig Custom Resource Definition (CRD). This approach allows Kubernetes administrators to dynamically adjust operator behavior without requiring a restart. The operator always retrieves the most recent configuration values during each reconcile loop.
Configuration Structure
The primary configuration resource is the KaiwoConfig CRD, typically maintained as a singleton within the Kubernetes cluster. Its key components are encapsulated in the KaiwoConfigSpec, which briefly includes:
kueue: Configures default integration settings with Kueue, including the default cluster queue name.ray: Specifies Ray-specific parameters, including default container images and memory allocations.storage: Manages default filesystem paths for mounting data storage and HuggingFace caches.nodes: Defines node-specific settings such as GPU resource keys, GPU node taints, and node pool exclusions.scheduling: Sets scheduling-related configurations, like the Kubernetes scheduler name.resourceMonitoring: Configures resource monitoring, including averaging intervals, utilization thresholds, and targeted namespaces.defaultKaiwoQueueConfigName: Specifies the default name for the Kaiwo queue configuration object.
Specifying the Configuration CR
The Kaiwo Operator identifies its configuration resource via the environment variable CONFIG_NAME. By default, this is set to kaiwo. Ensure that a KaiwoConfig resource with this exact name exists in your cluster. The operator automatically creates a default configuration at startup if none exists.
Note
The operator waits up to 30 seconds for the specified configuration resource to be found. If no resource is detected within this period, the operator pod will fail with an error.
Example KaiwoConfig CR
Here's a minimal example of a valid KaiwoConfig definition:
apiVersion: config.kaiwo.silogen.ai/v1alpha1
kind: KaiwoConfig
metadata:
name: kaiwo
spec:
scheduling:
kubeSchedulerName: "kaiwo-scheduler"
resourceMonitoring:
averagingTime: "20m"
lowUtilizationThreshold: 20
profile: "gpu"
For detailed descriptions of individual configuration fields, please see the full API reference.
Operator Environmental Variables
Some configuration is not changeable during runtime, or they may be often referenced from other config maps or secrets, and thus they are stored as environmental variables. These settings are not dynamic, and the operator must be restarted in order for changes to these values to take effect.
Kueue
DEFAULT_KAIWO_QUEUE_CONFIG_NAME: The name of the singletonKaiwoQueueConfigcustom resource to be used (defaults tokaiwo)DEFAULT_CLUSTER_QUEUE_NAME: The name of the default Kueue cluster queue (defaults tokaiwo)
Resource Monitoring
To enable and configure resource monitoring within the Kaiwo Operator, the following environment variables must be set on the operator deployment:
RESOURCE_MONITORING_ENABLED=true– Enables the resource monitoring component.RESOURCE_MONITORING_PROMETHEUS_ENDPOINT=<prometheus-endpoint>– Specifies the Prometheus endpoint to query metrics from.RESOURCE_MONITORING_POLLING_INTERVAL=10m– Sets the interval between metric polling queries.
Other configuration options
WEBHOOK_CERT_DIRECTORY: Path to manually provided webhook certificates (overrides automatic management if set). See Installation.
Forthcoming feature
ENFORCE_KAIWO_ON_GPU_WORKLOADS (Default: false): If true, the mutating admission webhook for batchv1.Job will automatically add the kaiwo.silogen.ai/managed: "true" label to any job requesting GPU resources, forcing it to be managed by Kaiwo/Kueue.
All environmental variables are typically set in the operator's Deployment manifest.
Command-Line Flags:
Refer to the output of kaiwo-operator --help (or check cmd/operator/main.go) for flags controlling metrics, health probes, leader election, and certificate paths.