Scheduling
replicas, gpus, gpusPerReplica, and gpuVendor
These fields collectively control the number of workload instances and how GPUs are allocated across them. Their interaction depends on the workload type (Job/Service) and whether Ray is used (ray: true).
Purpose:
replicas: Sets the desired number of instances (pods). Default: 1. Ignored for non-Ray Jobs.gpus: Specifies the total number of GPUs requested across all replicas. Default: 0.gpusPerReplica: Specifies the number of GPUs requested per replica. Default: 0.gpuVendor: Eitheramd(default) ornvidia. Determines the GPU resource key (e.g.,amd.com/gpu,nvidia.com/gpu).
Behavior:
-
Non-Ray Workloads (
ray: false):- KaiwoJob: Only one pod is created.
replicasis ignored.gpusorgpusPerReplica(if set > 0) determines the GPU request for the single pod's container. If bothgpusandgpusPerReplicaare set,gpusPerReplicatakes precedence if > 0, otherwisegpusis used. - KaiwoService (Deployment):
replicasdirectly sets thedeployment.spec.replicas.gpusorgpusPerReplica(if set > 0) determines the GPU request for each replica's container. If bothgpusandgpusPerReplicaare set,gpusPerReplicatakes precedence if > 0, otherwisegpusis used (implyinggpusPerReplica = gpus / replicas, though this division isn't explicitly performed; the request per pod is set based on the determinedgpusPerReplicavalue).
- KaiwoJob: Only one pod is created.
-
Ray Workloads (
ray: true):- The controller performs a calculation (
CalculateNumberOfReplicas) considering cluster node capacity (specifically, the minimum GPU capacity available on nodes matching thegpuVendor, referred to asminGpusPerNode). - User Precedence: If the user explicitly sets both
replicas(> 0) andgpusPerReplica(> 0), these values are used directly, provided the total requested GPUs (replicas * gpusPerReplica) does not exceed the total available GPUs of the specifiedgpuVendorin the cluster. Thegpusfield is ignored in this case. - Calculation Fallback: If the user does not explicitly set both
replicasandgpusPerReplica, or if the requested total exceeds cluster capacity, the controller calculates the optimalreplicasandgpusPerReplicabased on thegpusfield and the cluster'sminGpusPerNode.- The
totalUserRequestedGpusis determined (usinggpusfield, capped at total cluster capacity). - The final
replicasis calculated asceil(totalUserRequestedGpus / minGpusPerNode). - The final
gpusPerReplicais calculated astotalUserRequestedGpus / replicas.
- The
- The calculated or user-provided
replicasvalue sets the Ray worker group replica count (minReplicas,maxReplicas,replicas). This is due the fact that Kueue does not support Ray's autoscaling. - The calculated or user-provided
gpusPerReplicavalue sets the GPU resource request/limit for each Ray worker pod's container.
- The controller performs a calculation (
Summary Table (Ray Workloads):
User Input (spec.*) |
Calculation Performed? | Outcome (replicas, gpusPerReplica) |
Notes |
|---|---|---|---|
replicas > 0, gpusPerReplica > 0 |
No* | Uses user's replicas, user's gpusPerReplica |
*If total fits cluster. gpus ignored. Highest precedence. |
gpus > 0 (only) |
Yes | Calculated based on gpus and minGpusPerNode |
Aims to maximize GPUs per node up to minGpusPerNode. |
replicas > 0, gpus > 0 |
Yes | Calculated based on gpus and minGpusPerNode (user replicas ignored) |
Falls back to calculation based on total gpus. |
gpusPerReplica > 0, gpus > 0 |
Yes | Calculated based on gpus and minGpusPerNode (user gpusPerReplica ignored) |
Falls back to calculation based on total gpus. |
| All three set | No* | Uses user's replicas, user's gpusPerReplica |
*If total fits cluster (like row 1). Otherwise, calculates based on gpus. |
None set (or only gpuVendor) |
No | replicas=1, gpusPerReplica=0 |
No GPUs requested. |