Scheduling
replicas
, gpus
, gpusPerReplica
, and gpuVendor
These fields collectively control the number of workload instances and how GPUs are allocated across them. Their interaction depends on the workload type (Job/Service) and whether Ray is used (ray: true
).
Purpose:
replicas
: Sets the desired number of instances (pods). Default: 1. Ignored for non-Ray Jobs.gpus
: Specifies the total number of GPUs requested across all replicas. Default: 0.gpusPerReplica
: Specifies the number of GPUs requested per replica. Default: 0.gpuVendor
: Eitheramd
(default) ornvidia
. Determines the GPU resource key (e.g.,amd.com/gpu
,nvidia.com/gpu
).
Behavior:
-
Non-Ray Workloads (
ray: false
):- KaiwoJob: Only one pod is created.
replicas
is ignored.gpus
orgpusPerReplica
(if set > 0) determines the GPU request for the single pod's container. If bothgpus
andgpusPerReplica
are set,gpusPerReplica
takes precedence if > 0, otherwisegpus
is used. - KaiwoService (Deployment):
replicas
directly sets thedeployment.spec.replicas
.gpus
orgpusPerReplica
(if set > 0) determines the GPU request for each replica's container. If bothgpus
andgpusPerReplica
are set,gpusPerReplica
takes precedence if > 0, otherwisegpus
is used (implyinggpusPerReplica = gpus / replicas
, though this division isn't explicitly performed; the request per pod is set based on the determinedgpusPerReplica
value).
- KaiwoJob: Only one pod is created.
-
Ray Workloads (
ray: true
):- The controller performs a calculation (
CalculateNumberOfReplicas
) considering cluster node capacity (specifically, the minimum GPU capacity available on nodes matching thegpuVendor
, referred to asminGpusPerNode
). - User Precedence: If the user explicitly sets both
replicas
(> 0) andgpusPerReplica
(> 0), these values are used directly, provided the total requested GPUs (replicas * gpusPerReplica
) does not exceed the total available GPUs of the specifiedgpuVendor
in the cluster. Thegpus
field is ignored in this case. - Calculation Fallback: If the user does not explicitly set both
replicas
andgpusPerReplica
, or if the requested total exceeds cluster capacity, the controller calculates the optimalreplicas
andgpusPerReplica
based on thegpus
field and the cluster'sminGpusPerNode
.- The
totalUserRequestedGpus
is determined (usinggpus
field, capped at total cluster capacity). - The final
replicas
is calculated asceil(totalUserRequestedGpus / minGpusPerNode)
. - The final
gpusPerReplica
is calculated astotalUserRequestedGpus / replicas
.
- The
- The calculated or user-provided
replicas
value sets the Ray worker group replica count (minReplicas
,maxReplicas
,replicas
). This is due the fact that Kueue does not support Ray's autoscaling. - The calculated or user-provided
gpusPerReplica
value sets the GPU resource request/limit for each Ray worker pod's container.
- The controller performs a calculation (
Summary Table (Ray Workloads):
User Input (spec.* ) |
Calculation Performed? | Outcome (replicas , gpusPerReplica ) |
Notes |
---|---|---|---|
replicas > 0 , gpusPerReplica > 0 |
No* | Uses user's replicas , user's gpusPerReplica |
*If total fits cluster. gpus ignored. Highest precedence. |
gpus > 0 (only) |
Yes | Calculated based on gpus and minGpusPerNode |
Aims to maximize GPUs per node up to minGpusPerNode . |
replicas > 0 , gpus > 0 |
Yes | Calculated based on gpus and minGpusPerNode (user replicas ignored) |
Falls back to calculation based on total gpus . |
gpusPerReplica > 0 , gpus > 0 |
Yes | Calculated based on gpus and minGpusPerNode (user gpusPerReplica ignored) |
Falls back to calculation based on total gpus . |
All three set | No* | Uses user's replicas , user's gpusPerReplica |
*If total fits cluster (like row 1). Otherwise, calculates based on gpus . |
None set (or only gpuVendor ) |
No | replicas=1 , gpusPerReplica=0 |
No GPUs requested. |