API Reference
Packages
aim.silogen.ai/v1alpha1
Package v1alpha1 contains API Schema definitions for the AIM v1alpha1 API group.
Resource Types
- AIMClusterImage
- AIMClusterImageList
- AIMClusterRuntimeConfig
- AIMClusterRuntimeConfigList
- AIMClusterServiceTemplate
- AIMClusterServiceTemplateList
- AIMImage
- AIMImageList
- AIMModelCache
- AIMModelCacheList
- AIMRuntimeConfig
- AIMRuntimeConfigList
- AIMService
- AIMServiceList
- AIMServiceTemplate
- AIMServiceTemplateList
- AIMTemplateCache
- AIMTemplateCacheList
AIMClusterImage
AIMClusterImage is the Schema for cluster-scoped AIM image catalog entries.
Appears in: - AIMClusterImageList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterImage |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMImageSpec |
|||
status AIMImageStatus |
AIMClusterImageList
AIMClusterImageList contains a list of AIMClusterImage.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterImageList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMClusterImage array |
AIMClusterRuntimeConfig
AIMClusterRuntimeConfig defines cluster-scoped runtime defaults for AIM resources.
Appears in: - AIMClusterRuntimeConfigList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterRuntimeConfig |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMClusterRuntimeConfigSpec |
|||
status AIMRuntimeConfigStatus |
AIMClusterRuntimeConfigList
AIMClusterRuntimeConfigList contains a list of AIMClusterRuntimeConfig.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterRuntimeConfigList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMClusterRuntimeConfig array |
AIMClusterRuntimeConfigSpec
AIMClusterRuntimeConfigSpec defines cluster-wide defaults for AIM resources.
Appears in: - AIMClusterRuntimeConfig
| Field | Description | Default | Validation |
|---|---|---|---|
defaultStorageClassName string |
DefaultStorageClassName is the storage class used for model caches when one is not specified directly on the consumer resource. |
||
routing AIMRuntimeRoutingConfig |
Routing controls HTTP routing defaults applied to AIM resources. |
AIMClusterServiceTemplate
Appears in: - AIMClusterServiceTemplateList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterServiceTemplate |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMClusterServiceTemplateSpec |
|||
status AIMServiceTemplateStatus |
AIMClusterServiceTemplateList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMClusterServiceTemplateList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMClusterServiceTemplate array |
AIMClusterServiceTemplateSpec
AIMClusterServiceTemplateSpec defines the desired state of AIMClusterServiceTemplate (cluster-scoped).
A cluster-scoped template that selects a runtime profile for a given AIM model.
Appears in: - AIMClusterServiceTemplate
| Field | Description | Default | Validation |
|---|---|---|---|
aimImageName string |
AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.Example: meta/llama-3-8b:1.1+20240915 |
MinLength: 1 |
|
metric AIMMetric |
Metric selects the optimization goal. - latency: prioritize low end‑to‑end latency- throughput: prioritize sustained requests/second |
Enum: [latency throughput] |
|
precision AIMPrecision |
Precision selects the numeric precision used by the runtime. | Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
gpuSelector AimGpuSelector |
AimGpuSelector contains the strategy to choose the resources to give each replica | ||
runtimeConfigName string |
RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. | default | |
resources ResourceRequirements |
Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults. |
AIMDiscoveryProfileMetadata
Appears in: - AIMDiscoveryProfile
| Field | Description | Default | Validation |
|---|---|---|---|
engine string |
|||
gpu string |
|||
gpu_count integer |
|||
metric AIMMetric |
Enum: [latency throughput] |
||
precision AIMPrecision |
Enum: [bf16 fp16 fp8 int8] |
AIMImage
AIMImage is the Schema for namespace-scoped AIM image catalog entries.
Appears in: - AIMImageList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMImage |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMImageSpec |
|||
status AIMImageStatus |
AIMImageDiscoverySpec
AIMImageDiscoverySpec configures metadata discovery and template generation for an image.
Appears in: - AIMImageSpec
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled toggles metadata discovery for this image. Disabled by default. | ||
autoCreateTemplates boolean |
AutoCreateTemplates controls whether recommended deployments from discovery automatically create ServiceTemplates. Enabled by default when discovery runs. |
AIMImageList
AIMImageList contains a list of AIMImage.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMImageList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMImage array |
AIMImageSpec
AIMImageSpec defines the desired state of AIMImage.
Appears in: - AIMClusterImage - AIMImage
| Field | Description | Default | Validation |
|---|---|---|---|
image string |
Image is the container image URI for this AIM model. This image is inspected by the operator to select runtime profiles used by templates. |
MinLength: 1 |
|
defaultServiceTemplate string |
DefaultServiceTemplate is the default template to use for this image, if the user does not provide any | ||
discovery AIMImageDiscoverySpec |
Discovery controls metadata extraction and automatic template creation for this image. | ||
resources ResourceRequirements |
Resources defines the default resource requirements for services using this image. Template- or service-level values override these defaults. Must have both cpu and memory in requests Must have memory in limits |
Required: {} |
AIMImageStatus
AIMImageStatus defines the observed state of AIMImage.
Appears in: - AIMClusterImage - AIMImage
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller | ||
status AIMImageStatusEnum |
Status represents the overall status of the image based on its templates | Pending | Enum: [Pending Progressing Ready Degraded Failed] |
conditions Condition array |
Conditions represent the latest available observations of the model's state | ||
resolvedRuntimeConfig AIMResolvedRuntimeConfig |
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. | ||
imageMetadata ImageMetadata |
ImageMetadata is the metadata extracted from an AIM image |
AIMImageStatusEnum
Underlying type: string
AIMImageStatusEnum represents the overall status of an AIMImage.
Validation: - Enum: [Pending Progressing Ready Degraded Failed]
Appears in: - AIMImageStatus
| Field | Description |
|---|---|
Pending |
AIMImageStatusPending indicates the image has been created but template generation has not started. |
Progressing |
AIMImageStatusProgressing indicates one or more templates are still being discovered. |
Ready |
AIMImageStatusReady indicates all templates are available and ready. |
Degraded |
AIMImageStatusDegraded indicates one or more templates are degraded or failed. |
Failed |
AIMImageStatusFailed indicates all templates are degraded or failed. |
AIMMetric
Underlying type: string
AIMMetric enumerates the targeted service characteristic
Validation: - Enum: [latency throughput]
Appears in: - AIMClusterServiceTemplateSpec - AIMDiscoveryProfileMetadata - AIMProfileMetadata - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon
| Field | Description |
|---|---|
latency |
|
throughput |
AIMModelCache
AIMModelCache is the Schema for the modelcaches API
Appears in: - AIMModelCacheList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMModelCache |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMModelCacheSpec |
|||
status AIMModelCacheStatus |
AIMModelCacheList
AIMModelCacheList contains a list of AIMModelCache
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMModelCacheList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMModelCache array |
AIMModelCacheSpec
AIMModelCacheSpec defines the desired state of AIMModelCache
Appears in: - AIMModelCache
| Field | Description | Default | Validation |
|---|---|---|---|
sourceUri string |
SourceURI is the source of the model to be downloaded. This is the only identifier |
MinLength: 1 Pattern: ^(hf\|s3)://[^ \t\r\n]+$ |
|
storageClassName string |
StorageClassName specifies the storage class for the cache volume | ||
size Quantity |
Size specifies the size of the cache volume | ||
env EnvVar array |
Env lists the environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens). |
||
modelDownloadImage string |
ModelDownloadImage is the image used to download the model | kserve/storage-initializer:v0.16.0-rc0 | |
imagePullSecrets LocalObjectReference array |
ImagePullSecrets references secrets for pulling AIM container images. |
AIMModelCacheStatus
AIMModelCacheStatus defines the observed state of AIMModelCache
Appears in: - AIMModelCache
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
|||
conditions Condition array |
Conditions represent the latest available observations of the model cache's state | ||
status AIMModelCacheStatusEnum |
Status represents the current status of the model cache | Pending | Enum: [Pending Progressing Available Failed] |
lastUsed Time |
LastUsed represents the last time a model was deployed that used this cache | ||
persistentVolumeClaim string |
PersistentVolumeClaim represents the name of the created PVC |
AIMModelCacheStatusEnum
Underlying type: string
Validation: - Enum: [Pending Progressing Available Failed]
Appears in: - AIMModelCacheStatus
| Field | Description |
|---|---|
Pending |
AIMModelCacheStatusPending denotes that the model cache has not been created yet |
Progressing |
AIMModelCacheStatusProgressing denotes that the model cache is currently being filled |
Available |
AIMModelCacheStatusAvailable denotes that a model cache is filled and ready to be used |
Failed |
AIMModelCacheStatusFailed denotes that the model cache has failed. A more detailed reason will be available in the conditions. |
AIMModelSource
Appears in: - AIMServiceTemplateStatus
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the name of the model | ||
sourceUri string |
SourceURI is the source where the model should be downloaded from | ||
size Quantity |
Size is the amount of storage that the source expects |
AIMPrecision
Underlying type: string
AIMPrecision enumerates supported numeric precisions
Validation: - Enum: [bf16 fp16 fp8 int8]
Appears in: - AIMClusterServiceTemplateSpec - AIMDiscoveryProfileMetadata - AIMProfileMetadata - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon
| Field | Description |
|---|---|
auto |
|
fp4 |
|
fp8 |
|
fp16 |
|
fp32 |
|
bf16 |
|
int4 |
|
int8 |
AIMProfile
Appears in: - AIMServiceTemplateStatus
| Field | Description | Default | Validation |
|---|---|---|---|
engine_args JSON |
Schemaless: {} |
||
env_vars object (keys:string, values:string) |
|||
metadata AIMProfileMetadata |
Refer to Kubernetes API documentation for fields of metadata. |
AIMProfileMetadata
Appears in: - AIMProfile
| Field | Description | Default | Validation |
|---|---|---|---|
engine string |
|||
gpu string |
|||
gpu_count integer |
|||
metric AIMMetric |
Enum: [latency throughput] |
||
precision AIMPrecision |
Enum: [bf16 fp16 fp8 int8] |
AIMResolutionScope
Underlying type: string
AIMResolutionScope describes the scope of a resolved reference.
Validation: - Enum: [Namespace Cluster Unknown]
Appears in: - AIMResolvedReference - AIMResolvedRuntimeConfig - AIMServiceResolvedTemplate
| Field | Description |
|---|---|
Namespace |
AIMResolutionScopeNamespace denotes a namespace-scoped resource. |
Cluster |
AIMResolutionScopeCluster denotes a cluster-scoped resource. |
Unknown |
AIMResolutionScopeUnknown denotes that the scope could not be determined. |
AIMResolvedReference
AIMResolvedReference captures metadata about a resolved reference.
Appears in: - AIMResolvedRuntimeConfig - AIMServiceResolvedTemplate - AIMServiceStatus - AIMServiceTemplateStatus
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the resource name that satisfied the reference. | ||
namespace string |
Namespace identifies where the resource was found when namespace-scoped. Empty indicates a cluster-scoped resource. |
||
scope AIMResolutionScope |
Scope indicates whether the resolved resource was namespace or cluster scoped. | Enum: [Namespace Cluster Unknown] |
|
kind string |
Kind is the fully-qualified kind of the resolved reference, when known. | ||
uid UID |
UID captures the unique identifier of the resolved reference, when known. |
AIMResolvedRuntimeConfig
AIMResolvedRuntimeConfig captures metadata about the runtime config that was resolved. This follows the same pattern as AIMServiceResolvedTemplate for consistency.
Appears in: - AIMImageStatus - AIMServiceStatus - AIMServiceTemplateStatus - AIMTemplateCacheStatus
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the resource name that satisfied the reference. | ||
namespace string |
Namespace identifies where the resource was found when namespace-scoped. Empty indicates a cluster-scoped resource. |
||
scope AIMResolutionScope |
Scope indicates whether the resolved resource was namespace or cluster scoped. | Enum: [Namespace Cluster Unknown] |
|
kind string |
Kind is the fully-qualified kind of the resolved reference, when known. | ||
uid UID |
UID captures the unique identifier of the resolved reference, when known. |
AIMRuntimeConfig
AIMRuntimeConfig defines namespace-scoped runtime overrides for AIM resources.
Appears in: - AIMRuntimeConfigList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMRuntimeConfig |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMRuntimeConfigSpec |
|||
status AIMRuntimeConfigStatus |
AIMRuntimeConfigCommon
AIMRuntimeConfigCommon captures configuration fields shared across cluster and namespace scopes.
Appears in: - AIMClusterRuntimeConfigSpec - AIMRuntimeConfigSpec
| Field | Description | Default | Validation |
|---|---|---|---|
defaultStorageClassName string |
DefaultStorageClassName is the storage class used for model caches when one is not specified directly on the consumer resource. |
||
routing AIMRuntimeRoutingConfig |
Routing controls HTTP routing defaults applied to AIM resources. |
AIMRuntimeConfigCredentials
AIMRuntimeConfigCredentials captures namespace-scoped authentication knobs.
Appears in: - AIMRuntimeConfigSpec
| Field | Description | Default | Validation |
|---|---|---|---|
serviceAccountName string |
ServiceAccountName is the service account used for discovery jobs, cache warmers, and any other workloads spawned by the operator on behalf of this runtime config. |
||
imagePullSecrets LocalObjectReference array |
ImagePullSecrets are merged with controller defaults when creating pods that need to pull model or runtime images. |
AIMRuntimeConfigList
AIMRuntimeConfigList contains a list of AIMRuntimeConfig.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMRuntimeConfigList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMRuntimeConfig array |
AIMRuntimeConfigSpec
AIMRuntimeConfigSpec defines namespace-scoped overrides for AIM resources.
Appears in: - AIMRuntimeConfig
| Field | Description | Default | Validation |
|---|---|---|---|
defaultStorageClassName string |
DefaultStorageClassName is the storage class used for model caches when one is not specified directly on the consumer resource. |
||
routing AIMRuntimeRoutingConfig |
Routing controls HTTP routing defaults applied to AIM resources. | ||
serviceAccountName string |
ServiceAccountName is the service account used for discovery jobs, cache warmers, and any other workloads spawned by the operator on behalf of this runtime config. |
||
imagePullSecrets LocalObjectReference array |
ImagePullSecrets are merged with controller defaults when creating pods that need to pull model or runtime images. |
AIMRuntimeConfigStatus
AIMRuntimeConfigStatus records the resolved config reference surfaced to consumers.
Appears in: - AIMClusterRuntimeConfig - AIMRuntimeConfig
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the last reconciled generation. | ||
conditions Condition array |
Conditions communicate reconciliation progress. |
AIMRuntimeParameters
AIMRuntimeParameters contains the runtime configuration parameters shared across templates and services. Fields use pointers to allow optional usage in different contexts (required in templates, optional in service overrides).
Appears in: - AIMClusterServiceTemplateSpec - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon
| Field | Description | Default | Validation |
|---|---|---|---|
metric AIMMetric |
Metric selects the optimization goal. - latency: prioritize low end‑to‑end latency- throughput: prioritize sustained requests/second |
Enum: [latency throughput] |
|
precision AIMPrecision |
Precision selects the numeric precision used by the runtime. | Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
gpuSelector AimGpuSelector |
AimGpuSelector contains the strategy to choose the resources to give each replica |
AIMRuntimeRoutingConfig
AIMRuntimeRoutingConfig configures routing defaults applied during inference service creation.
Appears in: - AIMClusterRuntimeConfigSpec - AIMRuntimeConfigCommon - AIMRuntimeConfigSpec
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled toggles HTTP routing management for consumers of this runtime config. | ||
gatewayRef ParentReference |
GatewayRef identifies the Gateway parent that should receive HTTPRoutes for consumers. | ||
routeTemplate string |
RouteTemplate renders a HTTP path prefix using the AIMService as context. Example: /\{.metadata.namespace\}/\{.metadata.labels['team']\}/\{.spec.model\}/ |
AIMService
AIMService manages a KServe-based AIM inference service for the selected model and template.
Appears in: - AIMServiceList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMService |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMServiceSpec |
|||
status AIMServiceStatus |
AIMServiceList
AIMServiceList contains a list of AIMService.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMServiceList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMService array |
AIMServiceOverrides
AIMServiceOverrides allows overriding template parameters at the service level. All fields are optional. When specified, they override the corresponding values from the referenced AIMServiceTemplate.
Appears in: - AIMServiceSpec
| Field | Description | Default | Validation |
|---|---|---|---|
metric AIMMetric |
Metric selects the optimization goal. - latency: prioritize low end‑to‑end latency- throughput: prioritize sustained requests/second |
Enum: [latency throughput] |
|
precision AIMPrecision |
Precision selects the numeric precision used by the runtime. | Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
gpuSelector AimGpuSelector |
AimGpuSelector contains the strategy to choose the resources to give each replica |
AIMServiceResolvedTemplate
AIMServiceResolvedTemplate retains the historical name while reusing the shared structure.
Appears in: - AIMServiceStatus
| Field | Description | Default | Validation |
|---|---|---|---|
name string |
Name is the resource name that satisfied the reference. | ||
namespace string |
Namespace identifies where the resource was found when namespace-scoped. Empty indicates a cluster-scoped resource. |
||
scope AIMResolutionScope |
Scope indicates whether the resolved resource was namespace or cluster scoped. | Enum: [Namespace Cluster Unknown] |
|
kind string |
Kind is the fully-qualified kind of the resolved reference, when known. | ||
uid UID |
UID captures the unique identifier of the resolved reference, when known. |
AIMServiceRouting
AIMServiceRouting configures optional HTTP routing for the service.
Appears in: - AIMServiceSpec
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled toggles HTTP routing management. | false | |
gatewayRef ParentReference |
GatewayRef identifies the Gateway parent that should receive the HTTPRoute. When omitted while routing is enabled, reconciliation will report a failure. |
||
annotations object (keys:string, values:string) |
Annotations to add to the HTTPRoute resource. | ||
routeTemplate string |
RouteTemplate overrides the HTTP path template used for routing. The value is rendered against the AIMService object using JSONPath expressions. |
AIMServiceRoutingStatus
AIMServiceRoutingStatus captures observed routing details.
Appears in: - AIMServiceStatus
| Field | Description | Default | Validation |
|---|---|---|---|
path string |
Path is the HTTP path prefix used when routing is enabled. Example: /tenant/svc-uuid. |
AIMServiceSpec
AIMServiceSpec defines the desired state of AIMService.
Binds a canonical model to an AIMServiceTemplate and configures replicas, caching behavior, and optional overrides. The template governs the base runtime selection knobs, while the overrides field allows service-specific customization.
Appears in: - AIMService
| Field | Description | Default | Validation |
|---|---|---|---|
aimImageName string |
AIMImageName is the canonical model name (including version/revision) to deploy. Expected to match the spec.metadata.name of an AIMImage. Example:meta-llama-3-8b-1-1-20240915. |
MinLength: 1 |
|
templateRef string |
TemplateRef is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to use. The template selects the runtime profile and GPU parameters. |
||
cacheModel boolean |
CacheModel requests that model sources be cached when starting the service if the template itself does not warm the cache. When warmCache: false on the template, this setting ensures caching isperformed before the service becomes ready. |
false | |
replicas integer |
Replicas overrides the number of replicas for this service. Other runtime settings remain governed by the template unless overridden. |
1 | |
runtimeConfigName string |
RuntimeConfigName references the AIM runtime configuration (by name) to use for this service. | default | |
resources ResourceRequirements |
Resources overrides the container resource requirements for this service. When specified, these values take precedence over the template and image defaults. |
||
overrides AIMServiceOverrides |
Overrides allows overriding specific template parameters for this service. When specified, these values take precedence over the template values. |
||
env EnvVar array |
Env specifies environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens). |
||
imagePullSecrets LocalObjectReference array |
ImagePullSecrets references secrets for pulling AIM container images. | ||
routing AIMServiceRouting |
Routing enables HTTP routing through Gateway API for this service. |
AIMServiceStatus
AIMServiceStatus defines the observed state of AIMService.
Appears in: - AIMService
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
conditions Condition array |
Conditions represent the latest observations of template state. | ||
resolvedRuntimeConfig AIMResolvedRuntimeConfig |
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. | ||
resolvedImage AIMResolvedReference |
ResolvedImage captures metadata about the image that was resolved. | ||
status AIMServiceStatusEnum |
Status represents the current high‑level status of the service lifecycle. Values: Pending, Starting, Running, Failed, Degraded. |
Pending | Enum: [Pending Starting Running Failed Degraded] |
routing AIMServiceRoutingStatus |
Routing surfaces information about the configured HTTP routing, when enabled. | ||
resolvedTemplate AIMServiceResolvedTemplate |
ResolvedTemplate captures metadata about the template that satisfied the reference. |
AIMServiceStatusEnum
Underlying type: string
AIMServiceStatusEnum defines coarse-grained states for a service.
Validation: - Enum: [Pending Starting Running Failed Degraded]
Appears in: - AIMServiceStatus
| Field | Description |
|---|---|
Pending |
AIMServiceStatusPending denotes that the template has been created and discovery has not yet started. |
Starting |
AIMServiceStatusStarting denotes that discovery and/or cache warm is in progress. |
Running |
AIMServiceStatusRunning denotes that discovery succeeded and, if requested, caches are warmed. |
Failed |
AIMServiceStatusFailed denotes a terminal failure for discovery or warm operations. |
Degraded |
AIMServiceStatusDegraded denotes a recoverable failure state. |
AIMServiceTemplate
Appears in: - AIMServiceTemplateList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMServiceTemplate |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMServiceTemplateSpec |
|||
status AIMServiceTemplateStatus |
AIMServiceTemplateList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMServiceTemplateList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMServiceTemplate array |
AIMServiceTemplateSpec
AIMServiceTemplateSpec defines the desired state of AIMServiceTemplate (namespace-scoped).
A namespaced and versioned template that selects a runtime profile for a given AIM model (by canonical name). Templates are intentionally narrow: they describe runtime selection knobs for the AIM container and do not redefine the full Kubernetes deployment shape.
Appears in: - AIMServiceTemplate
| Field | Description | Default | Validation |
|---|---|---|---|
aimImageName string |
AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.Example: meta/llama-3-8b:1.1+20240915 |
MinLength: 1 |
|
metric AIMMetric |
Metric selects the optimization goal. - latency: prioritize low end‑to‑end latency- throughput: prioritize sustained requests/second |
Enum: [latency throughput] |
|
precision AIMPrecision |
Precision selects the numeric precision used by the runtime. | Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
gpuSelector AimGpuSelector |
AimGpuSelector contains the strategy to choose the resources to give each replica | ||
runtimeConfigName string |
RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. | default | |
resources ResourceRequirements |
Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults. |
||
caching AIMTemplateCachingConfig |
Caching configures model caching behavior for this namespace-scoped template. When enabled, models will be cached using the specified environment variables during download. |
||
env EnvVar array |
Env specifies environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens). |
||
imagePullSecrets LocalObjectReference array |
ImagePullSecrets references secrets for pulling AIM container images. |
AIMServiceTemplateSpecCommon
AIMServiceTemplateSpecCommon contains the shared fields for both cluster-scoped and namespace-scoped service templates.
Appears in: - AIMClusterServiceTemplateSpec - AIMServiceTemplateSpec
| Field | Description | Default | Validation |
|---|---|---|---|
aimImageName string |
AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.Example: meta/llama-3-8b:1.1+20240915 |
MinLength: 1 |
|
metric AIMMetric |
Metric selects the optimization goal. - latency: prioritize low end‑to‑end latency- throughput: prioritize sustained requests/second |
Enum: [latency throughput] |
|
precision AIMPrecision |
Precision selects the numeric precision used by the runtime. | Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8] |
|
gpuSelector AimGpuSelector |
AimGpuSelector contains the strategy to choose the resources to give each replica | ||
runtimeConfigName string |
RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. | default | |
resources ResourceRequirements |
Resources defines the default container resource requirements applied to services derived from this template. Service-specific values override the template defaults. |
AIMServiceTemplateStatus
AIMServiceTemplateStatus defines the observed state of AIMServiceTemplate.
Appears in: - AIMClusterServiceTemplate - AIMServiceTemplate
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
conditions Condition array |
Conditions represent the latest observations of template state. | ||
resolvedRuntimeConfig AIMResolvedRuntimeConfig |
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. | ||
resolvedImage AIMResolvedReference |
ResolvedImage captures metadata about the image that was resolved. | ||
status AIMTemplateStatusEnum |
Status represents the current high‑level status of the template lifecycle. Values: Pending, Progressing, Available, Failed. |
Pending | Enum: [Pending Progressing Available Degraded Failed] |
modelSources AIMModelSource array |
ModelSources list the models that this template requires to run. These are the models that will be cached, if this template is cached. |
||
profile AIMProfile |
Profile contains the full discovery result profile as a free-form JSON object. This includes metadata, engine args, environment variables, and model details. |
AIMTemplateCache
AIMTemplateCache pre-warms model caches for a specified template.
Appears in: - AIMTemplateCacheList
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMTemplateCache |
||
metadata ObjectMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
spec AIMTemplateCacheSpec |
|||
status AIMTemplateCacheStatus |
AIMTemplateCacheList
AIMTemplateCacheList contains a list of AIMTemplateCache.
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string |
aim.silogen.ai/v1alpha1 |
||
kind string |
AIMTemplateCacheList |
||
metadata ListMeta |
Refer to Kubernetes API documentation for fields of metadata. |
||
items AIMTemplateCache array |
AIMTemplateCacheSpec
AIMTemplateCacheSpec defines the desired state of AIMTemplateCache
Appears in: - AIMTemplateCache
| Field | Description | Default | Validation |
|---|---|---|---|
templateRef string |
TemplateRef is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to cache. The controller will first look for a namespace-scoped AIMServiceTemplate in the same namespace. If not found, it will look for a cluster-scoped AIMClusterServiceTemplate with the same name. Namespace-scoped templates take priority over cluster-scoped templates. |
MinLength: 1 |
|
env EnvVar array |
Env specifies environment variables to use for authentication when downloading models. These variables are used for authentication with model registries (e.g., HuggingFace tokens). |
||
imagePullSecrets LocalObjectReference array |
ImagePullSecrets references secrets for pulling AIM container images. | ||
storageClassName string |
StorageClassName is the name for the storage class to use for this cache | ||
runtimeConfigName string |
RuntimeConfigName references the AIM runtime configuration (by name) to use for this template cache. | default |
AIMTemplateCacheStatus
AIMTemplateCacheStatus defines the observed state of AIMTemplateCache
Appears in: - AIMTemplateCache
| Field | Description | Default | Validation |
|---|---|---|---|
observedGeneration integer |
ObservedGeneration is the most recent generation observed by the controller. | ||
conditions Condition array |
Conditions represent the latest observations of the template cache state. | ||
resolvedRuntimeConfig AIMResolvedRuntimeConfig |
ResolvedRuntimeConfig captures metadata about the runtime config that was resolved. | ||
status AIMTemplateCacheStatusEnum |
Status represents the current high-level status of the template cache. | Pending | Enum: [Pending Progressing Available Failed] |
resolvedTemplateKind string |
ResolvedTemplateKind indicates whether the template resolved to a namespace-scoped AIMServiceTemplate or cluster-scoped AIMClusterServiceTemplate. Values: "AIMServiceTemplate", "AIMClusterServiceTemplate" |
AIMTemplateCacheStatusEnum
Underlying type: string
AIMTemplateCacheStatusEnum defines the status of the template cache.
Validation: - Enum: [Pending Progressing Available Failed]
Appears in: - AIMTemplateCacheStatus
| Field | Description |
|---|---|
Pending |
AIMTemplateCacheStatusPending denotes that the template cache has been created but not yet processed. |
Progressing |
AIMTemplateCacheStatusProgressing denotes that the template cache is being warmed. |
Available |
AIMTemplateCacheStatusAvailable denotes that the template cache is ready and models are cached. |
Failed |
AIMTemplateCacheStatusFailed denotes that the template cache operation has failed. |
AIMTemplateCachingConfig
AIMTemplateCachingConfig configures model caching behavior for namespace-scoped templates.
Appears in: - AIMServiceTemplateSpec
| Field | Description | Default | Validation |
|---|---|---|---|
enabled boolean |
Enabled controls whether caching is enabled for this template. Defaults to false. |
false | |
env EnvVar array |
Env specifies environment variables to use when downloading the model. These variables are available to the model download process and can be used to configure download behavior, authentication, proxies, etc. |
AIMTemplateStatusEnum
Underlying type: string
AIMTemplateStatusEnum defines coarse-grained states for a template.
Validation: - Enum: [Pending Progressing Available Degraded Failed]
Appears in: - AIMServiceTemplateStatus
| Field | Description |
|---|---|
Pending |
AIMTemplateStatusPending denotes that the template has been created and discovery has not yet started. |
Progressing |
AIMTemplateStatusProgressing denotes that discovery and/or cache warm is in progress. |
Available |
AIMTemplateStatusAvailable denotes that discovery succeeded and, if requested, caches are warmed. |
Degraded |
AIMTemplateStatusDegraded denotes that the template is non-functional for some reason, for example that the cluster doesn't have the resources specified. |
Failed |
AIMTemplateStatusFailed denotes a terminal failure for discovery or warm operations. |
AimGpuSelector
Appears in: - AIMClusterServiceTemplateSpec - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon
| Field | Description | Default | Validation |
|---|---|---|---|
count integer |
Count is the number of the GPU resources requested per replica | Minimum: 1 |
|
model string |
Model is the model name of the GPU that is supported by this template | MinLength: 1 |
ImageMetadata
ImageMetadata contains metadata extracted from or provided for a container image.
Appears in: - AIMImageStatus
| Field | Description | Default | Validation |
|---|---|---|---|
model ModelMetadata |
Model contains AMD Silogen model-specific metadata. | ||
oci OCIMetadata |
OCI contains standard OCI image metadata. |
ModelMetadata
ModelMetadata contains AMD Silogen model-specific metadata extracted from image labels.
Appears in: - ImageMetadata
| Field | Description | Default | Validation |
|---|---|---|---|
canonicalName string |
CanonicalName is the canonical model identifier (e.g., mistralai/Mixtral-8x22B-Instruct-v0.1). Extracted from: org.amd.silogen.model.canonicalName |
||
source string |
Source is the URL where the model can be found. Extracted from: org.amd.silogen.model.source |
||
tags string array |
Tags are descriptive tags (e.g., ["text-generation", "chat", "instruction"]). Extracted from: org.amd.silogen.model.tags (comma-separated) |
||
versions string array |
Versions lists available versions. Extracted from: org.amd.silogen.model.versions (comma-separated) |
||
variants string array |
Variants lists model variants. Extracted from: org.amd.silogen.model.variants (comma-separated) |
||
hfTokenRequired boolean |
HFTokenRequired indicates if a HuggingFace token is required. Extracted from: org.amd.silogen.hfToken.required |
||
title string |
Title is the Silogen-specific title for the model. Extracted from: org.amd.silogen.title |
||
descriptionFull string |
DescriptionFull is the full description. Extracted from: org.amd.silogen.description.full |
||
releaseNotes string |
ReleaseNotes contains release notes for this version. Extracted from: org.amd.silogen.release.notes |
||
recommendedDeployments RecommendedDeployment array |
RecommendedDeployments contains recommended deployment configurations. Extracted from: org.amd.silogen.model.recommendedDeployments (parsed from JSON array) |
OCIMetadata
OCIMetadata contains standard OCI image metadata extracted from image labels.
Appears in: - ImageMetadata
| Field | Description | Default | Validation |
|---|---|---|---|
title string |
Title is the human-readable title. Extracted from: org.opencontainers.image.title |
||
description string |
Description is a brief description. Extracted from: org.opencontainers.image.description |
||
licenses string |
Licenses is the SPDX license identifier(s). Extracted from: org.opencontainers.image.licenses |
||
vendor string |
Vendor is the organization that produced the image. Extracted from: org.opencontainers.image.vendor |
||
authors string |
Authors is contact details of the authors. Extracted from: org.opencontainers.image.authors |
||
source string |
Source is the URL to the source code repository. Extracted from: org.opencontainers.image.source |
||
documentation string |
Documentation is the URL to documentation. Extracted from: org.opencontainers.image.documentation |
||
created string |
Created is the creation timestamp. Extracted from: org.opencontainers.image.created |
||
revision string |
Revision is the source control revision. Extracted from: org.opencontainers.image.revision |
||
version string |
Version is the image version. Extracted from: org.opencontainers.image.version |
RecommendedDeployment
RecommendedDeployment describes a recommended deployment configuration for a model.
Appears in: - ModelMetadata
| Field | Description | Default | Validation |
|---|---|---|---|
gpuModel string |
GPUModel is the GPU model name (e.g., MI300X, MI325X) | ||
gpuCount integer |
GPUCount is the number of GPUs required | ||
precision string |
Precision is the recommended precision (e.g., fp8, fp16, bf16) | ||
metric string |
Metric is the optimization target (e.g., latency, throughput) | ||
description string |
Description provides additional context about this deployment configuration |