Skip to content

API Reference

Packages

aim.silogen.ai/v1alpha1

Package v1alpha1 contains API Schema definitions for the AIM v1alpha1 API group.

Resource Types

AIMClusterImage

AIMClusterImage is the Schema for cluster-scoped AIM image catalog entries.

Appears in: - AIMClusterImageList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterImage
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMImageSpec
status AIMImageStatus

AIMClusterImageList

AIMClusterImageList contains a list of AIMClusterImage.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterImageList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMClusterImage array

AIMClusterRuntimeConfig

AIMClusterRuntimeConfig defines cluster-scoped runtime defaults for AIM resources.

Appears in: - AIMClusterRuntimeConfigList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterRuntimeConfig
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMClusterRuntimeConfigSpec
status AIMRuntimeConfigStatus

AIMClusterRuntimeConfigList

AIMClusterRuntimeConfigList contains a list of AIMClusterRuntimeConfig.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterRuntimeConfigList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMClusterRuntimeConfig array

AIMClusterRuntimeConfigSpec

AIMClusterRuntimeConfigSpec defines cluster-wide defaults for AIM resources.

Appears in: - AIMClusterRuntimeConfig

Field Description Default Validation
defaultStorageClassName string DefaultStorageClassName is the storage class used for model caches when one is not
specified directly on the consumer resource.
routing AIMRuntimeRoutingConfig Routing controls HTTP routing defaults applied to AIM resources.

AIMClusterServiceTemplate

Appears in: - AIMClusterServiceTemplateList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterServiceTemplate
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMClusterServiceTemplateSpec
status AIMServiceTemplateStatus

AIMClusterServiceTemplateList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMClusterServiceTemplateList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMClusterServiceTemplate array

AIMClusterServiceTemplateSpec

AIMClusterServiceTemplateSpec defines the desired state of AIMClusterServiceTemplate (cluster-scoped).

A cluster-scoped template that selects a runtime profile for a given AIM model.

Appears in: - AIMClusterServiceTemplate

Field Description Default Validation
aimImageName string AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.
Example: meta/llama-3-8b:1.1+20240915
MinLength: 1
metric AIMMetric Metric selects the optimization goal.
- latency: prioritize low end‑to‑end latency
- throughput: prioritize sustained requests/second
Enum: [latency throughput]
precision AIMPrecision Precision selects the numeric precision used by the runtime. Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
gpuSelector AimGpuSelector AimGpuSelector contains the strategy to choose the resources to give each replica
runtimeConfigName string RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. default
resources ResourceRequirements Resources defines the default container resource requirements applied to services derived from this template.
Service-specific values override the template defaults.

AIMDiscoveryProfileMetadata

Appears in: - AIMDiscoveryProfile

Field Description Default Validation
engine string
gpu string
gpu_count integer
metric AIMMetric Enum: [latency throughput]
precision AIMPrecision Enum: [bf16 fp16 fp8 int8]

AIMImage

AIMImage is the Schema for namespace-scoped AIM image catalog entries.

Appears in: - AIMImageList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMImage
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMImageSpec
status AIMImageStatus

AIMImageDiscoverySpec

AIMImageDiscoverySpec configures metadata discovery and template generation for an image.

Appears in: - AIMImageSpec

Field Description Default Validation
enabled boolean Enabled toggles metadata discovery for this image. Disabled by default.
autoCreateTemplates boolean AutoCreateTemplates controls whether recommended deployments from discovery
automatically create ServiceTemplates. Enabled by default when discovery runs.

AIMImageList

AIMImageList contains a list of AIMImage.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMImageList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMImage array

AIMImageSpec

AIMImageSpec defines the desired state of AIMImage.

Appears in: - AIMClusterImage - AIMImage

Field Description Default Validation
image string Image is the container image URI for this AIM model.
This image is inspected by the operator to select runtime profiles used by templates.
MinLength: 1
defaultServiceTemplate string DefaultServiceTemplate is the default template to use for this image, if the user does not provide any
discovery AIMImageDiscoverySpec Discovery controls metadata extraction and automatic template creation for this image.
resources ResourceRequirements Resources defines the default resource requirements for services using this image.
Template- or service-level values override these defaults.
Must have both cpu and memory in requests
Must have memory in limits
Required: {}

AIMImageStatus

AIMImageStatus defines the observed state of AIMImage.

Appears in: - AIMClusterImage - AIMImage

Field Description Default Validation
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller
status AIMImageStatusEnum Status represents the overall status of the image based on its templates Pending Enum: [Pending Progressing Ready Degraded Failed]
conditions Condition array Conditions represent the latest available observations of the model's state
resolvedRuntimeConfig AIMResolvedRuntimeConfig ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.
imageMetadata ImageMetadata ImageMetadata is the metadata extracted from an AIM image

AIMImageStatusEnum

Underlying type: string

AIMImageStatusEnum represents the overall status of an AIMImage.

Validation: - Enum: [Pending Progressing Ready Degraded Failed]

Appears in: - AIMImageStatus

Field Description
Pending AIMImageStatusPending indicates the image has been created but template generation has not started.
Progressing AIMImageStatusProgressing indicates one or more templates are still being discovered.
Ready AIMImageStatusReady indicates all templates are available and ready.
Degraded AIMImageStatusDegraded indicates one or more templates are degraded or failed.
Failed AIMImageStatusFailed indicates all templates are degraded or failed.

AIMMetric

Underlying type: string

AIMMetric enumerates the targeted service characteristic

Validation: - Enum: [latency throughput]

Appears in: - AIMClusterServiceTemplateSpec - AIMDiscoveryProfileMetadata - AIMProfileMetadata - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon

Field Description
latency
throughput

AIMModelCache

AIMModelCache is the Schema for the modelcaches API

Appears in: - AIMModelCacheList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMModelCache
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMModelCacheSpec
status AIMModelCacheStatus

AIMModelCacheList

AIMModelCacheList contains a list of AIMModelCache

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMModelCacheList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMModelCache array

AIMModelCacheSpec

AIMModelCacheSpec defines the desired state of AIMModelCache

Appears in: - AIMModelCache

Field Description Default Validation
sourceUri string SourceURI is the source of the model to be downloaded. This is the only
identifier
MinLength: 1
Pattern: ^(hf\|s3)://[^ \t\r\n]+$
storageClassName string StorageClassName specifies the storage class for the cache volume
size Quantity Size specifies the size of the cache volume
env EnvVar array Env lists the environment variables to use for authentication when downloading models.
These variables are used for authentication with model registries (e.g., HuggingFace tokens).
modelDownloadImage string ModelDownloadImage is the image used to download the model kserve/storage-initializer:v0.16.0-rc0
imagePullSecrets LocalObjectReference array ImagePullSecrets references secrets for pulling AIM container images.

AIMModelCacheStatus

AIMModelCacheStatus defines the observed state of AIMModelCache

Appears in: - AIMModelCache

Field Description Default Validation
observedGeneration integer
conditions Condition array Conditions represent the latest available observations of the model cache's state
status AIMModelCacheStatusEnum Status represents the current status of the model cache Pending Enum: [Pending Progressing Available Failed]
lastUsed Time LastUsed represents the last time a model was deployed that used this cache
persistentVolumeClaim string PersistentVolumeClaim represents the name of the created PVC

AIMModelCacheStatusEnum

Underlying type: string

Validation: - Enum: [Pending Progressing Available Failed]

Appears in: - AIMModelCacheStatus

Field Description
Pending AIMModelCacheStatusPending denotes that the model cache has not been created yet
Progressing AIMModelCacheStatusProgressing denotes that the model cache is currently being filled
Available AIMModelCacheStatusAvailable denotes that a model cache is filled and ready to be used
Failed AIMModelCacheStatusFailed denotes that the model cache has failed. A more detailed reason will be available in the conditions.

AIMModelSource

Appears in: - AIMServiceTemplateStatus

Field Description Default Validation
name string Name is the name of the model
sourceUri string SourceURI is the source where the model should be downloaded from
size Quantity Size is the amount of storage that the source expects

AIMPrecision

Underlying type: string

AIMPrecision enumerates supported numeric precisions

Validation: - Enum: [bf16 fp16 fp8 int8]

Appears in: - AIMClusterServiceTemplateSpec - AIMDiscoveryProfileMetadata - AIMProfileMetadata - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon

Field Description
auto
fp4
fp8
fp16
fp32
bf16
int4
int8

AIMProfile

Appears in: - AIMServiceTemplateStatus

Field Description Default Validation
engine_args JSON Schemaless: {}
env_vars object (keys:string, values:string)
metadata AIMProfileMetadata Refer to Kubernetes API documentation for fields of metadata.

AIMProfileMetadata

Appears in: - AIMProfile

Field Description Default Validation
engine string
gpu string
gpu_count integer
metric AIMMetric Enum: [latency throughput]
precision AIMPrecision Enum: [bf16 fp16 fp8 int8]

AIMResolutionScope

Underlying type: string

AIMResolutionScope describes the scope of a resolved reference.

Validation: - Enum: [Namespace Cluster Unknown]

Appears in: - AIMResolvedReference - AIMResolvedRuntimeConfig - AIMServiceResolvedTemplate

Field Description
Namespace AIMResolutionScopeNamespace denotes a namespace-scoped resource.
Cluster AIMResolutionScopeCluster denotes a cluster-scoped resource.
Unknown AIMResolutionScopeUnknown denotes that the scope could not be determined.

AIMResolvedReference

AIMResolvedReference captures metadata about a resolved reference.

Appears in: - AIMResolvedRuntimeConfig - AIMServiceResolvedTemplate - AIMServiceStatus - AIMServiceTemplateStatus

Field Description Default Validation
name string Name is the resource name that satisfied the reference.
namespace string Namespace identifies where the resource was found when namespace-scoped.
Empty indicates a cluster-scoped resource.
scope AIMResolutionScope Scope indicates whether the resolved resource was namespace or cluster scoped. Enum: [Namespace Cluster Unknown]
kind string Kind is the fully-qualified kind of the resolved reference, when known.
uid UID UID captures the unique identifier of the resolved reference, when known.

AIMResolvedRuntimeConfig

AIMResolvedRuntimeConfig captures metadata about the runtime config that was resolved. This follows the same pattern as AIMServiceResolvedTemplate for consistency.

Appears in: - AIMImageStatus - AIMServiceStatus - AIMServiceTemplateStatus - AIMTemplateCacheStatus

Field Description Default Validation
name string Name is the resource name that satisfied the reference.
namespace string Namespace identifies where the resource was found when namespace-scoped.
Empty indicates a cluster-scoped resource.
scope AIMResolutionScope Scope indicates whether the resolved resource was namespace or cluster scoped. Enum: [Namespace Cluster Unknown]
kind string Kind is the fully-qualified kind of the resolved reference, when known.
uid UID UID captures the unique identifier of the resolved reference, when known.

AIMRuntimeConfig

AIMRuntimeConfig defines namespace-scoped runtime overrides for AIM resources.

Appears in: - AIMRuntimeConfigList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMRuntimeConfig
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMRuntimeConfigSpec
status AIMRuntimeConfigStatus

AIMRuntimeConfigCommon

AIMRuntimeConfigCommon captures configuration fields shared across cluster and namespace scopes.

Appears in: - AIMClusterRuntimeConfigSpec - AIMRuntimeConfigSpec

Field Description Default Validation
defaultStorageClassName string DefaultStorageClassName is the storage class used for model caches when one is not
specified directly on the consumer resource.
routing AIMRuntimeRoutingConfig Routing controls HTTP routing defaults applied to AIM resources.

AIMRuntimeConfigCredentials

AIMRuntimeConfigCredentials captures namespace-scoped authentication knobs.

Appears in: - AIMRuntimeConfigSpec

Field Description Default Validation
serviceAccountName string ServiceAccountName is the service account used for discovery jobs, cache warmers,
and any other workloads spawned by the operator on behalf of this runtime config.
imagePullSecrets LocalObjectReference array ImagePullSecrets are merged with controller defaults when creating pods that need
to pull model or runtime images.

AIMRuntimeConfigList

AIMRuntimeConfigList contains a list of AIMRuntimeConfig.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMRuntimeConfigList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMRuntimeConfig array

AIMRuntimeConfigSpec

AIMRuntimeConfigSpec defines namespace-scoped overrides for AIM resources.

Appears in: - AIMRuntimeConfig

Field Description Default Validation
defaultStorageClassName string DefaultStorageClassName is the storage class used for model caches when one is not
specified directly on the consumer resource.
routing AIMRuntimeRoutingConfig Routing controls HTTP routing defaults applied to AIM resources.
serviceAccountName string ServiceAccountName is the service account used for discovery jobs, cache warmers,
and any other workloads spawned by the operator on behalf of this runtime config.
imagePullSecrets LocalObjectReference array ImagePullSecrets are merged with controller defaults when creating pods that need
to pull model or runtime images.

AIMRuntimeConfigStatus

AIMRuntimeConfigStatus records the resolved config reference surfaced to consumers.

Appears in: - AIMClusterRuntimeConfig - AIMRuntimeConfig

Field Description Default Validation
observedGeneration integer ObservedGeneration is the last reconciled generation.
conditions Condition array Conditions communicate reconciliation progress.

AIMRuntimeParameters

AIMRuntimeParameters contains the runtime configuration parameters shared across templates and services. Fields use pointers to allow optional usage in different contexts (required in templates, optional in service overrides).

Appears in: - AIMClusterServiceTemplateSpec - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon

Field Description Default Validation
metric AIMMetric Metric selects the optimization goal.
- latency: prioritize low end‑to‑end latency
- throughput: prioritize sustained requests/second
Enum: [latency throughput]
precision AIMPrecision Precision selects the numeric precision used by the runtime. Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
gpuSelector AimGpuSelector AimGpuSelector contains the strategy to choose the resources to give each replica

AIMRuntimeRoutingConfig

AIMRuntimeRoutingConfig configures routing defaults applied during inference service creation.

Appears in: - AIMClusterRuntimeConfigSpec - AIMRuntimeConfigCommon - AIMRuntimeConfigSpec

Field Description Default Validation
enabled boolean Enabled toggles HTTP routing management for consumers of this runtime config.
gatewayRef ParentReference GatewayRef identifies the Gateway parent that should receive HTTPRoutes for consumers.
routeTemplate string RouteTemplate renders a HTTP path prefix using the AIMService as context.
Example: /\{.metadata.namespace\}/\{.metadata.labels['team']\}/\{.spec.model\}/

AIMService

AIMService manages a KServe-based AIM inference service for the selected model and template.

Appears in: - AIMServiceList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMService
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMServiceSpec
status AIMServiceStatus

AIMServiceList

AIMServiceList contains a list of AIMService.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMServiceList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMService array

AIMServiceOverrides

AIMServiceOverrides allows overriding template parameters at the service level. All fields are optional. When specified, they override the corresponding values from the referenced AIMServiceTemplate.

Appears in: - AIMServiceSpec

Field Description Default Validation
metric AIMMetric Metric selects the optimization goal.
- latency: prioritize low end‑to‑end latency
- throughput: prioritize sustained requests/second
Enum: [latency throughput]
precision AIMPrecision Precision selects the numeric precision used by the runtime. Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
gpuSelector AimGpuSelector AimGpuSelector contains the strategy to choose the resources to give each replica

AIMServiceResolvedTemplate

AIMServiceResolvedTemplate retains the historical name while reusing the shared structure.

Appears in: - AIMServiceStatus

Field Description Default Validation
name string Name is the resource name that satisfied the reference.
namespace string Namespace identifies where the resource was found when namespace-scoped.
Empty indicates a cluster-scoped resource.
scope AIMResolutionScope Scope indicates whether the resolved resource was namespace or cluster scoped. Enum: [Namespace Cluster Unknown]
kind string Kind is the fully-qualified kind of the resolved reference, when known.
uid UID UID captures the unique identifier of the resolved reference, when known.

AIMServiceRouting

AIMServiceRouting configures optional HTTP routing for the service.

Appears in: - AIMServiceSpec

Field Description Default Validation
enabled boolean Enabled toggles HTTP routing management. false
gatewayRef ParentReference GatewayRef identifies the Gateway parent that should receive the HTTPRoute.
When omitted while routing is enabled, reconciliation will report a failure.
annotations object (keys:string, values:string) Annotations to add to the HTTPRoute resource.
routeTemplate string RouteTemplate overrides the HTTP path template used for routing.
The value is rendered against the AIMService object using JSONPath expressions.

AIMServiceRoutingStatus

AIMServiceRoutingStatus captures observed routing details.

Appears in: - AIMServiceStatus

Field Description Default Validation
path string Path is the HTTP path prefix used when routing is enabled.
Example: /tenant/svc-uuid.

AIMServiceSpec

AIMServiceSpec defines the desired state of AIMService.

Binds a canonical model to an AIMServiceTemplate and configures replicas, caching behavior, and optional overrides. The template governs the base runtime selection knobs, while the overrides field allows service-specific customization.

Appears in: - AIMService

Field Description Default Validation
aimImageName string AIMImageName is the canonical model name (including version/revision) to deploy.
Expected to match the spec.metadata.name of an AIMImage. Example:
meta-llama-3-8b-1-1-20240915.
MinLength: 1
templateRef string TemplateRef is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to use.
The template selects the runtime profile and GPU parameters.
cacheModel boolean CacheModel requests that model sources be cached when starting the service
if the template itself does not warm the cache.
When warmCache: false on the template, this setting ensures caching is
performed before the service becomes ready.
false
replicas integer Replicas overrides the number of replicas for this service.
Other runtime settings remain governed by the template unless overridden.
1
runtimeConfigName string RuntimeConfigName references the AIM runtime configuration (by name) to use for this service. default
resources ResourceRequirements Resources overrides the container resource requirements for this service.
When specified, these values take precedence over the template and image defaults.
overrides AIMServiceOverrides Overrides allows overriding specific template parameters for this service.
When specified, these values take precedence over the template values.
env EnvVar array Env specifies environment variables to use for authentication when downloading models.
These variables are used for authentication with model registries (e.g., HuggingFace tokens).
imagePullSecrets LocalObjectReference array ImagePullSecrets references secrets for pulling AIM container images.
routing AIMServiceRouting Routing enables HTTP routing through Gateway API for this service.

AIMServiceStatus

AIMServiceStatus defines the observed state of AIMService.

Appears in: - AIMService

Field Description Default Validation
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
conditions Condition array Conditions represent the latest observations of template state.
resolvedRuntimeConfig AIMResolvedRuntimeConfig ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.
resolvedImage AIMResolvedReference ResolvedImage captures metadata about the image that was resolved.
status AIMServiceStatusEnum Status represents the current high‑level status of the service lifecycle.
Values: Pending, Starting, Running, Failed, Degraded.
Pending Enum: [Pending Starting Running Failed Degraded]
routing AIMServiceRoutingStatus Routing surfaces information about the configured HTTP routing, when enabled.
resolvedTemplate AIMServiceResolvedTemplate ResolvedTemplate captures metadata about the template that satisfied the reference.

AIMServiceStatusEnum

Underlying type: string

AIMServiceStatusEnum defines coarse-grained states for a service.

Validation: - Enum: [Pending Starting Running Failed Degraded]

Appears in: - AIMServiceStatus

Field Description
Pending AIMServiceStatusPending denotes that the template has been created and discovery has not yet started.
Starting AIMServiceStatusStarting denotes that discovery and/or cache warm is in progress.
Running AIMServiceStatusRunning denotes that discovery succeeded and, if requested, caches are warmed.
Failed AIMServiceStatusFailed denotes a terminal failure for discovery or warm operations.
Degraded AIMServiceStatusDegraded denotes a recoverable failure state.

AIMServiceTemplate

Appears in: - AIMServiceTemplateList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMServiceTemplate
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMServiceTemplateSpec
status AIMServiceTemplateStatus

AIMServiceTemplateList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMServiceTemplateList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMServiceTemplate array

AIMServiceTemplateSpec

AIMServiceTemplateSpec defines the desired state of AIMServiceTemplate (namespace-scoped).

A namespaced and versioned template that selects a runtime profile for a given AIM model (by canonical name). Templates are intentionally narrow: they describe runtime selection knobs for the AIM container and do not redefine the full Kubernetes deployment shape.

Appears in: - AIMServiceTemplate

Field Description Default Validation
aimImageName string AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.
Example: meta/llama-3-8b:1.1+20240915
MinLength: 1
metric AIMMetric Metric selects the optimization goal.
- latency: prioritize low end‑to‑end latency
- throughput: prioritize sustained requests/second
Enum: [latency throughput]
precision AIMPrecision Precision selects the numeric precision used by the runtime. Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
gpuSelector AimGpuSelector AimGpuSelector contains the strategy to choose the resources to give each replica
runtimeConfigName string RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. default
resources ResourceRequirements Resources defines the default container resource requirements applied to services derived from this template.
Service-specific values override the template defaults.
caching AIMTemplateCachingConfig Caching configures model caching behavior for this namespace-scoped template.
When enabled, models will be cached using the specified environment variables
during download.
env EnvVar array Env specifies environment variables to use for authentication when downloading models.
These variables are used for authentication with model registries (e.g., HuggingFace tokens).
imagePullSecrets LocalObjectReference array ImagePullSecrets references secrets for pulling AIM container images.

AIMServiceTemplateSpecCommon

AIMServiceTemplateSpecCommon contains the shared fields for both cluster-scoped and namespace-scoped service templates.

Appears in: - AIMClusterServiceTemplateSpec - AIMServiceTemplateSpec

Field Description Default Validation
aimImageName string AIMImageName is the AIM image name. Matches metadata.name of an AIMImage. Immutable.
Example: meta/llama-3-8b:1.1+20240915
MinLength: 1
metric AIMMetric Metric selects the optimization goal.
- latency: prioritize low end‑to‑end latency
- throughput: prioritize sustained requests/second
Enum: [latency throughput]
precision AIMPrecision Precision selects the numeric precision used by the runtime. Enum: [auto fp4 fp8 fp16 fp32 bf16 int4 int8]
gpuSelector AimGpuSelector AimGpuSelector contains the strategy to choose the resources to give each replica
runtimeConfigName string RuntimeConfigName references the AIM runtime configuration (by name) to use for this template. default
resources ResourceRequirements Resources defines the default container resource requirements applied to services derived from this template.
Service-specific values override the template defaults.

AIMServiceTemplateStatus

AIMServiceTemplateStatus defines the observed state of AIMServiceTemplate.

Appears in: - AIMClusterServiceTemplate - AIMServiceTemplate

Field Description Default Validation
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
conditions Condition array Conditions represent the latest observations of template state.
resolvedRuntimeConfig AIMResolvedRuntimeConfig ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.
resolvedImage AIMResolvedReference ResolvedImage captures metadata about the image that was resolved.
status AIMTemplateStatusEnum Status represents the current high‑level status of the template lifecycle.
Values: Pending, Progressing, Available, Failed.
Pending Enum: [Pending Progressing Available Degraded Failed]
modelSources AIMModelSource array ModelSources list the models that this template requires to run. These are the models that will be
cached, if this template is cached.
profile AIMProfile Profile contains the full discovery result profile as a free-form JSON object.
This includes metadata, engine args, environment variables, and model details.

AIMTemplateCache

AIMTemplateCache pre-warms model caches for a specified template.

Appears in: - AIMTemplateCacheList

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMTemplateCache
metadata ObjectMeta Refer to Kubernetes API documentation for fields of metadata.
spec AIMTemplateCacheSpec
status AIMTemplateCacheStatus

AIMTemplateCacheList

AIMTemplateCacheList contains a list of AIMTemplateCache.

Field Description Default Validation
apiVersion string aim.silogen.ai/v1alpha1
kind string AIMTemplateCacheList
metadata ListMeta Refer to Kubernetes API documentation for fields of metadata.
items AIMTemplateCache array

AIMTemplateCacheSpec

AIMTemplateCacheSpec defines the desired state of AIMTemplateCache

Appears in: - AIMTemplateCache

Field Description Default Validation
templateRef string TemplateRef is the name of the AIMServiceTemplate or AIMClusterServiceTemplate to cache.
The controller will first look for a namespace-scoped AIMServiceTemplate in the same namespace.
If not found, it will look for a cluster-scoped AIMClusterServiceTemplate with the same name.
Namespace-scoped templates take priority over cluster-scoped templates.
MinLength: 1
env EnvVar array Env specifies environment variables to use for authentication when downloading models.
These variables are used for authentication with model registries (e.g., HuggingFace tokens).
imagePullSecrets LocalObjectReference array ImagePullSecrets references secrets for pulling AIM container images.
storageClassName string StorageClassName is the name for the storage class to use for this cache
runtimeConfigName string RuntimeConfigName references the AIM runtime configuration (by name) to use for this template cache. default

AIMTemplateCacheStatus

AIMTemplateCacheStatus defines the observed state of AIMTemplateCache

Appears in: - AIMTemplateCache

Field Description Default Validation
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
conditions Condition array Conditions represent the latest observations of the template cache state.
resolvedRuntimeConfig AIMResolvedRuntimeConfig ResolvedRuntimeConfig captures metadata about the runtime config that was resolved.
status AIMTemplateCacheStatusEnum Status represents the current high-level status of the template cache. Pending Enum: [Pending Progressing Available Failed]
resolvedTemplateKind string ResolvedTemplateKind indicates whether the template resolved to a namespace-scoped
AIMServiceTemplate or cluster-scoped AIMClusterServiceTemplate.
Values: "AIMServiceTemplate", "AIMClusterServiceTemplate"

AIMTemplateCacheStatusEnum

Underlying type: string

AIMTemplateCacheStatusEnum defines the status of the template cache.

Validation: - Enum: [Pending Progressing Available Failed]

Appears in: - AIMTemplateCacheStatus

Field Description
Pending AIMTemplateCacheStatusPending denotes that the template cache has been created but not yet processed.
Progressing AIMTemplateCacheStatusProgressing denotes that the template cache is being warmed.
Available AIMTemplateCacheStatusAvailable denotes that the template cache is ready and models are cached.
Failed AIMTemplateCacheStatusFailed denotes that the template cache operation has failed.

AIMTemplateCachingConfig

AIMTemplateCachingConfig configures model caching behavior for namespace-scoped templates.

Appears in: - AIMServiceTemplateSpec

Field Description Default Validation
enabled boolean Enabled controls whether caching is enabled for this template.
Defaults to false.
false
env EnvVar array Env specifies environment variables to use when downloading the model.
These variables are available to the model download process and can be used
to configure download behavior, authentication, proxies, etc.

AIMTemplateStatusEnum

Underlying type: string

AIMTemplateStatusEnum defines coarse-grained states for a template.

Validation: - Enum: [Pending Progressing Available Degraded Failed]

Appears in: - AIMServiceTemplateStatus

Field Description
Pending AIMTemplateStatusPending denotes that the template has been created and discovery has not yet started.
Progressing AIMTemplateStatusProgressing denotes that discovery and/or cache warm is in progress.
Available AIMTemplateStatusAvailable denotes that discovery succeeded and, if requested, caches are warmed.
Degraded AIMTemplateStatusDegraded denotes that the template is non-functional for some reason, for example that the cluster doesn't have the resources specified.
Failed AIMTemplateStatusFailed denotes a terminal failure for discovery or warm operations.

AimGpuSelector

Appears in: - AIMClusterServiceTemplateSpec - AIMRuntimeParameters - AIMServiceOverrides - AIMServiceTemplateSpec - AIMServiceTemplateSpecCommon

Field Description Default Validation
count integer Count is the number of the GPU resources requested per replica Minimum: 1
model string Model is the model name of the GPU that is supported by this template MinLength: 1

ImageMetadata

ImageMetadata contains metadata extracted from or provided for a container image.

Appears in: - AIMImageStatus

Field Description Default Validation
model ModelMetadata Model contains AMD Silogen model-specific metadata.
oci OCIMetadata OCI contains standard OCI image metadata.

ModelMetadata

ModelMetadata contains AMD Silogen model-specific metadata extracted from image labels.

Appears in: - ImageMetadata

Field Description Default Validation
canonicalName string CanonicalName is the canonical model identifier (e.g., mistralai/Mixtral-8x22B-Instruct-v0.1).
Extracted from: org.amd.silogen.model.canonicalName
source string Source is the URL where the model can be found.
Extracted from: org.amd.silogen.model.source
tags string array Tags are descriptive tags (e.g., ["text-generation", "chat", "instruction"]).
Extracted from: org.amd.silogen.model.tags (comma-separated)
versions string array Versions lists available versions.
Extracted from: org.amd.silogen.model.versions (comma-separated)
variants string array Variants lists model variants.
Extracted from: org.amd.silogen.model.variants (comma-separated)
hfTokenRequired boolean HFTokenRequired indicates if a HuggingFace token is required.
Extracted from: org.amd.silogen.hfToken.required
title string Title is the Silogen-specific title for the model.
Extracted from: org.amd.silogen.title
descriptionFull string DescriptionFull is the full description.
Extracted from: org.amd.silogen.description.full
releaseNotes string ReleaseNotes contains release notes for this version.
Extracted from: org.amd.silogen.release.notes
recommendedDeployments RecommendedDeployment array RecommendedDeployments contains recommended deployment configurations.
Extracted from: org.amd.silogen.model.recommendedDeployments (parsed from JSON array)

OCIMetadata

OCIMetadata contains standard OCI image metadata extracted from image labels.

Appears in: - ImageMetadata

Field Description Default Validation
title string Title is the human-readable title.
Extracted from: org.opencontainers.image.title
description string Description is a brief description.
Extracted from: org.opencontainers.image.description
licenses string Licenses is the SPDX license identifier(s).
Extracted from: org.opencontainers.image.licenses
vendor string Vendor is the organization that produced the image.
Extracted from: org.opencontainers.image.vendor
authors string Authors is contact details of the authors.
Extracted from: org.opencontainers.image.authors
source string Source is the URL to the source code repository.
Extracted from: org.opencontainers.image.source
documentation string Documentation is the URL to documentation.
Extracted from: org.opencontainers.image.documentation
created string Created is the creation timestamp.
Extracted from: org.opencontainers.image.created
revision string Revision is the source control revision.
Extracted from: org.opencontainers.image.revision
version string Version is the image version.
Extracted from: org.opencontainers.image.version

RecommendedDeployment

RecommendedDeployment describes a recommended deployment configuration for a model.

Appears in: - ModelMetadata

Field Description Default Validation
gpuModel string GPUModel is the GPU model name (e.g., MI300X, MI325X)
gpuCount integer GPUCount is the number of GPUs required
precision string Precision is the recommended precision (e.g., fp8, fp16, bf16)
metric string Metric is the optimization target (e.g., latency, throughput)
description string Description provides additional context about this deployment configuration