Welcome to Kaiwo

🚀️🚀️ Kaiwo supports AMD GPUs! 🚀️🚀️

Description

Kaiwo (pronunciation "ky-voh") is a Kubernetes-native tool designed to optimize GPU resource utilization for AI workloads. Built on top of Ray and Kueue , Kaiwo minimizes GPU idleness and increases resource efficiency through intelligent job queueing, fair sharing of resources, guaranteed quotas and opportunistic gang scheduling.

Kaiwo supports a wide range of AI workloads, including distributed multi-node pretraining, fine-tuning, online inference, and batch inference, with seamless integration into Kubernetes environments.

This documentation is intended for two main audiences:

AI Scientists/Engineers: who want Kaiwo to manage their AI workloads on Kubernetes. See here
Infrastructure/Platform Administrators: who want to deploy and manage Kaiwo on their Kubernetes clusters. See here

Main Features

GPU Utilization Optimization: Kaiwo Operator dynamically queues workloads to reduce GPU idle time and maximize resource utilization.
CLI Tool: Simplified workload submission using the kaiwo CLI tool
Distributed Workload Scheduling: Effortlessly schedule distributed workloads across multiple Kubernetes nodes with Kaiwo Operator.
Broad Workload Support with pre-built templates: Supports running Kubernetes Jobs, Deployments, RayJobs and RayServices.
Integration with Ray and Kueue: Leverages the power of Ray for distributed computing and Kueue for efficient job queueing.