Welcome to Kaiwo
🚀️🚀️ Kaiwo supports AMD GPUs! 🚀️🚀️
Description
Kaiwo (pronunciation "ky-voh") is a Kubernetes-native tool designed to optimize GPU resource utilization for AI workloads. Built on top of Ray and Kueue , Kaiwo minimizes GPU idleness and increases resource efficiency through intelligent job queueing, fair sharing of resources, guaranteed quotas and opportunistic gang scheduling.
Kaiwo supports a wide range of AI workloads, including distributed multi-node pretraining, fine-tuning, online inference, and batch inference, with seamless integration into Kubernetes environments.
This documentation is intended for two main audiences:
- AI Scientists/Engineers: who want Kaiwo to manage their AI workloads on Kubernetes. See here
- Infrastructure/Platform Administrators: who want to deploy and manage Kaiwo on their Kubernetes clusters. See here
Main Features
- GPU Utilization Optimization
- Kaiwo Operator dynamically queues workloads to reduce GPU idle time and maximize resource utilization.
- CLI Tool
- Simplified workload submission using the kaiwo CLI tool
- Distributed Workload Scheduling
- Effortlessly schedule distributed workloads across multiple Kubernetes nodes with Kaiwo Operator.
- Broad Workload Support with pre-built templates
- Supports running Kubernetes Jobs, Deployments, RayJobs and RayServices.
- Integration with Ray and Kueue
- Leverages the power of Ray for distributed computing and Kueue for efficient job queueing.