Overcommit Ratio

⚠️ This feature is still experimental. Please use it with caution.

Understanding the Overcommit Ratio in Hami vGPU

Hami supports configuring a global overcommit ratio for both vGPU compute cores and memory. The purpose of vGPU overcommit ratio is to improve GPU utilization, not to increase resource allocation for individual tasks. The mechanism of vGPU overcommit ratio is only logical in hami-scheduler.

Key Concepts

NVIDIA Device Core Scaling: Overcommit ratio applied to GPU compute cores.
NVIDIA Device Memory Scaling: Overcommit ratio applied to GPU memory.

Core Capabilities

Enable higher GPU utilization, allowing more workloads to share a single GPU card.

Configuring the Overcommit Ratio

Go to Administrator → Marketplace → Cluster Plugin.
Switch to the target cluster.
Update the parameters NVIDIA Device Core Scaling and NVIDIA Device Memory Scaling when deploying or upgrading the Alauda Build of Hami cluster plugin.

Notes

vGPU Core Overcommit Ratio
- When the overcommit ratio for GPU cores is greater than 1, multiple workloads may request more than 100% of the GPU compute capacity.
- If all workloads run at full load, they share the physical GPU compute equally (up to their requested share). As a result, each workload may run slower compared to using a dedicated GPU.
- If some workloads are idle, active workloads can utilize the freed capacity.
Example:
- Core overcommit ratio = 2 → one GPU card provides a logical 200% of allocatable cores.
- Four pods request: Pod A = 80%, Pod B = 60%, Pod C = 40%, Pod D = 20%.
- Scenarios:
  - If all pods are busy, Pod D receives its requested 20%, while Pods A–C compete for the remaining 80% (≈26.7% each).
  - If only Pod A is active, it can utilize up to 80% of the cores.
vGPU Memory Overcommit Ratio
- When memory overcommit ratio is enabled, workloads may collectively request more than the physical GPU memory capacity.
- If total requests exceed available memory and all pods attempt to use their full allocation, some workloads may encounter CUDA out of memory errors.
- Use memory overcommit ratio with caution, as it can directly lead to application failures.
Scope
- The overcommit ratio described here applies only to NVIDIA GPUs.

#Overcommit Ratio

#TOC

#Understanding the Overcommit Ratio in Hami vGPU

#Key Concepts

#Core Capabilities

#Configuring the Overcommit Ratio