Upgrade
This article will introduce how to upgrade from GPU-manager or old Hami(version 2.5) to the newest Hami version.
TOC
GPU-manager to Hami
Note
- GPU-manager and Hami can not deploy in same node but can deploy in same cluster.
- When you start upgrading, applications need to be modified one by one, which will cause the business pod to restart.
- When you have only one gpu node, you need to uninstall GPU-manager and then install Hami. You can do this by modifying the node label when you both deploy the two plugin.
For example, you can remove the
nvidia-device-enable=vgpunode label to delete the gpu-manager instance on this node, and then add thegpu=onlabel to deploy the hami plugin on this node.
Procedure
Modified Your applications one by one, example:
Your old GPU-manager instance:
Migrate to Hami:
Hami to Hami
Important Changes (v2.5.0 → v2.7.1)
⚠️ Upgrading from v2.5 to v2.7.1 should not affect existing applications. ✅ It is recommended to restart applications with a rolling update to avoid unexpected issues.
Procedure
- Updrade ACP version if needed.
- Upload the package of Hami v2.7.1 plugin to ACP.
- Go to the
Administrator->Clusters->Tartget Cluster->Functional Componentspage, then click theUpdradebutton and you will see theAlauda Build of HAMican be updraded.Clusters->Tartget Cluster->Functional Componentspage, then click theUpdradebutton and you will see theAlauda Build of HAMican be updraded. - Update some ConfigMaps that defines extended resources, which can be used to set extended resources on the ACP. Run the following script in your gpu cluster:
Click to expand code
Note
If you config resource quota for hami resource in versions prior to 2.7.1, please delete and reconfigure it.