Configuring VerticalPodAutoscaler (VPA)

Overview Release Notes Supported Versions v4.2 Release Notes Lifecycle Policy Understanding VerticalPodAutoscalers How Does the VPA Work?Supported Features Prerequisites Installing the Vertical Pod Autoscaler Plugin Upgrading the Vertical Pod Autoscaler Plugin Creating a VerticalPodAutoscaler Using the CLI Using the Web Console Advanced VPA Configuration Update Policy Options Container Policy Options Follow-Up Actions

Overview

The Alauda Container Platform Vertical Pod Autoscaler is based on the open-source Vertical Pod Autoscaler component. It analyzes historical resource usage data of Pods and provides quota recommendations to improve resource utilization.

For both stateless and stateful applications, the VerticalPodAutoscaler (VPA) automatically recommends—and optionally applies—more appropriate CPU and memory resource limits based on your application's needs. This helps ensure Pods have sufficient resources while improving overall cluster resource utilization.

Release Notes

This section details the new features, enhancements, deprecated features, and known issues for the Alauda Container Platform Vertical Pod Autoscaler plugin.

Supported Versions

Version	Version
v4.2.0	v4.0, v4.1, v4.2

v4.2 Release Notes

v4.2.0

Security vulnerability remediation.
Now independently releasable.

Lifecycle Policy

The following table outlines the lifecycle schedule for released versions of the Alauda Container Platform Vertical Pod Autoscaler plugin:

Version	Release Date	End of Life
v4.2.0	2025-12-16	2027-12-16

Understanding VerticalPodAutoscalers

A VerticalPodAutoscaler (VPA) is used to recommend or automatically update CPU and memory resource requests and limits for your Pods based on their historical usage patterns.

After creating a VPA, the platform begins monitoring the resource usage of the target Pods. Once sufficient data is collected, the VPA calculates recommended resource values. Depending on its configured update mode, the VPA can either apply these recommendations automatically or simply provide them for manual review and application.

By analyzing resource usage over time, VPA helps ensure your Pods are allocated the resources they need without over-provisioning, leading to more efficient cluster-wide resource utilization.

How Does the VPA Work?

The VerticalPodAutoscaler (VPA) extends the concept of pod resource optimization. The VPA monitors the resource usage of your pods and provides recommendations for CPU and memory requests based on the observed usage patterns.

The VPA works by continuously monitoring the resource usage of your pods and updating its recommendations as new data becomes available. The VPA can operate in the following modes:

Off: VPA only provides recommendations without automatically applying them.
Manual Adjustment: You can manually adjust resource configurations based on VPA recommendations.

Important: Elastic scaling (horizontal or vertical) performs best when cluster resources are sufficient. When resources are scarce, scaling actions may cause Pods to become stuck in a Pending state. Ensure your cluster has adequate resources, set reasonable quotas, and configure alerts to monitor scaling events.

Supported Features

The VPA provides resource recommendations based on historical usage patterns, allowing you to optimize your pod's CPU and memory configurations.

Important: When manually applying VPA recommendations, pod recreation will occur, which can cause temporary disruption to your application. Consider applying recommendations during maintenance windows for production workloads.

Prerequisites

Before using VPA, ensure the following:

The Alauda Container Platform Vertical Pod Autoscaler cluster plugin is installed in your cluster.
- Download the latest plugin package compatible with your platform version.
- Utilize the violet CLI tool to upload Alauda Container Platform Vertical Pod Autoscalers and Alauda DevOps Pipelines packages to your target cluster. For detailed instructions on using violet, please refer to the CLI.
Please ensure that the monitoring components are deployed in the current cluster and are functioning properly. You can check the deployment and health status of the monitoring components by clicking on the top right corner of the platform > Platform Health Status..

Installing the Vertical Pod Autoscaler Plugin

Log in and navigate to the Administrators page.
Click Marketplace > Cluster Plugins to access the Cluster Plugins list page.
Locate the Alauda Container Platform Vertical Pod Autoscaler cluster plugin, click Install, then proceed to the installation page.

Upgrading the Vertical Pod Autoscaler Plugin

Log in and navigate to the Administrators page.
Click Marketplace > Cluster Plugins to access the Cluster Plugins list page.
Locate the Alauda Container Platform Vertical Pod Autoscaler cluster plugin, click Upgrade, then proceed to the installation page.

Creating a VerticalPodAutoscaler

Using the CLI

You can create a VerticalPodAutoscaler using the command line interface by defining a YAML file and using the kubectl create command. The following example shows vertical pod autoscaling for a Deployment object:

Create a YAML file named vpa.yaml with the following content:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-deployment-vpa namespace: default spec: targetRef: apiVersion: apps/v1 kind: Deployment name: my-deployment updatePolicy: updateMode: 'Off' resourcePolicy: containerPolicies: - containerName: '*' mode: 'Auto'
1. Use the autoscaling.k8s.io/v1 API.
2. The name of the VPA
3. Specify the target workload object. VPA uses the workload's selector to find pods that need resource adjustment. Supported workload types include DaemonSet, Deployment, ReplicaSet, StatefulSet, ReplicationController, Job, and CronJob.
4. Specify the API version of the object to scale.
5. Specify the type of object.
6. The target resource to which the VPA applies
7. Update policy that defines how VPA applies recommendations. The updateMode can be:
  - Auto: Automatically sets resource requests when creating pods and updates current pods to recommended resource requests. Currently equivalent to "Recreate". This mode may cause application downtime. Once in-place pod resource updates are supported, "Auto" mode will adopt this update mechanism.
  - Recreate: Automatically sets resource requests when creating pods and evicts current pods to update to recommended resource requests. Will not use in-place updates.
  - Initial: Only sets resource requests when creating pods, no modifications afterward.
  - Off: Does not automatically modify pod resource requests, only provides recommendations in the VPA object.
8. Resource policy that can set specific strategies for different containers. For example, setting a container's mode to "Auto" means it will calculate recommendations for that container, while "Off" means it won't calculate recommendations.
9. Apply policy to all containers in the pod.
10. Set the mode to Auto or Off. Auto means recommendations will be generated for this container, Off means no recommendations will be generated.

Apply the YAML file to create the VPA:

kubectl create -f vpa.yaml

Example output:

verticalpodautoscaler.autoscaling.k8s.io/my-deployment-vpa created

After you create the VPA, you can view the recommendations by running the following command:

kubectl describe vpa my-deployment-vpa

Example output (partial):

Status:
  Recommendation:
    Container Recommendations:
      Container Name:  my-container
      Lower Bound:
        Cpu:     100m
        Memory:  262144k
      Target:
        Cpu:     200m
        Memory:  524288k
      Upper Bound:
        Cpu:     300m
        Memory:  786432k

Using the Web Console

Log in and navigate to the Container Platform.
In the left navigation bar, click Workloads > Deployments.
Click on Deployment Name.
Scroll down to the Elastic Scaling area and click Update on the right.

Select Vertical Scaling and configure the scaling rules.

Parameter	Description
Scaling Mode	Currently supports Manual Scaling mode, which provides recommended resource configurations by analyzing past resource usage. You can manually adjust according to the recommended values. Adjustments will cause pods to be recreated and restarted, so please choose an appropriate time to avoid impacting running applications. Typically, after pods have been running for more than 8 days, the recommended values will become accurate. Note that when cluster resources are insufficient, scaling may cause Pods to be in a Pending state. Please ensure that the cluster has sufficient resources or reasonable quotas, or configure alerts to monitor scaling conditions.
Target Container	Defaults to the first container of the workload. You can choose to enable resource limit recommendations for one or more containers as needed.

Parameter

Description

Scaling Mode

Currently supports Manual Scaling mode, which provides recommended resource configurations by analyzing past resource usage. You can manually adjust according to the recommended values. Adjustments will cause pods to be recreated and restarted, so please choose an appropriate time to avoid impacting running applications.
Typically, after pods have been running for more than 8 days, the recommended values will become accurate.
Note that when cluster resources are insufficient, scaling may cause Pods to be in a Pending state. Please ensure that the cluster has sufficient resources or reasonable quotas, or configure alerts to monitor scaling conditions.

Target Container

Defaults to the first container of the workload. You can choose to enable resource limit recommendations for one or more containers as needed.

Click Update.

Advanced VPA Configuration

Update Policy Options

updateMode: "Off" - VPA only provides recommendations without automatically applying them. You can manually apply these recommendations as needed.
updateMode: "Auto" - Automatically sets resource requests when creating pods and updates current pods to recommended values. Currently equivalent to "Recreate".
updateMode: "Recreate" - Automatically sets resource requests when creating pods and evicts current pods to update to recommended values.
updateMode: "Initial" - Only sets resource requests when creating pods, no modifications afterward.
minReplicas: <number> - Minimum number of replicas. Ensures this minimum number of pods remain available when the Updater evicts pods. Must be greater than 0.

Container Policy Options

containerName: "*" - Apply policy to all containers in the pod.
mode: "Auto" - Automatically generate recommendations for the container.
mode: "Off" - Do not generate recommendations for the container.

Notes:

VPA recommendations are based on historical usage data, so it may take several days of pod operation before recommendations become accurate.
Pod recreation will occur when VPA recommendations are applied in Auto mode, which can cause temporary disruption to your application.

Follow-Up Actions

After configuring VPA, the recommended values for CPU and memory resource limits of the target container can be viewed in the Elastic Scaling area. In the Containers area, select the target container tab and click the icon on the right side of Resource Limits to update the resource limits according to the recommended values.

#Configuring VerticalPodAutoscaler (VPA)

#TOC

#Overview

#Release Notes

#Supported Versions

#v4.2 Release Notes

#v4.2.0

#Lifecycle Policy

#Understanding VerticalPodAutoscalers

#How Does the VPA Work?

#Supported Features

#Prerequisites

#Installing the Vertical Pod Autoscaler Plugin

#Upgrading the Vertical Pod Autoscaler Plugin

#Creating a VerticalPodAutoscaler

#Using the CLI

#Using the Web Console

#Advanced VPA Configuration

#Update Policy Options

#Container Policy Options

#Follow-Up Actions