在 global 集群中创建 VMware vSphere 集群
本文说明如何使用直接连接 vCenter 的标准 CAPV 模式,从 global 集群创建一个 VMware vSphere 业务集群。该操作步骤涵盖一个受支持的最小拓扑:一个 datacenter、每个节点一个 NIC,以及通过 VSphereMachineConfigPool 实现的静态 IP 分配。
场景
在以下场景中使用本文档:
- 你希望在环境中创建第一个基础 VMware vSphere 业务集群。
- 你在初始验证中使用一个 datacenter 和每个节点一个 NIC。
- 在启用高级放置或网络功能之前,你希望先保持首次部署尽可能简单。
本文档适用于以下部署模型:
- CAPV 直接连接到 vCenter。
- 控制平面节点和 worker 节点都使用
VSphereMachineConfigPool 进行静态 IP 分配和数据盘配置。
ClusterResourceSet 会自动交付 vSphere CPI 组件。
- 首次验证使用一个 datacenter 和每个节点一个 NIC。
本文档不适用于以下场景:
- 依赖 vSphere Supervisor 或
vm-operator 的部署。
- 不使用
VSphereMachineConfigPool 的部署。
- 首次部署时同时启用多个 datacenter、多个 NIC 和复杂磁盘扩展的场景。
本文档针对当前平台环境编写。kube-ovn 的交付路径依赖于消费 Cluster 资源注解的平台控制器,因此该工作流并不打算作为平台上下文之外的通用独立 CAPV 部署指南。
前提条件
在开始之前,请确保满足以下条件:
- 你已完成 Preparing Parameters for a VMware vSphere Cluster 中的值收集。
global 集群可以访问 vCenter。
- 目标模板、网络、datastore 以及 vCenter 资源池可用。
- 控制平面 VIP 和负载均衡器已就绪。
- 所有必需的静态 IP 地址都已分配且未被占用。
- 已启用
ClusterResourceSet=true。
- 平台已具备有效的公共镜像仓库配置。
- 平台可以处理安装网络插件所需的集群注解。
关键对象
ClusterResourceSet
ClusterResourceSet 是 global 集群中的一个 Cluster API 资源。在业务 API server 可达之后,它会将所引用的 ConfigMap 和 Secret 资源应用到业务集群。
在此工作流中,ClusterResourceSet 用于自动交付 vSphere CPI 资源。
vSphere CPI 组件
vSphere CPI 组件通过 ClusterResourceSet 交付到业务集群。它将业务节点连接到 vSphere 基础设施,使集群能够报告基础设施身份并完成 cloud-provider 初始化。
machine config pool
machine config pool 即 VSphereMachineConfigPool 自定义资源。在基础工作流中:
- 一个 machine config pool 用于控制平面节点。
- 一个 machine config pool 用于 worker 节点。
每个节点槽位都包含 hostname、datacenter、静态 IP 分配以及可选的数据盘定义。
对于网络配置,请区分以下字段:
networkName 是 vCenter 网络或 port group 名称。
deviceName 是 guest operating system 内部的 NIC 名称。
如果设置了 deviceName,CAPV 会将该值写入生成的 guest-network 元数据中。如果省略它,当前实现通常会按 NIC 顺序使用 eth0、eth1 和 eth2 之类的 NIC 名称。
还要区分以下值格式:
- 节点 IP 地址与前缀长度一起使用,例如
10.10.10.11/24。
- gateway 字段只包含 gateway IP 地址,例如
10.10.10.1。
在基础工作流中:
- 一个
VSphereMachineConfigPool 用于控制平面节点。
- 一个
VSphereMachineConfigPool 用于 worker 节点。
VM 模板要求
该工作流使用的 VM 模板应满足以下最低要求:
- 它使用目标平台环境所需的操作系统。
- 它包含
cloud-init。
- 它包含 VMware Tools 或
open-vm-tools。
- 它包含
containerd。
- 它包含 kubeadm bootstrap 所需的基础组件。
- 它在
/root/images/ 下包含预导出的容器镜像 tar 文件。这些文件会在 kubeadm 运行前由 capv-load-local-images.sh 导入到 containerd 中,从而使节点引导不依赖于从远程 registry 拉取镜像。
/root/images/*.tar 文件必须包含 sandbox(pause)镜像,且其引用必须与 /etc/containerd/config.toml 中配置的 sandbox_image 值(containerd v1)或 sandbox 值(containerd v2)完全一致。例如,如果 containerd 配置为 sandbox_image = "registry.example.com/tkestack/pause:3.10",则某个 tar 文件必须包含该完全相同的镜像引用。不匹配会导致 containerd 从网络拉取 sandbox 镜像,这会破坏本地预加载的目的,并在 air-gapped 环境中失败。
静态 IP 配置、hostname 注入以及其他初始化设置都依赖 cloud-init。节点 IP 上报依赖 guest tools。
本地文件布局
业务集群命名
业务 cluster_name 不能为 global。该名称保留给 global 集群,重复使用会导致业务集群资源与 cpaas-system 中的 global 集群资源发生冲突。global- 前缀保留给 global 集群的 DR 工作流所拥有的资源;请参见 Common Prerequisites。不要将 global- 用于业务集群资源,因为故障切换操作可能会将这些资源视为属于 global 集群而选中它们。
按照约定,CAPI Cluster 和 provider cluster 资源(VSphereCluster)应保持与 <cluster_name> 完全同名,而非根级的 CAPI 和 provider 资源(KubeadmControlPlane、KubeadmConfigTemplate、MachineDeployment、VSphereMachineTemplate、VSphereMachineConfigPool 等)应以前缀 <cluster_name>- 命名——例如,示例清单使用 <cluster_name>-kcp 和 <cluster_name>-md-0。这是一条建议而非控制器强制规则,但它可以避免多个业务集群同时存在于 cpaas-system 时发生同命名空间冲突,并且在运维过程中更容易看清资源归属。
创建本地工作目录,并按以下布局保存清单:
capv-cluster/
├── 00-namespace.yaml
├── 01-vsphere-credentials-secret.yaml
├── 02-vspheremachineconfigpool-control-plane.yaml
├── 03-vspheremachineconfigpool-worker.yaml
├── 10-cluster.yaml
├── 15-vsphere-cpi-clusterresourceset.yaml
├── 20-control-plane.yaml
└── 30-workers-md-0.yaml
使用以下命令创建目录:
mkdir -p ./capv-cluster
cd ./capv-cluster
操作步骤
验证环境
从 global 集群运行以下命令,以验证最低前提条件:
kubectl get ns
kubectl get minfo -l cpaas.io/module-name=cluster-api-provider-vsphere
kubectl get minfo -l cpaas.io/module-name=cluster-api-provider-kubeadm
kubectl -n cpaas-system get deploy capi-controller-manager -o jsonpath='{.spec.template.spec.containers[0].args}'
kubectl -n cpaas-system get secret public-registry-credential -o jsonpath='{.data.content}'
确认以下结果:
global 集群可达。
- Alauda Container Platform Kubeadm Provider 和 Alauda Container Platform VMware vSphere Infrastructure Provider 正在运行。
- controller 参数中包含
ClusterResourceSet=true。
- 公共 registry 凭证
data.content 不为空。
在继续之前,还要验证以下项目:
- vCenter server 地址可达。
- vCenter 用户名和密码有效。
- thumbprint 正确。
- 模板名称正确。
- 目标 datacenter 中可以解析该模板。
- 如果 VM 以
fullClone 方式克隆,则模板系统盘不大于后续清单中使用的 diskGiB 值。如果 CAPV 完成的是 linkedClone,系统盘大小保持模板大小,diskGiB 会被忽略。
- 模板中已安装 VMware Tools 或
open-vm-tools。
- VIP 已存在,且可从执行环境访问端口
6443。
- 负载均衡器对 real-server 维护的归属模型是明确的。
创建 namespace 和 vCenter 凭证 Secret
创建用于存储业务集群对象的 namespace。
此工作流将业务集群对象存放在 cpaas-system namespace 中。在下面的清单和命令中,请将每个 <namespace> 占位符替换为 cpaas-system。
00-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: <namespace>
创建 VSphereCluster.spec.identityRef 引用的 vCenter 凭证 Secret。
01-vsphere-credentials-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: <credentials_secret_name>
namespace: <namespace>
type: Opaque
stringData:
username: "<vsphere_username>"
password: "<vsphere_password>"
应用这两个清单:
kubectl apply -f 00-namespace.yaml
kubectl apply -f 01-vsphere-credentials-secret.yaml
创建 Cluster 和 VSphereCluster 对象
创建基础 cluster 清单,其中包含业务集群网络设置、控制平面 endpoint 以及 vCenter 连接设置。
10-cluster.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <cluster_name>
namespace: <namespace>
labels:
cluster.x-k8s.io/cluster-name: <cluster_name>
cluster-type: VSphere
addons.cluster.x-k8s.io/vsphere-cpi: "enabled"
annotations:
capi.cpaas.io/resource-group-version: infrastructure.cluster.x-k8s.io/v1beta1
capi.cpaas.io/resource-kind: VSphereCluster
cpaas.io/sentry-deploy-type: Baremetal
cpaas.io/alb-address-type: ClusterAddress
cpaas.io/network-type: kube-ovn
cpaas.io/kube-ovn-version: <kube_ovn_version>
cpaas.io/kube-ovn-join-cidr: <kube_ovn_join_cidr>
spec:
clusterNetwork:
pods:
cidrBlocks:
- <pod_cidr>
services:
cidrBlocks:
- <service_cidr>
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: <cluster_name>-kcp
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
name: <cluster_name>
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
metadata:
name: <cluster_name>
namespace: <namespace>
spec:
controlPlaneEndpoint:
host: "<vip>"
port: <api_server_port>
identityRef:
kind: Secret
name: <credentials_secret_name>
server: "<vsphere_server>"
thumbprint: "<thumbprint>"
应用该清单:
kubectl apply -f 10-cluster.yaml
创建 vSphere CPI 交付资源
创建一个 ClusterResourceSet,使业务集群在业务 API server 可达后自动接收 vSphere CPI 配置和清单。
INFO
在基础工作流中,VSphereCluster.spec.failureDomainSelector 有意不设置,且 CPI vsphere.conf 不包含 [Labels] 区块。这两者仅在启用 failure domain 之后才需要;请按照 Extension Scenarios 中的说明将它们一起配置。如果在没有匹配的 VSphereFailureDomain 对象的情况下向 vsphere.conf 添加 [Labels],会导致 CPI 查找并不存在的 zone 和 region 标签。
WARNING
CPI 的 ConfigMap、Secret 和 ClusterResourceSet 资源必须创建在与 Cluster 资源相同的 namespace 中。在本指南中,该 namespace 是 cpaas-system。ClusterResourceSet 只能匹配其自身 namespace 内的 cluster;如果将其部署到不同的 namespace,将会静默地阻止资源交付。
INFO
Cluster 注解中的 kube-ovn 配置由平台控制器消费。本文档不会直接安装网络插件。
TIP
该清单较长,并且在 data 字段中嵌套了 YAML。应用之前请先校验清单:kubectl apply --dry-run=client -f 15-vsphere-cpi-clusterresourceset.yaml。
15-vsphere-cpi-clusterresourceset.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: <cluster_name>-vsphere-cpi-config
namespace: <namespace>
data:
data: |
apiVersion: v1
kind: ConfigMap
metadata:
name: cloud-config
namespace: kube-system
data:
vsphere.conf: |
[Global]
secret-name = "vsphere-cloud-secret"
secret-namespace = "kube-system"
service-account = "cloud-controller-manager"
port = "443"
insecure-flag = "<cpi_insecure_flag>"
datacenters = "<cpi_datacenters>"
[VirtualCenter "<vsphere_server>"]
---
apiVersion: v1
kind: Secret
metadata:
name: <cluster_name>-vsphere-cpi-secret
namespace: <namespace>
type: addons.cluster.x-k8s.io/resource-set
stringData:
data: |
apiVersion: v1
kind: Secret
metadata:
name: vsphere-cloud-secret
namespace: kube-system
type: Opaque
stringData:
<vsphere_server>.username: <vsphere_username>
<vsphere_server>.password: <vsphere_password>
---
apiVersion: v1
kind: ConfigMap
metadata:
name: <cluster_name>-vsphere-cpi-manifests
namespace: <namespace>
data:
data: |
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloud-controller-manager
namespace: kube-system
automountServiceAccountToken: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:cloud-controller-manager
rules:
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "patch", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["*"]
- apiGroups: [""]
resources: ["nodes/status"]
verbs: ["patch"]
- apiGroups: [""]
resources: ["services"]
verbs: ["list", "patch", "update", "watch"]
- apiGroups: [""]
resources: ["services/status"]
verbs: ["patch"]
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["create", "get", "list", "watch", "update"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "update", "watch"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["create", "get", "list", "watch", "update"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: servicecatalog.k8s.io:apiserver-authentication-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- apiGroup: ""
kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- apiGroup: ""
kind: User
name: cloud-controller-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:cloud-controller-manager
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- kind: User
name: cloud-controller-manager
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: cloud-controller-manager
tier: control-plane
k8s-app: vsphere-cloud-controller-manager
name: vsphere-cloud-controller-manager
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: vsphere-cloud-controller-manager
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
component: cloud-controller-manager
k8s-app: vsphere-cloud-controller-manager
spec:
securityContext:
runAsUser: 1001
automountServiceAccountToken: true
# Optional: required when the CPI image is stored in a private
# registry that needs authentication. The platform automatically
# syncs a dockerconfigjson secret named "global-registry-auth"
# into every namespace of the workload cluster when the
# `global` cluster secret "public-registry-credential"
# (data.content) is configured. If your environment does not
# use a private registry, remove the imagePullSecrets block.
imagePullSecrets:
- name: global-registry-auth
serviceAccountName: cloud-controller-manager
hostNetwork: true
tolerations:
- operator: Exists
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: node.kubernetes.io/not-ready
effect: NoSchedule
operator: Exists
containers:
- name: vsphere-cloud-controller-manager
image: <image_registry>/ait/cloud-provider-vsphere:<cpi_image_tag>
args:
- --v=2
- --cloud-provider=vsphere
- --cloud-config=/etc/cloud/vsphere.conf
volumeMounts:
- mountPath: /etc/cloud
name: vsphere-config-volume
readOnly: true
resources:
requests:
cpu: 200m
volumes:
- name: vsphere-config-volume
configMap:
name: cloud-config
---
apiVersion: v1
kind: Service
metadata:
labels:
component: cloud-controller-manager
name: vsphere-cloud-controller-manager
namespace: kube-system
spec:
type: NodePort
ports:
- port: 43001
protocol: TCP
targetPort: 43001
selector:
component: cloud-controller-manager
---
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
name: <cluster_name>-vsphere-cpi
namespace: <namespace>
spec:
strategy: Reconcile
clusterSelector:
matchLabels:
addons.cluster.x-k8s.io/vsphere-cpi: "enabled"
resources:
- name: <cluster_name>-vsphere-cpi-config
kind: ConfigMap
- name: <cluster_name>-vsphere-cpi-secret
kind: Secret
- name: <cluster_name>-vsphere-cpi-manifests
kind: ConfigMap
应用该清单:
kubectl apply -f 15-vsphere-cpi-clusterresourceset.yaml
创建 machine config pool
创建控制平面 machine config pool。
INFO
每个节点槽位都在 network.primary 下声明其 NIC 布局(必填),并在 network.additional 下声明额外 NIC(可选列表)。主 NIC 的 networkName 是必填项,provider 会根据 hostname 和解析后的主 NIC 地址派生 Kubernetes 节点名称、kubelet serving certificate 的 DNS SAN,以及 kubelet 的 node-ip。hostname 必须是有效的 DNS-1123 子域名。
INFO
deviceName 是可选项。如果你不需要强制 guest NIC 名称,可以从每个节点槽位中移除 deviceName 行。provider 会按 NIC 顺序分配 NIC 名称,例如 eth0、eth1。
02-vspheremachineconfigpool-control-plane.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
metadata:
name: <cluster_name>-cp-pool
namespace: <namespace>
spec:
clusterRef:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
name: <cluster_name>
datacenter: "<default_datacenter>"
releaseDelayHours: <release_delay_hours>
configs:
- hostname: "<cp_node_name_1>"
datacenter: "<master_01_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_01_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_lib_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
- hostname: "<cp_node_name_2>"
datacenter: "<master_02_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_02_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_lib_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
- hostname: "<cp_node_name_3>"
datacenter: "<master_03_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_03_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_lib_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
创建 worker machine config pool。
03-vspheremachineconfigpool-worker.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
metadata:
name: <cluster_name>-worker-pool
namespace: <namespace>
spec:
clusterRef:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
name: <cluster_name>
datacenter: "<default_datacenter>"
releaseDelayHours: <release_delay_hours>
configs:
- hostname: "<worker_node_name_1>"
datacenter: "<worker_01_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<worker_01_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <worker_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <worker_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
应用这两个清单:
kubectl apply -f 02-vspheremachineconfigpool-control-plane.yaml
kubectl apply -f 03-vspheremachineconfigpool-worker.yaml
创建控制平面对象
创建 VSphereMachineTemplate 和 KubeadmControlPlane 对象。请将下面完整模板中的占位符替换为在检查清单文档中收集到的值。
模板中保留了 cloneMode 和 diskGiB,因为 CAPV 同时接受这两个字段。实际上,diskGiB 只会在实际克隆操作为 fullClone 时影响系统盘。如果 cloneMode 为 linkedClone 且模板存在可用快照,CAPV 会完成 linked clone,系统盘大小将保持与源模板一致。如果不存在可用快照,CAPV 会回退到 fullClone,此时 diskGiB 会再次生效。
20-control-plane.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: <cluster_name>-control-plane
namespace: <namespace>
spec:
template:
spec:
server: "<vsphere_server>"
template: "<template_name>"
cloneMode: <clone_mode>
folder: "<vm_folder>"
datastore: "<cp_system_datastore>"
diskGiB: <cp_system_disk_gib>
memoryMiB: <cp_memory_mib>
numCPUs: <cp_num_cpus>
os: Linux
powerOffMode: <power_off_mode>
network:
devices:
- networkName: "<nic1_network_name>"
machineConfigPoolRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
name: <cluster_name>-cp-pool
namespace: <namespace>
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: <cluster_name>-kcp
namespace: <namespace>
spec:
rolloutStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
version: "<k8s_version>"
replicas: <cp_replicas>
machineTemplate:
nodeDrainTimeout: 1m
nodeDeletionTimeout: 5m
metadata:
labels:
node-role.kubernetes.io/control-plane: ""
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
name: <cluster_name>-control-plane
kubeadmConfigSpec:
users:
- name: boot
sudo: ALL=(ALL) NOPASSWD:ALL
sshAuthorizedKeys:
- "<ssh_public_key>"
files:
- path: /etc/kubernetes/admission/psa-config.yaml
owner: "root:root"
permissions: "0644"
content: |
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "privileged"
enforce-version: "latest"
audit: "baseline"
audit-version: "latest"
warn: "baseline"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces:
- kube-system
- <namespace>
- path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
owner: "root:root"
permissions: "0644"
content: |
{
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"kind": "KubeletConfiguration",
"protectKernelDefaults": true,
"streamingConnectionIdleTimeout": "5m",
"tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
"tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key"
}
# Generate the encryption key with: head -c 32 /dev/urandom | base64
- path: /etc/kubernetes/encryption-provider.conf
owner: "root:root"
append: false
permissions: "0644"
content: |
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <encryption_provider_secret>
- path: /etc/kubernetes/audit/policy.yaml
owner: "root:root"
append: false
permissions: "0644"
content: |
apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: None
users:
- system:kube-controller-manager
- system:kube-scheduler
- system:serviceaccount:kube-system:endpoint-controller
verbs: ["get", "update"]
namespaces: ["kube-system"]
resources:
- group: ""
resources: ["endpoints"]
- level: None
nonResourceURLs:
- /healthz*
- /version
- /swagger*
- level: None
resources:
- group: ""
resources: ["events"]
- level: None
resources:
- group: "devops.alauda.io"
- level: None
verbs: ["get", "list", "watch"]
- level: None
resources:
- group: "coordination.k8s.io"
resources: ["leases"]
- level: None
resources:
- group: "authorization.k8s.io"
resources: ["subjectaccessreviews", "selfsubjectaccessreviews"]
- group: "authentication.k8s.io"
resources: ["tokenreviews"]
- level: None
resources:
- group: "app.alauda.io"
resources: ["imagewhitelists"]
- group: "k8s.io"
resources: ["namespaceoverviews"]
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: Metadata
resources:
- group: "operator.connectors.alauda.io"
resources: ["installmanifests"]
- group: "operators.katanomi.dev"
resources: ["katanomis"]
- level: RequestResponse
resources:
- group: ""
- group: "aiops.alauda.io"
- group: "apps"
- group: "app.k8s.io"
- group: "authentication.istio.io"
- group: "auth.alauda.io"
- group: "autoscaling"
- group: "asm.alauda.io"
- group: "clusterregistry.k8s.io"
- group: "crd.alauda.io"
- group: "infrastructure.alauda.io"
- group: "monitoring.coreos.com"
- group: "operators.coreos.com"
- group: "networking.istio.io"
- group: "extensions.istio.io"
- group: "install.istio.io"
- group: "security.istio.io"
- group: "telemetry.istio.io"
- group: "opentelemetry.io"
- group: "networking.k8s.io"
- group: "portal.alauda.io"
- group: "rbac.authorization.k8s.io"
- group: "storage.k8s.io"
- group: "tke.cloud.tencent.com"
- group: "devopsx.alauda.io"
- group: "core.katanomi.dev"
- group: "deliveries.katanomi.dev"
- group: "integrations.katanomi.dev"
- group: "artifacts.katanomi.dev"
- group: "builds.katanomi.dev"
- group: "versioning.katanomi.dev"
- group: "sources.katanomi.dev"
- group: "tekton.dev"
- group: "operator.tekton.dev"
- group: "eventing.knative.dev"
- group: "flows.knative.dev"
- group: "messaging.knative.dev"
- group: "operator.knative.dev"
- group: "sources.knative.dev"
- group: "operator.devops.alauda.io"
- group: "flagger.app"
- group: "jaegertracing.io"
- group: "velero.io"
resources: ["deletebackuprequests"]
- group: "connectors.alauda.io"
- group: "operator.connectors.alauda.io"
resources: ["connectorscores", "connectorsgits", "connectorsocis"]
- level: Metadata
- path: /usr/local/bin/capv-load-local-images.sh
owner: "root:root"
permissions: "0755"
content: |
#!/bin/bash
set -euo pipefail
until mountpoint -q /var/lib/containerd; do
echo "waiting for /var/lib/containerd mount"
sleep 1
done
systemctl restart containerd
until systemctl is-active --quiet containerd; do
echo "waiting for containerd"
sleep 1
done
if [ ! -d "/root/images" ]; then
echo "ERROR: /root/images directory not found" >&2
exit 1
fi
image_count=0
for image_file in /root/images/*.tar; do
if [ -f "$image_file" ]; then
echo "importing image: $image_file"
ctr -n k8s.io images import "$image_file"
image_count=$((image_count + 1))
fi
done
if [ "$image_count" -eq 0 ]; then
echo "ERROR: no tar files found in /root/images" >&2
exit 1
fi
echo "imported $image_count images"
preKubeadmCommands:
- hostnamectl set-hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback localhost6 localhost6.localdomain6" >/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }} {{ local_hostname }} localhost localhost.localdomain localhost4 localhost4.localdomain4" >>/etc/hosts
- while ! ip route | grep -q "default via"; do sleep 1; done; echo "NetworkManager started"
- /usr/local/bin/capv-load-local-images.sh
postKubeadmCommands:
- chmod 600 /var/lib/kubelet/config.yaml
clusterConfiguration:
imageRepository: <image_registry>/tkestack
dns:
imageTag: <dns_image_tag>
etcd:
local:
imageTag: <etcd_image_tag>
apiServer:
extraArgs:
admission-control-config-file: /etc/kubernetes/admission/psa-config.yaml
audit-log-format: json
audit-log-maxage: "30"
audit-log-maxbackup: "10"
audit-log-maxsize: "200"
audit-log-mode: batch
audit-log-path: /etc/kubernetes/audit/audit.log
audit-policy-file: /etc/kubernetes/audit/policy.yaml
encryption-provider-config: /etc/kubernetes/encryption-provider.conf
kubelet-certificate-authority: /etc/kubernetes/pki/ca.crt
profiling: "false"
tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
tls-min-version: VersionTLS12
extraVolumes:
- hostPath: /etc/kubernetes
mountPath: /etc/kubernetes
name: vol-dir-0
pathType: Directory
controllerManager:
extraArgs:
bind-address: "::"
cloud-provider: external
profiling: "false"
tls-min-version: VersionTLS12
scheduler:
extraArgs:
bind-address: "::"
profiling: "false"
tls-min-version: VersionTLS12
initConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
node-labels: kube-ovn/role=master
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
node-labels: kube-ovn/role=master
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
应用该清单:
kubectl apply -f 20-control-plane.yaml
创建 worker 对象
创建 worker machine template、bootstrap template 和 MachineDeployment。
30-workers-md-0.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: <cluster_name>-worker
namespace: <namespace>
spec:
template:
spec:
server: "<vsphere_server>"
template: "<template_name>"
cloneMode: <clone_mode>
folder: "<vm_folder>"
datastore: "<worker_system_datastore>"
diskGiB: <worker_system_disk_gib>
memoryMiB: <worker_memory_mib>
numCPUs: <worker_num_cpus>
os: Linux
powerOffMode: <power_off_mode>
network:
devices:
- networkName: "<nic1_network_name>"
machineConfigPoolRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
name: <cluster_name>-worker-pool
namespace: <namespace>
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: <cluster_name>-worker-bootstrap
namespace: <namespace>
spec:
template:
spec:
files:
- path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
owner: "root:root"
permissions: "0644"
content: |
{
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"kind": "KubeletConfiguration",
"protectKernelDefaults": true,
"staticPodPath": null,
"streamingConnectionIdleTimeout": "5m",
"tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
"tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key"
}
- path: /usr/local/bin/capv-load-local-images.sh
owner: "root:root"
permissions: "0755"
content: |
#!/bin/bash
set -euo pipefail
until mountpoint -q /var/lib/containerd; do
echo "waiting for /var/lib/containerd mount"
sleep 1
done
systemctl restart containerd
until systemctl is-active --quiet containerd; do
echo "waiting for containerd"
sleep 1
done
if [ ! -d "/root/images" ]; then
echo "ERROR: /root/images directory not found" >&2
exit 1
fi
image_count=0
for image_file in /root/images/*.tar; do
if [ -f "$image_file" ]; then
echo "importing image: $image_file"
ctr -n k8s.io images import "$image_file"
image_count=$((image_count + 1))
fi
done
if [ "$image_count" -eq 0 ]; then
echo "ERROR: no tar files found in /root/images" >&2
exit 1
fi
echo "imported $image_count images"
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
preKubeadmCommands:
- hostnamectl set-hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback localhost6 localhost6.localdomain6" >/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }} {{ local_hostname }} localhost localhost.localdomain localhost4 localhost4.localdomain4" >>/etc/hosts
- while ! ip route | grep -q "default via"; do sleep 1; done; echo "NetworkManager started"
- /usr/local/bin/capv-load-local-images.sh
postKubeadmCommands:
- chmod 600 /var/lib/kubelet/config.yaml
users:
- name: boot
sudo: ALL=(ALL) NOPASSWD:ALL
sshAuthorizedKeys:
- "<ssh_public_key>"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: <cluster_name>-md-0
namespace: <namespace>
spec:
clusterName: <cluster_name>
replicas: <worker_replicas>
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
selector:
matchLabels:
nodepool: md-0
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: <cluster_name>
nodepool: md-0
spec:
clusterName: <cluster_name>
version: "<k8s_version>"
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: <cluster_name>-worker-bootstrap
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
name: <cluster_name>-worker
应用该清单:
kubectl apply -f 30-workers-md-0.yaml
在基础工作流中,请注意以下 worker 专属规则:
- 主 worker 清单默认不设置
failureDomain,因为基础工作流假定只有一个 datacenter。如果你需要让某个 worker MachineDeployment 落到特定的 VSphereDeploymentZone,请按照 Extension Scenarios 中的说明添加 failureDomain。
- 某些环境会向
KubeadmConfigTemplate 添加额外的 runtime-image 替换命令或服务重启命令。这些命令有意不包含在基础示例中。只有在你环境中的平台要求明确需要时,才添加它们。
等待集群就绪
所有清单应用完成后,集群创建是异步的。使用以下命令监控进度:
kubectl -n <namespace> get cluster,kubeadmcontrolplane,machinedeployment,machine -w
在继续验证之前,请等待 KubeadmControlPlane 报告预期数量的就绪副本,并且所有 Machine 对象都进入 Running 阶段。
验证
使用以下命令验证集群创建工作流。
- 检查
global 集群中的 CPI 交付资源:
kubectl -n <namespace> get clusterresourceset
kubectl -n <namespace> get clusterresourcesetbinding
- 导出业务 kubeconfig:
kubectl -n <namespace> get secret <cluster_name>-kubeconfig -o jsonpath='{.data.value}' | base64 -d > /tmp/<cluster_name>.kubeconfig
- 检查业务集群中是否创建了 vSphere CPI daemonset:
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig -n kube-system get daemonset
- 检查
global 集群对象:
kubectl -n <namespace> get cluster,vspherecluster,kubeadmcontrolplane,machinedeployment,machine,vspheremachine,vspherevm
- 检查业务节点:
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig get nodes -o wide
确认以下结果:
vsphere-cloud-controller-manager 出现在业务集群中。
- 控制平面和 worker 节点已创建。
- 节点最终变为
Ready。
故障排查
当工作流失败时,首先使用以下命令:
kubectl -n <namespace> describe cluster <cluster_name>
kubectl -n <namespace> describe vspherecluster <cluster_name>
kubectl -n <namespace> describe kubeadmcontrolplane <cluster_name>-kcp
kubectl -n <namespace> describe machinedeployment <cluster_name>-md-0
kubectl -n <namespace> get cluster,vspherecluster,kubeadmcontrolplane,machinedeployment,machine,vspheremachine,vspherevm
kubectl -n cpaas-system logs deploy/capi-controller-manager
优先检查以下内容:
- 如果 CPI 资源未交付,请验证
ClusterResourceSet=true、ClusterResourceSet 和 ClusterResourceSetBinding。
- 如果
ClusterResourceSet 已存在但未创建任何 ClusterResourceSetBinding,请检查 controller 是否对所引用的 ConfigMap 和 Secret 资源拥有所需的 delete 权限。
- 如果网络插件未安装,请验证所需的集群注解是否存在,以及平台控制器是否已处理它们。
- 如果缺少
cpaas.io/registry-address 注解,请验证公共 registry 凭证以及负责注入该注解的平台控制器。
- 如果某个 machine 卡在
Provisioning,请检查 VSphereMachine 上的 MachineConfigPoolReady 条件——它会显示槽位分配是否因 pool 绑定或 datacenter 不匹配而失败。
- 如果某个 VM 正在等待 IP 分配,请验证 VMware Tools、静态 IP 设置以及
VSphereVM.status.addresses。
- 如果业务
Node 对象一直没有 spec.providerID,请先验证 CPI 交付资源,然后检查是否存在重复的 vCenter guest hostname。当同一 datacenter 中的旧 VM 仍报告与新节点相同的 guest hostname 时,cloud-provider-vsphere 可能会回退到 node-name 查询、缓存旧 VM,并由于 VM IP 与 kubelet node IP 不匹配而拒绝新节点。请检查 leader vsphere-cloud-controller-manager 日志、节点 SystemUUID、真实 VM UUID 以及 vCenter guest hostname/IP 值。在你修复或移除重复 hostname 或旧 VM 冲突之后,重启业务集群的 vsphere-cloud-controller-manager Pods,以清除错误的内存缓存:
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig -n kube-system delete pod \
-l k8s-app=vsphere-cloud-controller-manager
- 如果 datastore 空间耗尽,请检查目标 datastore 中是否仍残留旧 VM 目录或
.vmdk 文件。
- 如果模板系统盘大小与清单值不一致,请先检查实际克隆模式。当 VM 以
linkedClone 创建时,系统盘会保持模板大小,且 diskGiB 被忽略。只有 fullClone 会使用 diskGiB,并且在这种情况下 diskGiB 不能小于模板磁盘大小。
- 如果控制平面 endpoint 未起来,请验证负载均衡器、VIP 以及端口
6443。
- 如果与 vCenter 的 TLS 连接失败,请验证 thumbprint、vCenter 地址,以及代理设置是否干扰了连接。
查看 controller 日志时,请遵循以下规则:
deploy/capi-controller-manager 运行在 global 集群的 cpaas-system namespace 中。
- 不要使用业务集群的 kubeconfig 来查看
capi-controller-manager 日志。
- 如果平台控制器处理了集群网络注解,还要检查平台 network-controller 日志和平台 cluster-lifecycle-controller 日志。
下一步
基础拓扑运行后,如果你需要第二个 NIC、多个 datacenter、failure domain、额外数据盘或更多 worker 副本,请继续阅读 Extension Scenarios。