在 global 集群中创建 VMware vSphere 集群
本文说明如何通过直接连接 vCenter 的标准 CAPV 模式,从 global 集群创建一个 VMware vSphere 业务集群。该操作步骤覆盖一个最小支持拓扑:一个 datacenter、每个节点一个 NIC,以及通过 VSphereMachineConfigPool 进行静态 IP 分配。
场景
在以下场景中使用本文档:
- 你希望在环境中创建第一个基线 VMware vSphere 业务集群。
- 你在初始验证中使用一个 datacenter 和每个节点一个 NIC。
- 你希望在启用高级放置或网络功能之前,先保持首次部署尽可能简单。
本文档适用于以下部署模型:
- CAPV 直接连接到 vCenter。
- 控制平面和 worker 节点都使用
VSphereMachineConfigPool 进行静态 IP 分配和数据盘配置。
ClusterResourceSet 会自动下发 vSphere CPI 组件。
- 首次验证使用一个 datacenter 和每个节点一个 NIC。
本文档不适用于以下场景:
- 依赖 vSphere Supervisor 或
vm-operator 的部署。
- 不使用
VSphereMachineConfigPool 的部署。
- 首次部署时同时启用多个 datacenter、多个 NIC 和复杂磁盘扩展。
本文档针对当前平台环境编写。kube-ovn 的交付路径依赖于消费 Cluster 资源注解的平台控制器,因此该工作流不适用于平台上下文之外的通用独立 CAPV 部署指南。
前提条件
开始之前,请确保满足以下条件:
- 你已完成 Preparing Parameters for a VMware vSphere Cluster 中的值收集。
global 集群可以访问 vCenter。
- 目标模板、网络、datastore 和 vCenter resource pool 可用。
- 控制平面 VIP 和负载均衡器已准备就绪。
- 所有必需的静态 IP 地址都已分配且未被使用。
- 已启用
ClusterResourceSet=true。
- 平台已具备有效的公共 registry 配置。
- 平台可以处理安装网络插件所需的集群注解。
关键对象
ClusterResourceSet
ClusterResourceSet 是管理集群中的一个 Cluster API 资源。工作负载 API server 可达后,它会将引用的 ConfigMap 和 Secret 资源应用到业务集群。
在该工作流中,ClusterResourceSet 用于自动下发 vSphere CPI 资源。
vSphere CPI component
vSphere CPI component 通过 ClusterResourceSet 下发到业务集群。它将业务节点连接到 vSphere 基础设施,使集群能够报告基础设施标识并完成 cloud-provider 初始化。
machine config pool
machine config pool 即 VSphereMachineConfigPool 自定义资源。在基线工作流中:
- 一个 machine config pool 用于控制平面节点。
- 一个 machine config pool 用于 worker 节点。
每个节点槽位都包含 hostname、datacenter、静态 IP 分配以及可选的数据盘定义。
对于网络配置,需要区分以下字段:
networkName 是 vCenter 网络或 port group 名称。
deviceName 是 guest operating system 内的 NIC 名称。
如果设置了 deviceName,CAPV 会将该值写入生成的 guest-network metadata 中。如果省略该字段,当前实现通常会按 NIC 顺序使用 eth0、eth1、eth2 之类的 NIC 名称。
还需要区分以下值格式:
- 节点 IP 地址需要与前缀长度一起使用,例如
10.10.10.11/24。
- gateway 字段仅包含 gateway IP 地址,例如
10.10.10.1。
在基线工作流中:
- 一个
VSphereMachineConfigPool 用于控制平面节点。
- 一个
VSphereMachineConfigPool 用于 worker 节点。
VM 模板要求
该工作流使用的 VM 模板应满足以下最低要求:
- 它使用目标平台环境所需的 operating system。
- 它包含
cloud-init。
- 它包含 VMware Tools 或
open-vm-tools。
- 它包含
containerd。
- 它包含 kubeadm bootstrap 所需的基础组件。
- 它在
/root/images/ 下包含预先导出的 container image tar 文件。这些文件会在 kubeadm 运行前由 capv-load-local-images.sh 导入到 containerd 中,从而使节点 bootstrap 不依赖从远程 registry 拉取镜像。
/root/images/*.tar 文件必须包含 sandbox (pause) image,且其 reference 必须与 /etc/containerd/config.toml 中配置的 sandbox_image 值(containerd v1)或 sandbox 值(containerd v2)完全匹配。例如,如果 containerd 配置为 sandbox_image = "registry.example.com/tkestack/pause:3.10",则其中一个 tar 文件必须包含该完全一致的 image reference。不匹配会导致 containerd 从网络拉取 sandbox image,这会破坏本地预加载的目的,并在 air-gapped 环境中失败。
静态 IP 配置、hostname 注入和其他初始化设置依赖 cloud-init。节点 IP 上报依赖 guest tools。
本地文件布局
创建本地工作目录,并按以下布局保存 manifests:
capv-cluster/
├── 00-namespace.yaml
├── 01-vsphere-credentials-secret.yaml
├── 02-vspheremachineconfigpool-control-plane.yaml
├── 03-vspheremachineconfigpool-worker.yaml
├── 10-cluster.yaml
├── 15-vsphere-cpi-clusterresourceset.yaml
├── 20-control-plane.yaml
└── 30-workers-md-0.yaml
使用以下命令创建目录:
mkdir -p ./capv-cluster
cd ./capv-cluster
操作步骤
验证环境
从管理环境运行以下命令,以验证最低前提条件:
kubectl get ns
kubectl get minfo -l cpaas.io/module-name=cluster-api-provider-vsphere
kubectl get minfo -l cpaas.io/module-name=cluster-api-provider-kubeadm
kubectl -n cpaas-system get deploy capi-controller-manager -o jsonpath='{.spec.template.spec.containers[0].args}'
kubectl -n cpaas-system get secret public-registry-credential -o jsonpath='{.data.content}'
确认以下结果:
- 管理集群可访问。
- Alauda Container Platform Kubeadm Provider 和 Alauda Container Platform VMware vSphere Infrastructure Provider 正在运行。
- controller 参数中包含
ClusterResourceSet=true。
- 公共 registry 凭证
data.content 不为空。
继续之前,还要检查以下项目:
- vCenter server 地址可达。
- vCenter 用户名和密码有效。
- thumbprint 正确。
- 模板名称正确。
- 目标 datacenter 中可以解析该模板。
- 如果 VM 以
fullClone 方式克隆,则模板系统盘大小不能大于后续 manifests 中使用的 diskGiB 值。如果 CAPV 完成 linkedClone,系统盘大小将保持为模板大小,而 diskGiB 会被忽略。
- 模板中已安装 VMware Tools 或
open-vm-tools。
- VIP 存在,且执行环境可以访问端口
6443。
- 用于 real-server 维护的负载均衡器归属模型已经明确。
创建 namespace 和 vCenter 凭证 Secret
创建用于存储业务集群对象的 namespace。
该工作流将业务集群对象存储在 cpaas-system namespace 中。在下面的 manifests 和命令中,请将所有 <namespace> 占位符替换为 cpaas-system。
00-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: <namespace>
创建由 VSphereCluster.spec.identityRef 引用的 vCenter 凭证 Secret。
01-vsphere-credentials-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: <credentials_secret_name>
namespace: <namespace>
type: Opaque
stringData:
username: "<vsphere_username>"
password: "<vsphere_password>"
应用这两个 manifest:
kubectl apply -f 00-namespace.yaml
kubectl apply -f 01-vsphere-credentials-secret.yaml
创建 Cluster 和 VSphereCluster 对象
创建基础 cluster manifest,包含业务集群网络设置、控制平面 endpoint 以及 vCenter 连接设置。
10-cluster.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: <cluster_name>
namespace: <namespace>
labels:
cluster.x-k8s.io/cluster-name: <cluster_name>
cluster-type: VSphere
addons.cluster.x-k8s.io/vsphere-cpi: "enabled"
annotations:
capi.cpaas.io/resource-group-version: infrastructure.cluster.x-k8s.io/v1beta1
capi.cpaas.io/resource-kind: VSphereCluster
cpaas.io/sentry-deploy-type: Baremetal
cpaas.io/alb-address-type: ClusterAddress
cpaas.io/network-type: kube-ovn
cpaas.io/kube-ovn-version: <kube_ovn_version>
cpaas.io/kube-ovn-join-cidr: <kube_ovn_join_cidr>
spec:
clusterNetwork:
pods:
cidrBlocks:
- <pod_cidr>
services:
cidrBlocks:
- <service_cidr>
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: <cluster_name>
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
name: <cluster_name>
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
metadata:
name: <cluster_name>
namespace: <namespace>
spec:
controlPlaneEndpoint:
host: "<vip>"
port: <api_server_port>
identityRef:
kind: Secret
name: <credentials_secret_name>
server: "<vsphere_server>"
thumbprint: "<thumbprint>"
应用该 manifest:
kubectl apply -f 10-cluster.yaml
创建 vSphere CPI 下发资源
创建一个 ClusterResourceSet,使业务集群在 workload API server 可达后自动接收 vSphere CPI 配置和 manifests。
WARNING
CPI ConfigMap、Secret 和 ClusterResourceSet 资源必须创建在与 Cluster 资源相同的 namespace 中。在本指南中,该 namespace 是 cpaas-system。ClusterResourceSet 只能匹配其自身 namespace 内的集群;如果部署到不同 namespace,将会静默地阻止资源下发。
INFO
Cluster 注解中的 kube-ovn 配置由平台控制器消费。本文档不会直接安装网络插件。
TIP
该 manifest 很长,并且在 data 字段中包含嵌套 YAML。应用前请先校验 manifest:kubectl apply --dry-run=client -f 15-vsphere-cpi-clusterresourceset.yaml。
15-vsphere-cpi-clusterresourceset.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: <cluster_name>-vsphere-cpi-config
namespace: <namespace>
data:
data: |
apiVersion: v1
kind: ConfigMap
metadata:
name: cloud-config
namespace: kube-system
data:
vsphere.conf: |
[Global]
secret-name = "vsphere-cloud-secret"
secret-namespace = "kube-system"
service-account = "cloud-controller-manager"
port = "443"
insecure-flag = "<cpi_insecure_flag>"
datacenters = "<cpi_datacenters>"
[Labels]
zone = "k8s-zone"
region = "k8s-region"
[VirtualCenter "<vsphere_server>"]
---
apiVersion: v1
kind: Secret
metadata:
name: <cluster_name>-vsphere-cpi-secret
namespace: <namespace>
type: addons.cluster.x-k8s.io/resource-set
stringData:
data: |
apiVersion: v1
kind: Secret
metadata:
name: vsphere-cloud-secret
namespace: kube-system
type: Opaque
stringData:
<vsphere_server>.username: <vsphere_username>
<vsphere_server>.password: <vsphere_password>
---
apiVersion: v1
kind: ConfigMap
metadata:
name: <cluster_name>-vsphere-cpi-manifests
namespace: <namespace>
data:
data: |
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloud-controller-manager
namespace: kube-system
automountServiceAccountToken: false
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:cloud-controller-manager
rules:
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "patch", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["*"]
- apiGroups: [""]
resources: ["nodes/status"]
verbs: ["patch"]
- apiGroups: [""]
resources: ["services"]
verbs: ["list", "patch", "update", "watch"]
- apiGroups: [""]
resources: ["services/status"]
verbs: ["patch"]
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["create", "get", "list", "watch", "update"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "update", "watch"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["create", "get", "list", "watch", "update"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get", "list", "watch", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: servicecatalog.k8s.io:apiserver-authentication-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- apiGroup: ""
kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- apiGroup: ""
kind: User
name: cloud-controller-manager
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:cloud-controller-manager
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
- kind: User
name: cloud-controller-manager
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
component: cloud-controller-manager
tier: control-plane
k8s-app: vsphere-cloud-controller-manager
name: vsphere-cloud-controller-manager
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: vsphere-cloud-controller-manager
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
component: cloud-controller-manager
k8s-app: vsphere-cloud-controller-manager
spec:
securityContext:
runAsUser: 1001
automountServiceAccountToken: true
# Optional: required when the CPI image is stored in a private
# registry that needs authentication. The platform automatically
# syncs a dockerconfigjson secret named "global-registry-auth"
# into every namespace of the workload cluster when the
# management-cluster secret "public-registry-credential"
# (data.content) is configured. If your environment does not
# use a private registry, remove the imagePullSecrets block.
imagePullSecrets:
- name: global-registry-auth
serviceAccountName: cloud-controller-manager
hostNetwork: true
tolerations:
- operator: Exists
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: node.kubernetes.io/not-ready
effect: NoSchedule
operator: Exists
containers:
- name: vsphere-cloud-controller-manager
image: <image_registry>/ait/cloud-provider-vsphere:<cpi_image_tag>
args:
- --v=2
- --cloud-provider=vsphere
- --cloud-config=/etc/cloud/vsphere.conf
volumeMounts:
- mountPath: /etc/cloud
name: vsphere-config-volume
readOnly: true
resources:
requests:
cpu: 200m
volumes:
- name: vsphere-config-volume
configMap:
name: cloud-config
---
apiVersion: v1
kind: Service
metadata:
labels:
component: cloud-controller-manager
name: vsphere-cloud-controller-manager
namespace: kube-system
spec:
type: NodePort
ports:
- port: 43001
protocol: TCP
targetPort: 43001
selector:
component: cloud-controller-manager
---
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
name: <cluster_name>-vsphere-cpi
namespace: <namespace>
spec:
strategy: Reconcile
clusterSelector:
matchLabels:
addons.cluster.x-k8s.io/vsphere-cpi: "enabled"
resources:
- name: <cluster_name>-vsphere-cpi-config
kind: ConfigMap
- name: <cluster_name>-vsphere-cpi-secret
kind: Secret
- name: <cluster_name>-vsphere-cpi-manifests
kind: ConfigMap
应用该 manifest:
kubectl apply -f 15-vsphere-cpi-clusterresourceset.yaml
创建 machine config pools
创建控制平面 machine config pool。
INFO
每个节点槽位在 network.primary(必填)和 network.additional(可选列表)下声明其 NIC 布局。主 NIC 的 networkName 是必填项,provider 会根据 hostname 和解析后的主 NIC 地址推导 Kubernetes 节点名、kubelet serving certificate DNS SAN,以及 kubelet node-ip。hostname 必须是合法的 DNS-1123 子域名。
INFO
deviceName 是可选项。如果不需要强制指定 guest NIC 名称,可以从每个节点槽位中移除 deviceName 行。provider 会按 NIC 顺序分配 eth0、eth1 等 NIC 名称。
02-vspheremachineconfigpool-control-plane.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
metadata:
name: <cp_pool_name>
namespace: <namespace>
spec:
clusterRef:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
name: <cluster_name>
datacenter: "<default_datacenter>"
releaseDelayHours: <release_delay_hours>
configs:
- hostname: "<cp_node_name_1>"
datacenter: "<master_01_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_01_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_lib_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
- hostname: "<cp_node_name_2>"
datacenter: "<master_02_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_02_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_lib_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
- hostname: "<cp_node_name_3>"
datacenter: "<master_03_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<master_03_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <cp_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <cp_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
- name: var-lib-etcd
sizeGiB: <cp_var_etcd_size_gib>
mountPath: /var/lib/etcd
fsFormat: ext4
wipeFilesystem: true
创建 worker machine config pool。
03-vspheremachineconfigpool-worker.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
metadata:
name: <worker_pool_name>
namespace: <namespace>
spec:
clusterRef:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
name: <cluster_name>
datacenter: "<default_datacenter>"
releaseDelayHours: <release_delay_hours>
configs:
- hostname: "<worker_node_name_1>"
datacenter: "<worker_01_datacenter>"
network:
primary:
networkName: "<nic1_network_name>"
deviceName: "<nic1_device_name>"
ip: "<worker_01_nic1_ip>/<nic1_prefix>"
gateway: "<nic1_gateway>"
dns:
- "<nic1_dns_1>"
persistentDisks:
- name: var-cpaas
sizeGiB: <worker_var_cpaas_size_gib>
mountPath: /var/cpaas
fsFormat: ext4
- name: var-lib-containerd
sizeGiB: <worker_var_lib_containerd_size_gib>
mountPath: /var/lib/containerd
fsFormat: ext4
应用这两个 manifest:
kubectl apply -f 02-vspheremachineconfigpool-control-plane.yaml
kubectl apply -f 03-vspheremachineconfigpool-worker.yaml
创建控制平面对象
创建 VSphereMachineTemplate 和 KubeadmControlPlane 对象。请将下面完整模板中的占位符替换为检查清单文档中收集到的值。
cloneMode 和 diskGiB 在模板中都会保留,因为 CAPV 接受这两个字段。实际使用中,diskGiB 仅在真实克隆操作为 fullClone 时影响系统盘。如果 cloneMode 是 linkedClone 且模板存在可用 snapshot,CAPV 会完成 linked clone,系统盘大小保持为源模板大小。如果不存在可用 snapshot,CAPV 会回退到 fullClone,此时 diskGiB 再次生效。
20-control-plane.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: <cluster_name>-control-plane
namespace: <namespace>
spec:
template:
spec:
server: "<vsphere_server>"
template: "<template_name>"
cloneMode: <clone_mode>
datastore: "<cp_system_datastore>"
diskGiB: <cp_system_disk_gib>
memoryMiB: <cp_memory_mib>
numCPUs: <cp_num_cpus>
os: Linux
powerOffMode: <power_off_mode>
network:
devices:
- networkName: "<nic1_network_name>"
machineConfigPoolRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
name: <cp_pool_name>
namespace: <namespace>
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: <cluster_name>
namespace: <namespace>
spec:
rolloutStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
version: "<k8s_version>"
replicas: <cp_replicas>
machineTemplate:
nodeDrainTimeout: 1m
nodeDeletionTimeout: 5m
metadata:
labels:
node-role.kubernetes.io/control-plane: ""
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
name: <cluster_name>-control-plane
kubeadmConfigSpec:
users:
- name: boot
sudo: ALL=(ALL) NOPASSWD:ALL
sshAuthorizedKeys:
- "<ssh_public_key>"
files:
- path: /etc/kubernetes/admission/psa-config.yaml
owner: "root:root"
permissions: "0644"
content: |
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "privileged"
enforce-version: "latest"
audit: "baseline"
audit-version: "latest"
warn: "baseline"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces:
- kube-system
- <namespace>
- path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
owner: "root:root"
permissions: "0644"
content: |
{
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"kind": "KubeletConfiguration",
"protectKernelDefaults": true,
"streamingConnectionIdleTimeout": "5m",
"tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
"tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key"
}
# Generate the encryption key with: head -c 32 /dev/urandom | base64
- path: /etc/kubernetes/encryption-provider.conf
owner: "root:root"
append: false
permissions: "0644"
content: |
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <encryption_provider_secret>
- path: /etc/kubernetes/audit/policy.yaml
owner: "root:root"
append: false
permissions: "0644"
content: |
apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
- "RequestReceived"
rules:
- level: None
users:
- system:kube-controller-manager
- system:kube-scheduler
- system:serviceaccount:kube-system:endpoint-controller
verbs: ["get", "update"]
namespaces: ["kube-system"]
resources:
- group: ""
resources: ["endpoints"]
- level: None
nonResourceURLs:
- /healthz*
- /version
- /swagger*
- level: None
resources:
- group: ""
resources: ["events"]
- level: None
resources:
- group: "devops.alauda.io"
- level: None
verbs: ["get", "list", "watch"]
- level: None
resources:
- group: "coordination.k8s.io"
resources: ["leases"]
- level: None
resources:
- group: "authorization.k8s.io"
resources: ["subjectaccessreviews", "selfsubjectaccessreviews"]
- group: "authentication.k8s.io"
resources: ["tokenreviews"]
- level: None
resources:
- group: "app.alauda.io"
resources: ["imagewhitelists"]
- group: "k8s.io"
resources: ["namespaceoverviews"]
- level: Metadata
resources:
- group: ""
resources: ["secrets", "configmaps"]
- level: Metadata
resources:
- group: "operator.connectors.alauda.io"
resources: ["installmanifests"]
- group: "operators.katanomi.dev"
resources: ["katanomis"]
- level: RequestResponse
resources:
- group: ""
- group: "aiops.alauda.io"
- group: "apps"
- group: "app.k8s.io"
- group: "authentication.istio.io"
- group: "auth.alauda.io"
- group: "autoscaling"
- group: "asm.alauda.io"
- group: "clusterregistry.k8s.io"
- group: "crd.alauda.io"
- group: "infrastructure.alauda.io"
- group: "monitoring.coreos.com"
- group: "operators.coreos.com"
- group: "networking.istio.io"
- group: "extensions.istio.io"
- group: "install.istio.io"
- group: "security.istio.io"
- group: "telemetry.istio.io"
- group: "opentelemetry.io"
- group: "networking.k8s.io"
- group: "portal.alauda.io"
- group: "rbac.authorization.k8s.io"
- group: "storage.k8s.io"
- group: "tke.cloud.tencent.com"
- group: "devopsx.alauda.io"
- group: "core.katanomi.dev"
- group: "deliveries.katanomi.dev"
- group: "integrations.katanomi.dev"
- group: "artifacts.katanomi.dev"
- group: "builds.katanomi.dev"
- group: "versioning.katanomi.dev"
- group: "sources.katanomi.dev"
- group: "tekton.dev"
- group: "operator.tekton.dev"
- group: "eventing.knative.dev"
- group: "flows.knative.dev"
- group: "messaging.knative.dev"
- group: "operator.knative.dev"
- group: "sources.knative.dev"
- group: "operator.devops.alauda.io"
- group: "flagger.app"
- group: "jaegertracing.io"
- group: "velero.io"
resources: ["deletebackuprequests"]
- group: "connectors.alauda.io"
- group: "operator.connectors.alauda.io"
resources: ["connectorscores", "connectorsgits", "connectorsocis"]
- level: Metadata
- path: /usr/local/bin/capv-load-local-images.sh
owner: "root:root"
permissions: "0755"
content: |
#!/bin/bash
set -euo pipefail
until mountpoint -q /var/lib/containerd; do
echo "waiting for /var/lib/containerd mount"
sleep 1
done
systemctl restart containerd
until systemctl is-active --quiet containerd; do
echo "waiting for containerd"
sleep 1
done
if [ ! -d "/root/images" ]; then
echo "ERROR: /root/images directory not found" >&2
exit 1
fi
image_count=0
for image_file in /root/images/*.tar; do
if [ -f "$image_file" ]; then
echo "importing image: $image_file"
ctr -n k8s.io images import "$image_file"
image_count=$((image_count + 1))
fi
done
if [ "$image_count" -eq 0 ]; then
echo "ERROR: no tar files found in /root/images" >&2
exit 1
fi
echo "imported $image_count images"
preKubeadmCommands:
- hostnamectl set-hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback localhost6 localhost6.localdomain6" >/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }} {{ local_hostname }} localhost localhost.localdomain localhost4 localhost4.localdomain4" >>/etc/hosts
- while ! ip route | grep -q "default via"; do sleep 1; done; echo "NetworkManager started"
- /usr/local/bin/capv-load-local-images.sh
postKubeadmCommands:
- chmod 600 /var/lib/kubelet/config.yaml
clusterConfiguration:
imageRepository: <image_registry>/tkestack
dns:
imageTag: <dns_image_tag>
etcd:
local:
imageTag: <etcd_image_tag>
apiServer:
extraArgs:
admission-control-config-file: /etc/kubernetes/admission/psa-config.yaml
audit-log-format: json
audit-log-maxage: "30"
audit-log-maxbackup: "10"
audit-log-maxsize: "200"
audit-log-mode: batch
audit-log-path: /etc/kubernetes/audit/audit.log
audit-policy-file: /etc/kubernetes/audit/policy.yaml
encryption-provider-config: /etc/kubernetes/encryption-provider.conf
kubelet-certificate-authority: /etc/kubernetes/pki/ca.crt
profiling: "false"
tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
tls-min-version: VersionTLS12
extraVolumes:
- hostPath: /etc/kubernetes
mountPath: /etc/kubernetes
name: vol-dir-0
pathType: Directory
controllerManager:
extraArgs:
bind-address: "::"
cloud-provider: external
profiling: "false"
tls-min-version: VersionTLS12
scheduler:
extraArgs:
bind-address: "::"
profiling: "false"
tls-min-version: VersionTLS12
initConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
node-labels: kube-ovn/role=master
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
node-labels: kube-ovn/role=master
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
应用该 manifest:
kubectl apply -f 20-control-plane.yaml
创建 worker 对象
创建 worker machine template、bootstrap template 和 MachineDeployment。
30-workers-md-0.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
name: <cluster_name>-worker
namespace: <namespace>
spec:
template:
spec:
server: "<vsphere_server>"
template: "<template_name>"
cloneMode: <clone_mode>
datastore: "<worker_system_datastore>"
diskGiB: <worker_system_disk_gib>
memoryMiB: <worker_memory_mib>
numCPUs: <worker_num_cpus>
os: Linux
powerOffMode: <power_off_mode>
network:
devices:
- networkName: "<nic1_network_name>"
machineConfigPoolRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineConfigPool
name: <worker_pool_name>
namespace: <namespace>
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: <cluster_name>-worker-bootstrap
namespace: <namespace>
spec:
template:
spec:
files:
- path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
owner: "root:root"
permissions: "0644"
content: |
{
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"kind": "KubeletConfiguration",
"protectKernelDefaults": true,
"staticPodPath": null,
"streamingConnectionIdleTimeout": "5m",
"tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
"tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key"
}
- path: /usr/local/bin/capv-load-local-images.sh
owner: "root:root"
permissions: "0755"
content: |
#!/bin/bash
set -euo pipefail
until mountpoint -q /var/lib/containerd; do
echo "waiting for /var/lib/containerd mount"
sleep 1
done
systemctl restart containerd
until systemctl is-active --quiet containerd; do
echo "waiting for containerd"
sleep 1
done
if [ ! -d "/root/images" ]; then
echo "ERROR: /root/images directory not found" >&2
exit 1
fi
image_count=0
for image_file in /root/images/*.tar; do
if [ -f "$image_file" ]; then
echo "importing image: $image_file"
ctr -n k8s.io images import "$image_file"
image_count=$((image_count + 1))
fi
done
if [ "$image_count" -eq 0 ]; then
echo "ERROR: no tar files found in /root/images" >&2
exit 1
fi
echo "imported $image_count images"
joinConfiguration:
nodeRegistration:
criSocket: /var/run/containerd/containerd.sock
ignorePreflightErrors:
- ImagePull
kubeletExtraArgs:
cloud-provider: external
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
name: '{{ local_hostname }}'
patches:
directory: /etc/kubernetes/patches
preKubeadmCommands:
- hostnamectl set-hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback localhost6 localhost6.localdomain6" >/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }} {{ local_hostname }} localhost localhost.localdomain localhost4 localhost4.localdomain4" >>/etc/hosts
- while ! ip route | grep -q "default via"; do sleep 1; done; echo "NetworkManager started"
- /usr/local/bin/capv-load-local-images.sh
postKubeadmCommands:
- chmod 600 /var/lib/kubelet/config.yaml
users:
- name: boot
sudo: ALL=(ALL) NOPASSWD:ALL
sshAuthorizedKeys:
- "<ssh_public_key>"
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: <cluster_name>-md-0
namespace: <namespace>
spec:
clusterName: <cluster_name>
replicas: <worker_replicas>
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
selector:
matchLabels:
nodepool: md-0
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: <cluster_name>
nodepool: md-0
spec:
clusterName: <cluster_name>
version: "<k8s_version>"
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: <cluster_name>-worker-bootstrap
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
name: <cluster_name>-worker
应用该 manifest:
kubectl apply -f 30-workers-md-0.yaml
在基线工作流中,请注意以下 worker 相关规则:
- 主 worker manifest 中默认不设置
failureDomain,因为基线工作流假设只有一个 datacenter。如果你需要让 worker MachineDeployment 落在特定的 VSphereDeploymentZone 中,请按照 Extension Scenarios 中的说明添加 failureDomain。
- 某些环境会在
KubeadmConfigTemplate 中额外添加 runtime-image 替换命令或 service 重启命令。基线示例中故意不包含这些命令。仅当你所在环境的平台要求明确需要时再添加。
等待集群就绪
在所有 manifests 应用后,集群创建是异步的。使用以下命令监控进度:
kubectl -n <namespace> get cluster,kubeadmcontrolplane,machinedeployment,machine -w
在继续验证之前,请等待 KubeadmControlPlane 报告的 ready replicas 数量达到预期,并且所有 Machine 对象都进入 Running 阶段。
验证
使用以下命令验证集群创建工作流。
- 检查管理集群中的 CPI 下发资源:
kubectl -n <namespace> get clusterresourceset
kubectl -n <namespace> get clusterresourcesetbinding
- 导出业务集群 kubeconfig:
kubectl -n <namespace> get secret <cluster_name>-kubeconfig -o jsonpath='{.data.value}' | base64 -d > /tmp/<cluster_name>.kubeconfig
- 检查业务集群中是否创建了 vSphere CPI daemonset:
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig -n kube-system get daemonset
- 检查管理集群对象:
kubectl -n <namespace> get cluster,vspherecluster,kubeadmcontrolplane,machinedeployment,machine,vspheremachine,vspherevm
- 检查业务集群节点:
kubectl --kubeconfig=/tmp/<cluster_name>.kubeconfig get nodes -o wide
确认以下结果:
vsphere-cloud-controller-manager 出现在业务集群中。
- 控制平面节点和 worker 节点已创建。
- 节点最终变为
Ready。
故障排查
当工作流失败时,优先使用以下命令:
kubectl -n <namespace> describe cluster <cluster_name>
kubectl -n <namespace> describe vspherecluster <cluster_name>
kubectl -n <namespace> describe kubeadmcontrolplane <cluster_name>
kubectl -n <namespace> describe machinedeployment <cluster_name>-md-0
kubectl -n <namespace> get cluster,vspherecluster,kubeadmcontrolplane,machinedeployment,machine,vspheremachine,vspherevm
kubectl -n cpaas-system logs deploy/capi-controller-manager
优先检查以下内容:
- 如果 CPI 资源未下发,请检查
ClusterResourceSet=true、ClusterResourceSet 和 ClusterResourceSetBinding。
- 如果
ClusterResourceSet 存在但没有创建 ClusterResourceSetBinding,请检查 controller 是否对引用的 ConfigMap 和 Secret 资源具有所需的 delete 权限。
- 如果网络插件未安装,请检查所需的集群注解是否存在,以及平台控制器是否已处理这些注解。
- 如果缺少
cpaas.io/registry-address 注解,请检查公共 registry 凭证以及负责注入该注解的平台 controller。
- 如果某个 machine 一直停留在
Provisioning,请检查 VSphereMachine 的 MachineConfigPoolReady condition——它会显示是否由于 pool 绑定或 datacenter 不匹配导致槽位分配失败。
- 如果 VM 一直等待 IP 分配,请检查 VMware Tools、静态 IP 设置以及
VSphereVM.status.addresses。
- 如果 datastore 空间耗尽,请检查目标 datastore 中是否仍保留旧的 VM 目录或
.vmdk 文件。
- 如果模板系统盘大小与 manifest 值不一致,请先检查实际的 clone mode。当 VM 以
linkedClone 创建时,系统盘保持为模板大小,diskGiB 会被忽略。只有 fullClone 才会使用 diskGiB,并且此时 diskGiB 不能小于模板磁盘大小。
- 如果控制平面 endpoint 未起来,请检查负载均衡器、VIP 和端口
6443。
- 如果连接 vCenter 的 TLS 失败,请检查 thumbprint、vCenter 地址,以及 proxy 设置是否干扰连接。
查看 controller 日志时,请遵循以下规则:
deploy/capi-controller-manager 运行在 global 集群的 cpaas-system namespace 中。
- 不要使用业务集群 kubeconfig 来查看
capi-controller-manager 日志。
- 如果平台控制器处理了集群网络注解,还要检查平台 network-controller 日志和平台 cluster-lifecycle-controller 日志。
下一步
在基线拓扑运行后,如果你需要第二个 NIC、多个 datacenter、failure domain、额外数据盘或更多 worker 副本,请继续参考 Extension Scenarios。