Planning Infra Nodes for Logging Storage

This guide explains the planning considerations for running Logging storage plugins on dedicated Kubernetes infra nodes.

Objectives

Isolate resources: Prevent contention with business workloads.
Enforce stability: Reduce evictions and scheduling conflicts.
Simplify management: Centralize infra components with consistent scheduling rules.

Where to Configure Placement

For Alauda Container Platform Log Storage for Elasticsearch, configure placement through spec.valuesOverride.ait/chart-alauda-log-center.global.nodeSelector and spec.valuesOverride.ait/chart-alauda-log-center.global.tolerations in Installation.
For Alauda Container Platform Log Storage for ClickHouse, configure placement through Advanced Configuration in the console or through spec.config.components.nodeSelector and spec.config.components.tolerations in Installation.

Do not patch the generated StatefulSets, Deployments, or ClickHouseInstallation resources as the standard way to place Logging storage workloads on infra nodes.

Before You Configure Placement

Plan the infra nodes according to Cluster Node Planning.
Confirm whether your storage uses LocalVolume or other PVs with spec.nodeAffinity.
Make sure the selected infra nodes can satisfy both the scheduling rules and the storage placement constraints.

Check Local PVs and nodeAffinity

If your components use local storage (for example TopoLVM, local PV), confirm whether PVs have spec.nodeAffinity. If so, either:

Add all nodes referenced by pv.spec.nodeAffinity to the infra node group, or
Redeploy components using a storage class without node affinity (for example Ceph/RBD).

Example (Elasticsearch):

# 1) Get ES PVCs
kubectl get pvc -n cpaas-system | grep elastic

# 2) Inspect one PV
kubectl get pv elasticsearch-log-node-pv-192.168.135.243 -o yaml

If the PV shows:

spec:
  local:
    path: /cpaas/data/elasticsearch/data
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 192.168.135.243

Then Elasticsearch data is pinned to node 192.168.135.243. Ensure that node is part of the infra node group, or migrate storage.

The same principle applies to any Logging storage component that uses node-bound local storage.

Historical Kafka and ZooKeeper Nodes

Due to historical reasons, ensure Kafka and ZooKeeper nodes are also labeled/tainted as infra:

kubectl get nodes -l kafka=true
kubectl get nodes -l zk=true
# Add the listed nodes into infra nodes as above

Troubleshooting

Common issues and fixes:

Issue	Diagnosis	Solution
Pods stuck in Pending	`kubectl describe pod <pod> \| grep Events`	Add tolerations or adjust selectors
Taint/toleration mismatch	`kubectl describe node <node> \| grep Taints`	Add matching tolerations to the workloads
Resource starvation	`kubectl top nodes -l node-role.kubernetes.io/infra`	Scale infra nodes or tune resource requests

Example error:

Events:
  Warning  FailedScheduling  2m  default-scheduler  0/3 nodes are available:
  3 node(s) had untolerated taint {node-role.kubernetes.io/infra: true}

Fix: add matching tolerations to the plugin configuration and make sure the selected infra nodes also satisfy the required storage placement constraints.

#Planning Infra Nodes for Logging Storage

#TOC

#Objectives

#Where to Configure Placement

#Before You Configure Placement

#Check Local PVs and nodeAffinity

#Historical Kafka and ZooKeeper Nodes

#Troubleshooting