如何归档日志到第三方存储

目前,平台产生的日志会存储在日志存储组件中,但这些日志的保留周期较短。对于合规要求较高的企业,日志通常需要更长的保留时间以满足审计需求。此外,存储的经济性也是企业关注的重点之一。

基于以上场景,平台提供了日志归档方案,允许用户将日志转存到外部的 NFS 或对象存储。

INFO

本文档描述了将日志导出或归档到平台外部存储的方案。

如果您希望 ClickHouse 使用 S3 作为其原生日志数据存储或冷数据存储,请参见 How to Use S3 Storage with ClickHouseInstallation

传输到外部 NFS

前提条件

资源说明
NFS需提前搭建好 NFS 服务,并确定要挂载的 NFS 路径。
Kafka需提前获取 Kafka 服务地址。
镜像地址需在 global 集群中使用 CLI 工具执行以下命令获取镜像地址:
- 获取 alpine 镜像地址:kubectl get daemonset nevermore -n cpaas-system -o jsonpath='{.spec.template.spec.initContainers[0].image}'
- 获取 razor 镜像地址:kubectl get deployment razor -n cpaas-system -o jsonpath='{.spec.template.spec.containers[0].image}'

创建日志同步资源

  1. 点击左侧导航栏的 集群管理 > 集群

  2. 点击日志将要转存的集群右侧操作按钮 > CLI 工具

  3. 根据以下参数说明修改 YAML,修改完成后将代码粘贴到打开的 CLI 工具 命令行中,回车执行。

    资源类型字段路径说明
    ConfigMapdata.export.yml.output.compression压缩日志文本;支持选项为 none(不压缩)zlibgzip
    ConfigMapdata.export.yml.output.file_type导出日志文件类型;支持 txt、csv、json。
    ConfigMapdata.export.yml.output.max_size单个归档文件大小,单位为 MB。超过该值时,日志将根据 compression 字段配置自动压缩归档。
    ConfigMapdata.export.yml.scopes日志转存范围;当前支持的日志包括:系统日志、应用日志、Kubernetes 日志、产品日志。
    Deploymentspec.template.spec.containers[0].command[7]Kafka 服务地址。
    Deploymentspec.template.spec.volumes[3].hostPath.path要挂载的 NFS 路径。
    Deploymentspec.template.spec.initContainers[0].imageAlpine 镜像地址。
    Deploymentspec.template.spec.containers[0].imageRazor 镜像地址。
    cat << "EOF" |kubectl apply -f -
    apiVersion: v1
    data:
      export.yml: |
        scopes: # 日志转存范围,默认只采集应用日志
          system: false  # 系统日志
          workload: true # 应用日志
          kubernetes: false # Kubernetes 日志
          platform: false # 产品日志
        output:
          type: local
          path: /cpaas/data/logarchive
          layout: TimePrefixed
          # 单个归档文件大小,单位 MB。超过该值时,日志将根据 compression 字段配置自动压缩归档。
          max_size: 200
          compression: zlib    # 可选:none(不压缩)/ zlib / gzip
          file_type: txt   # 可选:txt csv json
    kind: ConfigMap
    metadata:
      name: log-exporter-config
      namespace: cpaas-system
    
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        service_name: log-exporter
      name: log-exporter
      namespace: cpaas-system
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 5
      selector:
        matchLabels:
          service_name: log-exporter
      strategy:
        rollingUpdate:
          maxSurge: 0
          maxUnavailable: 1
        type: RollingUpdate
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: lanaya
            cpaas.io/product: Platform-Center
            service_name: log-exporter
            version: v1
          namespace: cpaas-system
        spec:
          automountServiceAccountToken: true
          affinity:
            podAffinity: {}
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
                - podAffinityTerm:
                    labelSelector:
                      matchExpressions:
                        - key: service_name
                          operator: In
                          values:
                            - log-exporter
                    topologyKey: kubernetes.io/hostname
                  weight: 50
          initContainers:
            - args:
                - -ecx
                - |
                  chown -R 697:697 /cpaas/data/logarchive
              command:
                - /bin/sh
              image: registry.example.cn:60080/ops/alpine:3.16 # Alpine 镜像地址
              imagePullPolicy: IfNotPresent
              name: chown
              resources:
                limits:
                  cpu: 100m
                  memory: 200Mi
                requests:
                  cpu: 10m
                  memory: 50Mi
              securityContext:
                runAsUser: 0
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
                - mountPath: /cpaas/data/logarchive
                  name: data
          containers:
            - command:
              - /razor
              - consumer
              - --v=1
              - --kafka-group-log=log-nfs
              - --kafka-auth-enabled=true
              - --kafka-tls-enabled=true
              - --kafka-endpoint=192.168.143.120:9092  # 根据实际环境填写
              - --database-type=file
              - --export-config=/etc/log-export/export.yml
              image: registry.example.cn:60080/ait/razor:v3.16.0-beta.3.g3df8e987  # Razor 镜像
              imagePullPolicy: Always
              livenessProbe:
                failureThreshold: 5
                httpGet:
                  path: /metrics
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 20
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 3
              name: log-export
              ports:
                - containerPort: 80
                  protocol: TCP
              readinessProbe:
                failureThreshold: 5
                httpGet:
                  path: /metrics
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 20
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 3
              resources:
                limits:
                  cpu: "2"
                  memory: 4Gi
                requests:
                  cpu: 440m
                  memory: 1280Mi
              securityContext:
                runAsGroup: 697
                runAsUser: 697
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
                - mountPath: /etc/secrets/kafka
                  name: kafka-basic-auth
                  readOnly: true
                - mountPath: /etc/log-export
                  name: config
                  readOnly: true
                - mountPath: /cpaas/data/logarchive
                  name: data
          dnsPolicy: ClusterFirst
          nodeSelector:
            kubernetes.io/os: linux
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext:
            fsGroup: 697
          serviceAccount: lanaya
          serviceAccountName: lanaya
          terminationGracePeriodSeconds: 10
          tolerations:
            - effect: NoSchedule
              key: node-role.kubernetes.io/master
              operator: Exists
            - effect: NoSchedule
              key: node-role.kubernetes.io/control-plane
              operator: Exists
            - effect: NoSchedule
              key: node-role.kubernetes.io/cpaas-system
              operator: Exists
          volumes:
            - name: kafka-basic-auth
              secret:
                defaultMode: 420
                secretName: kafka-basic-auth
            - name: elasticsearch-basic-auth
              secret:
                defaultMode: 420
                secretName: elasticsearch-basic-auth
            - configMap:
                defaultMode: 420
                name: log-exporter-config
              name: config
            - hostPath:
                path: /cpaas/data/logarchive    # 要挂载的 NFS 路径
                type: DirectoryOrCreate
              name: data
    EOF
  4. 容器状态变为 Running 后,即可在 NFS 路径下查看持续归档的日志,日志文件目录结构如下:

    /cpaas/data/logarchive/$date/$project/$namespace-$cluster/logfile

传输到外部 S3 存储

前提条件

资源说明
S3 存储需提前准备好 S3 存储服务地址,并获取 access_key_idsecret_access_key 的值;创建日志存储的 bucket。
Kafka需提前获取 Kafka 服务地址。
镜像地址需在 global 集群中使用 CLI 工具执行以下命令获取镜像地址:
- 获取 alpine 镜像地址:kubectl get daemonset nevermore -n cpaas-system -o jsonpath='{.spec.template.spec.initContainers[0].image}'
- 获取 razor 镜像地址:kubectl get deployment razor -n cpaas-system -o jsonpath='{.spec.template.spec.containers[0].image}'

创建日志同步资源

  1. 点击左侧导航栏的 集群管理 > 集群

  2. 点击日志将要转存的集群右侧操作按钮 > CLI 工具

  3. 根据以下参数说明修改 YAML,修改完成后将代码粘贴到打开的 CLI 工具 命令行中,回车执行。

    资源类型字段路径说明
    Secretdata.access_key_id对获取的 access_key_id 进行 Base64 编码。
    Secretdata.secret_access_key对获取的 secret_access_key 进行 Base64 编码。
    ConfigMapdata.export.yml.output.compression压缩日志文本;支持选项为 none(不压缩)zlibgzip
    ConfigMapdata.export.yml.output.file_type导出日志文件类型;支持 txt、csv、json。
    ConfigMapdata.export.yml.output.max_size单个归档文件大小,单位为 MB。超过该值时,日志将根据 compression 字段配置自动压缩归档。
    ConfigMapdata.export.yml.scopes日志转存范围;当前支持的日志包括:系统日志、应用日志、Kubernetes 日志、产品日志。
    ConfigMapdata.export.yml.output.s3.bucket_nameBucket 名称。
    ConfigMapdata.export.yml.output.s3.endpointS3 存储服务地址。
    ConfigMapdata.export.yml.output.s3.regionS3 存储服务的地域信息。
    Deploymentspec.template.spec.containers[0].command[7]Kafka 服务地址。
    Deploymentspec.template.spec.volumes[3].hostPath.path本地挂载路径,用于临时存储日志信息。日志文件同步到 S3 存储后会自动删除。
    Deploymentspec.template.spec.initContainers[0].imageAlpine 镜像地址。
    Deploymentspec.template.spec.containers[0].imageRazor 镜像地址。
    cat << "EOF" |kubectl apply -f -
    apiVersion: v1
    type: Opaque
    data:
      # 必须包含以下两个键
      access_key_id: bWluaW9hZG1pbg==  # 对获取的 access_key_id 进行 Base64 编码
      secret_access_key: bWluaW9hZG1pbg==  # 对获取的 secret_access_key 进行 Base64 编码
    kind: Secret
    metadata:
      name: log-export-s3-secret
      namespace: cpaas-system
    
    ---
    apiVersion: v1
    data:
      export.yml: |
        scopes: # 日志转存范围,默认只采集应用日志
          system: false  # 系统日志
          workload: true # 应用日志
          kubernetes: false # Kubernetes 日志
          platform: false # 产品日志
        output:
          type: s3
          path: /cpaas/data/logarchive
    
          s3:
            s3forcepathstyle: true
            bucket_name: baucket_name_s3           # 填写准备好的 bucket 名称
            endpoint: http://192.168.179.86:9000   # 填写准备好的 S3 存储服务地址
            region: "dummy"                        # 地域信息
            access_secret: log-export-s3-secret
            insecure: true
    
          layout: TimePrefixed
          # 单个归档文件大小,单位 MB。超过该值时,日志将根据 compression 字段配置自动压缩归档。
          max_size: 200
          compression: zlib                        # 可选:none(不压缩)/ zlib / gzip
          file_type: txt                           # 可选:txt、csv、json
    kind: ConfigMap
    metadata:
      name: log-exporter-config
      namespace: cpaas-system
    
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        service_name: log-exporter
      name: log-exporter
      namespace: cpaas-system
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 5
      selector:
        matchLabels:
          service_name: log-exporter
      strategy:
        rollingUpdate:
          maxSurge: 0
          maxUnavailable: 1
        type: RollingUpdate
      template:
        metadata:
          creationTimestamp: null
          labels:
            app: lanaya
            cpaas.io/product: Platform-Center
            service_name: log-exporter
            version: v1
          namespace: cpaas-system
        spec:
          affinity:
            podAffinity: {}
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
                - podAffinityTerm:
                    labelSelector:
                      matchExpressions:
                        - key: service_name
                          operator: In
                          values:
                            - log-exporter
                    topologyKey: kubernetes.io/hostname
                  weight: 50
          initContainers:
            - args:
                - -ecx
                - |
                  chown -R 697:697 /cpaas/data/logarchive
              command:
                - /bin/sh
              image: registry.example.cn:60080/ops/alpine:3.16 # Alpine 镜像地址
              imagePullPolicy: IfNotPresent
              name: chown
              resources:
                limits:
                  cpu: 100m
                  memory: 200Mi
                requests:
                  cpu: 10m
                  memory: 50Mi
              securityContext:
                runAsUser: 0
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
                - mountPath: /cpaas/data/logarchive
                  name: data
          containers:
            - command:
                - /razor
                - consumer
                - --v=1
                - --kafka-group-log=log-s3
                - --kafka-auth-enabled=true
                - --kafka-tls-enabled=true
                - --kafka-endpoint=192.168.179.86:9092  # 根据实际环境填写 Kafka 服务地址
                - --database-type=file
                - --export-config=/etc/log-export/export.yml
              image: registry.example.cn:60080/ait/razor:v3.16.0-beta.3.g3df8e987  # Razor 镜像
              imagePullPolicy: Always
              livenessProbe:
                failureThreshold: 5
                httpGet:
                  path: /metrics
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 20
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 3
              name: log-export
              ports:
                - containerPort: 80
                  protocol: TCP
              readinessProbe:
                failureThreshold: 5
                httpGet:
                  path: /metrics
                  port: 8080
                  scheme: HTTP
                initialDelaySeconds: 20
                periodSeconds: 10
                successThreshold: 1
                timeoutSeconds: 3
              resources:
                limits:
                  cpu: "2"
                  memory: 4Gi
                requests:
                  cpu: 440m
                  memory: 1280Mi
              securityContext:
                runAsGroup: 697
                runAsUser: 697
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
              volumeMounts:
                - mountPath: /etc/secrets/kafka
                  name: kafka-basic-auth
                  readOnly: true
                - mountPath: /etc/log-export
                  name: config
                  readOnly: true
                - mountPath: /cpaas/data/logarchive
                  name: data
          dnsPolicy: ClusterFirst
          nodeSelector:
            kubernetes.io/os: linux
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext:
            fsGroup: 697
          serviceAccount: lanaya
          serviceAccountName: lanaya
          terminationGracePeriodSeconds: 10
          tolerations:
            - effect: NoSchedule
              key: node-role.kubernetes.io/master
              operator: Exists
            - effect: NoSchedule
              key: node-role.kubernetes.io/control-plane
              operator: Exists
            - effect: NoSchedule
              key: node-role.kubernetes.io/cpaas-system
              operator: Exists
          volumes:
            - name: kafka-basic-auth
              secret:
                defaultMode: 420
                secretName: kafka-basic-auth
            - name: elasticsearch-basic-auth
              secret:
                defaultMode: 420
                secretName: elasticsearch-basic-auth
            - configMap:
                defaultMode: 420
                name: log-exporter-config
              name: config
            - hostPath:
                path: /cpaas/data/logarchive    # 日志本地临时存储地址
                type: DirectoryOrCreate
              name: data
    EOF
  4. 容器状态变为 Running 后,即可在 bucket 中查看持续归档的日志。