如何调整 Webservice Ingress 超时时间和请求体大小

本文介绍如何配置 gitlab.webservice.ingress 下暴露的三个 NGINX Ingress 参数、何时需要调整它们，以及如何在其他 Ingress controller 上应用同样的配置意图。

适用场景：

推送大型仓库、LFS 对象或容器镜像失败，并报 413 Request Entity Too Large。
大型仓库的 git clone / git push / 项目导入在约 10 分钟后超时，并报 502 Bad Gateway。
升级或重启 webservice 后短暂出现 502 Bad Gateway。

背景

GitLab webservice 通过 NGINX Ingress 暴露。随附的 Helm chart 会在 GitlabOfficial CR 的 spec.helmValues.gitlab.webservice.ingress 下暴露三个参数。它们会被渲染为 <RELEASE>-webservice-default Ingress 对象上的 NGINX Ingress annotations：

apiVersion: operator.alaudadevops.io/v1alpha1
kind: GitlabOfficial
metadata:
  name: sample
spec:
  helmValues:
    gitlab:
      webservice:
        ingress:
          proxyConnectTimeout: 15    # seconds, -> nginx.ingress.kubernetes.io/proxy-connect-timeout
          proxyReadTimeout: 600      # seconds, -> nginx.ingress.kubernetes.io/proxy-read-timeout
          proxyBodySize: "512m"      # size,    -> nginx.ingress.kubernetes.io/proxy-body-size

参数	含义	默认值
`proxyConnectTimeout`	NGINX 等待与 webservice Pod 建立 TCP 连接的时间。	`15`s
`proxyReadTimeout`	NGINX 等待来自上游 Pod 的两次连续读取之间的时间。	`600`s
`proxyBodySize`	NGINX 允许接受并转发的客户端请求体最大大小。	`512m`

这些默认值适用于大多数安装环境。三个参数之间关系紧密——大型仓库通常既需要更大的请求体大小，也需要更长的读取超时时间——因此它们通常应当一起调整，而不是逐个单独调整。

前提条件

具备编辑 GitlabOfficial CR 的权限（kubectl edit gitlabofficial <NAME> -n <NS>）。
只有当集群使用社区版 ingress-nginx controller (kubernetes/ingress-nginx) 时， proxyConnectTimeout / proxyReadTimeout / proxyBodySize 这些字段才会生效，因为 chart 会将它们渲染到 nginx.ingress.kubernetes.io/* 注解命名空间下。对于其他 controller，请参见下面的配置其他 Ingress controller。
检查请求路径上的每一跳。 如果在 GitLab 自身的 Ingress 前面还有平台级 LB 或反向代理，那么同样需要在那里提高限制——实际生效的限制是整条链路中的最小值。

面向大型仓库 / 上传的调优（ingress-nginx）

对于托管大型仓库、LFS 对象或容器/包 Registry 流量的安装环境，通常会同时出现以下三种症状，它们的解决方式相同——将这三个参数一起调高：

症状	需要提高的参数
`git push` / UI 上传 / LFS / Registry 出现 `413 Request Entity Too Large`。日志：`client intended to send too large body`。	`proxyBodySize`
`git clone` / `git push` / 项目导入卡住约 10 分钟后失败，并报 `502` 或 `RPC failed`。日志：`upstream timed out (110: Connection timed out)`。	`proxyReadTimeout`
在 webservice 滚动更新期间短暂出现 `502 Bad Gateway`（当 Pod 变为 Ready 后会消失）。	`proxyConnectTimeout`

对于具有大型仓库 / LFS / Registry 的 GitLab 实例，建议的起始值如下：

spec:
  helmValues:
    gitlab:
      webservice:
        ingress:
          proxyConnectTimeout: 30      # 15 -> 30; modest bump to absorb pod-restart jitter
          proxyReadTimeout: 1800       # 600 -> 1800; 30 min for large clone/push/import
          proxyBodySize: "5g"          # 512m -> 5g;  fits LFS / registry blobs

请根据实际使用情况选择具体值：

使用场景	`proxyBodySize`	`proxyReadTimeout`
仅源代码，小型仓库	`512m`（默认）	`600`（默认）
Git LFS / 大型二进制资源	`2g` ~ `5g`	`1800`
容器 / Package Registry	`5g` ~ `10g`	`1800` ~ `3600`

proxyConnectTimeout 通常是症状，而不是调节旋钮。 在滚动更新期间短暂出现 502，通常意味着 webservice Pod 启动较慢，或者 readiness probe 配置不正确——应优先修复这些问题。只有在环境中 TCP 建连确实较慢时才需要提高它（到 30–60s），例如跨 AZ 网络。将它设置为 600 这类大值只会掩盖真实的后端故障，并堆积 NGINX worker 线程。

proxyBodySize 只约束 Ingress 层。 GitLab 本身还在 Admin Area → Settings → General → Account and limit 下配置了应用层限制（max push size、max attachment size、 max import size，等等）。如有需要，请同步提高这些限制。

提示： 对于非常大的 Git 操作，优先使用 SSH（git@）而不是 HTTPS。SSH 流量不会经过 HTTP Ingress，因此不受这三个参数的影响。

配置其他 Ingress controller

上面的三个顶层字段只会在 nginx.ingress.kubernetes.io/* 命名空间下生成注解，因此会被以下组件忽略：

Traefik、HAProxy、Contour、Istio Gateway，以及其他非 NGINX controller。
F5 NGINX Inc. 的 nginxinc/kubernetes-ingress —— 它使用不同的注解命名空间（nginx.org/*）。

对于这些 controller，请通过 gitlab.webservice.ingress.annotations 直接设置等效注解，它会合并到渲染后的 Ingress 对象中。

F5 NGINX Inc.（nginx.org/*）示例：

spec:
  helmValues:
    gitlab:
      webservice:
        ingress:
          annotations:
            nginx.org/client-max-body-size: "5g"
            nginx.org/proxy-read-timeout: "1800s"
            nginx.org/proxy-connect-timeout: "30s"

对于 Traefik，proxyBodySize 的等效配置是 Middleware 资源中的 buffering.maxRequestBodyBytes，而超时则是在 IngressRoute / EntryPoint 级别配置，而不是通过每个 Ingress 的注解配置。请单独定义这些资源，并可选地通过 annotations 中的 traefik.ingress.kubernetes.io/router.middlewares 引用它们。

当 global.ingress.provider 设置为 nginx 之外的值时，不会注入 nginx.ingress.kubernetes.io/* 注解，但 Ingress 资源本身仍会被渲染—— annotations 中的值会被保留。如果所选 controller 根本不支持通过每个 Ingress 的注解来配置这些限制，请在 controller 本身上进行配置。

验证已应用的配置

更新 CR 并等待 reconciliation 完成后，检查 Ingress 对象上的注解：

kubectl -n <NAMESPACE> get ingress <RELEASE>-webservice-default \
  -o jsonpath='{.metadata.annotations}' | tr ',' '\n' \
  | grep -E 'body-size|read-timeout|connect-timeout'

期望输出（ingress-nginx 示例）：

"nginx.ingress.kubernetes.io/proxy-body-size":"5g"
"nginx.ingress.kubernetes.io/proxy-connect-timeout":"30"
"nginx.ingress.kubernetes.io/proxy-read-timeout":"1800"

如果这些值不匹配：

确认 CR 已在 spec.helmValues.gitlab.webservice.ingress 下更新（不是在 spec.helmValues.nginx-ingress.controller.* 下，后者属于不同层）。
检查 operator 是否成功完成 reconciliation： kubectl describe gitlabofficial <NAME> -n <NS>。
确认 GitLab 自身 Ingress 前面没有上游 Ingress / LB 在施加更严格的限制。

更大的值总是更好吗？

不是。每个参数都有代价：

proxyBodySize 过大 —— NGINX 会缓冲（或流式传输）整个请求体；单次超大上传可能会显著增加 Ingress Controller 节点上的内存和磁盘使用量。应将其设置为略高于真实最大值，而不是任意调得很高。
proxyReadTimeout 过大 —— 迟缓或卡住的上游连接会长时间占用 NGINX worker 插槽，降低其他用户可用的并发能力。请根据你最大的合法请求选择合适的值，而不是“越高越好”。
proxyConnectTimeout 过大 —— 通过在返回错误前等待很多分钟来掩盖真实的后端故障（Pod 未就绪、网络故障等）。请保持较小值（15–60s），并修复后端问题。

参考

NGINX Ingress annotations： https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/
F5 NGINX Inc. annotations： https://docs.nginx.com/nginx-ingress-controller/configuration/ingress-resources/advanced-configuration-with-annotations/

#如何调整 Webservice Ingress 超时时间和请求体大小

#目录

#背景

#前提条件

#面向大型仓库 / 上传的调优（ingress-nginx）

#配置其他 Ingress controller

#验证已应用的配置

#更大的值总是更好吗？

#参考