当多个网关配置相同 TLS 证书时出现 404 错误
问题描述
现象
通过 Istio Ingress Gateway 使用 HTTP/2 协议访问时,会出现 404 错误。
这是 Istio Community 中的一个已知问题。更多信息请参考 404 errors occur when multiple gateways configured with same TLS certificate。
分析
如果配置了多个使用相同 TLS 证书的网关,那么依赖 HTTP/2 连接复用的浏览器(即大多数浏览器)在访问第二个 host 时,如果之前已经与另一个 host 建立过连接,就会返回 404 错误。
示例: 如果域名 a.example.com 和 b.example.com 使用相同的 TLS 证书,并通过同一个 Istio Ingress Gateway 访问,但分别配置在两个不同的 Gateway 资源中,那么 HTTP/2 浏览器客户端在访问过 a.example.com 之后再访问 b.example.com 时,就会遇到 404 错误。这是由于浏览器的 HTTP/2 连接复用所致。
排查方法
你可以使用以下脚本快速检查你的环境中是否存在与问题描述相匹配的 Gateway 配置。该脚本需要在 Istio Ingress Gateway 所在业务集群的主节点上执行。
NOTE
- 该脚本依赖
jq 工具。如果你的集群节点未安装 jq 工具,请先在集群中安装 jq 后再执行脚本。工具下载链接:jq download。
jq 工具版本必须为 1.7 或更高。
#!/bin/bash
nslist=$(kubectl get ns -o jsonpath='{.items[*].metadata.name}')
declare -A cred_map
echo "begin to check gw"
for ns in $nslist; do
# Get gw resources
#echo "begin to list gw in $ns"
gateways=$(kubectl get gw -n $ns -o jsonpath='{.items[*].metadata.name}')
# Get the YAML file of the Gateway resource
for gateway in $gateways; do
gateway_yaml=$(kubectl get gw -n $ns $gateway -o yaml)
gateway_json=$(kubectl get gw -n $ns $gateway -o json)
tls_lines=$(echo "$gateway_yaml" | grep 'credentialName:')
secname=$(echo "$gateway_yaml" | grep 'credentialName:'|awk '{print $2}')
if [[ -n "$tls_lines" ]]; then
found=false
for key in "${!cred_map[@]}"; do
if [[ "$key" == "$secname" ]]; then
found=true
break
fi
done
if [[ $found == true ]]; then
echo -e "\033[31m cred already exist in other gw resource ,please must merge hosts in the gw resource ${cred_map[$secname]} ,and delete this gw! \033[0m"
hosts=$(echo "$gateway_json" | jq -r '.spec.servers[] | .hosts[]')
# Output Gateway name and hosts information
echo -e "\033[31m invalid gw name namespace: $gateway , $ns \033[0m"
echo "Hosts: $hosts"
else
echo "first get secret name $secname the gw is $gateway $ns"
cred_map["$secname"]="$gateway~$ns"
fi
#for key in "${!cred_map[@]}"; do
#echo "Key: $key, Value: ${cred_map[$key]}"
#done
echo ""
fi
done
done
脚本执行输出示例:
[root@idp-lihuang-w9x9w-9n9jv-cluster0-dt2n4 gwtls]# sh check.sh
begin to check gw
first get secret name jiaxiurc-com the gw is drawdb-gateway drawdb
first get secret name gyssg-com the gw is ec jxb-ec
first get secret name nexus the gw is nexus-gateway nexus
cred already exist in other gw resource, please must merge hosts in the gw resource drawdb-gateway~drawdb, and delete this gw!
invalid gw name namespace: authory-gateway, nm-edu-authory
Hosts: rzzx-test.jiaxiurc.com
rzzx-test.jiaxiurc.com
如果你在输出中看到类似如下信息:“cred already exist in other gw resource, please must merge hosts in the gw resource drawdb-gateway~drawdb, and delete this gw!” ,则说明你遇到了本文所描述的问题。
解决方案概述
针对该问题,我们提供两种解决方案。你可以参考下面的对比,并选择其中一种在你的环境中实施。
方案对比
解决方案 1:合并 Gateway 资源
方案说明
将使用相同 TLS 证书的多个 Gateway 资源合并为一个。
实施步骤
- 将多个 Gateway 资源合并为一个 Gateway 配置,使用相同的
spec.servers.hosts 列表,或者使用泛域名配置。
- 修改相关的 VirtualService 资源,确保它们指向合并后的 Gateway。
例如,在原始配置中,两个 Gateway 使用相同的 TLS 证书 testhl:
# Gateway Error Example 1: Two Gateways use the same TLS certificate
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: default2
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- "asm2.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https
number: 443
protocol: HTTPS
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: default
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- "asm1.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https
number: 443
protocol: HTTPS
---
# Gateway Error Example 2: The same Gateway uses the same TLS certificate in different Hosts sections
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: error-3
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- "asm1.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https-2
number: 443
protocol: HTTPS
- hosts:
- "asm2.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https
number: 443
protocol: HTTPS
---
# VirtualService Example
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: default2
namespace: bus-system
spec:
gateways:
- istio-system/default2
hosts:
- asm2.test.com
http:
- route:
- destination:
host: asm-0.testhl.svc.cluster.local
port:
number: 80
...
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: default
namespace: bus-system
spec:
gateways:
- istio-system/default
hosts:
- asm1.test.com
...
合并后的正确配置:
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: default2
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- "asm2.test.com"
- "asm1.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https
number: 443
protocol: HTTPS
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: default1
namespace: istio-system
spec:
gateways:
- istio-system/default2
hosts:
- asm2.test.com
...
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: default2
namespace: istio-system
spec:
gateways:
- istio-system/default2
hosts:
- asm1.test.com
...
你也可以使用泛域名格式进行配置:
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: default2
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- "*.test.com"
tls:
mode: SIMPLE
credentialName: "testhl"
port:
name: https
number: 443
protocol: HTTPS
步骤总结
- 合并 Gateway 资源的
spec.servers.hosts,将所有使用相同证书的 Gateway 资源合并到同一个 server 配置中。
- 修改 VirtualService 资源,使其指向合并后的 Gateway。
- 确保 VirtualService 中的
destination 使用 Kubernetes FQDN 格式。
重要说明: 完成上述步骤后,请重新执行检查脚本,确认问题已经解决。
解决方案 2:响应码 421
方案说明
当发生问题时返回 421 状态码,可让客户端重新建立连接,从而路由到正确的目标 Host。
实施步骤
应用以下 EnvoyFilter 配置:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: misdirected-request
namespace: istio-system
spec:
configPatches:
- applyTo: HTTP_FILTER
match:
context: GATEWAY
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
subFilter:
name: envoy.filters.http.router
patch:
operation: INSERT_BEFORE
value:
name: envoy.lua
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
inlineCode: |
local function get_host_from_authority(authority)
local colon_pos = authority:find(":", 1, true)
return colon_pos and authority:sub(1, colon_pos - 1) or authority
end
function envoy_on_request(request_handle)
local streamInfo = request_handle:streamInfo()
local requestedServerName = streamInfo:requestedServerName()
if requestedServerName ~= "" then
local host = get_host_from_authority(request_handle:headers():get(":authority"))
local isWildcard = string.sub(requestedServerName, 1, 2) == "*."
if isWildcard and not string.find(host, string.sub(requestedServerName, 3)) then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
elseif not isWildcard and requestedServerName ~= host then
request_handle:respond({[":status"] = "421"}, "Misdirected Request")
end
end
end