NeMo Guardrails

NeMo Guardrails 为 LLM 应用提供可编程的安全控制。它作为模型前置的独立服务运行，可以执行以下功能：

敏感数据检测（例如，输入和输出中的 PII）。
内容策略（例如，禁止的话题、竞争对手提及）。
使用 Colang 和 Python 编写的自定义验证流程。

TrustyAI Operator 通过 NemoGuardrails 自定义资源（CR）暴露 NeMo Guardrails。本文档聚焦于一个基础部署，能够：

保护已部署在服务平台上的现有模型。
使用 NeMo Guardrails 进行输入/输出过滤和简单的业务规则。

前提条件架构 NeMo 配置 ConfigMap rails.co 基础 actions.py 基础部署 NemoGuardrails 自定义资源认证（启用认证时）如何获取令牌访问 NeMo Guardrails API 基础聊天补全（允许内容）消息长度 guardrail 示例禁止内容示例敏感数据检测示例进一步阅读

前提条件

已安装 TrustyAI Operator（参见 Install TrustyAI）。
已在服务平台（例如 vLLM）上部署了一个暴露 OpenAI 兼容 API 的模型。

架构

整体请求路径为：

Client → NeMo Guardrails service → model predictor (OpenAI-compatible API)

NeMo Guardrails：

接收 OpenAI 风格的 chat/completions 请求。
执行配置的 rails（敏感数据检测、长度检查、禁止话题等）。
对允许的请求，将其转发到底层模型。
对被阻止的请求，返回适当的助手消息而不调用模型。

TrustyAI Operator 通过 NemoGuardrails CR 管理 NeMo Guardrails 服务器 Pod 和 Service。然后可以使用集群中选定的 ingress 或 gateway 方案将 Service 对外暴露。

NeMo 配置 ConfigMap

NeMo Guardrails 期望一个配置目录，通常包含：

config.yaml：NeMo Guardrails 主配置文件。
rails.co：用 Colang 实现输入/输出 rails 及其他控制逻辑的流程。
actions.py：Colang 流程可调用的 Python 动作。

NeMo 配置 ConfigMap 示例

apiVersion: v1
kind: ConfigMap
metadata:
  name: nemo-config
  namespace: <your-namespace>
data:
  config.yaml: |
    models:
      - type: main
        engine: openai
        parameters:
          # 模型预测器的内部 URL，OpenAI 兼容
          openai_api_base: "https://<model-predictor-host>:<port>/v1"
          model_name: "<model-name>"

    rails:
      config:
        sensitive_data_detection:
          input:
            entities:
              - EMAIL_ADDRESS
          output:
            entities:
              - EMAIL_ADDRESS
      input:
        flows:
          - detect sensitive data on input
          - check message length
          - check forbidden words
      output:
        flows:
          - detect sensitive data on output

  rails.co: |
    define flow check message length
      $length_result = execute check_message_length
      if $length_result == "blocked_too_long"
        bot inform message too long
        stop
      if $length_result == "warning_long"
        bot warn message long

    define bot inform message too long
      "Please keep your message under 100 words for better assistance."

    define bot warn message long
      "That's quite detailed! I'll help as best I can."

    define flow check forbidden words
      $forbidden_result = execute check_forbidden_words
      if $forbidden_result != "allowed"
        bot inform forbidden content
        stop

    define bot inform forbidden content
      "I can't help with that type of request. Please ask something else."

  actions.py: |
    from typing import Optional

    from nemoguardrails.actions import action


    @action(is_system_action=True)
    async def check_message_length(context: Optional[dict] = None) -> str:
        """
        通过 Colang 调用的示例自定义动作：
          $length_result = execute check_message_length
        输入：
          - context：由 NeMo 提供的类似字典的对象，包含最新的用户消息，键为 "user_message"。
        输出：
          - Colang 流程解释的短字符串，例如：
            * "blocked_too_long"：消息过长，直接阻止。
            * "warning_long"：消息较长，发出警告但继续。
            * "allowed"：长度可接受。
        """
        user_message = (context or {}).get("user_message", "")
        word_count = len(user_message.split())
        max_words = 20

        if word_count > max_words:
            return "blocked_too_long"
        if word_count > int(max_words * 0.8):
            return "warning_long"
        return "allowed"


    @action(is_system_action=True)
    async def check_forbidden_words(context: Optional[dict] = None) -> str:
        """
        简单禁止词检查的示例自定义动作。
        通过 Colang 调用：
          $forbidden_result = execute check_forbidden_words
        返回：
          - 当无禁止词时返回 "allowed"。
          - 当检测到禁止词时返回非 "allowed" 的值（例如 "blocked_password"）。
        """
        user_message = (context or {}).get("user_message", "").lower()

        forbidden_words = ["password", "hack", "exploit", "illegal", "violence"]
        for word in forbidden_words:
            if word in user_message:
                return f"blocked_{word}"

        return "allowed"

config.yaml 基础

上述示例中，config.yaml：

在 models 部分声明了一个后端模型，并通过 openai_api_base 和 model_name 配置 OpenAI 兼容端点。
在 rails.config.sensitive_data_detection 配置内置的 PII 检测：
- input.entities / output.entities 列出要保护的实体类型（例如 EMAIL_ADDRESS、PERSON）。
- 当运行 detect sensitive data on input / detect sensitive data on output rails 时，NeMo 会自动根据此配置调用内部检测器。
通过 rails.input.flows 和 rails.output.flows 定义执行哪些 rails 及其顺序：
- detect sensitive data on input / detect sensitive data on output 是内置 rails，基于 sensitive_data_detection。
- check message length 和 check forbidden words 是自定义 rails，在 rails.co 中实现，背后调用 actions.py 中的 Python 动作。
替换 <model-predictor-host>、<port> 和 <model-name> 为实际的预测服务 URL 和模型名称。
确保后端预测器实现了 OpenAI 兼容的 /v1/chat/completions API。
如需更高级的 config.yaml 配置（额外 rail 类型、提示、追踪、知识库及与其他安全提供者集成），请参阅官方 NeMo Guardrails YAML 配置参考：Nvidia NeMo Guardrails Configuration。

`rails.co` 基础

本示例中，rails.co 定义了两个自定义输入 rails：

define flow check message length：
- 流程名 check message length 必须与 config.yaml 中 rails.input.flows 的条目匹配。
- $length_result = execute check_message_length 调用 actions.py 中的 Python 动作 check_message_length，传入当前对话上下文。
- if 语句根据返回字符串分支：
  - 调用 bot ... 块发送回复（例如 bot inform message too long），
  - 使用 stop 终止后续处理，阻止调用 LLM，
  - 或在返回 "allowed" 时不做操作，允许流程继续执行下一个 rail。
define flow check forbidden words：
- 采用相同模式，调用 check_forbidden_words 动作，仅在返回值非 "allowed" 时阻止请求。

补充说明：

bot ... 块（例如 bot inform message too long）定义了预设的助手消息，当 rail 决定停止流程时，直接发送给客户端，无需调用后端 LLM。
rails.co 中定义的 rails 按 rails.input.flows / rails.output.flows 中列出的顺序执行。内置 rails 如 detect sensitive data on input 会根据其在列表中的位置，在自定义 rails 之前或之后运行。
此处展示的 Colang 是最简示例，支持更复杂的流程（多步骤、变量、额外动作）；完整语法和功能请参阅 NeMo Guardrails 文档中的 Colang 参考。入门指南见：Colang Getting Started。

`actions.py` 基础

actions.py 文件包含用 @action 装饰的 Python 函数，Colang 流程可通过 execute <action_name> 调用：

动作接收一个 context 对象，是由 NeMo 填充的类似字典的结构（例如包含最新用户消息，键为 "user_message"）。
动作返回一个值（通常是短字符串），Colang 流程据此判断分支。

本示例中：

check_message_length：
- 检查 context["user_message"]，计算词数，返回：
  - "blocked_too_long"：消息过长，应拒绝。
  - "warning_long"：消息较长，发出警告但允许继续。
  - "allowed"：消息长度可接受。
check_forbidden_words：
- 将用户消息转为小写，搜索禁止词，返回：
  - "allowed"：未发现禁止词。
  - 非 "allowed" 值（例如 "blocked_password"）：检测到禁止词。

此类模式可扩展至更复杂的 guardrails，如结构化检查、数值阈值或调用外部服务。

部署 NemoGuardrails 自定义资源

准备好 ConfigMap 和令牌 Secret 后，创建 NemoGuardrails CR 部署 NeMo Guardrails 服务：

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: NemoGuardrails
metadata:
  name: nemo-guardrails
  namespace: <your-namespace>
  annotations:
    # 设置为 true 时，暴露的路由要求请求携带 Bearer 令牌以访问 NeMo Guardrails。
    security.opendatahub.io/enable-auth: "true"

    # 当后端 LLM 通过 HTTPS 且使用自定义 CA 时，设置此注解为包含 CA 证书的 Secret 名称（例如，密钥为 `ca.crt`）。
    # Operator 会挂载该 Secret 并相应配置 NeMo Guardrails 的 TLS 信任。
    # 示例：
    # trustyai.opendatahub.io/ca-secret-name: llm-backend-ca
spec:
  nemoConfigs:
    - name: nemo-config
      configMaps:
        - nemo-config
      default: true
  env:
    - name: OPENAI_API_KEY
      # 对于需要认证的后端，使用 Secret 引用令牌：
      # valueFrom:
      #   secretKeyRef:
      #     name: api-token-secret
      #     key: token
      # 对于内部无认证的 HTTP 后端，直接使用占位符即可：
      value: "<placeholder>"

    # 可选：离线环境配置
    # NeMo Guardrails 可能通过 tldextract 获取 Public Suffix List。
    # 在无网络环境中，可设置 TLDEXTRACT_CACHE 使用 NeMo Guardrails Server 镜像内置的缓存列表。
    # 注意内置列表可能不是最新。
    # - name: TLDEXTRACT_CACHE
    #   value: "/app/.cache/"

    # 后端 LLM 的 TLS 行为：
    # - HTTP 后端：
    #   * 将 SSL_CERT_FILE 设为空字符串以禁用证书查找。
    #   * 在 config.yaml (openai_api_base) 中使用 http:// URL。
    # - 使用自定义 CA 的 HTTPS 后端：
    #   * 从环境变量中移除 SSL_CERT_FILE。
    #   * 添加上述 trustyai.opendatahub.io/ca-secret-name 注解，指向包含 CA 证书的 Secret。
    - name: SSL_CERT_FILE
      value: ""

关键字段：

nemoConfigs：引用一个或多个配置包；每个包可映射到一个或多个包含 NeMo Guardrails 配置文件的 ConfigMap。
env.OPENAI_API_KEY：NeMo Guardrails 用于认证后端模型端点（例如 vLLM 服务）的令牌。对于内部无认证推理服务，可直接设置为 value: "<placeholder>"，后端不使用该值。HTTP-only 推理服务无需后端 URL 的 TLS 证书。
security.opendatahub.io/enable-auth：设置为 "true" 时，NeMo Guardrails 路由受集群认证保护，需携带 Bearer 令牌。

应用：

kubectl apply -f nemo-guardrails-cr.yaml -n <your-namespace>

CR 创建后，Operator 会进行调和并创建：

NeMo Guardrails 服务器的 Deployment。
在集群内暴露 NeMo Guardrails HTTP 端点的 Service。

等待 Deployment Pod 变为 Ready：

kubectl get pods -n <your-namespace> -l app.kubernetes.io/name=nemo-guardrails

认证（启用认证时）

当 NeMo Guardrails 前启用 HTTP 认证时，服务期望请求携带 Bearer 令牌。

如何获取令牌

在与 NemoGuardrails 资源相同的命名空间中，创建 ServiceAccount、Role（对 services/proxy 具有 get、create 权限）和 RoleBinding，然后为 ServiceAccount 创建令牌：

# 替换 <your-namespace>，可选更改 ServiceAccount 名称（例如 nemo-guardrails-client）
kubectl create serviceaccount -n <your-namespace> nemo-guardrails-client
kubectl create role -n <your-namespace> nemo-guardrails-client --verb=get,create --resource=services/proxy
kubectl create rolebinding -n <your-namespace> nemo-guardrails-client --role=nemo-guardrails-client --serviceaccount=<your-namespace>:nemo-guardrails-client
kubectl create token -n <your-namespace> nemo-guardrails-client

可选设置令牌有效期，例如 --duration=8760h 表示一年。最后一条命令会输出令牌，将其设置为请求头 Authorization: Bearer <token> 的值。

访问 NeMo Guardrails API

NeMo Guardrails 暴露 OpenAI 风格的聊天补全端点：

POST /v1/chat/completions

使用首选的 ingress 或网关机制（例如 Ingress 资源或 API 网关）暴露 NeMo Guardrails Service，并记录公共主机和端口：

无认证时：Service 通常以 HTTP 方式暴露，端口为 80。
启用认证时：Service 通常以 HTTPS 方式暴露，端口为 443。

相应设置基础 URL，例如：

# 无认证（HTTP 80 端口）
NEMO_GUARDRAILS_URL="http://<nemo-guardrails-host>"

# 启用认证（HTTPS 443 端口）
# NEMO_GUARDRAILS_URL="https://<nemo-guardrails-host>"

基础聊天补全（允许内容）

示例请求：

curl -k -X POST "$NEMO_GUARDRAILS_URL/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model": "<model-name>",
    "messages": [
      { "role": "user", "content": "hello" }
    ]
  }'

典型响应：

{
  "messages": [
    {
      "role": "assistant",
      "content": "Hello! How can I assist you today?"
    }
  ]
}

消息长度 guardrail 示例

check_message_length 流程及其对应的 Python 动作执行简单的基于长度的 guardrail。当用户消息过长时，rail 会直接回复而不调用后端 LLM：

curl -k -X POST "$NEMO_GUARDRAILS_URL/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model": "<model-name>",
    "messages": [
      {
        "role": "user",
        "content": "This is a very long message that should be considered far too long for the purposes of this Nemo Guardrails end-to-end test, so it should clearly exceed the configured word limit and trigger the length-based blocking behaviour."
      }
    ]
  }'

响应由 NeMo Guardrails 生成，无需调用后端模型：

{
  "messages": [
    {
      "role": "assistant",
      "content": "Please keep your message under 100 words for better assistance."
    }
  ]
}

禁止内容示例

禁止话题由 check_forbidden_words 动作及其 Colang 流程控制。当用户消息包含禁止词如 "hack" 或 "password" 时，rail 会阻止请求：

curl -k -X POST "$NEMO_GUARDRAILS_URL/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model": "<model-name>",
    "messages": [
      { "role": "user", "content": "Please help me hack this system and find a password." }
    ]
  }'

响应由 NeMo Guardrails 生成，无需调用后端模型：

{
  "messages": [
    {
      "role": "assistant",
      "content": "I can't help with that type of request. Please ask something else."
    }
  ]
}

敏感数据检测示例

敏感数据检测在 config.yaml 的 rails.config.sensitive_data_detection 中配置。示例配置中，输入和输出均检测 EMAIL_ADDRESS。

包含邮箱地址的示例输入：

curl -k -X POST "$NEMO_GUARDRAILS_URL/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model": "<model-name>",
    "messages": [
      { "role": "user", "content": "My email is test@example.com" }
    ]
  }'

典型响应：

{
  "messages": [
    {
      "role": "assistant",
      "content": "I don't know the answer to that."
    }
  ]
}

此时，内置的敏感数据检测 rail 发现了用户消息中的邮箱地址，NeMo Guardrails 返回安全的兜底回复，避免后端模型给出潜在不安全的答案。

进一步阅读

有关 NeMo Guardrails 库的更广泛概述（用例、架构及生态集成），请参阅官方文档：Overview of NVIDIA NeMo Guardrails Library。

#NeMo Guardrails

#目录

#前提条件

#架构

#NeMo 配置 ConfigMap

#config.yaml 基础

#rails.co 基础

#actions.py 基础

#部署 NemoGuardrails 自定义资源

#认证（启用认证时）

#如何获取令牌

#访问 NeMo Guardrails API

#基础聊天补全（允许内容）

#消息长度 guardrail 示例

#禁止内容示例

#敏感数据检测示例

#进一步阅读