Metrics Collection for Tekton Components

Overview Prerequisites Tekton Pipelines PipelineRun Metrics running_pipelineruns Label Levels Status Label Values TaskRun Metrics config-observability Configuration Histogram Buckets Recommended Production Configuration Tekton Triggers Controller Metrics (port 9000)EventListener Sink Metrics Tekton Results Watcher Metrics Deletion Metrics Shared Metrics Watcher config-observability API Server Metrics Tekton Chains Chains Metrics Controller Framework Metrics Setting Up ServiceMonitor Pipeline ServiceMonitor Triggers ServiceMonitor EventListener Sink ServiceMonitor Results ServiceMonitor Chains ServiceMonitor Verification Check Metrics Endpoints Directly Check Prometheus Targets Example PromQL Queries MonitorDashboard Examples Tekton Pipeline Dashboard Tekton Pipeline Dashboard Interpretation (Common Questions)Tekton Triggers Dashboard Tekton Triggers Dashboard Interpretation (Common Questions)Tekton Results Dashboard Tekton Results Dashboard Interpretation (Common Questions)Tekton Chains Dashboard Tekton Chains Dashboard Interpretation (Common Questions)

Overview

Tekton components expose Prometheus-compatible metrics via HTTP endpoints. By deploying ServiceMonitor resources, Prometheus (or VictoriaMetrics) can automatically discover and scrape these metrics.

Namespace Note: This document uses tekton-pipelines as the default namespace for control-plane components (Pipelines, Triggers, Results, Chains). The primary exception is EventListener Services, which run in application namespaces where EventListeners are created.

If your deployment uses different namespaces, update both the commands and the namespaceSelector fields in ServiceMonitor resources below.

This document covers metrics for the following Tekton components:

Tekton Pipelines - PipelineRun / TaskRun execution metrics
Tekton Triggers - EventListener, TriggerBinding, and related resource metrics
Tekton Results - Run deletion and storage metrics
Tekton Chains - Signing and provenance metrics
Controller Framework - Infrastructure metrics shared by all controllers

It also covers:

How to configure metrics behavior via config-observability
How to deploy ServiceMonitor resources for scraping
How to verify that metrics collection is working

Prerequisites

Tekton control-plane components are installed and running (at minimum, the components you plan to scrape: Pipelines, Triggers, Results, and/or Chains).
kubectl is configured against the target cluster, and your account can create ServiceMonitor resources in the monitoring namespace.
A monitoring stack is deployed (Prometheus or compatible VictoriaMetrics) and can discover/scrape ServiceMonitor resources (or equivalent scrape-discovery objects used by your platform).
Your Prometheus/VictoriaMetrics instance is configured to discover the ServiceMonitor objects you create (namespace and label selectors must match).
Network policies and firewalls allow scraper pods to reach Tekton metrics ports (9090 for most control-plane services, 9000 for Triggers controller and EventListener sink).
If you want EventListener sink metrics, EventListeners must exist in their target namespaces and expose the http-metrics port.

Tekton Pipelines

The Tekton Pipelines component includes multiple sub-services that expose metrics on port 9090:

Service	Description	Metrics Port
`tekton-pipelines-controller`	Main reconciler for PipelineRun / TaskRun	9090
`tekton-pipelines-webhook`	Admission webhook	9090
`tekton-events-controller`	CloudEvents controller	9090
`tekton-pipelines-remote-resolvers`	Remote resource resolution	9090

The Pipeline controller metrics use the prefix tekton_pipelines_controller_.

PipelineRun Metrics

Metric Name	Type	Description	Labels
`pipelinerun_duration_seconds`	Histogram / LastValue	PipelineRun execution time in seconds	`status`, `namespace`, `pipeline`, `pipelinerun`, `reason`*
`pipelinerun_total`	Counter	Total number of completed PipelineRuns	`status`
`running_pipelineruns`	LastValue (Gauge)	Number of currently running PipelineRuns	Controlled by `metrics.running-pipelinerun.level` (see below)
`running_pipelineruns_waiting_on_pipeline_resolution`	LastValue (Gauge)	PipelineRuns waiting on Pipeline reference resolution	-
`running_pipelineruns_waiting_on_task_resolution`	LastValue (Gauge)	PipelineRuns waiting on Task reference resolution	-

* Labels marked with * are optional and depend on the config-observability configuration.

`running_pipelineruns` Label Levels

The running_pipelineruns metric labels are controlled by metrics.running-pipelinerun.level:

Level	Labels
`""` (default, cluster)	No labels
`"namespace"`	`namespace`
`"pipeline"`	`namespace`, `pipeline`
`"pipelinerun"`	`namespace`, `pipeline`, `pipelinerun`

Status Label Values

For PipelineRun metrics:

success - PipelineRun completed successfully
failed - PipelineRun failed
cancelled - PipelineRun was cancelled

For TaskRun metrics:

success - TaskRun completed successfully
failed - TaskRun failed

TaskRun Metrics

Metric Name	Type	Description	Labels
`taskrun_duration_seconds`	Histogram / LastValue	Standalone TaskRun execution time in seconds	`status`, `namespace`, `task`, `taskrun`, `reason`*
`pipelinerun_taskrun_duration_seconds`	Histogram / LastValue	TaskRun execution time when part of a PipelineRun	`status`, `namespace`, `task`, `taskrun`, `pipeline`, `pipelinerun`, `reason`*
`taskrun_total`	Counter	Total number of completed TaskRuns	`status`
`running_taskruns`	LastValue (Gauge)	Number of currently running TaskRuns	-
`running_taskruns_waiting_on_task_resolution_count`	LastValue (Gauge)	TaskRuns waiting on Task reference resolution	-
`running_taskruns_throttled_by_quota`	LastValue (Gauge)	TaskRuns throttled by ResourceQuota	`namespace`*
`running_taskruns_throttled_by_node`	LastValue (Gauge)	TaskRuns throttled by node-level resource constraints	`namespace`*
`taskruns_pod_latency_milliseconds`	LastValue	Pod scheduling latency for TaskRuns in milliseconds	`namespace`, `pod`, `task`, `taskrun`

config-observability Configuration

The config-observability ConfigMap in the tekton-pipelines namespace controls metrics behavior for the Pipeline controller. This ConfigMap is managed by the Tekton Operator and should be configured via the TektonConfig resource's spec.pipeline.options.configMaps field. See Adjusting Optional Configuration Items for Subcomponents for details.

Hot reload behavior: config-observability is watched at runtime. Most key changes (for example metrics.*) take effect without restarting Pods. Allow one or two scrape intervals for dashboard/query changes to appear. A restart is only needed when Pod spec settings change (for example changing CONFIG_OBSERVABILITY_NAME in the Deployment).

Example configuration via TektonConfig:

apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
  name: config
spec:
  pipeline:
    options:
      disabled: false
      configMaps:
        config-observability:
          data:
            metrics.backend-destination: prometheus

            # PipelineRun metrics aggregation level.
            # Values: "pipelinerun" | "pipeline" (default) | "namespace"
            #   - "pipelinerun": includes pipeline + pipelinerun labels; duration uses LastValue
            #   - "pipeline": includes pipeline label only
            #   - "namespace": no pipeline/pipelinerun labels
            metrics.pipelinerun.level: "pipeline"

            # TaskRun metrics aggregation level.
            # Values: "taskrun" | "task" (default) | "namespace"
            #   - "taskrun": includes task + taskrun labels; duration uses LastValue
            #   - "task": includes task label only
            #   - "namespace": no task/taskrun labels
            metrics.taskrun.level: "task"

            # Duration metric type for PipelineRun / TaskRun.
            # Values: "histogram" (default) | "lastvalue"
            # Note: When pipelinerun.level is "pipelinerun" or taskrun.level is "taskrun",
            #       duration type is forced to "lastvalue" regardless of this setting.
            metrics.pipelinerun.duration-type: "histogram"
            metrics.taskrun.duration-type: "histogram"

            # Running PipelineRun metrics aggregation level.
            # Values: "pipelinerun" | "pipeline" | "namespace" | "" (default, cluster-level)
            metrics.running-pipelinerun.level: ""

            # Include reason label on duration metrics (pipelinerun_duration_seconds,
            # taskrun_duration_seconds, pipelinerun_taskrun_duration_seconds).
            # Values: "true" | "false" (default)
            # Warning: Enabling this increases label cardinality.
            # Note: Despite the key name, this does NOT affect count metrics
            # (pipelinerun_total / taskrun_total), only duration metrics.
            metrics.count.enable-reason: "false"

            # Include namespace label on throttled TaskRun metrics.
            # Values: "true" | "false" (default)
            metrics.taskrun.throttle.enable-namespace: "false"

Histogram Buckets

When the duration type is histogram, the following bucket boundaries (in seconds) are used:

10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400

This corresponds to: 10s, 30s, 1m, 5m, 15m, 30m, 1h, 1.5h, 3h, 6h, 12h, 24h.

Recommended Production Configuration

For production environments, use aggregated levels to control label cardinality:

metrics.pipelinerun.level: "pipeline"
metrics.taskrun.level: "task"
metrics.pipelinerun.duration-type: "histogram"
metrics.taskrun.duration-type: "histogram"
metrics.count.enable-reason: "false"

If you need per-run granularity for debugging, temporarily switch to:

metrics.pipelinerun.level: "pipelinerun"
metrics.taskrun.level: "taskrun"

Note that this will significantly increase the number of time series.

Tekton Triggers

The Tekton Triggers component exposes two categories of metrics from different processes.

Controller Metrics (port 9000)

The Triggers controller reports resource count metrics every 60 seconds.

Service	Metrics Port
`tekton-triggers-controller`	9000

The Triggers controller metrics use the prefix controller_.

Metric Name	Type	Description	Labels
`eventlistener_count`	LastValue (Gauge)	Number of EventListener resources	-
`triggerbinding_count`	LastValue (Gauge)	Number of TriggerBinding resources	-
`clustertriggerbinding_count`	LastValue (Gauge)	Number of ClusterTriggerBinding resources	-
`triggertemplate_count`	LastValue (Gauge)	Number of TriggerTemplate resources	-
`clusterinterceptor_count`	LastValue (Gauge)	Number of ClusterInterceptor resources	-

EventListener Sink Metrics

Each EventListener pod exposes additional HTTP and event processing metrics. These metrics come from the EventListener sink process (not the controller). The Prometheus metric prefix is eventlistener_.

Metric Name (Prometheus)	Type	Description	Labels
`eventlistener_http_duration_seconds`	Histogram	EventListener HTTP request duration	-
`eventlistener_event_received_count`	Counter	Total events received by the sink	`status`
`eventlistener_triggered_resources`	Counter	Total resources created by triggers	`kind`

eventlistener_http_duration_seconds histogram buckets: 0.001, 0.01, 0.1, 1, 10 (seconds)
eventlistener_event_received_count status values: succeeded, failed
eventlistener_triggered_resources kind values: the Kubernetes resource Kind of the created object (e.g., PipelineRun, TaskRun)

These sink metrics are exposed per EventListener pod, not from the central controller. You may need a separate ServiceMonitor or PodMonitor to scrape them if the EventListener pods expose a metrics port.

Tekton Results

Tekton Results has two sub-services that expose metrics.

Service	Description	Metrics Port
`tekton-results-watcher`	Watches and cleans up PipelineRun/TaskRun resources	9090
`tekton-results-api`	gRPC/REST API server	9090

Watcher Metrics

The Watcher metrics use the prefix watcher_.

Deletion Metrics

Metric Name	Type	Description	Labels
`pipelinerun_delete_count`	Counter	Total number of deleted PipelineRuns	`status`, `namespace`
`pipelinerun_delete_duration_seconds`	Histogram / LastValue	Time from PipelineRun completion to deletion	`status`, `namespace`, `pipeline`*
`taskrun_delete_count`	Counter	Total number of deleted TaskRuns	`status`, `namespace`
`taskrun_delete_duration_seconds`	Histogram / LastValue	Time from TaskRun completion to deletion	`status`, `namespace`, `pipeline`, `task`

* Optional labels depend on config-observability settings for the Results Watcher.

Note: pipelinerun_delete_count, pipelinerun_delete_duration_seconds, taskrun_delete_count, and taskrun_delete_duration_seconds are only recorded when the Watcher actually deletes runs. These metrics will remain empty (no data points) unless the --completed_run_grace_period flag is set to a non-zero value on the tekton-results-watcher Deployment. By default this flag is 0, which disables automatic deletion. Set it to a positive duration (e.g. 10m) to enable deletion after a grace period, or to a negative value to delete immediately after archiving.

Status label values for Results Watcher:

success - Run completed successfully
failed - Run failed
cancelled - Run was cancelled

Shared Metrics

These metrics are registered by both the PipelineRun and TaskRun reconcilers in the Watcher, tracking storage-related events.

Metric Name	Type	Description	Labels
`runs_not_stored_count`	Counter	Runs deleted without being stored to Results	`kind`, `namespace`
`run_storage_latency_seconds`	Histogram	Time from run completion to successful storage	`kind`, `namespace`

The kind label identifies run type (PipelineRun / TaskRun in some metric series, pipelinerun / taskrun in others).

Note: runs_not_stored_count is only recorded when a run is externally deleted (e.g. via kubectl delete) while the Watcher is holding a finalizer to coordinate archiving. It will remain empty unless all of the following conditions are met:

The --logs_api flag is false (log storage disabled) — if logs are enabled, the Watcher skips finalizer-based coordination entirely.

The --disable_crd_update flag is false (annotation updates enabled).

The --store_deadline flag is set to a non-zero duration — this is the maximum time the Watcher waits for archiving to complete before giving up and allowing deletion.

A run is externally deleted before it is successfully archived (no results.tekton.dev/stored=true annotation), and the store_deadline has elapsed.

In normal operation (runs archived before deletion, or deletion triggered by the Watcher itself via --completed_run_grace_period), this counter stays at zero. A non-zero value indicates potential data loss: runs were deleted before their state could be saved to the Results API.

Quick reproduction (test environment): If you do not see this metric, that usually means the trigger conditions were not met, not that the metric is missing.

Configure Results Watcher via TektonConfig so that logs_api=false, disable_crd_update=false, and store_deadline is non-zero (for example 30s).

Temporarily set Results API replicas to 0 via TektonConfig (spec.result.options.deployments.tekton-results-api.spec.replicas: 0) so runs cannot be archived.

Create a TaskRun or PipelineRun and wait until it completes.

Wait until store_deadline has elapsed, then externally delete the run (kubectl delete ...).

Check Watcher /metrics or Prometheus for watcher_runs_not_stored_count (component-prefixed name in exposition format); it should increase.

Restore the original TektonConfig (re-enable Results API replicas and normal logs_api settings).

The run_storage_latency_seconds histogram uses the following bucket boundaries (in seconds):

0.1, 0.5, 1, 2, 5, 10, 30, 60, 120, 300, 600, 1800

Watcher config-observability

The Results Watcher has its own config-observability ConfigMap (named via the CONFIG_OBSERVABILITY_NAME environment variable, typically tekton-results-config-observability). This ConfigMap is managed by the Tekton Operator and should be configured via the TektonConfig resource's spec.results.options.configMaps field. See Adjusting Optional Configuration Items for Subcomponents for details.

Hot reload behavior: Results Watcher also watches this ConfigMap and applies most key changes without Pod restarts. A restart is only needed when Deployment-level settings (such as env vars/args) are changed.

It supports the following keys:

Key	Default	Values	Description
`metrics.pipelinerun.level`	`pipeline`	`pipeline`, `namespace`	Controls `pipeline` label on delete duration metrics
`metrics.taskrun.level`	`task`	`task`, `namespace`	Controls `task` label on delete duration metrics
`metrics.pipelinerun.duration-type`	`histogram`	`histogram`, `lastvalue`	Duration metric aggregation type for both PipelineRun and TaskRun deletion
`metrics.taskrun.duration-type`	`histogram`	`histogram`, `lastvalue`	Parsed but currently not used; `metrics.pipelinerun.duration-type` controls both

Note: Unlike Tekton Pipelines, the Results Watcher does not support pipelinerun / taskrun individual-run granularity levels. It also does not have the metrics.count.enable-reason, metrics.running-pipelinerun.level, or metrics.taskrun.throttle.enable-namespace keys.

Known issue in upstream: taskrun_delete_duration_seconds uses metrics.pipelinerun.duration-type (not metrics.taskrun.duration-type) to determine the aggregation type. This appears to be a copy-paste bug in the Results source code.

API Server Metrics

The API server exposes standard gRPC Prometheus metrics via the go-grpc-prometheus library on port 9090. These include:

grpc_server_handled_total - Total RPCs completed on the server
grpc_server_started_total - Total RPCs started on the server
grpc_server_msg_received_total / grpc_server_msg_sent_total - Message counts
grpc_server_handling_seconds (if PROMETHEUS_HISTOGRAM is enabled) - RPC handling duration

Tekton Chains

Tekton Chains is a security component that generates, signs, and stores provenance for artifacts built with Tekton Pipelines. It observes completed TaskRuns and PipelineRuns, then creates attestations and signatures.

Service	Description	Metrics Port
`tekton-chains-metrics`	Chains watcher/controller	9090 (`http-metrics`)

The Chains controller metrics use the prefix watcher_ (same as Results Watcher, but the custom metric names are different, so there are no collisions).

Chains Metrics

All Chains metrics are Counters with no labels.

Metric Name (Prometheus)	Type	Description
`watcher_taskrun_sign_created_total`	Counter	Total signed messages for TaskRuns
`watcher_taskrun_payload_stored_total`	Counter	Total stored payloads for TaskRuns
`watcher_taskrun_marked_signed_total`	Counter	Total TaskRuns marked as signed
`watcher_pipelinerun_sign_created_total`	Counter	Total signed messages for PipelineRuns
`watcher_pipelinerun_payload_stored_total`	Counter	Total stored payloads for PipelineRuns
`watcher_pipelinerun_marked_signed_total`	Counter	Total PipelineRuns marked as signed

Note: The official Tekton Chains documentation also mentions *_signing_failures_total counters for both TaskRun and PipelineRun, but these are not present in the current upstream source code. Verify against your deployed version.

Controller Framework Metrics

All Tekton controllers automatically expose the following infrastructure metrics. These metrics use the same prefix as the component's custom metrics (e.g., tekton_pipelines_controller_, controller_, watcher_).

Metric Name (without prefix)	Type	Description
`client_latency`	Histogram	Kubernetes API client request latency (seconds)
`client_results`	Counter	Kubernetes API request count (by status code)
`workqueue_depth`	Gauge	Current workqueue depth
`workqueue_adds_total`	Counter	Total workqueue additions
`workqueue_queue_latency_seconds`	Histogram	Time items spend waiting in the workqueue
`workqueue_work_duration_seconds`	Histogram	Time spent processing workqueue items
`workqueue_retries_total`	Counter	Total workqueue retries
`workqueue_unfinished_work_seconds`	Histogram	Duration of unfinished workqueue items
`workqueue_longest_running_processor_seconds`	Histogram	Duration of longest running workqueue processor
`reconcile_count`	Counter	Total reconciler invocations (labeled by `reconciler`, `success`, `namespace_name`)
`reconcile_latency`	Histogram	Reconciler invocation latency (labeled by `reconciler`, `success`, `namespace_name`)

Setting Up ServiceMonitor

To enable Prometheus scraping for Tekton components, deploy ServiceMonitor resources.

Prerequisites are listed in Prerequisites.

Use the following guidance based on your monitoring stack:

If you use Prometheus (Prometheus Operator), labels such as metadata.labels.prometheus: kube-prometheus must match the Prometheus CR spec.serviceMonitorSelector; otherwise, this ServiceMonitor will not be scraped.
If you use VictoriaMetrics, you typically do not need labels like prometheus: kube-prometheus; create ServiceMonitor/VMServiceScrape according to your monitoring setup.

When using Prometheus, use the following commands to find and verify the selector:

# 1) Locate Prometheus CRs (resource type: monitoring.coreos.com/v1, Kind=Prometheus)
$ kubectl get prometheus -A

# 2) Check ServiceMonitor selector on the target Prometheus instance
$ kubectl get prometheus -n <prometheus-namespace> <prometheus-name> -o yaml | yq '.spec.serviceMonitorSelector'

If no Prometheus CR exists in your cluster, monitoring is usually platform-managed (for example, VictoriaMetrics) or implemented differently. In such cases, labels like prometheus: kube-prometheus are usually not required; follow your platform scraping rules.

For more info please refer to Integrating External Metrics.

Pipeline ServiceMonitor

Pipeline ServiceMonitor YAML

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tekton-pipelines-metrics
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: tekton-pipelines
    # prometheus: kube-prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/part-of: tekton-pipelines
  endpoints:
  - port: http-metrics
    path: /metrics
    interval: 30s
  namespaceSelector:
    matchNames:
    - tekton-pipelines

This ServiceMonitor matches Pipeline services with the label app.kubernetes.io/part-of: tekton-pipelines (including remote-resolvers) and scrapes them in the tekton-pipelines namespace.

Triggers ServiceMonitor

Triggers ServiceMonitor YAML

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tekton-triggers-metrics
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: tekton-triggers
    # prometheus: kube-prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/part-of: tekton-triggers
      app.kubernetes.io/component: controller
  endpoints:
  - port: http-metrics
    path: /metrics
    interval: 30s
  namespaceSelector:
    matchNames:
    - tekton-pipelines

This ServiceMonitor collects Triggers controller metrics (controller_*) only. It does not include EventListener sink metrics.

EventListener Sink ServiceMonitor

EventListener Sink ServiceMonitor YAML

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tekton-eventlistener-sink-metrics
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: tekton-eventlistener-sink
    # prometheus: kube-prometheus
spec:
  selector:
    matchExpressions:
    - key: eventlistener
      operator: Exists
    - key: app.kubernetes.io/managed-by
      operator: In
      values:
      - EventListener
  endpoints:
  - port: http-metrics
    path: /metrics
    interval: 30s
  namespaceSelector:
    any: true

EventListener Services usually run in application namespaces, so this example uses namespaceSelector.any: true for cross-namespace scraping. If you need tighter scope, switch to matchNames and list allowed namespaces explicitly.

Results ServiceMonitor

The Results services have both app.kubernetes.io/part-of: tekton-results and app.kubernetes.io/name labels. To precisely target API + Watcher (and exclude Postgres), this example matches app.kubernetes.io/name:

Results ServiceMonitor YAML

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tekton-results-metrics
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: tekton-results
    # prometheus: kube-prometheus
spec:
  selector:
    matchExpressions:
    - key: app.kubernetes.io/name
      operator: In
      values:
      - tekton-results-api
      - tekton-results-watcher
  endpoints:
  - port: prometheus
    path: /metrics
    interval: 30s
  - port: metrics
    path: /metrics
    interval: 30s
  namespaceSelector:
    matchNames:
    - tekton-pipelines

The Results API server uses port name prometheus (9090) and the Watcher uses port name metrics (9090). Each service only exposes one of these port names, so only the matching endpoint will be scraped.

Chains ServiceMonitor

Chains ServiceMonitor YAML

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: tekton-chains-metrics
  namespace: tekton-pipelines
  labels:
    app.kubernetes.io/name: tekton-chains
    # prometheus: kube-prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/part-of: tekton-chains
  endpoints:
  - port: http-metrics
    path: /metrics
    interval: 30s
  namespaceSelector:
    matchNames:
    - tekton-pipelines

Verification

After deploying the ServiceMonitor resources, verify that Prometheus is scraping the targets.

Check Metrics Endpoints Directly

# Pipeline controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-pipelines-controller 9090:9090
$ curl -s http://localhost:9090/metrics | grep tekton_pipelines_controller_

# HELP tekton_pipelines_controller_client_latency How long Kubernetes API requests take
# TYPE tekton_pipelines_controller_client_latency histogram
tekton_pipelines_controller_client_latency_bucket{name="",le="1e-05"} 0
tekton_pipelines_controller_client_latency_bucket{name="",le="0.0001"} 0
tekton_pipelines_controller_client_latency_bucket{name="",le="0.001"} 0

# Triggers controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-triggers-controller 9000:9000
$ curl -s http://localhost:9000/metrics | grep controller_

# HELP controller_client_latency How long Kubernetes API requests take
# TYPE controller_client_latency histogram
controller_client_latency_bucket{name="",le="1e-05"} 0
controller_client_latency_bucket{name="",le="0.0001"} 1
controller_client_latency_bucket{name="",le="0.001"} 2

# EventListener sink metrics (replace namespace/service)
$ kubectl port-forward -n <eventlistener-namespace> svc/<eventlistener-service> 9000:9000
$ curl -s http://localhost:9000/metrics | grep eventlistener_

# HELP eventlistener_client_latency How long Kubernetes API requests take
# TYPE eventlistener_client_latency histogram
eventlistener_client_latency_bucket{name="",le="1e-05"} 0
eventlistener_client_latency_bucket{name="",le="0.0001"} 0
eventlistener_client_latency_bucket{name="",le="0.001"} 0

# HELP eventlistener_triggered_resources Count of the number of triggered eventlistener resources
# TYPE eventlistener_triggered_resources counter
eventlistener_triggered_resources{kind="PipelineRun"} 10

# Results watcher
$ kubectl port-forward -n tekton-pipelines svc/tekton-results-watcher 9091:9090
$ curl -s http://localhost:9091/metrics | grep watcher_

# HELP watcher_client_latency How long Kubernetes API requests take
# TYPE watcher_client_latency histogram
watcher_client_latency_bucket{name="",le="1e-05"} 0
watcher_client_latency_bucket{name="",le="0.0001"} 0
watcher_client_latency_bucket{name="",le="0.001"} 0

# Results API
$ kubectl port-forward -n tekton-pipelines svc/tekton-results-api-service 9092:9090
$ curl -s http://localhost:9092/metrics | grep grpc_server_

# HELP grpc_server_handled_total Total number of RPCs completed on the server, regardless of success or failure.
# TYPE grpc_server_handled_total counter
grpc_server_handled_total{grpc_code="Aborted",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="CreateRecord",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="CreateResult",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 0

# HELP grpc_server_started_total Total number of RPCs started on the server.
# TYPE grpc_server_started_total counter
grpc_server_started_total{grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 337606
grpc_server_started_total{grpc_method="CreateRecord",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 10301
grpc_server_started_total{grpc_method="CreateResult",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 832

# Chains controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-chains-metrics 9093:9090
$ curl -s http://localhost:9093/metrics | grep watcher_

# HELP watcher_client_latency How long Kubernetes API requests take
# TYPE watcher_client_latency histogram
watcher_client_latency_bucket{name="",le="1e-05"} 0
watcher_client_latency_bucket{name="",le="0.0001"} 0
watcher_client_latency_bucket{name="",le="0.001"} 0

EventListener sink metrics such as eventlistener_event_received_count and eventlistener_http_duration_seconds are request-driven. Send at least one request to the EventListener before validating these metrics.

Check Prometheus Targets

# Verify ServiceMonitor resources exist
$ kubectl get servicemonitor -n tekton-pipelines

NAME                                AGE
tekton-chains-metrics               10m
tekton-eventlistener-sink-metrics   10m
tekton-pipelines-metrics            10m
tekton-results-metrics              10m
tekton-triggers-metrics             10m

# Check Prometheus targets (via Prometheus UI or API)
# Look for targets with job labels matching the ServiceMonitor names

Example PromQL Queries

# PipelineRun cumulative success rate (avoids misinterpretation in empty completion windows)
100 * sum(tekton_pipelines_controller_pipelinerun_total{status="success"}) / clamp_min(sum(tekton_pipelines_controller_pipelinerun_total), 1)

# Completed PipelineRuns in the last 5 minutes (throughput)
round(sum(increase(tekton_pipelines_controller_pipelinerun_total[5m])))

# PipelineRun duration P95 (histogram mode)
histogram_quantile(0.95,
  rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])
)

# TaskRun duration P95 (histogram mode, includes standalone + in-pipeline TaskRuns)
histogram_quantile(0.95,
  (
    sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m]))
    +
    sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m]))
  )
)

# PipelineRun duration (lastvalue mode)
avg_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds[5m])

# Currently running PipelineRuns (single series to avoid duplicate legends)
max(tekton_pipelines_controller_running_pipelineruns)

# TaskRuns throttled by resource quota
max(tekton_pipelines_controller_running_taskruns_throttled_by_quota)

# Trigger resource counts
controller_eventlistener_count
controller_triggertemplate_count

# Chains signing activity
watcher_taskrun_sign_created_total
watcher_pipelinerun_sign_created_total

MonitorDashboard Examples

The following MonitorDashboard resources provide ready-to-use dashboards for monitoring Tekton components. Deploy them to the cpaas-system namespace under the tekton folder.

Important: Each panel must include id (unique integer), datasource: prometheus, and transformations: []. Each target must include datasource: prometheus and refId. Duration P50/P95 panels in this document use *_bucket queries and require metrics.*.duration-type=histogram; if you use lastvalue, replace those queries with LastValue-style expressions such as avg_over_time(...).

Tekton Pipeline Dashboard

Tekton Pipeline Dashboard YAML

kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
  labels:
    cpaas.io/dashboard.folder: tekton
    cpaas.io/dashboard.is.home.dashboard: "false"
    cpaas.io/dashboard.tag.tekton: "true"
  name: tekton-pipeline
  namespace: cpaas-system
spec:
  body:
    titleZh: Tekton Pipeline Overview
    tags:
      - tekton
    time:
      from: now-1h
      to: now
    templating:
      list: []
    panels:
      - id: 1
        title: PipelineRun Total (by status)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 0, y: 0 }
        targets:
          - datasource: prometheus
            expr: sum by (status) (tekton_pipelines_controller_pipelinerun_total)
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 2
        title: TaskRun Total (by status)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 8, y: 0 }
        targets:
          - datasource: prometheus
            expr: sum by (status) (tekton_pipelines_controller_taskrun_total)
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 3
        title: PipelineRun Success Rate (cumulative)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 4, x: 16, y: 0 }
        targets:
          - datasource: prometheus
            expr: "100 * sum(tekton_pipelines_controller_pipelinerun_total{status=\"success\"}) / clamp_min(sum(tekton_pipelines_controller_pipelinerun_total), 1)"
            refId: A
        fieldConfig:
          defaults:
            unit: percent
            color: { mode: thresholds }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds:
              mode: absolute
              steps:
                - { color: red, value: null }
                - { color: orange, value: 80 }
                - { color: green, value: 95 }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 12
        title: Completed PipelineRuns (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 4, x: 20, y: 0 }
        targets:
          - datasource: prometheus
            expr: "round(sum(increase(tekton_pipelines_controller_pipelinerun_total[5m])))"
            legendFormat: completed
            refId: A
        fieldConfig:
          defaults:
            unit: short
            decimals: 0
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 4
        title: Running PipelineRuns
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 0, y: 8 }
        targets:
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_pipelineruns)
            legendFormat: running
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 5
        title: Running TaskRuns
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 8, y: 8 }
        targets:
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_taskruns)
            legendFormat: running
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 6
        title: TaskRuns Throttled
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 16, y: 8 }
        targets:
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_taskruns_throttled_by_quota)
            legendFormat: by quota
            refId: A
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_taskruns_throttled_by_node)
            legendFormat: by node
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: orange, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 7
        title: PipelineRun Duration P50 / P95
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 0, y: 16 }
        targets:
          - datasource: prometheus
            expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P50
            refId: A
          - datasource: prometheus
            expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P95
            refId: B
        fieldConfig:
          defaults:
            unit: s
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 8
        title: TaskRun Duration P50 / P95 (Standalone)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 8, y: 16 }
        targets:
          - datasource: prometheus
            expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P50
            refId: A
          - datasource: prometheus
            expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P95
            refId: B
        fieldConfig:
          defaults:
            unit: s
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 13
        title: TaskRun Duration P50 / P95 (In-Pipeline)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 16, y: 16 }
        targets:
          - datasource: prometheus
            expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P50
            refId: A
          - datasource: prometheus
            expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
            legendFormat: P95
            refId: B
        fieldConfig:
          defaults:
            unit: s
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 9
        title: Workqueue Depth
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 0, y: 24 }
        targets:
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_workqueue_depth)
            legendFormat: depth
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 10
        title: Reconcile Count (by success)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 8, y: 24 }
        targets:
          - datasource: prometheus
            expr: sum(increase(tekton_pipelines_controller_reconcile_count{success="true"}[5m]))
            legendFormat: success=true
            refId: A
          - datasource: prometheus
            expr: sum(increase(tekton_pipelines_controller_reconcile_count{success="false"}[5m]))
            legendFormat: success=false
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 11
        title: Resolution Waiting
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 8, x: 16, y: 24 }
        targets:
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_pipelineruns_waiting_on_pipeline_resolution)
            legendFormat: PR waiting pipeline
            refId: A
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_pipelineruns_waiting_on_task_resolution)
            legendFormat: PR waiting task
            refId: B
          - datasource: prometheus
            expr: max(tekton_pipelines_controller_running_taskruns_waiting_on_task_resolution_count)
            legendFormat: TR waiting task
            refId: C
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: orange, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []

Tekton Pipeline Dashboard Interpretation (Common Questions)

PipelineRun Total (by status) is a completion-event counter recorded by the controller, not the total number of PipelineRun objects. In the current implementation, user-triggered cancellation (spec.status=Cancelled) may not enter this counting path, so the cancelled series may be absent. To validate cancellation volume, check PipelineRun objects and events.
Running PipelineRuns is a real-time snapshot (how many are running now). It can change independently from PipelineRun Total.
Completed PipelineRuns (last 5m) is throughput (newly completed runs in the last 5 minutes). Seeing 0 during low traffic or idle periods is expected.
PipelineRun Success Rate (cumulative) is cumulative since controller start, not a 5-minute window success rate. A short-term failure does not immediately cause a large shift.
Reconcile Count (by success) measures controller reconcile loops, not PipelineRun counts.
Status series are shown only for label values that actually have samples in the selected time range. If a status has no samples in the window, its curve/legend will not appear.
TaskRun Duration P50 / P95 (Standalone) and TaskRun Duration P50 / P95 (In-Pipeline) are split to avoid mixed-query instability. In environments that only expose one histogram family, the other panel may be empty, which is expected.

Tekton Triggers Dashboard

Tekton Triggers Dashboard YAML

kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
  labels:
    cpaas.io/dashboard.folder: tekton
    cpaas.io/dashboard.is.home.dashboard: "false"
    cpaas.io/dashboard.tag.tekton: "true"
  name: tekton-triggers
  namespace: cpaas-system
spec:
  body:
    titleZh: Tekton Triggers Overview
    tags:
      - tekton
    time:
      from: now-1h
      to: now
    templating:
      list: []
    panels:
      - id: 1
        title: EventListener Count
        type: timeseries
        datasource: prometheus
        gridPos: { h: 6, w: 5, x: 0, y: 0 }
        targets:
          - datasource: prometheus
            expr: controller_eventlistener_count
            legendFormat: EventListener
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 2
        title: TriggerTemplate Count
        type: timeseries
        datasource: prometheus
        gridPos: { h: 6, w: 5, x: 5, y: 0 }
        targets:
          - datasource: prometheus
            expr: controller_triggertemplate_count
            legendFormat: TriggerTemplate
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 3
        title: TriggerBinding Count
        type: timeseries
        datasource: prometheus
        gridPos: { h: 6, w: 5, x: 10, y: 0 }
        targets:
          - datasource: prometheus
            expr: controller_triggerbinding_count
            legendFormat: TriggerBinding
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 4
        title: ClusterTriggerBinding
        type: timeseries
        datasource: prometheus
        gridPos: { h: 6, w: 5, x: 15, y: 0 }
        targets:
          - datasource: prometheus
            expr: controller_clustertriggerbinding_count
            legendFormat: ClusterTriggerBinding
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 5
        title: ClusterInterceptor
        type: timeseries
        datasource: prometheus
        gridPos: { h: 6, w: 4, x: 20, y: 0 }
        targets:
          - datasource: prometheus
            expr: controller_clusterinterceptor_count
            legendFormat: ClusterInterceptor
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 6
        title: All Trigger Resource Counts (trend)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 24, x: 0, y: 6 }
        targets:
          - datasource: prometheus
            expr: controller_eventlistener_count
            legendFormat: EventListener
            refId: A
          - datasource: prometheus
            expr: controller_triggertemplate_count
            legendFormat: TriggerTemplate
            refId: B
          - datasource: prometheus
            expr: controller_triggerbinding_count
            legendFormat: TriggerBinding
            refId: C
          - datasource: prometheus
            expr: controller_clustertriggerbinding_count
            legendFormat: ClusterTriggerBinding
            refId: D
          - datasource: prometheus
            expr: controller_clusterinterceptor_count
            legendFormat: ClusterInterceptor
            refId: E
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []

Tekton Triggers Dashboard Interpretation (Common Questions)

EventListener Count, TriggerTemplate Count, TriggerBinding Count, ClusterTriggerBinding, and ClusterInterceptor are object-count snapshots, not request volume or event-processing throughput.
All Trigger Resource Counts (trend) shows the combined trend for the same resource counts. Short deviations versus the single-resource trend panels within a scrape interval are expected.
Showing 0 when no Triggers resources exist is normal and does not indicate a scraping failure.

Tekton Results Dashboard

Tekton Results Dashboard YAML

kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
  labels:
    cpaas.io/dashboard.folder: tekton
    cpaas.io/dashboard.is.home.dashboard: "false"
    cpaas.io/dashboard.tag.tekton: "true"
  name: tekton-results
  namespace: cpaas-system
spec:
  body:
    titleZh: Tekton Results Overview
    tags:
      - tekton
    time:
      from: now-1h
      to: now
    templating:
      list: []
    panels:
      - id: 1
        title: PipelineRun Reconcile Count (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 0 }
        targets:
          - datasource: prometheus
            expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler",success="true"}[5m])))
            legendFormat: success=true
            refId: A
          - datasource: prometheus
            expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler",success="false"}[5m])))
            legendFormat: success=false
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 2
        title: TaskRun Reconcile Count (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 0 }
        targets:
          - datasource: prometheus
            expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler",success="true"}[5m])))
            legendFormat: success=true
            refId: A
          - datasource: prometheus
            expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler",success="false"}[5m])))
            legendFormat: success=false
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 3
        title: PipelineRun Reconcile Latency P95
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 8 }
        targets:
          - datasource: prometheus
            expr: histogram_quantile(0.95, sum by (le) (rate(watcher_reconcile_latency_bucket{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler"}[5m])))
            legendFormat: P95
            refId: A
        fieldConfig:
          defaults:
            unit: ms
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 4
        title: TaskRun Reconcile Latency P95
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 8 }
        targets:
          - datasource: prometheus
            expr: histogram_quantile(0.95, sum by (le) (rate(watcher_reconcile_latency_bucket{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler"}[5m])))
            legendFormat: P95
            refId: A
        fieldConfig:
          defaults:
            unit: ms
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 5
        title: Workqueue Depth (PipelineRun vs TaskRun)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 16 }
        targets:
          - datasource: prometheus
            expr: sum(watcher_work_queue_depth{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler"})
            legendFormat: pipelinerun
            refId: A
          - datasource: prometheus
            expr: sum(watcher_work_queue_depth{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler"})
            legendFormat: taskrun
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 6
        title: Workqueue Adds (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 16 }
        targets:
          - datasource: prometheus
            expr: round(sum(increase(watcher_workqueue_adds_total{name=~"github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler-(consumer|fast|slow)"}[5m])))
            legendFormat: pipelinerun adds
            refId: A
          - datasource: prometheus
            expr: round(sum(increase(watcher_workqueue_adds_total{name=~"github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler-(consumer|fast|slow)"}[5m])))
            legendFormat: taskrun adds
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 7
        title: gRPC Request Rate (Results API)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 24 }
        targets:
          - datasource: prometheus
            expr: "sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\"}[5m]))"
            legendFormat: requests
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 8
        title: gRPC Error Percentage (Results API, excl. NotFound/Canceled)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 24 }
        targets:
          - datasource: prometheus
            expr: "100 * ((sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\",grpc_code!~\"OK|NotFound|Canceled\"}[5m])) or vector(0)) / clamp_min((sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\"}[5m])) or vector(0)), 0.001))"
            legendFormat: error %
            refId: A
        fieldConfig:
          defaults:
            unit: percent
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: red, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []

Tekton Results Dashboard Interpretation (Common Questions)

This dashboard revision is based on Results Watcher reconcile/workqueue metrics plus Results API gRPC metrics, so it stays populated under common deployments (logs_api=true, automatic deletion disabled).
PipelineRun Reconcile Count (last 5m) and TaskRun Reconcile Count (last 5m) show separate 5-minute increments for success=true and success=false.
PipelineRun Reconcile Latency P95 and TaskRun Reconcile Latency P95 are calculated from watcher reconcile latency histograms. Under low traffic, the line can be sparse.
Workqueue Depth shows current queue depth, and Workqueue Adds (last 5m) shows enqueue volume over the last 5 minutes.
gRPC Error Percentage (Results API, excl. NotFound/Canceled) is the percentage of abnormal errors over total requests, excluding common business return codes (NotFound, Canceled).

Tekton Chains Dashboard

Tekton Chains Dashboard YAML

kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
  labels:
    cpaas.io/dashboard.folder: tekton
    cpaas.io/dashboard.is.home.dashboard: "false"
    cpaas.io/dashboard.tag.tekton: "true"
  name: tekton-chains
  namespace: cpaas-system
spec:
  body:
    titleZh: Tekton Chains Overview
    tags:
      - tekton
    time:
      from: now-1h
      to: now
    templating:
      list: []
    panels:
      - id: 1
        title: TaskRun Signatures Created (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 0 }
        targets:
          - datasource: prometheus
            expr: round(increase(watcher_taskrun_sign_created_total[5m]))
            legendFormat: sign created
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 2
        title: PipelineRun Signatures Created (last 5m)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 0 }
        targets:
          - datasource: prometheus
            expr: round(increase(watcher_pipelinerun_sign_created_total[5m]))
            legendFormat: sign created
            refId: A
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 3
        title: Payloads Stored (last 5m, TaskRun vs PipelineRun)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 0, y: 8 }
        targets:
          - datasource: prometheus
            expr: round(increase(watcher_taskrun_payload_stored_total[5m]))
            legendFormat: TaskRun
            refId: A
          - datasource: prometheus
            expr: round(increase(watcher_pipelinerun_payload_stored_total[5m]))
            legendFormat: PipelineRun
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []
      - id: 4
        title: Marked Signed (last 5m, TaskRun vs PipelineRun)
        type: timeseries
        datasource: prometheus
        gridPos: { h: 8, w: 12, x: 12, y: 8 }
        targets:
          - datasource: prometheus
            expr: round(increase(watcher_taskrun_marked_signed_total[5m]))
            legendFormat: TaskRun
            refId: A
          - datasource: prometheus
            expr: round(increase(watcher_pipelinerun_marked_signed_total[5m]))
            legendFormat: PipelineRun
            refId: B
        fieldConfig:
          defaults:
            color: { mode: palette-classic }
            custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
            thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
          overrides: []
        options:
          legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
          tooltip: { mode: multi, sort: desc }
        transformations: []

Tekton Chains Dashboard Interpretation (Common Questions)

TaskRun Signatures Created (last 5m), PipelineRun Signatures Created (last 5m), Payloads Stored (last 5m), and Marked Signed (last 5m) use increase(...[5m]) and represent increments in the last five minutes.
When there is no new signing or storage activity, these lines drop to 0; this does not imply a component fault.
Payloads Stored and Marked Signed represent different processing stages, so their values are not expected to always match.

#Metrics Collection for Tekton Components

#TOC

#Overview

#Prerequisites

#Tekton Pipelines

#PipelineRun Metrics

#running_pipelineruns Label Levels

#Status Label Values

#TaskRun Metrics

#config-observability Configuration

#Histogram Buckets

#Recommended Production Configuration

#Tekton Triggers

#Controller Metrics (port 9000)

#EventListener Sink Metrics

#Tekton Results

#Watcher Metrics

#Deletion Metrics

#Shared Metrics

#Watcher config-observability

#API Server Metrics

#Tekton Chains

#Chains Metrics

#Controller Framework Metrics

#Setting Up ServiceMonitor

#Pipeline ServiceMonitor

#Triggers ServiceMonitor

#EventListener Sink ServiceMonitor

#Results ServiceMonitor

#Chains ServiceMonitor

#Verification

#Check Metrics Endpoints Directly

#Check Prometheus Targets

#Example PromQL Queries

#MonitorDashboard Examples

#Tekton Pipeline Dashboard

#Tekton Pipeline Dashboard Interpretation (Common Questions)

#Tekton Triggers Dashboard

#Tekton Triggers Dashboard Interpretation (Common Questions)

#Tekton Results Dashboard

#Tekton Results Dashboard Interpretation (Common Questions)

#Tekton Chains Dashboard

#Tekton Chains Dashboard Interpretation (Common Questions)