Metrics Collection for Tekton Components
Overview
Tekton components expose Prometheus-compatible metrics via HTTP endpoints. By deploying ServiceMonitor resources, Prometheus (or VictoriaMetrics) can automatically discover and scrape these metrics.
Namespace Note: This document uses tekton-pipelines as the default namespace for control-plane components (Pipelines, Triggers, Results, Chains).
The primary exception is EventListener Services, which run in application namespaces where EventListeners are created.
If your deployment uses different namespaces, update both the commands and the namespaceSelector fields in ServiceMonitor resources below.
This document covers metrics for the following Tekton components:
- Tekton Pipelines - PipelineRun / TaskRun execution metrics
- Tekton Triggers - EventListener, TriggerBinding, and related resource metrics
- Tekton Results - Run deletion and storage metrics
- Tekton Chains - Signing and provenance metrics
- Controller Framework - Infrastructure metrics shared by all controllers
It also covers:
- How to configure metrics behavior via
config-observability
- How to deploy
ServiceMonitor resources for scraping
- How to verify that metrics collection is working
Prerequisites
- Tekton control-plane components are installed and running (at minimum, the components you plan to scrape: Pipelines, Triggers, Results, and/or Chains).
kubectl is configured against the target cluster, and your account can create ServiceMonitor resources in the monitoring namespace.
- A monitoring stack is deployed (Prometheus or compatible VictoriaMetrics) and can discover/scrape
ServiceMonitor resources (or equivalent scrape-discovery objects used by your platform).
- Your Prometheus/VictoriaMetrics instance is configured to discover the
ServiceMonitor objects you create (namespace and label selectors must match).
- Network policies and firewalls allow scraper pods to reach Tekton metrics ports (
9090 for most control-plane services, 9000 for Triggers controller and EventListener sink).
- If you want EventListener sink metrics, EventListeners must exist in their target namespaces and expose the
http-metrics port.
Tekton Pipelines
The Tekton Pipelines component includes multiple sub-services that expose metrics on port 9090:
The Pipeline controller metrics use the prefix tekton_pipelines_controller_.
PipelineRun Metrics
* Labels marked with * are optional and depend on the config-observability configuration.
running_pipelineruns Label Levels
The running_pipelineruns metric labels are controlled by metrics.running-pipelinerun.level:
Status Label Values
For PipelineRun metrics:
success - PipelineRun completed successfully
failed - PipelineRun failed
cancelled - PipelineRun was cancelled
For TaskRun metrics:
success - TaskRun completed successfully
failed - TaskRun failed
TaskRun Metrics
config-observability Configuration
The config-observability ConfigMap in the tekton-pipelines namespace controls metrics behavior for the Pipeline controller. This ConfigMap is managed by the Tekton Operator and should be configured via the TektonConfig resource's spec.pipeline.options.configMaps field. See Adjusting Optional Configuration Items for Subcomponents for details.
Hot reload behavior: config-observability is watched at runtime. Most key changes (for example metrics.*) take effect without restarting Pods. Allow one or two scrape intervals for dashboard/query changes to appear. A restart is only needed when Pod spec settings change (for example changing CONFIG_OBSERVABILITY_NAME in the Deployment).
Example configuration via TektonConfig:
apiVersion: operator.tekton.dev/v1alpha1
kind: TektonConfig
metadata:
name: config
spec:
pipeline:
options:
disabled: false
configMaps:
config-observability:
data:
metrics.backend-destination: prometheus
# PipelineRun metrics aggregation level.
# Values: "pipelinerun" | "pipeline" (default) | "namespace"
# - "pipelinerun": includes pipeline + pipelinerun labels; duration uses LastValue
# - "pipeline": includes pipeline label only
# - "namespace": no pipeline/pipelinerun labels
metrics.pipelinerun.level: "pipeline"
# TaskRun metrics aggregation level.
# Values: "taskrun" | "task" (default) | "namespace"
# - "taskrun": includes task + taskrun labels; duration uses LastValue
# - "task": includes task label only
# - "namespace": no task/taskrun labels
metrics.taskrun.level: "task"
# Duration metric type for PipelineRun / TaskRun.
# Values: "histogram" (default) | "lastvalue"
# Note: When pipelinerun.level is "pipelinerun" or taskrun.level is "taskrun",
# duration type is forced to "lastvalue" regardless of this setting.
metrics.pipelinerun.duration-type: "histogram"
metrics.taskrun.duration-type: "histogram"
# Running PipelineRun metrics aggregation level.
# Values: "pipelinerun" | "pipeline" | "namespace" | "" (default, cluster-level)
metrics.running-pipelinerun.level: ""
# Include reason label on duration metrics (pipelinerun_duration_seconds,
# taskrun_duration_seconds, pipelinerun_taskrun_duration_seconds).
# Values: "true" | "false" (default)
# Warning: Enabling this increases label cardinality.
# Note: Despite the key name, this does NOT affect count metrics
# (pipelinerun_total / taskrun_total), only duration metrics.
metrics.count.enable-reason: "false"
# Include namespace label on throttled TaskRun metrics.
# Values: "true" | "false" (default)
metrics.taskrun.throttle.enable-namespace: "false"
Histogram Buckets
When the duration type is histogram, the following bucket boundaries (in seconds) are used:
10, 30, 60, 300, 900, 1800, 3600, 5400, 10800, 21600, 43200, 86400
This corresponds to: 10s, 30s, 1m, 5m, 15m, 30m, 1h, 1.5h, 3h, 6h, 12h, 24h.
Recommended Production Configuration
For production environments, use aggregated levels to control label cardinality:
metrics.pipelinerun.level: "pipeline"
metrics.taskrun.level: "task"
metrics.pipelinerun.duration-type: "histogram"
metrics.taskrun.duration-type: "histogram"
metrics.count.enable-reason: "false"
If you need per-run granularity for debugging, temporarily switch to:
metrics.pipelinerun.level: "pipelinerun"
metrics.taskrun.level: "taskrun"
Note that this will significantly increase the number of time series.
Tekton Triggers
The Tekton Triggers component exposes two categories of metrics from different processes.
Controller Metrics (port 9000)
The Triggers controller reports resource count metrics every 60 seconds.
The Triggers controller metrics use the prefix controller_.
EventListener Sink Metrics
Each EventListener pod exposes additional HTTP and event processing metrics. These metrics come from the EventListener sink process (not the controller). The Prometheus metric prefix is eventlistener_.
eventlistener_http_duration_seconds histogram buckets: 0.001, 0.01, 0.1, 1, 10 (seconds)
eventlistener_event_received_count status values: succeeded, failed
eventlistener_triggered_resources kind values: the Kubernetes resource Kind of the created object (e.g., PipelineRun, TaskRun)
These sink metrics are exposed per EventListener pod, not from the central controller. You may need a separate ServiceMonitor or PodMonitor to scrape them if the EventListener pods expose a metrics port.
Tekton Results
Tekton Results has two sub-services that expose metrics.
Watcher Metrics
The Watcher metrics use the prefix watcher_.
Deletion Metrics
* Optional labels depend on config-observability settings for the Results Watcher.
Note: pipelinerun_delete_count, pipelinerun_delete_duration_seconds, taskrun_delete_count, and taskrun_delete_duration_seconds are only recorded when the Watcher actually deletes runs. These metrics will remain empty (no data points) unless the --completed_run_grace_period flag is set to a non-zero value on the tekton-results-watcher Deployment. By default this flag is 0, which disables automatic deletion. Set it to a positive duration (e.g. 10m) to enable deletion after a grace period, or to a negative value to delete immediately after archiving.
Status label values for Results Watcher:
success - Run completed successfully
failed - Run failed
cancelled - Run was cancelled
Shared Metrics
These metrics are registered by both the PipelineRun and TaskRun reconcilers in the Watcher, tracking storage-related events.
The kind label identifies run type (PipelineRun / TaskRun in some metric series, pipelinerun / taskrun in others).
Note: runs_not_stored_count is only recorded when a run is externally deleted (e.g. via kubectl delete) while the Watcher is holding a finalizer to coordinate archiving. It will remain empty unless all of the following conditions are met:
- The
--logs_api flag is false (log storage disabled) — if logs are enabled, the Watcher skips finalizer-based coordination entirely.
- The
--disable_crd_update flag is false (annotation updates enabled).
- The
--store_deadline flag is set to a non-zero duration — this is the maximum time the Watcher waits for archiving to complete before giving up and allowing deletion.
- A run is externally deleted before it is successfully archived (no
results.tekton.dev/stored=true annotation), and the store_deadline has elapsed.
In normal operation (runs archived before deletion, or deletion triggered by the Watcher itself via --completed_run_grace_period), this counter stays at zero. A non-zero value indicates potential data loss: runs were deleted before their state could be saved to the Results API.
Quick reproduction (test environment):
If you do not see this metric, that usually means the trigger conditions were not met, not that the metric is missing.
- Configure Results Watcher via
TektonConfig so that logs_api=false, disable_crd_update=false, and store_deadline is non-zero (for example 30s).
- Temporarily set Results API replicas to
0 via TektonConfig (spec.result.options.deployments.tekton-results-api.spec.replicas: 0) so runs cannot be archived.
- Create a TaskRun or PipelineRun and wait until it completes.
- Wait until
store_deadline has elapsed, then externally delete the run (kubectl delete ...).
- Check Watcher
/metrics or Prometheus for watcher_runs_not_stored_count (component-prefixed name in exposition format); it should increase.
- Restore the original
TektonConfig (re-enable Results API replicas and normal logs_api settings).
The run_storage_latency_seconds histogram uses the following bucket boundaries (in seconds):
0.1, 0.5, 1, 2, 5, 10, 30, 60, 120, 300, 600, 1800
Watcher config-observability
The Results Watcher has its own config-observability ConfigMap (named via the CONFIG_OBSERVABILITY_NAME environment variable, typically tekton-results-config-observability). This ConfigMap is managed by the Tekton Operator and should be configured via the TektonConfig resource's spec.results.options.configMaps field. See Adjusting Optional Configuration Items for Subcomponents for details.
Hot reload behavior: Results Watcher also watches this ConfigMap and applies most key changes without Pod restarts. A restart is only needed when Deployment-level settings (such as env vars/args) are changed.
It supports the following keys:
Note: Unlike Tekton Pipelines, the Results Watcher does not support pipelinerun / taskrun individual-run granularity levels. It also does not have the metrics.count.enable-reason, metrics.running-pipelinerun.level, or metrics.taskrun.throttle.enable-namespace keys.
Known issue in upstream: taskrun_delete_duration_seconds uses metrics.pipelinerun.duration-type (not metrics.taskrun.duration-type) to determine the aggregation type. This appears to be a copy-paste bug in the Results source code.
API Server Metrics
The API server exposes standard gRPC Prometheus metrics via the go-grpc-prometheus library on port 9090. These include:
grpc_server_handled_total - Total RPCs completed on the server
grpc_server_started_total - Total RPCs started on the server
grpc_server_msg_received_total / grpc_server_msg_sent_total - Message counts
grpc_server_handling_seconds (if PROMETHEUS_HISTOGRAM is enabled) - RPC handling duration
Tekton Chains
Tekton Chains is a security component that generates, signs, and stores provenance for artifacts built with Tekton Pipelines. It observes completed TaskRuns and PipelineRuns, then creates attestations and signatures.
The Chains controller metrics use the prefix watcher_ (same as Results Watcher, but the custom metric names are different, so there are no collisions).
Chains Metrics
All Chains metrics are Counters with no labels.
Note: The official Tekton Chains documentation also mentions *_signing_failures_total counters for both TaskRun and PipelineRun, but these are not present in the current upstream source code. Verify against your deployed version.
Controller Framework Metrics
All Tekton controllers automatically expose the following infrastructure metrics. These metrics use the same prefix as the component's custom metrics (e.g., tekton_pipelines_controller_, controller_, watcher_).
Setting Up ServiceMonitor
To enable Prometheus scraping for Tekton components, deploy ServiceMonitor resources.
Prerequisites are listed in Prerequisites.
Use the following guidance based on your monitoring stack:
- If you use Prometheus (Prometheus Operator), labels such as
metadata.labels.prometheus: kube-prometheus must match the Prometheus CR spec.serviceMonitorSelector; otherwise, this ServiceMonitor will not be scraped.
- If you use VictoriaMetrics, you typically do not need labels like
prometheus: kube-prometheus; create ServiceMonitor/VMServiceScrape according to your monitoring setup.
When using Prometheus, use the following commands to find and verify the selector:
# 1) Locate Prometheus CRs (resource type: monitoring.coreos.com/v1, Kind=Prometheus)
$ kubectl get prometheus -A
# 2) Check ServiceMonitor selector on the target Prometheus instance
$ kubectl get prometheus -n <prometheus-namespace> <prometheus-name> -o yaml | yq '.spec.serviceMonitorSelector'
If no Prometheus CR exists in your cluster, monitoring is usually platform-managed (for example, VictoriaMetrics) or implemented differently. In such cases, labels like prometheus: kube-prometheus are usually not required; follow your platform scraping rules.
For more info please refer to Integrating External Metrics.
Pipeline ServiceMonitor
Pipeline ServiceMonitor YAML
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: tekton-pipelines-metrics
namespace: tekton-pipelines
labels:
app.kubernetes.io/name: tekton-pipelines
# prometheus: kube-prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/part-of: tekton-pipelines
endpoints:
- port: http-metrics
path: /metrics
interval: 30s
namespaceSelector:
matchNames:
- tekton-pipelines
This ServiceMonitor matches Pipeline services with the label app.kubernetes.io/part-of: tekton-pipelines (including remote-resolvers) and scrapes them in the tekton-pipelines namespace.
Triggers ServiceMonitor
Triggers ServiceMonitor YAML
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: tekton-triggers-metrics
namespace: tekton-pipelines
labels:
app.kubernetes.io/name: tekton-triggers
# prometheus: kube-prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/part-of: tekton-triggers
app.kubernetes.io/component: controller
endpoints:
- port: http-metrics
path: /metrics
interval: 30s
namespaceSelector:
matchNames:
- tekton-pipelines
This ServiceMonitor collects Triggers controller metrics (controller_*) only. It does not include EventListener sink metrics.
EventListener Sink ServiceMonitor
EventListener Sink ServiceMonitor YAML
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: tekton-eventlistener-sink-metrics
namespace: tekton-pipelines
labels:
app.kubernetes.io/name: tekton-eventlistener-sink
# prometheus: kube-prometheus
spec:
selector:
matchExpressions:
- key: eventlistener
operator: Exists
- key: app.kubernetes.io/managed-by
operator: In
values:
- EventListener
endpoints:
- port: http-metrics
path: /metrics
interval: 30s
namespaceSelector:
any: true
EventListener Services usually run in application namespaces, so this example uses namespaceSelector.any: true for cross-namespace scraping. If you need tighter scope, switch to matchNames and list allowed namespaces explicitly.
Results ServiceMonitor
The Results services have both app.kubernetes.io/part-of: tekton-results and app.kubernetes.io/name labels. To precisely target API + Watcher (and exclude Postgres), this example matches app.kubernetes.io/name:
Results ServiceMonitor YAML
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: tekton-results-metrics
namespace: tekton-pipelines
labels:
app.kubernetes.io/name: tekton-results
# prometheus: kube-prometheus
spec:
selector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- tekton-results-api
- tekton-results-watcher
endpoints:
- port: prometheus
path: /metrics
interval: 30s
- port: metrics
path: /metrics
interval: 30s
namespaceSelector:
matchNames:
- tekton-pipelines
The Results API server uses port name prometheus (9090) and the Watcher uses port name metrics (9090). Each service only exposes one of these port names, so only the matching endpoint will be scraped.
Chains ServiceMonitor
Chains ServiceMonitor YAML
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: tekton-chains-metrics
namespace: tekton-pipelines
labels:
app.kubernetes.io/name: tekton-chains
# prometheus: kube-prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/part-of: tekton-chains
endpoints:
- port: http-metrics
path: /metrics
interval: 30s
namespaceSelector:
matchNames:
- tekton-pipelines
Verification
After deploying the ServiceMonitor resources, verify that Prometheus is scraping the targets.
Check Metrics Endpoints Directly
# Pipeline controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-pipelines-controller 9090:9090
$ curl -s http://localhost:9090/metrics | grep tekton_pipelines_controller_
# HELP tekton_pipelines_controller_client_latency How long Kubernetes API requests take
# TYPE tekton_pipelines_controller_client_latency histogram
tekton_pipelines_controller_client_latency_bucket{name="",le="1e-05"} 0
tekton_pipelines_controller_client_latency_bucket{name="",le="0.0001"} 0
tekton_pipelines_controller_client_latency_bucket{name="",le="0.001"} 0
# Triggers controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-triggers-controller 9000:9000
$ curl -s http://localhost:9000/metrics | grep controller_
# HELP controller_client_latency How long Kubernetes API requests take
# TYPE controller_client_latency histogram
controller_client_latency_bucket{name="",le="1e-05"} 0
controller_client_latency_bucket{name="",le="0.0001"} 1
controller_client_latency_bucket{name="",le="0.001"} 2
# EventListener sink metrics (replace namespace/service)
$ kubectl port-forward -n <eventlistener-namespace> svc/<eventlistener-service> 9000:9000
$ curl -s http://localhost:9000/metrics | grep eventlistener_
# HELP eventlistener_client_latency How long Kubernetes API requests take
# TYPE eventlistener_client_latency histogram
eventlistener_client_latency_bucket{name="",le="1e-05"} 0
eventlistener_client_latency_bucket{name="",le="0.0001"} 0
eventlistener_client_latency_bucket{name="",le="0.001"} 0
# HELP eventlistener_triggered_resources Count of the number of triggered eventlistener resources
# TYPE eventlistener_triggered_resources counter
eventlistener_triggered_resources{kind="PipelineRun"} 10
# Results watcher
$ kubectl port-forward -n tekton-pipelines svc/tekton-results-watcher 9091:9090
$ curl -s http://localhost:9091/metrics | grep watcher_
# HELP watcher_client_latency How long Kubernetes API requests take
# TYPE watcher_client_latency histogram
watcher_client_latency_bucket{name="",le="1e-05"} 0
watcher_client_latency_bucket{name="",le="0.0001"} 0
watcher_client_latency_bucket{name="",le="0.001"} 0
# Results API
$ kubectl port-forward -n tekton-pipelines svc/tekton-results-api-service 9092:9090
$ curl -s http://localhost:9092/metrics | grep grpc_server_
# HELP grpc_server_handled_total Total number of RPCs completed on the server, regardless of success or failure.
# TYPE grpc_server_handled_total counter
grpc_server_handled_total{grpc_code="Aborted",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="CreateRecord",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="CreateResult",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 0
# HELP grpc_server_started_total Total number of RPCs started on the server.
# TYPE grpc_server_started_total counter
grpc_server_started_total{grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 337606
grpc_server_started_total{grpc_method="CreateRecord",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 10301
grpc_server_started_total{grpc_method="CreateResult",grpc_service="tekton.results.v1alpha2.Results",grpc_type="unary"} 832
# Chains controller
$ kubectl port-forward -n tekton-pipelines svc/tekton-chains-metrics 9093:9090
$ curl -s http://localhost:9093/metrics | grep watcher_
# HELP watcher_client_latency How long Kubernetes API requests take
# TYPE watcher_client_latency histogram
watcher_client_latency_bucket{name="",le="1e-05"} 0
watcher_client_latency_bucket{name="",le="0.0001"} 0
watcher_client_latency_bucket{name="",le="0.001"} 0
EventListener sink metrics such as eventlistener_event_received_count and eventlistener_http_duration_seconds are request-driven. Send at least one request to the EventListener before validating these metrics.
Check Prometheus Targets
# Verify ServiceMonitor resources exist
$ kubectl get servicemonitor -n tekton-pipelines
NAME AGE
tekton-chains-metrics 10m
tekton-eventlistener-sink-metrics 10m
tekton-pipelines-metrics 10m
tekton-results-metrics 10m
tekton-triggers-metrics 10m
# Check Prometheus targets (via Prometheus UI or API)
# Look for targets with job labels matching the ServiceMonitor names
Example PromQL Queries
# PipelineRun cumulative success rate (avoids misinterpretation in empty completion windows)
100 * sum(tekton_pipelines_controller_pipelinerun_total{status="success"}) / clamp_min(sum(tekton_pipelines_controller_pipelinerun_total), 1)
# Completed PipelineRuns in the last 5 minutes (throughput)
round(sum(increase(tekton_pipelines_controller_pipelinerun_total[5m])))
# PipelineRun duration P95 (histogram mode)
histogram_quantile(0.95,
rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])
)
# TaskRun duration P95 (histogram mode, includes standalone + in-pipeline TaskRuns)
histogram_quantile(0.95,
(
sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m]))
+
sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m]))
)
)
# PipelineRun duration (lastvalue mode)
avg_over_time(tekton_pipelines_controller_pipelinerun_duration_seconds[5m])
# Currently running PipelineRuns (single series to avoid duplicate legends)
max(tekton_pipelines_controller_running_pipelineruns)
# TaskRuns throttled by resource quota
max(tekton_pipelines_controller_running_taskruns_throttled_by_quota)
# Trigger resource counts
controller_eventlistener_count
controller_triggertemplate_count
# Chains signing activity
watcher_taskrun_sign_created_total
watcher_pipelinerun_sign_created_total
MonitorDashboard Examples
The following MonitorDashboard resources provide ready-to-use dashboards for monitoring Tekton components. Deploy them to the cpaas-system namespace under the tekton folder.
Important: Each panel must include id (unique integer), datasource: prometheus, and transformations: []. Each target must include datasource: prometheus and refId. Duration P50/P95 panels in this document use *_bucket queries and require metrics.*.duration-type=histogram; if you use lastvalue, replace those queries with LastValue-style expressions such as avg_over_time(...).
Tekton Pipeline Dashboard
Tekton Pipeline Dashboard YAML
kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
labels:
cpaas.io/dashboard.folder: tekton
cpaas.io/dashboard.is.home.dashboard: "false"
cpaas.io/dashboard.tag.tekton: "true"
name: tekton-pipeline
namespace: cpaas-system
spec:
body:
titleZh: Tekton Pipeline Overview
tags:
- tekton
time:
from: now-1h
to: now
templating:
list: []
panels:
- id: 1
title: PipelineRun Total (by status)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 0, y: 0 }
targets:
- datasource: prometheus
expr: sum by (status) (tekton_pipelines_controller_pipelinerun_total)
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 2
title: TaskRun Total (by status)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 8, y: 0 }
targets:
- datasource: prometheus
expr: sum by (status) (tekton_pipelines_controller_taskrun_total)
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 3
title: PipelineRun Success Rate (cumulative)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 4, x: 16, y: 0 }
targets:
- datasource: prometheus
expr: "100 * sum(tekton_pipelines_controller_pipelinerun_total{status=\"success\"}) / clamp_min(sum(tekton_pipelines_controller_pipelinerun_total), 1)"
refId: A
fieldConfig:
defaults:
unit: percent
color: { mode: thresholds }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds:
mode: absolute
steps:
- { color: red, value: null }
- { color: orange, value: 80 }
- { color: green, value: 95 }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 12
title: Completed PipelineRuns (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 4, x: 20, y: 0 }
targets:
- datasource: prometheus
expr: "round(sum(increase(tekton_pipelines_controller_pipelinerun_total[5m])))"
legendFormat: completed
refId: A
fieldConfig:
defaults:
unit: short
decimals: 0
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 4
title: Running PipelineRuns
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 0, y: 8 }
targets:
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_pipelineruns)
legendFormat: running
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 5
title: Running TaskRuns
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 8, y: 8 }
targets:
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_taskruns)
legendFormat: running
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 6
title: TaskRuns Throttled
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 16, y: 8 }
targets:
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_taskruns_throttled_by_quota)
legendFormat: by quota
refId: A
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_taskruns_throttled_by_node)
legendFormat: by node
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: orange, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 7
title: PipelineRun Duration P50 / P95
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 0, y: 16 }
targets:
- datasource: prometheus
expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P50
refId: A
- datasource: prometheus
expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P95
refId: B
fieldConfig:
defaults:
unit: s
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 8
title: TaskRun Duration P50 / P95 (Standalone)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 8, y: 16 }
targets:
- datasource: prometheus
expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P50
refId: A
- datasource: prometheus
expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P95
refId: B
fieldConfig:
defaults:
unit: s
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 13
title: TaskRun Duration P50 / P95 (In-Pipeline)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 16, y: 16 }
targets:
- datasource: prometheus
expr: (histogram_quantile(0.5, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P50
refId: A
- datasource: prometheus
expr: (histogram_quantile(0.95, sum by (le) (rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket[5m])))) and on() (sum(rate(tekton_pipelines_controller_pipelinerun_taskrun_duration_seconds_bucket{le="+Inf"}[5m])) > 0)
legendFormat: P95
refId: B
fieldConfig:
defaults:
unit: s
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 9
title: Workqueue Depth
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 0, y: 24 }
targets:
- datasource: prometheus
expr: max(tekton_pipelines_controller_workqueue_depth)
legendFormat: depth
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 10
title: Reconcile Count (by success)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 8, y: 24 }
targets:
- datasource: prometheus
expr: sum(increase(tekton_pipelines_controller_reconcile_count{success="true"}[5m]))
legendFormat: success=true
refId: A
- datasource: prometheus
expr: sum(increase(tekton_pipelines_controller_reconcile_count{success="false"}[5m]))
legendFormat: success=false
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 11
title: Resolution Waiting
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 8, x: 16, y: 24 }
targets:
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_pipelineruns_waiting_on_pipeline_resolution)
legendFormat: PR waiting pipeline
refId: A
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_pipelineruns_waiting_on_task_resolution)
legendFormat: PR waiting task
refId: B
- datasource: prometheus
expr: max(tekton_pipelines_controller_running_taskruns_waiting_on_task_resolution_count)
legendFormat: TR waiting task
refId: C
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: orange, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
Tekton Pipeline Dashboard Interpretation (Common Questions)
PipelineRun Total (by status) is a completion-event counter recorded by the controller, not the total number of PipelineRun objects. In the current implementation, user-triggered cancellation (spec.status=Cancelled) may not enter this counting path, so the cancelled series may be absent. To validate cancellation volume, check PipelineRun objects and events.
Running PipelineRuns is a real-time snapshot (how many are running now). It can change independently from PipelineRun Total.
Completed PipelineRuns (last 5m) is throughput (newly completed runs in the last 5 minutes). Seeing 0 during low traffic or idle periods is expected.
PipelineRun Success Rate (cumulative) is cumulative since controller start, not a 5-minute window success rate. A short-term failure does not immediately cause a large shift.
Reconcile Count (by success) measures controller reconcile loops, not PipelineRun counts.
- Status series are shown only for label values that actually have samples in the selected time range. If a status has no samples in the window, its curve/legend will not appear.
TaskRun Duration P50 / P95 (Standalone) and TaskRun Duration P50 / P95 (In-Pipeline) are split to avoid mixed-query instability. In environments that only expose one histogram family, the other panel may be empty, which is expected.
Tekton Triggers Dashboard
Tekton Triggers Dashboard YAML
kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
labels:
cpaas.io/dashboard.folder: tekton
cpaas.io/dashboard.is.home.dashboard: "false"
cpaas.io/dashboard.tag.tekton: "true"
name: tekton-triggers
namespace: cpaas-system
spec:
body:
titleZh: Tekton Triggers Overview
tags:
- tekton
time:
from: now-1h
to: now
templating:
list: []
panels:
- id: 1
title: EventListener Count
type: timeseries
datasource: prometheus
gridPos: { h: 6, w: 5, x: 0, y: 0 }
targets:
- datasource: prometheus
expr: controller_eventlistener_count
legendFormat: EventListener
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 2
title: TriggerTemplate Count
type: timeseries
datasource: prometheus
gridPos: { h: 6, w: 5, x: 5, y: 0 }
targets:
- datasource: prometheus
expr: controller_triggertemplate_count
legendFormat: TriggerTemplate
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 3
title: TriggerBinding Count
type: timeseries
datasource: prometheus
gridPos: { h: 6, w: 5, x: 10, y: 0 }
targets:
- datasource: prometheus
expr: controller_triggerbinding_count
legendFormat: TriggerBinding
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 4
title: ClusterTriggerBinding
type: timeseries
datasource: prometheus
gridPos: { h: 6, w: 5, x: 15, y: 0 }
targets:
- datasource: prometheus
expr: controller_clustertriggerbinding_count
legendFormat: ClusterTriggerBinding
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 5
title: ClusterInterceptor
type: timeseries
datasource: prometheus
gridPos: { h: 6, w: 4, x: 20, y: 0 }
targets:
- datasource: prometheus
expr: controller_clusterinterceptor_count
legendFormat: ClusterInterceptor
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 6
title: All Trigger Resource Counts (trend)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 24, x: 0, y: 6 }
targets:
- datasource: prometheus
expr: controller_eventlistener_count
legendFormat: EventListener
refId: A
- datasource: prometheus
expr: controller_triggertemplate_count
legendFormat: TriggerTemplate
refId: B
- datasource: prometheus
expr: controller_triggerbinding_count
legendFormat: TriggerBinding
refId: C
- datasource: prometheus
expr: controller_clustertriggerbinding_count
legendFormat: ClusterTriggerBinding
refId: D
- datasource: prometheus
expr: controller_clusterinterceptor_count
legendFormat: ClusterInterceptor
refId: E
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
Tekton Triggers Dashboard Interpretation (Common Questions)
EventListener Count, TriggerTemplate Count, TriggerBinding Count, ClusterTriggerBinding, and ClusterInterceptor are object-count snapshots, not request volume or event-processing throughput.
All Trigger Resource Counts (trend) shows the combined trend for the same resource counts. Short deviations versus the single-resource trend panels within a scrape interval are expected.
- Showing
0 when no Triggers resources exist is normal and does not indicate a scraping failure.
Tekton Results Dashboard
Tekton Results Dashboard YAML
kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
labels:
cpaas.io/dashboard.folder: tekton
cpaas.io/dashboard.is.home.dashboard: "false"
cpaas.io/dashboard.tag.tekton: "true"
name: tekton-results
namespace: cpaas-system
spec:
body:
titleZh: Tekton Results Overview
tags:
- tekton
time:
from: now-1h
to: now
templating:
list: []
panels:
- id: 1
title: PipelineRun Reconcile Count (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 0 }
targets:
- datasource: prometheus
expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler",success="true"}[5m])))
legendFormat: success=true
refId: A
- datasource: prometheus
expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler",success="false"}[5m])))
legendFormat: success=false
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 2
title: TaskRun Reconcile Count (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 0 }
targets:
- datasource: prometheus
expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler",success="true"}[5m])))
legendFormat: success=true
refId: A
- datasource: prometheus
expr: round(sum(increase(watcher_reconcile_count{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler",success="false"}[5m])))
legendFormat: success=false
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 3
title: PipelineRun Reconcile Latency P95
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 8 }
targets:
- datasource: prometheus
expr: histogram_quantile(0.95, sum by (le) (rate(watcher_reconcile_latency_bucket{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler"}[5m])))
legendFormat: P95
refId: A
fieldConfig:
defaults:
unit: ms
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 4
title: TaskRun Reconcile Latency P95
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 8 }
targets:
- datasource: prometheus
expr: histogram_quantile(0.95, sum by (le) (rate(watcher_reconcile_latency_bucket{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler"}[5m])))
legendFormat: P95
refId: A
fieldConfig:
defaults:
unit: ms
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 5
title: Workqueue Depth (PipelineRun vs TaskRun)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 16 }
targets:
- datasource: prometheus
expr: sum(watcher_work_queue_depth{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler"})
legendFormat: pipelinerun
refId: A
- datasource: prometheus
expr: sum(watcher_work_queue_depth{reconciler="github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler"})
legendFormat: taskrun
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 6
title: Workqueue Adds (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 16 }
targets:
- datasource: prometheus
expr: round(sum(increase(watcher_workqueue_adds_total{name=~"github.com.tektoncd.results.pkg.watcher.reconciler.pipelinerun.Reconciler-(consumer|fast|slow)"}[5m])))
legendFormat: pipelinerun adds
refId: A
- datasource: prometheus
expr: round(sum(increase(watcher_workqueue_adds_total{name=~"github.com.tektoncd.results.pkg.watcher.reconciler.taskrun.Reconciler-(consumer|fast|slow)"}[5m])))
legendFormat: taskrun adds
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 7
title: gRPC Request Rate (Results API)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 24 }
targets:
- datasource: prometheus
expr: "sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\"}[5m]))"
legendFormat: requests
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 8
title: gRPC Error Percentage (Results API, excl. NotFound/Canceled)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 24 }
targets:
- datasource: prometheus
expr: "100 * ((sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\",grpc_code!~\"OK|NotFound|Canceled\"}[5m])) or vector(0)) / clamp_min((sum(rate(grpc_server_handled_total{grpc_service=~\"tekton.results.*\"}[5m])) or vector(0)), 0.001))"
legendFormat: error %
refId: A
fieldConfig:
defaults:
unit: percent
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: red, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
Tekton Results Dashboard Interpretation (Common Questions)
- This dashboard revision is based on Results Watcher
reconcile/workqueue metrics plus Results API gRPC metrics, so it stays populated under common deployments (logs_api=true, automatic deletion disabled).
PipelineRun Reconcile Count (last 5m) and TaskRun Reconcile Count (last 5m) show separate 5-minute increments for success=true and success=false.
PipelineRun Reconcile Latency P95 and TaskRun Reconcile Latency P95 are calculated from watcher reconcile latency histograms. Under low traffic, the line can be sparse.
Workqueue Depth shows current queue depth, and Workqueue Adds (last 5m) shows enqueue volume over the last 5 minutes.
gRPC Error Percentage (Results API, excl. NotFound/Canceled) is the percentage of abnormal errors over total requests, excluding common business return codes (NotFound, Canceled).
Tekton Chains Dashboard
Tekton Chains Dashboard YAML
kind: MonitorDashboard
apiVersion: ait.alauda.io/v1alpha2
metadata:
labels:
cpaas.io/dashboard.folder: tekton
cpaas.io/dashboard.is.home.dashboard: "false"
cpaas.io/dashboard.tag.tekton: "true"
name: tekton-chains
namespace: cpaas-system
spec:
body:
titleZh: Tekton Chains Overview
tags:
- tekton
time:
from: now-1h
to: now
templating:
list: []
panels:
- id: 1
title: TaskRun Signatures Created (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 0 }
targets:
- datasource: prometheus
expr: round(increase(watcher_taskrun_sign_created_total[5m]))
legendFormat: sign created
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 2
title: PipelineRun Signatures Created (last 5m)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 0 }
targets:
- datasource: prometheus
expr: round(increase(watcher_pipelinerun_sign_created_total[5m]))
legendFormat: sign created
refId: A
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 3
title: Payloads Stored (last 5m, TaskRun vs PipelineRun)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 0, y: 8 }
targets:
- datasource: prometheus
expr: round(increase(watcher_taskrun_payload_stored_total[5m]))
legendFormat: TaskRun
refId: A
- datasource: prometheus
expr: round(increase(watcher_pipelinerun_payload_stored_total[5m]))
legendFormat: PipelineRun
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
- id: 4
title: Marked Signed (last 5m, TaskRun vs PipelineRun)
type: timeseries
datasource: prometheus
gridPos: { h: 8, w: 12, x: 12, y: 8 }
targets:
- datasource: prometheus
expr: round(increase(watcher_taskrun_marked_signed_total[5m]))
legendFormat: TaskRun
refId: A
- datasource: prometheus
expr: round(increase(watcher_pipelinerun_marked_signed_total[5m]))
legendFormat: PipelineRun
refId: B
fieldConfig:
defaults:
color: { mode: palette-classic }
custom: { drawStyle: line, fillOpacity: 0, lineWidth: 1, spanNulls: false }
thresholds: { mode: absolute, steps: [{ color: green, value: null }] }
overrides: []
options:
legend: { calcs: [latest], displayMode: list, placement: bottom, showLegend: true }
tooltip: { mode: multi, sort: desc }
transformations: []
Tekton Chains Dashboard Interpretation (Common Questions)
TaskRun Signatures Created (last 5m), PipelineRun Signatures Created (last 5m), Payloads Stored (last 5m), and Marked Signed (last 5m) use increase(...[5m]) and represent increments in the last five minutes.
- When there is no new signing or storage activity, these lines drop to
0; this does not imply a component fault.
Payloads Stored and Marked Signed represent different processing stages, so their values are not expected to always match.