Grafana Plugin (grafana/v1-alpha)

The Grafana plugin is an optional plugin that can be used to scaffold Grafana Dashboards to allow you to check out the default metrics which are exported by projects using controller-runtime.

When to use it ?

How to use it ?

Prerequisites:

Basic Usage

The Grafana plugin is attached to the init subcommand and the edit subcommand:

# Initialize a new project with grafana plugin
kubebuilder init --plugins grafana.kubebuilder.io/v1-alpha

# Enable grafana plugin to an existing project
kubebuilder edit --plugins grafana.kubebuilder.io/v1-alpha

The plugin will create a new directory and scaffold the JSON files under it (i.e. grafana/controller-runtime-metrics.json).

Show case:

See an example of how to use the plugin in your project:

output

Now, let’s check how to use the Grafana dashboards

  1. Copy the JSON file
  2. Visit <your-grafana-url>/dashboard/import to import a new dashboard.
  3. Paste the JSON content to Import via panel json, then press Load button
  4. Select the data source for Prometheus metrics
  5. Once the json is imported in Grafana, the dashboard is ready.

Grafana Dashboard

Controller Runtime Reconciliation total & errors

  • Metrics:
    • controller_runtime_reconcile_total
    • controller_runtime_reconcile_errors_total
  • Query:
    • sum(rate(controller_runtime_reconcile_total{job=“$job”}[5m])) by (instance, pod)
    • sum(rate(controller_runtime_reconcile_errors_total{job=“$job”}[5m])) by (instance, pod)
  • Description:
    • Per-second rate of total reconciliation as measured over the last 5 minutes
    • Per-second rate of reconciliation errors as measured over the last 5 minutes
  • Sample:

Controller CPU & Memory Usage

  • Metrics:
    • process_cpu_seconds_total
    • process_resident_memory_bytes
  • Query:
    • rate(process_cpu_seconds_total{job=“$job”, namespace=“$namespace”, pod=“$pod”}[5m]) * 100
    • process_resident_memory_bytes{job=“$job”, namespace=“$namespace”, pod=“$pod”}
  • Description:
    • Per-second rate of CPU usage as measured over the last 5 minutes
    • Allocated Memory for the running controller
  • Sample:

Seconds of P50/90/99 Items Stay in Work Queue

  • Metrics
    • workqueue_queue_duration_seconds_bucket
  • Query:
    • histogram_quantile(0.50, sum(rate(workqueue_queue_duration_seconds_bucket{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name, le))
  • Description
    • Seconds an item stays in workqueue before being requested.
  • Sample:

Seconds of P50/90/99 Items Processed in Work Queue

  • Metrics
    • workqueue_work_duration_seconds_bucket
  • Query:
    • histogram_quantile(0.50, sum(rate(workqueue_work_duration_seconds_bucket{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name, le))
  • Description
    • Seconds of processing an item from workqueue takes.
  • Sample:

Add Rate in Work Queue

  • Metrics
    • workqueue_adds_total
  • Query:
    • sum(rate(workqueue_adds_total{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name)
  • Description
    • Per-second rate of items added to work queue
  • Sample:

Retries Rate in Work Queue

  • Metrics
    • workqueue_retries_total
  • Query:
    • sum(rate(workqueue_retries_total{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name)
  • Description
    • Per-second rate of retries handled by workqueue
  • Sample:

Number of Workers in Use

  • Metrics
    • controller_runtime_active_workers
  • Query:
    • controller_runtime_active_workers{job=“$job”, namespace=“$namespace”}
  • Description
    • The number of active controller workers
  • Sample:

WorkQueue Depth

  • Metrics
    • workqueue_depth
  • Query:
    • workqueue_depth{job=“$job”, namespace=“$namespace”}
  • Description
    • Current depth of workqueue
  • Sample:

Unfinished Seconds

  • Metrics
    • workqueue_unfinished_work_seconds
  • Query:
    • rate(workqueue_unfinished_work_seconds{job=“$job”, namespace=“$namespace”}[5m])
  • Description
    • How many seconds of work has done that is in progress and hasn’t been observed by work_duration.
  • Sample:

Visualize Custom Metrics

The Grafana plugin supports scaffolding manifests for custom metrics.

Generate Config Template

When the plugin is triggered for the first time, grafana/custom-metrics/config.yaml is generated.

---
customMetrics:
#  - metric: # Raw custom metric (required)
#    type:   # Metric type: counter/gauge/histogram (required)
#    expr:   # Prom_ql for the metric (optional)
#    unit:   # Unit of measurement, examples: s,none,bytes,percent,etc. (optional)

Add Custom Metrics to Config

You can enter multiple custom metrics in the file. For each element, you need to specify the metric and its type. The Grafana plugin can automatically generate expr for visualization. Alternatively, you can provide expr and the plugin will use the specified one directly.

---
customMetrics:
  - metric: memcached_operator_reconcile_total # Raw custom metric (required)
    type: counter # Metric type: counter/gauge/histogram (required)
    unit: none
  - metric: memcached_operator_reconcile_time_seconds_bucket
    type: histogram

Scaffold Manifest

Once config.yaml is configured, you can run kubebuilder edit --plugins grafana.kubebuilder.io/v1-alpha again. This time, the plugin will generate grafana/custom-metrics/custom-metrics-dashboard.json, which can be imported to Grafana UI.

Show case:

See an example of how to visualize your custom metrics:

output2

Subcommands

The Grafana plugin implements the following subcommands:

  • edit ($ kubebuilder edit [OPTIONS])

  • init ($ kubebuilder init [OPTIONS])

Affected files

The following scaffolds will be created or updated by this plugin:

  • grafana/*.json

Further resources