coder/docs/admin/integrations/prometheus.md

# Prometheus

Coder exposes many metrics which can be consumed by a Prometheus server, and
give insight into the current state of a live Coder deployment.

If you don't have a Prometheus server installed, you can follow the Prometheus
[Getting started](https://prometheus.io/docs/prometheus/latest/getting_started/) guide.

## Enable Prometheus metrics

Coder server exports metrics via the HTTP endpoint, which can be enabled using
either the environment variable `CODER_PROMETHEUS_ENABLE` or the flag
`--prometheus-enable`.

The Prometheus endpoint address is `http://localhost:2112/` by default. You can
use either the environment variable `CODER_PROMETHEUS_ADDRESS` or the flag
`--prometheus-address <network-interface>:<port>` to select a different listen
address.

If `coder server --prometheus-enable` is started locally, you can preview the
metrics endpoint in your browser or with `curl`:

```console
$ curl http://localhost:2112/
# HELP coderd_api_active_users_duration_hour The number of users that have been active within the last hour.
# TYPE coderd_api_active_users_duration_hour gauge
coderd_api_active_users_duration_hour 0
...
```

### Kubernetes deployment

The Prometheus endpoint can be enabled in the [Helm chart's](https://github.com/coder/coder/tree/main/helm)
`values.yml` by setting `CODER_PROMETHEUS_ENABLE=true`. Once enabled, the environment variable `CODER_PROMETHEUS_ADDRESS` will be set by default to
`0.0.0.0:2112`. A Service Endpoint will not be exposed; if you need to
expose the Prometheus port on a Service, (for example, to use a
`ServiceMonitor`), create a separate headless service instead.

```yaml
apiVersion: v1
kind: Service
metadata:
  name: coder-prom
  namespace: coder
spec:
  clusterIP: None
  ports:
    - name: prom-http
      port: 2112
      protocol: TCP
      targetPort: 2112
  selector:
    app.kubernetes.io/instance: coder
    app.kubernetes.io/name: coder
  type: ClusterIP
```

### Prometheus configuration

To allow Prometheus to scrape the Coder metrics, you will need to create a
`scrape_config` in your `prometheus.yml` file, or in the Prometheus Helm chart
values. The following is an example `scrape_config`.

```yaml
scrape_configs:
  - job_name: "coder"
    scheme: "http"
    static_configs:
      # replace with the the IP address of the Coder pod or server
      - targets: ["<ip>:2112"]
        labels:
          apps: "coder"
```

To use the Kubernetes Prometheus operator to scrape metrics, you will need to
create a `ServiceMonitor` in your Coder deployment namespace. The following is
an example `ServiceMonitor`.

```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: coder-service-monitor
  namespace: coder
spec:
  endpoints:
    - port: prom-http
      interval: 10s
      scrapeTimeout: 10s
  namespaceSelector:
    matchNames:
    - coder
  selector:
    matchLabels:
      app.kubernetes.io/name: coder
```

## Available metrics

You must first enable `coderd_agentstats_*` with the flag
`--prometheus-collect-agent-stats`, or the environment variable
`CODER_PROMETHEUS_COLLECT_AGENT_STATS` before they can be retrieved from the
deployment. They will always be available from the agent.

<!-- Code generated by 'make docs/admin/integrations/prometheus.md'. DO NOT EDIT -->

| Name                                                          | Type      | Description                                                                                                                      | Labels                                                                               |
|---------------------------------------------------------------|-----------|----------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| `agent_scripts_executed_total`                                | counter   | Total number of scripts executed by the Coder agent. Includes cron scheduled scripts.                                            | `agent_name` `success` `template_name` `username` `workspace_name`                   |
| `coderd_agents_apps`                                          | gauge     | Agent applications with statuses.                                                                                                | `agent_name` `app_name` `health` `username` `workspace_name`                         |
| `coderd_agents_connection_latencies_seconds`                  | gauge     | Agent connection latencies in seconds.                                                                                           | `agent_name` `derp_region` `preferred` `username` `workspace_name`                   |
| `coderd_agents_connections`                                   | gauge     | Agent connections with statuses.                                                                                                 | `agent_name` `lifecycle_state` `status` `tailnet_node` `username` `workspace_name`   |
| `coderd_agents_up`                                            | gauge     | The number of active agents per workspace.                                                                                       | `template_name` `username` `workspace_name`                                          |
| `coderd_agentstats_connection_count`                          | gauge     | The number of established connections by agent                                                                                   | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_connection_median_latency_seconds`         | gauge     | The median agent connection latency                                                                                              | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_currently_reachable_peers`                 | gauge     | The number of peers (e.g. clients) that are currently reachable over the encrypted network.                                      | `agent_name` `connection_type` `template_name` `username` `workspace_name`           |
| `coderd_agentstats_rx_bytes`                                  | gauge     | Agent Rx bytes                                                                                                                   | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_session_count_jetbrains`                   | gauge     | The number of session established by JetBrains                                                                                   | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_session_count_reconnecting_pty`            | gauge     | The number of session established by reconnecting PTY                                                                            | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_session_count_ssh`                         | gauge     | The number of session established by SSH                                                                                         | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_session_count_vscode`                      | gauge     | The number of session established by VSCode                                                                                      | `agent_name` `username` `workspace_name`                                             |
| `coderd_agentstats_startup_script_seconds`                    | gauge     | The number of seconds the startup script took to execute.                                                                        | `agent_name` `success` `template_name` `username` `workspace_name`                   |
| `coderd_agentstats_tx_bytes`                                  | gauge     | Agent Tx bytes                                                                                                                   | `agent_name` `username` `workspace_name`                                             |
| `coderd_api_active_users_duration_hour`                       | gauge     | The number of users that have been active within the last hour.                                                                  |                                                                                      |
| `coderd_api_concurrent_requests`                              | gauge     | The number of concurrent API requests.                                                                                           |                                                                                      |
| `coderd_api_concurrent_websockets`                            | gauge     | The total number of concurrent API websockets.                                                                                   |                                                                                      |
| `coderd_api_request_latencies_seconds`                        | histogram | Latency distribution of requests in seconds.                                                                                     | `method` `path`                                                                      |
| `coderd_api_requests_processed_total`                         | counter   | The total number of processed API requests                                                                                       | `code` `method` `path`                                                               |
| `coderd_api_websocket_durations_seconds`                      | histogram | Websocket duration distribution of requests in seconds.                                                                          | `path`                                                                               |
| `coderd_api_workspace_latest_build`                           | gauge     | The latest workspace builds with a status.                                                                                       | `status`                                                                             |
| `coderd_api_workspace_latest_build_total`                     | gauge     | DEPRECATED: use coderd_api_workspace_latest_build instead                                                                        | `status`                                                                             |
| `coderd_insights_applications_usage_seconds`                  | gauge     | The application usage per template.                                                                                              | `application_name` `slug` `template_name`                                            |
| `coderd_insights_parameters`                                  | gauge     | The parameter usage per template.                                                                                                | `parameter_name` `parameter_type` `parameter_value` `template_name`                  |
| `coderd_insights_templates_active_users`                      | gauge     | The number of active users of the template.                                                                                      | `template_name`                                                                      |
| `coderd_license_active_users`                                 | gauge     | The number of active users.                                                                                                      |                                                                                      |
| `coderd_license_limit_users`                                  | gauge     | The user seats limit based on the active Coder license.                                                                          |                                                                                      |
| `coderd_license_user_limit_enabled`                           | gauge     | Returns 1 if the current license enforces the user limit.                                                                        |                                                                                      |
| `coderd_metrics_collector_agents_execution_seconds`           | histogram | Histogram for duration of agents metrics collection in seconds.                                                                  |                                                                                      |
| `coderd_oauth2_external_requests_rate_limit`                  | gauge     | The total number of allowed requests per interval.                                                                               | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_rate_limit_next_reset_unix`  | gauge     | Unix timestamp of the next interval                                                                                              | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_rate_limit_remaining`        | gauge     | The remaining number of allowed requests in this interval.                                                                       | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_rate_limit_reset_in_seconds` | gauge     | Seconds until the next interval                                                                                                  | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_rate_limit_total`            | gauge     | DEPRECATED: use coderd_oauth2_external_requests_rate_limit instead                                                               | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_rate_limit_used`             | gauge     | The number of requests made in this interval.                                                                                    | `name` `resource`                                                                    |
| `coderd_oauth2_external_requests_total`                       | counter   | The total number of api calls made to external oauth2 providers. 'status_code' will be 0 if the request failed with no response. | `name` `source` `status_code`                                                        |
| `coderd_prebuilt_workspace_claim_duration_seconds`            | histogram | Time to claim a prebuilt workspace by organization, template, and preset.                                                        | `organization_name` `preset_name` `template_name`                                    |
| `coderd_provisionerd_job_timings_seconds`                     | histogram | The provisioner job time duration in seconds.                                                                                    | `provisioner` `status`                                                               |
| `coderd_provisionerd_jobs_current`                            | gauge     | The number of currently running provisioner jobs.                                                                                | `provisioner`                                                                        |
| `coderd_provisionerd_num_daemons`                             | gauge     | The number of provisioner daemons.                                                                                               |                                                                                      |
| `coderd_provisionerd_workspace_build_timings_seconds`         | histogram | The time taken for a workspace to build.                                                                                         | `status` `template_name` `template_version` `workspace_transition`                   |
| `coderd_workspace_builds_total`                               | counter   | The number of workspaces started, updated, or deleted.                                                                           | `action` `owner_email` `status` `template_name` `template_version` `workspace_name`  |
| `coderd_workspace_creation_duration_seconds`                  | histogram | Time to create a workspace by organization, template, preset, and type (regular or prebuild).                                    | `organization_name` `preset_name` `template_name` `type`                             |
| `coderd_workspace_creation_total`                             | counter   | Total regular (non-prebuilt) workspace creations by organization, template, and preset.                                          | `organization_name` `preset_name` `template_name`                                    |
| `coderd_workspace_latest_build_status`                        | gauge     | The current workspace statuses by template, transition, and owner.                                                               | `status` `template_name` `template_version` `workspace_owner` `workspace_transition` |
| `go_gc_duration_seconds`                                      | summary   | A summary of the pause duration of garbage collection cycles.                                                                    |                                                                                      |
| `go_goroutines`                                               | gauge     | Number of goroutines that currently exist.                                                                                       |                                                                                      |
| `go_info`                                                     | gauge     | Information about the Go environment.                                                                                            | `version`                                                                            |
| `go_memstats_alloc_bytes`                                     | gauge     | Number of bytes allocated and still in use.                                                                                      |                                                                                      |
| `go_memstats_alloc_bytes_total`                               | counter   | Total number of bytes allocated, even if freed.                                                                                  |                                                                                      |
| `go_memstats_buck_hash_sys_bytes`                             | gauge     | Number of bytes used by the profiling bucket hash table.                                                                         |                                                                                      |
| `go_memstats_frees_total`                                     | counter   | Total number of frees.                                                                                                           |                                                                                      |
| `go_memstats_gc_sys_bytes`                                    | gauge     | Number of bytes used for garbage collection system metadata.                                                                     |                                                                                      |
| `go_memstats_heap_alloc_bytes`                                | gauge     | Number of heap bytes allocated and still in use.                                                                                 |                                                                                      |
| `go_memstats_heap_idle_bytes`                                 | gauge     | Number of heap bytes waiting to be used.                                                                                         |                                                                                      |
| `go_memstats_heap_inuse_bytes`                                | gauge     | Number of heap bytes that are in use.                                                                                            |                                                                                      |
| `go_memstats_heap_objects`                                    | gauge     | Number of allocated objects.                                                                                                     |                                                                                      |
| `go_memstats_heap_released_bytes`                             | gauge     | Number of heap bytes released to OS.                                                                                             |                                                                                      |
| `go_memstats_heap_sys_bytes`                                  | gauge     | Number of heap bytes obtained from system.                                                                                       |                                                                                      |
| `go_memstats_last_gc_time_seconds`                            | gauge     | Number of seconds since 1970 of last garbage collection.                                                                         |                                                                                      |
| `go_memstats_lookups_total`                                   | counter   | Total number of pointer lookups.                                                                                                 |                                                                                      |
| `go_memstats_mallocs_total`                                   | counter   | Total number of mallocs.                                                                                                         |                                                                                      |
| `go_memstats_mcache_inuse_bytes`                              | gauge     | Number of bytes in use by mcache structures.                                                                                     |                                                                                      |
| `go_memstats_mcache_sys_bytes`                                | gauge     | Number of bytes used for mcache structures obtained from system.                                                                 |                                                                                      |
| `go_memstats_mspan_inuse_bytes`                               | gauge     | Number of bytes in use by mspan structures.                                                                                      |                                                                                      |
| `go_memstats_mspan_sys_bytes`                                 | gauge     | Number of bytes used for mspan structures obtained from system.                                                                  |                                                                                      |
| `go_memstats_next_gc_bytes`                                   | gauge     | Number of heap bytes when next garbage collection will take place.                                                               |                                                                                      |
| `go_memstats_other_sys_bytes`                                 | gauge     | Number of bytes used for other system allocations.                                                                               |                                                                                      |
| `go_memstats_stack_inuse_bytes`                               | gauge     | Number of bytes in use by the stack allocator.                                                                                   |                                                                                      |
| `go_memstats_stack_sys_bytes`                                 | gauge     | Number of bytes obtained from system for stack allocator.                                                                        |                                                                                      |
| `go_memstats_sys_bytes`                                       | gauge     | Number of bytes obtained from system.                                                                                            |                                                                                      |
| `go_threads`                                                  | gauge     | Number of OS threads created.                                                                                                    |                                                                                      |
| `process_cpu_seconds_total`                                   | counter   | Total user and system CPU time spent in seconds.                                                                                 |                                                                                      |
| `process_max_fds`                                             | gauge     | Maximum number of open file descriptors.                                                                                         |                                                                                      |
| `process_open_fds`                                            | gauge     | Number of open file descriptors.                                                                                                 |                                                                                      |
| `process_resident_memory_bytes`                               | gauge     | Resident memory size in bytes.                                                                                                   |                                                                                      |
| `process_start_time_seconds`                                  | gauge     | Start time of the process since unix epoch in seconds.                                                                           |                                                                                      |
| `process_virtual_memory_bytes`                                | gauge     | Virtual memory size in bytes.                                                                                                    |                                                                                      |
| `process_virtual_memory_max_bytes`                            | gauge     | Maximum amount of virtual memory available in bytes.                                                                             |                                                                                      |
| `promhttp_metric_handler_requests_in_flight`                  | gauge     | Current number of scrapes being served.                                                                                          |                                                                                      |
| `promhttp_metric_handler_requests_total`                      | counter   | Total number of scrapes by HTTP status code.                                                                                     | `code`                                                                               |

<!-- End generated by 'make docs/admin/integrations/prometheus.md'. -->

### Note on Prometheus native histogram support

The following metrics support native histograms:

* `coderd_workspace_creation_duration_seconds`
* `coderd_prebuilt_workspace_claim_duration_seconds`

Native histograms are an **experimental** Prometheus feature that removes the need to predefine bucket boundaries and allows higher-resolution buckets that adapt to deployment characteristics.
Whether a metric is exposed as classic or native depends entirely on the Prometheus server configuration (see [Prometheus docs](https://prometheus.io/docs/specs/native_histograms/) for details):

* If native histograms are enabled, Prometheus ingests the high-resolution histogram.
* If not, it falls back to the predefined buckets.

⚠️ Important: classic and native histograms cannot be aggregated together. If Prometheus is switched from classic to native at a certain point in time, dashboards may need to account for that transition.
For this reason, it’s recommended to follow [Prometheus’ migration guidelines](https://prometheus.io/docs/specs/native_histograms/#migration-considerations) when moving from classic to native histograms.