coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Cian Johnston	4b585465b8	feat: label chatd metrics by model, add stream-state diagnostics (#24475 ) Adds production-observability metrics to coderd/x/chatd/ for model-level correlation and a chatStreams memory-leak investigation. - Label per-request chatd metrics (steps_total, message_count, prompt_size_bytes, tool_result_size_bytes, ttft_seconds, compaction_total) with `model` and enrich the per-turn logger with provider/model. - Add `coderd_chatd_stream_retries_total{provider, model, kind}` counter incremented in chatloop before OnRetry. - Register a prometheus.Collector exposing `streams_active`, `stream_buffer_size_max`, `stream_buffer_events`, `stream_subscribers` from p.chatStreams. - Add `coderd_chatd_stream_buffer_dropped_total` counter, incremented per publishToStream drop independently of the existing log-rate-limited bufferDropCount. - Snapshot logger/model before the title-generation goroutine to avoid a data race with the logger/model rebind below it. > 🤖	2026-04-17 16:16:30 +01:00
Cian Johnston	d7439a9de0	feat: add Prometheus metrics for chatd subsystem (#24371 ) Adds 7 Prometheus metrics to the chatd subsystem and introduces typed `ActivityBumpReason` for deadline bump attribution. \| Metric \| Type \| Labels \| \|--------\|------\|--------\| \| `coderd_chatd_chats` \| Gauge \| `state` (streaming, waiting) \| \| `coderd_chatd_message_count` \| Histogram \| `provider` \| \| `coderd_chatd_prompt_size_bytes` \| Histogram \| `provider` \| \| `coderd_chatd_tool_result_size_bytes` \| Histogram \| `provider`, `tool_name` \| \| `coderd_chatd_ttft_seconds` \| Histogram \| `provider` \| \| `coderd_chatd_compaction_total` \| Counter \| `provider`, `result` \| \| `coderd_chatd_steps_total` \| Counter \| `provider` \| > 🤖	2026-04-15 19:53:10 +01:00
Danny Kopping	48b90f8cc8	feat: add coder_build_info metric (#24365 ) _Disclaimer: produced by Claude Opus 4.6_ Adds a `coder_build_info` metric which allows operators to see which versions of Coder are currently running. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-04-15 12:48:38 +00:00
J. Scott Miller	20b953a99d	feat: add Prometheus metric for agent first connection duration (#24179 ) ## Summary Add `coderd_agents_first_connection_seconds` histogram metric that records the duration from workspace agent creation to first connection. This fills an observability gap — provisioner job timings and startup script metrics exist, but the agent connection phase (which can take several minutes) was not exposed to Prometheus. Closes https://github.com/coder/coder/issues/21282 ## Changes - `coderd/prometheusmetrics/prometheusmetrics.go` — Define and register a `HistogramVec` in the existing `Agents()` polling loop. Observe `first_connected_at - created_at` exactly once per agent via a deduplication map, pruned each tick to prevent unbounded memory growth. - `coderd/prometheusmetrics/prometheusmetrics_test.go` — Update `TestAgents` to set `first_connected_at` on the test agent and assert the histogram is collected with correct labels, sample count, and sample sum. - `docs/admin/integrations/prometheus.md`, `scripts/metricsdocgen/generated_metrics` — Auto-generated documentation updates from `make gen`. ## Metric details \| Property \| Value \| \|---\|---\| \| Name \| `coderd_agents_first_connection_seconds` \| \| Type \| histogram \| \| Labels \| `template_name`, `agent_name`, `username`, `workspace_name` \| \| Buckets \| 1s, 10s, 30s, 1m, 2m, 5m, 10m, 30m, 1h \| ## Example PromQL ```promql # P95 agent connection time by template histogram_quantile(0.95, sum(rate(coderd_agents_first_connection_seconds_bucket[1h])) by (le, template_name) ) ``` <details> <summary>Implementation notes</summary> ### Design decisions - Histogram over gauge: Enables `histogram_quantile()` for percentile queries. - Observe in `Agents()` polling loop: All required data is already fetched by `GetWorkspaceAgentsForMetrics()` — no new DB queries. - Dedup via `map[uuid.UUID]struct{}`: Prevents re-observing the same agent across polling ticks. Pruned each cycle to bound memory. - Buckets: Aligned with `coderd_provisionerd_workspace_build_timings_seconds` range (1s–1h). ### Overhead at scale (100k active workspaces) The deduplication map (`observedFirstConnection`) and per-tick pruning map (`currentAgentIDs`) are both `map[[16]byte]struct{}`. At 100k agents: - Memory: ~2.25 MB persistent + ~2.25 MB transient per tick = ~4.5 MB peak. - CPU: ~25 ms of map operations per tick (one tick per minute) = <0.05% of one core. Both are negligible relative to the existing cost of the `Agents()` loop (the DB query, per-agent `GetWorkspaceAppsByAgentID` calls, and coordinator node lookups dominate). </details> > 🤖 Generated by Coder Agents	2026-04-14 12:00:46 -05:00
Cian Johnston	f164463c6a	fix(scripts/metricsdocgen): shush the prometheus scanner in CI (#23642 ) - Suppress informational `log.Printf` messages from the metrics scanner when stdout is not a TTY (i.e. piped via `atomic_write` in `make gen` or CI) - Genuine warnings (`warnf`) still print unconditionally so real problems remain visible - `log.Fatalf` for fatal errors is unchanged > 🤖 Created by Coder Agents and reviewed by a human	2026-03-26 12:58:02 +00:00
Cian Johnston	f1d333f0e6	refactor: deduplicate utility helpers across the codebase (#23338 ) Audited exported helpers in `coderd/util/`, `testutil`, `cryptorand`, and friends, then replaced duplicated implementations with canonical versions. - fix: `maps.SortedKeys` generic signature* — value type was hardcoded to `any`, making it impossible to actually call. Added second type parameter `V any`. Added table-driven tests with `cmp.Diff`. - refactor: replace ad-hoc ptr helpers with `ptr.Ref` — removed `int64Ptr`, `stringPtr`, `boolPtr`, `i64ptr`, `strPtr`, `PtrInt32` across 6 files. - refactor: replace local `sortedKeys`/`sortKeys` with `maps.SortedKeys` — now that the signature is fixed, scripts can use it. - refactor: replace hand-rolled `capitalize` with `strings.Capitalize` — the typegen version was also not UTF-8 safe. > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-20 15:12:41 +00:00
Jon Ayers	6c44de951d	feat: add Prometheus collector for DERP server expvar metrics (#22583 ) This PR does three things: - Exports derp expvars to the pprof endpoint - Exports the expvar metrics as prometheus metrics in both coderd and wsproxy - Updates our tailscale to a fix I also had to make to avoid a data race condition I generated this with mux but I also manually tested that the metrics were getting properly emitted	2026-03-06 01:57:58 -06:00
Mathias Fredriksson	a6a8fd94d7	build(Makefile): enable parallel `make -j gen` with correct dependency graph (#22612 ) `make gen` could not run with `-j` because inter-target dependency edges were missing. Multiple recipes compile `coderd/rbac` (which includes generated files like `object_gen.go`), and without explicit ordering, parallel runs produced syntax errors from mid-write reads. Three main changes: Dependency graph fixes declare the compile-time chain through `coderd/rbac` so that `object_gen.go` is written before anything that imports it is compiled. The DB generation targets use a GNU Make 4.3+ grouped target (`&:`) so Make knows `generate.sh` co-produces `querier.go`, `unique_constraint.go`, `dbmetrics`, and `dbauthz` in a single invocation. `SKIP_DUMP_SQL=1` avoids re-entrant `make` inside `generate.sh` when the Makefile already guarantees `dump.sql` is fresh. `scripts/atomicwrite` package replaces `os.WriteFile` in all gen scripts with a temp-file-in-same-dir + rename pattern, preventing interrupted runs from leaving partial files. `.PRECIOUS` and shell atomic writes protect git-tracked generated files from Make's default delete-on-error behavior. Since these files are committed, deletion is worse than staleness -- `git restore` is the recovery path. CI now runs `make -j --output-sync -B gen` (~32s, down from ~85s serial). \| Scenario \| Before \| After \| \|-----------------------------------\|--------------------\|----------\| \| `make gen` (serial) \| 95s \| 95s \| \| `make -j gen` (parallel) \| race error \| 22s \| \| CI `make -j --output-sync -B gen` \| forced serial ~85s \| ~32s \|	2026-03-05 11:58:10 +00:00
Zach	5b7377c375	feat: add Prometheus metrics for boundary log drop reporting (#22521 ) Add Prometheus metrics to the boundary log proxy for observability: - batches_dropped_total (reason: buffer_full, forward_failed) - logs_dropped_total (reason: buffer_full, forward_failed, boundary_channel_full, boundary_batch_full) - batches_forwarded_total Also add BoundaryStatus to the BoundaryMessage envelope so boundary can report dropped log counts as a separate wire message. The agent records these as Prometheus metrics, making boundary-side data loss visible. Backwards compatibility for older versions of boundary is maintained. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 12:42:34 -07:00
Susana Ferreira	ca234f346d	fix: mark presets as validation_failed to prevent endless prebuild retries (#22085 ) ## Description - Updates `wsbuilder` to return a `BuildError` with `http.StatusBadRequest` to signify a "validation error" on missing or invalid parameters - Adds a short-circuit in `prebuilds.StoreReconciler` to mark presets for which creating a build returns a "validation error" as "validation failed" and skip further attempts to reconcile. - Adds a test to verify the above - Introduces a new Prometheus metric `coderd_prebuilt_workspaces_preset_validation_failed` to track the above Closes: https://github.com/coder/coder/issues/21237 --------- Co-authored-by: Cian Johnston <cian@coder.com>	2026-02-27 14:26:48 +00:00
Garrett Delfosse	4057363f78	fix(coderd): add organization_name label to insights Prometheus metrics (#22296 ) ## Description When multiple organizations have templates with the same name, the Prometheus `/metrics` endpoint returns HTTP 500 because Prometheus rejects duplicate label combinations. The three `coderd_insights_` metrics (`coderd_insights_templates_active_users`, `coderd_insights_applications_usage_seconds`, `coderd_insights_parameters`) used only `template_name` as a distinguishing label, so two templates named e.g. `"openstack-v1"` in different orgs would produce duplicate metric series. This adds `organization_name` as a label to all three insight metric descriptors to disambiguate templates across organizations. ## Changes `coderd/prometheusmetrics/insights/metricscollector.go`: - Added `organization_name` label to all three metric descriptors - Added `organizationNames` field (template ID → org name) to the `insightsData` struct - In `doTick`: after fetching templates, collect unique org IDs, fetch organizations via `GetOrganizations`, and build a template-ID-to-org-name mapping - In `Collect()`: pass the organization name as an additional label value in every `MustNewConstMetric` call `coderd/prometheusmetrics/insights/testdata/insights-metrics.json`*: Updated golden file to include `organization_name=coder` in all metric label keys. Fixes #21748	2026-02-25 08:58:50 +00:00
Susana Ferreira	a613ffa3d6	chore: integrate metrics scanner into Makefile (#21465 ) ## Description This PR wires up the metrics scanner in the Makefile to automatically regenerate metrics documentation when source files change. ## Changes * Add Makefile target `scripts/metricsdocgen/generated_metrics` to run the AST scanner to generate the metrics file * Update `docs/admin/integrations/prometheus.md` Makefile target to depend on `scripts/metricsdocgen/generated_metrics` * Add `scripts/metricsdocgen/README.md` documenting the metrics generation process Closes: https://github.com/coder/coder/issues/13223	2026-02-13 12:31:33 +00:00
Susana Ferreira	df84cea924	feat(scripts/metricsdocgen): support merging static and generated metrics files (#21464 ) ## Description This PR refactors `scripts/metricsdocgen/main.go` to support merging static and generated metrics files for documentation generation. The static `metrics` file remains necessary for metrics not defined in the coder codebase (`go_`, `process_`, `promhttp_`, `coder_aibridged_`), as well as edge cases the scanner cannot handle (e.g., such as metrics with runtime-determined labels or function-local variable references for fields, ...). Handling these edge cases in the scanner would make it significantly more complex, so we keep this hybrid approach to accommodate them. This means that in such cases, developers need to update the `metrics` file directly, meaning there is still a risk of out-of-date information in the documentation. However, this solution should already encompass most cases. Static metrics take priority over generated metrics when both files contain the same metric name, allowing manual overrides without modifying the scanner. Some of these edge cases could be easily fixed by updating the codebase to use one of the supported patterns. ## Changes * Update `scripts/metricsdocgen/main.go` to read from two separate metrics files: * `metrics`: static, manually maintained metrics (e.g., `go_`, `process_`, `promhttp_`, `coder_aibridged_`) * `generated_metrics`: auto-generated by the AST scanner * Update `metrics` file to contain only static and edge-case metrics * Skip metrics with empty HELP descriptions in the scanner * Update `generated_metrics` to reflect skipped metrics * Update `docs/admin/integrations/prometheus.md` with merged metrics Related to: https://github.com/coder/coder/issues/13223 Disclosure: This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira	2026-02-13 12:19:33 +00:00
Susana Ferreira	55d1a32424	feat(scripts/metricsdocgen): add promauto.With() pattern to metrics scanner (#21463 ) ## Description This PR implements extraction of metrics defined using `promauto.With()` factory patterns. ## Changes * Add `extractPromautoMetric()` to handle: * `promauto.With(reg).NewCounterVec(prometheus.CounterOpts{...}, labels)` * `factory.NewGaugeVec(prometheus.GaugeOpts{...}, labels)` * Script generates an updated `scripts/metricsdocgen/generated_metrics` file Related to: https://github.com/coder/coder/issues/13223 Disclosure: This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira	2026-02-13 11:24:33 +00:00
Susana Ferreira	bcb437d281	feat(scripts/metricsdocgen): add prometheus.New() and NewVec() patterns to metrics scanner (#21462 ) ## Description This PR implements extraction of metrics defined using `prometheus.New()` and `prometheus.NewVec()` patterns with `Opts{}` structs. ## Changes Add `extractOptsMetric()` to handle: * `prometheus.NewGauge(prometheus.GaugeOpts{...})` * `prometheus.NewCounter(prometheus.CounterOpts{...})` * `prometheus.NewHistogram(prometheus.HistogramOpts{...})` * `prometheus.NewSummary(prometheus.SummaryOpts{...})` * `prometheus.NewVec(prometheus.Opts{...}, labels)` * Script generates an updated `scripts/metricsdocgen/generated_metrics` file Related to: https://github.com/coder/coder/issues/13223 Disclosure: This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira	2026-02-13 11:13:55 +00:00
Susana Ferreira	45280d5516	feat(scripts/metricsdocgen): add prometheus.NewDesc() pattern to metrics scanner (#21461 ) ## Description This PR implements extraction of metrics defined using the `prometheus.NewDesc()` pattern. ## Changes * Add `extractNewDescMetric()` to extract metrics from `prometheus.NewDesc()` calls * Script generates an updated `scripts/metricsdocgen/generated_metrics` file Related to: https://github.com/coder/coder/issues/13223 Disclosure: This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira	2026-02-13 11:01:34 +00:00
Susana Ferreira	a9180d406e	feat(scripts/metricsdocgen): add AST scanner core for metrics doc generation (#21460 ) ## Description This PR adds an AST-based scanner to automatically generate Prometheus metrics documentation from the coder source code. ## Changes * Add `scripts/metricsdocgen/scanner/scanner.go` with: * Directory walking for `agent/`, `coderd/`, `enterprise/`, `provisionerd/` * Go file parsing (skipping `_test.go` files) AST inspection for metric extraction * `Metric.String()` for Prometheus text exposition format rendering * `writeMetrics()` to output metrics to stdout * Placeholder `extractMetricFromCall()` (implemented in subsequent PRs) * Empty `scripts/metricsdocgen/generated_metrics` placeholder (populated by subsequent PRs) Note: To facilitate the review process, this was separated into scoped stacked PRs. The division was based on the main structure, the different Prometheus patterns currently present in the codebase, and updates to the build process. Related to: https://github.com/coder/coder/issues/13223 Disclosure: This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira	2026-02-13 10:48:55 +00:00
Callum Styan	5f3be6b288	feat: add provisioner job queue wait time histogram and jobs enqueued counter (#21869 ) This PR adds some metrics to help identify job enqueue rates and latencies. This work was initiated as a way to help reduce the cost of the observation/measurement itself for autostart scaletests, which impacts our ability to identify/reason about the load caused by autostart. See: https://github.com/coder/internal/issues/1209 I've extended the metrics here to account for regular user initiated builds, prebuilds, autostarts, etc. IMO there is still the question here of whether we want to include or need the `transition` label, which is only present on workspace builds. Including it does lead to an increase in cardinality, and in the case of the histogram (when not using native histograms) that's at least a few extra series for every bucket. We could remove the transition label there but keep it on the counter. Additionally, the histogram is currently observing latencies for other jobs, such as template builds/version imports, those do not have a transition type associated with them. Tested briefly in a workspace, can see metric values like the following: - `coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"} 1` - `coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"} 1` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:40:47 -08:00
Jon Ayers	6035e45cb8	feat: add e2e workspace build duration metric (#21739 ) Adds coderd_template_workspace_build_duration_seconds histogram that tracks the full duration from workspace build creation to agent ready. This captures the complete user-perceived build time including provisioning and agent startup. The metric is emitted when the agent reports ready/error/timeout via the lifecycle API, ensuring each build is counted exactly once per replica.	2026-02-06 16:26:02 -06:00
Marcin Tojek	036ed5672f	fix!: remove deprecated prometheus metrics (#21788 ) ## Description Removes the following deprecated Prometheus metrics: - `coderd_api_workspace_latest_build_total` → use `coderd_api_workspace_latest_build` instead - `coderd_oauth2_external_requests_rate_limit_total` → use `coderd_oauth2_external_requests_rate_limit` instead These metrics were deprecated in #12976 because gauge metrics should avoid the `_total` suffix per [Prometheus naming conventions](https://prometheus.io/docs/practices/naming/). ## Changes - Removed deprecated metric `coderd_api_workspace_latest_build_total` from `coderd/prometheusmetrics/prometheusmetrics.go` - Removed deprecated metric `coderd_oauth2_external_requests_rate_limit_total` from `coderd/promoauth/oauth2.go` - Updated tests to use the non-deprecated metric name Fixes #12999	2026-01-30 13:30:06 +01:00
Marcin Tojek	04b0253e8a	feat: add Prometheus metrics for license warnings and errors (#21749 ) Fixes: coder/internal#767 Adds two new Prometheus metrics for license health monitoring: - `coderd_license_warnings` - count of active license warnings - `coderd_license_errors` - count of active license errors Metrics endpoint after startup of a deployment with license enabled: ``` ... # HELP coderd_license_errors The number of active license errors. # TYPE coderd_license_errors gauge coderd_license_errors 0 ... # HELP coderd_license_warnings The number of active license warnings. # TYPE coderd_license_warnings gauge coderd_license_warnings 0 ... ```	2026-01-29 13:50:15 +01:00
Callum Styan	806d7e4c11	docs: update metrics docs to include metadata batcher metrics (#21665 ) This updates the metrics docs to include metrics added in https://github.com/coder/coder/pull/21330 Signed-off-by: Callum Styan <callumstyan@gmail.com>	2026-01-26 09:22:14 -08:00
Danny Kopping	c6631e1e50	feat: expose `aibridged` metrics (#20865 ) Upgrades `coder/aibridge` to v0.2.0 which includes https://github.com/coder/aibridge/pull/62. Creates a `prometheus.Registerer` with a prefix `coder_aibridged_` and passes that along to coder/aibridge which actually exposes the metrics. Also includes a side-effect of a change described in https://github.com/coder/aibridge/pull/62#discussion_r2550017470. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2025-11-24 18:16:06 +02:00
Susana Ferreira	c1f8465de6	fix: add missing provisionerd metrics to docs (#20358 ) ## Description Add missing provisionerd metrics to Prometheus documentation: * `coderd_provisionerd_num_daemons`: The number of provisioner daemons. * `coderd_provisionerd_workspace_build_timings_seconds`: The time taken for a workspace to build. Related to internal thread: https://codercom.slack.com/archives/C07GRNNRW03/p1760642020583019	2025-10-20 11:33:45 +01:00
Susana Ferreira	0ab345ca84	feat: add prebuild timing metrics to Prometheus (#19503 ) ## Description This PR introduces one counter and two histograms related to workspace creation and claiming. The goal is to provide clearer observability into how workspaces are created (regular vs prebuild) and the time cost of those operations. ### `coderd_workspace_creation_total` * Metric type: Counter * Name: `coderd_workspace_creation_total` * Labels: `organization_name`, `template_name`, `preset_name` This counter tracks whether a regular workspace (not created from a prebuild pool) was created using a preset or not. Currently, we already expose `coderd_prebuilt_workspaces_claimed_total` for claimed prebuilt workspaces, but we lack a comparable metric for regular workspace creations. This metric fills that gap, making it possible to compare regular creations against claims. Implementation notes: * Exposed as a `coderd_` metric, consistent with other workspace-related metrics (e.g. `coderd_api_workspace_latest_build`: https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149). * Every `defaultRefreshRate` (1 minute ), DB query `GetRegularWorkspaceCreateMetrics` is executed to fetch all regular workspaces (not created from a prebuild pool). * The counter is updated with the total from all time (not just since metric introduction). This differs from the histograms below, which only accumulate from their introduction forward. ### `coderd_workspace_creation_duration_seconds` & `coderd_prebuilt_workspace_claim_duration_seconds` * Metric types: Histogram * Names: * `coderd_workspace_creation_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name`, `type` (`regular`, `prebuild`) * `coderd_prebuilt_workspace_claim_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name` We already have `coderd_provisionerd_workspace_build_timings_seconds`, which tracks build run times for all workspace builds handled by the provisioner daemon. However, in the context of this issue, we are only interested in creation and claim build times, not all transitions; additionally, this metric does not include `preset_name`, and adding it there would significantly increase cardinality. Therefore, separate more focused metrics are introduced here: * `coderd_workspace_creation_duration_seconds`: Build time to create a workspace (either a regular workspace or the build into a prebuild pool, for prebuild initial provisioning build). * `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a prebuilt workspace from the pool. The reason for two separate histograms is that: * Creation (regular or prebuild): provisioning builds with similar time magnitude, generally expected to take longer than a claim operation. * Claim: expected to be a much faster provisioning build. #### Native histogram usage Provisioning times vary widely between projects. Using static buckets risks unbalanced or poorly informative histograms. To address this, these metrics use [Prometheus native histograms](https://prometheus.io/docs/specs/native_histograms/): * First introduced in Prometheus v2.40.0 * Recommended stable usage from v2.45+ * Requires Go client `prometheus/client_golang` v1.15.0+ * Experimental and must be explicitly enabled on the server (`--enable-feature=native-histograms`) For compatibility, we also retain a classic bucket definition (aligned with the existing provisioner metric: https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189). * If native histograms are enabled, Prometheus ingests the high-resolution histogram. * If not, it falls back to the predefined buckets. Implementation notes: * Unlike the counter, these histograms are updated in real-time at workspace build job completion. * They reflect data only from the point of introduction forward (no historical backfill). ## Relates to Closes: https://github.com/coder/coder/issues/19528 Native histograms tested in observability stack: https://github.com/coder/observability/pull/50	2025-08-28 15:00:26 +01:00
Muhammad Atif Ali	419eba5fb6	docs: restructure docs (#14421 ) Closes #13434 Supersedes #14182 --------- Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com> Co-authored-by: Ethan Dickson <ethan@coder.com> Co-authored-by: Ben Potter <ben@coder.com> Co-authored-by: Stephen Kirby <58410745+stirby@users.noreply.github.com> Co-authored-by: Stephen Kirby <me@skirby.dev> Co-authored-by: EdwardAngert <17991901+EdwardAngert@users.noreply.github.com> Co-authored-by: Edward Angert <EdwardAngert@users.noreply.github.com>	2024-10-05 10:52:04 -05:00
Ethan	fb28979537	fix(docs): add `coderd_workspace_latest_build_status` prometheus metric (#14828 )	2024-09-27 02:55:24 +10:00
Ethan	c8580a415a	feat: expose current agent connections by type via prometheus (#14612 )	2024-09-11 14:13:30 +10:00
dependabot[bot]	c41d0efff9	chore: bump github.com/prometheus/client_golang from 1.18.0 to 1.19.1 (#13232 ) * chore: bump github.com/prometheus/client_golang from 1.18.0 to 1.19.1	2024-05-13 13:01:28 +00:00
Pavel Aseev	4682355eed	chore: deprecate gauge metrics with _total suffix (#12744 ) (#12976 ) * chore: deprecate gauge metrics with _total suffix (#12744) Deprecated metrics: - coderd_oauth2_external_requests_rate_limit_total - coderd_api_workspace_latest_build_total * Apply suggestions from code review add link to follow-up issue Co-authored-by: Cian Johnston <public@cianjohnston.ie> --------- Co-authored-by: Cian Johnston <public@cianjohnston.ie>	2024-04-24 11:23:24 +03:00
Steven Masley	13359aa16f	chore: drop github per user rate limit tracking (#12286 ) * chore: drop github per user rate limit tracking Rate limits for authenticated requests are per user. This would be an excessive number of prometheus labels, so we only track the unauthorized limit.	2024-02-23 11:17:52 -06:00
Steven Masley	89ab659114	chore: add oauth2 prometheus metrics for to documentation (#11534 )	2024-01-10 15:46:37 +00:00
Steven Masley	b7bdb17460	feat: add metrics to workspace agent scripts (#11132 ) * push startup script metrics to agent	2023-12-13 11:45:43 -06:00
Eric Paulsen	167c759149	docs: add license and template insights prom metrics (#11109 ) * docs: add license and template insights prom metrics * add: coderd_insights_applications_usage_seconds	2023-12-08 14:17:14 -05:00
Colin Adler	bc862fa493	chore: upgrade tailscale to v1.46.1 (#8913 )	2023-08-09 19:50:26 +00:00
Marcin Tojek	942aba3a66	feat: expose agent stats via Prometheus endpoint (#7115 ) * WIP * WIP * WIP * Agents * fix * 1min * fix * WIP * Test * docs * fmt * Add timer to measure the metrics collection * Use CachedGaugeVec * Unit tests * WIP * WIP * db: GetWorkspaceAgentStatsAndLabels * fmt * WIP * gauges * feat: collect * fix * fmt * minor fixes * Prometheus flag * fix * WIP * fix tests * WIP * fix json * Rx Tx bytes * CloseFunc * fix * fix * Fixes * fix * fix: IgnoreErrors * Fix: Windows * fix * reflect.DeepEquals	2023-04-14 16:14:52 +02:00
Marcin Tojek	0347231bb8	feat: expose agent metrics via Prometheus endpoint (#7011 ) * WIP * WIP * WIP * Agents * fix * 1min * fix * WIP * Test * docs * fmt * Add timer to measure the metrics collection * Use CachedGaugeVec * Unit tests * Address PR comments	2023-04-07 17:48:52 +02:00
Cian Johnston	43e8ba0811	feat(api): add prometheus metric coderd_workspace_builds_total (#6314 ) This PR adds the prometheus metric coderd_workspace_builds_total. It measures the total number of workspace builds, along with a number of labels intended to be useful for an operator debugging a failed workspace build trying to discover the scope of the issue.	2023-02-23 01:28:10 +00:00
Ammar Bandukwala	f05609b4da	chore: format Go more aggressively	2023-02-18 18:32:09 -06:00
Kyle Carberry	026b1cd2a4	chore: update to go 1.20 (#5968 ) Co-authored-by: Colin Adler <colin1adler@gmail.com>	2023-02-02 12:36:27 -06:00
Steven Masley	f76ef98a32	chore!: Standardize prometheus time metrics to seconds (#5709 ) * chore!: Standardize prometheus time metrics to seconds * Update prometheus docs	2023-01-13 11:15:25 -06:00
Marcin Tojek	dc6d271293	feat: Build framework for generating API docs (#5383 ) * WIP * Gen * WIP * chi swagger * WIP * WIP * WIP * GetWorkspaces * GetWorkspaces * Markdown * Use widdershins * WIP * WIP * WIP * Markdown template * Fix: makefile * fmt * Fix: comment * Enable swagger conditionally * fix: site * Default false * Flag tests * fix * fix * template fixes * Fix * Fix * Fix * WIP * Formatted * Cleanup * Templates * BEGIN END SECTION * subshell exit code * Fix * Fix merge * WIP * Fix * Fix fmt * Fix * Generic api.md page * Fix merge * Link pages * Fix * Fix * Fix: links * Add icon * Write manifest file * Fix fmt * Fix: enterprise * Fix: Swagger.Enable * Fix: rename apidocs to apidoc * Fix: find -not -prune * Fix: json not available * Fix: rename Coderd API to Coder API * Fix: npm exec * Fix: api dir * Fix: by ID * Fix: string uuid * Fix: include deleted * Fix: indirect go.mod * Fix: source lib.sh * Fix: shellcheck * Fix: pushd popd * Fix: fmt * Fix: improve workspaces * Fix: swagger-enable * Fix * Fix: mention only HTTP 200 * Fix: IDs * Fix: https * Fix: icon * More APis * Fix: format swagger.json * Fix: SwaggerEndpoint * Fix: SCRIPT_DIR * Fix: PROJECT_ROOT * Fix: use code tags in schemas.md * Fix: examples * Fix: examples * Fix: improve format * Fix: date-time,enums * Fix: include_deleted * Fix: array of * Fix: parameter, response * Fix: string time or null * Workspaces: more docs * Workspaces: more docs * Fix: renderDisplayName * Fix: ActiveUserCount * Fix * Fix: typo * Templates: docs * Notice: incomplete	2022-12-19 18:43:46 +01:00
Marcin Tojek	883cf8afa9	chore: Add missing metrics description (#5212 ) * chore: Add missing metrics description * Update provisionerd/provisionerd.go Co-authored-by: Mathias Fredriksson <mafredri@gmail.com> * Fix Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2022-12-01 12:50:57 +01:00
Marcin Tojek	38bdae7016	docs: Prometheus metrics + generator (#5179 ) * docs: Prometheus metrics * Fix * Typo * Typo * Typo * Fix: link * Update docs/admin/prometheus.md Co-authored-by: Dean Sheather <dean@deansheather.com> * Update docs/admin/prometheus.md Co-authored-by: Dean Sheather <dean@deansheather.com> * Update docs/admin/prometheus.md Co-authored-by: Dean Sheather <dean@deansheather.com> * Update docs/admin/prometheus.md Co-authored-by: Dean Sheather <dean@deansheather.com> * Update docs/admin/prometheus.md Co-authored-by: Dean Sheather <dean@deansheather.com> * Rephrase * notice * use ```shell * Generator * gosec * fix: lint * PR comments * not needed anymore Co-authored-by: Dean Sheather <dean@deansheather.com> Co-authored-by: Geoffrey Huntley <ghuntley@ghuntley.com>	2022-11-30 17:39:51 +01:00

44 Commits