Commit Graph

88 Commits

Author SHA1 Message Date
Danielle Maywood f91475cd51 test: remove unnecessary dbauthz.AsSystemRestricted calls in tests (#22663) 2026-03-05 20:29:49 +00:00
Garrett Delfosse 4057363f78 fix(coderd): add organization_name label to insights Prometheus metrics (#22296)
## Description

When multiple organizations have templates with the same name, the
Prometheus `/metrics` endpoint returns HTTP 500 because Prometheus
rejects duplicate label combinations. The three `coderd_insights_*`
metrics (`coderd_insights_templates_active_users`,
`coderd_insights_applications_usage_seconds`,
`coderd_insights_parameters`) used only `template_name` as a
distinguishing label, so two templates named e.g. `"openstack-v1"` in
different orgs would produce duplicate metric series.

This adds `organization_name` as a label to all three insight metric
descriptors to disambiguate templates across organizations.

## Changes

**`coderd/prometheusmetrics/insights/metricscollector.go`**:
- Added `organization_name` label to all three metric descriptors
- Added `organizationNames` field (template ID → org name) to the
`insightsData` struct
- In `doTick`: after fetching templates, collect unique org IDs, fetch
organizations via `GetOrganizations`, and build a
template-ID-to-org-name mapping
- In `Collect()`: pass the organization name as an additional label
value in every `MustNewConstMetric` call

**`coderd/prometheusmetrics/insights/testdata/insights-metrics.json`**:
Updated golden file to include `organization_name=coder` in all metric
label keys.

Fixes #21748
2026-02-25 08:58:50 +00:00
Marcin Tojek 036ed5672f fix!: remove deprecated prometheus metrics (#21788)
## Description

Removes the following deprecated Prometheus metrics:

- `coderd_api_workspace_latest_build_total` → use
`coderd_api_workspace_latest_build` instead
- `coderd_oauth2_external_requests_rate_limit_total` → use
`coderd_oauth2_external_requests_rate_limit` instead

These metrics were deprecated in #12976 because gauge metrics should
avoid the `_total` suffix per [Prometheus naming
conventions](https://prometheus.io/docs/practices/naming/).

## Changes

- Removed deprecated metric `coderd_api_workspace_latest_build_total`
from `coderd/prometheusmetrics/prometheusmetrics.go`
- Removed deprecated metric
`coderd_oauth2_external_requests_rate_limit_total` from
`coderd/promoauth/oauth2.go`
- Updated tests to use the non-deprecated metric name

Fixes #12999
2026-01-30 13:30:06 +01:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Steven Masley 3194bcfc9e chore: distinct operations for provisioner's 'parse', 'init', 'plan', 'apply', 'graph' (#21064)
Provisioner steps broken into smaller granular actions.
Changes:
- `ExtractArchive` moved to `init` request (was in `configure`)
- Writing `tfstate` moved to `plan` (was in `configure`)
- Moved most plan/apply outputs to `GraphComplete`
2025-12-15 11:26:41 -06:00
Ethan 645da33767 test: fix TestDescCacheTimestampUpdate flake (#20975)
## Problem

`TestDescCacheTimestampUpdate` was flaky on Windows CI because
`time.Now()` has ~15.6ms resolution, causing consecutive calls to return
identical timestamps.

## Solution

Inject `quartz.Clock` into `MetricsAggregator` using an options pattern,
making the test deterministic by using a mock clock with explicit time
advancement.

### Changes
- Add `clock quartz.Clock` field to `MetricsAggregator` struct
- Add `WithClock()` option for dependency injection
- Replace all `time.Now()` calls with `ma.clock.Now()`
- Update test to use mock clock with `mClock.Advance(time.Second)`

---

This PR was fully generated by [`mux`](https://github.com/coder/mux)
using Claude Opus 4.5, and reviewed by me.

Closes https://github.com/coder/internal/issues/1146
2025-12-02 10:53:36 +11:00
Callum Styan 658e8c34a9 perf: improve performance of metricsAggregator path by reducing memory allocations (#20724)
Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-11-24 15:45:08 -08:00
Callum Styan 5a18cf4c86 fix: remove unintentionally added print in test code (#20391)
accidentally added in https://github.com/coder/coder/pull/19786

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-10-20 18:51:15 -07:00
Callum Styan 141ef23c81 fix: introduce dedicated queries for workspaces and workspace agents metrics (#19786)
aid in differentiation between sources of calls to `GetWorkspaces` but introducing new queries for metrics specific use cases

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-10-17 13:40:10 -07:00
Spike Curtis 1354d84eb4 chore: refactor instance identity to be a SessionTokenProvider (#19566)
Refactors Agent instance identity to be a SessionTokenProvider.

Refactors the CLI to create Agent clients via a centralized function, rather than add-hoc via individual command handlers and their flags.

This allows commands besides `coder agent`, but which still use the agent identity, to support instance identity authentication.

Fixes #19111 by unifying all API requests to go thru the SessionTokenProvider for auth credentials.
2025-09-03 10:38:42 +04:00
dependabot[bot] 519812776e chore: bump github.com/stretchr/testify from 1.10.0 to 1.11.1 (#19599)
Bumps [github.com/stretchr/testify](https://github.com/stretchr/testify)
from 1.10.0 to 1.11.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/stretchr/testify/releases">github.com/stretchr/testify's
releases</a>.</em></p>
<blockquote>
<h2>v1.11.1</h2>
<p>This release fixes <a
href="https://redirect.github.com/stretchr/testify/issues/1785">#1785</a>
introduced in v1.11.0 where expected argument values implementing the
stringer interface (<code>String() string</code>) with a method which
mutates their value, when passed to mock.Mock.On
(<code>m.On(&quot;Method&quot;, &lt;expected&gt;).Return()</code>) or
actual argument values passed to mock.Mock.Called may no longer match
one another where they previously did match. The behaviour prior to
v1.11.0 where the stringer is always called is restored. Future testify
releases may not call the stringer method at all in this case.</p>
<h2>What's Changed</h2>
<ul>
<li>Backport <a
href="https://redirect.github.com/stretchr/testify/issues/1786">#1786</a>
to release/1.11: mock: revert to pre-v1.11.0 argument matching behavior
for mutating stringers by <a
href="https://github.com/brackendawson"><code>@​brackendawson</code></a>
in <a
href="https://redirect.github.com/stretchr/testify/pull/1788">stretchr/testify#1788</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/stretchr/testify/compare/v1.11.0...v1.11.1">https://github.com/stretchr/testify/compare/v1.11.0...v1.11.1</a></p>
<h2>v1.11.0</h2>
<h2>What's Changed</h2>
<h3>Functional Changes</h3>
<p>v1.11.0 Includes a number of performance improvements.</p>
<ul>
<li>Call stack perf change for CallerInfo by <a
href="https://github.com/mikeauclair"><code>@​mikeauclair</code></a> in
<a
href="https://redirect.github.com/stretchr/testify/pull/1614">stretchr/testify#1614</a></li>
<li>Lazily render mock diff output on successful match by <a
href="https://github.com/mikeauclair"><code>@​mikeauclair</code></a> in
<a
href="https://redirect.github.com/stretchr/testify/pull/1615">stretchr/testify#1615</a></li>
<li>assert: check early in Eventually, EventuallyWithT, and Never by <a
href="https://github.com/cszczepaniak"><code>@​cszczepaniak</code></a>
in <a
href="https://redirect.github.com/stretchr/testify/pull/1427">stretchr/testify#1427</a></li>
<li>assert: add IsNotType by <a
href="https://github.com/bartventer"><code>@​bartventer</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1730">stretchr/testify#1730</a></li>
<li>assert.JSONEq: shortcut if same strings by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1754">stretchr/testify#1754</a></li>
<li>assert.YAMLEq: shortcut if same strings by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1755">stretchr/testify#1755</a></li>
<li>assert: faster and simpler isEmpty using reflect.Value.IsZero by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1761">stretchr/testify#1761</a></li>
<li>suite: faster methods filtering (internal refactor) by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1758">stretchr/testify#1758</a></li>
</ul>
<h3>Fixes</h3>
<ul>
<li>assert.ErrorAs: log target type by <a
href="https://github.com/craig65535"><code>@​craig65535</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1345">stretchr/testify#1345</a></li>
<li>Fix failure message formatting for Positive and Negative asserts in
<a
href="https://redirect.github.com/stretchr/testify/pull/1062">stretchr/testify#1062</a></li>
<li>Improve ErrorIs message when error is nil but an error was expected
by <a href="https://github.com/tsioftas"><code>@​tsioftas</code></a> in
<a
href="https://redirect.github.com/stretchr/testify/pull/1681">stretchr/testify#1681</a></li>
<li>fix Subset/NotSubset when calling with mixed input types by <a
href="https://github.com/siliconbrain"><code>@​siliconbrain</code></a>
in <a
href="https://redirect.github.com/stretchr/testify/pull/1729">stretchr/testify#1729</a></li>
<li>Improve ErrorAs failure message when error is nil by <a
href="https://github.com/ccoVeille"><code>@​ccoVeille</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1734">stretchr/testify#1734</a></li>
<li>mock.AssertNumberOfCalls: improve error msg by <a
href="https://github.com/3scalation"><code>@​3scalation</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1743">stretchr/testify#1743</a></li>
</ul>
<h3>Documentation, Build &amp; CI</h3>
<ul>
<li>docs: Fix typo in README by <a
href="https://github.com/alexandear"><code>@​alexandear</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1688">stretchr/testify#1688</a></li>
<li>Replace deprecated io/ioutil with io and os by <a
href="https://github.com/alexandear"><code>@​alexandear</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1684">stretchr/testify#1684</a></li>
<li>Document consequences of calling t.FailNow() by <a
href="https://github.com/greg0ire"><code>@​greg0ire</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1710">stretchr/testify#1710</a></li>
<li>chore: update docs for Unset <a
href="https://redirect.github.com/stretchr/testify/issues/1621">#1621</a>
by <a href="https://github.com/techfg"><code>@​techfg</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1709">stretchr/testify#1709</a></li>
<li>README: apply gofmt to examples by <a
href="https://github.com/alexandear"><code>@​alexandear</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1687">stretchr/testify#1687</a></li>
<li>refactor: use %q and %T to simplify fmt.Sprintf by <a
href="https://github.com/alexandear"><code>@​alexandear</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1674">stretchr/testify#1674</a></li>
<li>Propose Christophe Colombier (ccoVeille) as approver by <a
href="https://github.com/brackendawson"><code>@​brackendawson</code></a>
in <a
href="https://redirect.github.com/stretchr/testify/pull/1716">stretchr/testify#1716</a></li>
<li>Update documentation for the Error function in assert or require
package by <a
href="https://github.com/architagr"><code>@​architagr</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1675">stretchr/testify#1675</a></li>
<li>assert: remove deprecated build constraints by <a
href="https://github.com/alexandear"><code>@​alexandear</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1671">stretchr/testify#1671</a></li>
<li>assert: apply gofumpt to internal test suite by <a
href="https://github.com/ccoVeille"><code>@​ccoVeille</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1739">stretchr/testify#1739</a></li>
<li>CI: fix shebang in .ci.*.sh scripts by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1746">stretchr/testify#1746</a></li>
<li>assert,require: enable parallel testing on (almost) all top tests by
<a href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1747">stretchr/testify#1747</a></li>
<li>suite.Passed: add one more status test report by <a
href="https://github.com/Ararsa-Derese"><code>@​Ararsa-Derese</code></a>
in <a
href="https://redirect.github.com/stretchr/testify/pull/1706">stretchr/testify#1706</a></li>
<li>Add Helper() method in internal mocks and assert.CollectT by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1423">stretchr/testify#1423</a></li>
<li>assert.Same/NotSame: improve usage of Sprintf by <a
href="https://github.com/ccoVeille"><code>@​ccoVeille</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1742">stretchr/testify#1742</a></li>
<li>mock: enable parallel testing on internal testsuite by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1756">stretchr/testify#1756</a></li>
<li>suite: cleanup use of 'testing' internals at runtime by <a
href="https://github.com/dolmen"><code>@​dolmen</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1751">stretchr/testify#1751</a></li>
<li>assert: check test failure message for Empty and NotEmpty by <a
href="https://github.com/ccoVeille"><code>@​ccoVeille</code></a> in <a
href="https://redirect.github.com/stretchr/testify/pull/1745">stretchr/testify#1745</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/stretchr/testify/commit/2a57335dc9cd6833daa820bc94d9b40c26a7917d"><code>2a57335</code></a>
Merge pull request <a
href="https://redirect.github.com/stretchr/testify/issues/1788">#1788</a>
from brackendawson/1785-backport-1.11</li>
<li><a
href="https://github.com/stretchr/testify/commit/af8c91234f184009f57ef29027b39ca89cb00100"><code>af8c912</code></a>
Backport <a
href="https://redirect.github.com/stretchr/testify/issues/1786">#1786</a>
to release/1.11</li>
<li><a
href="https://github.com/stretchr/testify/commit/b7801fbf5cd58d201296d5d0e132d1849966dbd4"><code>b7801fb</code></a>
Merge pull request <a
href="https://redirect.github.com/stretchr/testify/issues/1778">#1778</a>
from stretchr/dependabot/github_actions/actions/chec...</li>
<li><a
href="https://github.com/stretchr/testify/commit/69831f3b08c40d56a09d0be93e9d5ae034f1590b"><code>69831f3</code></a>
build(deps): bump actions/checkout from 4 to 5</li>
<li><a
href="https://github.com/stretchr/testify/commit/a53be35c3b0cfcd5189cffcfd75df60ea581104c"><code>a53be35</code></a>
Improve captureTestingT helper</li>
<li><a
href="https://github.com/stretchr/testify/commit/aafb604176db7e1f2c9810bc90d644291d057687"><code>aafb604</code></a>
mock: improve formatting of error message</li>
<li><a
href="https://github.com/stretchr/testify/commit/7218e0390acd2aea3edb18574110ec2753c0aeef"><code>7218e03</code></a>
improve error msg</li>
<li><a
href="https://github.com/stretchr/testify/commit/929a2126c2702df436312656a0304580b526c6e9"><code>929a212</code></a>
Merge pull request <a
href="https://redirect.github.com/stretchr/testify/issues/1758">#1758</a>
from stretchr/dolmen/suite-faster-method-filtering</li>
<li><a
href="https://github.com/stretchr/testify/commit/bc7459ec38128532ff32f23cfab4ea0b725210f2"><code>bc7459e</code></a>
suite: faster filtering of methods (-testify.m)</li>
<li><a
href="https://github.com/stretchr/testify/commit/7d37b5c962954410bcd7a71ff3a77c79514056d1"><code>7d37b5c</code></a>
suite: refactor methodFilter</li>
<li>Additional commits viewable in <a
href="https://github.com/stretchr/testify/compare/v1.10.0...v1.11.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/stretchr/testify&package-manager=go_modules&previous-version=1.10.0&new-version=1.11.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ethan Dickson <ethan@coder.com>
2025-09-02 03:12:37 +00:00
Susana Ferreira 0ab345ca84 feat: add prebuild timing metrics to Prometheus (#19503)
## Description

This PR introduces one counter and two histograms related to workspace
creation and claiming. The goal is to provide clearer observability into
how workspaces are created (regular vs prebuild) and the time cost of
those operations.

### `coderd_workspace_creation_total`

* Metric type: Counter
* Name: `coderd_workspace_creation_total`
* Labels: `organization_name`, `template_name`, `preset_name`

This counter tracks whether a regular workspace (not created from a
prebuild pool) was created using a preset or not.
Currently, we already expose `coderd_prebuilt_workspaces_claimed_total`
for claimed prebuilt workspaces, but we lack a comparable metric for
regular workspace creations. This metric fills that gap, making it
possible to compare regular creations against claims.

Implementation notes:
* Exposed as a `coderd_` metric, consistent with other workspace-related
metrics (e.g. `coderd_api_workspace_latest_build`:
https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149).
* Every `defaultRefreshRate` (1 minute ), DB query
`GetRegularWorkspaceCreateMetrics` is executed to fetch all regular
workspaces (not created from a prebuild pool).
* The counter is updated with the total from all time (not just since
metric introduction). This differs from the histograms below, which only
accumulate from their introduction forward.

### `coderd_workspace_creation_duration_seconds` &
`coderd_prebuilt_workspace_claim_duration_seconds`

* Metric types: Histogram
* Names:
  * `coderd_workspace_creation_duration_seconds`
* Labels: `organization_name`, `template_name`, `preset_name`, `type`
(`regular`, `prebuild`)
  * `coderd_prebuilt_workspace_claim_duration_seconds`
    * Labels: `organization_name`, `template_name`, `preset_name`

We already have `coderd_provisionerd_workspace_build_timings_seconds`,
which tracks build run times for all workspace builds handled by the
provisioner daemon.
However, in the context of this issue, we are only interested in
creation and claim build times, not all transitions; additionally, this
metric does not include `preset_name`, and adding it there would
significantly increase cardinality. Therefore, separate more focused
metrics are introduced here:
* `coderd_workspace_creation_duration_seconds`: Build time to create a
workspace (either a regular workspace or the build into a prebuild pool,
for prebuild initial provisioning build).
* `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a
prebuilt workspace from the pool.

The reason for two separate histograms is that:
* Creation (regular or prebuild): provisioning builds with similar time
magnitude, generally expected to take longer than a claim operation.
* Claim: expected to be a much faster provisioning build.

#### Native histogram usage

Provisioning times vary widely between projects. Using static buckets
risks unbalanced or poorly informative histograms.
To address this, these metrics use [Prometheus native
histograms](https://prometheus.io/docs/specs/native_histograms/):
* First introduced in Prometheus v2.40.0
* Recommended stable usage from v2.45+
* Requires Go client `prometheus/client_golang` v1.15.0+
* Experimental and must be explicitly enabled on the server
(`--enable-feature=native-histograms`)

For compatibility, we also retain a classic bucket definition (aligned
with the existing provisioner metric:
https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189).
* If native histograms are enabled, Prometheus ingests the
high-resolution histogram.
* If not, it falls back to the predefined buckets.

Implementation notes:
* Unlike the counter, these histograms are updated in real-time at
workspace build job completion.
* They reflect data only from the point of introduction forward (no
historical backfill).

## Relates to 

Closes: https://github.com/coder/coder/issues/19528
Native histograms tested in observability stack:
https://github.com/coder/observability/pull/50
2025-08-28 15:00:26 +01:00
Callum Styan 014a2d5b0f perf: don't call GetUserByID unnecessarily for Agents metrics loops (#19395)
At the moment, the loop which retrieves and updates the values of the
agents metrics excessively calls `GetUserByID` (a DB query). First it
retrieves a list of all workspaces, filtering out inactive agents (not
entirely clear to me whether this is non-running workspaces, or just
dead agents), and then iterates over those workspaces to get the rest of
the relevant data for the metrics. The next call is `GetUserByID` for
`workspace.OwnerID`. This is unnecessary because the `workspaces_visible` view we pull workspaces from has already been joined with the users table to get the username/name/etc.

This should at least partially resolve
https://github.com/coder/internal/issues/726 
---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-08-21 11:01:32 -07:00
Dean Sheather 6eb02d1c2a chore: wire up usage tracking for managed agents (#19096)
Wires up the usage collector and publisher to coderd.

Relates to coder/internal#814
2025-08-20 23:38:09 +10:00
Mathias Fredriksson 1b66495b70 fix(coderd/prometheusmetrics)!: filter deleted wsbuilds to reduce db load (#19197)
This change removes the `GetLatestWorkspaceBuilds` query which includes
all workspaces for all time (including deleted). This allows us to also
stop using `GetProvisionerJobsByIDs` for said builds as the job status
is included in `GetWorkspaces` called separately.

**BREAKING CHANGE**: The `coderd_api_workspace_latest_build` Prometheus
metric no longer includes builds belonging to deleted workspaces, as
such, this metric will show fewer statuses.

Fixes coder/internal#717
2025-08-11 14:48:31 +03:00
Steven Masley 0a3afeddc8 chore: add more pprof labels for various go routines (#19243)
- ReplicaSync
- Notifications
- MetricsAggregator
- DBPurge
2025-08-07 20:05:32 +00:00
Steven Masley 8ba8b4f061 chore: add profiling labels for pprof analysis (#19232)
PProf labels segment the code into groups for determing the source of
cpu/memory profiles. Since the web server and background jobs share a
lot of the same code (eg wsbuilder), it helps to know if the load is
user induced, or background job based.
2025-08-07 11:21:17 -05:00
Callum Styan ffbfaf2a6f feat: allow bypassing current CORS magic based on template config (#18706)
Solves https://github.com/coder/coder/issues/15096

This is a slight rework/refactor of the earlier PRs from @dannykopping
and @Emyrk:
- https://github.com/coder/coder/pull/15669
- https://github.com/coder/coder/pull/15684
- https://github.com/coder/coder/pull/17596

Rather than having a per-app CORS behaviour setting and additionally a
template level setting for ports, this PR adds a single template level
CORS behaviour setting that is then used by all apps/ports for
workspaces created from that template.

The main changes are in `proxy.go` and `request.go` to:
a) get the CORS behaviour setting from the template
b) have `HandleSubdomain` bypass the CORS middleware handler if the
selected behaviour is `passthru`
c) in `proxyWorkspaceApp`, do not modify the response if the selected
behaviour is `passthru`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added support for configuring CORS behavior ("simple" or "passthru")
at the template level for all shared ports.
* Introduced a new "CORS Behavior" setting in the template creation and
settings forms.
* API endpoints and responses now include the optional `cors_behavior`
property for templates.
* Workspace apps and proxy now honor the specified CORS behavior,
enabling conditional CORS middleware application.
* Enhanced workspace app tests with comprehensive scenarios covering
CORS behaviors and authentication states.

* **Bug Fixes**
  * None.

* **Documentation**
* Updated API and admin documentation to describe the new
`cors_behavior` property and its usage.
* Added examples and schema references for CORS behavior in relevant API
docs.

* **Tests**
* Extended automated tests to cover different CORS behavior scenarios
for templates and workspace apps.

* **Chores**
* Updated audit logging to track changes to the `cors_behavior` field on
templates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-07-30 13:42:39 -07:00
ケイラ fae30a00fd chore: remove unnecessary redeclarations in for loops (#18440) 2025-06-20 13:16:55 -06:00
Hugo Dutka 277c2c7ea7 chore(coderd/prometheusmetrics): remove dbmem from tests (#18238) 2025-06-05 09:30:27 +02:00
Steven Masley ca38729840 chore: revert dynamic params as a safe experiment (#17510) 2025-04-22 16:21:15 +00:00
Jon Ayers 17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
Eng Zer Jun 04c33968cf refactor: replace golang.org/x/exp/slices with slices (#16772)
The experimental functions in `golang.org/x/exp/slices` are now
available in the standard library since Go 1.21.

Reference: https://go.dev/doc/go1.21#slices

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2025-03-04 00:46:49 +11:00
Spike Curtis 5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
Colin Adler 3de98c25db feat: add prometheus metric for tracking user statuses (#15281) 2024-10-30 18:41:16 +00:00
Steven Masley 343f8ec9ab chore: join owner, template, and org in new workspace view (#15116)
Joins in fields like `username`, `avatar_url`, `organization_name`,
`template_name` to `workspaces` via a **view**. 
The view must be maintained moving forward, but this prevents needing to
add RBAC permissions to fetch related workspace fields.
2024-10-22 09:20:54 -05:00
Garrett Delfosse 922f4c545f fix: handle new agent stat format correctly (#14576)
---------

Co-authored-by: Ethan Dickson <ethan@coder.com>
2024-09-20 01:52:14 +10:00
Kayla Washburn-Love bf4b7abf14 chore(coderd): allow creating workspaces without specifying an organization (#14048) 2024-07-30 10:44:02 -06:00
Ethan dd243686e4 chore!: remove deprecated agent v1 routes (#13486) 2024-06-11 12:22:59 +10:00
Garrett Delfosse 5b9a65e5c1 chore: move Batcher and Tracker to workspacestats (#13418) 2024-06-10 15:35:23 -04:00
Colin Adler 40390ecc30 chore: fix TestServer/Prometheus/DBMetricsDisabled test flake (#13453)
See: https://github.com/coder/coder/actions/runs/9352137263/job/25739550487#step:5:368
2024-06-03 15:38:59 -05:00
Garrett Delfosse 5789ea5397 chore: move stat reporting into workspacestats package (#13386) 2024-05-29 11:49:08 -04:00
Garrett Delfosse c550d0641d feat: move shared ports out of experiment (#13120) 2024-05-02 14:11:33 -04:00
Pavel Aseev 4682355eed chore: deprecate gauge metrics with _total suffix (#12744) (#12976)
* chore: deprecate gauge metrics with _total suffix (#12744)

Deprecated metrics:
- coderd_oauth2_external_requests_rate_limit_total
- coderd_api_workspace_latest_build_total

* Apply suggestions from code review

add link to follow-up issue

Co-authored-by: Cian Johnston <public@cianjohnston.ie>

---------

Co-authored-by: Cian Johnston <public@cianjohnston.ie>
2024-04-24 11:23:24 +03:00
Steven Masley 0a8c8ce5cc chore: remove InsertWorkspaceAgentStat query (#12869)
* chore: remove InsertWorkspaceAgentStat query

InsertWorkspaceAgentStats (batch) exists. We only used the singular in
a single unit test place. Removing the single for the batch, reducing
the interface size.
2024-04-09 12:35:27 -05:00
Danny Kopping 79fb8e43c5 feat: expose workspace statuses (with details) as a prometheus metric (#12762)
Implements #12462
2024-04-02 09:57:36 +02:00
Mathias Fredriksson b183236482 feat(coderd/database): use template_usage_stats in *ByTemplate insights queries (#12668)
This PR updates the `*ByTempalte` insights queries used for generating Prometheus metrics to behave the same way as the new rollup query and re-written insights queries that utilize the rolled up data.
2024-03-25 17:42:02 +02:00
Danny Kopping 9cfd5baa91 feat(coderd): export metric indicating each experiment's status (#12657) 2024-03-19 14:11:27 +02:00
Steven Masley f0f9569d51 chore: enforce that provisioners can only acquire jobs in their own organization (#12600)
* chore: add org ID as optional param to AcquireJob
* chore: plumb through organization id to provisioner daemons
* add org id to provisioner domain key
* enforce org id argument
* dbgen provisioner jobs defaults to default org
2024-03-18 12:48:13 -05:00
Danny Kopping da54c8a51f fix: fix data race in TestLabelsAggregation tests (#12578) 2024-03-13 13:47:22 +02:00
Danny Kopping 7a7105ad66 feat: make agent stats' cardinality configurable (#12535) 2024-03-13 12:03:36 +02:00
Cian Johnston 8f40ee3465 Revert "feat: make agent stats' cardinality configurable (#12468)" (#12533)
This reverts commit 21d1873d97.
2024-03-11 14:33:36 +00:00
Danny Kopping 21d1873d97 feat: make agent stats' cardinality configurable (#12468)
Closes #12221
2024-03-11 16:04:08 +02:00
Marcin Tojek aacb4a2b4c feat: use map instead of slice in metrics aggregator (#11815) 2024-01-29 09:12:41 +01:00
Spike Curtis fdd60d316e fix: fix MetricsAggregator check for metric sameness (#11508)
Fixes #11451

A refactor of the Agent API passes metrics as protobufs, which include pointers to label name/value pairs.  The aggregator tested for sameness by doing a shallow compare of label values, which for different stats reports would compare unequal because the pointers would be different.

This fix does a deep compare.

While testing I also noted that we neglect to compare template names. This is unlikely to have caused any issue in practice, since the combination of username/workspace is unique, but in the context of comparing metric labels we should do the comparison.

If a user creates a workspace, deletes it, then recreates from a different template, we could in principle have reported incorrect stats for the old template.
2024-01-09 15:21:30 +04:00
Steven Masley fe867d02e0 fix: correct perms for forbidden error in TemplateScheduleStore.Load (#11286)
* chore: TemplateScheduleStore.Load() throwing forbidden error
* fix: workspace agent scope to include template
2023-12-20 11:38:49 -06:00
Dean Sheather e46431078c feat: add AgentAPI using DRPC (#10811)
Co-authored-by: Spike Curtis <spike@coder.com>
2023-12-18 22:53:28 +10:00
Steven Masley b7bdb17460 feat: add metrics to workspace agent scripts (#11132)
* push startup script metrics to agent
2023-12-13 11:45:43 -06:00
Marcin Tojek ef4d1b68e1 test: insights metrics: verify plugin usage (#11156) 2023-12-13 10:46:52 +01:00