Commit Graph

579 Commits

Author SHA1 Message Date
George K e5c19d0af4 feat: backend support for creating and storing service accounts (#22698)
Add is_service_account column to users table with CHECK constraints
enforcing login_type='none' and empty email for service accounts.
Update user creation API to validate service account constraints.

Related to:
https://linear.app/codercom/issue/PLAT-27/feat-backend-support-for-creating-and-storing-service-accounts
2026-03-11 10:19:08 -07:00
Jon Ayers 6c44de951d feat: add Prometheus collector for DERP server expvar metrics (#22583)
This PR does three things:
- Exports derp expvars to the pprof endpoint
- Exports the expvar metrics as prometheus metrics in both coderd and
wsproxy
- Updates our tailscale to a fix I also had to make to avoid a data race
condition

I generated this with mux but I also manually tested that the metrics
were getting properly emitted
2026-03-06 01:57:58 -06:00
Jon Ayers 25dac6e5f7 docs: add process priority management documentation (#22626) 2026-03-05 14:16:29 -06:00
Zach 5b7377c375 feat: add Prometheus metrics for boundary log drop reporting (#22521)
Add Prometheus metrics to the boundary log proxy for observability:
- batches_dropped_total (reason: buffer_full, forward_failed)
- logs_dropped_total (reason: buffer_full, forward_failed,
  boundary_channel_full, boundary_batch_full)
- batches_forwarded_total

Also add BoundaryStatus to the BoundaryMessage envelope so boundary
can report dropped log counts as a separate wire message. The agent
records these as Prometheus metrics, making boundary-side data loss
visible. Backwards compatibility for older versions of boundary is maintained.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 12:42:34 -07:00
Spike Curtis 56eb57caf4 chore: enable agent socket by default (#22352)
relates to #21335

Enables the agent socket by default and updates docs to strike references to having to enable it.

The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token.

Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.
2026-03-03 21:23:59 +04:00
Susana Ferreira ca234f346d fix: mark presets as validation_failed to prevent endless prebuild retries (#22085)
## Description

- Updates `wsbuilder` to return a `BuildError` with
`http.StatusBadRequest` to signify a "validation error" on missing or
invalid parameters
- Adds a short-circuit in `prebuilds.StoreReconciler` to mark presets
for which creating a build returns a "validation error" as "validation
failed" and skip further attempts to reconcile.
- Adds a test to verify the above
- Introduces a new Prometheus metric
`coderd_prebuilt_workspaces_preset_validation_failed` to track the above

Closes: https://github.com/coder/coder/issues/21237

---------

Co-authored-by: Cian Johnston <cian@coder.com>
2026-02-27 14:26:48 +00:00
Garrett Delfosse 4057363f78 fix(coderd): add organization_name label to insights Prometheus metrics (#22296)
## Description

When multiple organizations have templates with the same name, the
Prometheus `/metrics` endpoint returns HTTP 500 because Prometheus
rejects duplicate label combinations. The three `coderd_insights_*`
metrics (`coderd_insights_templates_active_users`,
`coderd_insights_applications_usage_seconds`,
`coderd_insights_parameters`) used only `template_name` as a
distinguishing label, so two templates named e.g. `"openstack-v1"` in
different orgs would produce duplicate metric series.

This adds `organization_name` as a label to all three insight metric
descriptors to disambiguate templates across organizations.

## Changes

**`coderd/prometheusmetrics/insights/metricscollector.go`**:
- Added `organization_name` label to all three metric descriptors
- Added `organizationNames` field (template ID → org name) to the
`insightsData` struct
- In `doTick`: after fetching templates, collect unique org IDs, fetch
organizations via `GetOrganizations`, and build a
template-ID-to-org-name mapping
- In `Collect()`: pass the organization name as an additional label
value in every `MustNewConstMetric` call

**`coderd/prometheusmetrics/insights/testdata/insights-metrics.json`**:
Updated golden file to include `organization_name=coder` in all metric
label keys.

Fixes #21748
2026-02-25 08:58:50 +00:00
Kacper Sawicki 1e274063d4 feat(coderd): filter expired API tokens server-side (#22263)
## Summary

Moves expired token filtering from client-side to server-side by adding
an `include_expired` parameter to the `GetAPIKeysByLoginType` and
`GetAPIKeysByUserID` database queries. This is more efficient for large
deployments with many expired/short-lived tokens.

## Changes

- Add `include_expired` parameter to SQL queries using `OR`
short-circuit
- Add `include_expired` query parameter to `GET
/users/{user}/keys/tokens`
- Add `IncludeExpired` field to `codersdk.TokensFilter`
- Remove client-side filtering from CLI `tokens list` command
- Add `TestTokensFilterExpired` test

Fixes coder/internal#1357
2026-02-24 15:27:03 +00:00
Jon Ayers 0a7a3da178 fix: exclude provisioner_state from workspace_build_with_user view (#22159)
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.

This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
2026-02-23 22:46:17 -06:00
Thomas Kosiewski b776a14b46 fix(coderd): harden OAuth2 provider security (#22194)
## Summary

Harden the OAuth2 provider with multiple security fixes addressing
`coder/security#121` (CSRF session takeover) and converge on OAuth 2.1
compliance.

### Security Fixes

| Fix | Description | Commits |
|-----|-------------|---------|
| **CSRF on `/oauth2/authorize`** | Enforce CSRF protection on the
authorize endpoint POST (consent form submission) | `ba7d646`, `b94a64e`
|
| **Clickjacking: `frame-ancestors` CSP** | Prevent consent page from
being iframed (`Content-Security-Policy: frame-ancestors 'none'` +
`X-Frame-Options: DENY`) | `597aeb2` |
| **Exact redirect URI matching** | Changed from prefix matching to full
string exact matching per OAuth 2.1 §4.1.2.1 | `73d64b1`, `93897f1` |
| **Store & verify `redirect_uri`** | Store redirect_uri with auth code
in DB, verify at token exchange matches exactly (RFC 6749 §4.1.3) |
`50569b9`, `d7ca315` |
| **Mandatory PKCE** | Require `code_challenge` at authorization (for
`response_type=code`) + unconditional `code_verifier` verification at
token exchange | `d7ca315`, `1cda1a9` |
| **Reject implicit grant** | `response_type=token` now returns
`unsupported_response_type` error page (OAuth 2.1 removes implicit flow)
| `d7ca315`, `91b8863` |

### Changes by File

**`coderd/httpmw/csrf.go`** — Extended the CSRF `ExemptFunc` to enforce
CSRF on `/oauth2/authorize` in addition to `/api` routes. The consent
form POST is now CSRF-protected to prevent cross-site authorization code
theft.

**`site/site.go`** — Added `Content-Security-Policy: frame-ancestors
'none'` and `X-Frame-Options: DENY` headers to `RenderOAuthAllowPage`
(consent page only — does not affect the SPA/global CSP used by AI
tasks).

**`coderd/httpapi/queryparams.go`** — Changed `RedirectURL` from prefix
matching (`strings.HasPrefix(v.Path, base.Path)`) to full URI exact
matching (`v.String() != base.String()`), comparing scheme, host, path,
and query.

**`coderd/oauth2provider/authorize.go`** — Added PKCE enforcement:
`code_challenge` is required when `response_type=code` (via a
conditional check, not `RequiredNotEmpty`, so `response_type=token` can
reach the explicit rejection path). `ShowAuthorizePage` (GET) validates
`response_type` before rendering and returns a 400 error page for
unsupported types. `ProcessAuthorize` (POST) stores the `redirect_uri`
with the auth code when explicitly provided.

**`coderd/oauth2provider/tokens.go`** — PKCE verification is now
unconditional (not gated on `code_challenge` being present in DB). If
the stored code has a `redirect_uri`, the token endpoint verifies it
matches exactly — mismatch returns `errBadCode` → `invalid_grant`.
Missing `code_verifier` returns `invalid_grant`.

**`codersdk/oauth2.go`** — `OAuth2ProviderResponseTypeToken` constant
and `Valid()` acceptance are **kept** so the authorize handler can parse
`response_type=token` and return the proper `unsupported_response_type`
error rather than failing at parameter validation.

**`coderd/database/migrations/000421_*`** — Added `redirect_uri text`
column to `oauth2_provider_app_codes`.

### Design Decisions

**`state` parameter remains optional** — The plan initially required
`state` via `RequiredNotEmpty`, but this was reverted in `376a753` to
avoid breaking existing clients. The `state` is still hashed and stored
when provided (via `state_hash` column), securing clients that opt in.

**`response_type=token` kept in `Valid()`** — Removing it from `Valid()`
would cause the parameter parser to reject the request before the
authorize handler can return the proper `unsupported_response_type`
error. The constant is kept for correct error handling flow.

**CSP scoped to consent page only** — `frame-ancestors 'none'` is set
only on the OAuth consent page renderer, not globally. The SPA/global
CSP was previously changed to allow framing for AI tasks
([#18102](https://github.com/coder/coder/pull/18102)); this change does
not regress that.

### Out of Scope (follow-up PRs)

- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting on dynamic client registration


---

<details>
<summary>📋 Implementation Plan</summary>

# Plan: Harden OAuth2 Provider — Security Fixes + OAuth 2.1 Compliance

## Context & Why

Security issue `coder/security#121` reports a critical session takeover
via CSRF on the OAuth2 provider. This plan covers all remaining security
fixes from that issue **plus** convergence on OAuth 2.1 requirements.
The goal is a single PR that closes all actionable gaps.

## Current State (already committed on branch `csrf-sjx1`)

| Fix | Status | Commits |
|-----|--------|---------|
| Fix 1: CSRF on `/oauth2/authorize` |  Done | `ba7d646`, `b94a64e` |
| CSRF token in consent form HTML |  Done | `b94a64e` |
| `state_hash` column + storage |  Done (hash stored, but state still
optional) | `9167d83`, `b94a64e` |
| Tests for CSRF + state hash |  Done | `e4119b5` |

## Remaining Work

### ~~Fix 2 — Require `state` parameter~~ (DROPPED)

> **Decision:** Do not enforce `state` as required. The `state`
parameter is still hashed and stored when provided (via
`hashOAuth2State` / `state_hash` column from prior commits), but clients
are not forced to supply it. This avoids breaking existing integrations
that omit state.

**Rollback:** Remove `"state"` from the `RequiredNotEmpty` call in
`coderd/oauth2provider/authorize.go:42`:

```go
// BEFORE (current on branch)
p.RequiredNotEmpty("response_type", "client_id", "state", "code_challenge")

// AFTER
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```

No test changes needed — tests already pass `state` voluntarily.

### Fix 4 — Exact redirect URI matching

Currently `coderd/httpapi/queryparams.go:233` uses prefix matching:

```go
// CURRENT — prefix match
if v.Host != base.Host || !strings.HasPrefix(v.Path, base.Path) {
```

OAuth 2.1 requires **exact string matching**. Change to:

```go
// AFTER — exact match (OAuth 2.1 §4.1.2.1)
if v.Host != base.Host || v.Path != base.Path {
```

**File: `coderd/httpapi/queryparams.go` — `RedirectURL` method**

Also update the error message from "must be a subset of" to "must
exactly match".

**Additionally**, store `redirect_uri` with the auth code and verify at
the token endpoint (RFC 6749 §4.1.3):

1. **New migration** (same migration file or a new `000421`): Add
`redirect_uri text` column to `oauth2_provider_app_codes`
2. **Update INSERT query** in `coderd/database/queries/oauth2.sql` to
include `redirect_uri`
3. **`coderd/oauth2provider/authorize.go`**: Store
`params.redirectURL.String()` when inserting the code
4. **`coderd/oauth2provider/tokens.go`**: After retrieving the code from
DB, verify that `redirect_uri` from the token request matches the stored
value exactly. Currently `tokens.go:103` calls `p.RedirectURL(vals,
callbackURL, "redirect_uri")` for prefix validation only — it must
compare against the stored redirect_uri from the code, not just the
app's callback URL.

<details>
<summary>Why both exact match AND store+verify?</summary>

Exact matching at the authorize endpoint prevents open redirectors
(attacker can't use a sub-path).
Storing and verifying at the token endpoint prevents code injection — an
attacker who steals a code can't exchange it with a different
redirect_uri than was originally authorized. This is required by RFC
6749 §4.1.3 and OAuth 2.1.
</details>

### Fix 7 — `frame-ancestors` CSP on consent page

The consent page can be iframed by a workspace app (same-site), which is
the attack vector. Add a `Content-Security-Policy` header to prevent
framing.

**File: `site/site.go` — `RenderOAuthAllowPage` function (~line 731)**

Before writing the response, add:

```go
func RenderOAuthAllowPage(rw http.ResponseWriter, r *http.Request, data RenderOAuthAllowData) {
    rw.Header().Set("Content-Type", "text/html; charset=utf-8")
    // Prevent the consent page from being framed to mitigate
    // clickjacking attacks (coder/security#121).
    rw.Header().Set("Content-Security-Policy", "frame-ancestors 'none'")
    rw.Header().Set("X-Frame-Options", "DENY")
    ...
```

Both headers for defense-in-depth (CSP for modern browsers,
X-Frame-Options for legacy).

### OAuth 2.1 — Mandatory PKCE

Currently PKCE is checked only when `code_challenge` was provided during
authorization (`tokens.go:258`):

```go
// CURRENT — conditional check
if dbCode.CodeChallenge.Valid && dbCode.CodeChallenge.String != "" {
    // verify PKCE
}
```

OAuth 2.1 requires PKCE for ALL authorization code flows. Change to:

**File: `coderd/oauth2provider/authorize.go`** — Add `"code_challenge"`
to required params:

```go
p.RequiredNotEmpty("response_type", "client_id", "code_challenge")
```

**File: `coderd/oauth2provider/tokens.go:257-265`** — Make PKCE
verification unconditional:

```go
// AFTER — PKCE always required (OAuth 2.1)
if req.CodeVerifier == "" {
    return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !dbCode.CodeChallenge.Valid || dbCode.CodeChallenge.String == "" {
    // Code was issued without a challenge — should not happen
    // with the authorize endpoint enforcement, but defend in
    // depth.
    return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
if !VerifyPKCE(dbCode.CodeChallenge.String, req.CodeVerifier) {
    return codersdk.OAuth2TokenResponse{}, errInvalidPKCE
}
```

**File: `codersdk/oauth2.go`** — Remove
`OAuth2ProviderResponseTypeToken` from the enum or reject it explicitly
in the authorize handler. Currently it's defined at line 216 but the
handler ignores `response_type` and always issues a code. We should
either:
- (a) Remove the `"token"` variant from the enum and reject it with
`unsupported_response_type`, OR
- (b) Add an explicit check in `ProcessAuthorize` that rejects
`response_type=token`

Option (b) is simpler and more backwards-compatible:

```go
// In ProcessAuthorize, after extracting params:
if params.responseType != codersdk.OAuth2ProviderResponseTypeCode {
    httpapi.WriteOAuth2Error(ctx, rw, http.StatusBadRequest,
        codersdk.OAuth2ErrorCodeUnsupportedResponseType,
        "Only response_type=code is supported")
    return
}
```

### OAuth 2.1 — Bearer tokens in query strings

`coderd/httpmw/apikey.go:743` accepts `access_token` from URL query
parameters. OAuth 2.1 prohibits this. However, this may be used
internally (e.g., workspace apps, DERP). Need to audit callers before
removing.

**Approach:** This is a larger change with potential breakage. Mark as a
**separate follow-up issue** rather than including in this PR. Document
the finding.

### OAuth 2.1 — Removed flows

 **Already compliant.** `tokens.go` only supports `authorization_code`
and `refresh_token` grant types. The implicit grant
(`response_type=token`) will be explicitly rejected per the PKCE section
above.

### OAuth 2.1 — Refresh token rotation

 **Already compliant.** `tokens.go:442` deletes the old API key when a
refresh token is used.

## Migration Plan

All DB changes can go in a single new migration (or extend 000420 if the
branch is rebased before merge). Columns to add:
- `redirect_uri text` on `oauth2_provider_app_codes`

The `state_hash` column is already added by migration 000420.

## Implementation Order

1. **Fix 7** — CSP headers on consent page (isolated, no deps)
2. ~~**Fix 2** — Require `state` parameter~~ (DROPPED — state stays
optional)
3. **Fix 4** — Exact redirect URI matching + store/verify redirect_uri
4. **PKCE mandatory** — Require `code_challenge` + reject
`response_type=token`
5. **Rollback** — Remove `"state"` from `RequiredNotEmpty` in
`authorize.go`
6. **Tests** — Update/add tests for all changes
7. **`make gen`** after DB changes

## Out of Scope (separate PRs)

- Bearer tokens in query strings (needs internal caller audit)
- Scope enforcement on OAuth2 tokens
- Rate limiting / quota on dynamic client registration

</details>

---
_Generated with [`mux`](https://github.com/coder/mux) • Model:
`anthropic:claude-opus-4-6` • Thinking: `xhigh`_
2026-02-23 12:18:44 +01:00
Danielle Maywood 02a80eac2e docs: document new terraform-managed devcontainers (#21978) 2026-02-19 11:45:04 +00:00
Cian Johnston 4a3304fc38 feat(cli)!: expire tokens by default (#21783)
## Summary

> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.

Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.

## Problem

During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:

```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```

This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.

## Solution

This PR adds:

- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)` 
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states

## Test plan

- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token

Closes #21782

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-17 13:16:46 +00:00
Atif Ali 63563e57db docs: add registry mirroring guide for Artifactory (#22025)
Verified to be working locally.

---------

Co-authored-by: Phorcys <57866459+phorcys420@users.noreply.github.com>
2026-02-16 18:29:48 +01:00
Susana Ferreira df84cea924 feat(scripts/metricsdocgen): support merging static and generated metrics files (#21464)
## Description

This PR refactors `scripts/metricsdocgen/main.go` to support merging static and generated metrics files for documentation generation.

The static `metrics` file remains necessary for metrics not defined in the coder codebase (`go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`), as well as **edge cases** the scanner cannot handle (e.g.,  such as metrics with runtime-determined labels or function-local variable references for fields, ...). Handling these edge cases in the scanner would make it significantly more complex, so we keep this hybrid approach to accommodate them. This means that in such cases, developers need to update the `metrics` file directly, meaning there is still a risk of out-of-date information in the documentation. However, this solution should already encompass most cases.

Static metrics take priority over generated metrics when both files contain the same metric name, allowing manual overrides without modifying the scanner. Some of these edge cases could be easily fixed by updating the codebase to use one of the supported patterns.

## Changes

* Update `scripts/metricsdocgen/main.go` to read from two separate metrics files:
  * `metrics`: static, manually maintained metrics (e.g., `go_*`, `process_*`, `promhttp_*`, `coder_aibridged_*`)
  * `generated_metrics`: auto-generated by the AST scanner
* Update `metrics` file to contain only static and edge-case metrics
* Skip metrics with empty HELP descriptions in the scanner
* Update `generated_metrics` to reflect skipped metrics
* Update `docs/admin/integrations/prometheus.md` with merged metrics

Related to: https://github.com/coder/coder/issues/13223

**Disclosure:** This PR was mainly developed with Claude Sonnet 4, with iterative review and refinement by @ssncferreira
2026-02-13 12:19:33 +00:00
Callum Styan 5f3be6b288 feat: add provisioner job queue wait time histogram and jobs enqueued counter (#21869)
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209

I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.

Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.

Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-12 13:40:47 -08:00
Kacper Sawicki 60e3ab7632 feat(site)!: add consent prompt for auto-creation with prefilled parameters (#22011)
### Summary

Workspace created via mode=auto links now require explicit user
confirmation before provisioning. A warning dialog shows all prefilled
param.* values from the URL and blocks creation until the user clicks
`Confirm and Create`. Clicking `Cancel` falls back to the standard form
view.

<img width="820" height="475" alt="auto-create-consent-dialog"
src="https://github.com/user-attachments/assets/8339e3bd-434f-4a04-9385-436bf95f49d7"
/>

### Breaking behavior change

Links using `mode=auto` (e.g., "Open in Coder" buttons) will no longer
silently create workspaces. Users will now see a consent dialog and must
explicitly confirm before the workspace is provisioned. Any existing
integrations or automation relying on `mode=auto` for seamless workspace
creation will now require manual user interaction.

---------

Co-authored-by: Jake Howell <jacob@coder.com>
2026-02-12 15:39:02 +01:00
Jon Ayers 6035e45cb8 feat: add e2e workspace build duration metric (#21739)
Adds coderd_template_workspace_build_duration_seconds histogram that
tracks the full duration from workspace build creation to agent ready.
This captures the complete user-perceived build time including
provisioning and agent startup.

The metric is emitted when the agent reports ready/error/timeout via the
lifecycle API, ensuring each build is counted exactly once per replica.
2026-02-06 16:26:02 -06:00
blinkagent[bot] e5c3d151bb docs: add upgrade best practices guide (#21656) 2026-02-06 16:08:59 +00:00
blinkagent[bot] e98ee5e33d docs: fix incorrect path to coder modules in registry repo (#21976)
## Description

Fixes an incorrect path in the air-gapped/offline installation
documentation for publishing Coder modules to Artifactory.

The [coder/registry](https://github.com/coder/registry) repo has the
following structure:
```
registry/           # repo root
└── registry/       # subdirectory
    └── coder/
        └── modules/
```

The documentation previously instructed users to run:
```shell
cd registry/coder/modules
```

But the correct path is:
```shell
cd registry/registry/coder/modules
```

This was causing confusion for users trying to set up Coder modules in
air-gapped environments with Artifactory or similar repository managers.

Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
2026-02-06 09:30:03 -05:00
Steven Masley efd98bd93a chore: add template toggle to disable module caching (#21931)
There exists use cases to disable the new module caching behavior of
workspace builds. This was the legacy behavior.
2026-02-05 14:38:55 -06:00
Steven Masley 6b3d4377c3 feat: archive modules in size order until limit is hit (#21773)
Archiving modules attempts to save as many modules as it can before it hits the limit. Enabling the template as much as it can, rather than a hard failure.
2026-02-02 09:03:18 -06:00
Thomas Kosiewski dd6aec04d7 fix(coderd/oauth2provider): support client_secret_basic client auth (#21793) 2026-02-02 16:01:33 +01:00
Marcin Tojek 3e369c0b04 fix: separate SMTP envelope and header addresses (#21840)
## Description

When configuring a From address with a display name (e.g., `Coder System
<system@coder.com>`), the SMTP `MAIL FROM` command was incorrectly
receiving the full address string instead of just the bare email
address, causing `501 Invalid MAIL argument` errors on some SMTP
servers.

## Changes

- Updated `validateFromAddr` to return both:
  - `envelopeFrom`: bare email for SMTP `MAIL FROM` command (RFC 5321)
- `headerFrom`: original address with display name for email header (RFC
5322)

Fixes #20727
2026-02-02 13:53:02 +01:00
Marcin Tojek 036ed5672f fix!: remove deprecated prometheus metrics (#21788)
## Description

Removes the following deprecated Prometheus metrics:

- `coderd_api_workspace_latest_build_total` → use
`coderd_api_workspace_latest_build` instead
- `coderd_oauth2_external_requests_rate_limit_total` → use
`coderd_oauth2_external_requests_rate_limit` instead

These metrics were deprecated in #12976 because gauge metrics should
avoid the `_total` suffix per [Prometheus naming
conventions](https://prometheus.io/docs/practices/naming/).

## Changes

- Removed deprecated metric `coderd_api_workspace_latest_build_total`
from `coderd/prometheusmetrics/prometheusmetrics.go`
- Removed deprecated metric
`coderd_oauth2_external_requests_rate_limit_total` from
`coderd/promoauth/oauth2.go`
- Updated tests to use the non-deprecated metric name

Fixes #12999
2026-01-30 13:30:06 +01:00
Kacper Sawicki d09300eadf feat(cli): add 'coder login token' command to print session token (#21627)
Adds a new subcommand to print the current session token for use in
scripts and automation, similar to `gh auth token`.

## Usage

```bash
CODER_SESSION_TOKEN=$(coder login token)
```

Fixes #21515
2026-01-29 16:06:17 +01:00
Marcin Tojek 04b0253e8a feat: add Prometheus metrics for license warnings and errors (#21749)
Fixes: coder/internal#767

Adds two new Prometheus metrics for license health monitoring:

- `coderd_license_warnings` - count of active license warnings
- `coderd_license_errors` - count of active license errors

Metrics endpoint after startup of a deployment with license enabled:

```
...
# HELP coderd_license_errors The number of active license errors.
# TYPE coderd_license_errors gauge
coderd_license_errors 0
...
# HELP coderd_license_warnings The number of active license warnings.
# TYPE coderd_license_warnings gauge
coderd_license_warnings 0
...
```
2026-01-29 13:50:15 +01:00
Callum Styan d4cd982608 chore: undeprecate the workspace rename flag and clarify potential issues (#21669)
This undeprecates the `allow-workspace-renames` flag. IIUC, the 'danger'
with using this flag is that the workspace name might have been used in
the definition of some other terraform resources within template code,
so a rename could cause problems such as with persistent disks.

for https://github.com/coder/coder/issues/21628

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2026-01-27 10:53:13 -08:00
Callum Styan 806d7e4c11 docs: update metrics docs to include metadata batcher metrics (#21665)
This updates the metrics docs to include metrics added in
https://github.com/coder/coder/pull/21330

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2026-01-26 09:22:14 -08:00
Ben Potter ece531ab4e chore: mention usage data reporting in AI Gov docs (#21664)
<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2026-01-23 21:40:17 +00:00
Spike Curtis f0152e291a docs: fix 10k docs to include 600 provisioners (#21597)
fixes typo in docs
2026-01-22 10:43:13 +04:00
Sas Swart ffa83a4ebc docs: add documentation for coder script ordering (#21090)
This Pull request adds documentation and guidance for the Coder script
ordering feature. We:
* explain the use case, benefits, and requirements.
* provide example configuration snippets
* discuss best practices and troubleshooting

---------

Co-authored-by: Cian Johnston <cian@coder.com>
Co-authored-by: DevCats <christofer@coder.com>
2026-01-14 14:40:38 +02:00
Andrew Aquino 0c5809726d fix(docs): show dynamic parameters demo in local GIF instead of Imgur link (#21487)
fixes this bug where the dynamic parameters demo GIF isn't viewable in
the UK:

<img width="720" height="798" alt="image"
src="https://github.com/user-attachments/assets/757cd4fb-6b32-4db8-87fa-31a01588d69d"
/>
2026-01-13 09:31:32 -08:00
George K cc2efe9e1f feat(coderd/rbac): make organization-member a per-org system custom role (#21359)
Migrated the built-in organization-member role to DB storage so it can be customized per org.

Closes https://github.com/coder/internal/issues/1073 (part 1)
2026-01-12 18:19:19 -08:00
Steven Masley 89f4d60e7b chore: remove experiment "terraform-directory-reuse" (#21397)
Experiment is no longer required, the new method will be released without an experiment and without a toggle

Main PR is: https://github.com/coder/coder/pull/21398
2026-01-09 11:13:16 -06:00
Spike Curtis 4bc49ed6eb docs: update scale architecture and add 10k user doc (#21454)
Updates 2k, 3k docs to match previous changes to 1k ( #21362), including new database recommendations.

Adds a 10k doc.
2026-01-09 08:16:11 +04:00
Atif Ali 989def7a94 docs: document coder_script resource (#21409)
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-07 00:04:46 +05:00
Spike Curtis ed6d41a5ef docs: simplify 1k scale architecture and change db recommendation (#21362)
DRAFT: I'd like feedback on this approach for 1k before I give the others the same treatment and add a 10k document.



- Bumps database requirements to 8 vCPU, 30 GB memory. In our testing database was nearly always the bottleneck. (This could come back down again with improvements to how we use it.)

- Removes specific machine type recommendations.
    - This only applies to VM-based deployments and many of our customers use Kubernetes.
    - The major clouds upgrade their machine teirs, so our recommendations go out of date
    - In its place we just give CPU and memory requirements
- Removes API requests per second
    - It's not a metric that many operators will know until they are already operating
    - Our API requests vary wildly in cost depending on what they are
    - Replaces them with Users | Running Workspaces | Concurrent Builds - which represents our scale testing scenarios, and are easier for operators to reason about.
- Removes specific advice about workspace sizing, instead gives the minimum specs for the agent
- Gives Kubernetes resource request/limits in notes
- Adds advice about not needing high performance disks for Coderd, but that provisioners will benefit.
2026-01-06 14:29:41 +04:00
blinkagent[bot] 874f3994b5 docs: update VS Code Web subpath comment to reflect current support (#21375)
Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
2026-01-02 17:16:27 +05:00
Bjorn Robertsson 5b3c24c02f docs: document multiple agents for port-forwarding (#21221)
Co-authored-by: Atif Ali <atif@coder.com>
2025-12-19 11:45:51 +00:00
Jason Barnett f9087d6feb fix: correct Slack webhook example code in documentation (#21295)
Fixes #21294
2025-12-17 11:27:39 +01:00
Steven Masley 8fefd91e4a feat!: support PKCE in the oauth2 client's auth/exchange flow (#21215)
**Breaking Change:** Existing oauth apps might now use PKCE. If an
unknown IdP type was being used, and it does not support PKCE, it will
break.

To fix, set the PKCE methods on the external auth to `none`
```
export CODER_EXTERNAL_AUTH_1_PKCE_METHODS=none
```
2025-12-15 17:41:47 +00:00
Mathias Fredriksson ea9f003cdd docs: clarify dev containers entry point and reduce callouts (#21188)
The user guide jumped straight into integration details without explaining
what dev containers are. Now it opens with a brief orientation linking to
the spec, then explains this guide covers the Docker-based approach.

Converted several NOTE callouts to prose where they were just cross-references
or stacked unnecessarily. The Envbuilder index note was reframed to lead with
its strengths rather than "we recommend the other thing."

Also updates platform support to Linux only per current status.

Refs #21157
2025-12-09 16:37:19 +02:00
Mathias Fredriksson f3e26ca557 docs: add guidance on when to use Project Discovery for Dev Containers (#21190)
Refs #21157
2025-12-09 16:36:19 +02:00
Mathias Fredriksson 97bc7eb9e5 docs: restructure dev container documentation (#21157)
Dev container admin docs were scattered across two locations: the Docker-based
integration under extending-templates/ and Envbuilder under managing-templates/.
There was no landing page explaining that two approaches exist or helping admins
choose between them.

This moves everything under admin/integrations/devcontainers/ with a decision
guide at the top. Dev containers are an integration with the dev container
specification, so integrations/ is a natural fit alongside JFrog, Vault, etc.

Stub pages remain at the original locations for discoverability.

New structure:

  admin/integrations/devcontainers/
  ├── index.md                                # Landing page + decision guide
  ├── integration.md                          # Docker-based dev containers
  └── envbuilder/
      ├── index.md
      ├── add-envbuilder.md
      ├── envbuilder-security-caching.md
      └── envbuilder-releases-known-issues.md

Refs #21080
2025-12-09 13:03:02 +02:00
Mathias Fredriksson 61beb7bfa8 docs: rewrite dev containers documentation for GA (#21080)
docs: rewrite dev containers documentation for GA

Corrects inaccuracies in SSH examples (deprecated `--container` flag),
port forwarding (native sub-agent forwarding is primary), and
prerequisites (dev containers are on by default). Fixes template
descriptions: docker-devcontainer uses native Dev Containers while
AWS/Kubernetes templates use Envbuilder.

Renames admin docs folder from `devcontainers/` to `envbuilder/` to
reflect actual content. Adds customization guide documenting agent
naming, display apps, custom apps, and variable interpolation. Documents
multi-repo workspace support and adds note about Terraform module
limitations with sub-agents. Fixes module registry URLs.

Refs #18907
2025-12-05 19:42:16 +02:00
Spike Curtis d5bb1361e2 docs: delete references to adding database replicas (#21077)
Removes references to adding database replicas from the scaling docs, as Coder only allows a single connection URL. These passages where added in error.
2025-12-03 16:15:58 +04:00
Marcin Tojek 65ef6df1df docs: add documentation for preset invalidation (#21018)
Fixes #17917
2025-12-03 11:43:49 +01:00
Mathias Fredriksson f1b2715555 docs: add data retention and export documentation for AI Bridge (#21055)
Previously AI Bridge retention was only documented in the auto-generated
CLI reference, making it difficult for administrators to discover and
understand how to configure data retention for compliance requirements.

This adds retention configuration to the AI Bridge setup guide with
examples, documents the REST API and CLI export options in the monitoring
guide, and cross-references AI Bridge from the central data retention
page for discoverability.

Closes #21038
2025-12-03 11:39:36 +02:00
Mathias Fredriksson ff46917e62 feat: add retention config for workspace_agent_logs (#21039)
Replace hardcoded 7-day retention for workspace agent logs with
configurable retention from deployment settings. Defaults to 7d to
preserve existing behavior.

Depends on #21038
Updates #20743
2025-12-02 16:01:33 +00:00
Mathias Fredriksson d9888ced11 docs: add data retention documentation (#21038)
Document configurable retention policies for Audit Logs, Connection Logs,
and API keys. Add new data-retention.md page and update existing docs to
reference it.

Depends on #21021
Updates #20743
2025-12-02 15:47:36 +00:00