This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces`
endpoint, which can be used to listen on a single global pubsub channel
for _all_ workspace build updates, and makes use of it in the autostart
scaletest.
This negates the need to use a workspace watch pubsub channel _per_
workspace, which has auth overhead associated with each call. This is
especially relevant in situations such as the autostart scaletest, where
we need to start/stop a set of workspaces before we can configure their
autostart config. The overhead associated with all the watch requests
skews the scaletest results and makes it harder to reason about the
performance of the autostart feature itself.
The autostart scaletest also no longer generates its own metrics nor
does it wait for all the workspaces to actually start via autostart. We
should update the scaletest dashboard after both PRs are merged to
measure autostart performance via the new metrics.
The new function/endpoint and its usage in the autostart scaletest are
gated behind an experiment feature flag, this is something we should
discuss whether we want to enable the endpoint in prod by default or
not. If so, we can remove the experiment.
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Callum Styan <callum@coder.com>
_Disclaimer: implemented by a Coder Agent using Claude Opus 4.6._
Marks the injected MCP approach in AI Bridge as deprecated across the
codebase.
## Changes
- **`codersdk/deployment.go`**: Deprecated `ExternalAuthConfig.MCPURL`,
`.MCPToolAllowRegex`, `.MCPToolDenyRegex` fields; deprecated and hid the
`--aibridge-inject-coder-mcp-tools` server flag; deprecated
`AIBridgeConfig.InjectCoderMCPTools`.
- **`coderd/externalauth/externalauth.go`**: Deprecated `Config.MCPURL`,
`.MCPToolAllowRegex`, `.MCPToolDenyRegex`.
- **`enterprise/aibridgedserver/aibridgedserver.go`**: Added runtime
deprecation warning when `CODER_AIBRIDGE_INJECT_CODER_MCP_TOOLS` is
enabled; deprecated `getCoderMCPServerConfig`.
- **`enterprise/aibridged/mcp.go`**: Deprecated `MCPProxyBuilder`
interface and `MCPProxyFactory` struct.
- **`docs/ai-coder/ai-bridge/mcp.md`**: Added deprecation warning
banner.
OverrideVSCodeConfigs previously unconditionally set
`git.useIntegratedAskPass` and `github.gitAuthentication` to false,
clobbering any values provided by template authors via module settings
(e.g. the vscode-web module's settings block). This change only set
these keys when they are not already present, so template-provided
values are preserved.
Registry PR [#758](https://github.com/coder/registry/pull/758) fixed the
module side (run.sh merges template-author settings into the existing
settings.json instead of overwriting the file). But the agent still
unconditionally stamped false onto both keys before the script ran, so
the merge base always contained the agent's values and template authors
couldn't set them to anything else. This change fixes the agent side by
only writing defaults when the keys are absent.
WaitBuffer is a thread-safe io.Writer that supports blocking until
accumulated output matches a substring or custom predicate. It
replaces ad-hoc safeBuffer/syncWriter types and time.Sleep-based
poll loops in tests with signal-driven waits.
- WaitFor/WaitForNth/WaitForCond for blocking on output
- Replace custom buffer types in cli/sync_test.go and
provisionersdk/agent_test.go
- Convert time.Sleep poll loops to require.Eventually/require.Never
in cli/ssh_test.go, coderd/activitybump_test.go,
coderd/workspaceagentsrpc_test.go, workspaceproxy_test.go, and
scaletest tests
Removes `t.Parallel()` from `TestKeyring` and
`TestWindowsKeyring_WriteReadDelete`. The OS keyring is a shared system
resource that's flaky under concurrent access, especially Windows
Credential Manager in CI.
Fixescoder/internal#1370
- Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_`
- Extracts and refactors existing GitHub PR sync logic to new packages
`coderd/gitsync` and `coderd/externalauth/gitprovider`
- Associated wiring and tests
Created using Opus 4.6
The `start_with_dependencies` golden test was flaky on Windows CI. It
used `time.Sleep(100ms)` in a goroutine hoping the `sync start` command
would have time to call `SyncReady`, find the dependency unsatisfied,
and print the "Waiting..." message before the goroutine completed the
dependency.
On slower Windows runners, the sleep could finish and complete the
dependency before the command's first `SyncReady` call, so `ready` was
already `true` and the "Waiting..." message was never printed, causing
the golden file mismatch.
This replaces the `time.Sleep` with a `syncWriter` that wraps
`bytes.Buffer` with a mutex and a channel. The channel closes when the
written output contains the expected signal string ("Waiting"). The
goroutine blocks on this channel instead of sleeping, so it only
completes the dependency after the command has confirmed it is in the
waiting state.
Fixes https://github.com/coder/internal/issues/1376
_Disclaimer: implemented with Opus 4.6 and Coder Agents._
Follow-up to #22879.
## Problem
The `CODER_SESSION_TOKEN` guard added in #22879 blocks `coder login`
unconditionally when the env var is set. This conflicts with
`--use-token-as-session`, which intentionally uses the provided token
(including from the env var) directly as the session token.
## Fix
Add `&& !useTokenForSession` to the check so that `coder login
--use-token-as-session` still works when `CODER_SESSION_TOKEN` is set.
## Testing
Added `TestLogin/SessionTokenEnvVarWithUseTokenAsSession` — sets the env
var with a valid token and passes `--use-token-as-session`, verifying
login succeeds.
---------
Signed-off-by: Danny Kopping <danny@coder.com>
The `TestGitSSH/Local_SSH_Keys` test was flaking on Windows CI with a
context deadline exceeded error when calling `client.GitSSHKey(ctx)`.
Two issues contributed to the flake:
1. `prepareTestGitSSH` called `coderdtest.AwaitWorkspaceAgents` without
passing the caller's context. This created a separate internal 25s
timeout, wasting time budget independently of the setup context.
Changed to use `NewWorkspaceAgentWaiter(...).WithContext(ctx).Wait()`
so the agent wait shares the caller's timeout.
2. The `Local SSH Keys` subtest used `WaitLong` (25s) for its setup
context, but this subtest does more work than `Dial` (runs the
command twice). Bumped to `WaitSuperLong` (60s) to give slow
Windows CI runners enough time.
Fixescoder/internal#770
Handle errors that were previously assigned to blank identifiers in the
`cli/` package.
- ssh.go: Log ExistsViaCoderConnect DNS lookup error at debug level
instead of silently discarding it. Fallthrough behavior preserved.
- exp_scaletest_llmmock.go: Log srv.Stop() error via the existing
logger instead of discarding it.
_Disclaimer: created with Opus 4.6 and Coder Agents._
## Problem
When `CODER_SESSION_TOKEN` is set as an environment variable with an
invalid value, `coder login` fails with a confusing error:
```
error: Trace=[create api key: ]
You are signed out or your session has expired. Please sign in again to continue.
Suggestion: Try logging in using 'coder login'.
```
The suggestion to run `coder login` is what the user just did, making it
circular and unhelpful.
## Root cause
The `--token` flag is mapped to `CODER_SESSION_TOKEN` via serpent. When
the env var is set, `coder login` picks it up as the session token and
tries to use it to create a new API key, which fails because the token
is invalid. Even if login were to succeed and write a new token to disk,
subsequent commands would still use the env var (which takes precedence
over the on-disk token), so the user would remain stuck.
## Fix
Before attempting login, check if `CODER_SESSION_TOKEN` is set in the
environment. If so, return a clear error telling the user to unset it:
```
the environment variable CODER_SESSION_TOKEN is set, which takes precedence
over the session token stored on disk. Please unset it and try again.
unset CODER_SESSION_TOKEN
```
## Testing
Added `TestLogin/SessionTokenEnvVar` that verifies the error is returned
when the env var is set.
Previously `coder login token` didn't load the server URL from config,
so it always required --url or CODER_URL when using the keyring to store
the session token. This command would only print out the token when
already logged in to a deployment and file storage is used to store the
session token (keyring is the default on Windows/macOS). It would also
print out an incorrect token when --url was specified and the session
token stored on disk was for a different deployment that the user logged
into.
This change fixes all of these issues, and also errors out when using
session token file storage with a `--url` argument that doesn't match
the stored config URL, since the file only stores one token and would
silently return the wrong one.
See https://github.com/coder/coder/issues/22733 for a table of the
before/after behaviors.
The `--parameter-default` value is now used to pre-select the default option for a coder parameter
with option blocks when prompting interactively in CLI.
Related to: https://github.com/coder/coder/issues/22078
## Problem
When `coder ssh --stdio` checks for Coder Connect availability, it
constructs a hostname like `agent.workspace.owner.coder` and performs a
DNS AAAA lookup via `ExistsViaCoderConnect`. Without a trailing dot,
this hostname is not a fully-qualified domain name (FQDN), so the system
DNS resolver appends each configured search domain before querying.
Go's pure-Go DNS resolver (used when `CGO_ENABLED=0`, which is the
default for CLI builds) does **not** stop after getting NXDOMAIN on the
first name. It tries all names in the search list sequentially:
1. `agent.workspace.owner.coder.` → NXDOMAIN (fast)
2. `agent.workspace.owner.coder.corp.example.com.` → timeout
3. `agent.workspace.owner.coder.internal.company.com.` → timeout
On corporate networks where the search-domain-expanded queries hit DNS
infrastructure that drops rather than responds (common for nonsensical
hostnames with deep subdomain chains), each expanded query hits the full
DNS timeout (default 5s × 2 attempts = 10s per name). With 2-3 search
domains, this compounds to 20-30+ seconds of blocking.
## Fix
Adding a trailing dot marks the hostname as an FQDN. Go's `nameList()`
in `src/net/dnsclient_unix.go` returns a single-entry list for rooted
names, completely bypassing search domain expansion.
This is consistent with how `IsCoderConnectRunning` already handles its
DNS check — `tailnet.IsCoderConnectEnabledFmtString` includes a trailing
dot for exactly this reason.
## Verification
Tested with a fake DNS server that responds with NXDOMAIN for `.coder`
queries but drops search-domain-expanded queries:
| Hostname | Time | Queries sent |
|---|---|---|
| `main.workstation.kevin.coder` (no trailing dot) | **~15s** | 4 (as-is
+ 3 search domains) |
| `main.workstation.kevin.coder.` (trailing dot) | **<1ms** | 1 (FQDN
only) |
Closes https://github.com/coder/coder/issues/22581
_Generated by [mux](https://github.com/coder/mux) but reviewed by a
human_
## Description
Adds optional TLS support for the AI Bridge Proxy listener. When TLS cert and key files are provided, the proxy serves over HTTPS instead of plain HTTP.
## Changes
* New configuration options to enable TLS on the proxy listener
* Wraps the TCP listener in `tls.NewListener` when configured
* Tests for validation errors, invalid files, and full integration (tunneled + MITM) through a TLS listener
Note: Documentation for TLS listener setup and client configuration will be handled in a follow-up PR.
Related to: https://github.com/coder/internal/issues/1335
## Description
Renames internal fields, variables, and comments related to the proxy's certificate/key configuration to explicitly reference their MITM CA purpose.
The AI Bridge Proxy uses a CA certificate to sign dynamically generated leaf certificates during MITM interception of HTTPS traffic from AI clients. With the upcoming introduction of TLS listener certificates (for serving the proxy itself over HTTPS, implemented upstack https://github.com/coder/coder/pull/22411), the previous generic naming would become ambiguous. This refactor makes it clear which certificate is which.
No user-facing flags, environment variables, YAML keys, or JSON fields were changed, this is purely an internal rename to avoid confusion going forward.
Related to https://github.com/coder/internal/issues/1335
## Summary
Fixes cross-replica chat relay failing with:
```
failed to open initial relay for chat stream
error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200
failed to open relay for message parts
error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200
```
Subscribers see accurate `status=running` (delivered via pubsub) but
miss all in-progress `message_part` events (delivered only via the relay
WebSocket that never connects).
## Root cause
`redirectToAccessURL` in `cli/server.go` redirects any request whose
`Host` header doesn't match the access URL. The enterprise chat relay
dials another replica directly via its DERP relay address (e.g.
`http://10.0.0.2:8080`), so the `Host` header is the pod IP — not the
access URL.
This triggers a **307 redirect** to the access URL. The WebSocket
library follows the redirect, but the second request is a plain GET —
`Connection: Upgrade` and `Upgrade: websocket` headers are **not carried
over** by HTTP redirect semantics. The load-balanced access URL routes
the plain GET to any replica, which serves the SPA catch-all handler and
returns **HTTP 200 with `index.html`**.
The WebSocket library then fails: `expected handshake response status
code 101 but got 200`.
DERP mesh already has an exemption for this exact scenario
(`isDERPPath`). Chat relay was added later and didn't get one.
## Fix
Bypass `redirectToAccessURL` for requests that carry the
`X-Coder-Relay-Source-Replica` header, which the enterprise relay
already sets on every request (`enterprise/coderd/chatd/chatd.go:573`).
## Sequence diagram
**Before (broken):**
```
Replica A (subscriber) Replica B (worker) Load Balancer
| | |
|--- WS dial pod-ip:8080 ----->| |
| |-- 307 redirect to LB --->|
| | |
|<----------- plain GET (no Upgrade headers) ------------->|
| | |-- routes to any replica
|<----------- 200 index.html -------------------------------|
| |
X 'expected 101 but got 200' |
```
**After (fixed):**
```
Replica A (subscriber) Replica B (worker)
| |
|--- WS dial pod-ip:8080 ----->|
| (X-Coder-Relay-Source- |
| Replica header set) |
| |-- bypass redirect
|<--------- 101 Upgrade ------|
|<==== message_part events ====|
```
relates to #21335
Modifies our local MCP server used in Tasks to push task status updates over the agentsocket, rather than directly dialing Coderd. This will significantly reduce pressure on the database at scale because we can avoid expensive authentication of the agent API key.
Disclosure: I used AI to generate a lot of this PR, but hand-reviewed and tweaked it.
relates to #21335
Enables the agent socket by default and updates docs to strike references to having to enable it.
The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token.
Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.
Replace manual experiment checks in web-push handlers with the
`RequireExperimentWithDevBypass` middleware on the route group, matching
the pattern used by OAuth2, Agents, and MCP experiments.
## Changes
- **`coderd/coderd.go`**: Add `RequireExperimentWithDevBypass`
middleware to `/webpush` route group
- **`coderd/webpush.go`**: Remove inline
`api.Experiments.Enabled(codersdk.ExperimentWebPush)` checks from all
three handlers
- **`cli/server.go`**: Gate webpush dispatcher initialization with
`buildinfo.IsDev()` fallback so dev builds always init the real
dispatcher
- **`coderd/webpush_test.go`**: Remove experiment enablement from tests
(dev bypass handles it)
Net effect: -26 lines removed, +5 added.
Created using whatchamacallits (Opus 4.6 Max)
## Problem
When the git askpass flow triggered diff status refreshes, it updated
**every chat** connected to the workspace. This was wasteful and could
cause confusing status updates on unrelated chats.
## Solution
Thread the chat ID through the entire git askpass flow so only the chat
that initiated the git operation gets updated:
1. **`coderd/chatd/chattool/execute.go`** — Sets `CODER_CHAT_ID` env var
on spawned processes (alongside the existing `CODER_CHAT_AGENT`)
2. **`cli/gitaskpass.go`** — Reads `CODER_CHAT_ID` from the environment
and sends it as a `chat_id` query parameter in the `ExternalAuthRequest`
3. **`codersdk/agentsdk/agentsdk.go`** — Adds `ChatID` field to
`ExternalAuthRequest` and encodes it as a query param
4. **`coderd/workspaceagents.go`** — Parses `chat_id` query param and
passes it through to `storeChatGitRef` and
`triggerWorkspaceChatDiffStatusRefresh`
5. **`coderd/chats.go`** — `storeChatGitRef` and
`refreshWorkspaceChatDiffStatuses` now scope updates to just the
initiating chat when a chat ID is provided, falling back to
all-workspace-chats behavior for backwards compatibility (non-chat git
operations)
Fixes three bugs that caused `coder update` to always re-prompt for
multi-select (`list(string)`) parameters instead of reusing previous
build values:
1. **`isValidTemplateParameterOption` failed for multi-select values**
(`cli/parameterresolver.go`): It compared the entire JSON array string
(e.g. `["vim","emacs"]`) against individual option values, which never
matched. Now parses the JSON array and validates each element
separately.
2. **`RichParameter` ignored previous build value for multi-select**
(`cli/cliui/parameter.go`): The `list(string)` branch always used the
template's default value instead of the `defaultValue` argument (which
carries the previous build's value). Now uses `defaultValue` when
available, falling back to the template default.
3. **Pre-existing crash when `list(string)` has no default value**
(`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused
`unexpected end of JSON input`. Now skips unmarshaling when the default
source is empty.
Fixes#19956
Fixes#22030
## Problem
When a template has `require_active_version = true` and a workspace is
outdated, the web UI always shows "Update and start" as the **only**
button (for all users including admins), but `coder start` starts with
the old version. For admins, this silently succeeds on the stale
version. For non-admins, it goes through a clunky 403→retry path. This
also affects the VS Code extension, which calls `coder start --yes`
under the hood.
## Root Cause
`buildWorkspaceStartRequest()` in `cli/start.go` checks
`workspace.AutomaticUpdates == "always"` but ignores
`workspace.TemplateRequireActiveVersion`. The server-side autostart
already ORs both settings together:
```go
// coderd/autobuild/lifecycle_executor.go
func useActiveVersion(opts, ws) bool {
return opts.RequireActiveVersion || ws.AutomaticUpdates == "always"
}
```
The CLI was missing the `RequireActiveVersion` check.
## Fix
Add `workspace.TemplateRequireActiveVersion` to the existing OR
condition:
```go
// Before:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || action == WorkspaceUpdate {
// After:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || workspace.TemplateRequireActiveVersion || action == WorkspaceUpdate {
```
Now `coder start` and `coder restart` proactively use the active
template version when `require_active_version` is set, matching the web
UI and server autostart behavior. The 403→retry fallback remains as a
safety net but is no longer the primary path for any user.
## Testing
Updated `enterprise/cli/start_test.go` — all user types (owner, template
admin, ACL admin, group ACL admin, member) now expect the active version
when `require_active_version` is set, and verify the 403→retry message
does NOT appear.
When AgentAPI is configured, `WithTaskReporter` unconditionally
overrides all self-reported states to `working`. The intent was to
distrust the agent's `idle` and rely on the screen watcher, but the
override also blocks `failure` and `complete`, which only the agent can
produce (the screen watcher only knows `running`/`stable`). Tasks get
stuck as `working` or `null` forever.
Now only `idle` is overridden to `working`; `failure`, `complete`, and
`working` pass through as-is.
Also:
- Remove misplaced unconditional `"Failed to watch screen events"` log
that fired on every startup
- Add SSE reconnection with exponential backoff (1s-30s) in
`startWatcher` so it recovers from dropped connections instead of dying
silently
- Add `complete` to the `coder_report_task` tool enum, which the
`coder/claude-code` registry module already instructs agents to use but
was missing from the schema
Refs coder/internal#1350
## Summary
Moves expired token filtering from client-side to server-side by adding
an `include_expired` parameter to the `GetAPIKeysByLoginType` and
`GetAPIKeysByUserID` database queries. This is more efficient for large
deployments with many expired/short-lived tokens.
## Changes
- Add `include_expired` parameter to SQL queries using `OR`
short-circuit
- Add `include_expired` query parameter to `GET
/users/{user}/keys/tokens`
- Add `IncludeExpired` field to `codersdk.TokensFilter`
- Remove client-side filtering from CLI `tokens list` command
- Add `TestTokensFilterExpired` test
Fixescoder/internal#1357
## Problem
When a template adds a new immutable parameter, `coder update
--parameter param=value` fails with:
```
error: start workspace: parameter "machine_type" is immutable and cannot be updated
```
The interactive prompt handles this correctly (allows setting first-time
immutable params), but the CLI `--parameter` flag path does not.
## Root Cause
In `cli/parameterresolver.go`, `verifyConstraints()` runs before the
interactive prompt and unconditionally rejects any immutable parameter
during updates. It doesn't distinguish between **new** immutable
parameters (first-time use, should be allowed) and **existing** ones
(already set, should be blocked from changing).
## Fix
Added an `isFirstTimeUse` check to the immutable parameter constraint,
matching the logic already used by the interactive prompt path (line
323). New immutable parameters can now be set via `--parameter`, while
existing immutable parameters are still blocked from being changed.
## Testing
Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag`
which:
1. Creates a workspace with a mutable parameter
2. Updates the template to add a new immutable parameter
3. Runs `coder update --parameter immutable_param=value`
4. Verifies the update succeeds and the parameter is set correctly
Fixes#22164
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.
This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
`--secure-auth-cookie` now automatically sources it's default value from `--access-url`
If the access url uses HTTPS, secure is set to `true`.
To revert to old behavior, set the value explicitly to `false`
If a deployment has 2 domains, overriding the oidc url allows the oidc
redirect to differ from the access_url
response to https://github.com/coder/coder/discussions/21500
**This config setting is hidden by default**
`coder templates version list` makes a call to determine the `active`
version:
```
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
but this is not carried across to the `-ojson` output version, so this
PR implements that in order to support programattic addressing.
It is added a top level entry. If it should be nested under
`TemplateVersion` let me know.
```
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson | jq '.[] | select(.active == true) | { active, id: .TemplateVersion.id }'
{
"active": true,
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
}
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson |jq '.[] | select(.active == true)'
{
"TemplateVersion": {
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19",
"template_id": "1a84ce78-06a6-41ad-99e4-8ea5d9b91e89",
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"created_at": "2025-10-10T10:34:02.254357+11:00",
"updated_at": "2025-10-10T10:34:46.594032+11:00",
"name": "infallible_feistel2",
"message": "Uploaded from the CLI",
"job": {
"id": "8afd05ca-b4be-48d5-a6b9-82dcfd12c960",
"created_at": "2025-10-10T10:34:02.251234+11:00",
"started_at": "2025-10-10T10:34:02.257301+11:00",
"completed_at": "2025-10-10T10:34:46.594032+11:00",
"status": "succeeded",
"worker_id": "a0940ade-ecdd-47c2-98c6-f2a4e5eb0733",
"file_id": "05fd653c-3a3f-4e5c-856b-29407732e1b1",
"tags": {
"owner": "",
"scope": "organization"
},
"queue_position": 0,
"queue_size": 0,
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"initiator_id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"input": {
"template_version_id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
},
"type": "template_version_import",
"metadata": {
"template_version_name": "",
"template_id": "00000000-0000-0000-0000-000000000000",
"template_name": "",
"template_display_name": "",
"template_icon": ""
},
"logs_overflowed": false
},
"readme": "---\ndxxxxx,
"created_by": {
"id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"username": "rowansmith",
"name": "rowan smith"
},
"archived": false,
"has_external_agent": false
},
"active": true
}
```
At present it is not possible to obtain the `id` of the template version
in the table output:
```
➜ ~ coder templates version list -h
coder v2.30.1+16408b1
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Adding this because it is useful when wanting to programatically
retrieve the details of the latest template version, and `-ojson` does
not include `active` details in it's output.
```
➜ Downloads ./coder-cli-templateversions-list-id templates version list -h
coder v2.30.1-devel+bab99db9e7
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [id|name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
--include-archived bool
Include archived versions in the result list.
-o, --output table|json (default: table)
Output format.
———
Run `coder --help` for a list of global options.
➜ Downloads ./coder-cli-templateversions-list-id templates version list aws-linux-dynamic -c id,name,'created at','created by',status,active
ID NAME CREATED AT CREATED BY STATUS ACTIVE
38f66eae-ec63-49b7-a9d2-cdb79c379d19 infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
aa797ea5-4221-461b-80b0-90c5164f8dc0 mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
## Summary
> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.
Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.
## Problem
During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:
```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```
This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.
## Solution
This PR adds:
- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)`
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states
## Test plan
- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token
Closes#21782🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Problem
Site-wide admins (e.g., Owners) could not use `coder create --org <org>`
to create workspaces in organizations they are not members of. The error
was:
```
$ coder create my-workspace -t docker --org data-science
error: organization "data-science" not found, are you sure you are a member of this organization?
```
This was inconsistent with the web UI, where Owners can create
workspaces in any organization.
## Root Cause
The CLI's `OrganizationContext.Selected()` function only checked the
user's membership list, ignoring site-wide RBAC permissions that grant
Owners access to all organizations.
## Solution
Added a fallback in `OrganizationContext.Selected()` that fetches the
org directly via the API when not found in the membership list. This
works because the API endpoint applies RBAC filtering, allowing Owners
to read any org.
## Impact
This fixes `coder create --org` and all other CLI commands that use
`OrganizationContext.Selected()` (29+ commands), including:
- `coder templates push --org <any-org>`
- `coder organizations members add --org <any-org>`
- `coder provisioner list --org <any-org>`
## Testing
Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which:
- Creates an Owner user who is NOT a member of a second org
- Verifies they can create a workspace there using `--org`
- Properly fails without the code fix, passes with it
---
*This PR was generated by [mux](https://mux.coder.com) but reviewed by a
human.*
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209
I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.
Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.
Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>