relates to #21335
Enables the agent socket by default and updates docs to strike references to having to enable it.
The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token.
Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.
Replace manual experiment checks in web-push handlers with the
`RequireExperimentWithDevBypass` middleware on the route group, matching
the pattern used by OAuth2, Agents, and MCP experiments.
## Changes
- **`coderd/coderd.go`**: Add `RequireExperimentWithDevBypass`
middleware to `/webpush` route group
- **`coderd/webpush.go`**: Remove inline
`api.Experiments.Enabled(codersdk.ExperimentWebPush)` checks from all
three handlers
- **`cli/server.go`**: Gate webpush dispatcher initialization with
`buildinfo.IsDev()` fallback so dev builds always init the real
dispatcher
- **`coderd/webpush_test.go`**: Remove experiment enablement from tests
(dev bypass handles it)
Net effect: -26 lines removed, +5 added.
Created using whatchamacallits (Opus 4.6 Max)
## Problem
When the git askpass flow triggered diff status refreshes, it updated
**every chat** connected to the workspace. This was wasteful and could
cause confusing status updates on unrelated chats.
## Solution
Thread the chat ID through the entire git askpass flow so only the chat
that initiated the git operation gets updated:
1. **`coderd/chatd/chattool/execute.go`** — Sets `CODER_CHAT_ID` env var
on spawned processes (alongside the existing `CODER_CHAT_AGENT`)
2. **`cli/gitaskpass.go`** — Reads `CODER_CHAT_ID` from the environment
and sends it as a `chat_id` query parameter in the `ExternalAuthRequest`
3. **`codersdk/agentsdk/agentsdk.go`** — Adds `ChatID` field to
`ExternalAuthRequest` and encodes it as a query param
4. **`coderd/workspaceagents.go`** — Parses `chat_id` query param and
passes it through to `storeChatGitRef` and
`triggerWorkspaceChatDiffStatusRefresh`
5. **`coderd/chats.go`** — `storeChatGitRef` and
`refreshWorkspaceChatDiffStatuses` now scope updates to just the
initiating chat when a chat ID is provided, falling back to
all-workspace-chats behavior for backwards compatibility (non-chat git
operations)
Fixes three bugs that caused `coder update` to always re-prompt for
multi-select (`list(string)`) parameters instead of reusing previous
build values:
1. **`isValidTemplateParameterOption` failed for multi-select values**
(`cli/parameterresolver.go`): It compared the entire JSON array string
(e.g. `["vim","emacs"]`) against individual option values, which never
matched. Now parses the JSON array and validates each element
separately.
2. **`RichParameter` ignored previous build value for multi-select**
(`cli/cliui/parameter.go`): The `list(string)` branch always used the
template's default value instead of the `defaultValue` argument (which
carries the previous build's value). Now uses `defaultValue` when
available, falling back to the template default.
3. **Pre-existing crash when `list(string)` has no default value**
(`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused
`unexpected end of JSON input`. Now skips unmarshaling when the default
source is empty.
Fixes#19956
Fixes#22030
## Problem
When a template has `require_active_version = true` and a workspace is
outdated, the web UI always shows "Update and start" as the **only**
button (for all users including admins), but `coder start` starts with
the old version. For admins, this silently succeeds on the stale
version. For non-admins, it goes through a clunky 403→retry path. This
also affects the VS Code extension, which calls `coder start --yes`
under the hood.
## Root Cause
`buildWorkspaceStartRequest()` in `cli/start.go` checks
`workspace.AutomaticUpdates == "always"` but ignores
`workspace.TemplateRequireActiveVersion`. The server-side autostart
already ORs both settings together:
```go
// coderd/autobuild/lifecycle_executor.go
func useActiveVersion(opts, ws) bool {
return opts.RequireActiveVersion || ws.AutomaticUpdates == "always"
}
```
The CLI was missing the `RequireActiveVersion` check.
## Fix
Add `workspace.TemplateRequireActiveVersion` to the existing OR
condition:
```go
// Before:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || action == WorkspaceUpdate {
// After:
if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways || workspace.TemplateRequireActiveVersion || action == WorkspaceUpdate {
```
Now `coder start` and `coder restart` proactively use the active
template version when `require_active_version` is set, matching the web
UI and server autostart behavior. The 403→retry fallback remains as a
safety net but is no longer the primary path for any user.
## Testing
Updated `enterprise/cli/start_test.go` — all user types (owner, template
admin, ACL admin, group ACL admin, member) now expect the active version
when `require_active_version` is set, and verify the 403→retry message
does NOT appear.
When AgentAPI is configured, `WithTaskReporter` unconditionally
overrides all self-reported states to `working`. The intent was to
distrust the agent's `idle` and rely on the screen watcher, but the
override also blocks `failure` and `complete`, which only the agent can
produce (the screen watcher only knows `running`/`stable`). Tasks get
stuck as `working` or `null` forever.
Now only `idle` is overridden to `working`; `failure`, `complete`, and
`working` pass through as-is.
Also:
- Remove misplaced unconditional `"Failed to watch screen events"` log
that fired on every startup
- Add SSE reconnection with exponential backoff (1s-30s) in
`startWatcher` so it recovers from dropped connections instead of dying
silently
- Add `complete` to the `coder_report_task` tool enum, which the
`coder/claude-code` registry module already instructs agents to use but
was missing from the schema
Refs coder/internal#1350
## Summary
Moves expired token filtering from client-side to server-side by adding
an `include_expired` parameter to the `GetAPIKeysByLoginType` and
`GetAPIKeysByUserID` database queries. This is more efficient for large
deployments with many expired/short-lived tokens.
## Changes
- Add `include_expired` parameter to SQL queries using `OR`
short-circuit
- Add `include_expired` query parameter to `GET
/users/{user}/keys/tokens`
- Add `IncludeExpired` field to `codersdk.TokensFilter`
- Remove client-side filtering from CLI `tokens list` command
- Add `TestTokensFilterExpired` test
Fixescoder/internal#1357
## Problem
When a template adds a new immutable parameter, `coder update
--parameter param=value` fails with:
```
error: start workspace: parameter "machine_type" is immutable and cannot be updated
```
The interactive prompt handles this correctly (allows setting first-time
immutable params), but the CLI `--parameter` flag path does not.
## Root Cause
In `cli/parameterresolver.go`, `verifyConstraints()` runs before the
interactive prompt and unconditionally rejects any immutable parameter
during updates. It doesn't distinguish between **new** immutable
parameters (first-time use, should be allowed) and **existing** ones
(already set, should be blocked from changing).
## Fix
Added an `isFirstTimeUse` check to the immutable parameter constraint,
matching the logic already used by the interactive prompt path (line
323). New immutable parameters can now be set via `--parameter`, while
existing immutable parameters are still blocked from being changed.
## Testing
Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag`
which:
1. Creates a workspace with a mutable parameter
2. Updates the template to add a new immutable parameter
3. Runs `coder update --parameter immutable_param=value`
4. Verifies the update succeeds and the parameter is set correctly
Fixes#22164
The provisioner state for a workspace build was being loaded for every
long-lived agent rpc connection. Since this state can be anywhere from
kilobytes to megabytes this can gradually cause the `coderd` memory
footprint to grow over time. It's also a lot of unnecessary allocations
for every query that fetches a workspace build since only a few callers
ever actually reference the provisioner state.
This PR removes it from the returned workspace build and adds a query to
fetch the provisioner state explicitly.
`--secure-auth-cookie` now automatically sources it's default value from `--access-url`
If the access url uses HTTPS, secure is set to `true`.
To revert to old behavior, set the value explicitly to `false`
If a deployment has 2 domains, overriding the oidc url allows the oidc
redirect to differ from the access_url
response to https://github.com/coder/coder/discussions/21500
**This config setting is hidden by default**
`coder templates version list` makes a call to determine the `active`
version:
```
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
but this is not carried across to the `-ojson` output version, so this
PR implements that in order to support programattic addressing.
It is added a top level entry. If it should be nested under
`TemplateVersion` let me know.
```
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson | jq '.[] | select(.active == true) | { active, id: .TemplateVersion.id }'
{
"active": true,
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
}
➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson |jq '.[] | select(.active == true)'
{
"TemplateVersion": {
"id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19",
"template_id": "1a84ce78-06a6-41ad-99e4-8ea5d9b91e89",
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"created_at": "2025-10-10T10:34:02.254357+11:00",
"updated_at": "2025-10-10T10:34:46.594032+11:00",
"name": "infallible_feistel2",
"message": "Uploaded from the CLI",
"job": {
"id": "8afd05ca-b4be-48d5-a6b9-82dcfd12c960",
"created_at": "2025-10-10T10:34:02.251234+11:00",
"started_at": "2025-10-10T10:34:02.257301+11:00",
"completed_at": "2025-10-10T10:34:46.594032+11:00",
"status": "succeeded",
"worker_id": "a0940ade-ecdd-47c2-98c6-f2a4e5eb0733",
"file_id": "05fd653c-3a3f-4e5c-856b-29407732e1b1",
"tags": {
"owner": "",
"scope": "organization"
},
"queue_position": 0,
"queue_size": 0,
"organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79",
"initiator_id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"input": {
"template_version_id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19"
},
"type": "template_version_import",
"metadata": {
"template_version_name": "",
"template_id": "00000000-0000-0000-0000-000000000000",
"template_name": "",
"template_display_name": "",
"template_icon": ""
},
"logs_overflowed": false
},
"readme": "---\ndxxxxx,
"created_by": {
"id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6",
"username": "rowansmith",
"name": "rowan smith"
},
"archived": false,
"has_external_agent": false
},
"active": true
}
```
At present it is not possible to obtain the `id` of the template version
in the table output:
```
➜ ~ coder templates version list -h
coder v2.30.1+16408b1
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
➜ ~ coder templates version list aws-linux-dynamic
NAME CREATED AT CREATED BY STATUS ACTIVE
infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
Adding this because it is useful when wanting to programatically
retrieve the details of the latest template version, and `-ojson` does
not include `active` details in it's output.
```
➜ Downloads ./coder-cli-templateversions-list-id templates version list -h
coder v2.30.1-devel+bab99db9e7
USAGE:
coder templates versions list [flags] <template>
List all the versions of the specified template
OPTIONS:
-O, --org string, $CODER_ORGANIZATION
Select which organization (uuid or name) to use.
-c, --column [id|name|created at|created by|status|active|archived] (default: name,created at,created by,status,active)
Columns to display in table output.
--include-archived bool
Include archived versions in the result list.
-o, --output table|json (default: table)
Output format.
———
Run `coder --help` for a list of global options.
➜ Downloads ./coder-cli-templateversions-list-id templates version list aws-linux-dynamic -c id,name,'created at','created by',status,active
ID NAME CREATED AT CREATED BY STATUS ACTIVE
38f66eae-ec63-49b7-a9d2-cdb79c379d19 infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active
aa797ea5-4221-461b-80b0-90c5164f8dc0 mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded
```
## Summary
> NOTE: Calling this out as a breaking change in case existing consumers
of the CLI depend on being able to see expired tokens OR being able to
delete tokens immediately.
Updates the `coder tokens rm` command to immediately expire a token by
ID, preserving the token record for audit trail purposes. Tokens can
still be deleted by passing `--delete`.
## Problem
During an incident on dev.coder.com, operators needed to urgently expire
an API key that was stuck in a hot loop. The only way to do this was via
direct database access:
```sql
UPDATE api_keys SET expires_at = NOW() WHERE id = '...';
```
This is not ideal for operators who may not have direct DB access or
want to avoid manual SQL.
## Solution
This PR adds:
- **API endpoint**: `PUT /api/v2/users/{user}/keys/{keyid}/expire` -
Sets the token's `expires_at` to now
- **SDK method**: `ExpireAPIKey(ctx, userID, keyID)`
- **Updates CLI**: `coder tokens rm <name|id|token>` now _expires_ by
default. You can still delete by passing the `--delete` flag. The `coder
tokens list` command now also hides expired tokens by default. You can
`--include-expired` if needed to include them.
- **Audit logging**: The expire action is logged with old and new key
states
## Test plan
- Tests cover: owner expiring own token, admin expiring other user's
token, non-admin cannot expire other's token, 404 for non-existent token
Closes#21782🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Problem
Site-wide admins (e.g., Owners) could not use `coder create --org <org>`
to create workspaces in organizations they are not members of. The error
was:
```
$ coder create my-workspace -t docker --org data-science
error: organization "data-science" not found, are you sure you are a member of this organization?
```
This was inconsistent with the web UI, where Owners can create
workspaces in any organization.
## Root Cause
The CLI's `OrganizationContext.Selected()` function only checked the
user's membership list, ignoring site-wide RBAC permissions that grant
Owners access to all organizations.
## Solution
Added a fallback in `OrganizationContext.Selected()` that fetches the
org directly via the API when not found in the membership list. This
works because the API endpoint applies RBAC filtering, allowing Owners
to read any org.
## Impact
This fixes `coder create --org` and all other CLI commands that use
`OrganizationContext.Selected()` (29+ commands), including:
- `coder templates push --org <any-org>`
- `coder organizations members add --org <any-org>`
- `coder provisioner list --org <any-org>`
## Testing
Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which:
- Creates an Owner user who is NOT a member of a second org
- Verifies they can create a workspace there using `--org`
- Properly fails without the code fix, passes with it
---
*This PR was generated by [mux](https://mux.coder.com) but reviewed by a
human.*
This PR adds some metrics to help identify job enqueue rates and
latencies. This work was initiated as a way to help reduce the cost of
the observation/measurement itself for autostart scaletests, which
impacts our ability to identify/reason about the load caused by
autostart. See: https://github.com/coder/internal/issues/1209
I've extended the metrics here to account for regular user initiated
builds, prebuilds, autostarts, etc. IMO there is still the question here
of whether we want to include or need the `transition` label, which is
only present on workspace builds. Including it does lead to an increase
in cardinality, and in the case of the histogram (when not using native
histograms) that's at least a few extra series for every bucket. We
could remove the transition label there but keep it on the counter.
Additionally, the histogram is currently observing latencies for other
jobs, such as template builds/version imports, those do not have a
transition type associated with them.
Tested briefly in a workspace, can see metric values like the following:
-
`coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"}
1`
-
`coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"}
1`
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This change adds Linux support for Desktop VPN by aligning Linux
behavior with the existing Windows daemon implementation and adding a
Linux networking stack implementation.
### What changed
- Consolidated the daemon command implementation into a shared file:
- `cli/vpndaemon_windows_linux.go` (`//go:build windows || linux`)
- Consolidated daemon tests into a shared file:
- `cli/vpndaemon_windows_linux_test.go` (`//go:build windows || linux`)
- Removed Linux-only duplicate daemon files:
- `cli/vpndaemon_linux.go`
- `cli/vpndaemon_linux_test.go`
- Removed unsupported-platform stubs per current supported OS targets:
- `cli/vpndaemon_other.go`
- `vpn/tun.go`
- Kept Linux networking stack implementation in:
- `vpn/tun_linux.go`
### Notes
- Linux now uses the same `rpc-read-handle` / `rpc-write-handle` flags
and env vars as Windows.
- The daemon logs to stderr (via CLI logger sinks), and does not forward
logs over the RPC pipe.
closes: https://github.com/coder/internal/issues/1331
Fixes up an issue in the test where we end up calling `FailNow` outside
the main test goroutine. Also adds the ability to name a `ptytest.PTY`
for cases like this one where we start multiple commands. This will help
debugging if we see the issue again.
This doesn't address the root cause of the failure, but I think we
should close the flake issue. I think we'd need like a stacktrace of all
goroutines at the point of failing the test, but that's way too much
effort unless we see this again.
follows on from #21940.
The API endpoints existed for this already, so this PR just adds CLI functionality which uses those API endpoints.
Generated with the help of Mux
## Summary
Fixes flaky `TestServer/BuiltinPostgres` test caused by port conflicts
in CI.
## Fix
Increase retry attempts from 3 to 10 for better odds when port conflicts
occur.
Fixes https://github.com/coder/internal/issues/1017
Adds additional logs for determining what signal the agent receives
prior to shut down. Also helps distinguish whether the signal originated
at the agent or reaper.
Context was created before expensive setup operations (building
workspaces, starting agents), leaving insufficient time for the actual
command execution. Split into setupCtx for setup and a fresh ctx for
the command to ensure both get the full timeout.
The API endpoints existed for this already, so this PR just adds CLI
functionality which uses those API endpoints.
closes#21891
Generated with the help of Mux
* Adds support for parameter `format=text` in the following API routes:
* `/api/v2/workspaceagents/:id/logs`
* `/api/v2/workspacebuilds/:id/logs`
* `/api/v2/templateversions/:id/logs`
* `/api/v2/templateversions/:id/dry-run/:id/logs`
* Adds links to view raw logs on the following pages:
* Workspace build page
* Template editor page
* Template version page
* Refactors existing log formatting in `cli/logs.go` to live in `codersdk`.
🤖 Generated with Claude Opus 4.5, reviewed by me.
---------
Co-authored-by: Claude <noreply@anthropic.com>
These tests use dbfake to set up database state directly and don't
need a provisioner daemon. Removing it fixes a flaky failure on
Windows where the provisioner daemon acquired a job that dbfake had
already "completed", causing the task status to be "error" instead
of "paused".
Fixescoder/internal#1322
Refs coder/internal#1323
Apply optimizations:
* https://github.com/openai/openai-go/pull/602
* https://github.com/coder/aibridge/pull/160
These reduce CPU time and allocation count for OpenAI `chat/completions`
and `responses` APIs, making the use of OpenAI chat models through AI
Bridge more performant.
In order to test these changes, we add scaletesting support for the
responses API.
## Description
Mark `--ssh-hostname-prefix` flag and `CODER_SSH_HOSTNAME_PREFIX` env
variable as deprecated, recommending users to use
`--workspace-hostname-suffix` / `CODER_WORKSPACE_HOSTNAME_SUFFIX`
instead for consistency with Coder Desktop.
The deprecated option is now hidden from help output and docs but
remains functional for backward compatibility. When used, it will show a
deprecation warning pointing to the recommended alternative.
## Changes
- Added `UseInstead` pointing to `workspace-hostname-suffix` option
(triggers deprecation warning)
- Set `Hidden: true` to hide from CLI help and documentation
- Updated description to mention deprecation
- Regenerated docs and help files via `make gen`
Closes#18156
---
_Originally requested by @matifali in
https://github.com/coder/coder/pull/18085#discussion_r2115594447_
The test was creating two template versions without explicit names,
relying on `namesgenerator.NameDigitWith()` which can produce
collisions. When both versions got the same random name, the test failed
with a 409 Conflict error.
Fix by giving each version an explicit name (`v1`, `v2`).
Closes https://github.com/coder/internal/issues/1309
---
*Generated by [mux](https://mux.coder.com)*
Adds a new subcommand to print the current session token for use in
scripts and automation, similar to `gh auth token`.
## Usage
```bash
CODER_SESSION_TOKEN=$(coder login token)
```
Fixes#21515
Fixes: https://github.com/coder/internal/issues/560
"Select" CLI UI component should ignore "space" when `+Add custom value`
is highlighted. Otherwise it interprets that as a potential option...
and panics.
The reaper (PID 1) now returns the child's exit code instead of always
exiting 0. Signal termination uses the standard Unix convention of 128 +
signal number.
fixes#21661
The test occasionally times out at 15s on Windows CI runners.
Investigation of CI logs shows the HTTP request to the agent's
gitsshkey endpoint never appears in server logs, suggesting it
hangs before the request completes (possibly in connection setup,
middleware, or database queries). Increase to 60s to reduce flake
rate.
Fixescoder/internal#770
This undeprecates the `allow-workspace-renames` flag. IIUC, the 'danger'
with using this flag is that the workspace name might have been used in
the definition of some other terraform resources within template code,
so a rename could cause problems such as with persistent disks.
for https://github.com/coder/coder/issues/21628
---------
Signed-off-by: Callum Styan <callumstyan@gmail.com>