coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 04:58:23 +00:00

Author	SHA1	Message	Date
Callum Styan	36665e17b2	feat: add WatchAllWorkspaceBuilds endpoint for autostart scaletests (#22057 ) This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces` endpoint, which can be used to listen on a single global pubsub channel for _all_ workspace build updates, and makes use of it in the autostart scaletest. This negates the need to use a workspace watch pubsub channel _per_ workspace, which has auth overhead associated with each call. This is especially relevant in situations such as the autostart scaletest, where we need to start/stop a set of workspaces before we can configure their autostart config. The overhead associated with all the watch requests skews the scaletest results and makes it harder to reason about the performance of the autostart feature itself. The autostart scaletest also no longer generates its own metrics nor does it wait for all the workspaces to actually start via autostart. We should update the scaletest dashboard after both PRs are merged to measure autostart performance via the new metrics. The new function/endpoint and its usage in the autostart scaletest are gated behind an experiment feature flag, this is something we should discuss whether we want to enable the endpoint in prod by default or not. If so, we can remove the experiment. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Callum Styan <callum@coder.com>	2026-03-13 20:37:41 -07:00
Danny Kopping	870583224d	chore: deprecate injected MCP approach in AI Bridge (#23031 ) _Disclaimer: implemented by a Coder Agent using Claude Opus 4.6._ Marks the injected MCP approach in AI Bridge as deprecated across the codebase. ## Changes - `codersdk/deployment.go`: Deprecated `ExternalAuthConfig.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex` fields; deprecated and hid the `--aibridge-inject-coder-mcp-tools` server flag; deprecated `AIBridgeConfig.InjectCoderMCPTools`. - `coderd/externalauth/externalauth.go`: Deprecated `Config.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex`. - `enterprise/aibridgedserver/aibridgedserver.go`: Added runtime deprecation warning when `CODER_AIBRIDGE_INJECT_CODER_MCP_TOOLS` is enabled; deprecated `getCoderMCPServerConfig`. - `enterprise/aibridged/mcp.go`: Deprecated `MCPProxyBuilder` interface and `MCPProxyFactory` struct. - `docs/ai-coder/ai-bridge/mcp.md`: Added deprecation warning banner.	2026-03-13 16:15:33 +02:00
Zach	2488cf0d41	fix(agent): don't overwrite existing vscode git auth settings (#22871 ) OverrideVSCodeConfigs previously unconditionally set `git.useIntegratedAskPass` and `github.gitAuthentication` to false, clobbering any values provided by template authors via module settings (e.g. the vscode-web module's settings block). This change only set these keys when they are not already present, so template-provided values are preserved. Registry PR [#758](https://github.com/coder/registry/pull/758) fixed the module side (run.sh merges template-author settings into the existing settings.json instead of overwriting the file). But the agent still unconditionally stamped false onto both keys before the script ran, so the merge base always contained the agent's values and template authors couldn't set them to anything else. This change fixes the agent side by only writing defaults when the keys are absent.	2026-03-12 13:39:24 -06:00
Mathias Fredriksson	57af7abf1f	test: add testutil.WaitBuffer and replace time.Sleep in tests (#22922 ) WaitBuffer is a thread-safe io.Writer that supports blocking until accumulated output matches a substring or custom predicate. It replaces ad-hoc safeBuffer/syncWriter types and time.Sleep-based poll loops in tests with signal-driven waits. - WaitFor/WaitForNth/WaitForCond for blocking on output - Replace custom buffer types in cli/sync_test.go and provisionersdk/agent_test.go - Convert time.Sleep poll loops to require.Eventually/require.Never in cli/ssh_test.go, coderd/activitybump_test.go, coderd/workspaceagentsrpc_test.go, workspaceproxy_test.go, and scaletest tests	2026-03-12 18:07:52 +02:00
Zach	5cb820387c	fix: use quartz clock in task status test (#22969 ) Replace time.Since() usage with a quartz.Clock injected via RootCmd to ensure relative time strings ("Xs ago") are deterministic.	2026-03-12 08:33:09 -06:00
George K	e5c19d0af4	feat: backend support for creating and storing service accounts (#22698 ) Add is_service_account column to users table with CHECK constraints enforcing login_type='none' and empty email for service accounts. Update user creation API to validate service account constraints. Related to: https://linear.app/codercom/issue/PLAT-27/feat-backend-support-for-creating-and-storing-service-accounts	2026-03-11 10:19:08 -07:00
Kyle Carberry	71b132b9e7	fix(cli/sessionstore): don't run Windows keyring tests in parallel (#22937 ) Removes `t.Parallel()` from `TestKeyring` and `TestWindowsKeyring_WriteReadDelete`. The OS keyring is a shared system resource that's flaky under concurrent access, especially Windows Credential Manager in CI. Fixes coder/internal#1370	2026-03-11 15:19:56 +02:00
Cian Johnston	bc27274aba	feat(coderd): refactors github pr sync functionality (#22715 ) - Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_` - Extracts and refactors existing GitHub PR sync logic to new packages `coderd/gitsync` and `coderd/externalauth/gitprovider` - Associated wiring and tests Created using Opus 4.6	2026-03-10 18:46:01 +00:00
Mathias Fredriksson	33136dfe39	fix: use signal-based sync instead of time.Sleep in sync test (#22918 ) The `start_with_dependencies` golden test was flaky on Windows CI. It used `time.Sleep(100ms)` in a goroutine hoping the `sync start` command would have time to call `SyncReady`, find the dependency unsatisfied, and print the "Waiting..." message before the goroutine completed the dependency. On slower Windows runners, the sleep could finish and complete the dependency before the command's first `SyncReady` call, so `ready` was already `true` and the "Waiting..." message was never printed, causing the golden file mismatch. This replaces the `time.Sleep` with a `syncWriter` that wraps `bytes.Buffer` with a mutex and a channel. The channel closes when the written output contains the expected signal string ("Waiting"). The goroutine blocks on this channel instead of sleeping, so it only completes the dependency after the command has confirmed it is in the waiting state. Fixes https://github.com/coder/internal/issues/1376	2026-03-10 17:21:08 +00:00
Danny Kopping	2948400aef	fix(cli): skip CODER_SESSION_TOKEN check when --use-token-as-session is set (#22888 ) _Disclaimer: implemented with Opus 4.6 and Coder Agents._ Follow-up to #22879. ## Problem The `CODER_SESSION_TOKEN` guard added in #22879 blocks `coder login` unconditionally when the env var is set. This conflicts with `--use-token-as-session`, which intentionally uses the provided token (including from the env var) directly as the session token. ## Fix Add `&& !useTokenForSession` to the check so that `coder login --use-token-as-session` still works when `CODER_SESSION_TOKEN` is set. ## Testing Added `TestLogin/SessionTokenEnvVarWithUseTokenAsSession` — sets the env var with a valid token and passes `--use-token-as-session`, verifying login succeeds. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-03-10 15:40:54 +02:00
Mathias Fredriksson	73bf8478d8	fix(cli): fix flaky TestGitSSH/Local_SSH_Keys on Windows CI (#22883 ) The `TestGitSSH/Local_SSH_Keys` test was flaking on Windows CI with a context deadline exceeded error when calling `client.GitSSHKey(ctx)`. Two issues contributed to the flake: 1. `prepareTestGitSSH` called `coderdtest.AwaitWorkspaceAgents` without passing the caller's context. This created a separate internal 25s timeout, wasting time budget independently of the setup context. Changed to use `NewWorkspaceAgentWaiter(...).WithContext(ctx).Wait()` so the agent wait shares the caller's timeout. 2. The `Local SSH Keys` subtest used `WaitLong` (25s) for its setup context, but this subtest does more work than `Dial` (runs the command twice). Bumped to `WaitSuperLong` (60s) to give slow Windows CI runners enough time. Fixes coder/internal#770	2026-03-10 12:12:15 +02:00
Mathias Fredriksson	41c505f03b	fix(cli): handle ignored errors in ssh and scaletest commands (#22852 ) Handle errors that were previously assigned to blank identifiers in the `cli/` package. - ssh.go: Log ExistsViaCoderConnect DNS lookup error at debug level instead of silently discarding it. Fallthrough behavior preserved. - exp_scaletest_llmmock.go: Log srv.Stop() error via the existing logger instead of discarding it.	2026-03-10 12:08:40 +02:00
Danny Kopping	d936a99e6b	fix(cli): error when CODER_SESSION_TOKEN env var is set during login (#22879 ) _Disclaimer: created with Opus 4.6 and Coder Agents._ ## Problem When `CODER_SESSION_TOKEN` is set as an environment variable with an invalid value, `coder login` fails with a confusing error: ``` error: Trace=[create api key: ] You are signed out or your session has expired. Please sign in again to continue. Suggestion: Try logging in using 'coder login'. ``` The suggestion to run `coder login` is what the user just did, making it circular and unhelpful. ## Root cause The `--token` flag is mapped to `CODER_SESSION_TOKEN` via serpent. When the env var is set, `coder login` picks it up as the session token and tries to use it to create a new API key, which fails because the token is invalid. Even if login were to succeed and write a new token to disk, subsequent commands would still use the env var (which takes precedence over the on-disk token), so the user would remain stuck. ## Fix Before attempting login, check if `CODER_SESSION_TOKEN` is set in the environment. If so, return a clear error telling the user to unset it: ``` the environment variable CODER_SESSION_TOKEN is set, which takes precedence over the session token stored on disk. Please unset it and try again. unset CODER_SESSION_TOKEN ``` ## Testing Added `TestLogin/SessionTokenEnvVar` that verifies the error is returned when the env var is set.	2026-03-10 09:41:05 +00:00
Zach	14341edfc2	fix(cli): fix `coder login token` failing without --url flag (#22742 ) Previously `coder login token` didn't load the server URL from config, so it always required --url or CODER_URL when using the keyring to store the session token. This command would only print out the token when already logged in to a deployment and file storage is used to store the session token (keyring is the default on Windows/macOS). It would also print out an incorrect token when --url was specified and the session token stored on disk was for a different deployment that the user logged into. This change fixes all of these issues, and also errors out when using session token file storage with a `--url` argument that doesn't match the stored config URL, since the file only stores one token and would silently return the wrong one. See https://github.com/coder/coder/issues/22733 for a table of the before/after behaviors.	2026-03-10 08:57:27 +01:00
Danielle Maywood	4cf8d4414e	feat: make `coder task send` resume paused tasks (#22203 )	2026-03-07 01:36:03 +00:00
Cian Johnston	0b1e4880bd	chore(cli): fix TestTokens harder (#22684 ) `time.Now()` is greater than microsecond precision while timestamps we store in Postgres are only microsecond precision. Flake potential is non-zero.	2026-03-06 00:04:09 +00:00
George K	5dd570f099	fix(cli/cliui): apply defaults when rendering select prompts (#22093 ) The `--parameter-default` value is now used to pre-select the default option for a coder parameter with option blocks when prompting interactively in CLI. Related to: https://github.com/coder/coder/issues/22078	2026-03-05 09:35:57 -08:00
Ethan	5a5828b090	fix(cli): add trailing dot to Coder Connect hostname to prevent DNS search domain expansion (#22607 ) ## Problem When `coder ssh --stdio` checks for Coder Connect availability, it constructs a hostname like `agent.workspace.owner.coder` and performs a DNS AAAA lookup via `ExistsViaCoderConnect`. Without a trailing dot, this hostname is not a fully-qualified domain name (FQDN), so the system DNS resolver appends each configured search domain before querying. Go's pure-Go DNS resolver (used when `CGO_ENABLED=0`, which is the default for CLI builds) does not stop after getting NXDOMAIN on the first name. It tries all names in the search list sequentially: 1. `agent.workspace.owner.coder.` → NXDOMAIN (fast) 2. `agent.workspace.owner.coder.corp.example.com.` → timeout 3. `agent.workspace.owner.coder.internal.company.com.` → timeout On corporate networks where the search-domain-expanded queries hit DNS infrastructure that drops rather than responds (common for nonsensical hostnames with deep subdomain chains), each expanded query hits the full DNS timeout (default 5s × 2 attempts = 10s per name). With 2-3 search domains, this compounds to 20-30+ seconds of blocking. ## Fix Adding a trailing dot marks the hostname as an FQDN. Go's `nameList()` in `src/net/dnsclient_unix.go` returns a single-entry list for rooted names, completely bypassing search domain expansion. This is consistent with how `IsCoderConnectRunning` already handles its DNS check — `tailnet.IsCoderConnectEnabledFmtString` includes a trailing dot for exactly this reason. ## Verification Tested with a fake DNS server that responds with NXDOMAIN for `.coder` queries but drops search-domain-expanded queries: \| Hostname \| Time \| Queries sent \| \|---\|---\|---\| \| `main.workstation.kevin.coder` (no trailing dot) \| ~15s \| 4 (as-is + 3 search domains) \| \| `main.workstation.kevin.coder.` (trailing dot) \| <1ms \| 1 (FQDN only) \| Closes https://github.com/coder/coder/issues/22581 _Generated by [mux](https://github.com/coder/mux) but reviewed by a human_	2026-03-06 01:56:54 +11:00
Cian Johnston	89cee2dd81	chore(cli): fix flaky temporal assertion in TestTokens (#22654 ) Fixes https://github.com/coder/internal/issues/1379	2026-03-05 10:18:51 +00:00
Susana Ferreira	21c91cebaa	feat: add TLS listener support to aibridgeproxyd (#22411 ) ## Description Adds optional TLS support for the AI Bridge Proxy listener. When TLS cert and key files are provided, the proxy serves over HTTPS instead of plain HTTP. ## Changes * New configuration options to enable TLS on the proxy listener * Wraps the TCP listener in `tls.NewListener` when configured * Tests for validation errors, invalid files, and full integration (tunneled + MITM) through a TLS listener Note: Documentation for TLS listener setup and client configuration will be handled in a follow-up PR. Related to: https://github.com/coder/internal/issues/1335	2026-03-05 09:19:34 +00:00
Susana Ferreira	c79e8f2707	refactor: clarify MITM certificate naming in aibridgeproxyd (#22408 ) ## Description Renames internal fields, variables, and comments related to the proxy's certificate/key configuration to explicitly reference their MITM CA purpose. The AI Bridge Proxy uses a CA certificate to sign dynamically generated leaf certificates during MITM interception of HTTPS traffic from AI clients. With the upcoming introduction of TLS listener certificates (for serving the proxy itself over HTTPS, implemented upstack https://github.com/coder/coder/pull/22411), the previous generic naming would become ambiguous. This refactor makes it clear which certificate is which. No user-facing flags, environment variables, YAML keys, or JSON fields were changed, this is purely an internal rename to avoid confusion going forward. Related to https://github.com/coder/internal/issues/1335	2026-03-05 09:06:38 +00:00
Kyle Carberry	a6b9a25f82	fix(cli): bypass access URL redirect for inter-replica chat relay (#22635 ) ## Summary Fixes cross-replica chat relay failing with: ``` failed to open initial relay for chat stream error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200 failed to open relay for message parts error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200 ``` Subscribers see accurate `status=running` (delivered via pubsub) but miss all in-progress `message_part` events (delivered only via the relay WebSocket that never connects). ## Root cause `redirectToAccessURL` in `cli/server.go` redirects any request whose `Host` header doesn't match the access URL. The enterprise chat relay dials another replica directly via its DERP relay address (e.g. `http://10.0.0.2:8080`), so the `Host` header is the pod IP — not the access URL. This triggers a 307 redirect to the access URL. The WebSocket library follows the redirect, but the second request is a plain GET — `Connection: Upgrade` and `Upgrade: websocket` headers are not carried over by HTTP redirect semantics. The load-balanced access URL routes the plain GET to any replica, which serves the SPA catch-all handler and returns HTTP 200 with `index.html`. The WebSocket library then fails: `expected handshake response status code 101 but got 200`. DERP mesh already has an exemption for this exact scenario (`isDERPPath`). Chat relay was added later and didn't get one. ## Fix Bypass `redirectToAccessURL` for requests that carry the `X-Coder-Relay-Source-Replica` header, which the enterprise relay already sets on every request (`enterprise/coderd/chatd/chatd.go:573`). ## Sequence diagram Before (broken): ``` Replica A (subscriber) Replica B (worker) Load Balancer \| \| \| \|--- WS dial pod-ip:8080 ----->\| \| \| \|-- 307 redirect to LB --->\| \| \| \| \|<----------- plain GET (no Upgrade headers) ------------->\| \| \| \|-- routes to any replica \|<----------- 200 index.html -------------------------------\| \| \| X 'expected 101 but got 200' \| ``` After (fixed): ``` Replica A (subscriber) Replica B (worker) \| \| \|--- WS dial pod-ip:8080 ----->\| \| (X-Coder-Relay-Source- \| \| Replica header set) \| \| \|-- bypass redirect \|<--------- 101 Upgrade ------\| \|<==== message_part events ====\| ```	2026-03-04 20:26:03 -05:00
Kayla はな	e35717bc19	fix: show a notice when workspace sharing is disabled globally in organization settings (#22580 )	2026-03-04 11:14:52 -07:00
Spike Curtis	1a30ca1a2a	chore: use agentsocket for task status updates in MCP server (#22354 ) relates to #21335 Modifies our local MCP server used in Tasks to push task status updates over the agentsocket, rather than directly dialing Coderd. This will significantly reduce pressure on the database at scale because we can avoid expensive authentication of the agent API key. Disclosure: I used AI to generate a lot of this PR, but hand-reviewed and tweaked it.	2026-03-04 21:41:21 +04:00
Spike Curtis	56eb57caf4	chore: enable agent socket by default (#22352 ) relates to #21335 Enables the agent socket by default and updates docs to strike references to having to enable it. The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token. Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.	2026-03-03 21:23:59 +04:00
Cian Johnston	517cb0ce73	refactor(webpush): use RequireExperimentWithDevBypass middleware (#22525 ) Replace manual experiment checks in web-push handlers with the `RequireExperimentWithDevBypass` middleware on the route group, matching the pattern used by OAuth2, Agents, and MCP experiments. ## Changes - `coderd/coderd.go`: Add `RequireExperimentWithDevBypass` middleware to `/webpush` route group - `coderd/webpush.go`: Remove inline `api.Experiments.Enabled(codersdk.ExperimentWebPush)` checks from all three handlers - `cli/server.go`: Gate webpush dispatcher initialization with `buildinfo.IsDev()` fallback so dev builds always init the real dispatcher - `coderd/webpush_test.go`: Remove experiment enablement from tests (dev bypass handles it) Net effect: -26 lines removed, +5 added. Created using whatchamacallits (Opus 4.6 Max)	2026-03-03 09:49:04 +00:00
Mathias Fredriksson	b80dbd2d4e	test(cli): fix flaky TestProvisioners_Golden (#22491 ) Fixes coder/internal#449	2026-03-03 08:47:34 +00:00
Kyle Carberry	56f95a3e6d	fix: scope git askpass diff status updates to initiating chat (#22534 ) ## Problem When the git askpass flow triggered diff status refreshes, it updated every chat connected to the workspace. This was wasteful and could cause confusing status updates on unrelated chats. ## Solution Thread the chat ID through the entire git askpass flow so only the chat that initiated the git operation gets updated: 1. `coderd/chatd/chattool/execute.go` — Sets `CODER_CHAT_ID` env var on spawned processes (alongside the existing `CODER_CHAT_AGENT`) 2. `cli/gitaskpass.go` — Reads `CODER_CHAT_ID` from the environment and sends it as a `chat_id` query parameter in the `ExternalAuthRequest` 3. `codersdk/agentsdk/agentsdk.go` — Adds `ChatID` field to `ExternalAuthRequest` and encodes it as a query param 4. `coderd/workspaceagents.go` — Parses `chat_id` query param and passes it through to `storeChatGitRef` and `triggerWorkspaceChatDiffStatusRefresh` 5. `coderd/chats.go` — `storeChatGitRef` and `refreshWorkspaceChatDiffStatuses` now scope updates to just the initiating chat when a chat ID is provided, falling back to all-workspace-chats behavior for backwards compatibility (non-chat git operations)	2026-03-02 22:52:39 -05:00
Steven Masley	7bc454eed8	chore: version is 2.31 not 1.31 (#22494 )	2026-03-02 16:23:09 +00:00
Kyle Carberry	edee917d88	feat: add experimental agents support (#22290 ) feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 16:50:56 +00:00
Steven Masley	21bc185254	doc: add language to mention disruptive nature of cookie host prefix (#22384 )	2026-02-27 15:59:01 +00:00
Kacper Sawicki	ab28ecde88	fix(cli): reuse multi-select parameter values on workspace update (#22261 ) Fixes three bugs that caused `coder update` to always re-prompt for multi-select (`list(string)`) parameters instead of reusing previous build values: 1. `isValidTemplateParameterOption` failed for multi-select values (`cli/parameterresolver.go`): It compared the entire JSON array string (e.g. `["vim","emacs"]`) against individual option values, which never matched. Now parses the JSON array and validates each element separately. 2. `RichParameter` ignored previous build value for multi-select (`cli/cliui/parameter.go`): The `list(string)` branch always used the template's default value instead of the `defaultValue` argument (which carries the previous build's value). Now uses `defaultValue` when available, falling back to the template default. 3. Pre-existing crash when `list(string)` has no default value (`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused `unexpected end of JSON input`. Now skips unmarshaling when the default source is empty. Fixes #19956	2026-02-26 14:34:30 +01:00
Jon Ayers	4e365e59b6	fix: add provision/tags to prebuilds scenario (#22294 )	2026-02-25 11:16:20 -06:00
Garrett Delfosse	6c16794173	fix(cli): proactively use active template version when require_active_version is set (#22033 ) Fixes #22030 ## Problem When a template has `require_active_version = true` and a workspace is outdated, the web UI always shows "Update and start" as the only button (for all users including admins), but `coder start` starts with the old version. For admins, this silently succeeds on the stale version. For non-admins, it goes through a clunky 403→retry path. This also affects the VS Code extension, which calls `coder start --yes` under the hood. ## Root Cause `buildWorkspaceStartRequest()` in `cli/start.go` checks `workspace.AutomaticUpdates == "always"` but ignores `workspace.TemplateRequireActiveVersion`. The server-side autostart already ORs both settings together: ```go // coderd/autobuild/lifecycle_executor.go func useActiveVersion(opts, ws) bool { return opts.RequireActiveVersion \|\| ws.AutomaticUpdates == "always" } ``` The CLI was missing the `RequireActiveVersion` check. ## Fix Add `workspace.TemplateRequireActiveVersion` to the existing OR condition: ```go // Before: if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways \|\| action == WorkspaceUpdate { // After: if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways \|\| workspace.TemplateRequireActiveVersion \|\| action == WorkspaceUpdate { ``` Now `coder start` and `coder restart` proactively use the active template version when `require_active_version` is set, matching the web UI and server autostart behavior. The 403→retry fallback remains as a safety net but is no longer the primary path for any user. ## Testing Updated `enterprise/cli/start_test.go` — all user types (owner, template admin, ACL admin, group ACL admin, member) now expect the active version when `require_active_version` is set, and verify the 403→retry message does NOT appear.	2026-02-24 19:51:48 -05:00
Mathias Fredriksson	947b390c5a	fix: allow agent-reported final states, add SSE reconnection (#22286 ) When AgentAPI is configured, `WithTaskReporter` unconditionally overrides all self-reported states to `working`. The intent was to distrust the agent's `idle` and rely on the screen watcher, but the override also blocks `failure` and `complete`, which only the agent can produce (the screen watcher only knows `running`/`stable`). Tasks get stuck as `working` or `null` forever. Now only `idle` is overridden to `working`; `failure`, `complete`, and `working` pass through as-is. Also: - Remove misplaced unconditional `"Failed to watch screen events"` log that fired on every startup - Add SSE reconnection with exponential backoff (1s-30s) in `startWatcher` so it recovers from dropped connections instead of dying silently - Add `complete` to the `coder_report_task` tool enum, which the `coder/claude-code` registry module already instructs agents to use but was missing from the schema Refs coder/internal#1350	2026-02-24 20:28:50 +02:00
Kacper Sawicki	1e274063d4	feat(coderd): filter expired API tokens server-side (#22263 ) ## Summary Moves expired token filtering from client-side to server-side by adding an `include_expired` parameter to the `GetAPIKeysByLoginType` and `GetAPIKeysByUserID` database queries. This is more efficient for large deployments with many expired/short-lived tokens. ## Changes - Add `include_expired` parameter to SQL queries using `OR` short-circuit - Add `include_expired` query parameter to `GET /users/{user}/keys/tokens` - Add `IncludeExpired` field to `codersdk.TokensFilter` - Remove client-side filtering from CLI `tokens list` command - Add `TestTokensFilterExpired` test Fixes coder/internal#1357	2026-02-24 15:27:03 +00:00
Kacper Sawicki	3c69d683f4	fix(cli): allow new immutable parameters via --parameter flag during update (#22221 ) ## Problem When a template adds a new immutable parameter, `coder update --parameter param=value` fails with: ``` error: start workspace: parameter "machine_type" is immutable and cannot be updated ``` The interactive prompt handles this correctly (allows setting first-time immutable params), but the CLI `--parameter` flag path does not. ## Root Cause In `cli/parameterresolver.go`, `verifyConstraints()` runs before the interactive prompt and unconditionally rejects any immutable parameter during updates. It doesn't distinguish between new immutable parameters (first-time use, should be allowed) and existing ones (already set, should be blocked from changing). ## Fix Added an `isFirstTimeUse` check to the immutable parameter constraint, matching the logic already used by the interactive prompt path (line 323). New immutable parameters can now be set via `--parameter`, while existing immutable parameters are still blocked from being changed. ## Testing Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag` which: 1. Creates a workspace with a mutable parameter 2. Updates the template to add a new immutable parameter 3. Runs `coder update --parameter immutable_param=value` 4. Verifies the update succeeds and the parameter is set correctly Fixes #22164	2026-02-24 09:15:02 +01:00
Jon Ayers	0a7a3da178	fix: exclude provisioner_state from workspace_build_with_user view (#22159 ) The provisioner state for a workspace build was being loaded for every long-lived agent rpc connection. Since this state can be anywhere from kilobytes to megabytes this can gradually cause the `coderd` memory footprint to grow over time. It's also a lot of unnecessary allocations for every query that fetches a workspace build since only a few callers ever actually reference the provisioner state. This PR removes it from the returned workspace build and adds a query to fetch the provisioner state explicitly.	2026-02-23 22:46:17 -06:00
Sushant P	37a8e61ea2	chore: move Shared Workspaces from experiments to beta (#22206 ) * Removed the shared-workspaces experiment and cleaned up related middleware * Added beta tagging to the UI for shared workspaces	2026-02-23 08:30:32 -08:00
Steven Masley	b0f35316da	chore!: automatically use secure cookies if using https access-url (#22198 ) `--secure-auth-cookie` now automatically sources it's default value from `--access-url` If the access url uses HTTPS, secure is set to `true`. To revert to old behavior, set the value explicitly to `false`	2026-02-20 10:33:37 -06:00
Steven Masley	efdaaa2c8f	chore: add oidc redirect url to override access url (#21521 ) If a deployment has 2 domains, overriding the oidc url allows the oidc redirect to differ from the access_url response to https://github.com/coder/coder/discussions/21500 This config setting is hidden by default	2026-02-20 09:11:01 -06:00
Steven Masley	e5f64eb21d	chore: optionally prefix authentication related cookies (#22148 ) When the deployment option is enabled auth cookies are prefixed with `__HOST-` ([info](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Set-Cookie)). This is all done in a middleware that intercepts all requests and strips the prefix on incoming request cookies.	2026-02-20 09:01:00 -06:00
Spike Curtis	1069ce6e19	feat: add support for agentsock on Windows (#22171 ) relates to #21335 Adds support for the agentsock and thus `coder exp sync` commands on Windows. This support was initially missing.	2026-02-20 16:27:32 +04:00
Rowan Smith	c664e4f72d	chore: add active field to template versions json output (#22165 ) `coder templates version list` makes a call to determine the `active` version: ``` ➜ ~ coder templates version list aws-linux-dynamic NAME CREATED AT CREATED BY STATUS ACTIVE infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded ``` but this is not carried across to the `-ojson` output version, so this PR implements that in order to support programattic addressing. It is added a top level entry. If it should be nested under `TemplateVersion` let me know. ``` ➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson \| jq '.[] \| select(.active == true) \| { active, id: .TemplateVersion.id }' { "active": true, "id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19" } ➜ ~ ./Downloads/coder-cli-templateversions-json-active templates version list aws-linux-dynamic -ojson \|jq '.[] \| select(.active == true)' { "TemplateVersion": { "id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19", "template_id": "1a84ce78-06a6-41ad-99e4-8ea5d9b91e89", "organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79", "created_at": "2025-10-10T10:34:02.254357+11:00", "updated_at": "2025-10-10T10:34:46.594032+11:00", "name": "infallible_feistel2", "message": "Uploaded from the CLI", "job": { "id": "8afd05ca-b4be-48d5-a6b9-82dcfd12c960", "created_at": "2025-10-10T10:34:02.251234+11:00", "started_at": "2025-10-10T10:34:02.257301+11:00", "completed_at": "2025-10-10T10:34:46.594032+11:00", "status": "succeeded", "worker_id": "a0940ade-ecdd-47c2-98c6-f2a4e5eb0733", "file_id": "05fd653c-3a3f-4e5c-856b-29407732e1b1", "tags": { "owner": "", "scope": "organization" }, "queue_position": 0, "queue_size": 0, "organization_id": "35f75f20-890e-4095-95f1-bb8f2ba02e79", "initiator_id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6", "input": { "template_version_id": "38f66eae-ec63-49b7-a9d2-cdb79c379d19" }, "type": "template_version_import", "metadata": { "template_version_name": "", "template_id": "00000000-0000-0000-0000-000000000000", "template_name": "", "template_display_name": "", "template_icon": "" }, "logs_overflowed": false }, "readme": "---\ndxxxxx, "created_by": { "id": "d20c05ff-ecf3-4521-a99d-516c8befbaa6", "username": "rowansmith", "name": "rowan smith" }, "archived": false, "has_external_agent": false }, "active": true } ```	2026-02-19 09:31:12 +11:00
Rowan Smith	1c4dd78b05	chore: add id to template version output columns (#22163 ) At present it is not possible to obtain the `id` of the template version in the table output: ``` ➜ ~ coder templates version list -h coder v2.30.1+16408b1 USAGE: coder templates versions list [flags] <template> List all the versions of the specified template OPTIONS: -O, --org string, $CODER_ORGANIZATION Select which organization (uuid or name) to use. -c, --column [name\|created at\|created by\|status\|active\|archived] (default: name,created at,created by,status,active) Columns to display in table output. ➜ ~ coder templates version list aws-linux-dynamic NAME CREATED AT CREATED BY STATUS ACTIVE infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded ``` Adding this because it is useful when wanting to programatically retrieve the details of the latest template version, and `-ojson` does not include `active` details in it's output. ``` ➜ Downloads ./coder-cli-templateversions-list-id templates version list -h coder v2.30.1-devel+bab99db9e7 USAGE: coder templates versions list [flags] <template> List all the versions of the specified template OPTIONS: -O, --org string, $CODER_ORGANIZATION Select which organization (uuid or name) to use. -c, --column [id\|name\|created at\|created by\|status\|active\|archived] (default: name,created at,created by,status,active) Columns to display in table output. --include-archived bool Include archived versions in the result list. -o, --output table\|json (default: table) Output format. ——— Run `coder --help` for a list of global options. ➜ Downloads ./coder-cli-templateversions-list-id templates version list aws-linux-dynamic -c id,name,'created at','created by',status,active ID NAME CREATED AT CREATED BY STATUS ACTIVE 38f66eae-ec63-49b7-a9d2-cdb79c379d19 infallible_feistel2 2025-10-10T10:34:02+11:00 rowansmith Succeeded Active aa797ea5-4221-461b-80b0-90c5164f8dc0 mystifying_almeida1 2025-10-10T10:32:38+11:00 rowansmith Succeeded ```	2026-02-18 16:47:45 +11:00
Danielle Maywood	d737f8c104	feat(cli): add `coder task resume` command (#22066 ) Complements https://github.com/coder/coder/pull/22012 by adding a `coder task resume` command	2026-02-17 16:24:13 +00:00
Cian Johnston	4a3304fc38	feat(cli)!: expire tokens by default (#21783 ) ## Summary > NOTE: Calling this out as a breaking change in case existing consumers of the CLI depend on being able to see expired tokens OR being able to delete tokens immediately. Updates the `coder tokens rm` command to immediately expire a token by ID, preserving the token record for audit trail purposes. Tokens can still be deleted by passing `--delete`. ## Problem During an incident on dev.coder.com, operators needed to urgently expire an API key that was stuck in a hot loop. The only way to do this was via direct database access: ```sql UPDATE api_keys SET expires_at = NOW() WHERE id = '...'; ``` This is not ideal for operators who may not have direct DB access or want to avoid manual SQL. ## Solution This PR adds: - API endpoint: `PUT /api/v2/users/{user}/keys/{keyid}/expire` - Sets the token's `expires_at` to now - SDK method: `ExpireAPIKey(ctx, userID, keyID)` - Updates CLI: `coder tokens rm <name\|id\|token>` now _expires_ by default. You can still delete by passing the `--delete` flag. The `coder tokens list` command now also hides expired tokens by default. You can `--include-expired` if needed to include them. - Audit logging: The expire action is logged with old and new key states ## Test plan - Tests cover: owner expiring own token, admin expiring other user's token, non-admin cannot expire other's token, 404 for non-existent token Closes #21782 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-17 13:16:46 +00:00
Ethan	4b3889e4f9	fix(cli): allow site admins to use `coder create --org` for any organization (#21528 ) ## Problem Site-wide admins (e.g., Owners) could not use `coder create --org <org>` to create workspaces in organizations they are not members of. The error was: ``` $ coder create my-workspace -t docker --org data-science error: organization "data-science" not found, are you sure you are a member of this organization? ``` This was inconsistent with the web UI, where Owners can create workspaces in any organization. ## Root Cause The CLI's `OrganizationContext.Selected()` function only checked the user's membership list, ignoring site-wide RBAC permissions that grant Owners access to all organizations. ## Solution Added a fallback in `OrganizationContext.Selected()` that fetches the org directly via the API when not found in the membership list. This works because the API endpoint applies RBAC filtering, allowing Owners to read any org. ## Impact This fixes `coder create --org` and all other CLI commands that use `OrganizationContext.Selected()` (29+ commands), including: - `coder templates push --org <any-org>` - `coder organizations members add --org <any-org>` - `coder provisioner list --org <any-org>` ## Testing Added `TestEnterpriseCreate/OwnerCanCreateInNonMemberOrg` which: - Creates an Owner user who is NOT a member of a second org - Verifies they can create a workspace there using `--org` - Properly fails without the code fix, passes with it --- This PR was generated by [mux](https://mux.coder.com) but reviewed by a human.	2026-02-16 12:16:08 +11:00
Danielle Maywood	6d41d98b65	feat(cli): add `coder task pause` command (#22012 ) Adds a new `coder task pause`	2026-02-13 14:21:31 +00:00
Callum Styan	5f3be6b288	feat: add provisioner job queue wait time histogram and jobs enqueued counter (#21869 ) This PR adds some metrics to help identify job enqueue rates and latencies. This work was initiated as a way to help reduce the cost of the observation/measurement itself for autostart scaletests, which impacts our ability to identify/reason about the load caused by autostart. See: https://github.com/coder/internal/issues/1209 I've extended the metrics here to account for regular user initiated builds, prebuilds, autostarts, etc. IMO there is still the question here of whether we want to include or need the `transition` label, which is only present on workspace builds. Including it does lead to an increase in cardinality, and in the case of the histogram (when not using native histograms) that's at least a few extra series for every bucket. We could remove the transition label there but keep it on the counter. Additionally, the histogram is currently observing latencies for other jobs, such as template builds/version imports, those do not have a transition type associated with them. Tested briefly in a workspace, can see metric values like the following: - `coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"} 1` - `coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"} 1` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:40:47 -08:00

1 2 3 4 5 ...

1793 Commits