coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 04:58:23 +00:00

Author	SHA1	Message	Date
Asher	47daca6eea	feat: add filtering to org members (#23334 ) Continuation of https://github.com/coder/coder/pull/23067 Add filtering to the paginated org member endpoint (pretty much the same as what I did in the previous PR with group members, except there I also had to add pagination since it was missing).	2026-03-21 16:58:45 -08:00
Susana Ferreira	139594a4f4	feat: block CONNECT tunnels to private/reserved IP ranges (#23109 ) ## Description Blocks `CONNECT` tunnels to private and reserved IP ranges in aibridgeproxyd, preventing the proxy from being used to reach internal networks. The Coder access URL is always exempt (hostname+port match) so the proxy can reach its own deployment. It is possible to exempt additional ranges via `CODER_AIBRIDGE_PROXY_ALLOWED_PRIVATE_CIDRS`. DNS rebinding is handled differently per path: * Direct (no upstream proxy): validate the resolved IP right before the TCP dial, no window between check and connect. * Upstream proxy: Resolves and checks before forwarding to the upstream dialer. A small rebinding window exists since the upstream proxy re-resolves independently. ## Changes * Add blocked IP denylist covering private, reserved, and special-purpose ranges * Add `AllowedPrivateCIDRs` option with CLI flag and env var * Wire IP checks into `proxy.ConnectDial` for both upstream and direct paths * Add tests for blocked/allowed cases across direct dial, upstream proxy, CIDR exemptions, and CoderAccessURL exemption Notes: documentation will be handled in a follow-up PR. Closes: https://github.com/coder/security/issues/124	2026-03-20 09:49:26 +00:00
Cian Johnston	06c50d13ad	fix(cli): exorcise the DERP healthcheck demon from TestSupportBundle (#23337 ) - Replace real healthcheck with mock `HealthcheckFunc` that returns a canned report instantly - Remove healthcheck cache-seeding goroutine/channel workaround - Remove `HealthcheckTimeout: testutil.WaitSuperLong` (no longer needed) - Reduce `setupCtx` from `WaitSuperLong` (60s) to `WaitLong` (25s) The DERP healthcheck performs real network operations (portmapper gateway probing, STUN) that hang for 60s+ on macOS CI runners. Since `TestSupportBundle` validates bundle generation, not healthcheck correctness, a canned report eliminates this entire class of flake. Fixes coder/internal#272 > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-20 09:46:13 +00:00
Michael Suchacz	6d214644f6	fix: make TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease deterministic (#23279 ) Eliminates the timing flake in `TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease` by making the chatd worker loop clock-controllable. ## Changes `coderd/chatd/chatd.go` - Replace `time.NewTicker` calls in `Server.start()` with `p.clock.NewTicker` using named quartz tags `("chatd", "acquire")` and `("chatd", "stale-recovery")`. `coderd/chatd/chatd_test.go` - Inject `quartz.NewMock(t)` into the test via `newActiveTestServer` config override. - Trap the acquire ticker so the test controls exactly when pending chats are reacquired. - Rewrite the test flow as explicit clock-advance steps instead of wall-clock polling. `AGENTS.md` - Document the PR title scope rule (scope must be a real path containing all changed files). ## Validation - `go test ./coderd/chatd -run TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease -count=100` ✅ - `go test ./coderd/chatd` ✅ - `make lint` ✅	2026-03-19 15:14:00 +00:00
Cian Johnston	be1c06dec9	feat: add endpoint and CLI for users to view their own OIDC claims (#23053 ) - Adds a new API endpoint `GET /api/v2/users/oidc-claims` that returns only the merged claims (not the separate id_token/userinfo breakdown). Scoped exclusively to the authenticated user's own identity — no user parameter, so users cannot view each other's claims. - Adds a new CLI command:** `coder users oidc-claims` that hits the above endpoint. - The existing owner-only debug endpoint is preserved unchanged for admins who need the full claim breakdown. > 🤖 This PR was created with the help of Coder Agents, and will be reviewed by my human. 🧑‍💻	2026-03-18 22:10:04 +00:00
Jon Ayers	eba7d943a0	fix: run stop build before starting a workspace with a failed start (#22925 )	2026-03-18 14:58:20 -05:00
Kacper Sawicki	1e07ec49a6	feat: add merge_strategy support for coder_env resources (#23107 ) ## Description Implements the server-side merge logic for the `merge_strategy` attribute added to `coder_env` in [terraform-provider-coder v2.15.0](https://github.com/coder/terraform-provider-coder/pull/489). This allows template authors to control how duplicate environment variable names are combined across multiple `coder_env` resources. Relates to https://github.com/coder/coder/issues/21885 ## Supported strategies \| Strategy \| Behavior \| \|----------\|----------\| \| `replace` (default) \| Last value wins — backward compatible \| \| `append` \| Joins values with `:` separator (e.g. PATH additions) \| \| `prepend` \| Prepends value with `:` separator \| \| `error` \| Fails the build if the variable is already defined \| ## Example ```hcl resource "coder_env" "path_tools" { agent_id = coder_agent.dev.id name = "PATH" value = "/home/coder/tools/bin" merge_strategy = "append" } ``` ## Changes - Proto: Added `merge_strategy` field to `Env` message in `provisioner.proto` - State reader: Updated `agentEnvAttributes` struct and proto construction in `resources.go` - Merge logic: Added `mergeExtraEnvs()` function in `provisionerdserver.go` with strategy-aware merging for both agent envs and devcontainer subagent envs - Tests: 15 unit tests covering all strategies, edge cases (empty values, mixed strategies, multiple appends) - Dependency: Bumped `terraform-provider-coder` v2.14.0 → v2.15.0 - Fixtures: Updated `duplicate-env-keys` test fixtures and golden files ## Ordering When multiple resources `append` or `prepend` to the same key, they are processed in alphabetical order by Terraform resource address (per the determinism fix in #22706).	2026-03-18 15:43:28 +01:00
Ethan	fc3508dc60	feat: configure acquire chat batch size (#23196 ) ## Summary - add a hidden deployment config option for chat acquire batch size (`CODER_CHAT_ACQUIRE_BATCH_SIZE` / `chat.acquireBatchSize`) - thread the configured value into chatd startup while preserving the existing default of `10` - clamp the deployment value to the `int32` range before passing it into chatd - regenerate the API/docs/types/testdata artifacts for the new config field ## Why `chatd` currently acquires pending chats in batches of `10` via a compile-time default. This change makes that batch size operator-configurable from deployment config, so we can tune acquisition behavior without another code change.	2026-03-19 00:54:32 +11:00
Cian Johnston	fe82d0aeb9	fix: allow member users to generate support bundles (#23040 ) Fixes AIGOV-141 The `coder support bundle` command previously required admin permissions (`Read DeploymentConfig`) and would abort entirely for non-admin `member` users with: ``` failed authorization check: cannot Read DeploymentValues ``` This change makes the command degrade gracefully instead of failing outright. <details> <summary> Changes </summary> ### `support/support.go` - `Run()`: The authorization check for `Read DeploymentValues` is now a soft warning instead of a hard gate. Unauthenticated users (401) still fail, but authenticated users with insufficient permissions proceed with reduced data. - `DeploymentInfo()`: `DeploymentConfig` and `DebugHealth` fetches now handle 403/401 responses gracefully, matching the existing pattern used by `DeploymentStats`, `Entitlements`, and `HealthSettings`. - `NetworkInfo()`: Coordinator debug and tailnet debug fetches now check response status codes for 403/401 before reading the body. ### `cli/support.go` - `summarizeBundle()`: No longer returns early when `Config` or `HealthReport` is nil. Instead prints warnings and continues summarizing available data (e.g., netcheck). ### Tests - `MissingPrivilege` → `MemberNoWorkspace`: Asserts member users can generate a bundle successfully with degraded admin-only data. - `NoPrivilege` → `MemberCanGenerateBundle`: Asserts the CLI produces a valid zip bundle for member users. - All existing tests continue to pass (`NoAuth`, `OK`, `OK_NoWorkspace`, `DontPanic`, etc.). ## Behavior matrix \| User type \| Before \| After \| \|---\|---\|---\| \| Admin \| Full bundle \| Full bundle (no change) \| \| Member \| Hard error \| Bundle with degraded admin-only data \| \| Unauthenticated \| Hard error \| Hard error (no change) \| Related to PRODUCT-182	2026-03-18 13:43:10 +00:00
Atif Ali	bd5b62c976	feat: expose MCP tool annotations for tool grouping (#23195 ) ## Summary - add shared MCP annotation metadata to toolsdk tools - emit MCP tool annotations from both coderd and CLI MCP servers - cover annotation serialization in toolsdk, coderd MCP e2e, and CLI MCP tests ## Why - Coder already exposed MCP tools, but it did not populate MCP tool annotation hints (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`). - Hosts such as Claude Desktop use those hints to classify and group tools, so without them Coder tools can get lumped together. - This change adds a shared annotation source in `toolsdk` and has both MCP servers emit those hints through `mcp.Tool.Annotations`, avoiding drift between local and remote MCP implementations. ## Testing - Tested locally on Cladue Desktop and the tools are categorized correctly. <table> <tr> <td> Before <td> After <tr> <td> <img width="613" height="183" alt="image" src="https://github.com/user-attachments/assets/29d2e3fb-53bc-4ea7-bdb3-f10df4ef996b" /> <td> <img width="600" height="457" alt="image" src="https://github.com/user-attachments/assets/cc384036-c9a7-4db9-9400-43ad51920ff5" /> </table> Note: Done using Coder Agents, reviewed and tested by human locally	2026-03-18 10:21:45 +00:00
Asher	903cfb183f	feat: add --service-account to cli user creation (#23186 )	2026-03-17 14:07:20 -08:00
George K	91ec0f1484	feat: add service_accounts workspace sharing mode (#23093 ) Introduce a three-way workspace sharing setting (none, everyone, service_accounts) replacing the boolean workspace_sharing_disabled. In service_accounts mode, only service account-owned workspaces can be shared while regular members' share permissions are removed. Adds a new organization-service-account system role with per-org permissions reconciled alongside the existing organization-member system role. Related to: https://linear.app/codercom/issue/PLAT-28/feat-service-accounts-sharing-mode-and-rbac-role --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com> Co-authored-by: Kayla はな <mckayla@hey.com>	2026-03-17 12:16:43 -07:00
Zach	3f76f312e4	feat(cli): add --no-wait flag to coder create (#22867 ) Adds a `--no-wait` flag (CODER_CREATE_NO_WAIT) to the create command, matching the existing pattern in `coder start`. When set, the `coder create` command returns immediately after the workspace creation API call succeeds instead of streaming build logs until completion. This enables fire-and-forget workspace creation in CI/automation contexts (e.g., GitHub Actions), where waiting for the build to finish is unnecessary. Combined with other existing flags, users can create a workspace with no interactivity, assuming the user is already authenticated.	2026-03-16 11:54:30 -06:00
Callum Styan	36665e17b2	feat: add WatchAllWorkspaceBuilds endpoint for autostart scaletests (#22057 ) This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces` endpoint, which can be used to listen on a single global pubsub channel for _all_ workspace build updates, and makes use of it in the autostart scaletest. This negates the need to use a workspace watch pubsub channel _per_ workspace, which has auth overhead associated with each call. This is especially relevant in situations such as the autostart scaletest, where we need to start/stop a set of workspaces before we can configure their autostart config. The overhead associated with all the watch requests skews the scaletest results and makes it harder to reason about the performance of the autostart feature itself. The autostart scaletest also no longer generates its own metrics nor does it wait for all the workspaces to actually start via autostart. We should update the scaletest dashboard after both PRs are merged to measure autostart performance via the new metrics. The new function/endpoint and its usage in the autostart scaletest are gated behind an experiment feature flag, this is something we should discuss whether we want to enable the endpoint in prod by default or not. If so, we can remove the experiment. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Callum Styan <callum@coder.com>	2026-03-13 20:37:41 -07:00
Danny Kopping	870583224d	chore: deprecate injected MCP approach in AI Bridge (#23031 ) _Disclaimer: implemented by a Coder Agent using Claude Opus 4.6._ Marks the injected MCP approach in AI Bridge as deprecated across the codebase. ## Changes - `codersdk/deployment.go`: Deprecated `ExternalAuthConfig.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex` fields; deprecated and hid the `--aibridge-inject-coder-mcp-tools` server flag; deprecated `AIBridgeConfig.InjectCoderMCPTools`. - `coderd/externalauth/externalauth.go`: Deprecated `Config.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex`. - `enterprise/aibridgedserver/aibridgedserver.go`: Added runtime deprecation warning when `CODER_AIBRIDGE_INJECT_CODER_MCP_TOOLS` is enabled; deprecated `getCoderMCPServerConfig`. - `enterprise/aibridged/mcp.go`: Deprecated `MCPProxyBuilder` interface and `MCPProxyFactory` struct. - `docs/ai-coder/ai-bridge/mcp.md`: Added deprecation warning banner.	2026-03-13 16:15:33 +02:00
Zach	2488cf0d41	fix(agent): don't overwrite existing vscode git auth settings (#22871 ) OverrideVSCodeConfigs previously unconditionally set `git.useIntegratedAskPass` and `github.gitAuthentication` to false, clobbering any values provided by template authors via module settings (e.g. the vscode-web module's settings block). This change only set these keys when they are not already present, so template-provided values are preserved. Registry PR [#758](https://github.com/coder/registry/pull/758) fixed the module side (run.sh merges template-author settings into the existing settings.json instead of overwriting the file). But the agent still unconditionally stamped false onto both keys before the script ran, so the merge base always contained the agent's values and template authors couldn't set them to anything else. This change fixes the agent side by only writing defaults when the keys are absent.	2026-03-12 13:39:24 -06:00
Mathias Fredriksson	57af7abf1f	test: add testutil.WaitBuffer and replace time.Sleep in tests (#22922 ) WaitBuffer is a thread-safe io.Writer that supports blocking until accumulated output matches a substring or custom predicate. It replaces ad-hoc safeBuffer/syncWriter types and time.Sleep-based poll loops in tests with signal-driven waits. - WaitFor/WaitForNth/WaitForCond for blocking on output - Replace custom buffer types in cli/sync_test.go and provisionersdk/agent_test.go - Convert time.Sleep poll loops to require.Eventually/require.Never in cli/ssh_test.go, coderd/activitybump_test.go, coderd/workspaceagentsrpc_test.go, workspaceproxy_test.go, and scaletest tests	2026-03-12 18:07:52 +02:00
Zach	5cb820387c	fix: use quartz clock in task status test (#22969 ) Replace time.Since() usage with a quartz.Clock injected via RootCmd to ensure relative time strings ("Xs ago") are deterministic.	2026-03-12 08:33:09 -06:00
George K	e5c19d0af4	feat: backend support for creating and storing service accounts (#22698 ) Add is_service_account column to users table with CHECK constraints enforcing login_type='none' and empty email for service accounts. Update user creation API to validate service account constraints. Related to: https://linear.app/codercom/issue/PLAT-27/feat-backend-support-for-creating-and-storing-service-accounts	2026-03-11 10:19:08 -07:00
Kyle Carberry	71b132b9e7	fix(cli/sessionstore): don't run Windows keyring tests in parallel (#22937 ) Removes `t.Parallel()` from `TestKeyring` and `TestWindowsKeyring_WriteReadDelete`. The OS keyring is a shared system resource that's flaky under concurrent access, especially Windows Credential Manager in CI. Fixes coder/internal#1370	2026-03-11 15:19:56 +02:00
Cian Johnston	bc27274aba	feat(coderd): refactors github pr sync functionality (#22715 ) - Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_` - Extracts and refactors existing GitHub PR sync logic to new packages `coderd/gitsync` and `coderd/externalauth/gitprovider` - Associated wiring and tests Created using Opus 4.6	2026-03-10 18:46:01 +00:00
Mathias Fredriksson	33136dfe39	fix: use signal-based sync instead of time.Sleep in sync test (#22918 ) The `start_with_dependencies` golden test was flaky on Windows CI. It used `time.Sleep(100ms)` in a goroutine hoping the `sync start` command would have time to call `SyncReady`, find the dependency unsatisfied, and print the "Waiting..." message before the goroutine completed the dependency. On slower Windows runners, the sleep could finish and complete the dependency before the command's first `SyncReady` call, so `ready` was already `true` and the "Waiting..." message was never printed, causing the golden file mismatch. This replaces the `time.Sleep` with a `syncWriter` that wraps `bytes.Buffer` with a mutex and a channel. The channel closes when the written output contains the expected signal string ("Waiting"). The goroutine blocks on this channel instead of sleeping, so it only completes the dependency after the command has confirmed it is in the waiting state. Fixes https://github.com/coder/internal/issues/1376	2026-03-10 17:21:08 +00:00
Danny Kopping	2948400aef	fix(cli): skip CODER_SESSION_TOKEN check when --use-token-as-session is set (#22888 ) _Disclaimer: implemented with Opus 4.6 and Coder Agents._ Follow-up to #22879. ## Problem The `CODER_SESSION_TOKEN` guard added in #22879 blocks `coder login` unconditionally when the env var is set. This conflicts with `--use-token-as-session`, which intentionally uses the provided token (including from the env var) directly as the session token. ## Fix Add `&& !useTokenForSession` to the check so that `coder login --use-token-as-session` still works when `CODER_SESSION_TOKEN` is set. ## Testing Added `TestLogin/SessionTokenEnvVarWithUseTokenAsSession` — sets the env var with a valid token and passes `--use-token-as-session`, verifying login succeeds. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-03-10 15:40:54 +02:00
Mathias Fredriksson	73bf8478d8	fix(cli): fix flaky TestGitSSH/Local_SSH_Keys on Windows CI (#22883 ) The `TestGitSSH/Local_SSH_Keys` test was flaking on Windows CI with a context deadline exceeded error when calling `client.GitSSHKey(ctx)`. Two issues contributed to the flake: 1. `prepareTestGitSSH` called `coderdtest.AwaitWorkspaceAgents` without passing the caller's context. This created a separate internal 25s timeout, wasting time budget independently of the setup context. Changed to use `NewWorkspaceAgentWaiter(...).WithContext(ctx).Wait()` so the agent wait shares the caller's timeout. 2. The `Local SSH Keys` subtest used `WaitLong` (25s) for its setup context, but this subtest does more work than `Dial` (runs the command twice). Bumped to `WaitSuperLong` (60s) to give slow Windows CI runners enough time. Fixes coder/internal#770	2026-03-10 12:12:15 +02:00
Mathias Fredriksson	41c505f03b	fix(cli): handle ignored errors in ssh and scaletest commands (#22852 ) Handle errors that were previously assigned to blank identifiers in the `cli/` package. - ssh.go: Log ExistsViaCoderConnect DNS lookup error at debug level instead of silently discarding it. Fallthrough behavior preserved. - exp_scaletest_llmmock.go: Log srv.Stop() error via the existing logger instead of discarding it.	2026-03-10 12:08:40 +02:00
Danny Kopping	d936a99e6b	fix(cli): error when CODER_SESSION_TOKEN env var is set during login (#22879 ) _Disclaimer: created with Opus 4.6 and Coder Agents._ ## Problem When `CODER_SESSION_TOKEN` is set as an environment variable with an invalid value, `coder login` fails with a confusing error: ``` error: Trace=[create api key: ] You are signed out or your session has expired. Please sign in again to continue. Suggestion: Try logging in using 'coder login'. ``` The suggestion to run `coder login` is what the user just did, making it circular and unhelpful. ## Root cause The `--token` flag is mapped to `CODER_SESSION_TOKEN` via serpent. When the env var is set, `coder login` picks it up as the session token and tries to use it to create a new API key, which fails because the token is invalid. Even if login were to succeed and write a new token to disk, subsequent commands would still use the env var (which takes precedence over the on-disk token), so the user would remain stuck. ## Fix Before attempting login, check if `CODER_SESSION_TOKEN` is set in the environment. If so, return a clear error telling the user to unset it: ``` the environment variable CODER_SESSION_TOKEN is set, which takes precedence over the session token stored on disk. Please unset it and try again. unset CODER_SESSION_TOKEN ``` ## Testing Added `TestLogin/SessionTokenEnvVar` that verifies the error is returned when the env var is set.	2026-03-10 09:41:05 +00:00
Zach	14341edfc2	fix(cli): fix `coder login token` failing without --url flag (#22742 ) Previously `coder login token` didn't load the server URL from config, so it always required --url or CODER_URL when using the keyring to store the session token. This command would only print out the token when already logged in to a deployment and file storage is used to store the session token (keyring is the default on Windows/macOS). It would also print out an incorrect token when --url was specified and the session token stored on disk was for a different deployment that the user logged into. This change fixes all of these issues, and also errors out when using session token file storage with a `--url` argument that doesn't match the stored config URL, since the file only stores one token and would silently return the wrong one. See https://github.com/coder/coder/issues/22733 for a table of the before/after behaviors.	2026-03-10 08:57:27 +01:00
Danielle Maywood	4cf8d4414e	feat: make `coder task send` resume paused tasks (#22203 )	2026-03-07 01:36:03 +00:00
Cian Johnston	0b1e4880bd	chore(cli): fix TestTokens harder (#22684 ) `time.Now()` is greater than microsecond precision while timestamps we store in Postgres are only microsecond precision. Flake potential is non-zero.	2026-03-06 00:04:09 +00:00
George K	5dd570f099	fix(cli/cliui): apply defaults when rendering select prompts (#22093 ) The `--parameter-default` value is now used to pre-select the default option for a coder parameter with option blocks when prompting interactively in CLI. Related to: https://github.com/coder/coder/issues/22078	2026-03-05 09:35:57 -08:00
Ethan	5a5828b090	fix(cli): add trailing dot to Coder Connect hostname to prevent DNS search domain expansion (#22607 ) ## Problem When `coder ssh --stdio` checks for Coder Connect availability, it constructs a hostname like `agent.workspace.owner.coder` and performs a DNS AAAA lookup via `ExistsViaCoderConnect`. Without a trailing dot, this hostname is not a fully-qualified domain name (FQDN), so the system DNS resolver appends each configured search domain before querying. Go's pure-Go DNS resolver (used when `CGO_ENABLED=0`, which is the default for CLI builds) does not stop after getting NXDOMAIN on the first name. It tries all names in the search list sequentially: 1. `agent.workspace.owner.coder.` → NXDOMAIN (fast) 2. `agent.workspace.owner.coder.corp.example.com.` → timeout 3. `agent.workspace.owner.coder.internal.company.com.` → timeout On corporate networks where the search-domain-expanded queries hit DNS infrastructure that drops rather than responds (common for nonsensical hostnames with deep subdomain chains), each expanded query hits the full DNS timeout (default 5s × 2 attempts = 10s per name). With 2-3 search domains, this compounds to 20-30+ seconds of blocking. ## Fix Adding a trailing dot marks the hostname as an FQDN. Go's `nameList()` in `src/net/dnsclient_unix.go` returns a single-entry list for rooted names, completely bypassing search domain expansion. This is consistent with how `IsCoderConnectRunning` already handles its DNS check — `tailnet.IsCoderConnectEnabledFmtString` includes a trailing dot for exactly this reason. ## Verification Tested with a fake DNS server that responds with NXDOMAIN for `.coder` queries but drops search-domain-expanded queries: \| Hostname \| Time \| Queries sent \| \|---\|---\|---\| \| `main.workstation.kevin.coder` (no trailing dot) \| ~15s \| 4 (as-is + 3 search domains) \| \| `main.workstation.kevin.coder.` (trailing dot) \| <1ms \| 1 (FQDN only) \| Closes https://github.com/coder/coder/issues/22581 _Generated by [mux](https://github.com/coder/mux) but reviewed by a human_	2026-03-06 01:56:54 +11:00
Cian Johnston	89cee2dd81	chore(cli): fix flaky temporal assertion in TestTokens (#22654 ) Fixes https://github.com/coder/internal/issues/1379	2026-03-05 10:18:51 +00:00
Susana Ferreira	21c91cebaa	feat: add TLS listener support to aibridgeproxyd (#22411 ) ## Description Adds optional TLS support for the AI Bridge Proxy listener. When TLS cert and key files are provided, the proxy serves over HTTPS instead of plain HTTP. ## Changes * New configuration options to enable TLS on the proxy listener * Wraps the TCP listener in `tls.NewListener` when configured * Tests for validation errors, invalid files, and full integration (tunneled + MITM) through a TLS listener Note: Documentation for TLS listener setup and client configuration will be handled in a follow-up PR. Related to: https://github.com/coder/internal/issues/1335	2026-03-05 09:19:34 +00:00
Susana Ferreira	c79e8f2707	refactor: clarify MITM certificate naming in aibridgeproxyd (#22408 ) ## Description Renames internal fields, variables, and comments related to the proxy's certificate/key configuration to explicitly reference their MITM CA purpose. The AI Bridge Proxy uses a CA certificate to sign dynamically generated leaf certificates during MITM interception of HTTPS traffic from AI clients. With the upcoming introduction of TLS listener certificates (for serving the proxy itself over HTTPS, implemented upstack https://github.com/coder/coder/pull/22411), the previous generic naming would become ambiguous. This refactor makes it clear which certificate is which. No user-facing flags, environment variables, YAML keys, or JSON fields were changed, this is purely an internal rename to avoid confusion going forward. Related to https://github.com/coder/internal/issues/1335	2026-03-05 09:06:38 +00:00
Kyle Carberry	a6b9a25f82	fix(cli): bypass access URL redirect for inter-replica chat relay (#22635 ) ## Summary Fixes cross-replica chat relay failing with: ``` failed to open initial relay for chat stream error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200 failed to open relay for message parts error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200 ``` Subscribers see accurate `status=running` (delivered via pubsub) but miss all in-progress `message_part` events (delivered only via the relay WebSocket that never connects). ## Root cause `redirectToAccessURL` in `cli/server.go` redirects any request whose `Host` header doesn't match the access URL. The enterprise chat relay dials another replica directly via its DERP relay address (e.g. `http://10.0.0.2:8080`), so the `Host` header is the pod IP — not the access URL. This triggers a 307 redirect to the access URL. The WebSocket library follows the redirect, but the second request is a plain GET — `Connection: Upgrade` and `Upgrade: websocket` headers are not carried over by HTTP redirect semantics. The load-balanced access URL routes the plain GET to any replica, which serves the SPA catch-all handler and returns HTTP 200 with `index.html`. The WebSocket library then fails: `expected handshake response status code 101 but got 200`. DERP mesh already has an exemption for this exact scenario (`isDERPPath`). Chat relay was added later and didn't get one. ## Fix Bypass `redirectToAccessURL` for requests that carry the `X-Coder-Relay-Source-Replica` header, which the enterprise relay already sets on every request (`enterprise/coderd/chatd/chatd.go:573`). ## Sequence diagram Before (broken): ``` Replica A (subscriber) Replica B (worker) Load Balancer \| \| \| \|--- WS dial pod-ip:8080 ----->\| \| \| \|-- 307 redirect to LB --->\| \| \| \| \|<----------- plain GET (no Upgrade headers) ------------->\| \| \| \|-- routes to any replica \|<----------- 200 index.html -------------------------------\| \| \| X 'expected 101 but got 200' \| ``` After (fixed): ``` Replica A (subscriber) Replica B (worker) \| \| \|--- WS dial pod-ip:8080 ----->\| \| (X-Coder-Relay-Source- \| \| Replica header set) \| \| \|-- bypass redirect \|<--------- 101 Upgrade ------\| \|<==== message_part events ====\| ```	2026-03-04 20:26:03 -05:00
Kayla はな	e35717bc19	fix: show a notice when workspace sharing is disabled globally in organization settings (#22580 )	2026-03-04 11:14:52 -07:00
Spike Curtis	1a30ca1a2a	chore: use agentsocket for task status updates in MCP server (#22354 ) relates to #21335 Modifies our local MCP server used in Tasks to push task status updates over the agentsocket, rather than directly dialing Coderd. This will significantly reduce pressure on the database at scale because we can avoid expensive authentication of the agent API key. Disclosure: I used AI to generate a lot of this PR, but hand-reviewed and tweaked it.	2026-03-04 21:41:21 +04:00
Spike Curtis	56eb57caf4	chore: enable agent socket by default (#22352 ) relates to #21335 Enables the agent socket by default and updates docs to strike references to having to enable it. The PRs in this stack change the MCP server that Tasks use to update their status to rely on the agent socket, rather than directly dialing Coderd with the agent token. Default disable was a reasonable default when it was only used for the experimental script ordering features, but now that we want to use it for Tasks, it should be default on.	2026-03-03 21:23:59 +04:00
Cian Johnston	517cb0ce73	refactor(webpush): use RequireExperimentWithDevBypass middleware (#22525 ) Replace manual experiment checks in web-push handlers with the `RequireExperimentWithDevBypass` middleware on the route group, matching the pattern used by OAuth2, Agents, and MCP experiments. ## Changes - `coderd/coderd.go`: Add `RequireExperimentWithDevBypass` middleware to `/webpush` route group - `coderd/webpush.go`: Remove inline `api.Experiments.Enabled(codersdk.ExperimentWebPush)` checks from all three handlers - `cli/server.go`: Gate webpush dispatcher initialization with `buildinfo.IsDev()` fallback so dev builds always init the real dispatcher - `coderd/webpush_test.go`: Remove experiment enablement from tests (dev bypass handles it) Net effect: -26 lines removed, +5 added. Created using whatchamacallits (Opus 4.6 Max)	2026-03-03 09:49:04 +00:00
Mathias Fredriksson	b80dbd2d4e	test(cli): fix flaky TestProvisioners_Golden (#22491 ) Fixes coder/internal#449	2026-03-03 08:47:34 +00:00
Kyle Carberry	56f95a3e6d	fix: scope git askpass diff status updates to initiating chat (#22534 ) ## Problem When the git askpass flow triggered diff status refreshes, it updated every chat connected to the workspace. This was wasteful and could cause confusing status updates on unrelated chats. ## Solution Thread the chat ID through the entire git askpass flow so only the chat that initiated the git operation gets updated: 1. `coderd/chatd/chattool/execute.go` — Sets `CODER_CHAT_ID` env var on spawned processes (alongside the existing `CODER_CHAT_AGENT`) 2. `cli/gitaskpass.go` — Reads `CODER_CHAT_ID` from the environment and sends it as a `chat_id` query parameter in the `ExternalAuthRequest` 3. `codersdk/agentsdk/agentsdk.go` — Adds `ChatID` field to `ExternalAuthRequest` and encodes it as a query param 4. `coderd/workspaceagents.go` — Parses `chat_id` query param and passes it through to `storeChatGitRef` and `triggerWorkspaceChatDiffStatusRefresh` 5. `coderd/chats.go` — `storeChatGitRef` and `refreshWorkspaceChatDiffStatuses` now scope updates to just the initiating chat when a chat ID is provided, falling back to all-workspace-chats behavior for backwards compatibility (non-chat git operations)	2026-03-02 22:52:39 -05:00
Steven Masley	7bc454eed8	chore: version is 2.31 not 1.31 (#22494 )	2026-03-02 16:23:09 +00:00
Kyle Carberry	edee917d88	feat: add experimental agents support (#22290 ) feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 16:50:56 +00:00
Steven Masley	21bc185254	doc: add language to mention disruptive nature of cookie host prefix (#22384 )	2026-02-27 15:59:01 +00:00
Kacper Sawicki	ab28ecde88	fix(cli): reuse multi-select parameter values on workspace update (#22261 ) Fixes three bugs that caused `coder update` to always re-prompt for multi-select (`list(string)`) parameters instead of reusing previous build values: 1. `isValidTemplateParameterOption` failed for multi-select values (`cli/parameterresolver.go`): It compared the entire JSON array string (e.g. `["vim","emacs"]`) against individual option values, which never matched. Now parses the JSON array and validates each element separately. 2. `RichParameter` ignored previous build value for multi-select (`cli/cliui/parameter.go`): The `list(string)` branch always used the template's default value instead of the `defaultValue` argument (which carries the previous build's value). Now uses `defaultValue` when available, falling back to the template default. 3. Pre-existing crash when `list(string)` has no default value (`cli/cliui/parameter.go`): `json.Unmarshal` on an empty string caused `unexpected end of JSON input`. Now skips unmarshaling when the default source is empty. Fixes #19956	2026-02-26 14:34:30 +01:00
Jon Ayers	4e365e59b6	fix: add provision/tags to prebuilds scenario (#22294 )	2026-02-25 11:16:20 -06:00
Garrett Delfosse	6c16794173	fix(cli): proactively use active template version when require_active_version is set (#22033 ) Fixes #22030 ## Problem When a template has `require_active_version = true` and a workspace is outdated, the web UI always shows "Update and start" as the only button (for all users including admins), but `coder start` starts with the old version. For admins, this silently succeeds on the stale version. For non-admins, it goes through a clunky 403→retry path. This also affects the VS Code extension, which calls `coder start --yes` under the hood. ## Root Cause `buildWorkspaceStartRequest()` in `cli/start.go` checks `workspace.AutomaticUpdates == "always"` but ignores `workspace.TemplateRequireActiveVersion`. The server-side autostart already ORs both settings together: ```go // coderd/autobuild/lifecycle_executor.go func useActiveVersion(opts, ws) bool { return opts.RequireActiveVersion \|\| ws.AutomaticUpdates == "always" } ``` The CLI was missing the `RequireActiveVersion` check. ## Fix Add `workspace.TemplateRequireActiveVersion` to the existing OR condition: ```go // Before: if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways \|\| action == WorkspaceUpdate { // After: if workspace.AutomaticUpdates == codersdk.AutomaticUpdatesAlways \|\| workspace.TemplateRequireActiveVersion \|\| action == WorkspaceUpdate { ``` Now `coder start` and `coder restart` proactively use the active template version when `require_active_version` is set, matching the web UI and server autostart behavior. The 403→retry fallback remains as a safety net but is no longer the primary path for any user. ## Testing Updated `enterprise/cli/start_test.go` — all user types (owner, template admin, ACL admin, group ACL admin, member) now expect the active version when `require_active_version` is set, and verify the 403→retry message does NOT appear.	2026-02-24 19:51:48 -05:00
Mathias Fredriksson	947b390c5a	fix: allow agent-reported final states, add SSE reconnection (#22286 ) When AgentAPI is configured, `WithTaskReporter` unconditionally overrides all self-reported states to `working`. The intent was to distrust the agent's `idle` and rely on the screen watcher, but the override also blocks `failure` and `complete`, which only the agent can produce (the screen watcher only knows `running`/`stable`). Tasks get stuck as `working` or `null` forever. Now only `idle` is overridden to `working`; `failure`, `complete`, and `working` pass through as-is. Also: - Remove misplaced unconditional `"Failed to watch screen events"` log that fired on every startup - Add SSE reconnection with exponential backoff (1s-30s) in `startWatcher` so it recovers from dropped connections instead of dying silently - Add `complete` to the `coder_report_task` tool enum, which the `coder/claude-code` registry module already instructs agents to use but was missing from the schema Refs coder/internal#1350	2026-02-24 20:28:50 +02:00
Kacper Sawicki	1e274063d4	feat(coderd): filter expired API tokens server-side (#22263 ) ## Summary Moves expired token filtering from client-side to server-side by adding an `include_expired` parameter to the `GetAPIKeysByLoginType` and `GetAPIKeysByUserID` database queries. This is more efficient for large deployments with many expired/short-lived tokens. ## Changes - Add `include_expired` parameter to SQL queries using `OR` short-circuit - Add `include_expired` query parameter to `GET /users/{user}/keys/tokens` - Add `IncludeExpired` field to `codersdk.TokensFilter` - Remove client-side filtering from CLI `tokens list` command - Add `TestTokensFilterExpired` test Fixes coder/internal#1357	2026-02-24 15:27:03 +00:00
Kacper Sawicki	3c69d683f4	fix(cli): allow new immutable parameters via --parameter flag during update (#22221 ) ## Problem When a template adds a new immutable parameter, `coder update --parameter param=value` fails with: ``` error: start workspace: parameter "machine_type" is immutable and cannot be updated ``` The interactive prompt handles this correctly (allows setting first-time immutable params), but the CLI `--parameter` flag path does not. ## Root Cause In `cli/parameterresolver.go`, `verifyConstraints()` runs before the interactive prompt and unconditionally rejects any immutable parameter during updates. It doesn't distinguish between new immutable parameters (first-time use, should be allowed) and existing ones (already set, should be blocked from changing). ## Fix Added an `isFirstTimeUse` check to the immutable parameter constraint, matching the logic already used by the interactive prompt path (line 323). New immutable parameters can now be set via `--parameter`, while existing immutable parameters are still blocked from being changed. ## Testing Added `TestUpdateValidateRichParameters/NewImmutableParameterViaFlag` which: 1. Creates a workspace with a mutable parameter 2. Updates the template to add a new immutable parameter 3. Runs `coder update --parameter immutable_param=value` 4. Verifies the update succeeds and the parameter is set correctly Fixes #22164	2026-02-24 09:15:02 +01:00

1 2 3 4 5 ...

1806 Commits