coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 04:58:23 +00:00

Author	SHA1	Message	Date
Yevhenii Shcherbina	dd73ea54bd	feat: add allow-byok option for ai-gateway (#24274 ) ## Summary Adds `--ai-gateway-allow-byok` deployment option to control whether users can use Bring Your Own Key (BYOK) mode with AI Gateway. When disabled (`--ai-gateway-allow-byok=false`), BYOK requests are rejected with a 403 and a message directing the admin to enable the flag. Centralized key authentication works regardless of this setting. Defaults to `true` (BYOK allowed). --------- Co-authored-by: Danny Kopping <danny@coder.com>	2026-04-15 14:16:49 -04:00
Stephen Kirby	e3f2398343	fix(cli): prevent false deprecation warnings for renamed options (#23931 ) Co-authored-by: dylanhuff-at-coder <dylan@coder.com>	2026-04-15 12:33:31 -05:00
Danny Kopping	48b90f8cc8	feat: add coder_build_info metric (#24365 ) _Disclaimer: produced by Claude Opus 4.6_ Adds a `coder_build_info` metric which allows operators to see which versions of Coder are currently running. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-04-15 12:48:38 +00:00
Danny Kopping	08045c2aac	feat: configure multiple AI Bridge providers of the same type (#23948 ) _Disclaimer: produced mostly by Claude Opus 4.6 following detailed planning._ ## Summary - Support multiple instances of the same AI Bridge provider type via indexed env vars (`CODER_AIBRIDGE_PROVIDER_<N>_<KEY>`), following the `CODER_EXTERNAL_AUTH_<N>_<KEY>` pattern - Existing single-provider env vars (`CODER_AIBRIDGE_OPENAI_KEY`, etc.) continue to work unchanged - Setting both a legacy env var and an indexed provider with the same name errors at startup to prevent silent misconfiguration - Mark legacy provider fields (`OpenAI`, `Anthropic`, `Bedrock`) as deprecated in `AIBridgeConfig` in favor of `Providers` ## Example ```sh CODER_AIBRIDGE_PROVIDER_0_TYPE=anthropic CODER_AIBRIDGE_PROVIDER_0_NAME=anthropic-corp CODER_AIBRIDGE_PROVIDER_0_KEY=sk-ant-corp-xxx CODER_AIBRIDGE_PROVIDER_0_BASE_URL=https://llm-proxy.internal.example.com/anthropic CODER_AIBRIDGE_PROVIDER_1_TYPE=anthropic CODER_AIBRIDGE_PROVIDER_1_NAME=anthropic-direct CODER_AIBRIDGE_PROVIDER_1_KEY=sk-ant-direct-yyy ``` Each instance is routed by name: - /api/v2/aibridge/anthropic-corp/v1/messages - /api/v2/aibridge/anthropic-direct/v1/messages Closes [AIGOV-157](https://linear.app/codercom/issue/AIGOV-157/spike-to-understand-if-there-is-a-simple-way-to-handle-multi-api-key) --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-04-15 07:59:37 +00:00
Cian Johnston	116323d3cf	feat: graduate web-push from experiment to always-on (#24310 ) * Removes experiment `web-push`. * Falls back to NoopWebpusher in case of error * Checks browser capability in FE * Adds note to agents getting-started docs regarding webpush without TLS > 🤖	2026-04-14 09:07:06 +01:00
Thomas Kosiewski	6ab30123bf	feat: add chat debug log tables, queries, and SDK types (#23913 )	2026-04-13 15:06:06 +02:00
J. Scott Miller	7bde763b66	feat: add workspace build transition to provisioner job list (#24131 ) Closes #16332 Previously `coder provisioner jobs list` showed no indication of what a workspace build job was doing (i.e., start, stop, or delete). This adds `workspace_build_transition` to the provisioner job metadata, exposed in both the REST API and CLI. Template and workspace name columns were also added, both available via `-c`. ``` $ coder provisioner jobs list -c id,type,status,"workspace build transition" ID TYPE STATUS WORKSPACE BUILD TRANSITION 95f35545-a59f-4900-813d-80b8c8fd7a33 template_version_import succeeded 0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a workspace_build succeeded start ```	2026-04-10 09:50:11 -05:00
Ehab Younes	1d0653cdab	fix(cli): retry dial timeouts in SSH connection setup (#24199 ) Reorder error checks in isRetryableError so IsConnectionError is evaluated before context.DeadlineExceeded. Dial timeouts (*net.OpError wrapping DeadlineExceeded) were incorrectly treated as non-retryable, causing Coder Connect to fail immediately on broken tunnels with valid DNS despite existing retry logic. Fixes #24201	2026-04-10 00:55:16 +03:00
Cian Johnston	1e40cea199	feat: warn in CLI when server runs dev or RC builds (#24158 ) Adds warning on stderr when the server version contains `-devel` or `-rc.N` > 🤖 Written by a Coder Agent. Will be reviewed by a human.	2026-04-09 12:48:35 -04:00
Kyle Carberry	391b22aef7	feat: add CLI commands for managing chat context from workspaces (#24105 ) Adds `coder exp chat context add` and `coder exp chat context clear` commands that run inside a workspace to manage chat context files via the agent token. `add` reads instruction and skill files from a directory (defaulting to cwd) and inserts them as context-file messages into an active chat. Multiple calls are additive — `instructionFromContextFiles` already accumulates all context-file parts across messages. `clear` soft-deletes all context-file messages, causing `contextFileAgentID()` to return `!found` on the next turn, which triggers `needsInstructionPersist=true` and re-fetches defaults from the agent. Both commands auto-detect the target chat via `CODER_CHAT_ID` (already set by `agentproc` on chat-spawned processes), or fall back to single-active-chat resolution for the agent. The `--chat` flag overrides both. Also adds sub-agent context inheritance: `createChildSubagentChat` now copies parent context-file messages to child chats at spawn time, so delegated sub-agents share the same instruction context without independently re-fetching from the workspace agent. <details><summary>Implementation details</summary> New files: - `cli/exp_chat.go` — CLI command tree under `coder exp chat context` Modified files: - `agent/agentcontextconfig/api.go` — `ConfigFromDir()` reads context from an arbitrary directory without env vars - `codersdk/agentsdk/agentsdk.go` — `AddChatContext`/`ClearChatContext` SDK methods - `coderd/workspaceagents.go` — POST/DELETE handlers on `/workspaceagents/me/chat-context` - `coderd/coderd.go` — Route registration - `coderd/database/queries/chats.sql` — `GetActiveChatsByAgentID`, `SoftDeleteContextFileMessages` - `coderd/database/dbauthz/dbauthz.go` — RBAC implementations for new queries - `coderd/x/chatd/subagent.go` — `copyParentContextFiles` for sub-agent inheritance - `cli/root.go` — Register `chatCommand()` in `AGPLExperimental()` Auth pattern: Uses `AgentAuth` (same as `coder external-auth`) — agent token via `CODER_AGENT_TOKEN` + `CODER_AGENT_URL` env vars. </details> > 🤖 Generated by Coder Agents --------- Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>	2026-04-09 16:33:00 +02:00
Garrett Delfosse	7b7baea851	feat: support disabling reverse/local port forwarding in agent SSH server (#24026 ) The agent SSH server unconditionally allows all four SSH forwarding paths (TCP local, TCP reverse, Unix local, Unix reverse). This is a sandbox escape vector when workspaces are used for AI agent containment — a reverse tunnel lets anything inside the workspace reach the user's local machine, bypassing network isolation. This adds two new agent CLI flags / environment variables: - `--block-reverse-port-forwarding` / `CODER_AGENT_BLOCK_REVERSE_PORT_FORWARDING` — blocks both TCP (`ssh -R`) and Unix socket reverse forwarding - `--block-local-port-forwarding` / `CODER_AGENT_BLOCK_LOCAL_PORT_FORWARDING` — blocks both TCP (`ssh -L`) and Unix socket local forwarding Template admins can set these via the `env` block on the container/VM resource that runs the agent (e.g. `docker_container`, `kubernetes_pod`), or via `coder_env` resources tied to the agent. Fixes https://github.com/coder/coder/issues/22275 <details> <summary>Implementation notes</summary> Follows the existing `BlockFileTransfer` pattern: 1. `agent/agentssh/agentssh.go` — New `BlockReversePortForwarding` and `BlockLocalPortForwarding` fields on `Config`. TCP callbacks check these before allowing forwarding. The `direct-streamlocal@openssh.com` channel handler is wrapped to reject Unix local forwards. 2. `agent/agentssh/forward.go` — `forwardedUnixHandler` gains a `blockReversePortForwarding` field to reject `streamlocal-forward@openssh.com` requests. 3. `agent/agent.go` — New fields on `Options` and `agent` struct, plumbed to SSH config. 4. `cli/agent.go` — New serpent flags with env vars. 5. Tests cover all four blocked paths: TCP local, TCP reverse, Unix local, Unix reverse. </details> > 🤖 Generated by Coder Agents	2026-04-08 10:41:55 -04:00
Ehab Younes	027c222e82	fix(cli): add dial timeout and keepalive for Coder Connect (#24015 ) The default `net.Dialer` in the Coder Connect path had no timeout, falling back to the OS TCP timeout when the tunnel was broken but DNS still resolved. Add a 5s dial timeout and 30s TCP keepalive. Fixes #24006	2026-04-08 01:11:28 +03:00
Ehab Younes	d00f148b76	fix(cli): retry transient connection failures during SSH setup (#24010 ) When `coder ssh` connects to a workspace after laptop wake, DNS or the control plane may be briefly unavailable. Previously this caused an immediate failure, which VS Code Remote SSH classified as permanent ("Reload Window"). Wrap each network step (workspace resolution, template version fetch, agent connection info, Coder Connect dial, tailnet dial) with `retryWithInterval` so transient errors (DNS, connection refused, 5xx) are retried individually. Non-retryable errors (auth, 404) and context cancellation stop immediately. Data transfer is never retried.	2026-04-08 00:59:10 +03:00
Kayla はな	c5f1a2fccf	feat: make service accounts a Premium feature (#24020 )	2026-04-07 12:25:32 -06:00
Jon Ayers	7e63fe68f7	fix: avoid instantiating a logger if provided /dev/null (#24027 ) - Adds some additional context to workspace traffic logging - Fails traffic tests if 0 bytes read from connection	2026-04-03 16:26:14 -05:00
Sas Swart	5b6b7719df	fix: make prebuild claiming durable and idempotent (#23108 ) ## Problem When a prebuilt workspace is claimed, the agent reinitializes via a single fire-and-forget pubsub event over SSE. If the agent's SSE connection is interrupted at claim time, the event is permanently lost — the workspace is stuck with no self-healing path. Additionally, regular (non-prebuild) workspaces had no way to opt out of the `/reinit` polling loop — agents would reconnect indefinitely to an endpoint that would never send them anything useful. ## Root Cause `workspaceAgentReinit` fetches the workspace (with its current `owner_id`) via `GetWorkspaceByAgentID`, but never checked whether a claim already happened. It only subscribed to pubsub for future events. The database already has durable claim state (`owner_id` changes from `PrebuildsSystemUserID` to the real user), but no layer ever consulted it on reconnection. ## Solution ### Server-side durable check with first-build-initiator gating TOCTOU-safe ordering: Subscribe to pubsub claim events before any durable checks, so a claim that fires during the check is buffered in the channel rather than lost. First-build-initiator gating: When `!workspace.IsPrebuild()` (owner is no longer the system user), look up the first build's `InitiatorID`. The prebuild reconciler always uses `PrebuildsSystemUserID` as the initiator. This distinguishes claimed prebuilds from regular workspaces without any SQL schema changes. - Regular workspace (first build initiator ≠ system user) → 409 Conflict, agent stops reconnecting - Claimed prebuild, build completed → pre-seed channel with reinit event and close it, transmitter delivers one-shot then exits - Claimed prebuild, build in-progress → fall through to pubsub subscription, agent waits for completion event - Unclaimed prebuild → pubsub subscription (existing happy path) ### Declarative reinit events (defense-in-depth) - Added `UserID` field to `ReinitializationEvent` with JSON tags - Switched pubsub serialization from raw string to JSON (with backward-compat fallback for rolling upgrades) - Populated `UserID` at both the publish site and the durable check ### Agent SDK: 409 handling `WaitForReinitLoop` detects 409 Conflict from the server and closes the `reinitEvents` channel, cleanly exiting the retry goroutine. ### Agent CLI: fixed two bugs + added reinitCtx - Closed channel (`!ok`): now blocks on `<-ctx.Done()` instead of `continue`, keeping the current agent running. Previously this would leak agents by skipping `agnt.Close()` and re-entering the loop. - Duplicate owner reinit: cancels `reinitCtx` (stops the reinit goroutine), then blocks on `<-ctx.Done()`. Previously `continue` would skip cleanup and create a new agent on the next loop iteration. - `reinitCtx`: a cancellable child of `ctx` passed to `WaitForReinitLoop`, allowing the agent to stop the reinit HTTP polling after reinit completes. ### Agent-side idempotency Tracks `lastOwnerID` in the agent reinit loop — duplicate events for the same owner are skipped. ## Testing - "unclaimed prebuild receives reinit via pubsub": prebuild owned by system user, pubsub event triggers reinit - "claimed prebuild receives one-shot reinit on reconnect": first build by system user, owner changed, build completed → immediate reinit (no pubsub needed) - "claimed prebuild waits during in-progress claim build": claimed but build still running → no reinit until build completes - "regular workspace gets 409": first build by real user → 409 Conflict, agent stops polling - Updated claim publisher/listener tests: verify `UserID` survives JSON round-trip + backward compat with raw string payloads - Updated SSE round-trip test: verify `UserID` survives transmit → receive cycle Fixes #22359 ## Rolling upgrade note During a rolling deploy where old coderd instances coexist with new ones, the pubsub `ReinitializationEvent` has a new `workspace_id` field (JSON key `workspace_id`). Old publishers send a raw reason string instead of JSON; the new listener gracefully falls back by treating the entire payload as the reason and filling in `WorkspaceID` from context. The only visible effect during the upgrade window is that `WorkspaceID` may be the zero UUID in agent-side logs — this is cosmetic and resolves once all instances are updated.	2026-04-02 23:51:02 +02:00
Max Schwenk	1cc23a3144	fix(cli): allow multiple depends-on args in `coder exp sync want` (#23869 ) Previously the command required exactly two arguments, forcing users to run it multiple times to declare multiple dependencies for a single unit. This accepts variadic depends-on arguments so all dependencies can be declared in one call: ``` coder exp sync want my-unit dep-1 dep-2 dep-3 ``` --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Marcin Tojek <mtojek@users.noreply.github.com>	2026-04-01 15:55:32 +00:00
Danielle Maywood	19390a5841	fix: resolve TestScheduleOverride/extend flake caused by timezone hour boundary race (#23830 )	2026-04-01 07:53:04 +01:00
Yevhenii Shcherbina	84b94a8376	feat: add chatgpt support for aibridge proxy (#23826 ) Add ChatGPT support for AIBridgeProxy	2026-03-31 12:54:38 -04:00
Susana Ferreira	b0036af57b	feat: register multiple Copilot providers for business and enterprise upstreams (#23811 ) ## Description Adds support for multiple Copilot provider instances to route requests to different Copilot upstreams (individual, business, enterprise). Each instance has its own name and base URL, enabling per-upstream metrics, logs, circuit breakers, API dump, and routing. ## Changes * Add Copilot business and enterprise provider names and host constants * Register three Copilot provider instances in aibridged (default, business, enterprise) * Update `defaultAIBridgeProvider` in `aibridgeproxy` to route new Copilot hosts to their corresponding providers ## Related * Depends on: https://github.com/coder/aibridge/pull/240 * Closes: https://github.com/coder/aibridge/issues/152 Note: documentation changes will be added in a follow-up PR. _Disclaimer: initially produced by Claude Opus 4.6, heavily modified and reviewed by @ssncferreira ._	2026-03-31 16:00:37 +01:00
Cian Johnston	81fe7543b4	chore: set tls.VersionTLS12 MinVersion in cli/server.go to address gosec warning (#23646 ) I was investigating `//nolint` comments and this one popped up. It raised my eyebrows enough to warrant its own PR.	2026-03-26 14:53:47 +00:00
Cian Johnston	847a88c6ca	chore: clean up stale and dangerous //nolint comments (#23643 ) ## Changes - Commit 1: Remove 17 unnecessary `//nolint` directives: - `//nolint:varnamelen` — linter not active - `//nolint:unused` on exported `SlimUnsupported` - `//nolint:govet` in `coderd/httpmw/csrf` — no longer fires - `//nolint:revive` on functions refactored since the nolint was added - `//nolint:paralleltest` citing Go 1.22 loop variable capture (obsolete) - Bare `//nolint` narrowed to specific `//nolint:gocritic` with justification - Commit 2: Fix root causes behind 5 dangerous nolint suppressions: - Add `MinVersion: tls.VersionTLS12` to TLS client config (removes `gosec` G402) - Delete trivial unexported wrappers `apiKey()`/`normalizeProvider()` in chatprovider (removes `revive` confusing-naming) - Add doc comments to `StartWithAssert` and `Router` (removes `revive` exported) - Rename unused parameters to `_` in integration test helpers > 🤖 This PR was created using Coder Agents and reviewed by me.	2026-03-26 14:13:53 +00:00
Jaayden Halko	3fb7c6264f	feat: display the AI add-on column in the UI on the Users and Organization Members tables (#23291 ) ## Summary Adds an entitlement-gated AI add-on column to both the Users table and the Organization Members table. When `ai_governance_user_limit` is entitled, each row shows whether the user is consuming an AI seat. ## Background The AI governance add-on tracks which users are consuming AI seats. Admins need visibility into per-user seat consumption directly from the user management tables. This change surfaces that information through both the site-wide Users table and the per-organization Members table, gated behind the `ai_governance_user_limit` entitlement so the column only appears when the feature is licensed. ## Implementation ### Backend - New SQL query `GetUserAISeatStates` (`coderd/database/queries/aiseatstate.sql`) — returns user IDs consuming an AI seat, derived from: - Users with entries in `aibridge_interceptions` (AI Bridge usage) - Users who own workspaces with `has_ai_task = true` builds (AI Tasks usage) - SDK types — added `has_ai_seat: boolean` to `codersdk.User` and `codersdk.OrganizationMemberWithUserData` - Handler wiring — both the Users list endpoint (`coderd/users.go`) and all Members endpoints (`coderd/members.go`) query AI seat state per page of user IDs and populate the response field - dbauthz — per-user `ActionRead` checks on `ResourceUserObject` ### Frontend - Shared `AISeatCell` component (`site/src/modules/users/AISeatCell.tsx`) — green `CircleCheck` for consuming, gray `X` for non-consuming - `TableColumnHelpTooltip` — extended with `ai_addon` variant with tooltip: "Users with access to AI features like AI Bridge, Boundary, or Tasks who are actively consuming a seat." - Column visibility gated behind `useFeatureVisibility().ai_governance_user_limit` ## Validation - Backend: dbauthz full method suite (`TestMethodTestSuite`) passes including new `GetUserAISeatStates` test - Backend: `TestGetUsers`, `TestUsersFilter`, CLI golden file tests pass - Frontend: 7/7 tests pass across `UsersPage.test.tsx` and `OrganizationMembersPage.test.tsx` (column visibility gating both directions) - `go build ./coderd/...` compiles clean - `pnpm --dir site run lint:types` passes - `make gen` clean ## Risks - Pagination performance: The AI seat query is scoped to the current page's user IDs (not a full table scan), keeping it efficient for paginated views. - Semantic scope: The workspace-side AI seat derivation uses "any build with `has_ai_task = true`" rather than "latest build only". If the product intent is latest-build-only, this can be tightened in a follow-up. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking: `xhigh` • Cost: `$27.25`_ <!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh costs=27.25 -->	2026-03-26 10:36:40 +00:00
Cian Johnston	46edaf2112	test: reduce number of coderdtest instances (#23463 ) Consolidates coderdtest invocations in 7 tests to reduce 23 instances to 7 across: - `TestGetUser` (3 → 1) — read-only user lookups - `TestUserTerminalFont` (3 → 1) — each creates own user via CreateAnotherUser - `TestUserTaskNotificationAlertDismissed` (3 → 1) — each creates own user - `TestUserLogin` (3 → 1) — each creates/deletes own user - `TestExpMcpConfigureClaudeCode` (5 → 1) — writes to isolated temp dirs - `TestOAuth2RegistrationTokenSecurity` (3 → 1) — independent registrations - `TestOAuth2SpecificErrorScenarios` (3 → 1) — independent error scenarios > 🤖 This PR was created with the help of Coder Agents, and has been reviewed by my human. 🧑‍💻	2026-03-25 09:53:06 +00:00
Mathias Fredriksson	38f723288f	fix: correct malformed struct tags in organizationroles and scim_test (#23497 ) Fix leading space in table tag and escaped-quote tag syntax. Extracted from #23201.	2026-03-25 13:11:08 +11:00
Mathias Fredriksson	147df5c971	refactor: replace sort.Strings with slices.Sort (#23457 ) The slices package provides type-safe generic replacements for the old typed sort convenience functions. The codebase already uses slices.Sort in 43 call sites; this finishes the migration for the remaining 29. - sort.Strings(x) -> slices.Sort(x) - sort.Float64s(x) -> slices.Sort(x) - sort.StringsAreSorted(x) -> slices.IsSorted(x)	2026-03-23 23:19:23 +02:00
Asher	47daca6eea	feat: add filtering to org members (#23334 ) Continuation of https://github.com/coder/coder/pull/23067 Add filtering to the paginated org member endpoint (pretty much the same as what I did in the previous PR with group members, except there I also had to add pagination since it was missing).	2026-03-21 16:58:45 -08:00
Susana Ferreira	139594a4f4	feat: block CONNECT tunnels to private/reserved IP ranges (#23109 ) ## Description Blocks `CONNECT` tunnels to private and reserved IP ranges in aibridgeproxyd, preventing the proxy from being used to reach internal networks. The Coder access URL is always exempt (hostname+port match) so the proxy can reach its own deployment. It is possible to exempt additional ranges via `CODER_AIBRIDGE_PROXY_ALLOWED_PRIVATE_CIDRS`. DNS rebinding is handled differently per path: * Direct (no upstream proxy): validate the resolved IP right before the TCP dial, no window between check and connect. * Upstream proxy: Resolves and checks before forwarding to the upstream dialer. A small rebinding window exists since the upstream proxy re-resolves independently. ## Changes * Add blocked IP denylist covering private, reserved, and special-purpose ranges * Add `AllowedPrivateCIDRs` option with CLI flag and env var * Wire IP checks into `proxy.ConnectDial` for both upstream and direct paths * Add tests for blocked/allowed cases across direct dial, upstream proxy, CIDR exemptions, and CoderAccessURL exemption Notes: documentation will be handled in a follow-up PR. Closes: https://github.com/coder/security/issues/124	2026-03-20 09:49:26 +00:00
Cian Johnston	06c50d13ad	fix(cli): exorcise the DERP healthcheck demon from TestSupportBundle (#23337 ) - Replace real healthcheck with mock `HealthcheckFunc` that returns a canned report instantly - Remove healthcheck cache-seeding goroutine/channel workaround - Remove `HealthcheckTimeout: testutil.WaitSuperLong` (no longer needed) - Reduce `setupCtx` from `WaitSuperLong` (60s) to `WaitLong` (25s) The DERP healthcheck performs real network operations (portmapper gateway probing, STUN) that hang for 60s+ on macOS CI runners. Since `TestSupportBundle` validates bundle generation, not healthcheck correctness, a canned report eliminates this entire class of flake. Fixes coder/internal#272 > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-20 09:46:13 +00:00
Michael Suchacz	6d214644f6	fix: make TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease deterministic (#23279 ) Eliminates the timing flake in `TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease` by making the chatd worker loop clock-controllable. ## Changes `coderd/chatd/chatd.go` - Replace `time.NewTicker` calls in `Server.start()` with `p.clock.NewTicker` using named quartz tags `("chatd", "acquire")` and `("chatd", "stale-recovery")`. `coderd/chatd/chatd_test.go` - Inject `quartz.NewMock(t)` into the test via `newActiveTestServer` config override. - Trap the acquire ticker so the test controls exactly when pending chats are reacquired. - Rewrite the test flow as explicit clock-advance steps instead of wall-clock polling. `AGENTS.md` - Document the PR title scope rule (scope must be a real path containing all changed files). ## Validation - `go test ./coderd/chatd -run TestInterruptAutoPromotionIgnoresLaterUsageLimitIncrease -count=100` ✅ - `go test ./coderd/chatd` ✅ - `make lint` ✅	2026-03-19 15:14:00 +00:00
Cian Johnston	be1c06dec9	feat: add endpoint and CLI for users to view their own OIDC claims (#23053 ) - Adds a new API endpoint `GET /api/v2/users/oidc-claims` that returns only the merged claims (not the separate id_token/userinfo breakdown). Scoped exclusively to the authenticated user's own identity — no user parameter, so users cannot view each other's claims. - Adds a new CLI command:** `coder users oidc-claims` that hits the above endpoint. - The existing owner-only debug endpoint is preserved unchanged for admins who need the full claim breakdown. > 🤖 This PR was created with the help of Coder Agents, and will be reviewed by my human. 🧑‍💻	2026-03-18 22:10:04 +00:00
Jon Ayers	eba7d943a0	fix: run stop build before starting a workspace with a failed start (#22925 )	2026-03-18 14:58:20 -05:00
Kacper Sawicki	1e07ec49a6	feat: add merge_strategy support for coder_env resources (#23107 ) ## Description Implements the server-side merge logic for the `merge_strategy` attribute added to `coder_env` in [terraform-provider-coder v2.15.0](https://github.com/coder/terraform-provider-coder/pull/489). This allows template authors to control how duplicate environment variable names are combined across multiple `coder_env` resources. Relates to https://github.com/coder/coder/issues/21885 ## Supported strategies \| Strategy \| Behavior \| \|----------\|----------\| \| `replace` (default) \| Last value wins — backward compatible \| \| `append` \| Joins values with `:` separator (e.g. PATH additions) \| \| `prepend` \| Prepends value with `:` separator \| \| `error` \| Fails the build if the variable is already defined \| ## Example ```hcl resource "coder_env" "path_tools" { agent_id = coder_agent.dev.id name = "PATH" value = "/home/coder/tools/bin" merge_strategy = "append" } ``` ## Changes - Proto: Added `merge_strategy` field to `Env` message in `provisioner.proto` - State reader: Updated `agentEnvAttributes` struct and proto construction in `resources.go` - Merge logic: Added `mergeExtraEnvs()` function in `provisionerdserver.go` with strategy-aware merging for both agent envs and devcontainer subagent envs - Tests: 15 unit tests covering all strategies, edge cases (empty values, mixed strategies, multiple appends) - Dependency: Bumped `terraform-provider-coder` v2.14.0 → v2.15.0 - Fixtures: Updated `duplicate-env-keys` test fixtures and golden files ## Ordering When multiple resources `append` or `prepend` to the same key, they are processed in alphabetical order by Terraform resource address (per the determinism fix in #22706).	2026-03-18 15:43:28 +01:00
Ethan	fc3508dc60	feat: configure acquire chat batch size (#23196 ) ## Summary - add a hidden deployment config option for chat acquire batch size (`CODER_CHAT_ACQUIRE_BATCH_SIZE` / `chat.acquireBatchSize`) - thread the configured value into chatd startup while preserving the existing default of `10` - clamp the deployment value to the `int32` range before passing it into chatd - regenerate the API/docs/types/testdata artifacts for the new config field ## Why `chatd` currently acquires pending chats in batches of `10` via a compile-time default. This change makes that batch size operator-configurable from deployment config, so we can tune acquisition behavior without another code change.	2026-03-19 00:54:32 +11:00
Cian Johnston	fe82d0aeb9	fix: allow member users to generate support bundles (#23040 ) Fixes AIGOV-141 The `coder support bundle` command previously required admin permissions (`Read DeploymentConfig`) and would abort entirely for non-admin `member` users with: ``` failed authorization check: cannot Read DeploymentValues ``` This change makes the command degrade gracefully instead of failing outright. <details> <summary> Changes </summary> ### `support/support.go` - `Run()`: The authorization check for `Read DeploymentValues` is now a soft warning instead of a hard gate. Unauthenticated users (401) still fail, but authenticated users with insufficient permissions proceed with reduced data. - `DeploymentInfo()`: `DeploymentConfig` and `DebugHealth` fetches now handle 403/401 responses gracefully, matching the existing pattern used by `DeploymentStats`, `Entitlements`, and `HealthSettings`. - `NetworkInfo()`: Coordinator debug and tailnet debug fetches now check response status codes for 403/401 before reading the body. ### `cli/support.go` - `summarizeBundle()`: No longer returns early when `Config` or `HealthReport` is nil. Instead prints warnings and continues summarizing available data (e.g., netcheck). ### Tests - `MissingPrivilege` → `MemberNoWorkspace`: Asserts member users can generate a bundle successfully with degraded admin-only data. - `NoPrivilege` → `MemberCanGenerateBundle`: Asserts the CLI produces a valid zip bundle for member users. - All existing tests continue to pass (`NoAuth`, `OK`, `OK_NoWorkspace`, `DontPanic`, etc.). ## Behavior matrix \| User type \| Before \| After \| \|---\|---\|---\| \| Admin \| Full bundle \| Full bundle (no change) \| \| Member \| Hard error \| Bundle with degraded admin-only data \| \| Unauthenticated \| Hard error \| Hard error (no change) \| Related to PRODUCT-182	2026-03-18 13:43:10 +00:00
Atif Ali	bd5b62c976	feat: expose MCP tool annotations for tool grouping (#23195 ) ## Summary - add shared MCP annotation metadata to toolsdk tools - emit MCP tool annotations from both coderd and CLI MCP servers - cover annotation serialization in toolsdk, coderd MCP e2e, and CLI MCP tests ## Why - Coder already exposed MCP tools, but it did not populate MCP tool annotation hints (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`). - Hosts such as Claude Desktop use those hints to classify and group tools, so without them Coder tools can get lumped together. - This change adds a shared annotation source in `toolsdk` and has both MCP servers emit those hints through `mcp.Tool.Annotations`, avoiding drift between local and remote MCP implementations. ## Testing - Tested locally on Cladue Desktop and the tools are categorized correctly. <table> <tr> <td> Before <td> After <tr> <td> <img width="613" height="183" alt="image" src="https://github.com/user-attachments/assets/29d2e3fb-53bc-4ea7-bdb3-f10df4ef996b" /> <td> <img width="600" height="457" alt="image" src="https://github.com/user-attachments/assets/cc384036-c9a7-4db9-9400-43ad51920ff5" /> </table> Note: Done using Coder Agents, reviewed and tested by human locally	2026-03-18 10:21:45 +00:00
Asher	903cfb183f	feat: add --service-account to cli user creation (#23186 )	2026-03-17 14:07:20 -08:00
George K	91ec0f1484	feat: add service_accounts workspace sharing mode (#23093 ) Introduce a three-way workspace sharing setting (none, everyone, service_accounts) replacing the boolean workspace_sharing_disabled. In service_accounts mode, only service account-owned workspaces can be shared while regular members' share permissions are removed. Adds a new organization-service-account system role with per-org permissions reconciled alongside the existing organization-member system role. Related to: https://linear.app/codercom/issue/PLAT-28/feat-service-accounts-sharing-mode-and-rbac-role --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com> Co-authored-by: Kayla はな <mckayla@hey.com>	2026-03-17 12:16:43 -07:00
Zach	3f76f312e4	feat(cli): add --no-wait flag to coder create (#22867 ) Adds a `--no-wait` flag (CODER_CREATE_NO_WAIT) to the create command, matching the existing pattern in `coder start`. When set, the `coder create` command returns immediately after the workspace creation API call succeeds instead of streaming build logs until completion. This enables fire-and-forget workspace creation in CI/automation contexts (e.g., GitHub Actions), where waiting for the build to finish is unnecessary. Combined with other existing flags, users can create a workspace with no interactivity, assuming the user is already authenticated.	2026-03-16 11:54:30 -06:00
Callum Styan	36665e17b2	feat: add WatchAllWorkspaceBuilds endpoint for autostart scaletests (#22057 ) This PR adds a `WatchAllWorkspaces` function with `watch-all-workspaces` endpoint, which can be used to listen on a single global pubsub channel for _all_ workspace build updates, and makes use of it in the autostart scaletest. This negates the need to use a workspace watch pubsub channel _per_ workspace, which has auth overhead associated with each call. This is especially relevant in situations such as the autostart scaletest, where we need to start/stop a set of workspaces before we can configure their autostart config. The overhead associated with all the watch requests skews the scaletest results and makes it harder to reason about the performance of the autostart feature itself. The autostart scaletest also no longer generates its own metrics nor does it wait for all the workspaces to actually start via autostart. We should update the scaletest dashboard after both PRs are merged to measure autostart performance via the new metrics. The new function/endpoint and its usage in the autostart scaletest are gated behind an experiment feature flag, this is something we should discuss whether we want to enable the endpoint in prod by default or not. If so, we can remove the experiment. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Callum Styan <callum@coder.com>	2026-03-13 20:37:41 -07:00
Danny Kopping	870583224d	chore: deprecate injected MCP approach in AI Bridge (#23031 ) _Disclaimer: implemented by a Coder Agent using Claude Opus 4.6._ Marks the injected MCP approach in AI Bridge as deprecated across the codebase. ## Changes - `codersdk/deployment.go`: Deprecated `ExternalAuthConfig.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex` fields; deprecated and hid the `--aibridge-inject-coder-mcp-tools` server flag; deprecated `AIBridgeConfig.InjectCoderMCPTools`. - `coderd/externalauth/externalauth.go`: Deprecated `Config.MCPURL`, `.MCPToolAllowRegex`, `.MCPToolDenyRegex`. - `enterprise/aibridgedserver/aibridgedserver.go`: Added runtime deprecation warning when `CODER_AIBRIDGE_INJECT_CODER_MCP_TOOLS` is enabled; deprecated `getCoderMCPServerConfig`. - `enterprise/aibridged/mcp.go`: Deprecated `MCPProxyBuilder` interface and `MCPProxyFactory` struct. - `docs/ai-coder/ai-bridge/mcp.md`: Added deprecation warning banner.	2026-03-13 16:15:33 +02:00
Zach	2488cf0d41	fix(agent): don't overwrite existing vscode git auth settings (#22871 ) OverrideVSCodeConfigs previously unconditionally set `git.useIntegratedAskPass` and `github.gitAuthentication` to false, clobbering any values provided by template authors via module settings (e.g. the vscode-web module's settings block). This change only set these keys when they are not already present, so template-provided values are preserved. Registry PR [#758](https://github.com/coder/registry/pull/758) fixed the module side (run.sh merges template-author settings into the existing settings.json instead of overwriting the file). But the agent still unconditionally stamped false onto both keys before the script ran, so the merge base always contained the agent's values and template authors couldn't set them to anything else. This change fixes the agent side by only writing defaults when the keys are absent.	2026-03-12 13:39:24 -06:00
Mathias Fredriksson	57af7abf1f	test: add testutil.WaitBuffer and replace time.Sleep in tests (#22922 ) WaitBuffer is a thread-safe io.Writer that supports blocking until accumulated output matches a substring or custom predicate. It replaces ad-hoc safeBuffer/syncWriter types and time.Sleep-based poll loops in tests with signal-driven waits. - WaitFor/WaitForNth/WaitForCond for blocking on output - Replace custom buffer types in cli/sync_test.go and provisionersdk/agent_test.go - Convert time.Sleep poll loops to require.Eventually/require.Never in cli/ssh_test.go, coderd/activitybump_test.go, coderd/workspaceagentsrpc_test.go, workspaceproxy_test.go, and scaletest tests	2026-03-12 18:07:52 +02:00
Zach	5cb820387c	fix: use quartz clock in task status test (#22969 ) Replace time.Since() usage with a quartz.Clock injected via RootCmd to ensure relative time strings ("Xs ago") are deterministic.	2026-03-12 08:33:09 -06:00
George K	e5c19d0af4	feat: backend support for creating and storing service accounts (#22698 ) Add is_service_account column to users table with CHECK constraints enforcing login_type='none' and empty email for service accounts. Update user creation API to validate service account constraints. Related to: https://linear.app/codercom/issue/PLAT-27/feat-backend-support-for-creating-and-storing-service-accounts	2026-03-11 10:19:08 -07:00
Kyle Carberry	71b132b9e7	fix(cli/sessionstore): don't run Windows keyring tests in parallel (#22937 ) Removes `t.Parallel()` from `TestKeyring` and `TestWindowsKeyring_WriteReadDelete`. The OS keyring is a shared system resource that's flaky under concurrent access, especially Windows Credential Manager in CI. Fixes coder/internal#1370	2026-03-11 15:19:56 +02:00
Cian Johnston	bc27274aba	feat(coderd): refactors github pr sync functionality (#22715 ) - Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_` - Extracts and refactors existing GitHub PR sync logic to new packages `coderd/gitsync` and `coderd/externalauth/gitprovider` - Associated wiring and tests Created using Opus 4.6	2026-03-10 18:46:01 +00:00
Mathias Fredriksson	33136dfe39	fix: use signal-based sync instead of time.Sleep in sync test (#22918 ) The `start_with_dependencies` golden test was flaky on Windows CI. It used `time.Sleep(100ms)` in a goroutine hoping the `sync start` command would have time to call `SyncReady`, find the dependency unsatisfied, and print the "Waiting..." message before the goroutine completed the dependency. On slower Windows runners, the sleep could finish and complete the dependency before the command's first `SyncReady` call, so `ready` was already `true` and the "Waiting..." message was never printed, causing the golden file mismatch. This replaces the `time.Sleep` with a `syncWriter` that wraps `bytes.Buffer` with a mutex and a channel. The channel closes when the written output contains the expected signal string ("Waiting"). The goroutine blocks on this channel instead of sleeping, so it only completes the dependency after the command has confirmed it is in the waiting state. Fixes https://github.com/coder/internal/issues/1376	2026-03-10 17:21:08 +00:00
Danny Kopping	2948400aef	fix(cli): skip CODER_SESSION_TOKEN check when --use-token-as-session is set (#22888 ) _Disclaimer: implemented with Opus 4.6 and Coder Agents._ Follow-up to #22879. ## Problem The `CODER_SESSION_TOKEN` guard added in #22879 blocks `coder login` unconditionally when the env var is set. This conflicts with `--use-token-as-session`, which intentionally uses the provided token (including from the env var) directly as the session token. ## Fix Add `&& !useTokenForSession` to the check so that `coder login --use-token-as-session` still works when `CODER_SESSION_TOKEN` is set. ## Testing Added `TestLogin/SessionTokenEnvVarWithUseTokenAsSession` — sets the env var with a valid token and passes `--use-token-as-session`, verifying login succeeds. --------- Signed-off-by: Danny Kopping <danny@coder.com>	2026-03-10 15:40:54 +02:00
Mathias Fredriksson	73bf8478d8	fix(cli): fix flaky TestGitSSH/Local_SSH_Keys on Windows CI (#22883 ) The `TestGitSSH/Local_SSH_Keys` test was flaking on Windows CI with a context deadline exceeded error when calling `client.GitSSHKey(ctx)`. Two issues contributed to the flake: 1. `prepareTestGitSSH` called `coderdtest.AwaitWorkspaceAgents` without passing the caller's context. This created a separate internal 25s timeout, wasting time budget independently of the setup context. Changed to use `NewWorkspaceAgentWaiter(...).WithContext(ctx).Wait()` so the agent wait shares the caller's timeout. 2. The `Local SSH Keys` subtest used `WaitLong` (25s) for its setup context, but this subtest does more work than `Dial` (runs the command twice). Bumped to `WaitSuperLong` (60s) to give slow Windows CI runners enough time. Fixes coder/internal#770	2026-03-10 12:12:15 +02:00

1 2 3 4 5 ...

1832 Commits