coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Cian Johnston	0a73ec6a50	feat(site/src/pages/AgentsPage): show error details for generic errors (#25803 ) Error messages in agent chat now expose the actual error detail instead of hiding it entirely. Also captures API response detail for generic errors that previously dropped it. (cherry picked from commit `78d556fffc`)	2026-06-02 12:23:59 +01:00
github-actions[bot]	26c035d742	fix(site): show condensed count for multi-provider in sessions list (#25705 ) (#25932 ) Cherry-pick of https://github.com/coder/coder/pull/25705 Original PR: #25705 — fix(site): show condensed count for multi-provider in sessions list Merge commit: `fc01aeeb0f` Requested by: @tracyjohnsonux Co-authored-by: TJ <tracy@coder.com>	2026-06-01 14:09:56 -04:00
github-actions[bot]	01766e9694	docs: document chat sharing (#25592 ) (#25927 ) Cherry-pick of https://github.com/coder/coder/pull/25592 Original PR: #25592 — docs: document chat sharing Merge commit: `372265a0b5` Requested by: @david-fraley Co-authored-by: Danielle Maywood <danielle@themaywoods.com>	2026-06-01 13:42:21 -04:00
github-actions[bot]	f4bf286deb	docs: document AI providers seeding mechanism & support for new types (#25855 ) (#25906 ) Cherry-pick of https://github.com/coder/coder/pull/25855 Original PR: #25855 — docs: document AI providers seeding mechanism & support for new types Merge commit: `f9937a8931` Requested by: @dannykopping --------- Co-authored-by: Danny Kopping <danny@coder.com> Co-authored-by: Susana Ferreira <susana@coder.com>	2026-06-01 13:41:19 -04:00
github-actions[bot]	ec2d20a7f1	feat: support adding GitHub Copilot AI provider via UI (#25888 ) (#25902 ) Cherry-pick of https://github.com/coder/coder/pull/25888 Original PR: #25888 — feat: support adding GitHub Copilot AI provider via UI Merge commit: `a85462bd49` Requested by: @dannykopping Co-authored-by: Danny Kopping <danny@coder.com>	2026-06-01 13:40:25 -04:00
github-actions[bot]	ea971d54f3	fix: deprecate ai provider seeding env config (#25854 ) (#25900 ) Cherry-pick of https://github.com/coder/coder/pull/25854 Original PR: #25854 — fix: deprecate ai provider seeding env config Merge commit: `c8555e2163` Requested by: @dannykopping Co-authored-by: Danny Kopping <danny@coder.com>	2026-06-01 13:40:09 -04:00
Dean Sheather	f7369502bf	chore: disable release freezing on dev.coder.com (#25881 ) (#25912 ) (cherry picked from commit `9c111a2be2`)	2026-06-01 17:01:43 +02:00
github-actions[bot]	32882aee95	fix: recreate `ai_provider_type` instead of ADD VALUE (#25895 ) (#25904 ) Cherry-pick of https://github.com/coder/coder/pull/25895 Original PR: #25895 — fix: recreate `ai_provider_type` instead of ADD VALUE Merge commit: `85f56e4944` Requested by: @dannykopping Signed-off-by: Danny Kopping <danny@coder.com> Co-authored-by: Danny Kopping <danny@coder.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 10:32:44 -04:00
github-actions[bot]	eb918f9ad5	chore: Style fixes and nits across the AI Governance docs (#25793 ) (#25897 ) Backport of https://github.com/coder/coder/pull/25793 Original PR: #25793 — chore: Style fixes and nits across the AI Governance docs Merge commit: `61a9c4a61d` Requested by: @nickvigilante Co-authored-by: Nick Vigilante <nickvigilante@users.noreply.github.com> Co-authored-by: Danny Kopping <danny@coder.com>	2026-06-01 10:06:34 -04:00
github-actions[bot]	295d2de5d7	feat(site): add Opus 4.8 known model (#25839 ) (#25853 ) Cherry-pick of https://github.com/coder/coder/pull/25839 Original PR: #25839 — feat(site): add Opus 4.8 known model Merge commit: `9448624d2d` Requested by: @ibetitsmike Co-authored-by: Thomas Kosiewski <tk@coder.com>	2026-05-29 19:59:47 -04:00
Cian Johnston	2d640eaf76	feat: classify provider_disabled 503 as non-retryable (#25800 ) (#25860 ) (NOTE: Depends on https://github.com/coder/coder/pull/25837) Adds a new `provider_disabled` error classification in `chatd` with the corresponding plumbing to classify it as non-retryable. Also adds a story for how this particular error kind is displayed in the UI. (cherry picked from commit `d0a51da0a9`) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. -->	2026-05-29 16:54:20 -04:00
Cian Johnston	359a39f58a	fix: add missing_key error kind for missing chat api_key_id (#25783 ) (#25798 ) Refs CODAGT-486 - `codersdk/chats.go`: New `ChatErrorKindMissingKey` constant and `AllChatErrorKinds` entry - `coderd/x/chatd/chaterror/message.go`: `terminalMessage` and `retryMessage` cases - `coderd/x/chatd/model_routing_aibridge.go`: Pre-classify error with `WithClassification` - `coderd/x/chatd/model_routing_internal_test.go`: Classification assertion on production path (CRF-2) - `chatStatusHelpers.ts`: Frontend title "Chat interrupted" - `LiveStreamTail.stories.tsx`: Storybook story with `detail` assertion - `docs/ai-coder/ai-gateway/clients/coder-agents.md`: Troubleshooting entry - Tests: classification round-trip, terminal message, metrics kind enumeration > Generated with [Coder Agents](https://coder.com/agents) on behalf of @johnstcn (cherry picked from commit `6df1536256`) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. -->	2026-05-29 13:07:19 -04:00
Cian Johnston	804bb3c0cf	fix(coderd): enforce api_key_id on user messages at type level (#25729 ) (#25797 ) - Empty string is valid for `apiKeyID` in paths that genuinely lack a caller key (e.g. agent-initiated context injection in `workspaceAgentAddChatContext`). AI Gateway fail-closed check remains the runtime safety net. - Context injection paths (`persistInstructionFiles`, compaction) read the key from `aibridge.DelegatedAPIKeyIDFromContext(ctx)`, set upstream by `contextWithActiveTurnAPIKeyID`. - Subagent context copy branches on `copiedRole == database.ChatMessageRoleUser` to choose the right append function. > Generated by Coder Agents (cherry picked from commit `b278be7361`) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. -->	2026-05-29 13:06:06 -04:00
github-actions[bot]	476ed480d1	fix(coderd): block ai provider env key drift (#25849 ) (#25851 ) Cherry-pick of https://github.com/coder/coder/pull/25849 Original PR: #25849 — fix(coderd): block ai provider env key drift Merge commit: `110210d7c9` Requested by: @dannykopping Co-authored-by: Danny Kopping <danny@coder.com>	2026-05-29 13:00:45 -04:00
github-actions[bot]	663f1ee834	fix: track credential hint across key failover attempts in aibridge (#25735 ) (#25847 ) Cherry-pick of https://github.com/coder/coder/pull/25735 Original PR: #25735 — fix: track credential hint across key failover attempts in aibridge Merge commit: `7b903cad73` Requested by: @ssncferreira Co-authored-by: Susana Ferreira <susana@coder.com>	2026-05-29 12:59:39 -04:00
github-actions[bot]	cccf436db2	feat: serve 503 sentinel for disabled providers (#25794 ) (#25837 ) Cherry-pick of https://github.com/coder/coder/pull/25794 Original PR: #25794 — feat: serve 503 sentinel for disabled providers Merge commit: `5b10268827` Requested by: @dannykopping Signed-off-by: Danny Kopping <danny@coder.com> Co-authored-by: Danny Kopping <danny@coder.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 12:58:13 -04:00
github-actions[bot]	cf6311b9e0	fix(coderd/x/chatd): harden openai-compatible chat calls (#25737 ) (#25796 ) Cherry-pick of https://github.com/coder/coder/pull/25737 Original PR: #25737 — fix(coderd/x/chatd): harden openai-compatible chat calls Merge commit: `f529577bee` Requested by: @ibetitsmike Co-authored-by: Michael Suchacz <203725896+ibetitsmike@users.noreply.github.com>	2026-05-29 12:53:24 -04:00
github-actions[bot]	c350e98a6e	fix(site): update models settings page description text (#25830 ) (#25831 ) Cherry-pick of https://github.com/coder/coder/pull/25830 Original PR: #25830 — fix(site): update models settings page description text Merge commit: `a801d996e7` Requested by: @tracyjohnsonux Co-authored-by: TJ <tracy@coder.com>	2026-05-29 12:49:46 -04:00
Danny Kopping	7e5e8eb9d2	fix: add ai provider status and reload freshness metrics (#25770 ) (#25795 ) Add metrics for `aibridged` and `aibridgeproxyd`'s provider statuses. AI providers can be modified, and possibly misconfigured, at runtime. These metrics help operators understand the state of these provider definitions in case unexpected behaviour is observed. (cherry picked from commit `12520ee964`)	2026-05-28 18:54:02 +02:00
github-actions[bot]	85d39b3dbe	fix(coderd/x/chatd/chatloop): use stream silence timeout (#25782 ) (#25786 ) Cherry-pick of https://github.com/coder/coder/pull/25782 Original PR: #25782 — fix(coderd/x/chatd/chatloop): use stream silence timeout Merge commit: `7e2f7198dd` Requested by: @ethanndickson Co-authored-by: Ethan <ethanndickson@gmail.com>	2026-05-28 11:29:14 -04:00
github-actions[bot]	eb8b062b1d	fix: re-validate provider per request and classify reloads (#25766 ) (#25788 ) Cherry-pick of https://github.com/coder/coder/pull/25766 Original PR: #25766 — fix: re-validate provider per request and classify reloads Merge commit: `a9f5ed7644` Requested by: @dannykopping Co-authored-by: Danny Kopping <danny@coder.com>	2026-05-28 09:29:30 -04:00
github-actions[bot]	570b193ed7	refactor(site): update BYOK link to use "View docs" on AI settings page (#25743 ) (#25764 ) Cherry-pick of https://github.com/coder/coder/pull/25743 Original PR: #25743 — refactor(site): update BYOK link to use "View docs" on AI settings page Merge commit: `cfa343e456` Requested by: @dannykopping Co-authored-by: TJ <tracy@coder.com>	2026-05-28 09:29:02 -04:00
blinkagent[bot]	75f51532f3	chore: update terraform to v1.15.5 (#25747 ) Cherry-pick of #25746 to `release/2.34`. Bumps bundled Terraform from `1.15.2` to `1.15.5`. Terraform 1.15.5 is built with Go 1.25.10 (vs Go 1.25.8 in 1.15.2), addressing Go stdlib CVEs flagged by security scanners. Files changed: - `.github/actions/setup-tf/action.yaml` - `scripts/Dockerfile.base` - `install.sh` - `flake.nix` (+ updated SRI hash for the linux_amd64 zip) - `mise.toml` - `mise.lock` (+ updated per-platform SHA256 checksums) - `provisioner/terraform/testdata/version.txt` - `provisioner/terraform/testdata/resources/ai-tasks-disabled/ai-tasks-disabled.tfplan.json` Release notes: https://github.com/hashicorp/terraform/releases/tag/v1.15.5 (cherry picked from commit `bcc6cca040` — will be updated to the merged SHA from #25746) Created on behalf of @Shelnutt2 Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>	2026-05-27 16:46:09 -04:00
github-actions[bot]	c457a62d41	ci: trigger CI on release branch creation (#25744 ) (#25752 ) Cherry-pick of https://github.com/coder/coder/pull/25744 Original PR: #25744 — ci: trigger CI on release branch creation Merge commit: `5991a2c8b0` Requested by: @f0ssel Co-authored-by: Garrett Delfosse <garrett@coder.com>	2026-05-27 14:47:49 -04:00
Ethan	f422ac89cc	ci: extract go-test-failure-report composite action (#25670 ) The Go test jobs in `ci.yaml` each had ~30 lines of inline shell that wrapped `gotestsum` with a PATH shim to capture JSON, then ran `gotestsummary` and `upload-artifact` to publish a failure report. Three jobs carried three near-identical copies. This change replaces the three inline blocks with a single composite action at `.github/actions/go-test-failure-report/` that runs the same `gotestsummary` invocation, writes the same markdown to `GITHUB_STEP_SUMMARY`, and uploads the same NDJSON artifact. The PATH shim is gone; gotestsum's native `GOTESTSUM_JSONFILE` env variable is used instead, plumbed through the `test-go-pg` composite. `test-go-pg` gains three optional inputs: - `gotestsum-json-file` — explicit JSON file path (or `default` for `${RUNNER_TEMP}/go-test.json`) - `run-regex` — passed to `go test -run` - `test-shuffle` — passed to `go test -shuffle` All three have safe defaults so existing callers are unaffected. No observable change in CI behavior: the three existing test-go-pg jobs continue to emit the same JSON, render the same failure summary, and upload the same artifact. Stacked under #25667, which uses the new composite and inputs to power a new flake-detector workflow.	2026-05-28 00:16:46 +10:00
Danny Kopping	2770bdc9d1	feat: route extra ai_provider_types through OpenAI and Anthropic providers (#25722 ) _Disclosure:_ _produced_ _with_ _Claude_ _Opus_ _4\.7_ AI Gateway only supports Anthropic (+Bedrock), OpenAI, and Copilot providers at present. All other types (Vercel, Gemini, etc) will be mapped to OpenAI since they support OpenAI-compatible endpoints.	2026-05-27 16:16:05 +02:00
Spike Curtis	6f06ace949	chore: export MsgQueue from pubsub package (#25707 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> Makes `MsgQueue` exported, so it can be used in pubsub implementations outside PGPubsub.	2026-05-27 10:11:51 -04:00
Danielle Maywood	d1e27889eb	fix(site): improve chat sharing mobile layout (#25687 )	2026-05-27 15:03:29 +01:00
Danielle Maywood	5603be19cc	feat(site): add transcript tool icons (#25724 )	2026-05-27 14:43:14 +01:00
Nick Vigilante	ecaf5e022b	docs: fix broken references and add users oidc-claims to manifest (#25706 ) ## Summary Three small docs fixes: - `docs/admin/integrations/oauth2-provider.md`: Replace broken relative link to `scripts/oauth2/README.md` with an absolute GitHub URL. The previous link escaped the `docs/` tree (`../../../scripts/oauth2/README.md`) and does not resolve in the published docs site. - `docs/install/releases/feature-stages.md`: Point the "Coder documentation" link to `docs/about/contributing/documentation.md`. The previous `../../README.md` target does not exist under `docs/`. - `docs/manifest.json`: Add the missing `users oidc-claims` entry alongside the other `users` CLI subcommands so the generated reference page (`docs/reference/cli/users_oidc-claims.md`) is reachable from the sidebar. ## Validation - Confirmed each new link target exists on `main` (`docs/about/contributing/documentation.md`, `scripts/oauth2/README.md`, `docs/reference/cli/users_oidc-claims.md`). - Pre-commit hooks pass (`fmt/markdown`, `lint/markdown`, `lint/emdash`, `lint/typos`, etc.). --- _This PR was prepared by a [Coder Agents](https://coder.com/) session on behalf of @nickvigilante. Human review requested since this is a docs-only change._	2026-05-27 09:29:16 -04:00
Cian Johnston	0c27224fc2	fix(coderd): pass title API key context (#25723 ) Fixes CODAGT-503 - Add failing-first coverage for manual title generation with missing message `api_key_id`, with both context fallback and fail-closed cases. - Set `aibridge.WithDelegatedAPIKeyID(ctx, apiKey.ID)` in `regenerateChatTitle` and `proposeChatTitle`. - In `generateManualTitleCandidate`, fall back to `aibridge.DelegatedAPIKeyIDFromContext(ctx)` only when `modelBuildOptionsFromMessages` yields an empty `ActiveAPIKeyID`. - Keep `modelBuildOptionsFromMessages` pure and leave automatic title generation unchanged.	2026-05-27 13:20:36 +01:00
Danny Kopping	10f37db35d	fix(coderd/x/chatd/chatprovider): keep gateway model prefix in ResolveModelWithProviderHint (#25725 ) For `vercel`, `openrouter`, and `openai-compat`, the `<provider>/<model>` slash is part of the upstream model ID rather than a hint. `ResolveModelWithProviderHint` was running `parseCanonicalModelRef` before honoring `providerHint`, so a config like `(provider=vercel, model=anthropic/claude-4-5-sonnet)` resolved to `provider=anthropic, model=claude-4-5-sonnet` and the prefix-less model name was forwarded to Vercel, which returned `Model 'claude-4-5-sonnet' not found`. Honor an explicit gateway provider hint before attempting canonical-ref parsing. Non-gateway hints (anthropic, openai, etc.) keep the existing canonical-ref-first behavior so `anthropic/claude-...` still has its prefix stripped when routed directly to Anthropic. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 11:13:39 +00:00
Max Schwenk	ae492495ee	fix(cli): show ready sync start dependencies (#25546 ) ## Problem Follow-on to: - https://github.com/coder/coder/pull/25089 `coder exp sync start` still printed a generic success message when the unit was ready on the first status check. That hid whether the unit had no dependencies or had dependencies that were already satisfied before `sync start` ran. Before: ```text Success ``` ## Solution Print explicit startup output for both ready-at-first-check cases. After, dependencies already satisfied: ```text Unit "test-unit" started immediately, dependencies already satisfied: [dep-unit, dep-unit-2] ``` After, no dependencies: ```text Unit "test-unit" started with no dependencies ``` The existing waiting path is unchanged and still reports the dependencies while waiting and after waiting finishes. Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>	2026-05-27 12:33:39 +02:00
Danny Kopping	79e007cf30	feat: hot-reload aibridged and aibridgeproxyd providers on DB changes (#25673 ) Previously the in-process aibridge daemon and the enterprise aibridgeproxy daemon both snapshotted their provider routing once at boot. Any `ai_providers` or `ai_provider_keys` mutation required a restart for either to pick it up. Add an `ai_providers_changed` pubsub channel that the CRUD handlers publish on after Create / Update / Delete. Both daemons subscribe: - aibridged rebuilds its `[]aibridge.Provider` snapshot via `BuildProviders` and swaps it into the pool atomically. Inflight requests keep serving against the bridge they already acquired; new acquires build against the new snapshot. Per-provider construction errors stay scoped to the offending row. - aibridgeproxyd rebuilds its routing snapshot from `GetAIProviders` and swaps the host→provider map atomically. The MITM listener picks up new providers without restart. DB read for aibridgeproxyd uses the existing `AsAIProviderMetadataReader` subject for routing-only access.	2026-05-27 11:58:43 +02:00
Cian Johnston	6acfe6c835	fix: classify quota errors as usage_limit instead of auth (#25676 ) Fixes CODAGT-484. - Removed "quota", "billing", "insufficient_quota", "payment required" from `authStrongPatterns` - Added `usageLimitPatterns` slice with those patterns - Added `usageLimitMatch` signal and rule between overloaded and authStrong in priority - Added terminal/retry messages for `ChatErrorKindUsageLimit` - Simplified auth message (removed billing reference) - Frontend: conditional `!usageLimitStatus.provider` guard on the "View Usage" Alert - Added `TestClassify_UsageLimitBeatsAuth` with 5 cases including real production OpenAI error - Added `ProviderQuotaExceeded` story asserting no "View Usage" link and correct `ChatStatusCallout` rendering > Generated with [Coder Agents](https://coder.com/agents)	2026-05-27 09:45:36 +01:00
Thomas Kosiewski	e32be68687	fix(dogfood/coder): verify Homebrew installer (#25721 )	2026-05-27 10:45:21 +02:00
Jake Howell	9c10ec2ca7	fix: resolve mui `<TimelineDateRow />` regression (#25716 )	2026-05-27 18:36:55 +10:00
Thomas Kosiewski	bfa17c315e	fix(dogfood/coder): persist mise user installs (#25720 )	2026-05-27 09:54:09 +02:00
Ethan	e91bec8574	fix(cli): close aibridge daemon before WebSocket shutdown wait (#25719 ) > [!WARNING] > The investigation and solution in this PR were done with [Mux](https://mux.coder.com/). I've reviewed the investigation methodology, evidence and solution, and it all appears sound. ## Summary PR #25570 (`refactor: move aibridged out of enterprise to AGPL`, merged 2026-05-22) added an in-memory aibridge DRPC server in `coderd/aibridged.go` that does `api.WebsocketWaitGroup.Add(1)` and only releases `Done()` when its client session is closed. PR #25575 then flipped `CODER_AI_GATEWAY_ENABLED` to default to `true`, so every `cli.Server()` invocation now spins up that goroutine. In `cli/server.go`, the only call to `aibridgeDaemon.Close()` was a `defer` scheduled at function return. During graceful shutdown the code first calls `coderAPICloser.Close()`, which waits on `api.WebsocketWaitGroup`. That wait sits for the full 10s timeout in `coderd/coderd.go` (`websocket shutdown timed out after 10 seconds`), then returns, then the function unwinds, and only then does the deferred `aibridgeDaemon.Close()` fire and let the goroutine call `Done()`. The 10s tax was previously latent (aibridged was enterprise-only and opt-in). After the two May 22 PRs it hit every `cli.Server()` test. On Linux/macOS CI it just makes the suite slower; on the Depot Windows runner, the ramdisk reservation leaves only ~17 GiB of headroom and the ~10s shutdown tails of multiple concurrent package binaries overlap into an OOM, presenting as `test-go-pg (windows-2022)` jobs that die silently at the ~600s watchdog with an empty `steps` array. See Slack: https://codercom.slack.com/archives/C05AE94121Z/p1779807717764189 ## Fix Close `aibridgeDaemon` explicitly during graceful shutdown, before `coderAPICloser.Close()` waits on the WebSocket wait group. This matches the existing ordered-shutdown pattern used for `tunnel` and `notificationsManager`. The deferred `aibridgeDaemon.Close()` is retained as a safety net for early-return paths, and is safe to double-call because `aibridged.Server.Close()` is already idempotent via `shutdownOnce` in `coderd/aibridged/aibridged.go`. ## Regression test `TestServer_AIGatewayShutdownOrdering` boots a real `coder server` with `--ai-gateway-enabled=true`, cancels its context, and asserts graceful shutdown finishes in under 8s. With the fix the test runs in ~0.1s; without the fix it fails deterministically at ~10.0s. The flag is passed explicitly so the test continues to guard the ordering even if the deployment default is ever flipped back. ## Evidence this fixes the OOM On Linux the patched `cli` test package drops from 114 s back to its pre-regression 30 s wall time at the same single-process peak RSS (~7.6 GiB), and the `websocket shutdown timed out after 10 seconds` log line disappears from every server-test run. Since the Windows OOM is the sum of multiple concurrent 10 s shutdown tails overlapping past the runner's ~17 GiB headroom, removing those tails returns the concurrent-RSS budget to its pre-regression level. The Windows OOM was intermittent (a handful of hits across many runs since May 22), so a single green `test-go-pg (windows-2022)` job on this PR is not by itself proof; confirmation will come from watching Windows runs on `main` over the next several days and seeing the ~600 s silent-kill fingerprint stop recurring. Relates to ENG-2771	2026-05-27 17:33:14 +10:00
TJ	916094c71c	feat(site): replace usage bars with ring indicators (#25708 ) Replaces the linear progress bars and text labels in the sidebar footer usage trigger with SVG donut ring charts that show the section icon centered inside each ring. ## Changes - `SvgRingProgress`: shared SVG component used by both `UsageIndicator` and `ContextUsageIndicator` - Ring colors follow the existing severity system (normal/warning/exceeded) - Hover tooltips show "Spend $12.50" and "Workspaces 30/100" - Dropdown menu content unchanged; full usage details still appear on click - Removed dead `summaryValue` field and `size="compact"` variant - Updated stories to cover ring trigger rendering and dropdown usage details > Generated by Coder Agents on behalf of @tracyjohnsonux	2026-05-26 22:01:31 -07:00
TJ	2afb33ac5e	feat(site/src/pages/AgentsPage): inline setup notice banner with admin/member distinction (#25518 ) Replaces the blocking Dialog modal setup notice with a context-aware inline banner above the chat input, with different messaging for admins and members. ## Inline notice banner The `AgentSetupNotice` component now renders as a `bg-surface-tertiary` inline box instead of an unclosable `Dialog` modal. The notice sits above the chat composer using negative margin overlap, and the composer is forced opaque (`bg-surface-secondary`) when the notice is present so the banner doesn't bleed through the semi-transparent desktop background. Three states based on role and configuration: - Admin, no providers or models: links to both provider and model setup - Admin, missing provider only: link to provider setup - Admin, has providers but no models: link to model setup only - Member, no models available: generic "your admin is still getting things set up" message The admin/member distinction is determined via `permissions.editDeploymentConfig` and applied in both `AgentChatPage` and `AgentCreatePage`. ## Conflict resolution notes During merge with main, the following were adapted: - Sidebar filter props updated to main's `sidebarFilters`/`onSidebarFiltersChange` pattern (replacing old `archivedFilter`) - Accepted `Sidebar/` -> `ChatsSidebar/` directory refactor from main - Dropped `hasArchivedChats` query (its sidebar consumer was removed in the refactor) - Provider link updated to `/ai/settings` (new AI settings page) > Generated with the assistance of Coder Agents on behalf of @tracyjohnsonux --------- Co-authored-by: jaaydenh <jaaydenh@users.noreply.github.com>	2026-05-26 21:00:53 -07:00
Ethan	e99f7171e4	ci: require docs lint when docs change (#25608 ) Move docs linting into the required CI umbrella and reuse the existing `changes` job so docs lint runs when docs or CI files change, plus on `main` as a backstop. This is motivated by the docs lint failures on #25601. That PR touched `.claude/docs/TESTING.md`; the standalone `Docs CI` workflow picked it up because `docs-ci.yaml` used broad `.md` matching, but local `pnpm lint-docs` and `make lint` did not catch the same file because they only scanned `docs/` plus root `.md`. The first failed Docs CI run reported markdownlint errors in `.claude/docs/TESTING.md` (`MD040` and `MD031`), and the next run reported a markdown table formatter failure in the same file. That mismatch is why this PR exists: prevent unrelated PRs from being surprised by stale `.claude/docs/` lint drift only after they happen to touch one of those files. The local docs scripts now include `.claude/docs/*`, and the old standalone `Docs CI` workflow is removed so we do not maintain separate path-filter logic outside the required CI workflow. > Generated by mux, but reviewed by a human	2026-05-27 12:30:05 +10:00
Zach	20b50dd4b8	docs: mark user secrets as beta (#25704 ) Update the user secrets user guide, the admin security secrets reference, and the docs manifest to label the feature as Beta instead of Early Access, and link to the beta section of the feature stages doc.	2026-05-26 15:22:17 -06:00
Zach	47ac4b309a	feat: enforce per-user limits on user_secrets (#25588 ) Add a Postgres trigger and matching codersdk constants that cap each user's secrets in four dimensions: count (50), total stored value bytes (200 KiB), env-injected stored value bytes (24 KiB), and env name length (256 bytes). Without these caps a user could overflow the 4 MiB DRPC agent manifest, the ~32 KiB Windows process env block, or Linux/macOS ARG_MAX at workspace start. The trigger is the source of truth on aggregates; the handler maps its check_violation error into a 400 that names the per-user budget in stored (post-encryption) bytes. A handler test exercises off-by-one at each cap across POST and PATCH, plus per-user budget isolation. Generated with help from Coder Agents.	2026-05-26 14:42:31 -06:00
Cian Johnston	d3155e1cab	test(enterprise/cli): add test to prove fix for #25699 (#25701 ) Adds an end-to-end enterprise CLI test to ensure legacy AI provider keys seeded at server startup are encrypted at rest when DBCrypt external token encryption is enabled, preventing regressions related to #25699. > Partially implemented by Coder Agents, and massaged afterwards by me.	2026-05-26 20:08:07 +00:00
Kyle Carberry	58f6b9c4d0	fix(coderd/externalauth): retry transient refresh failures with backoff (#25686 ) ## Summary Wraps external auth token refresh in an exponential-backoff retry so a brief upstream hiccup (5xx, network timeout, rate-limited 429) no longer surfaces as an `InvalidTokenError` and forces users to re-authenticate. GitHub in particular has been flaky enough lately that this is hitting real users. ## Behavior - `(Config).RefreshToken` now calls a helper that retries the `TokenSource.Token()` exchange with exponential backoff (250ms → 2s), bounded by a 10s total budget. - Errors classified as permanent by `isFailedRefresh` (e.g. `bad_refresh_token`, `invalid_grant`, `unauthorized_client`, ...) skip the retry loop. Retrying a permanent failure wastes the refresh quota and, on providers with single-use refresh tokens, can mask a legitimate concurrent winner with repeated `bad_refresh_token` responses. - Refreshes with an empty refresh token still short-circuit without making an API call. - The existing concurrent-refresh-race detection and optimistic-lock paths are unchanged. ## Tunables Three new `time.Duration` fields on `externalauth.Config` (`RefreshRetryInitialBackoff`, `RefreshRetryMaxBackoff`, `RefreshRetryTimeout`) let callers override the defaults. They default to zero, which falls back to the package defaults, so existing call sites are unaffected. The fields exist primarily so tests can dial the timing way down without touching package globals (and therefore without serializing parallel tests). ## Tests - `TestRefreshToken/RefreshRetries` now disables internal retries via `RefreshRetryTimeout = time.Nanosecond` so its existing "1 IDP call per `RefreshToken` invocation" assertion still holds. Otherwise its assertions are unchanged. - New `TestRefreshToken/RefreshTokenWithBackoff` simulates 3 transient 5xx failures followed by success and verifies the refresh ultimately succeeds with 4 total IDP attempts. - New `TestRefreshToken/RefreshTokenBackoffPermanentError` returns `bad_refresh_token` and verifies the refresh is not* retried even with a generous 1s budget. <details> <summary>Why the explicit <code>retryCtx.Err()</code> guard?</summary> `retry.Retrier.Wait` `select`s between `time.After(delay)` and `ctx.Done()`. The first call has `delay == 0`, so `time.After(0)` and an already-cancelled context both fire immediately and Go picks the case nondeterministically. Without the guard, a near-zero retry budget would still trigger an unwanted extra refresh attempt roughly half the time, which would have made the `RefreshRetries` test flaky. </details> This PR was opened by a Coder agent on behalf of @kylecarbs.	2026-05-26 15:35:22 -04:00
Michael Suchacz	8b1705eb65	feat: route chatd provider traffic through aibridge (#25629 ) ## Summary Routes chatd model calls backed by concrete AI Provider rows through the in-process aibridge transport by default, with deployment options to use direct provider routing when AI Gateway is disabled or chat AI Gateway routing is disabled. - Splits model routing into common, direct provider, and AI Gateway paths behind a single deployment-mode entry point. - Builds chatd models through explicit request, route, and options data. Active API key attribution is passed explicitly instead of being hidden inside generic model construction. - For AI Gateway BYOK routes, resolves the user's provider key in chatd, forwards it through provider-specific auth headers, and sets `X-Coder-AI-Governance-Token` to the `delegated` marker so aibridge preserves those headers while still stripping Coder-specific metadata. - Keeps central provider credentials and deployment fallback credentials out of forwarded provider auth headers, so AI Gateway central policy remains authoritative. - Redacts delegated provider auth from default string formatting to avoid accidental plaintext logging of user BYOK credentials. - Covers selected chat models, advisor overrides, title and quickgen paths, subagent overrides, computer use model selection, and an integration-style chat turn through the aibridge transport path. - Persists initiating API key IDs on chat and queued user messages, including subagent child messages, and fails closed for AI Gateway-routed model builds without an active key. - Removes unused `api_key_id` indexes while keeping the persistence columns and foreign keys. - Keeps the deployment option available through config and env parsing, but hides it from CLI help and generated docs. - Stabilizes the subagent poll fallback test so background CreateChat processing cannot win the state transition under slower CI environments. ## Tests - `go test ./coderd/x/chatd -run 'TestAIGatewayProviderAuthForUser\|TestAIGatewayProviderAuthRedactsFormatting\|TestResolveModelRouteForConfigAIGatewayProviderAuth\|TestAIGatewayModelForwardsProviderAuth\|TestProcessChat_AIGatewayRoutingUsesDelegatedAPIKey\|TestAwaitSubagentCompletion' -count=1` - `go test ./coderd/aibridged -run 'TestServeHTTP_DelegatedAPIKey\|TestServeHTTP_StripCoderToken' -count=1` - `git diff --check HEAD~1..HEAD` - `make lint` > Mux working on behalf of Mike.	2026-05-26 19:31:52 +00:00
Danny Kopping	a56c88a0cc	fix: run AI provider seed and build after newAPI so dbcrypt applies (#25699 ) ## Problem Two related symptoms of the same architectural issue: the `dbcrypt` wrapper is installed inside `enterprise/coderd.New`, so any access to `options.Database` that happens before `newAPI` runs bypasses encryption. Symptom 1 (reads): Provider keys added via the admin UI are encrypted at rest. `BuildProviders` was running before `newAPI`, against the unwrapped store, so the ciphertext was read as-is and shoved into the keypool as the upstream credential. Anthropic/OpenAI reject it, and the interception log shows: ``` coderd.aibridged.pool: interception failed ... error="all configured keys failed authentication" credential_kind=centralized credential_hint=PaPb...4A== credential_length=184 ``` Symptom 2 (writes): `SeedAIProvidersFromEnv` was also running before `newAPI`, against the unwrapped store, so env-derived keys (`CODER_AIBRIDGE_OPENAI_KEY`, indexed `CODER_AIBRIDGE_PROVIDER_<N>_KEY`, etc.) landed in `ai_provider_keys` as plaintext with `ApiKeyKeyID = null` even when `CODER_EXTERNAL_TOKEN_ENCRYPTION_KEYS` was set. ## Fix Move both `SeedAIProvidersFromEnv` and `BuildProviders` to after `newAPI`, where `options.Database` is the dbcrypt-wrapped store. Writes encrypt correctly; reads decrypt correctly. The enterprise closure (`enterprise/cli/server.go`) runs inside `newAPI` and calls `BuildProviders` for the aibridgeproxyd at that point. Once the agpl seed moves to after `newAPI`, the proxy on first boot would see no env-seeded providers. Add a matching seed call inside the enterprise closure before its `BuildProviders` to cover that case. Seeding is idempotent, so the agpl-side seed running again post-`newAPI` is a no-op when the rows already exist. ## Known shortcomings The clean version of this fix would just inherit `ctx` like every other startup step and place these calls naturally. It can't, for two reasons that are both about the surrounding handler architecture rather than this change: 1. `dbcrypt` wrapping is positioned inside `newAPI`, not around `options.Database` at creation. That's why both seed and build have to wait until after `newAPI` in the first place. The principled fix is to install the wrapper at the point the store is created (behind a hook the enterprise build supplies), so every consumer sees a single authoritative view and the ordering stops mattering. This would also collapse the duplicated seed call back to a single site. 2. The handler's shutdown sequence is not deferred. `coderAPICloser.Close()` and the other teardown steps run only if control reaches the `select` at the bottom of the handler. An early `return` from anywhere in Phase 1 (e.g. seed/build returning `context.Canceled` when the user hits ctrl-c during startup) skips that block and orphans all the goroutines `newAPI` spawned — tailnet workers, gitsync, telemetry batcher, etc. `goleak` then catches them at package teardown and `TestServer_TelemetryDisabled_FinalReport` fails. Moving the shutdown into deferred closers (with a `sync.Once`-guarded close to avoid double-close from the explicit Phase 2 call) is the principled fix. For this PR I took the smallest change that fixes the reported bugs: a detached context (`context.WithoutCancel(ctx)` + a 30s timeout) at the seed and build call sites in both the agpl and enterprise paths. It lets the calls complete even if the user cancels during startup, after which the handler reaches its shutdown select naturally and tears down through Phase 2. Both shortcomings above are worth addressing separately. ## Test plan - `make test RUN=TestServer_TelemetryDisabled_FinalReport` with `-race`; passes locally with `-count=3`. - Manually verified on a deployment with `CODER_EXTERNAL_TOKEN_ENCRYPTION_KEYS` set and env-configured providers: `ai_provider_keys.api_key_key_id` is populated, `api_key` is base64 ciphertext, and upstream auth succeeds. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 21:27:02 +02:00
blinkagent[bot]	dd741bd188	fix(site): only highlight Providers item on exact match in AI settings sidebar (#25700 ) ## Problem When visiting `/ai/settings/governance`, both AI Governance and Providers items in the AI settings subnav appear highlighted as active. ## Cause `SettingsSidebarNavItem` is built on react-router's `<NavLink>`, which by default treats a link as active when the current URL starts with the link's `to` path. Since `/ai/settings/governance` starts with `/ai/settings`, the Providers item is also marked active. ## Fix Pass `end` on the Providers nav item so it only matches when the path is exactly `/ai/settings` (the index route). The `SettingsSidebarNavItem` component already supports this prop for exactly this case. Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>	2026-05-26 19:23:13 +00:00
TJ	be184a0591	fix(site): update providers description with BYOK docs link (#25680 ) > 🤖 Generated with [Coder Agents](https://coder.com/agents) on behalf of @tracyjohnsonux Updates the providers page description to explain that providers power Coder Agents, AI Gateway, and other LLM features. Adds a "Manage deployment-wide BYOK" link to the docs. Uses `<Link>` component and `docs()` helper per project conventions.	2026-05-26 12:03:29 -07:00

1 2 3 4 5 ...

14542 Commits