coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Jon Ayers	bb11946bd4	fix: require update permission to recreate devcontainers (#25812 ) - The httpmw upstream from this endpoint only checks for read perms to the workspace agent. Recreating a dev container should require `update` perms since it mutates state. This also matches the behavior of the `DELETE` endpoint	2026-05-28 15:34:36 -05:00
Jon Ayers	c248dfb437	fix: escape agent log HTML (#25808 )	2026-05-28 14:43:07 -05:00
Cian Johnston	7ea0eff94e	fix: improve chat audit log descriptions and diff rendering (#25728 ) Chat ACL audit diffs rendered as `[object Object]` because the diff viewer called `.toString()` on object values. Common chat operations (archive, share) showed generic "updated chat" descriptions instead of semantic ones. Add `chatAuditLogDescription` to derive semantic descriptions from the audit diff for successful chat writes: "archived/unarchived chat" for archive toggles, "updated sharing for chat" for ACL-only changes. Extract diff value formatting into `formatAuditDiffValue`, which renders object values as deterministic compact JSON with sorted keys, fixing the `[object Object]` rendering for chat ACLs and any other object-valued fields. The previous `determineIdPSyncMappingDiff` workaround for IdP sync mappings was removed because the generic formatting handles it. Closes CODAGT-513 > Generated by Coder Agents on behalf of @johnstcn	2026-05-28 18:37:57 +01:00
TJ	ebf56ebd12	feat(site): desktop panel toolbar, zoom modes, and pop-out window (#25585 ) Redesigns the agent desktop panel with a persistent toolbar, zoom modes, and a detachable pop-out window. ## Changes Toolbar (`DesktopToolbar`) - Persistent top bar with right-aligned controls: Take/Release control, Zoom toggle, Detach - All buttons use consistent `subtle` variant with icon + label - `h-8` height, `bg-surface-primary` background with bottom border Zoom modes - Defaults to fit-to-window (`scaleViewport = true`) so the full 1920x1080 desktop is visible - Toggle to native 100% resolution via toolbar button or keyboard shortcuts (`Ctrl+0` fit, `Ctrl+1` native) - noVNC background color overridden from hardcoded `rgb(40,40,40)` to `--surface-secondary` so letterbox margins match the app theme in light and dark mode Pop-out window - New route at `/agents/:agentId/desktop` for a dedicated desktop window - Opens via toolbar "Detach" button at 50% of screen size, centered - BroadcastChannel coordination: sidebar shows placeholder with "Bring back" button - Closing the pop-out window automatically restores the sidebar panel Other - `useDesktopConnection` hook accepts `scaleViewport` option, synced to the RFB instance via a secondary effect - `DesktopPanelContext` extended with `agent` and `workspace` fields - Replaces the previous hover-overlay take/release control UX with toolbar buttons > Generated by Coder Agents on behalf of @tracyjohnsonux	2026-05-28 09:26:05 -07:00
DevCats	094fe971ad	chore(aibridge): add AWS PRM user-agent attribution for Bedrock calls (#25221 ) Adds middleware in `withAWSBedrockOptions` that appends the AWS Partner Revenue Measurement (PRM) attribution string to the User-Agent header on every Bedrock API call made through AI Bridge. This is the AI Bridge counterpart to the Terraform provisioner change merged in #23138. Together, they ensure all AWS API calls made by Coder (both workspace infrastructure via Terraform and LLM inference via Bedrock) include PRM attribution. ## How it works - A middleware is added before `bedrock.WithConfig(awsCfg)` that reads the existing `User-Agent` header and appends `sdk-ua-app-id/APN_1.1%2Fpc_cdfmjwn8i6u8l9fwz8h82e4w3%24` - Only affects Bedrock calls; OpenAI and direct Anthropic API calls are unaffected - Uses `option.WithMiddleware` rather than `option.WithHeader` because the existing User-Agent (set by the Anthropic SDK) must be preserved and appended to, not replaced ## Tests - Positive: `TestAWSBedrockIntegration` verifies PRM attribution is present in the User-Agent on Bedrock requests - Negative: `TestAnthropicMessages` verifies PRM attribution is absent on non-Bedrock requests ## References - Companion Terraform provisioner PR: #23138 (merged) - Backport: #24052 (merged) - Preserve existing `AWS_SDK_UA_APP_ID`: #24606 (open) - Original `coder/aibridge` PR: https://github.com/coder/aibridge/pull/224 (superseded by this PR since aibridge was moved into coder/coder via #24190) - [AWS SDK Application ID docs](https://docs.aws.amazon.com/sdkref/latest/guide/feature-appid.html) - [AWS PRM Automated User Agent](https://prm.partner.aws.dev/automated-user-agent.html) (partner login required) > Generated with [Coder Agents](https://coder.com/agents) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-05-28 11:08:00 -05:00
Danielle Maywood	0d1340a430	fix: collapse agent command output by default (#25748 )	2026-05-28 16:54:52 +01:00
dependabot[bot]	df929467f6	chore: bump github.com/open-policy-agent/opa from 1.11.0 to 1.17.0 (#25200 ) Bumps [github.com/open-policy-agent/opa](https://github.com/open-policy-agent/opa) from 1.11.0 to 1.17.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/open-policy-agent/opa/releases">github.com/open-policy-agent/opa's releases</a>.</em></p> <blockquote> <h2>v1.16.2</h2> <p>This release updates the version of Go used to build the OPA binaries and images to 1.26.3; addressing <a href="https://groups.google.com/g/golang-announce/c/qcCIEXso47M">a number of vulnerabilities</a>.</p> <h2>v1.16.1</h2> <p>This is a patch release addressing a regression in the plugin manager that may cause the service to hang on shutdown (<a href="https://redirect.github.com/open-policy-agent/opa/pull/8590">#8590</a>).</p> <h2>v1.16.0</h2> <blockquote> <p>[!WARNING]</p> <p>A regression has been found in the plugin manager, which may cause the service to hang on shutdown. Users are advised to go directly to v1.16.1.</p> </blockquote> <p>This release contains a mix of new features, performance improvements, and bugfixes. Notably:</p> <ul> <li>New <code>uri.parse</code> and <code>uri.is_valid</code> built-in functions</li> <li>Data API Request/Response Metadata</li> <li>Prometheus metrics exported via OTLP</li> <li>Formatter improvements</li> </ul> <blockquote> <p><strong><em>NOTE:</em></strong></p> <p>In v1.15.x, OPA was dropping logs for bundle downloads, <code>print()</code> calls and other plugin-originated logs. Users are advised to update, v1.16.0 fixes this bug in (<a href="https://redirect.github.com/open-policy-agent/opa/pull/8544">#8544</a>).</p> </blockquote> <h3>New <code>uri.parse</code> and <code>uri.is_valid</code> built-in functions (<a href="https://redirect.github.com/open-policy-agent/opa/issues/8263">#8263</a>)</h3> <p>Two new <a href="https://www.openpolicyagent.org/docs/policy-reference/builtins">built-in functions</a> have been added: <code>uri.parse</code> for parsing a given URI, and <code>uri.is_valid</code> for verifying the structure of a given URI.</p> <h4>uri.parse</h4> <p>Parses a URI and returns an object containing its components according to <a href="https://www.rfc-editor.org/rfc/rfc3986.html">RFC 3986</a>. Empty components are omitted.</p> <pre lang="rego"><code>package example <p>test_uri if { uri.parse("<a href="https://example.com:8080/api?q=1#top">https://example.com:8080/api?q=1#top</a>") == { "scheme": "https", "hostname": "example.com", "port": "8080", "path": "/api", "raw_path": "/api", "raw_query": "q=1", "fragment": "top", } } </code></pre></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/open-policy-agent/opa/blob/main/CHANGELOG.md">github.com/open-policy-agent/opa's changelog</a>.</em></p> <blockquote> <h2>1.17.0</h2> <p>This release contains a mix of new features, performance improvements, and bugfixes. Notably:</p> <ul> <li>A new <code>future.keywords.not</code> import that adds improved semantics to the <code>not</code> keyword.</li> <li>Rule Labels in Decision Logs</li> <li>Published json schema for IR and bundle manifest</li> <li>Dropped automaxprocs and x/net dependencies</li> </ul> <h3>Improved Negation Semantics (<a href="https://redirect.github.com/open-policy-agent/opa/issues/8387">#8387</a>)</h3> <p>This OPA release introduces a new <a href="https://www.openpolicyagent.org/docs/policy-reference/keywords/not#improved-negation-semantics"><code>future.keywords.not</code> import</a> that fixes a long-standing semantic issue with negation in Rego.</p> <p>Without the import, the compiler expands a negated composite expression like <code>not f(g(input.x))</code> into a series of sub-expressions evaluated <em>before</em> the <code>not</code>:</p> <pre><code>__local0__ = input.x g(__local0__, __local1__) not f(__local1__) </code></pre> <p>If any sub-expression fails — for example, <code>input.x</code> is undefined or <code>g</code> produces an undefined result — the entire rule fails rather than the <code>not</code> succeeding. This is unintuitive: the user's intent is "the condition does not hold," but an undefined intermediate value causes a silent failure instead of the expected <code>not</code> result.</p> <p>With <code>import future.keywords.not</code>, composite-expression negation wraps the full compiler expansion in an implicit body:</p> <pre><code>not { __local0__ = input.x; g(__local0__, __local1__); f(__local1__) } </code></pre> <p>Now, if <em>any</em> sub-expression is undefined or fails, the body is unsatisfiable and the <code>not</code> expression succeeds; matching the intuition that "the condition does not hold."</p> <blockquote> <p><strong><em>NOTE:</em></strong></p> <p>Users are recommended to import <code>future.keywords.not</code> whenever the <code>not</code> keyword is used in a policy.</p> </blockquote> <p>Authored by <a href="https://github.com/johanfylling"><code>@johanfylling</code></a></p> <h3>Rule Labels in Decision Logs (<a href="https://redirect.github.com/open-policy-agent/opa/issues/2089">#2089</a>)</h3> <p>Rule annotations now support a <code>labels</code> field. Labels from all successfully evaluated rules are collected and included in each decision log entry as a top-level <code>rule_labels</code></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/open-policy-agent/opa/commit/64a3625d33bc6ad8e7c40df03b76ce2fb3ab4d21"><code>64a3625</code></a> Release v1.17.0 (<a href="https://redirect.github.com/open-policy-agent/opa/issues/8710">#8710</a>)</li> <li><a href="https://github.com/open-policy-agent/opa/commit/68c9de5da00ea9d631c50327c709d5d7e8844bba"><code>68c9de5</code></a> benchmarks: tweak per-PR benchmark regression check based on pr-check</li> <li><a href="https://github.com/open-policy-agent/opa/commit/7fe3066154b7780eac16c290475f8506573a427f"><code>7fe3066</code></a> server: remove dead code (s.partials) (<a href="https://redirect.github.com/open-policy-agent/opa/issues/8708">#8708</a>)</li> <li><a href="https://github.com/open-policy-agent/opa/commit/37830be801a9ce4ec6d23df33f645bb6095f3043"><code>37830be</code></a> ast,storage/inmem: Add <code>inmem.NewFromASTObject</code> and add missing string case t...</li> <li><a href="https://github.com/open-policy-agent/opa/commit/1661f22ba399e94d08d8fb85218580a61779bdc4"><code>1661f22</code></a> ast: add some schema $ref tests</li> <li><a href="https://github.com/open-policy-agent/opa/commit/3e22f562f1e370973c1b6750eff11d06fe554c70"><code>3e22f56</code></a> benchmarks: only run for go changes</li> <li><a href="https://github.com/open-policy-agent/opa/commit/13aaeabce2221217cb6c175b269475803740fad2"><code>13aaeab</code></a> benchmarks: move env vars, remove zizmor-ignore comment</li> <li><a href="https://github.com/open-policy-agent/opa/commit/93e170868ac37f87696adfc2d7f672a0f1814936"><code>93e1708</code></a> benchmarks: fix PR message, skip tests</li> <li><a href="https://github.com/open-policy-agent/opa/commit/4ce3991901eed5b622a21f2f629029727e192ba7"><code>4ce3991</code></a> benchmarks: use go tool machinery, add benchstat</li> <li><a href="https://github.com/open-policy-agent/opa/commit/41df8df4a26d8de7a81bf4c5d78cb94f10a108d5"><code>41df8df</code></a> benchmarks: use benchlab for per-PR feedback</li> <li>Additional commits viewable in <a href="https://github.com/open-policy-agent/opa/compare/v1.11.0...v1.17.0">compare view</a></li> </ul> </details> <br /> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-28 15:20:56 +00:00
Steven Masley	4591212482	feat: implement SCIM handler for SCIM 2.0 compliance (#25572 ) Rewrites the SCIM 2.0 user provisioning handler to be RFC 7644 compliant. Verified against an external IdP Okta. Behavior is OPT IN	2026-05-28 10:00:37 -05:00
Cian Johnston	6df1536256	fix: add missing_key error kind for missing chat api_key_id (#25783 ) Refs CODAGT-486 - `codersdk/chats.go`: New `ChatErrorKindMissingKey` constant and `AllChatErrorKinds` entry - `coderd/x/chatd/chaterror/message.go`: `terminalMessage` and `retryMessage` cases - `coderd/x/chatd/model_routing_aibridge.go`: Pre-classify error with `WithClassification` - `coderd/x/chatd/model_routing_internal_test.go`: Classification assertion on production path (CRF-2) - `chatStatusHelpers.ts`: Frontend title "Chat interrupted" - `LiveStreamTail.stories.tsx`: Storybook story with `detail` assertion - `docs/ai-coder/ai-gateway/clients/coder-agents.md`: Troubleshooting entry - Tests: classification round-trip, terminal message, metrics kind enumeration > Generated with [Coder Agents](https://coder.com/agents) on behalf of @johnstcn	2026-05-28 15:50:52 +01:00
Nick Vigilante	ea280c5a90	docs(docs/install): strengthen Linux-only requirement on Docker install page (#25742 ) Closes DOCS-68. Promotes the existing "Linux only" guidance on `docs/install/docker.md` from an easy-to-miss bullet point to a prominent `[!IMPORTANT]` callout, and briefly states why the page is Linux-only so macOS readers do not waste time on the `getent` / `--group-add` snippets. ## Why this re-scope vs. the original ticket The original DOCS-68 scope was "add a macOS `getent` alternative". On inspection, that framing has three problems: 1. The Requirements section already says "A Linux machine. For macOS devices, start Coder using the standalone binary," so macOS users are already redirected. The signal just lives in a bullet that is easy to overlook. 2. The `--group-add $DOCKER_GROUP` mechanism that drives the `getent` call is Linux-specific. macOS Docker runtimes (Docker Desktop, Colima, Rancher Desktop, Podman) use a VM and forward the socket differently; the flag does not translate cleanly to any of them. 3. Defining a canonical macOS Docker path is the scope of [DEVREL-22](https://linear.app/codercom/issue/DEVREL-22) (recommend Colima / Rancher / Podman alternatives in the Quick Start guide). DOCS-68 should not pre-empt that work. This PR narrows the fix to making the existing macOS guidance unmissable. A real macOS Docker install path can come as a separate follow-up once DEVREL-22 lands and the recommended runtime is settled. <details> <summary>Decision log</summary> * (A) Close DOCS-68 as absorbed by DEVREL-22. Rejected — the install page still has a discoverability problem that DEVREL-22 (Quick Start) will not fix. * (B) Re-scope DOCS-68 to a narrow today-fix (this PR). Selected. * (C) Defer DOCS-68 until DEVREL-22 lands. Rejected — the install page is shipping the weaker guidance every day until then. </details> > [!NOTE] > This is a docs-only change. No product code was modified. --- Generated by Coder Agents on behalf of @nickvigilante.	2026-05-28 10:48:53 -04:00
Nick Vigilante	e32fdc813b	ci: rerun docs preview job on subsequent pushes (#25456 ) Fixes DOCS-174: the docs-preview workflow only fired on `pull_request: opened`. Subsequent pushes left the preview comment stale. ## Changes - Add `synchronize` and `reopened` to trigger types so subsequent pushes retrigger the workflow. - Add a workflow-level `concurrency` group keyed by PR number with `cancel-in-progress: true` so rapid successive pushes don't race the comment-upsert lookup. - Replace always-create comment logic with an upsert: find the existing comment containing `<!-- docs-preview -->` and PATCH it; fall through to create only when none exists or the PATCH itself fails (comment was deleted between find and update). - Filter the upsert lookup to comments authored by `github-actions[bot]` so a human comment containing the marker is never silently overwritten. - Decouple the `gh api` lookup from the `head -n 1` pipe so API failures (network, auth, rate-limit) propagate immediately instead of being swallowed by `\|\| true`. - Delete the stale preview comment when a `synchronize` push drops all Markdown changes (e.g. a follow-up push that removes the file an earlier push had previewed but still touches `docs/`). The previous preview comment would otherwise point at a deleted page. - Extract the marker and the comment-selector jq into a single `DOCS_PREVIEW_MARKER` variable and a `list_docs_preview_comments` shell function so the stale-cleanup and upsert branches share one source of truth. ## Out of scope Vercel ISR cache invalidation for feature branch previews requires a coder.com change (the `algolia-docs-sync` endpoint only accepts `main` and `release/` refs). Tracked separately in DOCS-174 out-of-scope notes. Pulls that fully revert their `docs/` changes in a follow-up push won't fire this workflow at all (GitHub's `paths` filter requires a path match in the diff), so a stale preview comment can survive on that specific edge. Removing the `paths` filter to handle it would run the workflow on every PR push, which is disproportionate. Acknowledged in [CRF-12](https://github.com/coder/coder/pull/25456#discussion_r3313738550). <details> <summary>Implementation notes</summary> Marker and selector deduplication: The marker string and jq selector previously appeared at three sites (comment body, stale-cleanup API call, upsert API call). They're now consolidated into `DOCS_PREVIEW_MARKER` plus a `list_docs_preview_comments` shell function so a future marker change updates one place. Comment body construction: The double-quoted multi-line string form with escaped backticks (`` \` ``) for the inline-code spans is shellcheck-clean. An earlier draft used `printf -v comment_body` with a single-quoted format string containing backticks, which triggered SC2016; the printf-three-pieces workaround that replaced it has since been simplified to the direct double-quoted form. Upsert logic: `gh api --paginate` fetches all PR comments, jq filters to `github-actions[bot]`-authored comments containing the marker, and the workflow PATCHes the first match. If the PATCH fails (404 because the comment was deleted between find and update), the script falls through to `gh pr comment` to create a new one. Self-heals on the next push if both paths somehow fail. Stale-cleanup logic*: Same selector as upsert, but in the early-exit branch when no Markdown files exist in this push. `DELETE` failures are logged and execution continues (the next push will re-attempt or post a fresh comment), so a transient API failure won't fail the CI job. </details> > Generated by Coder Agents on behalf of @nickvigilante	2026-05-28 10:03:21 -04:00
Danielle Maywood	8600b59aae	fix(site): normalize thinking transcript row (#25749 )	2026-05-28 14:49:10 +01:00
Danny Kopping	12520ee964	feat: add ai provider status and reload freshness metrics (#25770 ) Add metrics for `aibridged` and `aibridgeproxyd`'s provider statuses. AI providers can be modified, and possibly misconfigured, at runtime. These metrics help operators understand the state of these provider definitions in case unexpected behaviour is observed.	2026-05-28 14:57:33 +02:00
Nick Vigilante	637855e276	docs(docs/ai-coder): clarify Add-On is separate from Premium, add v2.32 requirement callout (#25463 ) Closes DOCS-54. Updates `docs/ai-coder/ai-governance.md` to address two known points of confusion: 1. Add-On is not included in Premium. The intro previously said the Add-On "can be added to Premium seats", which readers interpreted as bundled. Rewritten to say it is a separate per-user license that must be purchased in addition to Premium. 2. v2.32 requirement is now prominent. This was buried in a `## GA status and availability` section at the bottom. A `[!NOTE]` callout is added directly after the feature list so it is visible immediately. The duplicate paragraph in the GA section is removed. Also fixes "extend that platform" → "extend the Coder platform" (the original phrase had no clear antecedent). > [!NOTE] > This is a docs-only change. No product code was modified. --- Generated by Coder Agents on behalf of @nickvigilante. Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2026-05-28 08:46:06 -04:00
Nick Vigilante	ea71242f34	docs(docs/admin/monitoring): document log-human disable workaround (#25741 ) Closes DOCS-66. Adds a `[!NOTE]` callout to `docs/admin/monitoring/logs.md` documenting that `--log-human=""` (empty string) does not disable human-readable logging; the working value is `--log-human=/dev/null`. ## Context Reported by Bjorn Robertsson in `#docs` on 2026-04-29. Operators trying to silence the human-readable log stream had been setting `--log-human` (or `CODER_LOGGING_HUMAN`) to an empty string and getting unchanged log output. The empty-string path hits a 2023-vintage code path that falls back to the default `/dev/stderr` instead of disabling output. This PR documents the workaround on the admin-facing logs page. The CLI flag reference under `docs/reference/cli/server.md` is auto-generated and intentionally left unchanged. A separate engineering issue may be worth filing to fix the root cause (empty string should either disable or surface a warning). > [!NOTE] > This is a docs-only change. No product code was modified. --- Generated by Coder Agents on behalf of @nickvigilante.	2026-05-28 08:42:18 -04:00
Mathias Fredriksson	7a9125b953	fix(agent/agentfiles): merge duplicate file paths instead of rejecting (#25767 ) When a caller sends multiple entries for the same literal path, merge their edits into a single entry rather than returning 400. Symlink aliases (different paths, same real file) are still rejected.	2026-05-28 11:54:17 +00:00
Mathias Fredriksson	daf73b7b89	chore: mark database codegen files as linguist-generated (#25787 ) These files all have 'Code generated' / 'DO NOT EDIT' headers but were not in .gitattributes. This causes GitHub to show them expanded in PR diffs and count them toward diff stats. Files added: - coderd/database/queries.sql.go (sqlc) - coderd/database/models.go, querier.go (sqlc) - coderd/database/dbmock/dbmock.go (gomock) - coderd/database/dbmetrics/querymetrics.go (dbgen) - coderd/database/unique_constraint.go (dbgen) - coderd/database/foreign_key_constraint.go (dbgen) - coderd/database/check_constraint.go (dbgen)	2026-05-28 11:30:46 +00:00
Danny Kopping	f91390b2c8	ci: don't fail job if commenting on locked PR (#25765 ) The final step of `.github/workflows/cherry-pick.yaml` comments on the original PR with a link to the cherry-pick PR. If the original PR is locked, `gh pr comment` fails and the whole job exits with status 1, even though the backport branch and PR were created successfully. See https://github.com/coder/coder/actions/runs/26559681779/job/78239379200 for an example. Make the comment non-fatal: log a warning and continue. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 13:26:04 +02:00
Danny Kopping	a9f5ed7644	fix: re-validate provider per request and classify reloads (#25766 ) Refactors the `aibridgeproxyd` provider reload mechanism which was unnecessarily complex. Also ensures that providers are evaluated on each CONNECT request to prevent interception of requests to (newly) disabled providers; in this case the requests will passthrough unencrypted, by design.	2026-05-28 13:22:38 +02:00
Mathias Fredriksson	673709bd34	ci(.github/workflows/doc-check): update agents-chat-action to v0.3.0 (#25784 )	2026-05-28 11:16:34 +00:00
Ethan	7e2f7198dd	fix(coderd/x/chatd/chatloop): use stream silence timeout (#25782 ) Replaces the 60 second first-token timeout in the chat loop with a 10 minute stream-silence timeout. Previously, the guard bounded only the gap before the first stream part. Once any part arrived the attempt could hang indefinitely if the provider stopped streaming without closing the connection, and even normal long-running responses could be killed after 60 seconds if the provider was slow to emit the first token. The guard now arms when a model attempt opens its stream, resets on every received stream part, and fires after 10 minutes of complete silence. The existing retry path still handles the timeout, and the public `startup_timeout` error kind is preserved to avoid API and frontend churn. 10 minutes matches the default request timeout used by the Anthropic and OpenAI Python SDKs. Closes CODAGT-493	2026-05-28 21:02:40 +10:00
Mathias Fredriksson	3770176b7f	fix(scripts): use merge-base in emdash lint to avoid false positives (#25726 ) When GITHUB_BASE_REF is set, the emdash lint compared against the tip of main instead of the merge-base. For PRs behind main, this produced a diff covering all divergent files, flagging pre-existing emdashes the PR never touched. Query the PR commit count via gh, deepen HEAD by that amount, and resolve HEAD~N as the merge-base. Falls back to the branch tip when the merge-base cannot be determined.	2026-05-28 13:45:01 +03:00
Michael Suchacz	f529577bee	fix(coderd/x/chatd): harden openai-compatible chat calls (#25737 ) OpenAI-compatible chat paths hit two provider compatibility issues. Some compatible endpoints reject a named `tool_choice` when there is only one tool, and Gemini's OpenAI-compatible endpoint requires thought signatures on current-turn tool calls. Centralize OpenAI-compatible request patches in the chat provider: rewrite single named tool choices to `"required"`, and add the documented dummy Google thought signature to the first tool call in each current-turn tool step for Gemini routes. Vercel OpenAI-compatible requests are left unchanged for the thought-signature patch. > Mux created this PR on behalf of Mike.	2026-05-28 10:27:32 +02:00
TJ	cfa343e456	refactor(site): update BYOK link to use "View docs" on AI settings page (#25743 ) Changes the "Manage deployment-wide BYOK" link on the AI Providers settings page (`/ai/settings`) to "View docs", matching the pattern used on the provisioner keys page (`/organizations/{org}/provisioner-keys`). ### Changes - Swapped `Link` from `react-router` to `#/components/Link/Link` (uses `href` instead of `to`) - Removed `target="_blank"` and `rel="noreferrer"`: the link now navigates in the same tab, matching the provisioner keys page convention - Changed link text from "Manage deployment-wide BYOK" to "View docs" > Generated by Coder Agents on behalf of @tracyjohnsonux	2026-05-28 08:50:10 +02:00
Sas Swart	3b8a9ff802	feat: add preset query parameter for workspace creation deeplinks (#24328 ) Co-authored-by: Atif Ali <atif@coder.com>	2026-05-28 10:42:53 +05:00
Ethan	ca7f07142e	ci: add Go test flake detector workflow (#25667 ) Adds a `flake-go` workflow that hunts for ordering-dependent and racy Go tests on pull requests. The workflow runs only on PRs (cancelling earlier runs on new commits) and skips test execution when no Go test files changed. A single `flake_go` job uses [coder/whichtests](https://github.com/coder/whichtests) with `--coalesce` to compute the directly-modified `Test*` functions from the PR diff and emit them as one target row. The same job then runs those selected tests on a deliberately resource-constrained 4-vCPU runner with 4x parallelism oversubscription, `-count=25`, and `-shuffle=on` to amplify contention and surface flakes. Pinned at [coder/whichtests@ec33bab](https://github.com/coder/whichtests/commit/ec33bab1ec04cd86beb7a61a069db4463dba63f5). Reuses the `test-go-pg` composite (with its new `run-regex`, `test-shuffle`, and `gotestsum-json-file` inputs) and the `go-test-failure-report` composite, both introduced on the base branch (#25670), so this workflow shares one implementation of the gotestsum + failure-report path with the existing CI jobs. `Makefile` adds `TEST_SHUFFLE` support and single-quotes `RUN` so whichtests' regex survives shell parsing. Stacked on top of #25670. Demo @ https://github.com/coder/coder/actions/runs/26494322649/job/78018779381?pr=25667 Closes CODAGT-381	2026-05-28 12:35:37 +10:00
Callum Styan	9d37f63fbd	feat: report synthetic metadata from fake agents (#25166 ) Fake agents now fetch their manifest, spawn a single per-agent metadata goroutine, and emit batched BatchUpdateMetadata calls with 3072-byte base64 payloads so scaletest runs mirror the load shape of real agents. This matches what the current scaletest workspace template does for metadata. In the future we can extend the harness here to take in a config option for the metadata payload size. --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Mux <mux@coder.com>	2026-05-27 13:49:42 -07:00
blinkagent[bot]	1bfc1ce2c4	chore: update terraform to v1.15.5 (#25746 ) Bumps bundled Terraform from `1.15.2` to `1.15.5` across all pinned locations: - `.github/actions/setup-tf/action.yaml` - `scripts/Dockerfile.base` - `install.sh` - `flake.nix` (+ updated SRI hash for the linux_amd64 zip) - `mise.toml` - `mise.lock` (+ updated per-platform SHA256 checksums) - `provisioner/terraform/testdata/version.txt` - `provisioner/terraform/testdata/resources/ai-tasks-disabled/ai-tasks-disabled.tfplan.json` ## Why Terraform 1.15.5 is built with Go 1.25.10, while the 1.15.2 we currently ship was built with Go 1.25.8. The newer Go runtime addresses recent stdlib CVEs flagged by security scanners. Releases included: 1.15.3 (provider install crash fix, nested-module stack migration fix), 1.15.4 (Linux s390x builds, symlinked provider dir fix), 1.15.5. Release notes: https://github.com/hashicorp/terraform/releases/tag/v1.15.5 ## Cherry-pick #25747 mirrors this PR against `release/2.34`. Created on behalf of @Shelnutt2 Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>	2026-05-27 16:46:25 -04:00
Garrett Delfosse	5991a2c8b0	ci: trigger CI on release branch creation (#25744 ) GitHub Actions does not reliably trigger the push-based CI workflow when a new branch is created at a commit that already has a workflow run from another branch (e.g. `main`). This meant cutting a release branch produced no CI run on it, so `should_deploy.sh` never got to approve the deploy from the release branch. Adds the `create` event trigger to `ci.yaml` with a condition on the `changes` job to only proceed for release branch creations. All other jobs depend on `changes`, so non-release branch creations are a no-op. > Generated with [Coder Agents](https://coder.com/agents) by @f0ssel	2026-05-27 14:46:18 -04:00
Garrett Delfosse	a2e1ddb56f	fix: validate FileSize in NewDataBuilder to prevent OOM DoS (#25710 ) `NewDataBuilder` allocated `make([]byte, 0, req.FileSize)` using the client-supplied `int64` with no upper-bound check. The DRPC 4 MiB wire cap limits message size but not the integer value, so a crafted message with `FileSize = 1<<40` forces a 1 TiB allocation, triggering an unrecoverable `runtime.throw` that kills the entire `coderd` process. Add a `MaxFileSize` constant (100 MiB, matching `HTTPFileMaxBytes` in `coderd/files.go`) and reject negative or oversized `FileSize`, plus negative or excessive `Chunks`, before the allocation. `BytesToDataUpload` also returns an error for oversized data to preserve the encode/decode round-trip contract. Fix a pre-existing reversed subtraction in the `Add()` overflow error message. Closes https://linear.app/codercom/issue/PLAT-231 <details> <summary>Implementation details</summary> - `provisionersdk/proto/dataupload.go`: New exported `MaxFileSize` constant; validation in `NewDataBuilder` and `BytesToDataUpload`. Fixed reversed subtraction in `Add()` error. - `provisionersdk/proto/dataupload_test.go`: New `TestNewDataBuilderValidation` with 7 subtests. - Updated all 5 callers of `BytesToDataUpload` for new error return. - Audited all `make([]byte, ...)` in provisioner paths; no other client-supplied sizes. </details> > Generated by Coder Agents on behalf of @f0ssel	2026-05-27 14:30:11 -04:00
Jon Ayers	f6f284ea51	feat: add initial NATS implementation (#25602 )	2026-05-27 12:57:20 -05:00
Michael Suchacz	3cf867f84a	fix(site/src/pages/AgentsPage): make other-user chats read-only (#25736 ) Other-user agent chats showed a banner that implied prompts would run as the owner, but submitting from that view is forbidden. This updates the banner to identify the chat owner and makes chats owned by another user read-only in the UI by disabling the composer and hiding inline send or edit follow-up actions. > Mux working on behalf of Mike.	2026-05-27 18:10:10 +02:00
Cian Johnston	b278be7361	fix(coderd): enforce api_key_id on user messages at type level (#25729 ) - Empty string is valid for `apiKeyID` in paths that genuinely lack a caller key (e.g. agent-initiated context injection in `workspaceAgentAddChatContext`). AI Gateway fail-closed check remains the runtime safety net. - Context injection paths (`persistInstructionFiles`, compaction) read the key from `aibridge.DelegatedAPIKeyIDFromContext(ctx)`, set upstream by `contextWithActiveTurnAPIKeyID`. - Subagent context copy branches on `copiedRole == database.ChatMessageRoleUser` to choose the right append function. > Generated by Coder Agents	2026-05-27 17:00:23 +01:00
Danielle Maywood	2d40a40c57	fix(site): tighten execute tool duration spacing (#25739 )	2026-05-27 16:57:05 +01:00
Mathias Fredriksson	2730a87975	ci(.github/workflows/doc-check): update agents-chat-action to v0.2.0 (#25731 )	2026-05-27 17:51:18 +03:00
Ethan	f422ac89cc	ci: extract go-test-failure-report composite action (#25670 ) The Go test jobs in `ci.yaml` each had ~30 lines of inline shell that wrapped `gotestsum` with a PATH shim to capture JSON, then ran `gotestsummary` and `upload-artifact` to publish a failure report. Three jobs carried three near-identical copies. This change replaces the three inline blocks with a single composite action at `.github/actions/go-test-failure-report/` that runs the same `gotestsummary` invocation, writes the same markdown to `GITHUB_STEP_SUMMARY`, and uploads the same NDJSON artifact. The PATH shim is gone; gotestsum's native `GOTESTSUM_JSONFILE` env variable is used instead, plumbed through the `test-go-pg` composite. `test-go-pg` gains three optional inputs: - `gotestsum-json-file` — explicit JSON file path (or `default` for `${RUNNER_TEMP}/go-test.json`) - `run-regex` — passed to `go test -run` - `test-shuffle` — passed to `go test -shuffle` All three have safe defaults so existing callers are unaffected. No observable change in CI behavior: the three existing test-go-pg jobs continue to emit the same JSON, render the same failure summary, and upload the same artifact. Stacked under #25667, which uses the new composite and inputs to power a new flake-detector workflow.	2026-05-28 00:16:46 +10:00
Danny Kopping	2770bdc9d1	feat: route extra ai_provider_types through OpenAI and Anthropic providers (#25722 ) _Disclosure:_ _produced_ _with_ _Claude_ _Opus_ _4\.7_ AI Gateway only supports Anthropic (+Bedrock), OpenAI, and Copilot providers at present. All other types (Vercel, Gemini, etc) will be mapped to OpenAI since they support OpenAI-compatible endpoints.	2026-05-27 16:16:05 +02:00
Spike Curtis	6f06ace949	chore: export MsgQueue from pubsub package (#25707 ) <!-- If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting. --> Makes `MsgQueue` exported, so it can be used in pubsub implementations outside PGPubsub.	2026-05-27 10:11:51 -04:00
Danielle Maywood	d1e27889eb	fix(site): improve chat sharing mobile layout (#25687 )	2026-05-27 15:03:29 +01:00
Danielle Maywood	5603be19cc	feat(site): add transcript tool icons (#25724 )	2026-05-27 14:43:14 +01:00
Nick Vigilante	ecaf5e022b	docs: fix broken references and add users oidc-claims to manifest (#25706 ) ## Summary Three small docs fixes: - `docs/admin/integrations/oauth2-provider.md`: Replace broken relative link to `scripts/oauth2/README.md` with an absolute GitHub URL. The previous link escaped the `docs/` tree (`../../../scripts/oauth2/README.md`) and does not resolve in the published docs site. - `docs/install/releases/feature-stages.md`: Point the "Coder documentation" link to `docs/about/contributing/documentation.md`. The previous `../../README.md` target does not exist under `docs/`. - `docs/manifest.json`: Add the missing `users oidc-claims` entry alongside the other `users` CLI subcommands so the generated reference page (`docs/reference/cli/users_oidc-claims.md`) is reachable from the sidebar. ## Validation - Confirmed each new link target exists on `main` (`docs/about/contributing/documentation.md`, `scripts/oauth2/README.md`, `docs/reference/cli/users_oidc-claims.md`). - Pre-commit hooks pass (`fmt/markdown`, `lint/markdown`, `lint/emdash`, `lint/typos`, etc.). --- _This PR was prepared by a [Coder Agents](https://coder.com/) session on behalf of @nickvigilante. Human review requested since this is a docs-only change._	2026-05-27 09:29:16 -04:00
Cian Johnston	0c27224fc2	fix(coderd): pass title API key context (#25723 ) Fixes CODAGT-503 - Add failing-first coverage for manual title generation with missing message `api_key_id`, with both context fallback and fail-closed cases. - Set `aibridge.WithDelegatedAPIKeyID(ctx, apiKey.ID)` in `regenerateChatTitle` and `proposeChatTitle`. - In `generateManualTitleCandidate`, fall back to `aibridge.DelegatedAPIKeyIDFromContext(ctx)` only when `modelBuildOptionsFromMessages` yields an empty `ActiveAPIKeyID`. - Keep `modelBuildOptionsFromMessages` pure and leave automatic title generation unchanged.	2026-05-27 13:20:36 +01:00
Danny Kopping	10f37db35d	fix(coderd/x/chatd/chatprovider): keep gateway model prefix in ResolveModelWithProviderHint (#25725 ) For `vercel`, `openrouter`, and `openai-compat`, the `<provider>/<model>` slash is part of the upstream model ID rather than a hint. `ResolveModelWithProviderHint` was running `parseCanonicalModelRef` before honoring `providerHint`, so a config like `(provider=vercel, model=anthropic/claude-4-5-sonnet)` resolved to `provider=anthropic, model=claude-4-5-sonnet` and the prefix-less model name was forwarded to Vercel, which returned `Model 'claude-4-5-sonnet' not found`. Honor an explicit gateway provider hint before attempting canonical-ref parsing. Non-gateway hints (anthropic, openai, etc.) keep the existing canonical-ref-first behavior so `anthropic/claude-...` still has its prefix stripped when routed directly to Anthropic. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 11:13:39 +00:00
Max Schwenk	ae492495ee	fix(cli): show ready sync start dependencies (#25546 ) ## Problem Follow-on to: - https://github.com/coder/coder/pull/25089 `coder exp sync start` still printed a generic success message when the unit was ready on the first status check. That hid whether the unit had no dependencies or had dependencies that were already satisfied before `sync start` ran. Before: ```text Success ``` ## Solution Print explicit startup output for both ready-at-first-check cases. After, dependencies already satisfied: ```text Unit "test-unit" started immediately, dependencies already satisfied: [dep-unit, dep-unit-2] ``` After, no dependencies: ```text Unit "test-unit" started with no dependencies ``` The existing waiting path is unchanged and still reports the dependencies while waiting and after waiting finishes. Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>	2026-05-27 12:33:39 +02:00
Danny Kopping	79e007cf30	feat: hot-reload aibridged and aibridgeproxyd providers on DB changes (#25673 ) Previously the in-process aibridge daemon and the enterprise aibridgeproxy daemon both snapshotted their provider routing once at boot. Any `ai_providers` or `ai_provider_keys` mutation required a restart for either to pick it up. Add an `ai_providers_changed` pubsub channel that the CRUD handlers publish on after Create / Update / Delete. Both daemons subscribe: - aibridged rebuilds its `[]aibridge.Provider` snapshot via `BuildProviders` and swaps it into the pool atomically. Inflight requests keep serving against the bridge they already acquired; new acquires build against the new snapshot. Per-provider construction errors stay scoped to the offending row. - aibridgeproxyd rebuilds its routing snapshot from `GetAIProviders` and swaps the host→provider map atomically. The MITM listener picks up new providers without restart. DB read for aibridgeproxyd uses the existing `AsAIProviderMetadataReader` subject for routing-only access.	2026-05-27 11:58:43 +02:00
Cian Johnston	6acfe6c835	fix: classify quota errors as usage_limit instead of auth (#25676 ) Fixes CODAGT-484. - Removed "quota", "billing", "insufficient_quota", "payment required" from `authStrongPatterns` - Added `usageLimitPatterns` slice with those patterns - Added `usageLimitMatch` signal and rule between overloaded and authStrong in priority - Added terminal/retry messages for `ChatErrorKindUsageLimit` - Simplified auth message (removed billing reference) - Frontend: conditional `!usageLimitStatus.provider` guard on the "View Usage" Alert - Added `TestClassify_UsageLimitBeatsAuth` with 5 cases including real production OpenAI error - Added `ProviderQuotaExceeded` story asserting no "View Usage" link and correct `ChatStatusCallout` rendering > Generated with [Coder Agents](https://coder.com/agents)	2026-05-27 09:45:36 +01:00
Thomas Kosiewski	e32be68687	fix(dogfood/coder): verify Homebrew installer (#25721 )	2026-05-27 10:45:21 +02:00
Jake Howell	9c10ec2ca7	fix: resolve mui `<TimelineDateRow />` regression (#25716 )	2026-05-27 18:36:55 +10:00
Thomas Kosiewski	bfa17c315e	fix(dogfood/coder): persist mise user installs (#25720 )	2026-05-27 09:54:09 +02:00
Ethan	e91bec8574	fix(cli): close aibridge daemon before WebSocket shutdown wait (#25719 ) > [!WARNING] > The investigation and solution in this PR were done with [Mux](https://mux.coder.com/). I've reviewed the investigation methodology, evidence and solution, and it all appears sound. ## Summary PR #25570 (`refactor: move aibridged out of enterprise to AGPL`, merged 2026-05-22) added an in-memory aibridge DRPC server in `coderd/aibridged.go` that does `api.WebsocketWaitGroup.Add(1)` and only releases `Done()` when its client session is closed. PR #25575 then flipped `CODER_AI_GATEWAY_ENABLED` to default to `true`, so every `cli.Server()` invocation now spins up that goroutine. In `cli/server.go`, the only call to `aibridgeDaemon.Close()` was a `defer` scheduled at function return. During graceful shutdown the code first calls `coderAPICloser.Close()`, which waits on `api.WebsocketWaitGroup`. That wait sits for the full 10s timeout in `coderd/coderd.go` (`websocket shutdown timed out after 10 seconds`), then returns, then the function unwinds, and only then does the deferred `aibridgeDaemon.Close()` fire and let the goroutine call `Done()`. The 10s tax was previously latent (aibridged was enterprise-only and opt-in). After the two May 22 PRs it hit every `cli.Server()` test. On Linux/macOS CI it just makes the suite slower; on the Depot Windows runner, the ramdisk reservation leaves only ~17 GiB of headroom and the ~10s shutdown tails of multiple concurrent package binaries overlap into an OOM, presenting as `test-go-pg (windows-2022)` jobs that die silently at the ~600s watchdog with an empty `steps` array. See Slack: https://codercom.slack.com/archives/C05AE94121Z/p1779807717764189 ## Fix Close `aibridgeDaemon` explicitly during graceful shutdown, before `coderAPICloser.Close()` waits on the WebSocket wait group. This matches the existing ordered-shutdown pattern used for `tunnel` and `notificationsManager`. The deferred `aibridgeDaemon.Close()` is retained as a safety net for early-return paths, and is safe to double-call because `aibridged.Server.Close()` is already idempotent via `shutdownOnce` in `coderd/aibridged/aibridged.go`. ## Regression test `TestServer_AIGatewayShutdownOrdering` boots a real `coder server` with `--ai-gateway-enabled=true`, cancels its context, and asserts graceful shutdown finishes in under 8s. With the fix the test runs in ~0.1s; without the fix it fails deterministically at ~10.0s. The flag is passed explicitly so the test continues to guard the ordering even if the deployment default is ever flipped back. ## Evidence this fixes the OOM On Linux the patched `cli` test package drops from 114 s back to its pre-regression 30 s wall time at the same single-process peak RSS (~7.6 GiB), and the `websocket shutdown timed out after 10 seconds` log line disappears from every server-test run. Since the Windows OOM is the sum of multiple concurrent 10 s shutdown tails overlapping past the runner's ~17 GiB headroom, removing those tails returns the concurrent-RSS budget to its pre-regression level. The Windows OOM was intermittent (a handful of hits across many runs since May 22), so a single green `test-go-pg (windows-2022)` job on this PR is not by itself proof; confirmation will come from watching Windows runs on `main` over the next several days and seeing the ~600 s silent-kill fingerprint stop recurring. Relates to ENG-2771	2026-05-27 17:33:14 +10:00

1 2 3 4 5 ...

14553 Commits