coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Ethan	4f1043a50a	feat(scaletest): add chat scaletest command (#25553 ) Adds `coder exp scaletest chat`, a harness for creating Coder Agents chat load. Start the mock LLM separately, prepare the scaletest workspaces you want to target, then run the chat scaletest against the existing `scaletest-*` fleet selected by the shared workspace targeting flags: ```sh coder exp scaletest llm-mock --address 127.0.0.1:18080 coder exp scaletest chat --llm-mock-url http://127.0.0.1:18080/v1 --chats-per-workspace 10 --turns 1 coder exp scaletest chat --llm-mock-url http://127.0.0.1:18080/v1 --template docker --target-workspaces 0:10 --chats-per-workspace 1 --turns 10 --turn-start-delay 30s ``` This is the same pattern used by the `workspace-traffic` load generator. Keeping the fake LLM as a separate process is intentional so it can be scaled independently from the Coder deployment, which will likely be necessary as we scale up and up. This PR is the starting point: it provides the command, mock provider/model bootstrap, existing workspace selection, chat streaming, follow-up turns, metrics, and cleanup. Follow-up PRs will add multi-step turns via tool calls. I'm still a bit iffy on the mechanism I have for that. It'll likely involve having the runner send some magic strings that the mock will recognise. Relates to CODAGT-307 Relates to GRU-48 Relates to https://github.com/coder/scaletest/issues/124 Generated by Mux, but reviewed by a human	2026-05-26 14:19:36 +10:00
Ethan	fe13bb2a20	fix(coderd/x/chatd): seed afterMessageID test directly (#25665 ) This fixes the flaky `TestSubscribeAfterMessageID` by seeding its chat and messages directly, so the test no longer creates pending work that a chat worker can pick up. The assertion now covers only the `afterMessageID` subscription behavior, independent of chat processing lifecycle timing. Closes DEVEX-326 Closes https://github.com/coder/internal/issues/1489	2026-05-26 13:16:32 +10:00
Michael Suchacz	84240da0c1	fix(site/src/pages/AgentsPage): avoid skills popup flash (#25661 ) When removing the `/` personal skill trigger, the popover content stayed mounted during its close transition and briefly rendered the empty skills state at the viewport origin. This keeps the menu content mounted for stable Radix positioning, preserves the last open menu state during the close transition, and adds a Storybook regression for the backspace path. > Mux is creating this PR on behalf of Mike.	2026-05-25 21:58:37 +02:00
Susana Ferreira	846aac2f74	refactor(aibridge): remove InjectAuthHeader in favor of KeyFailoverConfig (#25618 ) ## Description `Provider.InjectAuthHeader` is no longer needed. With the addition of `KeyFailoverConfig` in #24920, authentication is now applied per-attempt by `KeyFailoverTransport` on passthrough routes. This PR removes the dead method from the `Provider` interface, all implementations (`Anthropic`, `OpenAI`, `Copilot`), and the test mock. The orphaned `InjectAuthHeader` unit tests are replaced with `Test{Anthropic,OpenAI,Copilot}_KeyFailoverConfig`. `TestPassthrough_KeyFailover` is also extended to cover Copilot in the BYOK scenario. Related to: https://linear.app/codercom/issue/AIGOV-334/aibridge-follow-ups-from-key-failover-prs > [!NOTE] > Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira	2026-05-25 19:10:38 +01:00
Susana Ferreira	22109a54ad	refactor(aibridge): clean up keypool and provider error handling (#25609 ) ## Description Cleans up how key pool errors are represented and how they get turned into HTTP responses. Consolidates two error types into a single type with a kind tag, and gives the response helpers in both providers consistent names. ## Changes - Replaced the keypool sentinel and transient error struct with one error type that carries a kind and a retry-after duration. - Updated `KeyFailoverConfig.BuildKeyPoolResponse` to take the typed key pool error, so each provider can shape the exhaustion response in its own format. - Removed the per-provider `MarkKey` callback from `KeyFailoverConfig` since providers can rely on the shared `MarkKeyOnStatus` helper. - Renamed the response-error helpers so OpenAI and Anthropic use the same naming. Related to: https://linear.app/codercom/issue/AIGOV-334/aibridge-follow-ups-from-key-failover-prs > [!NOTE] > Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira	2026-05-25 18:58:29 +01:00
Susana Ferreira	5d178ada9f	docs(aibridge): document known IsStreaming race condition (#25654 ) Documents the known race in `EventStream.IsStreaming()` and the resulting flake in `TestStreamingInterception_AgenticLoopFailover/agentic_all_keys_fail `, accepted rather than fixed since the inner agentic loop is on track to be removed as part of the reverse proxy migration in coder/aibridge#223. Full reasoning in coder/internal#1524.	2026-05-25 17:57:02 +01:00
Cian Johnston	579daaff70	feat: add GitLab support to coderd/externalauth/gitprovider Fixes CODAGT-146 Add GitLab support to the gitprovider package for gitsync/chatd PR diff flows. This is a squashed stack of 3 PRs: #25651 - refactor(coderd/externalauth): prepare gitprovider for multi-provider support - Change gitprovider.New to return (Provider, error) - Extract shared helpers (parseRetryAfter, checkRateLimitError, countDiffLines, escapePathPreserveSlashes) from github.go - Update all callers (db2sdk, exp_chats, gitsync) for new signature - Add error logging for provider construction failures - Thread context through provider resolution #25652 - feat(coderd/externalauth/gitprovider): add GitLab provider - Implement full Provider interface: FetchPullRequestStatus, FetchPullRequestDiff, FetchBranchDiff, ResolveBranchPullRequest - Handle nested groups, forks, and self-hosted instances - Rate limit detection on both library and raw HTTP paths - URL parsing/building with NormalizePullRequestURL support - Unit tests covering error paths, URL parsing, state mapping - Document GitLab configuration and known limitations #25653 - test(coderd/externalauth/gitprovider): add GitLab VCR integration tests - FetchPullRequestStatus: 4 fixtures (open, conflicts, merged, closed) - FetchPullRequestDiff: 4 fixtures - FetchBranchDiff: 3 fixtures (open, deleted, fork) - ResolveBranchPullRequest: 3 fixtures - go-vcr cassettes with sanitized GitLab API responses	2026-05-25 17:41:02 +01:00
Atif Ali	2ad2f7869d	chore(site): rename AI Bridge Sessions dropdown to AI Sessions (#25656 ) Renames the admin settings dropdown label from "AI Bridge Sessions" to "AI Sessions". The link target (`/aibridge/sessions`) is unchanged.	2026-05-25 16:23:38 +00:00
Mathias Fredriksson	7958ad6d04	fix(cli): use quartz clock in waitForTaskIdle for immediate first poll (#25648 ) waitForTaskIdle used time.NewTicker(5s) which delays the first poll by 5 seconds. Debugger tracing proved the failure mechanism: on slow CI (Windows), the first poll at 5s sees "working" (idle patch has not landed due to goroutine scheduling), needs poll #2 at 10s, but the 25s context expires before it fires. Two changes: 1. Use r.clock.NewTicker (quartz) with time.Nanosecond initial interval and Reset(5s) for immediate first poll. Tests inject a mock clock via clitest.NewWithClock for deterministic control. 2. Rewrite WaitsForWorkingAppState test with quartz traps (NewTicker + TickerReset) for deterministic synchronization instead of racing goroutines. Fix PausedDuringWaitForReady sync point. Closes DEVEX-381	2026-05-25 19:14:29 +03:00
Danny Kopping	8652ef3e3b	refactor: route `TransportFor` by provider name (#25650 ) Delegate `aibridge` routing responsibility to the in-memory transport layer. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 18:04:12 +02:00
Cian Johnston	0a45f96d30	ci: validate dogfood image tooling by running gen, fmt, lint, build (#25475 ) Adds a `test_image` job that runs `make gen`, `make fmt`, `make lint`, and `make build` inside the newly built image via `docker run`. This helps detect breaking changes before merge. > [!NOTE] > Generated with [Coder Agents](https://coder.com/agents)	2026-05-25 17:02:13 +01:00
Mathias Fredriksson	52e73b1343	test(agent/agentcontextconfig): isolate TestContextPartsFromDir from host HOME (#25649 ) ContextPartsFromDir scans ~/.coder/skills via DefaultSkillsDir. On machines with real skills installed, these leaked into test results. Set HOME/USERPROFILE to temp dirs on the parent test so subtests run in a clean environment.	2026-05-25 17:59:32 +03:00
Mathias Fredriksson	00a6dc56a7	test(coderd/x/chatd): wait for settled state in PromoteQueued ordering (#25644 ) TestPromoteQueuedWhileRunningRespectsMessageOrder was flaky because it read queue state from the database immediately after PromoteQueued returned. The active server worker drains queued messages concurrently, so the DB read races the auto-promote pipeline (TOCTOU). Instead of asserting intermediate queue state, wait for all three promoted messages to appear in chat history and verify their relative order (B before A before C). This asserts the same invariant (promote reorders B to the front) without reading during the race window. Closes CODAGT-384	2026-05-25 17:58:31 +03:00
Paweł Banaszewski	1a8a153c56	chore: fix flake in TestResponsesInjectedTool (#25630 ) Fixes flake in TestResponsesInjectedTool. See https://github.com/coder/coder/pull/25630/changes/d9bfeb20092129127ad5e7958c5b8dbf46740527 for reproduction. Due to AsyncRecorded token usages may be recorded in different order then expected. Fixes: https://github.com/coder/internal/issues/1544	2026-05-25 16:41:55 +02:00
Danny Kopping	4ddda3a9db	feat: filter interceptions and sessions by provider name (#25640 ) Allows filtering sessions & interceptions by provider name, and adds a test to vaidate that provider name is immutable (at least until #25606 lands).	2026-05-25 16:31:48 +02:00
Mathias Fredriksson	c8359d8598	fix(agent/agentproc): read process info before output to prevent TOCTOU (#25646 ) handleProcessOutput read proc.output() then proc.info() using separate locks. Between the two reads the exit goroutine could finish I/O and set running=false, pairing stale output with final status. On Windows CI this caused OutputExceedsBuffer to flake when the buffer snapshot caught mid-write data (OmittedBytes=0) but info reported the process as exited. Swap the read order so info is read first. The exit goroutine completes cmd.Wait (draining all pipe data) before setting running=false, so seeing Running=false guarantees the subsequent output read reflects the final buffer state. Closes CODAGT-399	2026-05-25 17:27:29 +03:00
Mathias Fredriksson	12f082c864	test(coderd/x/chatd): drain all subscriber events per tick in PromoteQueued tests (#25645 ) The root cause of the TestPromoteQueuedWhileRequiresActionMixedTools flake (CODAGT-425) was the subscriber out-of-order durable message delivery bug, fixed by PR #25433 (`ec1e861`). All five CI failures predate that fix. Zero failures since. This change hardens the subscriber event-drain pattern in both PromoteQueued requires_action tests: wrap the channel select in a for-loop so interleaved non-target events (status, queue_update, message_parts) are consumed in the same Eventually tick instead of each burning a 25ms interval. This is defense-in-depth for slow CI runners, not a standalone bug fix. Closes coder/internal#1523 Closes CODAGT-425	2026-05-25 16:55:48 +03:00
Cian Johnston	a4afb9dfc6	feat: add --env-file flag to develop.sh (#25621 ) Adds `--env-file` to `scripts/develop.sh` to allow reading environment from a given file. This makes it easier to configure things like external auth providers, access URLs, and other dev-time settings without exporting a wall of environment variables in every shell session. > Generated with [Coder Agents](https://coder.com/agents)	2026-05-25 11:54:57 +01:00
Michael Suchacz	ffc51ec8b3	feat(site/src/pages/AgentsPage): show MCP tool inputs (#25568 ) Generic agent chat tool cards now render an `Input` section before the existing output viewer, so MCP and workspace MCP tools expose the arguments sent to the tool. Empty inputs stay hidden, model-intent wrappers are stripped before display, and the formatted input is the single source of truth for whether an input block renders. Refs https://linear.app/codercom/issue/CODAGT-260/show-mcp-tool-inputs-in-agent-chats > Mux worked on this on Mike's behalf.	2026-05-25 12:09:03 +02:00
Sas Swart	3bf5f80277	feat(coderd/database): add boundary_sessions and boundary_logs tables (#25441 ) RFC: [Bridge ↔ Boundaries Correlation RFC](https://www.notion.so/coderhq/Gateway-and-Firewall-Correlation-RFC-31ad579be592803aa8b3d48348ccdde9) Add up/down migrations and matching sqlc queries for persisting Boundary audit events, as specified in the Bridge/Boundaries Correlation RFC. Tables: - `boundary_sessions`: session metadata with `workspace_agent_id` FK, `confined_process_name`, and timestamps (`started_at`, `updated_at`). ID is externally supplied by the Boundary process (no DB-side default). Created lazily when the first log for a session arrives. - `boundary_logs`: individual audit events with `session_id` FK, `sequence_number` (INT, primary ordering key), protocol/method/detail fields, and `matched_rule` (nullable; non-NULL implies allowed). Indexes (per RFC): - `(session_id, sequence_number)` for the ordering query path - `(captured_at)` for the retention purge path Queries: - `InsertBoundarySession` / `GetBoundarySessionByID` - `InsertBoundaryLog` / `GetBoundaryLogByID` - `ListBoundaryLogsBySessionID` with nullable `seq_after`/`seq_before` exclusive bounds for fetching events between two known interception sequence numbers - `DeleteOldBoundaryLogs` with row limit to avoid long-running transactions Also includes: dbgen helpers (`BoundarySession`, `BoundaryLog`), dbauthz implementations (reads gated on `ResourceAuditLog`, deletes on `ResourceSystem`), and all generated wrappers (dbmock, dbmetrics). No callers yet. A follow-up PR will add the dedicated `boundary_log` RBAC resource type. > Generated by Coder Agents	2026-05-25 11:14:36 +02:00
Danny Kopping	eddd4a8c2f	feat(coderd): accept delegated API key ID from in-process aibridge callers (#25625 ) Allows an `api_key_id` to be passed from a trusted in-memory transport (currently: `chatd`) to `aibridged` for use in authenticating LLM requests. This value can _only_ be passed via context, and all users of the in-memory transport _must_ provide it. It can be used in conjunction with BYOK headers. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 11:08:07 +02:00
Tyler	814386dda7	fix(site): left-align template README content instead of centering in narrow column (#25487 ) Closes #24183 ## Changes Drops `mx-auto` so README content left-aligns with the header. Bumps padding from 24px to 32px and widens `max-w` from 800px to 860px for breathing room. Applied to both: - `TemplateDocsPage.tsx` - `StarterTemplatePageView.tsx` > Generated with [Coder Agents](https://coder.com/agents)	2026-05-22 23:37:35 -07:00
Michael Suchacz	6739542875	test(coderd/x/chatd): skip signal wake send flake (#25633 ) Skips `TestSignalWakeSendMessage`, which flakes because the current chatd control notification flow can deliver stale status notifications after a new processing run starts. This mirrors the existing CODAGT-353 skips for the same stale-notification class and leaves the deterministic fix to that notification-flow refactor. Refs https://linear.app/codercom/issue/ENG-2727/flake-testsignalwakesendmessage > Generated by Coder Agents on behalf of @ibetitsmike.	2026-05-22 23:10:31 +00:00
Zach	a8962274fa	docs: describe how secrets reach a workspace (#25538 ) Replace the brief runtime-behavior paragraph with a dedicated section covering when env and file secrets appear in a workspace, what the running workspace sees, and how create/update/delete propagate. Call out that Coder never explicitly removes secret files it has written, so deleting a secret or changing its file path may leave the previous file on disk. Co-authored-by: Coder Agents <noreply@coder.com>	2026-05-22 14:06:47 -06:00
Mathias Fredriksson	471249f3e2	ci: migrate doc-check workflow to coder/agents-chat-action (#25178 ) Replace the inline `curl` + `jq` block in `.github/workflows/doc-check.yaml` with a single `uses: coder/agents-chat-action` step. Closes CODAGT-375	2026-05-22 19:09:36 +03:00
Jaayden Halko	ef3f95a7af	fix(site/src): account for iOS visual viewport offset (#25619 ) ## Summary - Compute mobile dropdown bottom offsets in layout-viewport coordinates, matching the fixed Radix popover wrapper. - Use `visualViewport.offsetTop` to clamp the above-composer popup height when iOS WebKit pans the visual viewport for the soft keyboard. - Align mobile dropdown width/left to the chat composer and add a Storybook regression for shifted visual viewports. ## Testing - `cd site && pnpm tsc --noEmit -p .` - `cd site && pnpm test:storybook src/pages/AgentsPage/components/ChatMessageInput/ChatMessageInput.stories.tsx` - `cd site && pnpm lint` ## Manual mobile verification Start dev mode with `./scripts/develop.sh`, open the forwarded port 8080 URL on a real iPhone in Safari and Chrome, focus the Agents chat input, type `/`, and verify the personal skills popup appears directly above the composer, stays within the visible viewport while the keyboard is open, and scrolls internally for long lists. Generated by Coder Agents.	2026-05-22 16:36:27 +01:00
Danny Kopping	0d9718e217	feat: add 'copilot' to ai_provider_type (#25616 )	2026-05-22 16:10:37 +02:00
Michael Suchacz	de6d62815e	fix(coderd): avoid redundant workspace setup (#25615 ) GPT-class chat turns could eagerly create workspaces or repeat setup such as cloning an existing repo because the system prompt framed setup work as the default path. This updates chatd prompt guidance and the `create_workspace` tool description so agents reuse existing chat and workspace context, treat injected workspace context as already read, avoid recloning present repositories, and create or start workspaces only when workspace-backed work is required. Delegated chats now report workspace needs to the parent instead of trying to create one. > Mux opened this PR on behalf of Mike.	2026-05-22 14:08:07 +00:00
Zach	8d0a73f0b1	chore: bump terraform-provider-coder and coder/preview for coder_secret removal (#25590 ) We decided to remove secret requirements and go a different direction for secrets in Coder (see PLAT-243). As a result, we removed the code in terraform-provider-coder and coder/preview to handle this resource. This PR pulls in said updated versions. Generated with assistance by Coder Agents.	2026-05-22 07:57:54 -06:00
Danielle Maywood	15e63dec6f	perf(site/src/pages/AgentsPage): combine array iterations (#25614 )	2026-05-22 14:23:22 +01:00
Jaayden Halko	e71710df37	fix(site/src): keep personal skills popup on-screen on mobile (#25598 ) On mobile, typing `/` in the chat input could leave the personal-skills popup partially clipped above the visible viewport. With the soft keyboard open, Radix's collision detection flipped the caret-anchored popup above the caret, and the resulting position pushed the top of the list off-screen. Add a `.mobile-full-width-dropdown-above-composer` CSS variant in `site/src/index.css`, driven by a new `--mobile-dropdown-above-composer-bottom` custom property set from the existing composer geometry effect in `AgentChatInput.tsx`. The variant pins the Radix popper wrapper to sit just above the chat input with the same horizontal padding (`calc(100vw - 2rem)`), and caps `max-height` to the space between the viewport top and the composer top so the inner `CommandList` scrolls when the skill list overflows. Apply the new classes to `PersonalSkillsTriggerMenu`'s `PopoverContent`. Desktop behavior is unchanged: the new selectors only apply below the `md` breakpoint, and the caret-anchored `PopoverAnchor` still drives Radix positioning everywhere else. Two new Storybook stories cover the mobile geometry: `MobileAboveChatInput` asserts the popup stays inside the visible viewport, and `MobileLongListScrolls` asserts the popup is scrollable when the skill list is taller than the available space. <details> <summary>Implementation plan</summary> The plan file lives at `/home/coder/.coder/plans/PLAN-28f5e6ed-97dd-4375-a338-60fded8ef8b0.md` in the agent workspace and was followed end-to-end without scope drift. Key decisions: - Did not reuse the existing `.mobile-full-width-dropdown-bottom` because its formula (`window.innerHeight - composer.bottom`) aligns the popup's bottom edge with the composer's bottom edge, which overlaps the composer rather than sitting above it. - Did not change the existing class's behavior because other dropdowns (Plus menu, ContextUsageIndicator, ModelSelector, WorkspacePill, CompactOrgSelector) rely on the current geometry. If the project decides the overlap pattern is also a bug, those callsites can migrate to the new variant in a separate change. - Kept the caret-pinned `PopoverAnchor` span in `PersonalSkillsTriggerMenu` because it still drives desktop positioning, and on mobile the CSS overrides the wrapper position entirely (same pattern as the existing `mobile-full-width-dropdown-bottom` usage). - Left `CommandList`'s `max-h-72` in place so desktop still caps the popup at ~18 rem; on mobile the wrapper's CSS-driven `max-height` is the binding constraint. </details> Generated by Coder Agents on behalf of @jaaydenh. --------- Co-authored-by: Coder Agents <noreply@coder.com>	2026-05-22 14:19:19 +01:00
Michael Suchacz	bdf2698fcd	fix: parse skill frontmatter as YAML (#25610 )	2026-05-22 15:09:30 +02:00
Cian Johnston	15ada66e14	feat: add pr, repo, pr_title chat search filters (#25569 ) Relates to CODAGT-432 Adds three new search filters to the chat list endpoint (`GET /api/experimental/chats/`): - `pr:<number>` - exact PR number match - `repo:<owner/repo>` - substring match against git remote origin or URL - `pr_title:<text>` - case-insensitive PR title substring match Includes SQL filter clauses (EXISTS against `chat_diff_statuses`), parser with validation, handler wiring, unit tests, swagger annotation update, and a new search syntax documentation page. > 🤖 Generated with [Coder Agents](https://coder.com/agents)	2026-05-22 13:58:07 +01:00
Danielle Maywood	5deab9f721	test: wait for devcontainer readiness (#25567 )	2026-05-22 13:55:21 +01:00
Matt Vollmer	3a2a97602e	fix(site/src/pages/AgentsPage): dismiss skills trigger on outside click (#25613 ) When the personal skills menu is open and the user clicks outside (e.g. the send button), the Popover closes via `onOpenChange` but the `SkillsTriggerPlugin`'s `dismissedTriggerRef` is not set. The next Lexical update listener call detects the trigger again and briefly reopens the menu, causing a visible flash. Addresses this symptom: https://github.com/user-attachments/assets/0c1442a2-df75-442b-bcf8-4b028dc647b0 Fix by recording the current trigger position in `dismissedTriggerRef` when the `open` prop transitions from `true` to `false`. This mirrors what the Escape key handler already does and prevents `refreshTrigger` from immediately re-opening the menu at the same position. <details><summary>Implementation details</summary> - Added a `useLayoutEffect` in `SkillsTriggerPlugin` that tracks `open` prop transitions via a `prevOpenRef`. When `open` goes from `true` to `false`, it snapshots the current trigger position into `dismissedTriggerRef`, matching the pattern the Escape handler uses (line 225-227). - Added `OutsideClickDismissesTriggerOnRefocus` Storybook regression story that verifies the menu stays closed when clicking back into the editor after an outside-click dismissal. </details> --- PR generated with Coder Agents	2026-05-22 08:49:00 -04:00
Danielle Maywood	fbf6fa1d25	chore(site/src/pages/AgentsPage): use Tailwind size shorthand (#25611 ) Replace redundant matching Tailwind width and height utilities in AgentsPage with the `size-*` shorthand. This addresses the AgentsPage `react-doctor/design-no-redundant-size-axes` findings without changing rendered dimensions.	2026-05-22 13:07:14 +01:00
Cian Johnston	e5293c81f9	fix(coderd): fix flaky TestSendMessageWithModelOverrideUpdatesLastModelConfigID (#25603 ) Fixes: ENG-2719 Fixes the flake in `TestSendMessageWithModelOverrideUpdatesLastModelConfigID` (and the same pattern in `TestSubsequentSendWithoutOverrideUsesPersistedModel`). > Generated with [Coder Agents](https://coder.com/agents)	2026-05-22 12:40:45 +01:00
Danny Kopping	ef6ee2af68	chore: tolerate empty providers at startup and log env seeds (#25605 ) Since AI Gateway is now enabled by default, and if the AI Gateway Proxy is enabled too it's possible the server can start without any configured providers. This would previously block startup, which is unacceptable. In an upstack PR we will handle reloading the providers at runtime, so the server needs to be able to start up even if it can't handle any proxy requests to AI Gateway. This change was necessitated because if there are providers configured in the environment they need to be seeded _before_ the proxy starts.	2026-05-22 12:45:14 +02:00
Cian Johnston	c8b1fa3196	fix: use UTC day boundaries for chat auto-archive eligibility (#25597 ) Fixes CODAGT-311. Users receive too many auto-archive notification emails because the dbpurge loop runs every 10 minutes and archives chats on each tick using timestamp-precise cutoffs, causing chats to trickle past the threshold continuously. Switch archive eligibility from timestamp arithmetic to date arithmetic (UTC day boundaries). All chats whose last activity falls on the same UTC date are now archived together on the first tick after midnight UTC, reducing notification emails to ~at most~ probably one per day. (Exception: if we hit the auto-archive limit) - SQL compares `(last_activity AT TIME ZONE 'UTC')::date` against cutoff date - Go truncates current time to start-of-day before subtracting archive days - Tests verify date boundary semantics including late-activity and batch edge cases - Docs updated to describe UTC day boundary behavior and at-most-daily notification cadence > [!NOTE] > Generated by Coder Agents	2026-05-22 11:39:44 +01:00
Mathias Fredriksson	0ba702c43f	fix: normalize command paths to base names in shellparse (#25599 ) Normalize program names in shellparse.Parse to their basename. Does not rely on filepath.Base because the server may run on either Linux or Windows where the behavior would differ. Closes CODAGT-470	2026-05-22 13:36:53 +03:00
Danny Kopping	5d40bac79f	feat: add in-memory transport for chatd -> aibridge routing (#25576 ) ### TL;DR Introduces an in-process `TransportFactory` for aibridge so that chatd (coder-agent LLM traffic) can route requests through the aibridged handler without crossing the HTTP route or requiring a license entitlement check. ### What changed? - Added a new `coderd/aibridge` package with a `TransportFactory` interface and a `Source` type for tagging the call site on request contexts. `SourceAgents` is defined as the constant for coder-agent traffic. - Implemented `NewTransportFactory` in `coderd/aibridged/transport.go`, which returns an `http.RoundTripper` that dispatches requests to the aibridged handler in-process. The response body is streamed through an `io.Pipe` so SSE/NDJSON/chunked responses propagate token-by-token. Handler panics are recovered and surfaced as 500 responses, and context cancellation closes the pipe with the appropriate error. - `RegisterInMemoryAIBridgedHTTPHandler` now also constructs a `TransportFactory` from the registered handler and stores it on `API.AIBridgeTransportFactory` (an `atomic.Pointer`), making it available to chatd without going through the license-gated HTTP route. - Added `API.AIBridgeTransportFactory` as a public `atomic.Pointer[aibridge.TransportFactory]` field on `coderd.API`. ### How to test? - `coderd/aibridged/transport_test.go` covers: transport creation, nil-handler errors, source attachment to context, header/status passthrough, streaming (SSE-style chunked writes visible before handler completion), context cancellation closing the body with an error, concurrent requests, handler panics producing 500s, and handlers that return without writing. - `coderd/aibridge_test.go` verifies that `AIBridgeTransportFactory` starts as nil on AGPL coderd, can be stored and loaded atomically, and that the stored factory correctly dispatches requests through the stub handler. ### Why make this change? Chatd needs to send LLM requests through aibridge in-process rather than via the external HTTP route, which is license-gated. The `TransportFactory` abstraction provides a clean seam: the entitlement check remains on the HTTP route for external callers, while in-process coder-agent traffic bypasses it through the factory. The `Source` type allows downstream handlers and logs to attribute traffic without gating behavior on the caller identity.	2026-05-22 12:33:10 +02:00
Ethan	c650aabbef	chore: standardize on _internal_test.go for white-box tests (#25601 ) My agent added `//nolint:testpackage` to a test file on one of my PRs. Again. This PR cleans it up across the entire repo and updates the in-repo conventions so future agents stop doing it. The repo already has a precedent for white-box tests that need to touch unexported symbols: `_internal_test.go` (145+ existing files). The `testpackage` linter's default `skip-regexp` exempts that filename suffix, so the `//nolint:testpackage` directive is unnecessary in every case where someone reached for it. This PR renames 51 such files to `_internal_test.go` via `git mv` so blame and history follow, and strips the dead directive from 2 files that were already correctly named (`coderd/oauth2provider/authorize_internal_test.go`, `coderd/x/chatd/advisor_internal_test.go`). `.claude/docs/TESTING.md` now documents the rule explicitly under Test Package Naming, which is imported into the root `AGENTS.md` via `@.claude/docs/TESTING.md`. The rule: prefer `package foo_test`; if you need internal access, rename the file to `_internal_test.go` rather than adding a nolint directive.	2026-05-22 20:24:38 +10:00
Ethan	705421bc5d	test: speed up agent container websocket close test (#25559 ) `TestWatchAgentContainers/CoderdWebSocketCanHandleClientClosing` spent about 15 seconds waiting for the real websocket heartbeat ticker to detect that the client closed. Add a clock-aware `HeartbeatClose` wrapper and pass `api.Clock` through the containers watch handler so the test can drive the heartbeat deterministically with `quartz.Mock`. The test still verifies the same client-close teardown path, but it advances the heartbeat tick instead of waiting for wall-clock time. Refs #25557 Discovered as part of the work on CODAGT-381.	2026-05-22 20:10:25 +10:00
Michael Suchacz	ca1f6b19a2	feat: remove legacy chat provider tables (#25416 )	2026-05-22 09:50:01 +02:00
Danny Kopping	ddec110b0e	refactor: move aibridged out of enterprise to AGPL (#25570 ) In order to allow Coder Agents to use AI Gateway in OSS, we need to rehome the `aibridged`\-related code into the AGPL path. The HTTP API is only registered under enterprise so will still require the AI Governance Add-on to be present in order to use it, whereas Coder Agents uses an in-memory pipe to the same handlers.	2026-05-22 09:11:37 +02:00
Danny Kopping	c50b0e84b9	feat!: default `CODER_AI_GATEWAY_ENABLED` to true (#25575 ) `CODER_AI_GATEWAY_ENABLED` / `CODER_AIBRIDGE_ENABLED` is now being defaulted to `true` now that it will be used by Coder Agents. If you previously had this value disabled explicitly, that value will persist.	2026-05-22 08:57:36 +02:00
Danny Kopping	9341efec9f	feat!: seed ai_providers from env on server startup (#24895 ) _Disclaimer: implemented by a Coder Agent using Claude Opus 4.7_ Part of the implementation of [RFC: Common AI Provider Configs](https://www.notion.so/coderhq/RFC-Common-AI-Provider-Configs-34bd579be59280ed958feffb82024797) (AIGOV-201). ## Note This change can cause a previously working installation to fail to start should a conflict exist between the providers configured in the environment & those now migrated to the database. I'll raise a PR upstack to document this process and workarounds should a startup fail. ## What this PR does Reconciles environment-derived AI provider configuration with the `ai_providers` table at server startup. The seed runs before the aibridged daemon is initialized, so the runtime always reads providers from the database; the legacy `CODER_AIBRIDGE_` environment variables become a one-shot migration source. ### Behavior - Concurrent server starts are serialized through a Postgres advisory lock (`LockIDAIProvidersEnvSeed`). - Missing rows are inserted with an audit entry attributed to the system actor. - Existing rows whose canonical hash matches the env-derived hash are left alone (the common no-op restart path). - Existing rows whose canonical hash does not* match cause server startup to fail with a descriptive error so the operator can explicitly resolve the conflict in either env or DB. - Soft-deleted rows are NOT resurrected from env; an explicit operator deletion is sticky across restarts. - Indexed providers whose name conflicts with a legacy env var fail startup with a clear remediation message. - Unknown provider types (e.g. `copilot`, until the DB enum is widened) are skipped with a log entry rather than failing startup. ### Canonical hashing The `canonicalAIProvider` shape captures exactly the fields that determine runtime behavior — `type`, `base_url`, and the Bedrock subset of settings (access key, access key secret, region, model, small fast model) — and is hashed with SHA-256. The hash is computed on demand from the row + env, never persisted, so the database does not need a new column for it. API keys live in the separate `ai_provider_keys` table and are intentionally excluded from the hash so operators can rotate keys via the API without forcing a server restart. <details> <summary>Decision log</summary> - The hash is intentionally not persisted in the database. The RFC discussed this trade-off; computing on demand keeps the schema minimal and lets the canonical shape evolve without a migration. - The lock uses an `iota` slot in `coderd/database/lock.go` rather than `GenLockID` so it's stable, easy to audit, and matches the convention used for every other startup lock. - A bearer-token Anthropic provider whose env vars also set Bedrock metadata but no AWS credentials does NOT store the Bedrock fields. Without credentials the discriminated settings would misrepresent the row as Bedrock auth. - We deliberately do NOT publish to the `ai_providers_changed` pubsub channel from the seed because the seed completes before any subscriber is started; the follow-up PR introduces that channel. </details>	2026-05-22 08:37:27 +02:00
Michael Suchacz	06526a5822	feat: use AI provider chat APIs (#25415 )	2026-05-22 07:53:23 +02:00
Kayla はな	10efde3e6c	fix(codersdk): fix stale comment reference (#25552 )	2026-05-21 21:11:11 -06:00
Michael Suchacz	5968c3dac7	feat: use AI provider keys at runtime (#25414 )	2026-05-22 02:17:09 +02:00

1 2 3 4 5 ...

14469 Commits