coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Michael Suchacz	8b1705eb65	feat: route chatd provider traffic through aibridge (#25629 ) ## Summary Routes chatd model calls backed by concrete AI Provider rows through the in-process aibridge transport by default, with deployment options to use direct provider routing when AI Gateway is disabled or chat AI Gateway routing is disabled. - Splits model routing into common, direct provider, and AI Gateway paths behind a single deployment-mode entry point. - Builds chatd models through explicit request, route, and options data. Active API key attribution is passed explicitly instead of being hidden inside generic model construction. - For AI Gateway BYOK routes, resolves the user's provider key in chatd, forwards it through provider-specific auth headers, and sets `X-Coder-AI-Governance-Token` to the `delegated` marker so aibridge preserves those headers while still stripping Coder-specific metadata. - Keeps central provider credentials and deployment fallback credentials out of forwarded provider auth headers, so AI Gateway central policy remains authoritative. - Redacts delegated provider auth from default string formatting to avoid accidental plaintext logging of user BYOK credentials. - Covers selected chat models, advisor overrides, title and quickgen paths, subagent overrides, computer use model selection, and an integration-style chat turn through the aibridge transport path. - Persists initiating API key IDs on chat and queued user messages, including subagent child messages, and fails closed for AI Gateway-routed model builds without an active key. - Removes unused `api_key_id` indexes while keeping the persistence columns and foreign keys. - Keeps the deployment option available through config and env parsing, but hides it from CLI help and generated docs. - Stabilizes the subagent poll fallback test so background CreateChat processing cannot win the state transition under slower CI environments. ## Tests - `go test ./coderd/x/chatd -run 'TestAIGatewayProviderAuthForUser\|TestAIGatewayProviderAuthRedactsFormatting\|TestResolveModelRouteForConfigAIGatewayProviderAuth\|TestAIGatewayModelForwardsProviderAuth\|TestProcessChat_AIGatewayRoutingUsesDelegatedAPIKey\|TestAwaitSubagentCompletion' -count=1` - `go test ./coderd/aibridged -run 'TestServeHTTP_DelegatedAPIKey\|TestServeHTTP_StripCoderToken' -count=1` - `git diff --check HEAD~1..HEAD` - `make lint` > Mux working on behalf of Mike.	2026-05-26 19:31:52 +00:00
Mathias Fredriksson	32ed9f1f39	fix: use old_text/new_text in edit_files tool schema (#25658 ) Models frequently confuse the search and replace fields in the edit_files tool (CODAGT-312). Rename the model-facing JSON fields to old_text/new_text so the intent is unambiguous. Backend: custom UnmarshalJSON on editFileEdit falls back to deprecated search/replace when old_text/new_text are empty. The workspace agent API is unchanged; toSDKFiles maps old_text/new_text back to search/replace for agent/agentfiles. Frontend: normalizeEdit in parseEditFilesArgs accepts both old_text/new_text and search/replace, normalizing to the internal { search, replace } representation so streaming diff rendering works with either field naming convention.	2026-05-26 11:11:47 +03:00
Ethan	fe13bb2a20	fix(coderd/x/chatd): seed afterMessageID test directly (#25665 ) This fixes the flaky `TestSubscribeAfterMessageID` by seeding its chat and messages directly, so the test no longer creates pending work that a chat worker can pick up. The assertion now covers only the `afterMessageID` subscription behavior, independent of chat processing lifecycle timing. Closes DEVEX-326 Closes https://github.com/coder/internal/issues/1489	2026-05-26 13:16:32 +10:00
Cian Johnston	579daaff70	feat: add GitLab support to coderd/externalauth/gitprovider Fixes CODAGT-146 Add GitLab support to the gitprovider package for gitsync/chatd PR diff flows. This is a squashed stack of 3 PRs: #25651 - refactor(coderd/externalauth): prepare gitprovider for multi-provider support - Change gitprovider.New to return (Provider, error) - Extract shared helpers (parseRetryAfter, checkRateLimitError, countDiffLines, escapePathPreserveSlashes) from github.go - Update all callers (db2sdk, exp_chats, gitsync) for new signature - Add error logging for provider construction failures - Thread context through provider resolution #25652 - feat(coderd/externalauth/gitprovider): add GitLab provider - Implement full Provider interface: FetchPullRequestStatus, FetchPullRequestDiff, FetchBranchDiff, ResolveBranchPullRequest - Handle nested groups, forks, and self-hosted instances - Rate limit detection on both library and raw HTTP paths - URL parsing/building with NormalizePullRequestURL support - Unit tests covering error paths, URL parsing, state mapping - Document GitLab configuration and known limitations #25653 - test(coderd/externalauth/gitprovider): add GitLab VCR integration tests - FetchPullRequestStatus: 4 fixtures (open, conflicts, merged, closed) - FetchPullRequestDiff: 4 fixtures - FetchBranchDiff: 3 fixtures (open, deleted, fork) - ResolveBranchPullRequest: 3 fixtures - go-vcr cassettes with sanitized GitLab API responses	2026-05-25 17:41:02 +01:00
Mathias Fredriksson	00a6dc56a7	test(coderd/x/chatd): wait for settled state in PromoteQueued ordering (#25644 ) TestPromoteQueuedWhileRunningRespectsMessageOrder was flaky because it read queue state from the database immediately after PromoteQueued returned. The active server worker drains queued messages concurrently, so the DB read races the auto-promote pipeline (TOCTOU). Instead of asserting intermediate queue state, wait for all three promoted messages to appear in chat history and verify their relative order (B before A before C). This asserts the same invariant (promote reorders B to the front) without reading during the race window. Closes CODAGT-384	2026-05-25 17:58:31 +03:00
Mathias Fredriksson	12f082c864	test(coderd/x/chatd): drain all subscriber events per tick in PromoteQueued tests (#25645 ) The root cause of the TestPromoteQueuedWhileRequiresActionMixedTools flake (CODAGT-425) was the subscriber out-of-order durable message delivery bug, fixed by PR #25433 (`ec1e861`). All five CI failures predate that fix. Zero failures since. This change hardens the subscriber event-drain pattern in both PromoteQueued requires_action tests: wrap the channel select in a for-loop so interleaved non-target events (status, queue_update, message_parts) are consumed in the same Eventually tick instead of each burning a 25ms interval. This is defense-in-depth for slow CI runners, not a standalone bug fix. Closes coder/internal#1523 Closes CODAGT-425	2026-05-25 16:55:48 +03:00
Michael Suchacz	6739542875	test(coderd/x/chatd): skip signal wake send flake (#25633 ) Skips `TestSignalWakeSendMessage`, which flakes because the current chatd control notification flow can deliver stale status notifications after a new processing run starts. This mirrors the existing CODAGT-353 skips for the same stale-notification class and leaves the deterministic fix to that notification-flow refactor. Refs https://linear.app/codercom/issue/ENG-2727/flake-testsignalwakesendmessage > Generated by Coder Agents on behalf of @ibetitsmike.	2026-05-22 23:10:31 +00:00
Michael Suchacz	de6d62815e	fix(coderd): avoid redundant workspace setup (#25615 ) GPT-class chat turns could eagerly create workspaces or repeat setup such as cloning an existing repo because the system prompt framed setup work as the default path. This updates chatd prompt guidance and the `create_workspace` tool description so agents reuse existing chat and workspace context, treat injected workspace context as already read, avoid recloning present repositories, and create or start workspaces only when workspace-backed work is required. Delegated chats now report workspace needs to the parent instead of trying to create one. > Mux opened this PR on behalf of Mike.	2026-05-22 14:08:07 +00:00
Michael Suchacz	bdf2698fcd	fix: parse skill frontmatter as YAML (#25610 )	2026-05-22 15:09:30 +02:00
Ethan	c650aabbef	chore: standardize on _internal_test.go for white-box tests (#25601 ) My agent added `//nolint:testpackage` to a test file on one of my PRs. Again. This PR cleans it up across the entire repo and updates the in-repo conventions so future agents stop doing it. The repo already has a precedent for white-box tests that need to touch unexported symbols: `_internal_test.go` (145+ existing files). The `testpackage` linter's default `skip-regexp` exempts that filename suffix, so the `//nolint:testpackage` directive is unnecessary in every case where someone reached for it. This PR renames 51 such files to `_internal_test.go` via `git mv` so blame and history follow, and strips the dead directive from 2 files that were already correctly named (`coderd/oauth2provider/authorize_internal_test.go`, `coderd/x/chatd/advisor_internal_test.go`). `.claude/docs/TESTING.md` now documents the rule explicitly under Test Package Naming, which is imported into the root `AGENTS.md` via `@.claude/docs/TESTING.md`. The rule: prefer `package foo_test`; if you need internal access, rename the file to `_internal_test.go` rather than adding a nolint directive.	2026-05-22 20:24:38 +10:00
Michael Suchacz	ca1f6b19a2	feat: remove legacy chat provider tables (#25416 )	2026-05-22 09:50:01 +02:00
Michael Suchacz	06526a5822	feat: use AI provider chat APIs (#25415 )	2026-05-22 07:53:23 +02:00
Michael Suchacz	5968c3dac7	feat: use AI provider keys at runtime (#25414 )	2026-05-22 02:17:09 +02:00
Michael Suchacz	356bccddc2	feat: add personal skills settings UI and docs (#25066 ) > Mux updated this PR on behalf of Mike. ## Summary - Add experimental personal skills API helpers and an Agents settings UI for listing, creating, editing, deleting, and importing SKILL.md content. - Add docs, Storybook coverage, and unit tests for backend-compatible SKILL.md parsing. - Address review feedback by simplifying frontmatter scalar parsing, clarifying the UI parser scope, defaulting personal skill queries to `me`, and patching React Query caches after create, update, and delete. - Merge latest `main` and resolve the Agents sidebar refactor conflicts. ## Validation - pre-commit hook - `go test ./codersdk/workspacesdk -run TestParseSkillFrontmatter -count=1` - `go test ./coderd/x/chatd/chattool -run 'Test' -count=1` - `cd site && pnpm test -- src/pages/AgentsPage/utils/personalSkills.test.ts src/api/queries/userSkills.test.ts src/utils/fileSize.test.ts --runInBand` - `cd site && pnpm lint:types` - `cd site && pnpm lint:check`	2026-05-22 00:20:10 +02:00
Michael Suchacz	35a624bebd	fix(coderd/x/chatd): gate default branch agent pushes (#25578 ) > Mux is opening this PR on behalf of Mike. Agents could interpret a generic "commit and push" request on `main` as permission to commit on the current branch and push its upstream. Add version-control safety guidance to the default agent system prompt so agents check the current branch and push target, avoid default or protected branch commits and pushes unless the user explicitly confirms after a warning, avoid plain git push from those branches, and create a feature branch first when no explicit confirmation is present.	2026-05-21 22:04:38 +02:00
Mathias Fredriksson	f1b772928d	feat: parse execute tool commands and render them in the chat UI (#25478 ) When the execute tool runs a chained shell command, the UI previously rendered the raw string. Long chains like "cd /repo && git pull && git add . && git commit -m fix" were hard to scan. A new ChatMessagePart.ParsedCommands [][]string field on tool-call parts carries one entry per simple command, parsed in chatd from args via mvdan.cc/sh/v3/syntax. The frontend renders the joined list ("cd, git pull, git add, git commit") in place of the raw command, and falls back to the raw command when the field is absent. Closes CODAGT-446	2026-05-21 08:12:34 +00:00
Mathias Fredriksson	ec1e861152	fix(coderd/x/chatd): deliver out-of-order durable messages on subscribe (#25433 ) The subscriber advanced a single delivery cursor on each notify and trusted it for both lookups. Concurrent publishMessage calls and PG NOTIFY commit ordering let cache appends and notifies arrive out of ID order, after which a late notify would scan above its own message and drop it. The DB fallback was also skipped whenever the cache delivered anything, hiding cross-replica messages that only the DB held. The cursor becomes a high-water mark, not the lookup key. Notifies trigger a rescan over the gap they describe and dedupe per subscription, and the DB pass runs every time so cross-replica messages can't get eaten by a local cache hit. Closes coder/internal#1525 Closes CODAGT-357	2026-05-21 10:35:41 +03:00
Michael Suchacz	63900d212d	feat: support personal skills in chats (#25366 ) > Mux updated this PR on behalf of Mike. ## Stack Context This PR builds on #25365 in the experimental personal skills stack and completes the chat integration. Stack order: 1. #25362 personal skill resolver 2. #25363 storage, permissions, API, and SDK 3. #25365 API test coverage 4. #25366 chattool and chatd integration 5. #25066 settings UI and docs 6. #25386 personal skills slash menu ## What? Updates chattool skill formatting and `read_skill` resolution so tools can read personal skills from the database, then injects personal skill metadata into chatd prompts and registers the skill-reading tools when skills are available. This branch has also been merged with current `origin/main` to resolve merge conflicts. ## Why? The chattool and chatd changes need to land together so the intermediate stack state stays buildable. This completes personal skill availability in chats without syncing personal skills into workspace filesystems. ## Validation - `go test -count=1 ./coderd/x/chatd/chattool -run 'TestFormatResolvedSkillIndex\|TestReadSkillTool\|TestReadSkillFileTool'` - `go test -count=1 ./coderd/x/chatd -run 'TestPersonalSkillsInSystemPrompt\|TestPersonalAndWorkspaceSkillCollisionInSystemPrompt\|TestSkillIndexRefreshReplacesStaleAliases\|TestFetchPersonalSkillMetadata\|TestLoadPersonalSkillBody'` - `go test -count=1 ./coderd -run 'Test.*UserSkill'` - `git diff --cached --check` - `make lint` - pre-commit hook	2026-05-20 19:50:50 +02:00
Michael Suchacz	13bf0e11f1	docs(coderd/x/chatd): define AI provider glossary (#25411 ) > Mux prepared this PR on behalf of Mike. ## Stack Context This is PR 1 of 6 in the `mike/ai-providers` Graphite stack. The stack migrates Agents chat provider configuration from legacy chat provider tables to the unified AI provider tables used by the AI provider administration surface. See the stack comment for review order and links. ## What? Adds a package-level `coderd/x/chatd/docs.go` glossary for AI Providers, provider-scoped keys, user BYOK keys, and Agents as the consuming feature area. ## Why? Keeping the glossary next to chatd makes the migration language visible where Agents consume AI Providers, without adding a separate PRD, root context file, or ADR structure.	2026-05-20 01:37:38 +02:00
Michael Suchacz	5a8d0016a5	feat: add personal skill storage, API, and SDK (#25363 ) > Mux updated this PR on behalf of Mike. ## Stack Context This PR is the storage, permissions, API, and SDK layer for experimental personal skills. #25362 has landed on `main`, so this branch is restacked directly on `main`. Stack order: 1. #25363 storage, permissions, API, and SDK 2. #25365 API test coverage 3. #25366 chattool and chatd integration 4. #25066 settings UI and docs 5. #25386 personal skills slash menu ## What? Adds the `user_skills` database table, generated queries, RBAC resources and scopes, audit resource handling, experimental user-scoped CRUD endpoints, SDK types, and generated API/site types. Follow-up review and restack fixes: - Enforce a bounded personal skill description in parser and database constraints. - Return `403 Forbidden` for unauthorized create and update attempts. - Return explicit conflict responses when soft-deleted users are targeted. - Keep user admins out of personal skills, while site owners can read and delete but not create or update. - Document trigger-raised constraint names and keep schema constants covered by tests. - Reuse `UserSkillMetadata` in the full `UserSkill` SDK response type. - Generate user skill IDs in Go instead of relying on a database default. - Rebase on latest `main` and renumber the user skills migration to `000502_user_skills`. ## Why? Personal skills need durable user-owned storage with owner authorization, limited site-owner moderation, and a hidden API surface before chatd can consume them. ## Validation - `make gen` - `go test ./coderd/database -run '^TestUserSkillSchemaConstants$' -count=1` - `go test ./coderd/database/dbauthz -run '^TestMethodTestSuite/TestUserSkills$' -count=1` - `go test ./coderd -run '^TestPatchUserSkill$' -count=1` - `go test ./codersdk ./coderd/database/db2sdk` - `make lint` - pre-commit hook on `97fd58108d`	2026-05-20 00:09:09 +02:00
Michael Suchacz	951a8e7237	feat: add intent labels to execute tool (#25482 ) > Mux opened this PR on behalf of Mike. Fixes CODAGT-451 Adds optional `model_intent` metadata to the built-in execute tool schema so tool calls can carry a short user-facing intent label without duplicating the command or duration. The Agents UI now composes that intent with the existing execute command and duration fields, displaying labels like `Checking repository state using git fetch origin for 2.3s` while keeping the shell command visible as the audit-relevant action. Existing execute calls without an intent keep the previous `Ran <command>` fallback label, so only intent-bearing calls get the new composed label.	2026-05-19 18:47:12 +02:00
Michael Suchacz	47b90afce6	fix(coderd/x/chatd/chatadvisor): truncate oversized advisor questions (#25489 ) Advisor tool calls currently reject questions over 2000 runes, which can leave the parent model retrying the same invalid call. This documents the limit in the advisor tool schema and guidance, then truncates oversized questions rune-safely before building the nested advisor prompt. > Mux working on behalf of Mike.	2026-05-19 17:57:14 +02:00
Cian Johnston	ce7f41f56d	fix: bump MaxChatFileIDs from 20 to 50 (#25492 ) Fixes CODAGT-456	2026-05-19 16:53:30 +01:00
Ethan	1e8c8d7dba	fix(coderd/x/chatd): drop orphan provider tool calls on replay (#25491 ) Anthropic replay can fail when stored history contains a provider-executed tool call like `web_search` without the matching provider-executed result. That orphaned call is incomplete provider-internal state, so replaying it can make an otherwise usable chat unreplayable even though there is no search result to preserve. This fixes replay by dropping orphan provider-executed tool calls from the model-visible prompt, preserving signed reasoning and the rest of the assistant content, then revalidating before the request. We do not synthesize tool results or drop reasoning. The database can retain the historical artifact for inspection, while Anthropic only sees replayable content. This matches permissively licensed prior art. Vercel AI SDK (Apache-2.0), used by mux, keeps incomplete tool state in UI/history but omits it from model requests with `convertToModelMessages(..., { ignoreIncompleteToolCalls: true })`. LangChain, LiteLLM, and OpenAI Agents (MIT for the relevant open-source code) also preserve Anthropic signed reasoning as opaque replay data. Coder applies that model-visible replay boundary explicitly because our persisted history is already in provider-message form. This matches mux, is cleaner than the older idea around not persisting the search query tool, and the model handles the repaired prompt fine. Closes CODAGT-448 ## Before <img width="963" height="491" alt="image" src="https://github.com/user-attachments/assets/a7788ebf-2728-4420-90cf-5e4f6905bdf7" /> ## After <img width="842" height="513" alt="image" src="https://github.com/user-attachments/assets/ae39c262-7586-4e2d-b7db-1b639a7e8e15" />	2026-05-20 01:28:02 +10:00
Ethan	9444eddf4e	feat(coderd/x/chatd): allow attach_file in root plan-mode chats (#25388 ) `attach_file` was registered for plan-mode turns but never added to `builtinPlanToolAllowed`, so the per-turn `ActiveTools` allowlist filtered it out and calls failed with `Tool not active in this turn: attach_file`. This was an omission rather than a deliberate block — the tool (#24280) landed shortly after plan mode (#24236) and no subsequent edit to the allowlist picked it up. Add `attach_file` under the `isRootChat` case, matching how other artifact-producing tools (`propose_plan`, `write_file`, `edit_files`) are gated. The tool only reads from the workspace and writes to chat-attachment storage, so it preserves plan mode's invariant of not making implementation changes to the workspace. Subagents in plan mode remain restricted to the minimal read-only surface.	2026-05-19 17:01:23 +10:00
Danielle Maywood	170a6e1fe9	feat: add chat sharing foundation (#25041 )	2026-05-18 22:32:05 +01:00
Kyle Carberry	385146000b	feat: record created_at/completed_at on reasoning ChatMessageParts (#24789 ) Records reasoning start and end times on persisted reasoning `ChatMessagePart`s so reasoning duration can be computed for stored chats. Backend-only: no SSE changes and no frontend rendering ship in this PR. The `created_at` field on `ChatMessagePart` is extended to also be present on `reasoning` parts (it previously appeared only on `tool-call` and `tool-result`), and a new `completed_at` field is added for `reasoning` parts. ### How timestamps are recorded - `StreamPartTypeReasoningStart`: stamp `startedAt = dbtime.Now()` on the active reasoning state. - `StreamPartTypeReasoningEnd`: stamp `completedAt = dbtime.Now()` and append both into parallel `[]time.Time` slices on `stepResult`. - Persistence reads the slices in occurrence order (reasoning has no provider-side ID) and applies them to the matching `ChatMessagePart` via `buildAssistantPartsForPersist`. The first reasoning block's stamps go onto the first reasoning part, and so on. - `flushActiveState` flushes partial reasoning interrupted before `StreamPartTypeReasoningEnd` with `startedAt` from the active state and `completedAt = dbtime.Now()` at the interruption. ### Why two fields, not one? Tool calls and results are point events. The frontend computes their duration by subtracting the call's `created_at` from the result's `created_at`. Reasoning is one assistant part that brackets a span, so we record both endpoints on the part itself. ### Why not stamp in `PartFromContent`? Same rationale as #24101: `PartFromContent` is called during both SSE publishing and persistence. Stamping there would yield incorrect persistence-time timestamps for reasoning blocks that finished much earlier in the step. Instead we capture in the chatloop and apply during persistence. <details><summary>Implementation plan</summary> - `codersdk/chats.go`: extend `CreatedAt`'s `variants` to include `reasoning?`; add `CompletedAt *time.Time` with `variants:"reasoning?"`. - `coderd/x/chatd/chatloop/chatloop.go`: extend `reasoningState` with `startedAt`; extend `stepResult` and `PersistedStep` with parallel `[]time.Time` reasoning slices; stamp on `ReasoningStart`/`ReasoningEnd`; thread the slices through all `PersistStep` call sites including the interrupt-safe path; record partial reasoning in `flushActiveState`. - `coderd/x/chatd/attachments.go`: walk reasoning parts in occurrence order and apply `step.ReasoningStartedAt[i]` to `part.CreatedAt` and `step.ReasoningCompletedAt[i]` to `part.CompletedAt`. ### Tests - `codersdk/chats_test.go` round-trips `created_at` + `completed_at` on reasoning parts and verifies omission when absent and partial interrupted parts. - `coderd/x/chatd/chatprompt/chatprompt_test.go` asserts `PartFromContent(ReasoningContent{})` does NOT stamp timestamps. - `coderd/x/chatd/chatloop/chatloop_test.go` `TestRun_ReasoningTimestamps` drives a stream with two reasoning blocks and verifies parallel slices, monotonicity, ordering, non-zero values, and content-block ordering. `TestRun_InterruptedReasoningFlushesTimestamps` cancels mid-reasoning and verifies `flushActiveState` records a non-zero pair. - `coderd/x/chatd/attachments_test.go` covers `buildAssistantPartsForPersist` for normal interleaved reasoning, partial (zero `completed_at`), and missing slices. </details> > Generated by Coder Agents. Co-authored-by: Coder Agent <agent@coder.com>	2026-05-18 12:30:30 -04:00
Kyle Carberry	159089686a	fix(coderd/x/chatd): prime workspace MCP cache after create/start (#25298 ) ## Problem Mid-turn workspace MCP discovery was broken when an agent was still cold-starting. `PrepareTools` in `chatd.go` flipped `workspaceMCPDiscovered = true` before calling `discoverWorkspaceMCPTools`, so a failed discovery attempt permanently blocked retries within the turn. Customer-reported repro: - New chat with no pre-selected workspace. - LLM calls `create_workspace` mid-turn at `23:35:05`. - `PrepareTools` fires, dials the agent with a 30s timeout, dial times out at `23:38:15`, `discoverWorkspaceMCPTools` returns empty. - Agent connects at `23:38:29`, 14 seconds later. - `workspaceMCPDiscovered` was already true, so `PrepareTools` never retried for the rest of the turn. MCP tools only appeared on the next user message. A naive retry loop in `PrepareTools` would also miss the bigger picture: a workspace boot can take several minutes (EC2 cold start, 10 min startup scripts), and the chatloop only gets a chance to call `PrepareTools` between LLM steps. ## Fix Do the workspace MCP discovery from inside the tool that already waits for the agent. `chattool.CreateWorkspace` and `chattool.StartWorkspace` call `waitForAgentReady`, which has a 2 min agent-online budget plus a 10 min startup-script budget. By the time they fire `OnChatUpdated`, the agent is `Ready`. The chatd `onChatUpdated` callback now launches an async `primeWorkspaceMCPCache` goroutine on every bind that has a valid workspace ID: - The primer calls `discoverWorkspaceMCPTools` until it returns a non-empty list or `workspaceMCPPrimeMaxWait` (30s) elapses, with a 2s backoff between attempts. The bounded wait handles the short race between agent-online and the agent's MCP `Connect` settling. - The primer runs asynchronously so the tool itself never blocks. Some templates simply do not advertise MCP tools, in which case the primer would otherwise spend its full budget for nothing. - The primer shares the chat `ctx` (not a detached one) so it is canceled together with the chat. A dangling primer would re-dial the workspace conn after `runChat`'s deferred `workspaceCtx.close()` and leak that conn. - `inflight.Add(1)` ensures server shutdown still waits for any in-progress primer. - `PrepareTools` is simplified back to a single discovery call. It now only sets `workspaceMCPDiscovered = true` on success, so an empty result no longer permanently blocks discovery within the turn. The cache hit warmed by the primer makes that call cheap in the common case; the dial fallback handles the rare cache miss. ## Tests All in `coderd/x/chatd/chatd_internal_test.go`: - `TestPrimeWorkspaceMCPCache_SuccessOnFirstAttempt` — single `ListMCPTools` call returning tools populates the cache. - `TestPrimeWorkspaceMCPCache_RetriesUntilToolsAppear` — first call empty, second returns tools; primer retries past the backoff and writes the cache. Uses `quartz.Mock.Trap` on `NewTimer`. - `TestPrimeWorkspaceMCPCache_GivesUpAfterDeadline` — `ListMCPTools` always empty; primer stops at `workspaceMCPPrimeMaxWait` and refuses to cache the empty result so PrepareTools can retry on the next step. The existing integration test `TestRunChat_WorkspaceMCPDiscoveryAfterMidTurnCreateWorkspace` continues to pass and now also exercises the async-primer path end-to-end via the create_workspace tool. ``` go test ./coderd/x/chatd/... -count=1 go test ./coderd/x/chatd/ -race -count=1 make pre-commit ``` <details> <summary>Design notes</summary> - The first iteration of this PR added retry+cooldown+failure-cap logic inside `PrepareTools`. It worked for the customer's ~30s race window but did not help workspaces that take several minutes to boot, because `PrepareTools` only fires between LLM steps. Reviewer pointed out the right place to handle this is the tool itself; the current implementation does that. - Why async: a primer that ran synchronously inside the `OnChatUpdated` callback blocked the create_workspace tool from returning for up to `workspaceMCPPrimeMaxWait`, which broke `TestCreateWorkspaceTool_EndToEnd` and would hurt any template that does not expose MCP tools. Decoupling lets the tool return immediately and lets the primer warm the cache concurrently with the next LLM step. - Why share the chat `ctx` rather than `context.WithoutCancel(ctx)` (the title-generation pattern): the primer touches `workspaceCtx.getWorkspaceConn`, which `runChat`'s deferred `workspaceCtx.close()` invalidates. A detached primer outliving the chat would dial a fresh conn and leak it. - The constant naming distinguishes `workspaceMCPDiscoveryTimeout` (35s per-call dial budget, unchanged from #25169) from `workspaceMCPPrimeMaxWait` (30s total budget for the post-ready primer loop) and `workspaceMCPPrimeRetryInterval` (2s between empty-result retries). </details> Follow-up to #25169. --- _This pull request was generated by Coder Agents._	2026-05-18 07:55:56 -04:00
Ethan	e75bd3aca4	fix: preserve Anthropic replay fidelity (#25377 ) Anthropic is strict about replaying the latest assistant turn once it contains signed or redacted reasoning. We were still mutating that turn in a few Coder-owned places: dropping empty reasoning blocks on replay, rewriting provider-tool history during sanitization, and in the worst case sending a prompt we already knew Anthropic would reject. This patch keeps the latest signed assistant immutable through Coder's replay and sanitization paths, preserves empty signed or redacted reasoning anywhere Coder owns the ledger, and fails before the provider call if the prompt is still unsafe. It also bumps the existing `coder/fantasy` `coder_2_33` fork that `main` already uses to the commit containing coder/fantasy#35. These fixes have also been upstreamed to charmbracelet/fantasy. Closes CODAGT-409.	2026-05-18 15:20:33 +10:00
Michael Suchacz	792f0b4902	feat: add personal skill resolver (#25362 ) > Mux updated this PR on behalf of Mike. ## Stack Context This stack splits experimental personal skills into smaller reviewable PRs. Personal skills are user-owned `SKILL.md` files stored by Coder and injected into chatd alongside workspace skills. Stack order: 1. #25362 personal skill resolver 2. #25363 storage, permissions, API, and SDK 3. #25365 API test coverage 4. #25366 chattool and chatd integration 5. #25066 settings UI and docs 6. #25386 personal skills slash menu ## What? Adds the shared personal skill parser and resolver package, plus reusable skill-name validation exported from `workspacesdk`. The parser enforces the full personal skill contract: max raw size, kebab-case name, max name length, and non-empty body. ## Why? The rest of the stack needs one source-aware resolver for personal and workspace skills, including collision handling and qualified aliases. Keeping personal skill constraints in the parser prevents callers from accidentally parsing invalid personal skills. ## Validation - `go test ./coderd/x/skills ./codersdk/workspacesdk` - pre-commit hooks on this branch	2026-05-16 15:33:43 +00:00
Ethan	a59b951565	test: skip stale notification chatd flakes (#25376 ) These chatd tests are flaking for the same stale control-notification race tracked by CODAGT-353, so this change skips the newly reflaking advisor-chain and `TestPatchChatMessage/ChangesModel` tests and rewrites the older `TODO(hugodutka)` skips to point at the same root cause. This keeps the known flakes documented consistently until the chatd notification-flow refactor lands. Closes CODAGT-427 Closes https://github.com/coder/internal/issues/1510	2026-05-15 17:36:48 +10:00
Ethan	a35f71cd8a	fix(coderd/x/chatd): retry HTTP/2 stream resets (#25170 ) Mid-stream HTTP/2 peer resets from LLM providers can arrive after a 200 streaming response has already emitted provisional parts. Previously those resets fell through as generic non-retryable errors because `stream ID` messages did not match retryable transport signals, and stream IDs could be misread as HTTP statuses. Classify retryable HTTP/2 RST_STREAM codes as transient timeout failures, ignore stream IDs during status extraction, and keep the existing `retry` event as the rollback boundary for provisional message parts so replacement attempts do not replay failed-attempt output. Closes CODAGT-382	2026-05-14 11:40:43 +10:00
Michael Suchacz	d1a471e29e	fix(coderd/x/chatd): retune subagent selection guidance (#25311 ) > Mux working on behalf of Mike. ## Summary - retune chatd subagent guidance to prefer `general` for substantial delegated work, including read-only synthesis and planning support - narrow `explore` guidance to repository-local code lookup and bounded tracing - add regression tests for planning, spawn tool, and Plan Mode guidance text ## Tests - `go test ./coderd/x/chatd -run 'Test(DefaultSystemPromptPlanningGuidance_SteersSubagentSelection\|SpawnAgent_DescriptionSteersGeneralForSubstantialResearch\|SpawnAgent_PlanModeDescriptionOmitsComputerUse\|PlanningOverlaySubagentGuidance_UsesPlanModeSafeDescriptions\|ExploreSubagentIsReadOnly)$'` - `make lint` - `make test TEST_PACKAGES=./coderd/x/chatd RUN=Guidance && make test TEST_PACKAGES=./coderd/x/chatd RUN=Description` - pre-commit hook during `git commit`	2026-05-13 23:10:21 +02:00
Kyle Carberry	b0b07536fc	feat: add opt-in Coder identity headers for MCP servers (#25153 )	2026-05-12 08:54:53 -04:00
Michael Suchacz	f1d160c7f4	fix: allow changing model when editing earlier chat message (#25084 ) Editing a previous user message and selecting a different model in the picker silently kept using the original model: the selection was dropped on the frontend, in the SDK, and in the backend, so both the replacement user message and the assistant turn that followed ran against the old model. Plumb the selected model through all three layers (`AgentChatPage`, `codersdk.EditChatMessageRequest`, `chatd.EditMessageOptions` / `Server.EditMessage`), defaulting to the original message's model when the client does not specify one. The existing `InsertChatMessages` CTE already advances `chats.last_model_config_id` when the inserted message's model differs, so the assistant turn picks up the new selection without further changes. The new model is validated inside the transaction, so an unknown ID rolls the edit back and returns a 400 `Invalid model config ID.`, mirroring the `SendMessage` path. Refs: CODAGT-345 This change was generated by a Coder agent. <details> <summary>Implementation plan</summary> # CODAGT-345: Editing an earlier message cannot change model ## Problem When editing a previous user message in a chat, the user can change the model in the model picker, but the backend keeps using the original message's model. The model selection is dropped at three layers: 1. Frontend: `AgentChatPage.tsx`'s edit branch builds an `EditChatMessageRequest` that omits `model_config_id`. The new-message branch (a few lines below) does include it. 2. SDK: `codersdk.EditChatMessageRequest` has no `ModelConfigID` field at all. 3. Backend: `chatd.EditMessageOptions` has no model field, and `Server.EditMessage` always copies the original message's `ModelConfigID` into the replacement message. Once the replacement user message is inserted with the original model, the `InsertChatMessages` CTE leaves `chats.last_model_config_id` unchanged, so the assistant turn that follows runs against the old model. ## Fix Plumb the selected model through all three layers, defaulting to the original message's model when the client doesn't override it. This mirrors the `SendMessage` path, which already accepts a `model_config_id` and validates it via `resolveSendMessageModelConfigID`. ### Backend - `codersdk/chats.go`: add `ModelConfigID *uuid.UUID` to `EditChatMessageRequest`. - `coderd/x/chatd/chatd.go`: - Add `ModelConfigID uuid.UUID` to `EditMessageOptions`. - In `EditMessage`, after fetching the edited message, resolve the model: if `opts.ModelConfigID != uuid.Nil`, validate it exists with `tx.GetChatModelConfigByID` (using `chatdModelConfigLookupContext`), otherwise keep `editedMsg.ModelConfigID.UUID`. Pass the resolved ID into `newChatMessage(...)`. - Reuse the existing `ErrInvalidModelConfigID` sentinel. - `coderd/exp_chats.go` (`patchChatMessage`): - Read `req.ModelConfigID` (nil-safe), pass into `chatd.EditMessageOptions`. - Add a `case xerrors.Is(editErr, chatd.ErrInvalidModelConfigID)` arm returning 400 `Invalid model config ID.`, matching the `postChatMessages` handler. ### Frontend - `site/src/pages/AgentsPage/AgentChatPage.tsx`: - In the edit branch, set `model_config_id: effectiveSelectedModel \|\| undefined` on the `EditChatMessageRequest`. - On success, persist the chosen model to `lastModelConfigIDStorageKey` so the next chat from this browser keeps the same default. Mirrors the new-message branch. ### Generated - `make site/src/api/typesGenerated.ts` and `make coderd/apidoc/swagger.json` produce the updated `EditChatMessageRequest` schema in `typesGenerated.ts`, `coderd/apidoc/{docs.go,swagger.json}`, and `docs/reference/api/{chats.md,schemas.md}`. ## Tests - `coderd/x/chatd/chatd_test.go`: - `TestEditMessageWithModelConfigOverride`: edit with a different model -> replacement message and `chats.LastModelConfigID` use the new model. - `TestEditMessagePreservesModelConfigByDefault`: edit without `ModelConfigID` -> original model preserved. - `TestEditMessageRejectsUnknownModelConfig`: passes a random UUID -> `ErrInvalidModelConfigID`, original message still present, `LastModelConfigID` unchanged (rollback). - `coderd/exp_chats_test.go` (under `TestPatchChatMessage`): - `ChangesModel`: end-to-end via SDK; `edited.Message.ModelConfigID` and `chat.LastModelConfigID` both match the new model. - `InvalidModelConfigID`: random UUID -> 400 `Invalid model config ID.`. </details>	2026-05-12 14:51:55 +02:00
Michael Suchacz	f847ff3731	test(coderd/x/chatd): skip stale notification flakes (#25177 ) Skip the chatd tests that currently flake because the control notification flow cannot distinguish stale wake/status NOTIFY payloads from real interrupt requests. Each skipped test includes a TODO to re-enable it after the chatd notification flow refactor handles stale notifications correctly. Supersedes #25133, #25134, #25135, and #25139. Refs [CODAGT-353](https://linear.app/coder/issue/CODAGT-353), [CODAGT-356](https://linear.app/coder/issue/CODAGT-356), [CODAGT-360](https://linear.app/coder/issue/CODAGT-360), and [CODAGT-361](https://linear.app/coder/issue/CODAGT-361). > Mux working on behalf of Mike.	2026-05-12 14:50:30 +02:00
Ethan	4e08543ace	test(coderd): centralize chat test harness and stabilize flakes (#25171 ) Chat tests previously constructed a real `openai` provider with a fake API key and no `BaseURL`, so background title generation hit `api.openai.com` and timed out under `-race`. The same root cause produced several distinct flakes: title regeneration races with synchronous `UpdateChat`/`ProposeChatTitle`, and pagination races against `updated_at` bumps from real-network processing. This moves the fake OpenAI-compatible provider and the chat-settle wait into first-class `coderdtest` capabilities. `coderd.Options.ChatProviderAPIKeys` is the new seam tests use to redirect chat traffic to a local `httptest.Server`. `coderdtest.WaitForChatSettled` replaces per-test waiters and drains tracked chat-daemon work after the chat row leaves `pending`/`running`. The `newChatClient*` constructors funnel through one options builder that installs the fake provider before the coderd test server so cleanup ordering is deterministic. Closes https://github.com/coder/internal/issues/1528 & Closes ENG-2659 Closes https://github.com/coder/internal/issues/1480 & Closes CODAGT-359 Closes https://github.com/coder/internal/issues/1507 & Closes CODAGT-368 Relates to https://github.com/coder/internal/issues/1397 & Relates to CODAGT-374	2026-05-12 22:13:55 +10:00
Kyle Carberry	376fc80451	fix(coderd/x/chatd): discover workspace MCP tools mid-turn after create_workspace (#25169 ) ## Problem In `coderd/x/chatd/chatd.go` `runChat`, workspace MCP discovery is gated on `chat.WorkspaceID.Valid` at the start of each turn. New chats that bind their workspace mid-turn (via `create_workspace` or `start_workspace`) get an empty workspace tool list on the first step, and the model falls back to `execute` (bash) because no workspace MCP tools are advertised. Repro: new chat → "create a workspace and use MCP tools". No `/api/v0/mcp/tools` request hits the agent on turn 1; turn 2 in the same chat works fine. ## Fix - Add a `PrepareTools` callback to `chatloop.RunOptions`, analogous to `PrepareMessages`. It is invoked once before each LLM step with the current tool list. When it returns non-nil, the chatloop replaces `opts.Tools`, rebuilds the per-step tool definitions, and appends new tool names to `opts.ActiveTools` so newly injected tools are callable immediately. - Wire `PrepareTools` in `runChat` to trigger workspace MCP discovery the first time the chat snapshot reports a valid `WorkspaceID`. The previous top-of-turn discovery path is unchanged for chats that start with a workspace. - Extract the discovery logic into `Server.discoverWorkspaceMCPTools` so the top-of-turn and mid-turn paths share identical behavior (cache, agent resolution, `ListMCPTools` timeout, invalidation). Mid-turn discovery stays disabled in plan-mode turns and Explore subagents, matching the existing top-of-turn gate. The `workspaceMCPDiscovered` flag prevents redundant dials after the first successful discovery. ## Tests - `coderd/x/chatd/chatloop/chatloop_test.go`: two new `TestRun_PrepareTools*` cases covering injection on the next step and active-set merging when `ActiveTools` is non-empty. - `coderd/x/chatd/chatd_test.go`: `TestRunChat_WorkspaceMCPDiscoveryAfterMidTurnCreateWorkspace` drives `runChat` through a `create_workspace` tool call against a real Postgres + mocked agent conn and asserts the second streamed LLM request advertises the workspace MCP tool. Verified that the test fails (and pinpoints the missing tool) when the `PrepareTools` wiring is disabled. ## Validation ``` go test ./coderd/x/chatd/chatloop/... -count=1 go test ./coderd/x/chatd/... -count=1 make lint/emdash ``` <details> <summary>Decision log</summary> - Chose a per-step `PrepareTools` callback over mutating `opts.Tools` in place because `chatloop.Run` builds the `fantasy.Tool` definitions once at start; a hook is required to let the LLM see new tools on the next step. - Returned `[]fantasy.AgentTool` (not also active-tool-names) and let the chatloop derive name merges via `mergeNewToolNames`. This avoids leaking plan-mode gating decisions into the callback contract. - Kept the existing top-of-turn discovery path so chats that already have a workspace at turn start pay no extra latency. - Skipped reusing `ReloadMessages` (history reload) since this is purely a tool-availability concern; coupling it to a history reload would defeat the chatloop cache prefix optimizations. </details> --- _This pull request was generated by Coder Agents._	2026-05-12 00:30:56 -04:00
Kyle Carberry	5a5cd79c4c	fix: drop buffered chat parts after their durable message commits (#25164 )	2026-05-12 00:30:38 -04:00
Kyle Carberry	0ed57ee343	fix(coderd/x/chatd): checkpoint buffered message_parts to avoid stale replay (#25145 )	2026-05-11 17:27:03 -04:00
Thomas Kosiewski	e56381eb61	feat: stream advisor tool output (#25032 ) Stream advisor output into the advisor tool card while the nested advisor call is still running. This keeps the advisor implementation intentionally advisor-specific: the parent model still receives the same final structured tool result, while the frontend receives transient `tool-result.result_delta` parts to render partial advisor text in the expanded card. The final persisted chat history remains unchanged. Refs CODAGT-322. Generated by Coder Agents. <details> <summary>Implementation plan</summary> - Publish advisor text deltas from the nested `chatloop.Run` via `RunAdvisorOptions.OnAdviceDelta`. - Forward those deltas through `chatadvisor.Tool` with the parent advisor tool call ID. - Emit transient `ChatMessagePartTypeToolResult` websocket parts with `ResultDelta` from `chatd`. - Add `result_delta` to the generated tool-result TypeScript variant. - Accumulate tool result deltas in frontend stream state and keep the tool running until the final result arrives. - Render streamed advisor advice in the existing advisor card using streaming markdown mode, while retaining the updated advisor UI. </details>	2026-05-11 20:18:49 +02:00
Michael Suchacz	6bb88775ab	test(coderd/x/chatd): pin TestGetWorkspaceConn_StatusCheck to mock clock (#25130 ) The `TimedOutAgentCacheHit`, `CacheHitHealthyAgent`, and `CacheHitDBError` subtests of `TestGetWorkspaceConn_StatusCheck` built their `WorkspaceAgent` timestamps with `time.Now()` in the parent test's slice literal and then ran the actual check against the server's real wall clock (`quartz.NewReal()`). On slow Windows CI runners, more than `agentInactiveDisconnectTimeout` (30s) of wall time can elapse between slice construction and the parallel subtest body. In that window, the cached "healthy" agent gets reclassified as disconnected by `agentDisconnectedFor`, and `CacheHitHealthyAgent` fails with `errChatAgentDisconnected` instead of returning the cached connection. Build each agent inside the subtest with `quartz.NewMock(t)` and feed the same clock into the `Server` so the agent timestamps and the status math share a single frozen `now`. This matches the pattern already used by `TestGetWorkspaceConn_DialTimeoutDisconnectedRecoveryThreshold` in the same file. Closes https://github.com/coder/internal/issues/1522 <details> <summary>Verification</summary> Inserting `time.Sleep(35 * time.Second)` at the top of each subtest's body reliably reproduces the original failure (`errChatAgentDisconnected` on `CacheHitHealthyAgent`) on the parent commit and passes with this change. After removing the synthetic sleep, `go test ./coderd/x/chatd -run TestGetWorkspaceConn_StatusCheck -count=50` passes cleanly. </details> > Generated by Coder Agents on behalf of the assignee. Co-authored-by: Coder Agents <noreply@coder.com>	2026-05-11 19:53:58 +02:00
Michael Suchacz	60779ad2ec	test(coderd/x/chatd): stop waking acquireLoop in TestResolveExploreToolSnapshot (#25129 ) Fixes [CODAGT-367](https://linear.app/codercom/issue/CODAGT-367). `TestResolveExploreToolSnapshot/` flaked on CI (Linux and Windows) with `context deadline exceeded` on the `GetMCPServerConfigsByIDs` call inside `resolveExploreToolSnapshot`. Each test setup called `server.CreateChat` twice with `MCPServerIDs` set to fake `.example.com` URLs. `CreateChat` marks the chat pending and calls `signalWake`, which causes the chatd background `acquireLoop` to pick the chat up. That goroutine then dialed the fake MCP URLs (NXDOMAIN, slower on Windows) and made an OpenAI request with the dbgen default test key (401). Under CI load, that activity racing the 4 parallel subtests' `GetMCPServerConfigsByIDs` calls was enough to exceed the 25s test context deadline. The failure logs in the issue showed both side effects firing in the same job. `resolveExploreToolSnapshot` only reads `ID`, `MCPServerIDs`, `PlanMode`, `ParentChatID`, and `Mode` off the parent argument, so the chats do not need to be persisted. Build them as in-memory `database.Chat` values instead. The MCP server configs remain in the DB because the function still queries them via `GetMCPServerConfigsByIDs`. Verified locally with `go test ./coderd/x/chatd -run TestResolveExploreToolSnapshot -count=100 -race` (passes, ~5s total) and the surrounding `TestResolve` / `TestCreateChildSubagentChat` / `TestSpawnAgent_Explore` tests. --- _Made by Coder Agents on behalf of @ibetitsmike. [Linear session](https://linear.app/codercom/issue/CODAGT-367/flake-testresolveexploretoolsnapshot#agent-session-0730f3fe)._	2026-05-11 19:46:59 +02:00
Michael Suchacz	645b8cc63d	fix(coderd/x/chatd/chaterror): deflake TestClassify_ParsesRetryAfterHTTPDate (#25128 ) The test built a `Retry-After` HTTP-date with `time.Now().Add(3*time.Second).UTC().Format(http.TimeFormat)`, then asserted that the parsed `RetryAfter` was `>= 2s`. `http.TimeFormat` has second precision, so `Format()` truncates up to ~1s. Combined with the small elapsed time between formatting in the test and `time.Until()` in production, the value could land just under `offset-1s` (1.997s observed in CI), failing the lower bound. Round the formatted target up to the next whole second so the parsed deadline is never earlier than `now+offset`, and assert against a symmetric `[offset-1s, offset+1s]` window. Closes [CODAGT-365](https://linear.app/codercom/issue/CODAGT-365/flake-testclassify-parsesretryafterhttpdate) Refs https://github.com/coder/internal/issues/1512 <sub>Created by [Coder Agents](https://coder.com/docs/agent).</sub> Co-authored-by: Coder Agents <coderagents@coder.com>	2026-05-11 19:09:51 +02:00
Cian Johnston	e8508b2d90	fix: recover chatd from poisoned chain anchor on retry (#25097 ) When OpenAI's Responses API returns `Previous response with id ... not found` for a chained turn, classify it as a `ChainBroken` retry, clear `previous_response_id`, exit chain mode, reload full history, and let `chatretry` retry. Self-heals chats whose anchor was poisoned before #25074 stopped truncated streams from being persisted as a successful turn with a stored response id. The new state is exposed via the existing `coderd_chatd_stream_retries_total` counter as a `chain_broken="true"\|"false"` label. Aggregating queries (`sum`, `rate` over `provider`/`model`/`kind`) keep working without changes; raw-series matchers without aggregation will now see two series per `(provider, model, kind)` where they previously saw one. The metric is internal-only so the blast radius should be small, but if you have dashboards that index by exact label matchers without aggregation they will need an extra `sum` or an explicit `chain_broken` selector. > 🤖 This PR was created with the help of Coder Agents, and was reviewed by a human 🧑‍💻	2026-05-11 17:43:40 +01:00
Michael Suchacz	915956460a	feat(coderd/x/chatd): add compact turn status labels (#25043 ) > Mux is acting on Mike's behalf. Changes chat turn-end summaries into compact status labels for the cached `last_turn_summary` and successful web push body. Uses a structured-output model call for successful turns, requiring a 2-5 word `label` and validating it to reject agent-centric phrasing. Pending and requires-action states keep deterministic status labels. Removes the earlier deterministic tool-signal pipeline in favor of the smaller structured-output path.	2026-05-11 17:09:42 +02:00
Mathias Fredriksson	fb60bb0c08	chore(coderd/x/chatd): instrument PromoteQueued + stream subscriber for ENG-2645 (#25085 ) TestPromoteQueuedWhileRequiresActionMixedTools has flaked three times across Windows and Ubuntu CI runners since 2026-05-06; local repro on the dev workspace has not surfaced it. The May 8 Ubuntu log shows all four PromoteQueued post-TX pubsub publishes reaching pg_notify, yet the test still times out 25s later, so the failure is downstream between the subscriber's listener and the test's events channel. Adds three Debug-level markers in chatd.go (no logic change) plus two t.Logf markers in the test's reader so the next CI occurrence pins down exactly which step failed. Closes ENG-2645 Closes coder/internal#1523	2026-05-11 08:33:46 +00:00
Ethan	063c06ca5f	test: prevent expired contexts in chatd parallel subtests (#25107 ) Parallel subtests in `coderd/x/chatd` reused a parent test context with a `testutil.WaitLong` deadline, so the context could expire before a subtest was scheduled under load. That made the subagent lifecycle tools return plain-text context errors instead of the expected JSON payload, causing flaky JSON unmarshal failures. Create fresh `chatdTestContext` values inside the affected parallel subtests and add `chatdTestContext` to the `paralleltestctx` custom function list so this pattern is caught by `make lint`. Closes https://github.com/coder/internal/issues/1494	2026-05-11 17:48:27 +10:00
Ethan	bd6cc1aaf2	feat(coderd): add stop_workspace chatd tool and recovery classification (#24997 ) ## Summary Adds a `stop_workspace` tool to chatd so the model can recover from the "workspace running but agent dead" failure mode (e.g. an OOM that leaves the workspace running but the agent unreachable) by stopping and then starting the workspace. <img width="924" height="742" alt="image" src="https://github.com/user-attachments/assets/279dedb6-6e29-4fe1-8754-3a1f01e538bf" /> ## What changed New `stop_workspace` chatd tool (`coderd/x/chatd/chattool/stopworkspace.go`). Mirrors `start_workspace`: shares `WorkspaceMu` to serialize with create/start, waits for any in-progress build before issuing a stop, and is idempotent only after a successful Stop transition. Failed stop builds re-attempt rather than reporting success. New `chatStopWorkspace` coderd hook (`coderd/exp_chats.go`). Mirrors `chatStartWorkspace` minus the `RequireActiveVersion` gate. Stop should not be blocked by template version policy. Differentiated recovery sentinels (`coderd/x/chatd/chatd.go`). `errChatAgentDisconnected` instructs the model to call `stop_workspace` then `start_workspace`. `errChatDialTimeout` instructs a single retry, then user escalation if it repeats. The previous single message conflated transient and persistent failures. Two-signal recovery gate. Recovery is only surfaced when a tool call times out and a fresh DB read of the latest workspace agent says `Disconnected`. The previous draft escalated on the DB read alone, which would fire on a 30-second heartbeat blip (e.g. agent respawn) and prompt a destructive stop/start unnecessarily. Cache-hit disconnected handling now clears the cache and retries a fresh dial before escalating, rather than returning the recovery sentinel immediately. Latest-agent classification uses `GetWorkspaceAgentsInLatestBuildByWorkspaceID` instead of the chat's bound `AgentID`, so stale bindings after a rebuild don't misclassify. Shared chattool helpers in `coderd/x/chatd/chattool/chattool.go`: `latestWorkspaceBuildAndJob`, `publishBuildBinding`, `provisionerJobTerminal`. Applied to both `start_workspace` and `stop_workspace`. ## Notes - Reverts an earlier draft that widened `ask_user_question` to root standard turns. Plan-mode-only behavior is restored. - The `stop_workspace` tool currently renders via the generic chat tool-call UI. A follow-up frontend PR will prettify the `stop_workspace` tool and style it like the `start_workspace` tool. - Never-connected (`Timeout` status) agents are intentionally excluded from recovery. They indicate template or startup failure, not the running-but-dead case this PR targets. Closes CODAGT-315	2026-05-11 16:23:07 +10:00
Mathias Fredriksson	3925d3941b	fix(coderd/x/chatd): wait long enough for cold-start workspace MCP discovery (#25035 ) The 5s timeout cancelled cold-start ListMCPTools calls before the agent's 30s connectTimeout could settle, so workspace MCP tools never reached the LLM. Bump to 35s and scope to ListMCPTools only.	2026-05-08 17:49:10 +03:00

1 2 3 4 5

230 Commits