coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 21:18:24 +00:00

Author	SHA1	Message	Date
Michael Suchacz	0211448d09	fix(coderd): sanitize Anthropic provider tool history (#24706 ) Anthropic can reject replayed chat histories when a provider-executed tool call, such as `web_search`, is present without its matching provider result block. This sanitizes unpaired Anthropic provider-executed tool calls during prompt reconstruction, before Anthropic requests, and before persistence so existing poisoned histories can continue and new malformed turns are not stored. Resolves: CODAGT-259 > Mux is acting on Mike's behalf.	2026-04-24 23:57:30 +02:00
Cian Johnston	a02339c66a	fix(coderd/x/chatd): prevent invalid tool results from poisoning chat history (#24663 ) - computeruse.go: Decode base64 screenshot data before storing in `ToolResponse.Data` (was casting base64 string to bytes without decoding) - chatloop.go: Re-encode `ToolResponse.Data` to base64 via `base64.StdEncoding.EncodeToString` instead of `string()` cast - mcpclient.go: UTF-8 validate all text from MCP responses in `convertCallResult()` using `strings.ToValidUTF8` - chatprompt.go (persist): Defense-in-depth UTF-8 sanitization of text and media Text fields before database storage - chatprompt.go (replay): Antivenom layer that validates base64 and UTF-8 at read time, auto-healing already-poisoned chats without requiring a migration - `TestToolResultAntivenom`: 4 subtests covering poisoned text, poisoned media, valid media round-trip, and media with invalid UTF-8 text - Adds `TestConvertCallResult_UTF8Sanitization`: 4 subtests covering invalid UTF-8 in TextContent, EmbeddedResource, valid passthrough, and multi-part - Adds `TestComputerUseTool_Run_ScreenshotDataIsDecodedBinary`: Verifies no double-encode in the computer-use path - Updated existing computer-use tests for the new decoded-binary contract > 🤖	2026-04-23 19:58:38 +01:00
Cian Johnston	72e3ae9c5f	feat: add chatd tool call error metrics and logging (#24559 ) - Add `coderd_chatd_tool_errors_total` prometheus counter (labels: provider, model, tool_name) - Log tool call errors at warn level with correlation fields: chat_id, owner_id, organization_id, workspace_id, agent_id, parent_chat_id, trigger_message_id, tool_name, tool_call_id, provider, model - Thread enriched logger from chatd.go into chatloop via `RunOptions.Logger` - Remove squashing of all MCP tool calls to the `mcp` bucket > 🤖	2026-04-22 16:19:56 +00:00
Ethan	cc4e04afde	feat(site): display file attachments in chat UI (#24281 ) Renders the durable file attachments introduced in #24280 in the chat interface. Without this, attachments were stored and served correctly but the UI showed raw file parts with no previews or download UX. Every attachment gets a download affordance, split into three rendering tiers: - Images — thumbnail with a hover/focus overlay containing a download link. `onFocusCapture`/`onBlurCapture` with `contains(relatedTarget)` keeps the overlay open while tabbing between the image and its download link. - Text-like files (`text/`, `application/json`) — expandable preview button with loading + error-with-retry states and the same download overlay. Preview fetches throw a typed `FetchTextAttachmentError` with a `.status` field instead of a stringly-typed error. - Everything else* — compact `FileCard` with extension badge, filename, and download link. User-side and assistant-side rendering now share `AttachmentBlocks.tsx` (`AttachmentPreviewFrame`, `TextAttachmentButton`, `ImageAttachmentButton`, `FileCard`, plus `getAttachmentHref`/`getAttachmentName`) instead of two near-duplicate implementations. The text-attachment overlay anchors to the preview surface so the download button stays pinned even when a loading/error status line widens the row below. `ComputerRenderer` detects when a screenshot was stored as a durable attachment (`attachment_file_id`) and suppresses the stale base64 rendering — the screenshot appears as a proper file part instead. `ToolLabel` shows the attached filename for `attach_file` tool calls. Storybook coverage in `ConversationTimeline.stories.tsx` was expanded to cover every tier (single/multiple images, inline + file-id text, JSON, download-only files, fetch-failure retry, mixed attachments + file references) with play-function assertions. <img width="811" height="150" alt="image" src="https://github.com/user-attachments/assets/27c71081-3502-4e80-92a7-d8adf1ff9323" /> ## Cleanup Per Mathias' post-merge suggestion on #24280, this PR also relocates `coderd/chatfiles` → `coderd/x/chatfiles` so the durable-attachment helpers live beside the rest of the `chatd` experimental surface. Closes CODAGT-91	2026-04-22 20:11:53 +10:00
Thomas Kosiewski	df7e838c21	feat(coderd): wire debug logging into chat lifecycle (#23917 )	2026-04-20 12:27:16 +02:00
Cian Johnston	4b585465b8	feat: label chatd metrics by model, add stream-state diagnostics (#24475 ) Adds production-observability metrics to coderd/x/chatd/ for model-level correlation and a chatStreams memory-leak investigation. - Label per-request chatd metrics (steps_total, message_count, prompt_size_bytes, tool_result_size_bytes, ttft_seconds, compaction_total) with `model` and enrich the per-turn logger with provider/model. - Add `coderd_chatd_stream_retries_total{provider, model, kind}` counter incremented in chatloop before OnRetry. - Register a prometheus.Collector exposing `streams_active`, `stream_buffer_size_max`, `stream_buffer_events`, `stream_subscribers` from p.chatStreams. - Add `coderd_chatd_stream_buffer_dropped_total` counter, incremented per publishToStream drop independently of the existing log-rate-limited bufferDropCount. - Snapshot logger/model before the title-generation goroutine to avoid a data race with the logger/model rebind below it. > 🤖	2026-04-17 16:16:30 +01:00
Michael Suchacz	1cf0354f72	feat: add plan mode with restricted tool boundary (#24236 ) > This PR was authored by Mux on behalf of Mike. ## Summary - add persistent plan mode for chats and the chat-specific plan file flow - add structured planning tools such as `ask_user_question` and `propose_plan` - keep `write_file` and `edit_files` constrained to the chat-specific plan file during plan turns - allow shell exploration in plan mode, including subagents, via `execute` and `process_output` - block implementation-oriented, provider-native, MCP, dynamic, and computer-use tools during plan turns - update the chat UI, tests, and docs for the new planning flow	2026-04-16 11:12:01 +02:00
Kyle Carberry	d11849d94a	fix: re-fetch context files and skills from workspace on each turn (#24360 ) Context files (AGENTS.md) and skills were only fetched from the workspace on the first turn or when the agent changed. On subsequent turns, stale content from persisted messages was used. This meant that if AGENTS.md or skills were modified on the workspace between turns, the agent wouldn't see the changes until the user created a new chat. ## Changes - Extract `fetchWorkspaceContext` from `persistInstructionFiles` to allow fetching workspace context without persisting - On subsequent turns, re-fetch fresh context from the workspace instead of reading stale persisted content; falls back to persisted messages if the workspace dial fails - Update `ReloadMessages` callback to re-derive instruction and skills from reloaded database messages after compaction, instead of using captured closure variables - Add `formatSystemInstructionsFromParts` helper to build system instructions directly from agent parts without requiring separate OS/directory params - Add tests for the new helper <details><summary>Implementation Notes</summary> ### Root cause In `runChat`, the `else if hasContextFiles` branch (subsequent turns) called `instructionFromContextFiles(messages)` which read stale content from persisted DB messages. The `ReloadMessages` callback (post-compaction) also used captured `instruction`/`skills` closure variables from the start of the turn, never re-deriving them. ### Approach 1. Extract `fetchWorkspaceContext` — Pure refactor of the fetch-only part of `persistInstructionFiles` (agent connection, context config retrieval, content sanitization, metadata stamping). Returns parts + skills without persisting. 2. Subsequent turns: Instead of reading from persisted messages, launch a `g2` goroutine that calls `fetchWorkspaceContext` to get fresh context from the workspace. Falls back gracefully to persisted messages if the workspace is unreachable. 3. ReloadMessages: Re-derive `instruction` from `instructionFromContextFiles(reloadedMsgs)` and `skills` from `skillsFromParts(reloadedMsgs)` using the freshly loaded messages, with fallback to captured values if the reloaded messages don't contain context (e.g. compacted away). </details> > 🤖 Generated by Coder Agents	2026-04-15 16:41:15 -04:00
Cian Johnston	d7439a9de0	feat: add Prometheus metrics for chatd subsystem (#24371 ) Adds 7 Prometheus metrics to the chatd subsystem and introduces typed `ActivityBumpReason` for deadline bump attribution. \| Metric \| Type \| Labels \| \|--------\|------\|--------\| \| `coderd_chatd_chats` \| Gauge \| `state` (streaming, waiting) \| \| `coderd_chatd_message_count` \| Histogram \| `provider` \| \| `coderd_chatd_prompt_size_bytes` \| Histogram \| `provider` \| \| `coderd_chatd_tool_result_size_bytes` \| Histogram \| `provider`, `tool_name` \| \| `coderd_chatd_ttft_seconds` \| Histogram \| `provider` \| \| `coderd_chatd_compaction_total` \| Counter \| `provider`, `result` \| \| `coderd_chatd_steps_total` \| Counter \| `provider` \| > 🤖	2026-04-15 19:53:10 +01:00
Ethan	65bf7c3b18	fix(coderd/x/chatd/chatloop): stabilize startup-timeout tests with quartz (#24193 ) The startup-timeout integration tests in `chatloop` used a 5ms real-time budget and relied on wall-clock scheduling to fire the startup guard timer before the first stream part arrived. On loaded CI runners the timer sometimes lost the race, producing `attempts == 2` instead of `attempts == 1` and flaking `TestRun_FirstPartDisarmsStartupTimeout`. Replace the real `time.Timer` in `startupGuard` with a `quartz.Timer` so tests can control time deterministically. Production behavior is unchanged: `RunOptions.Clock` defaults to `quartz.NewReal()` when nil, and the startup timeout still covers both opening the provider stream and waiting for the first stream part. - Add `RunOptions.Clock quartz.Clock` with nil-safe default. - Tag the startup guard timer as `"startupGuard"` for quartz trap targeting. - Rewrite the four startup-timeout integration tests to use `quartz.NewMock(t)` with trap/advance/release sequences instead of wall-clock sleeps. - Add `awaitRunResult` helper so tests fail with a clear message instead of hanging when `Run` does not complete. Closes https://github.com/coder/internal/issues/1460	2026-04-10 00:40:09 +10:00
Kyle Carberry	35c26ce22a	feat: add CreatedAt to tool-call and tool-result ChatMessageParts (#24101 ) Adds an optional `CreatedAt` timestamp to `tool-call` and `tool-result` `ChatMessagePart` variants so the frontend can compute tool execution duration (`result.created_at - call.created_at`). Timestamps are recorded at the correct moments in the chatloop: - Tool-call: when the model stream emits the tool call - Tool-result: when tool execution completes (or is interrupted) These are passed through `PersistedStep.PartCreatedAt` so the persistence layer can apply accurate timestamps to stored parts. SSE-published parts also carry `CreatedAt` for real-time display. Old persisted messages without `created_at` deserialize to `nil` — fully backward compatible. <details><summary>Implementation notes (Coder Agents generated)</summary> ### Why not stamp in `PartFromContent`? `PartFromContent` is called both for SSE publishing (correct timing) and during persistence (wrong timing — both tool-call and tool-result would get the same "persistence time" timestamp, yielding ~0 duration). Instead, timestamps are captured in the chatloop at the right moments and carried through `PersistedStep.PartCreatedAt` as a `map[string]time.Time` keyed by `"call:<id>"` / `"result:<id>"`. ### Interrupted tool calls `persistInterruptedStep` also stamps `CreatedAt` on synthetic error results for cancelled/interrupted tool calls, so partial duration is available. ### Files changed \| File \| Change \| \|------\|--------\| \| `codersdk/chats.go` \| Add `CreatedAt *time.Time` field \| \| `codersdk/chats_test.go` \| JSON round-trip test \| \| `coderd/database/dbtime/dbtime.go` \| Add `TimePtr` helper \| \| `coderd/x/chatd/chatloop/chatloop.go` \| Track timestamps, pass through `PersistedStep` \| \| `coderd/x/chatd/chatd.go` \| Apply timestamps during persistence \| \| `coderd/x/chatd/chatprompt/chatprompt_test.go` \| Verify `PartFromContent` does NOT stamp \| \| `site/src/api/typesGenerated.ts` \| Auto-generated \| </details> --------- Co-authored-by: Ethan <39577870+ethanndickson@users.noreply.github.com>	2026-04-08 12:42:03 -04:00
Kyle Carberry	b969d66978	feat: add dynamic tools support for chat API (#24036 ) Adds client-executed dynamic tools to the chat API. Dynamic tools are declared by the client at chat creation time, presented to the LLM alongside built-in tools, but executed by the client rather than chatd. This enables external systems (Slack bots, IDE extensions, Discord bots, CI/CD integrations) to plug custom tools into the LLM chat loop without modifying chatd's built-in tool set. Modeled after OpenAI's Assistants API: the chat pauses with `requires_action` status when the LLM calls a dynamic tool, the client POSTs results back via `POST /chats/{id}/tool-results`, and the chat resumes. See [this example](https://github.com/coder/coder-slackbot-poc) as a reference for how this is used. It's highly-configurable, which would enable creating chats from webhooks, periodically polling, or running as a Slackbot. <details> <summary>Design context</summary> ### Architecture The chatloop exits when it encounters dynamic tools and re-enters when results arrive. No blocking channels, no pubsub for tool results, no in-memory registry. The DB is the only coordination mechanism. ``` Phase 1 (chatloop): LLM response → execute built-in tools only → Persist(assistant + built-in results) → status = requires_action → chatloop exits Phase 2 (POST /tool-results): Persist(dynamic tool results) → status = pending → wakeCh → chatloop re-enters ``` ### Validation (POST /tool-results) 1. Chat status must be `requires_action` (409 if not) 2. Read chat's `dynamic_tools` → set of dynamic tool names 3. Read last assistant message → extract tool-call parts matching dynamic tool names 4. Submitted tool_call_ids must match exactly (400 for missing/extra) 5. Persist tool-result message parts, set status to `pending`, signal wake ### Idempotency Tool call IDs scoped per LLM step. State machine (`requires_action` → `pending`) is the guard. First POST wins, subsequent get 409. ### Mixed tool calls When the LLM calls both built-in and dynamic tools in one step, built-in tools execute immediately. Their results are persisted in phase 1. Dynamic tool results arrive via POST in phase 2. The LLM sees all results when the chatloop resumes. </details> > 🤖 Generated by Coder Agents	2026-04-08 11:54:44 -04:00
dylanhuff-at-coder	f796f3645f	fix(coderd): fix isContextLimitKey false positive on max_context_version (#23950 ) `isContextLimitKey` had a fallback heuristic that matched any key starting with `"max"` containing `"context"`, causing false positives on keys like `"max_context_version"`. A provider returning such metadata would have the value parsed as a context limit. Replace substring matching on the separator-stripped key with word-level matching. A new `metadataKeyWords` function tokenizes keys by splitting on separators and camelCase boundaries, then the fallback requires `"context"` paired with a limit-related word (`"limit"`, `"window"` + qualifier, `"length"` + qualifier, or `"tokens"` + qualifier). Known exact forms like `"context_window"` remain in the fast-path switch. Closes https://github.com/coder/coder/issues/23332	2026-04-02 10:07:01 -07:00
Ethan	70f031d793	feat(coderd/chatd): structured chat error classification and retry hardening (#23275 ) > PR Stack > 1. #23351 ← `#23282` > 2. #23282 ← `#23275` > 3. #23275 ← `#23349` (you are here) > 4. #23349 ← `main` --- ## Summary Extracts a structured error classification subsystem for agent chat (`chatd`) so that retry and error payloads carry machine-readable metadata — error kind, provider name, HTTP status code, and retryability — instead of raw error strings. This is the backend half of the error-handling work. The frontend counterpart is in #23282. ## Changes ### New package: `coderd/chatd/chaterror/` Canonical error classification — extracts error kind, provider, status code, and user-facing message from raw provider errors. One source of truth that drives both retry policy and stream payloads. - `kind.go`: Error kind enum (`rate_limit`, `timeout`, `auth`, `config`, `overloaded`, `unknown`). - `signals.go`: Signal extraction — parses provider name, HTTP status code, and retryability from error strings and wrapped types. - `classify.go`: Classification logic — maps extracted signals to an error kind. - `message.go`: User-facing message templates keyed by kind + signals. - `payload.go`: Projectors that build `ChatStreamError` and `ChatStreamRetry` payloads from a classified error. ### Modified - `codersdk/chats.go`: Added `Kind`, `Provider`, `Retryable`, `StatusCode` fields to `ChatStreamError` and `ChatStreamRetry`. - `coderd/chatd/chatretry/`: Thinned to retry-policy only; classification logic moved to `chaterror`. - `coderd/chatd/chatloop/`: Added per-attempt first-chunk timeout (60 s) via `guardedStream` wrapper — produces retryable `startup_timeout` errors instead of hanging forever. - `coderd/chatd/chatd.go`: Publishes normalized retry/error payloads via `chaterror` projectors.	2026-03-25 13:47:54 +11:00
Kyle Carberry	3812b504fc	fix(coderd/x/chatd): prevent nil required field in MCP tool schemas for OpenAI (#23538 )	2026-03-24 18:29:41 -04:00
Michael Suchacz	02356c61f6	fix: use previous_response_id chaining for OpenAI store=true follow-ups (#23450 ) OpenAI Responses follow-up turns were replaying full assistant/tool history even when `store=true`, which breaks after reasoning + provider-executed `web_search` output. This change persists the OpenAI response ID on assistant messages, then in `coderd/x/chatd` switches `store=true` follow-ups to `previous_response_id` chaining with a system + new-user-only prompt. `store=false` and missing-ID cases still fall back to manual replay. It also updates the fake OpenAI server and integration coverage for the chaining contract, and carries the rebased path move to `coderd/x/chatd` plus the migration renumber needed after rebasing onto `main`.	2026-03-24 14:57:40 +01:00
Cian Johnston	80a172f932	chore: move chatd and related packages to /x/ subpackage (#23445 ) - Moves `coderd/chatd/`, `coderd/gitsync/`, `enterprise/coderd/chatd/` under `x/` parent directories to signal instability - Adds `Experimental:` glue code comments in `coderd/coderd.go` > 🤖 This PR was created with the help of Coder Agents, and was reviewed by my human. 🧑‍💻	2026-03-23 17:34:43 +00:00

17 Commits