coder

mirror of https://github.com/coder/coder.git synced 2026-06-05 14:08:20 +00:00

Author	SHA1	Message	Date
Kyle Carberry	6972d073a2	fix: improve background process handling for agent tools (#23132 ) ## Problem Models frequently use shell `&` instead of `run_in_background=true` when starting long-running processes through `/agents`, causing them to die shortly after starting. This happens because: 1. No guidance in tool schema — The `ExecuteArgs` struct had zero `description` tags. The model saw `run_in_background: boolean (optional)` with no explanation of when/why to use it. 2. Shell `&` is silently broken — `sh -c "command &"` forks the process, the shell exits immediately, and the forked child becomes an orphan not tracked by the process manager. 3. No process group isolation — The SSH subsystem sets `Setsid: true` on spawned processes, but the agent process manager set no `SysProcAttr` at all. Signals only hit the top-level `sh`, not child processes. ## Investigation Compared our implementation against openai/codex and coder/mux: \| Aspect \| codex \| mux \| coder/coder (before) \| \|--------\|-------\|-----\|---------------------\| \| Background flag \| Yield/resume with `session_id` \| `run_in_background` with rich description \| `run_in_background` with no description \| \| `&` handling \| `setsid()` + `killpg()` \| `detached: true` + `killProcessTree()` \| Nothing — orphaned children escape \| \| Process isolation \| `setsid()` on every spawn \| `set -m; nohup ... setsid` for background \| No `SysProcAttr` at all \| \| Signal delivery \| `killpg(pgid, sig)` — entire group \| `kill -15 -\$pid` — negative PID \| `proc.cmd.Process.Signal()` — PID only \| ## Changes ### Fix 1: Add descriptions to `ExecuteArgs` (highest impact) The model now sees explicit guidance: "Use for long-running processes like dev servers, file watchers, or builds. Do NOT use shell & — it will not work correctly." ### Fix 2: Update tool description The top-level execute tool description now reinforces: "Use run_in_background=true for long-running processes. Never use shell '&' for backgrounding." ### Fix 3: Detect trailing `&` and auto-promote to background Defense-in-depth: if the model still uses `command &`, we strip the `&` and promote to `run_in_background=true` automatically. Correctly distinguishes `&` from `&&`. ### Fix 4: Process group isolation (`Setpgid`) New platform-specific files (`proc_other.go` / `proc_windows.go`) following the same pattern as `agentssh/exec_other.go`. Every spawned process gets its own process group. ### Fix 5: Process group signaling `signal()` now uses `syscall.Kill(-pid, sig)` on Unix to signal the entire process group, ensuring child processes from shell pipelines are also cleaned up. ## Testing All existing `agent/agentproc` tests pass. Both packages compile cleanly.	2026-03-16 16:22:10 -04:00
Mathias Fredriksson	72689c2552	fix(coderd): improve error handling in chattest, chattool, and chats (#23047 ) - Use t.Errorf in chattest non-streaming helpers so encoding failures fail the test - Thread testing.TB into writeResponsesAPIStreaming and log SSE write errors instead of silently dropping them - Bump createworkspace DB error log from Warn to Error - Use errors.Join for timeout + output error in execute.go	2026-03-13 21:41:24 +02:00
Mathias Fredriksson	9d33c340ec	fix(coderd): handle ignored errors across coderd packages (#22851 ) Handle previously ignored error return values in coderd: - coderd/chats.go: check sendEvent errors, log on failure - coderd/chatd/chattest: thread testing.TB through server structs, replace log.Printf with t.Logf, check writeSSEEvent errors - coderd/chatd/chattool/createworkspace.go: log UpdateChatWorkspace failure instead of discarding both return values - coderd/chatd/chattool/execute.go: surface ProcessOutput error in the timeout message returned to the caller - coderd/provisionerdserver: log stream.Send failure in the DownloadFile error helper	2026-03-13 19:53:20 +02:00
Kyle Carberry	0e1846fe2a	fix(agent): reap exited processes and scope process list by chat ID (#22944 )	2026-03-12 14:51:05 -07:00
Kyle Carberry	56f95a3e6d	fix: scope git askpass diff status updates to initiating chat (#22534 ) ## Problem When the git askpass flow triggered diff status refreshes, it updated every chat connected to the workspace. This was wasteful and could cause confusing status updates on unrelated chats. ## Solution Thread the chat ID through the entire git askpass flow so only the chat that initiated the git operation gets updated: 1. `coderd/chatd/chattool/execute.go` — Sets `CODER_CHAT_ID` env var on spawned processes (alongside the existing `CODER_CHAT_AGENT`) 2. `cli/gitaskpass.go` — Reads `CODER_CHAT_ID` from the environment and sends it as a `chat_id` query parameter in the `ExternalAuthRequest` 3. `codersdk/agentsdk/agentsdk.go` — Adds `ChatID` field to `ExternalAuthRequest` and encodes it as a query param 4. `coderd/workspaceagents.go` — Parses `chat_id` query param and passes it through to `storeChatGitRef` and `triggerWorkspaceChatDiffStatusRefresh` 5. `coderd/chats.go` — `storeChatGitRef` and `refreshWorkspaceChatDiffStatuses` now scope updates to just the initiating chat when a chat ID is provided, falling back to all-workspace-chats behavior for backwards compatibility (non-chat git operations)	2026-03-02 22:52:39 -05:00
Kyle Carberry	a621c3cb13	feat(agent): add process execution API and rewrite execute tool (#22416 ) ## Summary Adds a new agent-side process management HTTP API and rewrites the chat execute tool to use it instead of SSH sessions. ## What changed ### New agent/agentproc/ package - headtail.go — Thread-safe io.Writer with bounded memory (16KB head + 16KB tail ring buffer). Provides LLM-ready output with truncation metadata and long-line truncation at 2048 bytes. - headtail_test.go — 16 tests including race detector coverage for concurrent writes. - process.go — Manager + Process types for lifecycle management using agentexec.Execer for proper OOM/nice scores. - api.go — HTTP API following the agentfiles chi router pattern. 4 endpoints: start, list, output, signal. ### Agent wiring (agent/agent.go, agent/api.go) Mounts the process API at /api/v0/processes, mirroring how agentfiles is mounted. ### SDK (codersdk/workspacesdk/agentconn.go) 4 new AgentConn interface methods + 7 request/response types: - StartProcess, ListProcesses, ProcessOutput, SignalProcess ### Execute tool rewrite (coderd/chatd/chattool/execute.go) - SSH to Agent API: conn.StartProcess() + conn.ProcessOutput() polling - New parameters: workdir, run_in_background - Structured response: success, exit_code, wall_duration_ms, error, truncated, note, background_process_id - Non-interactive env vars: GIT_EDITOR=true, TERM=dumb, NO_COLOR=1, PAGER=cat, etc. - Output truncation: HeadTailBuffer caps at 32KB for LLM consumption - File-dump detection with advisory notes suggesting read_file - Default timeout: 60s to 10s - Foreground polling: 200ms intervals until exit or timeout ## Architecture State lives on the agent, surviving coderd failover and instance changes. Any coderd replica can query any agent via HTTP over tailnet.	2026-02-28 12:33:52 -05:00
Kyle Carberry	edee917d88	feat: add experimental agents support (#22290 ) feat: add AI chat system with agent tools and chat UI Introduce the chatd subsystem and Agents UI for AI-powered chat within Coder workspaces. - Add chatd package with chat loop, message compaction, prompt management, and LLM provider integration (OpenAI, Anthropic) - Add agent tools: create workspace, list/read templates, read/write/ edit files, execute commands - Add chat API endpoints with streaming, message editing, and durable reconnection - Add database schema and migrations for chats, chat messages, chat providers, and chat model configs - Add RBAC policies and dbauthz enforcement for chat resources - Add Agents UI pages with conversation timeline, queued messages list, diff viewer, and model configuration panel - Add comprehensive test coverage including coderd integration tests, chatd unit tests, and Storybook stories - Gate feature behind experiments flag --------- Co-authored-by: Cian Johnston <cian@coder.com> Co-authored-by: Danielle Maywood <danielle@themaywoods.com> Co-authored-by: Jeremy Ruppel <jeremy@coder.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 16:50:56 +00:00

7 Commits