coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 05:28:20 +00:00

Author	SHA1	Message	Date
Hugo Dutka	48ab492f49	feat: agents git watch backend (#22565 ) Adds real-time git status watching for workspace agents, so the frontend can subscribe over WebSocket and show git file changes in near real-time. 1. Subscription is scoped to a chat via `GET /api/experimental/chats/{chat}/git/watch`. 2. The workspace agent automatically determines which paths to watch based on tool calls made by the chat (and its ancestor chats). 3. Workspace agent polls subscribed repo working trees on a 30s interval, on tools calls, and on explicit `refresh` from the client. 4. Scans are rate-limited to at most once per second. 5. Edited paths are tracked in-memory inside the workspace agent. There is no database persistence — state is lost on agent restart. This will be addresses in a future PR. 6. Messages sent over WebSocket include a full-repo snapshot (unified diff, branch, origin). A new message is emitted only when the snapshot changes. This PR was implemented with AI with me closely controlling what it's doing. The code follows a plan file that was updated continuously during implementation. Here's the file if you'd like to see it: [project.md](https://gist.github.com/hugodutka/8722cf80c92f8a56555f7bc595b770e2). It reflects the current state of the PR.	2026-03-06 10:47:55 +01:00
Kyle Carberry	5945febf06	feat(agent): add fuzzy whitespace matching to edit_files tool (#22446 ) Inspired by openai/codex's `apply_patch` implementation, this changes the `edit_files` search-and-replace to use a cascading match strategy when the exact search string isn't found: 1. Exact substring match (byte-for-byte) — existing behavior, unchanged 2. Line-by-line match ignoring trailing whitespace — handles trailing spaces/tabs the LLM omits 3. Line-by-line match ignoring all leading/trailing whitespace — handles tabs-vs-spaces and wrong indentation depth ## Problem When the chat agent uses `edit_files`, it generates a search string that must match the file content exactly. LLMs frequently get whitespace wrong: - Emitting spaces when the file uses tabs (or vice versa) - Getting the indentation depth wrong by one or more levels - Omitting trailing whitespace that exists in the file When this happens, the edit silently does nothing, and the agent falls into a retry loop using `cat -A` to diagnose the exact whitespace characters. ## Solution Adopted the same cascading fuzzy match strategy that [openai/codex uses in `seek_sequence.rs`](https://github.com/openai/codex/blob/main/codex-rs/apply-patch/src/seek_sequence.rs): - Pass 1: exact match (existing behavior) - Pass 2: `TrimRight` each line before comparing (trailing whitespace tolerance) - Pass 3: `TrimSpace` each line before comparing (full indentation tolerance) When a fuzzy match is found, the matched lines in the original file are replaced with the replacement text. This preserves surrounding content exactly. ## Changes - `agent/agentfiles/files.go`: Replaced `icholy/replace` streaming transformer with in-memory `fuzzyReplace` + helper functions (`seekLines`, `spliceLines`) - `agent/agentfiles/files_test.go`: Added 6 new test cases covering trailing whitespace, tabs-vs-spaces, different indent depths, exact match preference, no-match behavior, and mixed whitespace multiline edits - Removed `icholy/replace` dependency from go.mod/go.sum --------- Co-authored-by: Kyle Carberry <kylecarbs@users.noreply.github.com>	2026-02-28 17:02:57 -05:00
Kyle Carberry	b65c0766d2	feat: add line-based read_file tool with safety limits (#22400 ) ## Summary Adds a new line-based file reading endpoint to the workspace agent, replacing the unbounded byte-based approach for the `read_file` chat tool and `coder_workspace_read_file` MCP tool. Problem: The current `read_file` tool returns the entire file contents with no limits, which can blow up LLM context windows and cause OOM issues with large files. Solution: Inspired by [`coder/mux`](https://github.com/coder/mux) and [`openai/codex`](https://github.com/openai/codex), implement a line-based reader with safety limits. ## Changes ### Agent (`agent/agentfiles/`) - New `/read-file-lines` endpoint with `HandleReadFileLines` handler - Line-based `offset` (1-based line number, default: 1) and `limit` (line count, default: 2000) - Safety constants: \| Constant \| Value \| Purpose \| \|---\|---\|---\| \| `MaxFileSize` \| 1 MB \| Reject files larger than this at stat \| \| `MaxLineBytes` \| 1,024 \| Per-line truncation with `... [truncated]` marker \| \| `MaxResponseLines` \| 2,000 \| Max lines per response \| \| `MaxResponseBytes` \| 32 KB \| Max total response size \| \| `DefaultLineLimit` \| 2,000 \| Default when no limit specified \| - Line numbering format: `1\tcontent` (tab-separated) - Structured JSON response: `{ success, file_size, total_lines, lines_read, content, error }` - Hard errors when limits exceeded — tells the LLM to use `offset`/`limit` - Existing byte-based `/read-file` endpoint preserved (used by `instruction.go`) ### SDK (`codersdk/workspacesdk/`) - `ReadFileLinesResponse` type added - `ReadFileLines` method added to `AgentConn` interface - Mock regenerated ### Chat tool (`coderd/chatd/chattool/`) - `read_file` tool now uses `conn.ReadFileLines()` instead of `conn.ReadFile()` - Updated tool description to document line-based parameters - Response includes `file_size`, `total_lines`, `lines_read` metadata ### MCP tool (`codersdk/toolsdk/`) - `coder_workspace_read_file` updated to use line-based reading - Schema descriptions updated for line-based offset/limit - Removed `maxFileLimit` constant (agent handles limits now) ### Tests - 13 new test cases for `TestReadFileLines`: - Path validation (empty, relative, non-existent, directory, no permissions) - Empty file handling - Basic read, offset, limit, offset+limit combinations - Offset beyond file length - Long line truncation (>1024 bytes) - Large file rejection (>1MB) - All existing tests pass unchanged ## Design decisions \| Decision \| Rationale \| \|---\|---\| \| Line-based, not byte-based \| Both coder/mux and openai/codex use line-based — matches how LLMs reason about code \| \| Default limit of 2000 \| Matches codex; prevents accidental full-file dumps while being generous \| \| 32 KB response cap \| Compromise between mux (16 KB) and codex (no cap) \| \| 1024 byte/line truncation with marker \| More generous than codex (500), marker helps LLM know data is missing \| \| Hard errors on overflow \| Matches mux; forces LLM to paginate rather than getting partial data \| \| Preserve byte-based endpoint \| `instruction.go` needs raw byte access for AGENTS.md \|	2026-02-27 15:12:56 -05:00
Asher	ff9ed91811	chore: move agent's file API into separate package (#21531 ) This makes it so we can test it directly without having to go through Tailnet, which appears to be causing flakes in CI where the requests time out and never make it to the agent. Takes inspiration from the container-related API endpoints. Would probably make sense to refactor the ls tests to also go through the API (rather than be internal tests like they are currently) but I left those alone for now to keep the diff minimal.	2026-01-16 17:03:17 -09:00

4 Commits