15 Commits

Author SHA1 Message Date
Mathias Fredriksson 7a9125b953 fix(agent/agentfiles): merge duplicate file paths instead of rejecting (#25767)
When a caller sends multiple entries for the same literal path, merge
their edits into a single entry rather than returning 400. Symlink
aliases (different paths, same real file) are still rejected.
2026-05-28 11:54:17 +00:00
Mathias Fredriksson 3986aa8a51 feat(agent/agentfiles): add post-fail diagnostic hints for edit_files (#25092)
When fuzzyReplace exhausts its passes, append a hint to the generic
"search string not found" error.

Inversion: if search did not match but replace does, list the lines
where replace appears.

Miscount: when a search line agrees with a file line except for the
count of one repeated rune, name the codepoint and counts.

Miscount takes precedence; both firing could direct an agent to swap
fields and corrupt the inversion anchor.

  Did you swap "search" and "replace"? Your replace string appears
  at line 12, 47, 89.

  Your search has 32 "─" (U+2500); the file has 37 at line 182.

Closes CODAGT-330
2026-05-11 17:28:12 +00:00
Ethan 3a9080fff6 feat: tag chat-originating agent logs with chat_id (#25019)
Workspace-agent logs emitted while serving chatd-driven requests were
not correlated with the originating chat, making agent logs hard to
attribute to the corresponding/originating chat.

This adds agent-side chat context middleware that parses `Coder-Chat-Id`
once, enriches agent access logs and structured handler/background logs,
and adds a chatd bridge log when chat headers are attached to an agent
connection.

Closes CODAGT-324
2026-05-08 13:25:30 +10:00
Mathias Fredriksson 6b0bb02e5d fix: server-side diffs and stricter fuzzy splicing for edit_files (#24454)
Fixes three classes of edit_files bugs and adds structured per-file
diff output for tool callers:

- New IncludeDiff flag on FileEditRequest; when set, the agent
  returns FileEditResponse.Files[]{Path, Diff} with unified diffs
  computed via go-udiff v0.4.1 Lines + ToUnified (not Unified,
  which calls log.Fatalf on internal error).
- Fuzzy match comparators split each line into leading whitespace,
  body, trailing whitespace, and ending. The splice substitutes at
  each position: on agreement between search and replace the file's
  bytes win; on disagreement the replacement's bytes are spliced
  verbatim. Carve-outs for empty-body lines, multi-line EOF splices,
  and level-aware indent translation for inserted lines.
- Indent-unit detection (GCD for spaces, tab-priority) lets a 4sp
  LLM search insert correctly into tab or 2sp files. Falls back to
  the previous cLead-inheritance path when units can't be detected
  cleanly.
- Empty search is rejected with "search string must not be empty".
- Duplicate file paths in one request are rejected; symlink aliases
  resolved via api.resolvePath before the dedup check.
- Frontend EditFilesRenderer consumes the structured files array by
  explicit path (no label munging) with per-file synthetic fallback
  for older agents or mismatched paths. On error, no diff is
  rendered so the synthetic fallback doesn't misrepresent a
  rejected edit as applied.

Breaking change: AgentConn.EditFiles changes from (ctx, req) error
to (ctx, req) (FileEditResponse, error) in codersdk/workspacesdk.
Source-breaking for external Go consumers; no compat shim per plan
owner.

Out of scope (tracked in CODAGT-214): level-aware indent for
middle-substituted splice lines. Locked in
TestEditFiles_FuzzyIndent_InsertionLevelAware's Lock_* cases plus
TestEditFiles_ReplaceAll_FuzzyIndentGap.
2026-04-18 16:39:34 +03:00
Michael Suchacz 1cf0354f72 feat: add plan mode with restricted tool boundary (#24236)
> This PR was authored by Mux on behalf of Mike.

## Summary
- add persistent plan mode for chats and the chat-specific plan file
flow
- add structured planning tools such as `ask_user_question` and
`propose_plan`
- keep `write_file` and `edit_files` constrained to the chat-specific
plan file during plan turns
- allow shell exploration in plan mode, including subagents, via
`execute` and `process_output`
- block implementation-oriented, provider-native, MCP, dynamic, and
computer-use tools during plan turns
- update the chat UI, tests, and docs for the new planning flow
2026-04-16 11:12:01 +02:00
Mathias Fredriksson 798a6673c6 fix(agent/agentfiles): make multi-file edit_files atomic (#23493)
When edit_files receives multiple files, each file was processed
independently: read, compute edits, write. If file B failed, file A
was already written to disk. The caller got an error but had no way
to know which files were modified.

Split editFile into prepareFileEdit (read + compute, no side
effects) and a write phase. The handler runs all preparations
first and writes only if every file's edits succeed.

A write-phase failure (e.g. disk full) can still leave earlier
files committed. True cross-file atomicity would require
filesystem transactions. The prepare phase catches the common
failure modes: bad paths, search misses, permission errors.
2026-03-24 19:23:57 +00:00
Mathias Fredriksson 1c0442c247 fix(agent/agentfiles): fix replace_all in fuzzy matching mode (#23480)
replace_all in fuzzy mode (passes 2 and 3 of fuzzyReplace) only
replaced the first match. seekLines returned the first match,
spliceLines replaced one range, and there was no loop.

Extract fuzzy pass logic into fuzzyReplaceLines which:
- Returns a 3-tuple (result, matched, error) for clean caller flow
- When replaceAll is true, collects all non-overlapping matches
  then applies replacements from last to first to preserve indices
- When replaceAll is false with multiple matches, returns an error

Add test cases for replace_all with fuzzy trailing whitespace and
fuzzy indent matching.
2026-03-24 14:41:45 +02:00
Mathias Fredriksson 16edcbdd5b fix(agent/agentfiles): follow symlinks in write_file and edit_files (#23478)
Both write_file and edit_files use atomic writes (write to temp
file, then rename). Since rename operates on directory entries, it
replaces symlinks with regular files instead of writing through
the link to the target.

Add resolveSymlink() that uses afero.Lstater/LinkReader to resolve
symlink chains (up to 10 levels) before the atomic write. Both
writeFile and editFile resolve the path before any filesystem
operations, matching the behavior of 'echo content > symlink'.

Gracefully no-ops on filesystems that don't support symlinks (e.g.
MemMapFs used in existing tests).
2026-03-24 12:39:55 +00:00
Mathias Fredriksson f3b91b7f11 fix(agent/agentfiles): use Create-style permissions for temp files (#23339)
Replace afero.TempFile (which uses os.CreateTemp with mode 0600)
with a custom createTempFile that uses OpenFile with mode 0666.
This lets the kernel apply the process umask, matching the default
behavior of os.Create. New files now get ~0644 (with standard
umask) instead of 0600.

Extract atomicWrite(ctx, path, mode, haveMode, reader) to share
the entire temp-file lifecycle between writeFile and editFile.
2026-03-20 21:30:28 +02:00
Mathias Fredriksson de4e568994 fix(agent/agentfiles): atomic writes and permission preservation (#23336)
Both writeFile and editFile now use the same atomic write strategy:
temp file in the same directory, write, rename. This ensures a
failed write leaves the original file intact instead of truncated.

editFile already used temp-and-rename but lost the original file's
permissions because afero.TempFile creates with mode 0600. Both
functions now Chmod after rename to preserve the original mode.

writeFile also swallowed io.Copy errors (logged but returned HTTP
200). Fixed to return the error so the client knows the write
failed.
2026-03-20 01:56:19 +02:00
Kyle Carberry 32a894d4a7 fix: error on ambiguous matches in edit_files tool (#23125)
## Problem

The `edit_files` tool used `strings.ReplaceAll` for exact substring
matches, silently replacing **every** occurrence. When an LLM's search
string wasn't unique in the file, this caused unintended edits. Fuzzy
matches (passes 2 and 3) only replaced the first occurrence, creating
inconsistent behavior. Zero matches were also silently ignored.

## Investigation

Investigated how **coder/mux** and **openai/codex** handle this:

| Tool | Multiple matches | No match | Flag |
|---|---|---|---|
| **coder/mux** `file_edit_replace_string` | Error (default
`replace_count=1`) | Error | `replace_count` (int, default 1, -1=all) |
| **openai/codex** `apply_patch` | Uses first match after cursor
(structural disambiguation via context lines + `@@` markers) | Error |
None (different paradigm) |
| **coder/coder** `edit_files` (before) | Exact: replaces all. Fuzzy:
replaces first. | Silent success | None |

## Solution

Adopted the mux approach (error on ambiguity) with a simpler
`replace_all: bool` instead of `replace_count: int`:

- **Default (`replace_all: false`)**: search string must match exactly
once. Multiple matches → error with guidance: *"search string matches N
occurrences. Include more surrounding context to make the match unique,
or set replace_all to true"*
- **`replace_all: true`**: replaces all occurrences (opt-in for
intentional bulk operations like variable renames)
- **Zero matches**: now returns an error instead of silently succeeding

Chose `bool` over `int` count because:
1. LLMs are bad at counting occurrences
2. The real intent is binary (one specific spot vs. all occurrences)
3. Simpler error recovery loop for the LLM

## Changes

| File | Change |
|---|---|
| `codersdk/workspacesdk/agentconn.go` | Add `ReplaceAll bool` to
`FileEdit` struct |
| `agent/agentfiles/files.go` | Count matches before replacing; error if
>1 and not opted in; error on zero matches; add `countLineMatches`
helper |
| `codersdk/toolsdk/toolsdk.go` | Expose `replace_all` in tool schema
with description |
| `agent/agentfiles/files_test.go` | Update existing tests, add
`EditEditAmbiguous`, `EditEditReplaceAll`, `NoMatchErrors`,
`AmbiguousExactMatch`, `ReplaceAllExact` |
2026-03-16 16:17:33 +00:00
Hugo Dutka 48ab492f49 feat: agents git watch backend (#22565)
Adds real-time git status watching for workspace agents, so the frontend
can subscribe over WebSocket and show
git file changes in near real-time.

1. Subscription is scoped to a **chat** via `GET
/api/experimental/chats/{chat}/git/watch`.
2. The workspace agent automatically determines which paths to watch
based on tool calls made by the chat (and its ancestor chats).
3. Workspace agent polls subscribed repo working trees on a 30s
interval, on tools calls, and on explicit `refresh` from the client.
4. Scans are rate-limited to at most once per second.
5. Edited paths are tracked **in-memory** inside the workspace agent.
There is no database persistence — state is lost on agent restart. This
will be addresses in a future PR.
6. Messages sent over WebSocket include a full-repo snapshot (unified
diff, branch, origin). A new message is emitted only when the snapshot
changes.

This PR was implemented with AI with me closely controlling what it's
doing. The code follows a plan file that was updated continuously during
implementation. Here's the file if you'd like to see it:
[project.md](https://gist.github.com/hugodutka/8722cf80c92f8a56555f7bc595b770e2).
It reflects the current state of the PR.
2026-03-06 10:47:55 +01:00
Kyle Carberry 5945febf06 feat(agent): add fuzzy whitespace matching to edit_files tool (#22446)
Inspired by openai/codex's `apply_patch` implementation, this changes
the `edit_files` search-and-replace to use a cascading match strategy
when the exact search string isn't found:

1. **Exact substring match** (byte-for-byte) — existing behavior,
unchanged
2. **Line-by-line match ignoring trailing whitespace** — handles
trailing spaces/tabs the LLM omits
3. **Line-by-line match ignoring all leading/trailing whitespace** —
handles tabs-vs-spaces and wrong indentation depth

## Problem

When the chat agent uses `edit_files`, it generates a search string that
must match the file content exactly. LLMs frequently get whitespace
wrong:
- Emitting spaces when the file uses tabs (or vice versa)
- Getting the indentation depth wrong by one or more levels
- Omitting trailing whitespace that exists in the file

When this happens, the edit silently does nothing, and the agent falls
into a retry loop using `cat -A` to diagnose the exact whitespace
characters.

## Solution

Adopted the same cascading fuzzy match strategy that [openai/codex uses
in
`seek_sequence.rs`](https://github.com/openai/codex/blob/main/codex-rs/apply-patch/src/seek_sequence.rs):

- Pass 1: exact match (existing behavior)
- Pass 2: `TrimRight` each line before comparing (trailing whitespace
tolerance)
- Pass 3: `TrimSpace` each line before comparing (full indentation
tolerance)

When a fuzzy match is found, the matched lines in the original file are
replaced with the replacement text. This preserves surrounding content
exactly.

## Changes

- `agent/agentfiles/files.go`: Replaced `icholy/replace` streaming
transformer with in-memory `fuzzyReplace` + helper functions
(`seekLines`, `spliceLines`)
- `agent/agentfiles/files_test.go`: Added 6 new test cases covering
trailing whitespace, tabs-vs-spaces, different indent depths, exact
match preference, no-match behavior, and mixed whitespace multiline
edits
- Removed `icholy/replace` dependency from go.mod/go.sum

---------

Co-authored-by: Kyle Carberry <kylecarbs@users.noreply.github.com>
2026-02-28 17:02:57 -05:00
Kyle Carberry b65c0766d2 feat: add line-based read_file tool with safety limits (#22400)
## Summary

Adds a new line-based file reading endpoint to the workspace agent,
replacing the unbounded byte-based approach for the `read_file` chat
tool and `coder_workspace_read_file` MCP tool.

**Problem**: The current `read_file` tool returns the entire file
contents with no limits, which can blow up LLM context windows and cause
OOM issues with large files.

**Solution**: Inspired by [`coder/mux`](https://github.com/coder/mux)
and [`openai/codex`](https://github.com/openai/codex), implement a
line-based reader with safety limits.

## Changes

### Agent (`agent/agentfiles/`)
- New `/read-file-lines` endpoint with `HandleReadFileLines` handler
- Line-based `offset` (1-based line number, default: 1) and `limit`
(line count, default: 2000)
- Safety constants:
  | Constant | Value | Purpose |
  |---|---|---|
  | `MaxFileSize` | 1 MB | Reject files larger than this at stat |
| `MaxLineBytes` | 1,024 | Per-line truncation with `... [truncated]`
marker |
  | `MaxResponseLines` | 2,000 | Max lines per response |
  | `MaxResponseBytes` | 32 KB | Max total response size |
  | `DefaultLineLimit` | 2,000 | Default when no limit specified |
- Line numbering format: `1\tcontent` (tab-separated)
- Structured JSON response: `{ success, file_size, total_lines,
lines_read, content, error }`
- Hard errors when limits exceeded — tells the LLM to use
`offset`/`limit`
- Existing byte-based `/read-file` endpoint preserved (used by
`instruction.go`)

### SDK (`codersdk/workspacesdk/`)
- `ReadFileLinesResponse` type added
- `ReadFileLines` method added to `AgentConn` interface
- Mock regenerated

### Chat tool (`coderd/chatd/chattool/`)
- `read_file` tool now uses `conn.ReadFileLines()` instead of
`conn.ReadFile()`
- Updated tool description to document line-based parameters
- Response includes `file_size`, `total_lines`, `lines_read` metadata

### MCP tool (`codersdk/toolsdk/`)
- `coder_workspace_read_file` updated to use line-based reading
- Schema descriptions updated for line-based offset/limit
- Removed `maxFileLimit` constant (agent handles limits now)

### Tests
- 13 new test cases for `TestReadFileLines`:
- Path validation (empty, relative, non-existent, directory, no
permissions)
  - Empty file handling
  - Basic read, offset, limit, offset+limit combinations
  - Offset beyond file length
  - Long line truncation (>1024 bytes)
  - Large file rejection (>1MB)
- All existing tests pass unchanged

## Design decisions

| Decision | Rationale |
|---|---|
| Line-based, not byte-based | Both coder/mux and openai/codex use
line-based — matches how LLMs reason about code |
| Default limit of 2000 | Matches codex; prevents accidental full-file
dumps while being generous |
| 32 KB response cap | Compromise between mux (16 KB) and codex (no cap)
|
| 1024 byte/line truncation with marker | More generous than codex
(500), marker helps LLM know data is missing |
| Hard errors on overflow | Matches mux; forces LLM to paginate rather
than getting partial data |
| Preserve byte-based endpoint | `instruction.go` needs raw byte access
for AGENTS.md |
2026-02-27 15:12:56 -05:00
Asher ff9ed91811 chore: move agent's file API into separate package (#21531)
This makes it so we can test it directly without having to go through
Tailnet, which appears to be causing flakes in CI where the requests
time out and never make it to the agent.

Takes inspiration from the container-related API endpoints.

Would probably make sense to refactor the ls tests to also go through
the API (rather than be internal tests like they are currently) but I
left those alone for now to keep the diff minimal.
2026-01-16 17:03:17 -09:00