Workspace-agent logs emitted while serving chatd-driven requests were
not correlated with the originating chat, making agent logs hard to
attribute to the corresponding/originating chat.
This adds agent-side chat context middleware that parses `Coder-Chat-Id`
once, enriches agent access logs and structured handler/background logs,
and adds a chatd bridge log when chat headers are attached to an agent
connection.
Closes CODAGT-324
Adds a deployment-wide setting to select the computer-use provider
(Anthropic or OpenAI) for AI agents, plus the OpenAI computer-use runner
needed to honor that selection.
The setting is stored in `site_configs` under
`agents_computer_use_provider`, defaults to Anthropic when unset, and is
exposed via experimental GET/PUT endpoints under
`/api/experimental/chats/config/computer-use-provider`. The chatd
computer-use tool now dispatches to either `runAnthropicComputerUse` or
`runOpenAIComputerUse` based on the resolved provider, with
provider-specific result metadata for OpenAI screenshots.
Frontend adds a provider dropdown to the Agents Experiments settings
page nested under the virtual desktop toggle, with disabled state
handling while virtual desktop is off and skeleton loaders while config
queries are in flight.
Hugo and Codex review follow-up:
- Uses shared provider validation and clearer computer-use constant
names.
- Removes stale OpenAI pending-safety-checks commentary.
- Documents why provider result metadata is needed for OpenAI
screenshots.
- Keeps the computer-use subagent visible when provider credentials are
missing, then returns a clear spawn-time configuration error.
- Uses OpenAI's recommended 1600x900 screenshot geometry to preserve the
native 16:9 aspect ratio.
- Moves OpenAI-specific computer-use helpers into
`coderd/x/chatd/chatopenai/computeruse` after rebasing onto the provider
package refactor in `main`.
- Converts OpenAI pixel scroll deltas to Coder desktop wheel-click
amounts.
- Preserves OpenAI pointer modifiers with key down/up desktop actions
and rejects unsupported non-left double-click buttons explicitly.
- Maps OpenAI back/forward side-button clicks to browser navigation key
actions.
- Defaults omitted OpenAI click buttons to left-click.
- Retries mouse release cleanup if the final OpenAI drag release fails.
- Keeps computer-use subagent availability messages stable when provider
config cannot be loaded, while logging the backend error.
- Releases remaining OpenAI modifier keys if a synthetic key-up cleanup
action fails.
- Updates Storybook interaction stories so provider snapshots show the
selected final provider.
> Mux updated this PR description on behalf of Mike.
Fixes https://github.com/coder/internal/issues/1461
Two synchronization issues caused
`TestPortableDesktop_IdleTimeout_StopsRecordings` (and the
`MultipleRecordings` variant) to flake on macOS CI:
1. **`clk.Advance(idleTimeout)` was not awaited.** In
`MultipleRecordings`, both idle timers fire simultaneously but their
`fire()` goroutines race to remove themselves from the mock clock's
event list. Without `MustWait`, the second timer may still be in `m.all`
when the next `Advance` is called, causing `"cannot advance ... beyond
next timer/ticker event in 0s"`.
2. **The test depended on SIGINT being handled promptly.** After the
`stop_timeout` timer was released, the test relied entirely on the shell
process handling SIGINT (via `rec.done`). On macOS, `/bin/sh` may not
interrupt `wait` reliably, leaving `lockedStopRecordingProcess` blocked
in its `select` while holding `p.mu` — deadlocking the
`require.Eventually` callback.
### Fix
Wait for each `Advance` to complete and advance past the 15s stop
timeout so the process is forcibly killed via the timer path,
independent of signal handling.
Verified with 1000 iterations (500 per test) with zero failures.
> Generated with [Coder Agents](https://coder.com/agents)
The agents chat interface displays thumbnails for videos recorded by the
computer use agent. Currently, to display a thumbnail, the frontend
downloads the entire video and shows the first frame. This PR starts
storing a new thumbnail file in the database for every recorded video,
and exposes the file id in the `wait_agent` tool result alongside the
recording file id, so the frontend can fetch just the thumbnail.
This PR introduces screen recording of the computer use agent using the
virtual desktop.
- Screen recording is triggered by a `wait_agent` tool call. Recording
is stopped by a successful `wait_agent` tool call or when there hasn't
been any desktop activity for 10 minutes.
- Recordings are handled by the `portabledesktop` cli via the `record`
command. The videos are sped up in periods of inactivity.
- Recordings are saved to the database to the `chat_files` table.
There's a hard limit of 100MB per recording. Larger recordings are
dropped.
- A successful `wait_agent` on a computer use subagent tool call returns
a `recording_file_id`, later allowing the frontend to display the
corresponding video.
- Move `agent/agentdesktop/` to `agent/x/agentdesktop/` to signal
experimental/unstable status
- Update import paths in `agent/agent.go` and `api_test.go`
> 🤖 This mechanical refactor was performed by an agent. I made sure it
didn't change anything it wasn't supposed to.