mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
5b32c4d79d
## Problem
MCP servers configured in `.mcp.json` with stdio transport are
discovered successfully (tools appear) but die immediately after
connection, making all tool calls fail.
## Root Cause
In `connectServer`, the subprocess is spawned with `connectCtx` — a
30-second timeout context whose `cancel()` is deferred:
```go
connectCtx, cancel := context.WithTimeout(ctx, connectTimeout)
defer cancel()
if err := c.Start(connectCtx); err != nil { ... }
```
The mcp-go stdio transport calls `exec.CommandContext(connectCtx, ...)`.
When `connectServer` returns, `cancel()` fires, and
`exec.CommandContext` sends SIGKILL to the subprocess. The process
immediately becomes a zombie.
Confirmed by checking `/proc/<pid>/status` after context cancellation:
```
State: Z (zombie)
```
## Fix
Pass the parent `ctx` (which is `a.gracefulCtx` — the agent's long-lived
context) to `c.Start()`. `connectCtx` continues to bound only the
`Initialize()` handshake. The subprocess is cleaned up when the Manager
is closed or the parent context is canceled.
## Regression Test
Added `TestConnectServer_StdioProcessSurvivesConnect` which:
- Spawns a real subprocess (re-execs the test binary as a fake MCP
server)
- Calls `connectServer` and lets it return (internal `connectCtx` gets
canceled)
- Verifies the subprocess is still alive by calling `ListTools`
The test **fails** on the old code with `transport error: context
deadline exceeded` and **passes** with the fix.
> Generated with [Coder Agents](https://coder.com/agents)