Files
coder/cli
Kyle Carberry a6b9a25f82 fix(cli): bypass access URL redirect for inter-replica chat relay (#22635)
## Summary

Fixes cross-replica chat relay failing with:

```
failed to open initial relay for chat stream
    error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200

failed to open relay for message parts
    error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200
```

Subscribers see accurate `status=running` (delivered via pubsub) but
miss all in-progress `message_part` events (delivered only via the relay
WebSocket that never connects).

## Root cause

`redirectToAccessURL` in `cli/server.go` redirects any request whose
`Host` header doesn't match the access URL. The enterprise chat relay
dials another replica directly via its DERP relay address (e.g.
`http://10.0.0.2:8080`), so the `Host` header is the pod IP — not the
access URL.

This triggers a **307 redirect** to the access URL. The WebSocket
library follows the redirect, but the second request is a plain GET —
`Connection: Upgrade` and `Upgrade: websocket` headers are **not carried
over** by HTTP redirect semantics. The load-balanced access URL routes
the plain GET to any replica, which serves the SPA catch-all handler and
returns **HTTP 200 with `index.html`**.

The WebSocket library then fails: `expected handshake response status
code 101 but got 200`.

DERP mesh already has an exemption for this exact scenario
(`isDERPPath`). Chat relay was added later and didn't get one.

## Fix

Bypass `redirectToAccessURL` for requests that carry the
`X-Coder-Relay-Source-Replica` header, which the enterprise relay
already sets on every request (`enterprise/coderd/chatd/chatd.go:573`).

## Sequence diagram

**Before (broken):**
```
Replica A (subscriber)          Replica B (worker)           Load Balancer
       |                              |                           |
       |--- WS dial pod-ip:8080 ----->|                           |
       |                              |-- 307 redirect to LB --->|
       |                              |                           |
       |<----------- plain GET (no Upgrade headers) ------------->|
       |                              |                           |-- routes to any replica
       |<----------- 200 index.html -------------------------------|
       |                              |
       X  'expected 101 but got 200'  |
```

**After (fixed):**
```
Replica A (subscriber)          Replica B (worker)
       |                              |
       |--- WS dial pod-ip:8080 ----->|
       |    (X-Coder-Relay-Source-    |
       |     Replica header set)      |
       |                              |-- bypass redirect
       |<--------- 101 Upgrade ------|
       |<==== message_part events ====|
```
2026-03-04 20:26:03 -05:00
..
2023-09-08 18:21:33 +00:00
2026-02-09 09:56:33 +00:00
2026-02-09 09:56:33 +00:00