Files
coder/codersdk
Kyle Carberry 63b6868113 fix(codersdk): propagate HTTPClient to websocket.Dial for TLS relay (#22642)
## Problem

In multi-replica Coder deployments, the chat relay WebSocket between
replicas fails with HTTP 401 (or TLS handshake errors). The subscriber
replica cannot relay `message_part` events from the worker replica.

**Root cause:** `codersdk.Client.Dial()` does not pass `c.HTTPClient` to
`websocket.DialOptions.HTTPClient`. The websocket library
(`github.com/coder/websocket`) falls back to `http.DefaultClient`, which
lacks the mesh TLS configuration needed for inter-replica communication.

The relay code in `enterprise/coderd/chatd/chatd.go` correctly sets
`sdkClient.HTTPClient = cfg.ReplicaHTTPClient` (which has mesh TLS
certs), but that client was never used for the actual WebSocket
handshake.

## Fix

One-line fix in `codersdk/client.go`: propagate `c.HTTPClient` to
`opts.HTTPClient` when the caller hasn't already set one.

## Test

Added `TestChatStreamRelay/RelayWithTLSAndCookieAuth` which:
- Sets up two replicas with TLS certificates (simulating mesh TLS in
production)
- Authenticates via cookies (simulating browser WebSocket behavior)
- Verifies message_part events relay across replicas over TLS

This test times out without the fix because the WebSocket handshake
fails with `x509: certificate signed by unknown authority`
(http.DefaultClient rejects self-signed certs).

## Related

Follow-up to #22635 which fixed the `redirectToAccessURL` middleware
bypassing 307 redirects for relay requests. That fix changed the error
from HTTP 200 to HTTP 401, exposing this deeper issue.
2026-03-04 21:57:23 -05:00
..