Commit Graph

14 Commits

Author SHA1 Message Date
Kyle Carberry a6b9a25f82 fix(cli): bypass access URL redirect for inter-replica chat relay (#22635)
## Summary

Fixes cross-replica chat relay failing with:

```
failed to open initial relay for chat stream
    error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200

failed to open relay for message parts
    error= dial relay stream: - failed to WebSocket dial: expected handshake response status code 101 but got 200
```

Subscribers see accurate `status=running` (delivered via pubsub) but
miss all in-progress `message_part` events (delivered only via the relay
WebSocket that never connects).

## Root cause

`redirectToAccessURL` in `cli/server.go` redirects any request whose
`Host` header doesn't match the access URL. The enterprise chat relay
dials another replica directly via its DERP relay address (e.g.
`http://10.0.0.2:8080`), so the `Host` header is the pod IP — not the
access URL.

This triggers a **307 redirect** to the access URL. The WebSocket
library follows the redirect, but the second request is a plain GET —
`Connection: Upgrade` and `Upgrade: websocket` headers are **not carried
over** by HTTP redirect semantics. The load-balanced access URL routes
the plain GET to any replica, which serves the SPA catch-all handler and
returns **HTTP 200 with `index.html`**.

The WebSocket library then fails: `expected handshake response status
code 101 but got 200`.

DERP mesh already has an exemption for this exact scenario
(`isDERPPath`). Chat relay was added later and didn't get one.

## Fix

Bypass `redirectToAccessURL` for requests that carry the
`X-Coder-Relay-Source-Replica` header, which the enterprise relay
already sets on every request (`enterprise/coderd/chatd/chatd.go:573`).

## Sequence diagram

**Before (broken):**
```
Replica A (subscriber)          Replica B (worker)           Load Balancer
       |                              |                           |
       |--- WS dial pod-ip:8080 ----->|                           |
       |                              |-- 307 redirect to LB --->|
       |                              |                           |
       |<----------- plain GET (no Upgrade headers) ------------->|
       |                              |                           |-- routes to any replica
       |<----------- 200 index.html -------------------------------|
       |                              |
       X  'expected 101 but got 200'  |
```

**After (fixed):**
```
Replica A (subscriber)          Replica B (worker)
       |                              |
       |--- WS dial pod-ip:8080 ----->|
       |    (X-Coder-Relay-Source-    |
       |     Replica header set)      |
       |                              |-- bypass redirect
       |<--------- 101 Upgrade ------|
       |<==== message_part events ====|
```
2026-03-04 20:26:03 -05:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
ケイラ fae30a00fd chore: remove unnecessary redeclarations in for loops (#18440) 2025-06-20 13:16:55 -06:00
Marcin Tojek 72f62578c1 fix: escape special characters in postgres password (#16510)
Fixes: https://github.com/coder/coder/issues/16319

This PR modifies existing escaping logic for special characters in
Postgres password, so it does fail on edge cases like `#` or `$` when
parser recognizes as invalid port.
2025-02-11 09:06:42 +01:00
Spike Curtis 5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
Colin Adler af001773db fix!: remove TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA cipher by default (#13837)
This cipher is included by default in Go as a fallback, but is marked as
an insecure cipher. This removes the 3des cipher by default.

Before:
```
$ nmap --script ssl-enum-ciphers -p 443 xxxxxxx
Starting Nmap 7.94 ( https://nmap.org ) at 2024-07-08 14:16 CDT
Nmap scan report for xxxxx (xxx.xxx.xxx.xxx)
Host is up (0.038s latency).
rDNS record for xxx.xxx.xxx.xxx: xxx.xxx.xxx.xxx.bc.googleusercontent.com

PORT    STATE SERVICE
443/tcp open  https
| ssl-enum-ciphers:
|   TLSv1.2:
|     ciphers:
|       TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA (secp256r1) - C
|     compressors:
|       NULL
|     cipher preference: server
|     warnings:
|       64-bit block cipher 3DES vulnerable to SWEET32 attack
|   TLSv1.3:
|     ciphers:
|       TLS_AKE_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_AKE_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|     cipher preference: server
|_  least strength: C
```

After:
```
$ nmap --script ssl-enum-ciphers -p 443 xxxxxxx
Starting Nmap 7.94 ( https://nmap.org ) at 2024-07-08 15:04 CDT
Nmap scan report for xxxxx (xxx.xxx.xxx.xxx)
Host is up (0.039s latency).
rDNS record for xxx.xxx.xxx.xxx: xxx.xxx.xxx.xxx.bc.googleusercontent.com

PORT    STATE SERVICE
443/tcp open  https
| ssl-enum-ciphers:
|   TLSv1.2:
|     ciphers:
|       TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (secp256r1) - A
|       TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 (secp256r1) - A
|     compressors:
|       NULL
|     cipher preference: client
|   TLSv1.3:
|     ciphers:
|       TLS_AKE_WITH_AES_128_GCM_SHA256 (ecdh_x25519) - A
|       TLS_AKE_WITH_AES_256_GCM_SHA384 (ecdh_x25519) - A
|       TLS_AKE_WITH_CHACHA20_POLY1305_SHA256 (ecdh_x25519) - A
|     cipher preference: server
|_  least strength: A
```

* fixup! fix!(cli): remove `TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA` cipher by default

* fixup! fix!(cli): remove `TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA` cipher by default
2024-07-09 17:18:27 +00:00
Spike Curtis b43344b672 feat: use latest gVisor and go 1.22.3 (#13338) 2024-05-23 08:22:44 -04:00
Ammar Bandukwala 496232446d chore(cli): replace clibase with external coder/serpent (#12252) 2024-03-15 11:24:38 -05:00
Garrett Delfosse 5981abd689 fix: handle unescaped userinfo in postgres url (#11396)
* fix: handle unescaped userinfo in postgres url

* add tests

* fix tests
2024-01-04 08:46:00 -05:00
Cian Johnston abafc0863c feat(coderd): store workspace proxy version in the database (#10790)
Stores workspace proxy version in database upon registration.
2023-11-21 11:21:25 +00:00
Spike Curtis 5173bce5cc fix: stop redirecting DERP and replicasync http requests (#10752)
Fixes an issue where setting CODER_REDIRECT_TO_ACCESS_URL breaks use of multiple Coder server replicas for DERP traffic.
2023-11-20 14:46:59 +04:00
Spike Curtis 71f87d054f fix: accept legacy redirect HTTP environment variables (#10748)
> Can someone help me understand the differences between these env variables:
>
>    CODER_REDIRECT_TO_ACCESS_URL
>    CODER_TLS_REDIRECT_HTTP_TO_HTTPS
>    CODER_TLS_REDIRECT_HTTP

Oh man, what a mess. It looks like `CODER_TLS_REDIRECT_HTTP ` appears in our config docs. Maybe that was the initial name for the environment variable?

At some point, both the flag and the environment variable were `--tls-redirect-http-to-https` and `CODER_TLS_REDIRECT_HTTP_TO_HTTPS`.  `CODER_TLS_REDIRECT_HTTP` did nothing.

However, then we introduced `CODER_REDIRECT_TO_ACCESS_URL`, we put in some deprecation code that was maybe fat-fingered such that we accept the environment variable `CODER_TLS_REDIRECT_HTTP` but the flag `--tls-redirect-http-to-https`.  Our docs still refer to `CODER_TLS_REDIRECT_HTTP` at https://coder.com/docs/v2/latest/admin/configure#address

So, I think what we gotta do is still accept `CODER_TLS_REDIRECT_HTTP` since it was working and in an example doc, but also fix the deprecation code to accept `CODER_TLS_REDIRECT_HTTP_TO_HTTPS` environment variable.
2023-11-17 15:09:29 +04:00
Steven Masley 64398def48 feat: add configurable cipher suites for tls listening (#10505)
* feat: add configurable cipher suites for tls listening
* tls.VersionName is go 1.21, copy the function
2023-11-07 14:55:39 +00:00