mirror of
https://github.com/coder/coder.git
synced 2026-06-03 04:58:23 +00:00
5a5828b090
## Problem When `coder ssh --stdio` checks for Coder Connect availability, it constructs a hostname like `agent.workspace.owner.coder` and performs a DNS AAAA lookup via `ExistsViaCoderConnect`. Without a trailing dot, this hostname is not a fully-qualified domain name (FQDN), so the system DNS resolver appends each configured search domain before querying. Go's pure-Go DNS resolver (used when `CGO_ENABLED=0`, which is the default for CLI builds) does **not** stop after getting NXDOMAIN on the first name. It tries all names in the search list sequentially: 1. `agent.workspace.owner.coder.` → NXDOMAIN (fast) 2. `agent.workspace.owner.coder.corp.example.com.` → timeout 3. `agent.workspace.owner.coder.internal.company.com.` → timeout On corporate networks where the search-domain-expanded queries hit DNS infrastructure that drops rather than responds (common for nonsensical hostnames with deep subdomain chains), each expanded query hits the full DNS timeout (default 5s × 2 attempts = 10s per name). With 2-3 search domains, this compounds to 20-30+ seconds of blocking. ## Fix Adding a trailing dot marks the hostname as an FQDN. Go's `nameList()` in `src/net/dnsclient_unix.go` returns a single-entry list for rooted names, completely bypassing search domain expansion. This is consistent with how `IsCoderConnectRunning` already handles its DNS check — `tailnet.IsCoderConnectEnabledFmtString` includes a trailing dot for exactly this reason. ## Verification Tested with a fake DNS server that responds with NXDOMAIN for `.coder` queries but drops search-domain-expanded queries: | Hostname | Time | Queries sent | |---|---|---| | `main.workstation.kevin.coder` (no trailing dot) | **~15s** | 4 (as-is + 3 search domains) | | `main.workstation.kevin.coder.` (trailing dot) | **<1ms** | 1 (FQDN only) | Closes https://github.com/coder/coder/issues/22581 _Generated by [mux](https://github.com/coder/mux) but reviewed by a human_