Closes https://github.com/coder/internal/issues/563
The [Coder Connect
tunnel](https://github.com/coder/coder/blob/main/vpn/tunnel.go) receives
workspace state from the Coder server over a [dRPC
stream.](https://github.com/coder/coder/blob/114ba4593b2a82dfd41cdcb7fd6eb70d866e7b86/tailnet/controllers.go#L1029)
When first connecting to this stream, the current state of the user's
workspaces is received, with subsequent messages being diffs on top of
that state.
However, if the client disconnects from this stream, such as when the
user's device is suspended, and then reconnects later, no mechanism
exists for the tunnel to differentiate that message containing the
entire initial state from another diff, and so that state is incorrectly
applied as a diff.
In practice:
- Tunnel connects, receives a workspace update containing all the
existing workspaces & agents.
- Tunnel loses connection, but isn't completely stopped.
- All the user's workspaces are restarted, producing a new set of
agents.
- Tunnel regains connection, and receives a workspace update containing
all the existing workspaces & agents.
- This initial update is incorrectly applied as a diff, with the
Tunnel's state containing both the old & new agents.
This PR introduces a solution in which tunnelUpdater, when created,
sends a FreshState flag with the WorkspaceUpdate type. This flag is
handled in the vpn tunnel in the following fashion:
- Preserve existing Agents
- Remove current Agents in the tunnel that are not present in the
WorkspaceUpdate
- Remove unreferenced Workspaces
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
- Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <claude@anthropic.com>
Records the Device ID, Device OS and Coder Desktop version to telemetry.
These values are provided by the Coder Desktop client in the StartRequest method of the VPN protocol. We render them as an HTTP header to transmit to Coderd, where they are decoded and added to telemetry.
Prevents the VPN startup from hanging for 5 minutes due to a startup
backoff if `wintun.dll` cannot be loaded.
Because the `wintun` package doesn't expose an easy `Load() error`
method for us, the only way for us to force it to load (without unwanted
side effects) is through `wintun.Version()` which doesn't return an
error message.
So, we call that function so the `wintun` package loads the DLL and
configures the logging properly, then we try to load the DLL ourselves.
`LoadLibraryEx` will not load the library multiple times and returns a
reference to the existing library.
Closes https://github.com/coder/coder-desktop-windows/issues/24
On the Mac app, we want to display the shortest FQDN - we might as well
do the sorting as they leave the tunnel.
Right now it's coming from a map, so it's they arrive in a random order
each peer update.
Previously we were configuring using the display name of the user, which
may contain spaces, special characters, and isn't unique. This was
always supposed to be the username.
Replace Depot build action with Nix for Nix dogfood image builds
The dogfood Nix image is now built using Nix's native container tooling instead of Depot. This change:
- Adds Nix setup steps to the GitHub Actions workflow
- Removes the Dockerfile.nix in favor of a Nix-native container build
- Updates the flake.nix to support building Docker images
- Introduces a hash file to track Nix-related changes
- Updates the vendorHash for Go dependencies
Change-Id: I4e011fe3a19d9a1375fbfd5223c910e59d66a5d9
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Previously, a `nil` Router config would cause a panic in the dylib.
Normally, a nil Router config would indicate a shutdown of the service,
and that settings should be reset. However, for Coder Desktop macOS the
network configuration will be reset by the disconnecting of the system
VPN, so we'll instead do nothing.
- Adds `testutil.GoleakOptions` and consolidates existing options to
this location
- Pre-emptively adds required ignore for this Dependabot PR to pass CI
https://github.com/coder/coder/pull/16066
Migrates us to `coder/websocket` v1.8.12 rather than `nhooyr/websocket` on an older version.
Works around https://github.com/coder/websocket/issues/504 by adding an explicit test for `xerrors.Is(err, io.EOF)` where we were previously getting `io.EOF` from the netConn.
Closes#14734.
- Each outgoing agent upsertion also includes the timestamp of the last wireguard handshake.
- Agent upsertions will be created, for existing agents, with an updated last handshake time on a regular, fixed, interval of 10 seconds.
Addresses #14734.
This PR wires up `tunnel.go` to a `tailnet.Conn` via the new `/tailnet` endpoint, with all the necessary controllers such that a VPN connection can be started, stopped and inspected via the CoderVPN protocol.
Changes the RPC header format from `codervpn <version> <role>` to
`codervpn <role> <version1,version2,...>`.
The versions list is a list of the maximum supported minor version for
each major version, sorted by major versions.
E.g. `1.0,2.3,3.1` means `1.0, 2.0, 2.1, 2.2, 2.3, 3.0, 3.1` are
supported.
When we eventually support multiple versions, the peer's version list
will be compared against the current supported versions list to
determine the maximum major and minor version supported by both peers.
Closes#15601
`coder vpn-daemon run` will instantiate a RPC connection with the
specified pipe handles and communicate with the (yet to be implemented)
parent process.
The tests don't ensure that the tunnel is actually usable yet as the
tunnel functionality isn't implemented, but it does make sure that the
tunnel tries to read from the RPC pipe.
Closes#14735
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.
Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.
What happened is that we accidentally shipped two new API features without bumping the version. `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.
Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet. That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs. This isn't great, but hasn't caused us major issues because
1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported.
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.
Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.
To mitigate against this thing happening again, this PR also:
1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version. That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.
With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.