Commit Graph

136 Commits

Author SHA1 Message Date
Spike Curtis 2c7f8ac65f chore: migrate to coder/websocket 1.8.12 (#15898)
Migrates us to `coder/websocket` v1.8.12 rather than `nhooyr/websocket` on an older version.

Works around https://github.com/coder/websocket/issues/504 by adding an explicit test for `xerrors.Is(err, io.EOF)` where we were previously getting `io.EOF` from the netConn.
2024-12-19 00:51:30 +04:00
Spike Curtis 5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
Spike Curtis 40802958e9 fix: use explicit api versions for agent and tailnet (#15508)
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.

What happened is that we accidentally shipped two new API features without bumping the version.  `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.

Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet.  That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs.  This isn't great, but hasn't caused us major issues because 

1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported. 
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.

Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.

To mitigate against this thing happening again, this PR also:

1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version.  That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.

With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
2024-11-15 11:16:28 +04:00
Ethan b1298a3c1e feat: add WorkspaceUpdates tailnet RPC (#14847)
Closes #14716
Closes #14717

Adds a new user-scoped tailnet API endpoint (`api/v2/tailnet`) with a new RPC stream for receiving updates on workspaces owned by a specific user, as defined in #14716. 

When a stream is started, the `WorkspaceUpdatesProvider` will begin listening on the user-scoped pubsub events implemented in #14964. When a relevant event type is seen (such as a workspace state transition), the provider will query the DB for all the workspaces (and agents) owned by the user. This gets compared against the result of the previous query to produce a set of workspace updates. 

Workspace updates can be requested for any user ID, however only workspaces the authorised user is permitted to `ActionRead` will have their updates streamed.
Opening a tunnel to an agent requires that the user can perform `ActionSSH` against the workspace containing it.
2024-11-01 14:53:53 +11:00
Jon Ayers cd890aa3a0 feat: enable key rotation (#15066)
This PR contains the remaining logic necessary to hook up key rotation
to the product.
2024-10-25 17:14:35 +01:00
Steven Masley 343f8ec9ab chore: join owner, template, and org in new workspace view (#15116)
Joins in fields like `username`, `avatar_url`, `organization_name`,
`template_name` to `workspaces` via a **view**. 
The view must be maintained moving forward, but this prevents needing to
add RBAC permissions to fetch related workspace fields.
2024-10-22 09:20:54 -05:00
Spike Curtis 5bd19f8ba3 fix: fix flake in TestWorkspaceAgentClientCoordinate_ResumeToken (#14642)
fixes #14365

I bet what's going on is that in `connectToCoordinatorAndFetchResumeToken()` we call `Coordinate()`, send a message on the `Coordinate` client and then close it in rapid succession. We don't wait around for a response from the coordinator, so dRPC is likely aborting the call `Coordinate()` in the backend because the stream is closed before it even gets a chance.

Instead of using the Coordinator to record the peer ID assigned on the API call, we can wrap the resume token provider, since we call that API _and_ wait for a response. This also affords the opportunity to directly assert we get called with the right token.
2024-09-11 16:32:47 +04:00
Dean Sheather e8c59a1d9d chore: avoid flake in resume token test (#14378) 2024-08-22 13:27:43 +10:00
Dean Sheather cf8be4eac5 feat: add resume support to coordinator connections (#14234) 2024-08-20 17:16:49 +10:00
Kayla Washburn-Love bf4b7abf14 chore(coderd): allow creating workspaces without specifying an organization (#14048) 2024-07-30 10:44:02 -06:00
Spike Curtis ba7d1835e5 fix: fix flake in TestWorkspaceAgent_Metadata_CatchMemoryLeak (#13553)
Fixes flake seen here: https://github.com/coder/coder/actions/runs/9461246505/job/26061605278

#13486 subtly changes the test so that `post` uses the new v2 Agent API, and when canceling context, there is a race condition where the yamux session underpinning the API can get torn down before the RPC processes the canceled context, yielding a different error response than the test was previously expecting.

I've refactored the test to just stop posting when the test finishes, rather than depend on a context cancel to end the posting goroutine.
2024-06-12 18:33:22 +04:00
Ethan dd243686e4 chore!: remove deprecated agent v1 routes (#13486) 2024-06-11 12:22:59 +10:00
Colin Adler 9d00a26a90 fix: add missing route for codersdk.PostLogSource (#13421) 2024-06-03 12:29:50 -05:00
Garrett Delfosse 5789ea5397 chore: move stat reporting into workspacestats package (#13386) 2024-05-29 11:49:08 -04:00
Colin Adler 4d5a7b2d56 chore(codersdk): move all tailscale imports out of codersdk (#12735)
Currently, importing `codersdk` just to interact with the API requires
importing tailscale, which causes builds to fail unless manually using
our fork.
2024-03-26 12:44:31 -05:00
Garrett Delfosse 0723dd3abf fix: ensure agent token is from latest build in middleware (#12443) 2024-03-14 12:27:32 -04:00
Cian Johnston 74b749b890 chore(coderd): add test to assert agent token invalid when workspace deleted (#12290) 2024-02-26 13:27:00 +00:00
Marcin Tojek c0e169ebf9 feat: support custom order of agent metadata (#12066) 2024-02-08 17:29:34 +01:00
Spike Curtis 1cf4b62867 feat: change agent to use v2 API for reporting stats (#12024)
Modifies the agent to use the v2 API to report its statistics, using the `statsReporter` subcomponent.
2024-02-07 15:26:41 +04:00
Spike Curtis 1aa117b9ec chore: rename client Listen to ConnectRPC (#11916)
ConnectRPC seems more appropriate for this function
2024-02-01 14:44:11 +04:00
Spike Curtis 0fc177203e feat: use agent v2 API to update app health (#11889)
Use the Agent v2 API to update App Health
2024-01-30 11:35:12 +04:00
Spike Curtis 2599850e54 feat: use agent v2 API to post startup (#11877)
Uses the v2 Agent API to post startup information.
2024-01-30 11:23:28 +04:00
Spike Curtis da8bb1c198 feat: use agent v2 API to fetch manifest (#11832)
Agent uses the v2 API to obtain the manifest, instead of the HTTP API.
2024-01-30 10:11:28 +04:00
Steven Masley d66e6e78ee fix: always attempt external auth refresh when fetching (#11762) (#11830)
* fix: always attempt external auth refresh when fetching
* refactor validate to check expiry when considering "valid"
2024-01-29 08:55:15 -06:00
Ammar Bandukwala 79568bf628 Revert "fix: always attempt external auth refresh when fetching (#11762)"
This reverts commit 0befc0826a.
2024-01-25 14:22:47 -06:00
Steven Masley 0befc0826a fix: always attempt external auth refresh when fetching (#11762)
* fix: always attempt external auth refresh when fetching
* refactor validate to check expiry when considering "valid"
2024-01-25 10:54:56 -06:00
Spike Curtis f01cab9894 feat: use tailnet v2 API for coordination (#11638)
This one is huge, and I'm sorry.

The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.

There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
2024-01-22 11:07:50 +04:00
Kayla Washburn-Love 80eac73ed1 chore: remove useLocalStorage hook (#11712) 2024-01-19 16:04:19 -07:00
Marcin Tojek 5eb3e1cdaa feat: expose owner_name in coder_workspace resource (#11639) 2024-01-17 13:20:45 +01:00
Cian Johnston d583acad00 fix(coderd): workspaceapps: update last_used_at when workspace app reports stats (#11603)
- Adds a new query BatchUpdateLastUsedAt
- Adds calls to BatchUpdateLastUsedAt in app stats handler upon flush
- Passes a stats flush channel to apptest setup scaffolding and updates unit tests to assert modifications to LastUsedAt.
2024-01-16 14:06:39 +00:00
Steven Masley 03ee63931c chore: remove duplicate validate calls on same oauth token (#11598)
* chore: remove duplicate validate calls on same oauth token
2024-01-12 14:27:22 -06:00
Spike Curtis 211e59bf65 feat: add tailnet v2 API support to coordinate endpoint (#11228)
closes #10532

Adds v2 support to the /coordinate endpoint via a query parameter.

v1 already has test cases, and we haven't implemented v2 at the client yet, so the only new test case is an unsupported version.
2023-12-15 14:10:24 +04:00
Spike Curtis 43ba3146a9 feat: add test case for BlockDirect + listening ports (#11152)
Adds a test case for #10391 with single tailnet out of experimental
2023-12-13 12:28:09 +04:00
Garrett Delfosse 228cbec99b fix: stop updating agent stats from deleted workspaces (#11026)
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2023-12-07 13:55:29 -05:00
Szabolcs Fruhwald baf3bf6b9c feat: add workspace_id, owner_name to agent manifest (#10199)
Co-authored-by: Kyle Carberry <kyle@carberry.com>
Co-authored-by: Atif Ali <atif@coder.com>
2023-12-04 00:41:54 +03:00
Jon Ayers 967db2801b chore: refactor ResolveAutostart tests to use dbfake (#10603) 2023-11-30 19:33:04 -06:00
Spike Curtis 2dc565d5de chore: remove New----Builder from dbfake function names (#10882)
Drop "New" and "Builder" from the function names, in favor of the top-level resource created.  This shortens tests and gives a nice syntax.  Since everything is a builder, the prefix and suffix don't add much value and just make things harder to read.

I've also chosen to leave `Do()` as the function to insert into the database.  Even though it's a builder pattern, I fear `.Build()` might be confusing with Workspace Builds.  One other idea is `Insert()` but if we later add dbfake functions that update, this might be inconsistent.
2023-11-29 11:06:04 +04:00
Mathias Fredriksson f441ad66e1 fix(codersdk): keep workspace agent connection open after dial context (#10863) 2023-11-27 14:29:57 +02:00
Spike Curtis 4548ad7cef chore: remove dbfake.Workspace (#10880)
Remove dbfake.Workspace and use builder instead.
2023-11-27 14:39:16 +04:00
Spike Curtis 78283a7fb9 chore: remove dbfake.WorkspaceWithAgent (#10879)
Replace dbfake.WorkspaceWithAgent() with the builder pattern and remove this function.
2023-11-27 14:30:15 +04:00
Mathias Fredriksson 6ecba0fda7 fix(coderd): prevent logging error for query cancellation in watchWorkspaceAgentMetadata (#10843) 2023-11-22 15:32:31 +00:00
Dean Sheather a9c0c01629 chore: fix flake in listening ports test (#10833) 2023-11-22 09:30:51 +00:00
Spike Curtis b25e5dc90b chore: remove dbfake.WorkspaceBuild in favor of builder pattern (#10814)
I'd like to convert dbfake into a builder pattern to prevent a proliferation of XXXWithYYY methods.  This is one step of the way by removing the Non-builder function.
2023-11-22 13:04:58 +04:00
Jon Ayers 51b58cfc98 fix: only update last_used_at when connection count > 0 (#10808) 2023-11-21 18:10:41 -06:00
Mathias Fredriksson 198b56c137 fix(coderd): fix memory leak in watchWorkspaceAgentMetadata (#10685)
Fixes #10550
2023-11-16 17:03:53 +02:00
Kyle Carberry 839a16e299 feat: add dbfake for workspace builds and resources (#10426)
* feat: add dbfakedata for workspace builds and resources

This creates `coderdtest.NewWithDatabase` and adds a series of
helper functions to `dbfake` that insert structured fake data
for resources into the database.

It allows us to remove provisionerd from a significant amount of
tests which should speed them up and reduce flakes.

* Rename dbfakedata to dbfake

* Migrate workspaceagents_test.go to use the new dbfake

* Migrate agent_test.go to use the new fakes

* Fix comments
2023-11-02 17:15:07 +00:00
Spike Curtis a7c671ca07 feat: add workspace agent APIVersion (#10419)
Fixes #10339
2023-10-31 10:08:43 +04:00
Mathias Fredriksson 4857d4bd55 feat(codersdk/agentsdk): use new agent metadata batch endpoint (#10224)
Part of #9782
2023-10-13 17:32:28 +03:00
Spike Curtis 8a47262faf fix: ignore logged errors in TestWorkspaceAgent/Timeout
fixes #10167

Annoyingly, there isn't a good way to stop the publish from being sent on shutdown, and subscribing to them in the test is too fragile because empty messages are sent in a bunch of places, so we can't reliably tell it's regarding timeouts.
2023-10-10 15:45:47 +04:00
Kayla Washburn f001a57614 fix: only allow promoting successful template versions (#9998) 2023-10-05 10:49:25 -06:00