coder

mirror of https://github.com/coder/coder.git synced 2026-06-06 22:48:19 +00:00

Author	SHA1	Message	Date
Mathias Fredriksson	c069563af1	test: fix use of `t.Logf` where `t.Log` would suffice (#16328 )	2025-01-29 14:35:04 +00:00
Cian Johnston	7b88776403	chore(testutil): add testutil.GoleakOptions (#16070 ) - Adds `testutil.GoleakOptions` and consolidates existing options to this location - Pre-emptively adds required ignore for this Dependabot PR to pass CI https://github.com/coder/coder/pull/16066	2025-01-08 15:38:37 +00:00
Spike Curtis	63572d9f53	fix: loosen timing checks for heartbeats (#15923 ) Fixes #15782. I believe that Windows doesn't always have high-resolution timers available, so this PR loosens the check for PG Coordinator heartbeats, to avoid flakes like: https://github.com/coder/coder/actions/runs/12397381823/job/34607639048	2024-12-19 13:49:01 +04:00
Hugo Dutka	83c493e832	chore: fix more flaky tests on Windows with Postgres (#15629 ) Addresses the following flakes: - https://github.com/coder/internal/issues/222 - https://github.com/coder/internal/issues/223 - https://github.com/coder/internal/issues/224 - https://github.com/coder/internal/issues/225 - https://github.com/coder/internal/issues/226 - https://github.com/coder/internal/issues/227 - https://github.com/coder/internal/issues/228 - https://github.com/coder/internal/issues/229 - https://github.com/coder/internal/issues/230	2024-11-26 11:56:07 +01:00
Dean Sheather	fbe2fa66f5	chore: add test for coord rolling restart (#14680 ) Closes https://github.com/coder/team-coconut/issues/50 --------- Co-authored-by: Ethan Dickson <ethan@coder.com>	2024-11-20 18:04:33 +11:00
Spike Curtis	5861e516b9	chore: add standard test logger ignoring db canceled (#15556 ) Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes. Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.	2024-11-18 14:09:22 +04:00
Spike Curtis	8c00ebc6ee	chore: refactor ServerTailnet to use tailnet.Controllers (#15408 ) chore of #14729 Refactors the `ServerTailnet` to use `tailnet.Controller` so that we reuse logic around reconnection and handling control messages, instead of reimplementing. This unifies our "client" use of the tailscale API across CLI, coderd, and wsproxy.	2024-11-08 13:18:56 +04:00
Hugo Dutka	1bfa7d42e8	chore: add postgres template caching for tests (#15336 ) This PR is the first in a series aimed at closing [#15109](https://github.com/coder/coder/issues/15109). ### Changes - Template Database Creation: `dbtestutil.Open` now has the ability to create a template database if none is provided via `DB_FROM`. The template database’s name is derived from a hash of the migration files, ensuring that it can be reused across tests and is automatically updated whenever migrations change. - Optimized Database Handling: Previously, `dbtestutil.Open` would spin up a new container for each test when `DB_FROM` was unset. Now, it first checks for an active PostgreSQL instance on `localhost:5432`. If none is found, it creates a single container that remains available for subsequent tests, eliminating repeated container startups. These changes address the long individual test times (10+ seconds) reported by some users, likely due to the time Docker took to start and complete migrations.	2024-11-04 17:23:31 +01:00
Ethan	b1298a3c1e	feat: add WorkspaceUpdates tailnet RPC (#14847 ) Closes #14716 Closes #14717 Adds a new user-scoped tailnet API endpoint (`api/v2/tailnet`) with a new RPC stream for receiving updates on workspaces owned by a specific user, as defined in #14716. When a stream is started, the `WorkspaceUpdatesProvider` will begin listening on the user-scoped pubsub events implemented in #14964. When a relevant event type is seen (such as a workspace state transition), the provider will query the DB for all the workspaces (and agents) owned by the user. This gets compared against the result of the previous query to produce a set of workspace updates. Workspace updates can be requested for any user ID, however only workspaces the authorised user is permitted to `ActionRead` will have their updates streamed. Opening a tunnel to an agent requires that the user can perform `ActionSSH` against the workspace containing it.	2024-11-01 14:53:53 +11:00
Spike Curtis	7d9f5ab81d	chore: add Coder service prefix to tailnet (#14943 ) re: #14715 This PR introduces the Coder service prefix: `fd60:627a:a42b::/48` and refactors our existing code as calling the Tailscale service prefix explicitly (rather than implicitly). Removes the unused `Addresses` agent option. All clients today assume they can compute the Agent's IP address based on its UUID, so an agent started with a custom address would break things.	2024-10-04 10:04:10 +04:00
Spike Curtis	d6154c4310	chore: remove tailnet v1 API support (#14641 ) Drops support for v1 of the tailnet API, which was the original coordination protocol where we only sent node updates, never marked them lost or disconnected. v2 of the tailnet API went GA for CLI clients in Coder 2.8.0, so clients older than that would stop working.	2024-09-12 07:56:31 +04:00
Spike Curtis	fb3523b37f	chore: remove legacy AgentIP address (#14640 ) Removes the support for the Agent's "legacy IP" which was a hardcoded IP address all agents used to use, before we introduced "single tailnet". Single tailnet went GA in 2.7.0.	2024-09-12 07:40:19 +04:00
Spike Curtis	7b39f6b0d4	fix: improves coordination logging (#14556 )	2024-09-04 15:10:43 +04:00
Dean Sheather	cf8be4eac5	feat: add resume support to coordinator connections (#14234 )	2024-08-20 17:16:49 +10:00
Jon Ayers	4fc047954e	fix: avoid deleting peers on graceful close (#14165 ) * fix: avoid deleting peers on graceful close - Fixes an issue where a coordinator deletes all its peers on shutdown. This can cause disconnects whenever a coderd is redeployed.	2024-08-14 15:16:08 -04:00
Spike Curtis	e5268e4551	chore: spin clock library out to coder/quartz repo (#13777 ) Code that was in `/clock` has been moved to github.com/coder/quartz. This PR refactors our use of the clock library to point to the external Quartz repo.	2024-07-03 15:02:54 +04:00
Dean Sheather	6c94dd4f23	chore: add DRPC server implementation for network telemetry (#13675 )	2024-07-02 01:50:52 +10:00
Spike Curtis	ce7f13c6c3	fix: fix TestPGCoordinatorSingle_MissedHeartbeats flake (#13686 )	2024-06-27 19:17:24 +04:00
Steven Masley	5ccf5084e8	chore: create type for unique role names (#13506 ) * chore: create type for unique role names Using `string` was confusing when something should be combined with org context, and when not to. Naming this new name, "RoleIdentifier"	2024-06-11 08:55:28 -05:00
Spike Curtis	8326a3a675	chore: change mock clock to allow Advance() within timer/tick functions (#13500 )	2024-06-10 15:27:24 +04:00
Spike Curtis	a0962ba089	fix: wait for PGCoordinator to clean up db state (#13351 ) c.f. https://github.com/coder/coder/pull/13192#issuecomment-2097657692 We need to wait for PGCoordinator to finish its work before returning on `Close()`, so that we delete database state (best effort -- if this fails others will filter it out based on heartbeats).	2024-05-24 12:01:03 +04:00
Steven Masley	1f5788feff	chore: remove rbac psuedo resources, add custom verbs (#13276 ) Removes our pseudo rbac resources like `WorkspaceApplicationConnect` in favor of additional verbs like `ssh`. This is to make more intuitive permissions for building custom roles. The source of truth is now `policy.go`	2024-05-15 11:09:42 -05:00
Steven Masley	cb6b5e8fbd	chore: push rbac actions to policy package (#13274 ) Just moved `rbac.Action` -> `policy.Action`. This is for the stacked PR to not have circular dependencies when doing autogen. Without this, the autogen can produce broken golang code, which prevents the autogen from compiling. So just avoiding circular dependencies. Doing this in it's own PR to reduce LoC diffs in the primary PR, since this has 0 functional changes.	2024-05-15 09:46:35 -05:00
Colin Adler	205c43da99	fix(enterprise): mark nodes from unhealthy coordinators as lost (#13123 ) Instead of removing the mappings of unhealthy coordinators entirely, mark them as lost instead. This prevents peers from disappearing from other peers if a coordinator misses a heartbeat.	2024-05-03 14:07:29 -05:00
Colin Adler	777dfbe965	feat(enterprise): add ready for handshake support to pgcoord (#12935 )	2024-04-16 15:01:10 -05:00
Spike Curtis	06eae954c9	fix: stop sending DeleteTailnetPeer when coordinator is unhealthy (#12925 ) fixes #12923 Prevents Coordinate peer connections from generating spurious database queries like DeleteTailnetPeer when the coordinator is unhealthy. It does this by checking the health of the querier before accepting a connection, rather than unconditionally accepting it only for it to get swatted down later.	2024-04-10 22:49:13 +04:00
Colin Adler	4d5a7b2d56	chore(codersdk): move all tailscale imports out of `codersdk` (#12735 ) Currently, importing `codersdk` just to interact with the API requires importing tailscale, which causes builds to fail unless manually using our fork.	2024-03-26 12:44:31 -05:00
Steven Masley	bd6ad88077	chore: nolint always return error function (#12701 )	2024-03-21 09:35:10 -05:00
Steven Masley	131d0bd2ba	chore: fix linting issue in main(#12697 )	2024-03-20 20:15:01 -05:00
Colin Adler	e5d911462f	fix(tailnet): enforce valid agent and client addresses (#12197 ) This adds the ability for `TunnelAuth` to also authorize incoming wireguard node IPs, preventing agents from reporting anything other than their static IP generated from the agent ID.	2024-03-01 09:02:33 -06:00
Cian Johnston	a2cbb0f87f	fix(enterprise/coderd): check provisionerd API version on connection (#12191 )	2024-02-16 18:43:07 +00:00
Spike Curtis	627232eae9	fix: fix pgcoord to delete coordinator row last (#12155 ) Fixes #12141 Fixes #11750 PGCoord shutdown was uncoordinated, so an update at an inopportune time during shutdown would be rejected because the coordinator row was already deleted. This PR ensures that the PGCoord subcomponents that write updates are shut down before we take down the heartbeats, which is responsible for deleting the coordinator row.	2024-02-15 16:34:29 +04:00
Spike Curtis	1c8b803785	feat: add logging to pgcoord subscribe/unsubscribe (#11952 ) Adds logging to unsubscribing from peer and tunnel updates in pgcoordinator, since #11950 seems to be problem with these subscriptions	2024-01-31 12:15:58 +04:00
Spike Curtis	520b12e1a2	fix: close MultiAgentConn when coordinator closes (#11941 ) Fixes an issue where a MultiAgentConn isn't closed properly when the coordinator it is connected to is closed. Since servertailnet checks whether the conn is closed before reinitializing, it is important that we check this, otherwise servertailnet can get stuck if the coordinator closes (e.g. when we switch from AGPL to PGCoordinator after decoding a license).	2024-01-31 00:38:19 +04:00
Cian Johnston	ecae6f9135	fix(enterprise/tailnet): handle query canceled error in sendBeat() (#11794 )	2024-01-24 18:42:05 +00:00
Spike Curtis	f01cab9894	feat: use tailnet v2 API for coordination (#11638 ) This one is huge, and I'm sorry. The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet. There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!	2024-01-22 11:07:50 +04:00
Spike Curtis	8910ac715c	feat: add tailnet v2 support to wsproxy coordinate endpoint (#11637 ) wsproxy also needs to be updated to use tailnet v2 because the `tailnet.Conn` stores peers by ID, and the peerID was not being carried by the JSON protocol. This adds a query param to the endpoint to conditionally switch to the new protocol.	2024-01-18 10:10:36 +04:00
Spike Curtis	cae095fdb6	fix: stop logging errors on canceled cleanup queries (#11547 ) Fixes flake seen here: https://github.com/coder/coder/actions/runs/7474259128/job/20340051975	2024-01-10 16:20:29 +04:00
Spike Curtis	64638b381d	feat: promote PG Coordinator out of experimental (#11398 ) Promotes PG Coordinator out of experimental to GA	2024-01-05 08:03:36 +04:00
Steven Masley	dd05a6b13a	chore: mockgen archived, moved to new location (#11415 ) * chore: mockgen archived, moved to new location	2024-01-04 18:35:56 -06:00
Spike Curtis	f2606a78dd	fix: avoid converting nil node fixes: #11276	2023-12-19 13:38:15 +04:00
Dean Sheather	e46431078c	feat: add AgentAPI using DRPC (#10811 ) Co-authored-by: Spike Curtis <spike@coder.com>	2023-12-18 22:53:28 +10:00
Spike Curtis	ad3fed72bc	chore: rename Coordinator to CoordinatorV1 (#11222 ) Renames the tailnet.Coordinator to represent both v1 and v2 APIs, so that we can use this interface for the main atomic pointer. Part of #10532	2023-12-15 11:38:12 +04:00
Spike Curtis	bf3b35b1e2	fix: stop logging context Canceled as error (#11177 ) fixes #11166 and a related log that could have the same problem	2023-12-13 13:08:30 +04:00
Spike Curtis	b34ecf1e9e	fix: fix deadlock of mappingQuery on context canceled Fixes #11078 replace bare channel send with SendCtx so that we properly shut down when context is canceled.	2023-12-07 17:19:18 +04:00
Spike Curtis	2c86d0bed0	feat: support v2 Tailnet API in AGPL coordinator (#11010 ) Fixes #10529	2023-12-06 15:04:28 +04:00
Spike Curtis	812fb95273	fix: prevent connIO from panicking in race between Close and Enqueue (#10948 ) Spotted during a code read. ConnIO unlocks the mutex before attempting to write to the response channel, which could allow another goroutine to call Close() and close the channel, causing a panic. Fix is to hold the mutex. This won't cause a deadlock because the `select{}` has a `default` case, so we won't block even if the receiver isn't keeping up.	2023-12-01 10:23:29 +04:00
Spike Curtis	612e67a53b	feat: add cleanup of lost tailnet peers and tunnels to PGCoordinator (#10939 ) Adds the "lost" peer cleanup queries to PGCoordinator, including tests.	2023-12-01 10:13:29 +04:00
Spike Curtis	0cab6e7763	feat: support graceful disconnect in PGCoordinator (#10937 ) Adds support for graceful disconnect to PGCoordinator. When peers gracefully disconnect, they send a disconnect message. This triggers the peer to be disconnected from all tunneled peers. The Multi-Agent Client supports graceful disconnect, since it is in memory and we know that when it is closed, we really mean to disconnect. The v1 agent and client Websocket connections do not support graceful disconnect, since the v1 protocol doesn't have this feature. That means that if a v1 peer connects to a v2 peer, when the v1 peer's coordinator connection is closed, the v2 peer will see it as "lost" since we don't know whether the v1 peer meant to disconnect, or it just lost connectivity to the coordinator.	2023-12-01 09:55:25 +04:00
Spike Curtis	52901e1219	feat: implement HTMLDebug for PGCoord with v2 API (#10914 ) Implements HTMLDebug for the PGCoordinator with the new v2 API and related DB tables.	2023-11-28 22:37:20 +04:00

1 2

83 Commits