coder

mirror of https://github.com/coder/coder.git synced 2026-06-06 14:38:23 +00:00

Author	SHA1	Message	Date
Spike Curtis	0fc177203e	feat: use agent v2 API to update app health (#11889 ) Use the Agent v2 API to update App Health	2024-01-30 11:35:12 +04:00
Spike Curtis	2599850e54	feat: use agent v2 API to post startup (#11877 ) Uses the v2 Agent API to post startup information.	2024-01-30 11:23:28 +04:00
Spike Curtis	da8bb1c198	feat: use agent v2 API to fetch manifest (#11832 ) Agent uses the v2 API to obtain the manifest, instead of the HTTP API.	2024-01-30 10:11:28 +04:00
Spike Curtis	9cf4e7f15a	fix: prevent agent_test.go from failing on error logs (#11909 ) We're failing tests on error logs like this: https://github.com/coder/coder/actions/runs/7706053882/job/21000984583 Unfortunately, the error we hit, when the underlying connection is closed, is unexported, so we can't specifically ignore it. Part of the issue is that agent.Close() doesn't wait for these goroutines to complete before returning, so the test harness proceeds to close the connection. This looks to our product code like the network connection failing. It would be possible to fix this, but just doesn't seem worth it for the extra insurance of catching other error logs in these tests.	2024-01-30 10:04:01 +04:00
Spike Curtis	0eff646c31	chore: move proto to sdk conversion to agentsdk (#11831 ) `agentsdk` depends on `agent/proto` because it needs to get the version to dial. Therefore, the conversion routines need to live in `agentsdk` so that we can convert to and from the Manifest. I briefly considered refactoring the agent to only reference `proto.Manifest`, but decided against it because we might have multiple protocol versions in the future, its useful to have a protocol-independent data structure.	2024-01-30 09:04:56 +04:00
Spike Curtis	13e24f21e4	feat: use Agent v2 API for Service Banner (#11806 ) Agent uses the v2 API for the service banner, rather than the v1 HTTP API. One of several for #10534	2024-01-30 07:44:47 +04:00
Spike Curtis	207328ca50	feat: use appearance.Fetcher in agentapi (#11770 ) This PR updates the Agent API to use the appearance.Fetcher, which is set by entitlement code in Enterprise coderd. This brings the agentapi into compliance with the Enterprise feature.	2024-01-29 21:22:50 +04:00
Steven Masley	081fbef097	fix: code-server path based forwarding, defer to code-server (#11759 ) Do not attempt to construct a path based port forward url. Always defer to code server, as it has it's own proxy method.	2024-01-23 11:36:44 -06:00
Spike Curtis	059e533544	feat: agent uses Tailnet v2 API for DERPMap updates (#11698 ) Switches the Agent to use Tailnet v2 API to get DERPMap updates. Subsequent PRs will do the same for the CLI (`codersdk`) and `wsproxy`.	2024-01-23 14:42:07 +04:00
Spike Curtis	3e0e7f8739	feat: check agent API version on connection (#11696 ) fixes #10531 Adds a check for `version` on connection to the Agent API websocket endpoint. This is primarily for future-proofing, so that up-level agents get a sensible error if they connect to a back-level Coderd. It also refactors the location of the `CurrentVersion` variables, to be part of the `proto` packages, since the versions refer to the APIs defined therein.	2024-01-23 14:27:49 +04:00
Spike Curtis	f01cab9894	feat: use tailnet v2 API for coordination (#11638 ) This one is huge, and I'm sorry. The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet. There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!	2024-01-22 11:07:50 +04:00
Asher	72d9ec07aa	fix: detect JetBrains running on local ipv6 (#11676 )	2024-01-17 14:08:15 -09:00
Spike Curtis	b173195e0d	Revert "fix: detect JetBrains running on local ipv6 (#11653 )" (#11664 ) This reverts commit `2d61d5332a`.	2024-01-17 15:38:39 +04:00
Asher	2d61d5332a	fix: detect JetBrains running on local ipv6 (#11653 )	2024-01-16 15:53:41 -09:00
Mathias Fredriksson	385d58caf6	fix(agent/agentssh): allow remote forwarding a socket multiple times (#11631 ) * fix(agent/agentssh): allow remote forwarding a socket multiple times Fixes #11198 Fixes https://github.com/coder/customers/issues/407	2024-01-16 21:26:13 +02:00
Mathias Fredriksson	b1d53a68c2	fix(agent/agentssh): fix X11 forwarding by improving Xauthority management (#11550 ) Fixes #11531	2024-01-10 19:04:44 +02:00
Spike Curtis	fdd60d316e	fix: fix MetricsAggregator check for metric sameness (#11508 ) Fixes #11451 A refactor of the Agent API passes metrics as protobufs, which include pointers to label name/value pairs. The aggregator tested for sameness by doing a shallow compare of label values, which for different stats reports would compare unequal because the pointers would be different. This fix does a deep compare. While testing I also noted that we neglect to compare template names. This is unlikely to have caused any issue in practice, since the combination of username/workspace is unique, but in the context of comparing metric labels we should do the comparison. If a user creates a workspace, deletes it, then recreates from a different template, we could in principle have reported incorrect stats for the old template.	2024-01-09 15:21:30 +04:00
Steven Masley	dd05a6b13a	chore: mockgen archived, moved to new location (#11415 ) * chore: mockgen archived, moved to new location	2024-01-04 18:35:56 -06:00
Mathias Fredriksson	df3c310379	feat(cli): add `coder open vscode` (#11191 ) Fixes #7667	2024-01-02 20:46:18 +02:00
Spike Curtis	520c3a8ff7	fix: use TSMP for pings and checking reachability (#11306 ) We're seeing some flaky tests related to agent connectivity - https://github.com/coder/coder/actions/runs/7286675441/job/19856270998 I'm pretty sure what happened in this one is that the client opened a connection while the wgengine was in the process of reconfiguring the wireguard device, so the fact that the peer became "active" as a result of traffic being sent was not noticed. The test calls `AwaitReachable()` but this only tests the disco layer, so it doesn't wait for wireguard to come up. I think we should be using TSMP for pinging and reachability, since this operates at the IP layer, and therefore requires that wireguard comes up before being successful. This should also help with the problems we have seen where a TCP connection starts before wireguard is up and the initial round trip has to wait for the 5 second wireguard handshake retry. fixes: #11294	2024-01-02 15:53:52 +04:00
Spike Curtis	4071f1713b	feat: add logging to agent stats and JetBrains tracking (#11364 ) Adds logging so we can hope to diagnose #11363	2024-01-02 13:34:49 +04:00
Spike Curtis	25f2abf9ab	chore: remove tailnet from agent API and rename client API to tailnet (#11303 ) Refactors our DRPC service definitions slightly. In the previous version, I inserted the RPCs from the tailnet proto directly into the Agent service. This makes things hard to deal with because DRPC then generates a new set of methods with new interfaces with the `DRPCAgent_` prefixed. Since you can't have a single method that takes different argument types, we couldn't reuse the implementation of those RFCs without a lot of extra classes and pass-thru methods. Instead, the "right" way to do it is to integrate at the DRPC layer. So, we have two DRPC services available over the Agent websocket, and register them both on the DRPC `mux`. Since the tailnet proto RPC service is now for both clients and agents, I renamed some things to clarify and shorten. This PR also removes the `TailnetAPI` implementation from the `agentapi` package, and the next PR in the stack replaces it with the implementation from the `tailnet` package.	2024-01-02 10:02:45 +04:00
Spike Curtis	db9104c02e	fix: avoid panic on nil connection (#11305 ) Related to https://github.com/coder/coder/actions/runs/7286675441/job/19855871305 Fixes a panic if the listener returns an error, which can obfuscate the underlying problem and cause unrelated tests to be marked failed.	2023-12-21 14:26:11 +04:00
Kayla Washburn	3ab4800a18	chore: clean up lint (#11270 )	2023-12-18 14:59:39 -07:00
Steven Masley	a6901ae2c5	chore: fix race in cron close behavior (TestAgent_WriteVSCodeConfigs) (#11243 ) * chore: add unit test to excercise flake * Implement a *fix for cron stop() before run() This fix still has a race condition. I do not see a clean solution without modifying the cron libary. The cron library uses a boolean to indicate running, and that boolean needs to be set to "true" before we call "Close()". Or "Close()" should prevent "Run()" from doing anything. In either case, this solves the issue for a niche unit test bug in which the test finishes, calling Close(), before there was an oppertunity to start the go routine. It probably isn't worth a lot of time investment, and this fix will suffice	2023-12-18 09:26:40 -06:00
Dean Sheather	307186325f	fix: avoid db import in slim builds (#11258 )	2023-12-19 00:09:22 +10:00
Dean Sheather	e46431078c	feat: add AgentAPI using DRPC (#10811 ) Co-authored-by: Spike Curtis <spike@coder.com>	2023-12-18 22:53:28 +10:00
Spike Curtis	ad3fed72bc	chore: rename Coordinator to CoordinatorV1 (#11222 ) Renames the tailnet.Coordinator to represent both v1 and v2 APIs, so that we can use this interface for the main atomic pointer. Part of #10532	2023-12-15 11:38:12 +04:00
Steven Masley	b7bdb17460	feat: add metrics to workspace agent scripts (#11132 ) * push startup script metrics to agent	2023-12-13 11:45:43 -06:00
Spike Curtis	edeb9bb42a	fix: appease linter on darwin (#11154 ) Fixing up some linting errors that show up on Darwin, but not in CI.	2023-12-12 17:02:28 +04:00
Steven Masley	8221544514	chore: check if process is nil (#11090 ) * chore: check if process is nil We check if process is nil in the ports_supported file. Just matching that defensive check, not sure if it can be nil.	2023-12-07 22:23:42 +00:00
Asher	dbbf8acc26	fix: track JetBrains connections (#10968 ) * feat: implement jetbrains agentssh tracking Based on tcp forwarding instead of ssh connections * Add JetBrains tracking to bottom bar	2023-12-07 12:15:54 -09:00
Mathias Fredriksson	70cede8f7a	test(agent): improve TestAgent_Dial tests (#11013 ) Refs #11008	2023-12-04 13:11:30 +02:00
Spike Curtis	6c67add2d9	fix: detect and retry reverse port forward on used port (#10844 ) Fixes #10799 The flake happens when we try to remote forward, but the port we've chosen is not free. In the flaked example, it's actually the SSH listener that occupies the port we try to remote forward, leading to confusing reads (c.f. the linked issue). This fix simplies the tests considerably by using the Go ssh client, rather than shelling out to OpenSSH. This avoids using a pseudoterminal, avoids the need for starting any local OS listeners to communicate the forwarding (go SSH just returns in-process listeners), and avoids an OS listener to wire OpenSSH up to the agentConn. With the simplied logic, we can immediately tell if a remote forward on a random port fails, so we can do this in a loop until success or timeout. I've also simplified and fixed up the other forwarding tests. Since we set up forwarding in-process with Go ssh, we can remove a lot of the `require.Eventually` logic.	2023-11-27 09:42:45 +04:00
Mathias Fredriksson	61be4dfe5a	fix: improve exit codes for agent/agentssh and cli/ssh (#10850 )	2023-11-24 14:35:56 +02:00
Mathias Fredriksson	dbdcad0d09	test(agent/agentssh): fix flake in signal test (#10855 )	2023-11-24 13:47:40 +02:00
Mathias Fredriksson	2c6e0f7d0a	feat(agent/agentssh): handle session signals (#10842 )	2023-11-23 19:55:36 +02:00
Dean Sheather	a9c0c01629	chore: fix flake in listening ports test (#10833 )	2023-11-22 09:30:51 +00:00
Steven Masley	e448c10122	chore: add uuid's to ssh sessions for logging (#10721 ) * chore: add uuid to ssh connection logs	2023-11-17 16:04:23 +00:00
Spike Curtis	f400d8a0c5	fix: handle SIGHUP from OpenSSH (#10638 ) Fixes an issue where remote forwards are not correctly torn down when using OpenSSH with `coder ssh --stdio`. OpenSSH sends a disconnect signal, but then also sends SIGHUP to `coder`. Previously, we just exited when we got SIGHUP, and this raced against properly disconnecting. Fixes https://github.com/coder/customers/issues/327	2023-11-13 15:14:42 +04:00
Steven Masley	63a4f5f4a7	fix: case insensitive magic label (#10592 )	2023-11-08 11:17:14 -06:00
Spike Curtis	2a6fd90140	feat: add tailnet and agent API definitions (#10324 ) Adds API definitions and packages for Tailnet and Agent APIs (API version 2.0)	2023-10-30 12:14:45 +04:00
Mathias Fredriksson	7fecd39e23	fix(agent/agentscripts): display informative error for ErrWaitDelay (#10407 ) Fixes #10400	2023-10-27 19:07:26 +03:00
Asher	4af8446f48	fix: initialize terminal with correct size (#10369 ) * Fit once during creation This does not fix any bugs (that I know of) but we only need to fit once when the terminal is created, not every time we reconnect. Granted, currently we do not support reconnecting without refreshing anyway so it does not really matter, but this just seems more correct. Plus now we will not have to pass the fit addon around. * Pass size when connecting web socket URL I think this will solve an issue where screen does does not correctly handle an immediate resize. It seems to ignore the resize, but even if you send it again nothing changes, seemingly thinking it is already at that size? * Use new struct for decoding reconnecting pty requests Decoding a JSON message does not touch omitted (or null) fields so once a message with a resize comes in, every single message from that point will cause a resize. I am not sure if this is an actual problem in practice but at the very least it seems unintentional.	2023-10-23 23:42:39 -04:00
Mathias Fredriksson	1286904de8	test(agent): improve TestAgent_Session_TTY_MOTD_Update (#10385 )	2023-10-23 17:32:28 +00:00
Mathias Fredriksson	09f7b8e88c	fix(agent/agentscripts): track cron run and wait for cron stop (#10388 ) Fixes #10289	2023-10-23 17:08:52 +00:00
Mathias Fredriksson	1a2aea3a6b	fix(agent): prevent metadata from being discarded if report is slow (#10386 )	2023-10-23 17:02:54 +00:00
Mathias Fredriksson	8f1b4fb061	test(agent): fix service banner trim test flake (#10384 )	2023-10-23 18:06:59 +03:00
Mathias Fredriksson	76c65b1e1b	fix(agent): send metadata in batches (#10225 ) Fixes #9782 --- I recommend reviewing with ignore whitespace.	2023-10-13 17:48:25 +03:00
Mathias Fredriksson	4857d4bd55	feat(codersdk/agentsdk): use new agent metadata batch endpoint (#10224 ) Part of #9782	2023-10-13 17:32:28 +03:00

1 2 3 4 5 ...

273 Commits