coder

mirror of https://github.com/coder/coder.git synced 2026-06-02 20:48:20 +00:00

Author	SHA1	Message	Date
Jon Ayers	8b2f472f71	chore: use old slog (#21959 )	2026-02-05 16:35:41 -06:00
Jon Ayers	b275be2e7a	chore: backport fixes (#21957 )	2026-02-05 16:09:41 -06:00
blinkagent[bot]	ba71b321bc	fix: remove a sensitive field from an agent log line (#20968 ) (#21063 ) This PR removes a log field that could expose sensitive information in agent logs for workspaces that pass such information to the agent via its manifest. (cherry picked from commit `1d726c81bb`) Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>	2025-12-02 11:33:50 -06:00
Sas Swart	abe66a38eb	feat: implement agent socket api, client and cli (#20758 ) (#20976 )	2025-12-01 14:07:40 -06:00
Asher	c266bb830c	chore: add debug logging and recovery to agent api requests (#20785 ) This is to debug context timeouts on API requests to the agent. Because rbac and database cannot be imported in slim, split the logger middleware into slim and non-slim versions and break out the recovery middleware.	2025-11-25 14:59:20 -09:00
Spike Curtis	afd40436f0	fix: mock Agent querying OS for listening ports in tests (#20842 ) fixes https://github.com/coder/internal/issues/1123 We want to tests that ports are not included after they are no longer used, but this isn't safe on the real OS networking stack because there is no way to guarantee a port _won't_ be used. Instead, we introduce an interface and fake implementation for testing. On order to leave the filtering logic in the test path, this PR also does some refactoring. Caching logic is left in the real OS querying implementation and a new test case is added for it in this PR.	2025-11-25 14:25:24 +04:00
Sas Swart	2840fdcb54	feat(agent): add agent socket API (#20717 ) relates to: https://github.com/coder/internal/issues/1094 This is number 2 of 5 pull requests in an effort to add agent script ordering. It adds a drpc API that is exposed via a local socket. This API serves access to a lightweight DAG based dependency manager that was inspired by systemd. In follow-up PRs: * This unit manager will be plumbed into the workspace agent struct. * CLI commands will use this agentsocket api to express dependencies between coder scripts I used an LLM to produce some of these changes, but I have conducted thorough self review and consider this contribution to be ready for an external reviewer.	2025-11-21 13:09:27 +02:00
Sas Swart	500c17e257	feat(agent): add agent unit manager (#20715 ) relates to: https://github.com/coder/internal/issues/1094 This is number 1 of 5 pull requests in an effort to add agent script ordering. It adds a unit manager, which uses an underlying DAG and a list of subscribers to inform units when their dependencies have changed in status. In follow-up PRs: * This unit manager will be plumbed into the workspace agent struct. * It will then be exposed to users via a new socket based drpc API * The agentsocket API will then become accessible via CLI commands that allow coder scripts to express their dependencies on one another. This is an experimental feature. There may be ways to improve the efficiency of the manager struct, but it is more important to validate this feature with customers before we invest in such optimizations. See the tests for examples of how units may communicate with one another. Actual CLI usage will be analogous. I used an LLM to produce some of these changes, but I have conducted thorough self review and consider this contribution to be ready for an external reviewer.	2025-11-19 19:03:37 +02:00
Asher	643fe38b1e	fix: use temp file on same device with mcp file edit (#20477 ) Otherwise you can get errors like "invalid cross-device link".	2025-10-29 12:23:06 -08:00
Danielle Maywood	e4e4669feb	fix(agent/agentcontainers): remove unneeded default branch (#20511 ) Closes https://github.com/coder/internal/issues/769 According to the `time.NewTicker` documentation [^1] (which is used under the hood by https://github.com/coder/quartz) it will automatically adjust the time interval to make up for slow receivers. This means we should be safe to drop the default branch. > NewTicker returns a new Ticker containing a channel that will send the current time on the channel after each tick. The period of the ticks is specified by the duration argument. The ticker will adjust the time interval or drop ticks to make up for slow receivers. The duration d must be greater than zero; if not, NewTicker will panic. [^1]: https://pkg.go.dev/time#Ticker	2025-10-28 12:16:42 +00:00
Sas Swart	6c621364f8	feat: add a dependency management graph for agents (#20208 ) Relates to https://github.com/coder/internal/issues/1093 This is the first of N pull requests to allow coder script ordering. It introduces what is for now dead code, but paves the way for various interfaces that allow coder scripts and other processes to depend on one another via CLI commands and terraform configurations. The next step is to add reactivity to the graph, such that changes in the status of one vertex will propagate and allow other vertices to change their own statuses. Concurrency and stress testing yield the following: CPU Profile: <img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 52" src="https://github.com/user-attachments/assets/f46cf1a2-a0b2-4c02-81a0-069798108ee5" /> Mem Profile: <img width="1512" height="862" alt="Screenshot 2025-10-17 at 10 38 01" src="https://github.com/user-attachments/assets/45be1235-fff6-45ba-a50d-db9880377bd0" /> Predictably, lock contention and memory allocation are the largest components of this system under stress. Nothing seems untoward.	2025-10-24 16:18:16 +02:00
Ethan	33b42fca7a	test: fix flake in TestAgent_Metrics_SSH (#20450 ) Second flake for this test today 😮‍💨. Flake seen here, though I couldn't replicate this locally, some CI exclusive networking issue. https://github.com/coder/coder/actions/runs/18770305895/job/53553517887?pr=20448 ``` agent_test.go:3619: Error Trace: /home/runner/work/coder/coder/agent/agent_test.go:3619 Error: Received unexpected error: expected 1, got 0.000000: github.com/coder/coder/v2/agent_test.TestAgent_Metrics_SSH.func7 /home/runner/work/coder/coder/agent/agent_test.go:3557 Test: TestAgent_Metrics_SSH Messages: check fn for coderd_agentstats_currently_reachable_peers failed ``` This value is incremented by a successful ping to the peer from the agent, which is dependent on all the networking code, which I think is definitely out of scope of this test for agent metrics. So, we'll just assert that the metrics exist with the correct labels (`derp`, `p2p`)	2025-10-24 17:28:57 +11:00
Ethan	86ef3fb497	test: fix flake in TestAgent_Metrics_SSH (#20447 ) Closes https://github.com/coder/internal/issues/921 The flake in the linked issue was caused by the startup script taking longer than 1 second in CI. The existing conditional, that the startup script duration was under a second, was incorrect; the correct conditional is that the metric exists with the `success` label set to `true`.	2025-10-24 14:06:25 +11:00
Dean Sheather	6c99d5eca2	fix: avoid connection logging crashes in agent (#20307 ) - Ignore errors when reporting a connection from the server, just log them instead - Translate connection log IP `localhost` to `127.0.0.1` on both the server and the agent Note that the temporary fix for converting invalid IPs to localhost is not required in main since the database no longer forbids NULL for the IP column since https://github.com/coder/coder/pull/19788 Relates to #20194	2025-10-16 01:56:43 +11:00
Spike Curtis	5807fe01e4	test: prevent TestAgent_ReconnectingPTY connection reporting check from interfering (#20210 ) When we added support for connection tracking in the Workspace agent, we modified the ReconnectingPTY tests to add an initial connection that we immediately hang up and check that connections are logged. In the case of `screen`-based pty handling, hanging up the initial connection can race with the initial attachment to the `screen` process, and cause that process to exit early. This leaves subsequent connections to the same session ID to fail. In this PR we just use different pty session IDs so that the initial connections we do to verify logging don't interfere with the rest of the test. _Arguably_ it's a bug in our Reconnecting PTY code that hanging up immediately can leave the system in a weird state, but we do eventually recover and error out, so I don't think it's worth trying to fix.	2025-10-08 16:23:46 +04:00
Zach	4d1003eace	fix: remove initial global HTTP client usage (#20128 ) This PR makes the initial steps at removing usage of the global Go HTTP client, which was seen to have impacts on test flakiness in https://github.com/coder/internal/issues/1020. The first commit removes uses from tests, with the exception of one test that is tightly coupled to the default client. The second commit makes easy/low-risk removals from application code. This should have some impact to reduce test flakiness.	2025-10-02 11:43:13 -06:00
Asher	be7aa58075	feat: add coder_workspace_ls MCP tool (#19652 )	2025-09-12 15:57:15 -08:00
Asher	30330abaea	feat: add coder_workspace_edit_file MCP tool (#19629 )	2025-09-12 15:36:14 -08:00
Michael Suchacz	336e62bc37	fix: deflake BackedWriter tests (#19802 )	2025-09-12 14:00:08 +00:00
Asher	d5a02d570f	feat: add coder_workspace_write_file MCP tool (#19591 )	2025-09-11 12:17:15 -08:00
Michael Suchacz	4c98decfb7	chore: add backed reader, writer and pipe implementation (#19147 ) Relates to: https://github.com/coder/coder/issues/18101 This PR introduces a new `backedpipe` package that provides reliable bidirectional byte streams over unreliable network connections. The implementation includes: - `BackedPipe`: Orchestrates a reader and writer to provide transparent reconnection and data replay - `BackedReader`: Handles reading with automatic reconnection, blocking reads when disconnected - `BackedWriter`: Maintains a ring buffer of recent writes for replay during reconnection - `RingBuffer`: Efficient circular buffer implementation for storing data The package enables resilient connections by tracking sequence numbers and replaying missed data after reconnection. It handles connection failures gracefully, automatically reconnecting and resuming data transfer from the appropriate point.	2025-09-11 14:05:14 +02:00
Asher	4bf63b4068	feat: add coder_workspace_read_file MCP tool (#19562 ) Follows similarly to the bash tool (and some code to connect to an agent was extracted from it). There are two main parts: a new agent endpoint, and then a new MCP tool that consumes that endpoint.	2025-09-09 15:12:24 -08:00
Spike Curtis	1354d84eb4	chore: refactor instance identity to be a SessionTokenProvider (#19566 ) Refactors Agent instance identity to be a SessionTokenProvider. Refactors the CLI to create Agent clients via a centralized function, rather than add-hoc via individual command handlers and their flags. This allows commands besides `coder agent`, but which still use the agent identity, to support instance identity authentication. Fixes #19111 by unifying all API requests to go thru the SessionTokenProvider for auth credentials.	2025-09-03 10:38:42 +04:00
Ethan	51d8a05301	test: disable direct connections for a deterministic reachable peers metric (#19458 ) closes https://github.com/coder/internal/issues/921 Not sure what I was thinking when I wrote this test case, but it was relying on the connection being p2p on every ping, which is technically and evidently not always the case. Instead we'll require a DERP peer, and block direct connections.	2025-08-21 11:46:56 +10:00
Garrett Delfosse	dd867bd743	fix: fix jetbrains toolbox connection tracking (#19348 ) Fixes https://github.com/coder/coder/issues/18350 I attempted the route of relying on just the session env vars, in hopes that this issue was fixed in Toolbox and the process name matching was no longer need, but it was not a fruitful endeavor and it seems to be using the same connection logic as it did in gateway, just with new binary and flag names.	2025-08-20 08:39:08 -04:00
Danielle Maywood	5e84d257b7	refactor: convert workspacesdk.AgentConn to an interface (#19392 ) Fixes https://github.com/coder/internal/issues/907 We convert `workspacesdk.AgentConn` to an interface and generate a mock for it. This allows writing `coderd` tests that rely on the agent's HTTP api to not have to set up an entire tailnet networking stack.	2025-08-20 10:00:44 +01:00
Danielle Maywood	23c494f36b	fix(agent/agentcontainers): resolve symlink in tests (#19440 ) Fixes https://github.com/coder/internal/issues/917	2025-08-20 09:32:28 +01:00
Danielle Maywood	e8795269e4	fix: resolve `TestAPI/Error/DuringInjection` flake (#19407 ) Resolves https://github.com/coder/internal/issues/905	2025-08-19 12:23:37 +01:00
Dean Sheather	c6c8b00b07	chore: require nolint for testutil.RunRetry (#19394 )	2025-08-19 00:48:10 +10:00
Dean Sheather	e2ba9e7d62	chore: retry TestAgent_Dial subtests (#19387 ) Closes https://github.com/coder/internal/issues/595	2025-08-18 13:51:19 +00:00
Danielle Maywood	205eb29e60	fix: stop reading closed channel for `/watch` devcontainers endpoint (#19373 ) Fixes https://github.com/coder/coder/issues/19372 We increase the read limit to 4MiB (we use this limit elsewhere). We also make sure to stop sending messages when `containersCh` becomes closed.	2025-08-15 12:32:33 +01:00
Ethan	d7bdb3cdef	ci: add `paralleltestctx` to `lint/go` (#19369 ) Closes https://github.com/coder/internal/issues/884 We're adding this as a `go run` in `lint/go` for now, since adding it to golangci-lint ourselves involves recompiling golangci-lint and then running that new binary. I'll look into proposing it being added to the public golangci-lint linters. Doesn't appear to cause the lint ci job to take any longer, which is nice.	2025-08-15 16:16:18 +10:00
Spike Curtis	6ba55213fb	test: fix timeout on TestServer_X11_EvictionLRU (#19217 ) fixes https://github.com/coder/internal/issues/878 On my dev system it takes 900ms, but looking at timestamps in CI it took 25 seconds. Bumping timeout to 60s. Also fixes the segfault.	2025-08-07 16:40:38 +04:00
Danielle Maywood	760dc8b467	fix(agent/agentcontainers): fix `TestDevcontainerDiscovery/AutoStart` flake (#19179 ) Fixes https://github.com/coder/internal/issues/864	2025-08-05 13:58:55 +01:00
Spike Curtis	7eb41193f8	test: fix TestSSHServer_ClosesStdin to handle non-atomic write (#19174 ) fixes https://github.com/coder/internal/issues/863 We read an output file in a loop, but this could lead to races where the other process has created the file but not written, or a partial write in progress. Fix is to retry if the content is shorter than we expect.	2025-08-05 11:36:21 +04:00
Danielle Maywood	b8e2344ef5	chore(agent/agentcontainers): disable project autostart by default (#19114 ) We disable the logic that allows autostarting discovered devcontainers by default. We want this behavior to be opt-in rather than opt-out. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-04 16:21:13 +01:00
Danielle Maywood	ddb5b87815	chore(agent/agentcontainers): test current prebuilds integration (#19074 ) As it turns out, prebuilds + devcontainers appear to already work together. This PR has created a test that simulates a prebuild claim happening to `agentcontainers.API`, to see how we handle it.	2025-07-31 15:31:44 +01:00
Danielle Maywood	cc4f8da6e1	fix(agent/agentcontainers): fix devcontainer integration tests (#19109 ) It appears we accidentally merged a change that broke our devcontainer integration tests https://github.com/coder/coder/pull/18570.	2025-07-31 13:24:23 +01:00
Danielle Maywood	219d1b4101	chore(agent/agentcontainers): skip part of test if on `darwin` (#19081 )	2025-07-29 17:06:17 +01:00
Danielle Maywood	66cf90c736	feat(agent/agentcontainers): allow auto start for discovered containers (#19040 ) Closes https://github.com/coder/internal/issues/711 When a `devcontainer.json` has been found and it has `.customizations.coder.autoStart = true`, we will now auto start this dev container.	2025-07-28 12:30:52 +01:00
Danielle Maywood	25d70ce7bc	fix(agent/agentcontainers): respect ignore files (#19016 ) Closes https://github.com/coder/coder/issues/19011 We now use [go-git](https://pkg.go.dev/github.com/go-git/go-git/v5@v5.16.2/plumbing/format/gitignore)'s `gitignore` plumbing implementation to parse the `.gitignore` files and match against the patterns generated. We use this to ignore any ignored files in the git repository. Unfortunately I've had to slightly re-implement some of the interface exposed by `go-git` because they use `billy.Filesystem` instead of `afero.Fs`.	2025-07-24 12:12:05 +01:00
Danielle Maywood	f41275eb39	feat(agent/agentcontainers): auto detect dev containers (#18950 ) Relates to https://github.com/coder/internal/issues/711 This PR implements a project discovery mechanism that searches for any dev container projects and makes them visible in the UI so that they can be started. To make the wording on the site more clear, "Rebuild" has been changed to "Start" when there is no container associated with a known dev container configuration. I've also made it so that site will show the dev container config path when there is no other name available. ### Design decisions Just want to ensure my explanation for a few design decisions are noted down: - We only search for dev container configurations inside git repositories - We only search for these git repositories if they're at the top level or a direct child of the agent directory. This limited approach is to reduce the amount of files we ultimately walk when trying to find these projects. It makes sense to limit it to only the agent directory, although I'm open to expanding how deep we search.	2025-07-22 19:02:43 +01:00
Dean Sheather	a1b87a67c6	fix: use client preferred URL for the default DERP (#18911 ) The agentsdk currently does a remap of the DERP map to change the EmbeddedRelay node's URL to match the agent's access URL. This PR makes changes to the `workspacesdk` (used by clients like the CLI) and `vpn` (used by Coder Desktop) to match this behavior. This enables us the ability to try Coder clients in dogfood over a VPN without changing the global access URL.	2025-07-17 20:17:44 +10:00
Danielle Maywood	fb00cd2c1a	fix(agent/agentcontainers): fix `TestAPI/NoUpdaterLoopLogspam` flake (#18905 )	2025-07-17 10:59:02 +01:00
Danielle Maywood	bd3d0ea482	fix(agent/agentcontainers): fix `TestAPI/IgnoreCustomization` flake (#18863 )	2025-07-15 10:01:04 +01:00
Danielle Maywood	43b0bb7f61	feat(site): use websocket connection for devcontainer updates (#18808 ) Instead of polling every 10 seconds, we instead use a WebSocket connection for more timely updates.	2025-07-14 21:35:35 +01:00
Ethan	c1b2304d18	test(agent/agentssh): use fish shell compatible exit status checking (#18824 ) This (week-old) test was failing in my workspace because I use fish shell. I really do not like that Fish shell does not support `$?`, but I also do like Fish shell! We have a few people at Coder who use it who would appreciate this change.	2025-07-10 19:50:30 +10:00
Mathias Fredriksson	6c4db7a2bc	feat(cli): replace open vscode container with devcontainer subagent (#18765 ) This change allows a devcontainer to be opened via the agent syntax, `coder open vscode <workspace>.<agent>` and removes the `--container` option to simplify the subcommand. Accessing the subagent will behave similarly to how the `--container` option behaved. Fixes coder/internal#748	2025-07-08 19:21:41 +03:00
Danielle Maywood	0118e75009	fix(agent): disable dev container integration inside sub agents (#18781 ) It appears we accidentally broke this logic in a previous PR. This should now correctly disable the agent api as we'd expect.	2025-07-08 11:05:30 +01:00
blink-so[bot]	2c95a1dd71	chore: update gofumpt from v0.4.0 to v0.8.0 (#18652 )	2025-07-03 11:28:00 -06:00

1 2 3 4 5 ...

536 Commits