coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 05:28:20 +00:00

Author	SHA1	Message	Date
Kacper Sawicki	f016d9e505	fix(coderd): add role param to agent RPC to prevent false connectivity (#22052 ) ## Summary coder-logstream-kube and other tools that use the agent token to connect to the RPC endpoint were incorrectly triggering connection monitoring, causing false connected/disconnected timestamps on the agent. This led to VSCode/JetBrains disconnections and incorrect dashboard status. ## Changes Add a `role` query parameter to `/api/v2/workspaceagents/me/rpc`: - `role=agent`: triggers connection monitoring (default for the agent SDK) - any other value (e.g. `logstream-kube`): skips connection monitoring - omitted: triggers monitoring for backward compatibility with older agents The agent SDK now sends `role=agent` by default. A new `Role` field on the `agentsdk.Client` allows non-agent callers to specify a different role. ## Required follow-up coder-logstream-kube needs to set `client.Role = "logstream-kube"` before calling `ConnectRPC20()`. Without that change, it will still send `role=agent` and trigger monitoring. Fixes #21625	2026-02-18 09:44:06 +01:00
Danielle Maywood	2de8cdf160	feat(agent): add subagent ID fields to devcontainers in manifest (#21848 ) Update the agent protobuf schema (agent/proto/agent.proto) to include: - subagent_id field in WorkspaceAgentDevcontainer message - id field in CreateSubAgentRequest message Bump the Agent API version from v2.7 to v2.8 and update all client references throughout the codebase (ConnectRPC27 -> ConnectRPC28, DRPCAgentClient27 -> DRPCAgentClient28).	2026-02-03 12:37:30 +00:00
Spike Curtis	3398833919	test: don't drop error on blank IP address in report (#21642 ) fixes https://github.com/coder/internal/issues/1286 We can get blank IP address from the net connection if the client has already disconnected, as was the case in this flake. Fix is to only log error if we get something non-empty we can't parse. --------- Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2026-01-23 10:26:44 +00:00
Asher	ff9ed91811	chore: move agent's file API into separate package (#21531 ) This makes it so we can test it directly without having to go through Tailnet, which appears to be causing flakes in CI where the requests time out and never make it to the agent. Takes inspiration from the container-related API endpoints. Would probably make sense to refactor the ls tests to also go through the API (rather than be internal tests like they are currently) but I left those alone for now to keep the diff minimal.	2026-01-16 17:03:17 -09:00
Spike Curtis	bddb808b25	chore: arrange imports in a standard way (#21452 ) Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example: ``` import ( "context" "time" "github.com/prometheus/client_golang/prometheus" "golang.org/x/xerrors" "gopkg.in/natefinch/lumberjack.v2" "cdr.dev/slog/v3" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/serpent" ) ``` 3 groups: standard library, 3rd partly libs, Coder libs. This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.	2026-01-08 15:24:11 +04:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
Zach	07924037e7	feat: add boundary log forwarding from agent to coderd (#21345 ) Add agent forwarding of boundary audit logs from workspaces to coderd via agent API, and re-emission of boundary logs to coderd stderr. This change adds a server to the workspace agent that always listens on a unix socket for boundary to connect and send audit logs. coderd log format example: ``` [API] 2025-12-23 18:31:46.755 [info] coderd.agentrpc: boundary_request owner=.. workspace_name=.. agent_name=.. decision=.. workspace_id=.. http_method=.. http_url=.. event_time=.. request_id=.. ``` Corresponding boundary PR: https://github.com/coder/boundary/pull/124 RFC: https://www.notion.so/coderhq/Agent-Boundary-Logs-2afd579be59280f29629fc9823ac41ba https://github.com/coder/coder/issues/21280	2025-12-31 16:38:19 -07:00
Zach	9d1493a13a	feat: add initial API for boundary log forwarding to coderd (#21293 ) Add the AgentAPI changes to support the feature that transmits boundary logs from workspaces to coderd via the agent API for eventual re-emission to stderr. The API handlers are stubs for now because I'm trying to land this feature from multiple smaller PRs. High level architecture: - Boundary records resource access in batches and sends proto message to agent - Agent proxies messages to coderd (captured by the API changes in this PR) - coderd re-emits logs to stderr RFC: https://www.notion.so/coderhq/Agent-Boundary-Logs-2afd579be59280f29629fc9823ac41ba	2025-12-19 10:41:39 -07:00
Spike Curtis	71c6dc4043	fix: stop disconnecting from coderd early and record disconnect correctly (#21250 ) fixes https://github.com/coder/internal/issues/1196 The above issue exposes two different bugs in Coder. In the agent, there is a race where if the agent is closed while starting up networking, it will erroneously disconnect from Coderd, which delays or breaks writing final status and logs. In Coderd, there is a bug where we don't properly record the latest agent disconnection time if the agent had previously disconnected. This causes us to report the agent status as "Connected" even after it has disconnected up until the inactivity timeout fires. This PR fixes both issues. It also slightly reworks when we send workspace updates based on connection and disconnection. Previously we would send two updates when the agent connected in certain circumstances, even though the status would be the same in both (only times changed). Now we universally only send one on connect, and then another on disconnect.	2025-12-15 12:04:01 +04:00
Spike Curtis	ce9e7ad909	fix(agent): ignore EOF errors during shutdown (#21187 ) fixes: https://github.com/coder/internal/issues/1179 The problem in that flake is that dRPC doensn't consistently return `context.Canceled` if you make an RPC call and then cancel it: sometimes it returns EOF. Without this PR, if we get an EOF on one of the routines that uses the agentapi connection, we tear down the whole connection and reconnect to coderd --- even if we are in the middle of a graceful shutdown. What happened in the linked flake is that writing stats failed with EOF, which then caused us to reconnect and write the lifecycle "SHUTTING DOWN" twice.	2025-12-09 17:32:38 +04:00
Spike Curtis	40df21ed62	fix: fixes use of possibly nil RemoteAddr() and LocalAddr() return values (#21076 ) fixes: https://github.com/coder/internal/issues/1143 Both gVisor and the Go standard library implementations of `net.Conn` can under certain circumstances return `nil` for `RemoteAddr()` and `LocalAddr()` calls. If we call their methods, we segfault. This PR fixes these calls and adds ruleguard rules. Note that `slog.F("remote_addr", conn.RemoteAddr())` is fine because slog detects the `nil` before attempting to stringify the type.	2025-12-03 15:06:00 +04:00
Sas Swart	ce627bf23f	feat: implement agent socket api, client and cli (#20758 ) closes: https://github.com/coder/coder/issues/10352 closes: https://github.com/coder/internal/issues/1094 closes: https://github.com/coder/internal/issues/1095 In this pull request, we enable a new set of experimental cli commands grouped under `coder exp sync`. These commands allow any process acting within a coder workspace to inform the coder agent of its requirements and execution progress. The coder agent will then relay this information to other processes that have subscribed. These commands are: ``` # Check if this feature is enabled in your environment coder exp sync ping # express that your unit depends on another coder exp sync want <unit> <dependency_unit> # express that your unit intends to start a portion of the script that requires # other units to have completed first. This command blocks until all dependencies have been met coder exp sync start <unit> # express that your unit has completes its work, allowing dependent units to begin their execution coder exp sync complete <unit> ``` Example: In order to automatically run claude code in a new workspace, it must first have a git repository cloned. The scripts responsible for cloning the repository and for running claude code would coordinate in the following way: ```bash # Script A: Claude code # Inform the agent that the claude script wants the git script. # That is, the git script must have completed before the claude script can begin its execution coder exp sync want claude git # Inform the agent that we would now like to begin execution of claude. # This command will block until the git script (and any other defined dependencies) # have completed coder exp sync start claude # Now we run claude code and any other commands we need claude ... # Once our script has completed, we inform the agent, so that any scripts that depend on this one # may begin their execution coder exp sync complete claude ``` ```bash # Script B: Git # Because the git script does not have any dependencies, we can simply inform the agent that we # intend to start coder exp sync start git git clone ssh://git@github.com/coder/coder # Once the repository have been cloned, we inform the agent that this script is complete, so that # scripts that depend on it may begin their execution. coder exp sync complete git ``` Notes: * Unit names (ie. `claude` and `git`) given as input to the sync commands are arbitrary strings. You do not have to conform to specific identifiers. We recommend naming your scripts descriptively, but succinctly. * Scripts unit names should be well documented. Other scripts will need to know the names you've chosen in order to depend on yours. Therefore, you --------- Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2025-11-28 08:33:50 +02:00
Sas Swart	1d726c81bb	fix: remove a sensitive field from an agent log line (#20968 ) This PR removes a log field that could expose sensitive information in agent logs for workspaces that pass such information to the agent via its manifest.	2025-11-27 16:12:03 +02:00
Spike Curtis	afd40436f0	fix: mock Agent querying OS for listening ports in tests (#20842 ) fixes https://github.com/coder/internal/issues/1123 We want to tests that ports are not included after they are no longer used, but this isn't safe on the real OS networking stack because there is no way to guarantee a port _won't_ be used. Instead, we introduce an interface and fake implementation for testing. On order to leave the filtering logic in the test path, this PR also does some refactoring. Caching logic is left in the real OS querying implementation and a new test case is added for it in this PR.	2025-11-25 14:25:24 +04:00
Dean Sheather	6c99d5eca2	fix: avoid connection logging crashes in agent (#20307 ) - Ignore errors when reporting a connection from the server, just log them instead - Translate connection log IP `localhost` to `127.0.0.1` on both the server and the agent Note that the temporary fix for converting invalid IPs to localhost is not required in main since the database no longer forbids NULL for the IP column since https://github.com/coder/coder/pull/19788 Relates to #20194	2025-10-16 01:56:43 +11:00
Spike Curtis	1354d84eb4	chore: refactor instance identity to be a SessionTokenProvider (#19566 ) Refactors Agent instance identity to be a SessionTokenProvider. Refactors the CLI to create Agent clients via a centralized function, rather than add-hoc via individual command handlers and their flags. This allows commands besides `coder agent`, but which still use the agent identity, to support instance identity authentication. Fixes #19111 by unifying all API requests to go thru the SessionTokenProvider for auth credentials.	2025-09-03 10:38:42 +04:00
Danielle Maywood	f41275eb39	feat(agent/agentcontainers): auto detect dev containers (#18950 ) Relates to https://github.com/coder/internal/issues/711 This PR implements a project discovery mechanism that searches for any dev container projects and makes them visible in the UI so that they can be started. To make the wording on the site more clear, "Rebuild" has been changed to "Start" when there is no container associated with a known dev container configuration. I've also made it so that site will show the dev container config path when there is no other name available. ### Design decisions Just want to ensure my explanation for a few design decisions are noted down: - We only search for dev container configurations inside git repositories - We only search for these git repositories if they're at the top level or a direct child of the agent directory. This limited approach is to reduce the amount of files we ultimately walk when trying to find these projects. It makes sense to limit it to only the agent directory, although I'm open to expanding how deep we search.	2025-07-22 19:02:43 +01:00
Dean Sheather	a1b87a67c6	fix: use client preferred URL for the default DERP (#18911 ) The agentsdk currently does a remap of the DERP map to change the EmbeddedRelay node's URL to match the agent's access URL. This PR makes changes to the `workspacesdk` (used by clients like the CLI) and `vpn` (used by Coder Desktop) to match this behavior. This enables us the ability to try Coder clients in dogfood over a VPN without changing the global access URL.	2025-07-17 20:17:44 +10:00
Danielle Maywood	0118e75009	fix(agent): disable dev container integration inside sub agents (#18781 ) It appears we accidentally broke this logic in a previous PR. This should now correctly disable the agent api as we'd expect.	2025-07-08 11:05:30 +01:00
Mathias Fredriksson	0f3a1e9849	fix(agent/agentcontainers): split Init into Init and Start for early API responses (#18640 ) Previously in #18635 we delayed the containers API `Init` to avoid producing errors due to Docker and `@devcontainers/cli` not yet being installed by startup scripts. This had an adverse effect on the UX via UI responsiveness as the detection of devcontainers was greatly delayed. This change splits `Init` into `Init` and `Start` so that we can immediately after `Init` start serving known devcontainers (defined in Terraform), improving the UX. Related #18635 Related #18640	2025-06-27 19:01:50 +03:00
Mathias Fredriksson	8ee2668b39	fix(agent): fix script filtering for devcontainers (#18635 )	2025-06-27 16:59:31 +03:00
Mathias Fredriksson	7e99fb7d7e	fix(agent): delay containerAPI init to ensure startup scripts run before (#18630 )	2025-06-27 14:10:35 +03:00
ケイラ	09cc906981	chore: remove unnecessary redeclarations in for loops (part 2) (#18593 )	2025-06-26 12:28:00 -06:00
Mathias Fredriksson	eca6381314	feat(agent/agentcontainers): add more envs to readconfig for app URL building (#18603 )	2025-06-26 09:33:58 +00:00
Danielle Maywood	c4e4fe85f9	fix(agent): start devcontainers through agentcontainers package (#18471 ) Fixes https://github.com/coder/internal/issues/706 Context for the implementation here https://github.com/coder/internal/issues/706#issuecomment-2990490282 Synchronously starts dev containers defined in terraform with our `DevcontainerCLI` abstraction, instead of piggybacking off of our `agentscripts` package. This gives us more control over logs, instead of being reliant on packages which may or may not exist in the user-provided image.	2025-06-25 11:52:50 +01:00
Mathias Fredriksson	99d124e276	feat(agent): enable devcontainers by default (#18533 )	2025-06-24 21:17:04 +03:00
Danielle Maywood	118bf98145	chore(agent): add workspace owner env var and log dev container app failures (#18433 ) Listen to feedback that was missed in https://github.com/coder/coder/pull/18346 - Adds `CODER_WORKSPACE_OWNER_NAME` into the agent environment. - Logs warnings for when dev container app creation fails.	2025-06-19 09:37:48 +01:00
Mathias Fredriksson	d6df1f23a9	fix(agent/agentcontainers): update sub agent client on reconnect (#18399 ) Fixes coder/internal#697	2025-06-17 13:58:09 +00:00
Mathias Fredriksson	ae0c8701bb	feat(agent): disable devcontainers for sub agents (#18303 ) Updates coder/internal#621 Refs #18245	2025-06-10 10:47:02 +00:00
Mathias Fredriksson	fca99174ad	feat(agent/agentcontainers): implement sub agent injection (#18245 ) This change adds support for sub agent creation and injection into dev containers. Updates coder/internal#621	2025-06-10 12:37:54 +03:00
Mathias Fredriksson	04e4f2fac0	chore(agent): update agent proto client (#18242 )	2025-06-05 16:58:18 +03:00
Bruno Quaresma	d779126ee3	chore: rollback PR #18081 (#18104 ) Rollback https://github.com/coder/coder/pull/18081	2025-05-29 13:12:13 -03:00
Danielle Maywood	b712d0b23f	feat(coderd/agentapi): implement sub agent api (#17823 ) Closes https://github.com/coder/internal/issues/619 Implement the `coderd` side of the AgentAPI for the upcoming dev-container agents work. `agent/agenttest/client.go` is left unimplemented for a future PR working to implement the agent side of this feature.	2025-05-29 12:15:47 +01:00
Bruno Quaresma	2ec7404197	chore: make owner_name and owner_username consistent (#18081 ) We've been using owner_name inconsistently as username. So this PR fixes it by making the attribute naming more consistent.	2025-05-28 17:25:32 -03:00
Mathias Fredriksson	d6c14f3d8a	feat(agent/agentcontainers): update containers periodically (#17972 ) This change introduces a significant refactor to the agentcontainers API and enables periodic updates of Docker containers rather than on-demand. Consequently this change also allows us to move away from using a locking channel and replace it with a mutex, which simplifies usage. Additionally a previous oversight was fixed, and testing added, to clear devcontainer running/dirty status when the container has been removed. Updates coder/coder#16424 Updates coder/internal#621	2025-05-22 19:44:33 +03:00
Danielle Maywood	61f22a59ba	feat(agent): add `ParentId` to agent manifest (#17888 ) Closes https://github.com/coder/internal/issues/648 This change introduces a new `ParentId` field to the agent's manifest. This will allow an agent to know if it is a child or not, as well as knowing who the owner is. This is part of the Dev Container Agents work	2025-05-19 16:09:56 +01:00
Danielle Maywood	83df55700b	revert(agent): remove `CODER_AGENT_IS_SUB_AGENT` cli flag (#17875 ) The RFC has changed, this information will be passed through the manifest instead.	2025-05-16 11:04:21 +00:00
Sas Swart	425ee6fa55	feat: reinitialize agents when a prebuilt workspace is claimed (#17475 ) This pull request allows coder workspace agents to be reinitialized when a prebuilt workspace is claimed by a user. This facilitates the transfer of ownership between the anonymous prebuilds system user and the new owner of the workspace. Only a single agent per prebuilt workspace is supported for now, but plumbing has already been done to facilitate the seamless transition to multi-agent support. --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com> Co-authored-by: Danny Kopping <dannykopping@gmail.com>	2025-05-14 14:15:36 +02:00
Danielle Maywood	7f056da088	feat: add hidden `CODER_AGENT_IS_SUB_AGENT` flag to `coder agent` (#17783 ) Closes https://github.com/coder/internal/issues/620 Adds a new, hidden, flag `CODER_AGENT_IS_SUB_AGENT` to the `coder agent` command.	2025-05-13 10:57:50 +01:00
Mathias Fredriksson	7af188bfc1	fix(agent): fix unexpanded devcontainer paths for agentcontainers (#17736 ) Devcontainers were duplicated in the API because paths weren't absolute, we now normalize them early on to keep it simple. Updates #16424	2025-05-12 14:03:40 +03:00
Mathias Fredriksson	1fc74f629e	refactor(agent): update agentcontainers api initialization (#17600 ) There were too many ways to configure the agentcontainers API resulting in inconsistent behavior or features not being enabled. This refactor introduces a control flag for enabling or disabling the containers API. When disabled, all implementations are no-op and explicit endpoint behaviors are defined. When enabled, concrete implementations are used by default but can be overridden by passing options.	2025-04-29 17:53:10 +03:00
Mathias Fredriksson	268a50c193	feat(agent/agentcontainers): add file watcher and dirty status (#17573 ) Fixes coder/internal#479 Fixes coder/internal#480	2025-04-29 11:53:58 +03:00
Spike Curtis	c1816e3674	fix(agent): fix deadlock if closed while starting listeners (#17329 ) fixes #17328 Fixes a deadlock if we close the Agent in the middle of starting listeners on the tailnet.	2025-04-10 12:46:19 +04:00
Aaron Lehmann	aa0a63a295	fix(agent): log correct error variable in createTailnet (#17283 )	2025-04-07 16:32:52 +00:00
Aaron Lehmann	e9863aba81	fix: log correct error on drpc connection close error (#17265 )	2025-04-04 22:09:42 +03:00
Spike Curtis	42e5d71f59	fix: fix closeMutex unlock bug (#17259 ) Fixes https://github.com/coder/internal/issues/550 Classic return before unlocking bug.	2025-04-04 14:29:56 +04:00
Spike Curtis	f6bf6c6ec4	fix!: use names not IDs for agent SSH key seed (#17258 ) Changes the SSH host key seeding to use the owner username, workspace name, and agent name. This prevents SSH from complaining about a mismatched host key if you use Coder Desktop to connect, and delete and recreate your workspace with the same name. Previously this would generate a different key because the workspace ID changed. We also include the owner's username in anticipation of using Coder Desktop to access shared workspaces (or as a superuser) down the road, so that workspaces with the same name owned by different users will not have the same key. This change is BREAKING in a limited sense that early access users of Coder Desktop will see their SSH clients complain about host keys changing the first time each workspace is rebuilt with this code. It can be resolved by clearing your `.ssh/known_hosts` file of the Coder workspaces you access this way.	2025-04-04 12:51:46 +04:00
Mathias Fredriksson	b61f0ab958	fix(agent): ensure SSH server shutdown with process groups (#17227 ) Fix hanging workspace shutdowns caused by orphaned SSH child processes. Key changes: - Create process groups for non-PTY SSH sessions - Send SIGHUP to entire process group for proper termination - Add 5-second timeout to prevent indefinite blocking Fixes #17108	2025-04-03 16:01:43 +03:00
Mathias Fredriksson	7d4b3c8634	feat(agent): add devcontainer autostart support (#17076 ) This change adds support for devcontainer autostart in workspaces. The preconditions for utilizing this feature are: 1. The `coder_devcontainer` resource must be defined in Terraform 2. By the time the startup scripts have completed, - The `@devcontainers/cli` tool must be installed - The given workspace folder must contain a devcontainer configuration Example Terraform: ```tf resource "coder_devcontainer" "coder" { agent_id = coder_agent.main.id workspace_folder = "/home/coder/coder" config_path = ".devcontainer/devcontainer.json" # (optional) } ``` Closes #16423	2025-03-27 12:31:30 +02:00
Danielle Maywood	1bbbae8d57	chore: migrate to github.com/coder/clistat (#17107 ) Migrate from in-tree `clistat` package to https://github.com/coder/clistat.	2025-03-26 10:36:53 +00:00

1 2 3 4 5 ...

263 Commits