coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 13:08:25 +00:00

Author	SHA1	Message	Date
Michael Suchacz	e5707a13d6	feat: support multiple agents with shared instance-identity auth (#24325 ) > This PR was authored by Mux on behalf of Mike. ## Summary Adds support for multiple peer root workspace agents sharing the same `auth_instance_id`, so AWS, Azure, and GCP instance-identity auth can issue the correct session token for a selected agent instead of assuming a single root agent per instance. ## Problem When a Terraform template attaches two or more `coder_agent` resources (with `auth = "aws-instance-identity"`) to a single compute instance, every agent shares the same cloud instance ID. The existing singular lookup picks whichever agent was created most recently, silently ignoring the others. ## Solution Introduce an optional pre-auth agent selector (`CODER_AGENT_NAME`) and make the server-side lookup ambiguity-aware. Database layer: - `GetWorkspaceAgentsByInstanceID` (`:many`): returns all matching root agents for an instance ID. - `GetWorkspaceAgentByInstanceIDAndName` (`:one`): returns the named root agent for disambiguation. SDK and CLI: - `agent_name` field added to AWS, Azure, and GCP request structs (`omitempty` for backward compatibility). - `CODER_AGENT_NAME` env var and `--agent-name` flag wired into the agent bootstrap before instance-identity auth runs. Server handler (`handleAuthInstanceID`): - When `agent_name` is present: direct lookup by (instance ID, name). - When absent: legacy lookup, then resource-scoped ambiguity check. Returns 409 with available agent names if multiple root agents match. - Whitespace-only names are trimmed and treated as unspecified. - Sub-agents remain excluded (`parent_id IS NULL` filter). Verification template: - `examples/templates/aws-multi-agent/` provisions one EC2 instance with two agents (`main` and `dev`), both using instance-identity auth with `CODER_AGENT_NAME` set in the cloud-init user data. ## Backward compatibility Existing single-agent deployments work unchanged. The `agent_name` field is optional with `omitempty`, and the unnamed path preserves today's behavior when only one root agent matches.	2026-04-16 13:59:09 +02:00
Sas Swart	5b6b7719df	fix: make prebuild claiming durable and idempotent (#23108 ) ## Problem When a prebuilt workspace is claimed, the agent reinitializes via a single fire-and-forget pubsub event over SSE. If the agent's SSE connection is interrupted at claim time, the event is permanently lost — the workspace is stuck with no self-healing path. Additionally, regular (non-prebuild) workspaces had no way to opt out of the `/reinit` polling loop — agents would reconnect indefinitely to an endpoint that would never send them anything useful. ## Root Cause `workspaceAgentReinit` fetches the workspace (with its current `owner_id`) via `GetWorkspaceByAgentID`, but never checked whether a claim already happened. It only subscribed to pubsub for future events. The database already has durable claim state (`owner_id` changes from `PrebuildsSystemUserID` to the real user), but no layer ever consulted it on reconnection. ## Solution ### Server-side durable check with first-build-initiator gating TOCTOU-safe ordering: Subscribe to pubsub claim events before any durable checks, so a claim that fires during the check is buffered in the channel rather than lost. First-build-initiator gating: When `!workspace.IsPrebuild()` (owner is no longer the system user), look up the first build's `InitiatorID`. The prebuild reconciler always uses `PrebuildsSystemUserID` as the initiator. This distinguishes claimed prebuilds from regular workspaces without any SQL schema changes. - Regular workspace (first build initiator ≠ system user) → 409 Conflict, agent stops reconnecting - Claimed prebuild, build completed → pre-seed channel with reinit event and close it, transmitter delivers one-shot then exits - Claimed prebuild, build in-progress → fall through to pubsub subscription, agent waits for completion event - Unclaimed prebuild → pubsub subscription (existing happy path) ### Declarative reinit events (defense-in-depth) - Added `UserID` field to `ReinitializationEvent` with JSON tags - Switched pubsub serialization from raw string to JSON (with backward-compat fallback for rolling upgrades) - Populated `UserID` at both the publish site and the durable check ### Agent SDK: 409 handling `WaitForReinitLoop` detects 409 Conflict from the server and closes the `reinitEvents` channel, cleanly exiting the retry goroutine. ### Agent CLI: fixed two bugs + added reinitCtx - Closed channel (`!ok`): now blocks on `<-ctx.Done()` instead of `continue`, keeping the current agent running. Previously this would leak agents by skipping `agnt.Close()` and re-entering the loop. - Duplicate owner reinit: cancels `reinitCtx` (stops the reinit goroutine), then blocks on `<-ctx.Done()`. Previously `continue` would skip cleanup and create a new agent on the next loop iteration. - `reinitCtx`: a cancellable child of `ctx` passed to `WaitForReinitLoop`, allowing the agent to stop the reinit HTTP polling after reinit completes. ### Agent-side idempotency Tracks `lastOwnerID` in the agent reinit loop — duplicate events for the same owner are skipped. ## Testing - "unclaimed prebuild receives reinit via pubsub": prebuild owned by system user, pubsub event triggers reinit - "claimed prebuild receives one-shot reinit on reconnect": first build by system user, owner changed, build completed → immediate reinit (no pubsub needed) - "claimed prebuild waits during in-progress claim build": claimed but build still running → no reinit until build completes - "regular workspace gets 409": first build by real user → 409 Conflict, agent stops polling - Updated claim publisher/listener tests: verify `UserID` survives JSON round-trip + backward compat with raw string payloads - Updated SSE round-trip test: verify `UserID` survives transmit → receive cycle Fixes #22359 ## Rolling upgrade note During a rolling deploy where old coderd instances coexist with new ones, the pubsub `ReinitializationEvent` has a new `workspace_id` field (JSON key `workspace_id`). Old publishers send a raw reason string instead of JSON; the new listener gracefully falls back by treating the entire payload as the reason and filling in `WorkspaceID` from context. The only visible effect during the upgrade window is that `WorkspaceID` may be the zero UUID in agent-side logs — this is cosmetic and resolves once all instances are updated.	2026-04-02 23:51:02 +02:00
Danielle Maywood	6ccd20d45f	feat(agent): populate subagent ID for terraform-defined devcontainers (#21942 ) Completes the final piece of the puzzle. Support the pre-creation flow from the agent side.	2026-02-06 15:52:54 +00:00
Cian Johnston	353ebd9664	feat: add link for viewing raw build logs in workspace and template build jobs (#21727 ) * Adds support for parameter `format=text` in the following API routes: * `/api/v2/workspaceagents/:id/logs` * `/api/v2/workspacebuilds/:id/logs` * `/api/v2/templateversions/:id/logs` * `/api/v2/templateversions/:id/dry-run/:id/logs` * Adds links to view raw logs on the following pages: * Workspace build page * Template editor page * Template version page * Refactors existing log formatting in `cli/logs.go` to live in `codersdk`. 🤖 Generated with Claude Opus 4.5, reviewed by me. --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-03 09:45:23 +00:00
Danielle Maywood	05529139bc	feat(coderd): support deleting dev containers (#21248 ) Add an endpoint to coderd to support deleting dev containers	2025-12-24 12:34:39 +00:00
Marcin Tojek	0af038bddd	docs: group enumerated values by property in API docs (#21372 ) Fixes #13840	2025-12-22 16:19:25 +01:00
Rafael Rodriguez	e53bc247e9	feat: add tooltip field to workspace app that renders as markdown (#19651 ) In this pull request we're adding an optional `tooltip` field. The `tooltip` field is a string field (with markdown support) that will be used to display tooltips on hover over app buttons in a workspace dashboard. Tooltip screenshot <img width="816" height="275" alt="Screenshot 2025-08-29 at 4 11 56 PM" src="https://github.com/user-attachments/assets/52c736a1-f632-465b-89a0-35ca99bd367b" /> Tooltip video https://github.com/user-attachments/assets/21806337-accc-4acf-b8c6-450c031d98f1 Issue: https://github.com/coder/coder/issues/18431 Related provider PR: https://github.com/coder/terraform-provider-coder/pull/435 ### Changes - Added migration to add `tooltip` column to `workspace_apps` table - Updated queries to get/set the new `tooltip` column - Updated frontend to render tooltip as markdown (primary tool tip takes precedence over template tooltip) ### Testing - Added storybook test for `Applink` markdown rendering	2025-09-10 11:01:54 -05:00
Danielle Maywood	43b0bb7f61	feat(site): use websocket connection for devcontainer updates (#18808 ) Instead of polling every 10 seconds, we instead use a WebSocket connection for more timely updates.	2025-07-14 21:35:35 +01:00
Danielle Maywood	4756080eb2	feat(site): display devcontainer start error (#18637 ) Fixes https://github.com/coder/internal/issues/705 Surface errors on the UI when a devcontainer agent is unable to be injected.	2025-06-30 21:34:29 +01:00
Danielle Maywood	f2d229eed3	fix!: use devcontainer ID when rebuilding a devcontainer (#18604 ) This PR replaces the use of the container ID with the devcontainer ID. This is a breaking change. This allows rebuilding a devcontainer when there is no valid container ID.	2025-06-26 11:41:57 +01:00
Mathias Fredriksson	97474bb28b	feat: support devcontainer agents in ui and unify backend (#18332 ) This commit consolidates two container endpoints on the backend and improves the frontend devcontainer support by showing names and displaying apps as appropriate. With this change, the frontend now has knowledge of the subagent and we can also display things like port forwards. The frontend was updated to show dev container labels on the border as well as subagent connection status. The recreation flow was also adjusted a bit to show placeholder app icons when relevant. Support for apps was also added, although these are still WIP on the backend. And the port forwarding utility was added in since the sub agents now provide the necessary info. Fixes coder/internal#666	2025-06-17 16:06:47 +03:00
ケイラ	9fc3329575	feat: persist app groups in the database (#17977 )	2025-05-27 13:13:08 -06:00
Mathias Fredriksson	a18eb9d08f	feat(site): allow recreating devcontainers and showing dirty status (#18049 ) This change allows showing the devcontainer dirty status in the UI as well as a recreate button to update the devcontainer. Closes #16424	2025-05-27 19:42:24 +03:00
Mathias Fredriksson	0731304905	feat(agent/agentcontainers): recreate devcontainers concurrently (#18042 ) This change introduces a refactor of the devcontainers recreation logic which is now handled asynchronously rather than being request scoped. The response was consequently changed from "No Content" to "Accepted" to reflect this. A new `Status` field was introduced to the devcontainer struct which replaces `Running` (bool). This reflects that the devcontainer can now be in various states (starting, running, stopped or errored). The status field also protects against multiple concurrent recrations, as long as they are initiated via the API. Updates #16424	2025-05-26 18:30:52 +03:00
Mathias Fredriksson	98e2ec4417	feat: show devcontainer dirty status and allow recreate (#17880 ) Updates #16424	2025-05-19 12:56:10 +03:00
Sas Swart	425ee6fa55	feat: reinitialize agents when a prebuilt workspace is claimed (#17475 ) This pull request allows coder workspace agents to be reinitialized when a prebuilt workspace is claimed by a user. This facilitates the transfer of ownership between the anonymous prebuilds system user and the new owner of the workspace. Only a single agent per prebuilt workspace is supported for now, but plumbing has already been done to facilitate the seamless transition to multi-agent support. --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com> Co-authored-by: Danny Kopping <dannykopping@gmail.com>	2025-05-14 14:15:36 +02:00
Danielle Maywood	0b5f27f566	feat: add `parent_id` column to `workspace_agents` table (#17758 ) Adds a new nullable column `parent_id` to `workspace_agents` table. This lays the groundwork for having child agents.	2025-05-13 00:01:31 +01:00
Spike Curtis	12dc086628	feat: return hostname suffix on AgentConnectionInfo (#17334 ) Adds the Hostname Suffix to `AgentConnectionInfo` --- the VPN provider will use it to control the suffix for DNS hostnames. part of: #16828	2025-04-11 13:09:51 +04:00
Kyle Carberry	8ea956fc11	feat: add app status tracking to the backend (#17163 ) This does ~95% of the backend work required to integrate the AI work. Most left to integrate from the tasks branch is just frontend, which will be a lot smaller I believe. The real difference between this branch and that one is the abstraction -- this now attaches statuses to apps, and returns the latest status reported as part of a workspace. This change enables us to have a similar UX to in the tasks branch, but for agents other than Claude Code as well. Any app can report status now.	2025-03-31 10:55:44 -04:00
Cian Johnston	75b27e8f19	fix(agent/agentcontainers): improve testing of convertDockerInspect, return correct host port (#16887 ) * Improves separation of concerns between `runDockerInspect` and `convertDockerInspect`: `runDockerInspect` now just runs the command and returns the output, while `convertDockerInspect` now does all of the conversion and parsing logic. * Improves testing of `convertDockerInspect` using real test fixtures. * Fixes issue where the container port is returned instead of the host port. * Updates UI to link to correct host port. Container port is still displayed in the button text, but the HostIP:HostPort is shown in a popover. * Adds stories for workspace agent UI	2025-03-18 14:37:45 +00:00
Cian Johnston	31b1ff7d3b	feat(agent): add container list handler (#16346 ) Fixes https://github.com/coder/coder/issues/16268 - Adds `/api/v2/workspaceagents/:id/containers` coderd endpoint that allows listing containers visible to the agent. Optional filtering by labels is supported. - Adds go tools to the `coder-dylib` CI step so we can generate mocks if needed	2025-02-10 11:29:30 +00:00
Vincent Vielle	289338f19e	feat(site): connect open_in parameter (#16036 ) Second step to resolve [open_in issue](https://github.com/coder/terraform-provider-coder/issues/297) This PR improves the way the open_in parameter is forwarded across the code, changing the last `string` to const everywhere. Also make sure it is available and forwarded up to the `CreateLink` component.	2025-01-07 18:08:03 +01:00
Muhammad Atif Ali	94f5d52fdc	chore: adopt markdownlint and markdown-table-formatter for *.md (#15831 ) Co-authored-by: Edward Angert <EdwardAngert@users.noreply.github.com>	2025-01-03 13:12:59 +00:00
Vincent Vielle	08463c27d8	feat: add OpenIn option to coder_app (#15743 ) This PR is the coder/coder part of [the open_in parameter issue](https://github.com/coder/terraform-provider-coder/issues/297) aiming to add a new optional parameter to choose how to open modules. This PR is heavily linked [to this PR](https://github.com/coder/terraform-provider-coder/pull/321). ℹ️ For now, some integrations tests can not be pushed as it requires a release on the terraform-provider repo.	2025-01-03 11:27:02 +01:00
Ethan	b1298a3c1e	feat: add WorkspaceUpdates tailnet RPC (#14847 ) Closes #14716 Closes #14717 Adds a new user-scoped tailnet API endpoint (`api/v2/tailnet`) with a new RPC stream for receiving updates on workspaces owned by a specific user, as defined in #14716. When a stream is started, the `WorkspaceUpdatesProvider` will begin listening on the user-scoped pubsub events implemented in #14964. When a relevant event type is seen (such as a workspace state transition), the provider will query the DB for all the workspaces (and agents) owned by the user. This gets compared against the result of the previous query to produce a set of workspace updates. Workspace updates can be requested for any user ID, however only workspaces the authorised user is permitted to `ActionRead` will have their updates streamed. Opening a tunnel to an agent requires that the user can perform `ActionSSH` against the workspace containing it.	2024-11-01 14:53:53 +11:00
Danielle Maywood	ae522c558d	feat: add agent timings (#14713 ) * feat: begin impl of agent script timings * feat: add job_id and display_name to script timings * fix: increment migration number * fix: rename migrations from 251 to 254 * test: get tests compiling * fix: appease the linter * fix: get tests passing again * fix: drop column from correct table * test: add fixture for agent script timings * fix: typo * fix: use job id used in provisioner job timings * fix: increment migration number * test: behaviour of script runner * test: rewrite test * test: does exit 1 script break things? * test: rewrite test again * fix: revert change Not sure how this came to be, I do not recall manually changing these files. * fix: let code breathe * fix: wrap errors * fix: justify nolint * fix: swap require.Equal argument order * fix: add mutex operations * feat: add 'ran_on_start' and 'blocked_login' fields * fix: update testdata fixture * fix: refer to agent_id instead of job_id in timings * fix: JobID -> AgentID in dbauthz_test * fix: add 'id' to scripts, make timing refer to script id * fix: fix broken tests and convert bug * fix: update testdata fixtures * fix: update testdata fixtures again * feat: capture stage and if script timed out * fix: update migration number * test: add test for script api * fix: fake db query * fix: use UTC time * fix: ensure r.scriptComplete is not nil * fix: move err check to right after call * fix: uppercase sql * fix: use dbtime.Now() * fix: debug log on r.scriptCompleted being nil * fix: ensure correct rbac permissions * chore: remove DisplayName * fix: get tests passing * fix: remove space in sql up * docs: document ExecuteOption * fix: drop 'RETURNING' from sql * chore: remove 'display_name' from timing table * fix: testdata fixture * fix: put r.scriptCompleted call in goroutine * fix: track goroutine for test + use separate context for reporting * fix: appease linter, handle trackCommandGoroutine error * fix: resolve race condition * feat: replace timed_out column with status column * test: update testdata fixture * fix: apply suggestions from review * revert: linter changes	2024-09-24 10:51:49 +01:00
Danielle Maywood	86f68b220e	feat: add 'display_name' column to 'workspace_agent_scripts' (#14747 ) * feat: add 'display_name' column to 'workspace_agent_scripts' * fix: backfill from workspace_agent_log_sources * fix: run 'make gen'	2024-09-20 14:26:13 +01:00
Danielle Maywood	25f1ddbf5e	feat: add 'hidden' option to 'coder_app' to hide app from UI (#14570 ) Add 'hidden' property to 'coder_app' resource to allow hiding apps from the UI.	2024-09-09 14:39:32 +01:00
Kayla Washburn-Love	95a7c0c4f0	chore: use tabs for prettier and biome (#14283 )	2024-08-15 14:53:53 -06:00
Muhammad Atif Ali	48f29a1995	docs: move api and cli docs routes to reference/ (#14241 )	2024-08-13 18:39:46 +03:00

30 Commits