coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 21:18:24 +00:00

Author	SHA1	Message	Date
Zach	ddc0e99c69	chore: remove coder_secret Terraform integration (#25512 ) Removes the coder_secret Terraform integration: the data.coder_secret consumption path through provisionerdserver → provisioner.proto → provisioner/terraform, the dynamic-parameter secret-requirement validation, and the workspace-update / resolve-autostart surfaces that depended on it. This is being done due to a product/feature direction change (see PLAT-243). User-secret CRUD (DB, REST, CLI, UI, telemetry, audit) and the agent-manifest secret-injection path are untouched. The provisionerd API is bumped from v1.17 to v1.18 rather than rolled back: v1.17 shipped in v2.33.x, so user_secrets field numbers are reserved and the changelog documents both versions. Generated with assistance from Coder Agents.	2026-05-21 09:19:29 -06:00
Zach	79735f2d45	feat: plumb user secrets through provisioner chain to terraform (#24542 ) This change passes user secrets from coderd to the Terraform process at workspace build time so the `data.coder_secret` data source in terraform-provider-coder can resolve values at plan time. Secrets traverse two proto hops: `provisionerdserver` fetches them via`ListUserSecretsWithValues`, attaches them to `AcquiredJob.WorkspaceBuild.user_secrets` on `provisionerd.proto`; `runner.go` forwards into `PlanRequest.user_secrets` on `provisioner.proto`; the Terraform provisioner encodes each as `CODER_SECRET_ENV_<name>` or `CODER_SECRET_FILE_<hex(path)>` before invoking `terraform plan`. Only plan requests carry secrets; apply runs with `nil` because values are baked into plan state. Fetch is gated on a workspace transitioning to start. stop and delete transitions never carry secrets, so revoking or deleting a stored secret cannot make a workspace unstoppable. DB errors on the fetch fail the job outright rather than silently continuing with an empty secret set. Note that user secrets will be stored in the workspace_builds table in provisioner_state with other Terraform state (including other sensitive data).	2026-04-27 08:26:07 -06:00
Michael Suchacz	e5707a13d6	feat: support multiple agents with shared instance-identity auth (#24325 ) > This PR was authored by Mux on behalf of Mike. ## Summary Adds support for multiple peer root workspace agents sharing the same `auth_instance_id`, so AWS, Azure, and GCP instance-identity auth can issue the correct session token for a selected agent instead of assuming a single root agent per instance. ## Problem When a Terraform template attaches two or more `coder_agent` resources (with `auth = "aws-instance-identity"`) to a single compute instance, every agent shares the same cloud instance ID. The existing singular lookup picks whichever agent was created most recently, silently ignoring the others. ## Solution Introduce an optional pre-auth agent selector (`CODER_AGENT_NAME`) and make the server-side lookup ambiguity-aware. Database layer: - `GetWorkspaceAgentsByInstanceID` (`:many`): returns all matching root agents for an instance ID. - `GetWorkspaceAgentByInstanceIDAndName` (`:one`): returns the named root agent for disambiguation. SDK and CLI: - `agent_name` field added to AWS, Azure, and GCP request structs (`omitempty` for backward compatibility). - `CODER_AGENT_NAME` env var and `--agent-name` flag wired into the agent bootstrap before instance-identity auth runs. Server handler (`handleAuthInstanceID`): - When `agent_name` is present: direct lookup by (instance ID, name). - When absent: legacy lookup, then resource-scoped ambiguity check. Returns 409 with available agent names if multiple root agents match. - Whitespace-only names are trimmed and treated as unspecified. - Sub-agents remain excluded (`parent_id IS NULL` filter). Verification template: - `examples/templates/aws-multi-agent/` provisions one EC2 instance with two agents (`main` and `dev`), both using instance-identity auth with `CODER_AGENT_NAME` set in the cloud-init user data. ## Backward compatibility Existing single-agent deployments work unchanged. The `agent_name` field is optional with `omitempty`, and the unnamed path preserves today's behavior when only one root agent matches.	2026-04-16 13:59:09 +02:00
Sas Swart	5b6b7719df	fix: make prebuild claiming durable and idempotent (#23108 ) ## Problem When a prebuilt workspace is claimed, the agent reinitializes via a single fire-and-forget pubsub event over SSE. If the agent's SSE connection is interrupted at claim time, the event is permanently lost — the workspace is stuck with no self-healing path. Additionally, regular (non-prebuild) workspaces had no way to opt out of the `/reinit` polling loop — agents would reconnect indefinitely to an endpoint that would never send them anything useful. ## Root Cause `workspaceAgentReinit` fetches the workspace (with its current `owner_id`) via `GetWorkspaceByAgentID`, but never checked whether a claim already happened. It only subscribed to pubsub for future events. The database already has durable claim state (`owner_id` changes from `PrebuildsSystemUserID` to the real user), but no layer ever consulted it on reconnection. ## Solution ### Server-side durable check with first-build-initiator gating TOCTOU-safe ordering: Subscribe to pubsub claim events before any durable checks, so a claim that fires during the check is buffered in the channel rather than lost. First-build-initiator gating: When `!workspace.IsPrebuild()` (owner is no longer the system user), look up the first build's `InitiatorID`. The prebuild reconciler always uses `PrebuildsSystemUserID` as the initiator. This distinguishes claimed prebuilds from regular workspaces without any SQL schema changes. - Regular workspace (first build initiator ≠ system user) → 409 Conflict, agent stops reconnecting - Claimed prebuild, build completed → pre-seed channel with reinit event and close it, transmitter delivers one-shot then exits - Claimed prebuild, build in-progress → fall through to pubsub subscription, agent waits for completion event - Unclaimed prebuild → pubsub subscription (existing happy path) ### Declarative reinit events (defense-in-depth) - Added `UserID` field to `ReinitializationEvent` with JSON tags - Switched pubsub serialization from raw string to JSON (with backward-compat fallback for rolling upgrades) - Populated `UserID` at both the publish site and the durable check ### Agent SDK: 409 handling `WaitForReinitLoop` detects 409 Conflict from the server and closes the `reinitEvents` channel, cleanly exiting the retry goroutine. ### Agent CLI: fixed two bugs + added reinitCtx - Closed channel (`!ok`): now blocks on `<-ctx.Done()` instead of `continue`, keeping the current agent running. Previously this would leak agents by skipping `agnt.Close()` and re-entering the loop. - Duplicate owner reinit: cancels `reinitCtx` (stops the reinit goroutine), then blocks on `<-ctx.Done()`. Previously `continue` would skip cleanup and create a new agent on the next loop iteration. - `reinitCtx`: a cancellable child of `ctx` passed to `WaitForReinitLoop`, allowing the agent to stop the reinit HTTP polling after reinit completes. ### Agent-side idempotency Tracks `lastOwnerID` in the agent reinit loop — duplicate events for the same owner are skipped. ## Testing - "unclaimed prebuild receives reinit via pubsub": prebuild owned by system user, pubsub event triggers reinit - "claimed prebuild receives one-shot reinit on reconnect": first build by system user, owner changed, build completed → immediate reinit (no pubsub needed) - "claimed prebuild waits during in-progress claim build": claimed but build still running → no reinit until build completes - "regular workspace gets 409": first build by real user → 409 Conflict, agent stops polling - Updated claim publisher/listener tests: verify `UserID` survives JSON round-trip + backward compat with raw string payloads - Updated SSE round-trip test: verify `UserID` survives transmit → receive cycle Fixes #22359 ## Rolling upgrade note During a rolling deploy where old coderd instances coexist with new ones, the pubsub `ReinitializationEvent` has a new `workspace_id` field (JSON key `workspace_id`). Old publishers send a raw reason string instead of JSON; the new listener gracefully falls back by treating the entire payload as the reason and filling in `WorkspaceID` from context. The only visible effect during the upgrade window is that `WorkspaceID` may be the zero UUID in agent-side logs — this is cosmetic and resolves once all instances are updated.	2026-04-02 23:51:02 +02:00
Steven Masley	84de391f26	chore: add tallyman events for ai seat tracking (#22689 ) AI seat tracking inserted as heartbeat into usage table.	2026-03-18 09:30:22 -05:00
Steven Masley	537260aa22	fix: early oidc refresh with fake idp tests (#22712 ) Wrote unit tests that implement a fake idp to verify the oauth package actually refreshes the token	2026-03-06 16:51:27 +00:00
Cian Johnston	81468323e0	fix(coderd): use dbtime.Now() instead of time.Now() in test assertions against DB timestamps (#22685 ) `time.Now()` has nanosecond precision while Postgres timestamps are microsecond precision. When tests compare `time.Now()` against DB-sourced timestamps using `Before`/`After`/`WithinRange`/etc., there is a non-zero flake risk from the precision mismatch. This replaces `time.Now()` with `dbtime.Now()` (which rounds to microsecond precision) in all test assertions that compare against database timestamps. Follows from #22684. ## Changes (11 files) \| File \| Changes \| \|---\|---\| \| `coderd/apikey_test.go` \| 11 comparisons with `ExpiresAt` \| \| `coderd/users_test.go` \| 2 comparisons with `ExpiresAt` \| \| `coderd/oauth2_test.go` \| 1 comparison with `token.Expiry` \| \| `coderd/workspaces_test.go` \| 2 comparisons with `DormantAt` \| \| `coderd/workspaceagents_test.go` \| 3 comparisons with `ConnectedAt`/`DisconnectedAt` \| \| `coderd/workspaceapps/db_test.go` \| 1 comparison with `token.Expiry` \| \| `coderd/provisionerdserver/provisionerdserver_test.go` \| 1 comparison with `key.ExpiresAt` \| \| `enterprise/coderd/workspaces_test.go` \| 1 comparison with `DormantAt` \| \| `enterprise/coderd/license/license_test.go` \| 3 `NotBefore` values \| \| `enterprise/coderd/licenses_test.go` \| 2 `NotBefore` values \| \| `enterprise/coderd/users_test.go` \| 3 `Next()` comparisons \| ## Not changed (intentionally) - `scaletest/placebo/run_test.go` — compares wall-clock elapsed time, not DB timestamps - `cli/server_test.go`, `coderd/jwtutils/jwt_test.go`, `enterprise/aibridgeproxyd/aibridgeproxyd_test.go` — TLS cert fields, not DB-stored - `coderd/azureidentity/azureidentity_test.go` — Azure cert expiry, not DB 🤖 Generated by Claude Opus 4.6 but reviewed manually.	2026-03-06 09:14:11 +00:00
Jon Ayers	0a7a3da178	fix: exclude provisioner_state from workspace_build_with_user view (#22159 ) The provisioner state for a workspace build was being loaded for every long-lived agent rpc connection. Since this state can be anywhere from kilobytes to megabytes this can gradually cause the `coderd` memory footprint to grow over time. It's also a lot of unnecessary allocations for every query that fetches a workspace build since only a few callers ever actually reference the provisioner state. This PR removes it from the returned workspace build and adds a query to fetch the provisioner state explicitly.	2026-02-23 22:46:17 -06:00
Danielle Maywood	911d734df9	fix: avoid re-using `AuthInstanceID` for sub agents (#22196 ) Parent agents were re-using AuthInstanceID when spawning child agents. This caused GetWorkspaceAgentByInstanceID to return the most recently created sub agent instead of the parent when the parent tried to refetch its own manifest. Fix by not reusing AuthInstanceID for sub agents, and updating GetWorkspaceAgentByInstanceID to filter them out entirely.	2026-02-19 16:56:29 +00:00
Danielle Maywood	37aecda165	feat(coderd/provisionerdserver): insert sub agent resource (#21699 ) Update provisionerdserver to handle the changes introduced to provisionerd in https://github.com/coder/coder/pull/21602 We now create a relationship between `workspace_agent_devcontainers` and `workspace_agents` with the newly created `subagent_id`.	2026-01-30 17:19:19 +00:00
Steven Masley	e13f2a9869	chore: remove extra `stop_modules` from provisionerd proto (#21706 ) Was a duplicate of start_modules Closes https://github.com/coder/coder/issues/21206	2026-01-28 09:25:47 -06:00
Cian Johnston	7b44976618	fix(coderd/provisionerdserver): correct managed agent tracking (#21696 ) Relates to https://github.com/coder/internal/issues/1282 Updates tracking of managed agents to be predicated instead on the presence of a related `task_id` instead of the presence of a `coder_ai_task` resource.	2026-01-27 12:14:52 +00:00
Steven Masley	89f4d60e7b	chore: remove experiment "terraform-directory-reuse" (#21397 ) Experiment is no longer required, the new method will be released without an experiment and without a toggle Main PR is: https://github.com/coder/coder/pull/21398	2026-01-09 11:13:16 -06:00
Spike Curtis	bddb808b25	chore: arrange imports in a standard way (#21452 ) Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example: ``` import ( "context" "time" "github.com/prometheus/client_golang/prometheus" "golang.org/x/xerrors" "gopkg.in/natefinch/lumberjack.v2" "cdr.dev/slog/v3" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/serpent" ) ``` 3 groups: standard library, 3rd partly libs, Coder libs. This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.	2026-01-08 15:24:11 +04:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
Steven Masley	9ca5b44b56	chore: implement persistent terraform directories (experimental) (#20563 ) Prior to this, every workspace build ran `terraform init` in a fresh directory. This would mean the `modules` are downloaded fresh. If the module is not pinned, subsequent workspace builds would have different modules.	2025-11-13 07:50:17 -06:00
Steven Masley	04727c06e8	chore: add experiment toggle for terraform workspace caching (#20559 ) Experiments passed to provisioners to determine behavior. This adds `--experiments` flag to provisioner daemons. Prior to this, provisioners had no method to turn on/off experiments.	2025-11-12 14:26:15 -06:00
Steven Masley	9149c1e9f2	chore: append template metadata to protobuf config (#20558 ) Adds some extra meta data sent to provisioners. Also adds a field `reuse_terraform_workspace` to tell the provisioner whether or not to use the caching experiment.	2025-11-12 12:46:39 -06:00
Mathias Fredriksson	ce04f6cc5d	fix(coderd): remove deprecated AITaskSidebarApp column (#20680 ) This column was no longer used in `v2.28` and the codersdk field deprecated. Both can now be dropped in `v2.29`. Closes coder/internal#974	2025-11-07 12:45:45 +02:00
Mathias Fredriksson	a6b0eae38d	refactor(coderd): drop sidebar app constraint and simplify provisionerdserver for tasks (#20591 ) Updates coder/internal#973 Updates coder/internal#974	2025-11-03 13:46:38 +02:00
Cian Johnston	1961252918	chore(coderd/provisionerdserver): address flake in TestServer_ExpirePrebuildsSessionToken (#20648 ) Addresses a flake seen locally by @mafredri: ``` panic: interface conversion: proto.isAcquiredJob_Type is nil, not proto.AcquiredJob_WorkspaceBuild_ [recovered] panic: interface conversion: proto.isAcquiredJob_Type is nil, not proto.AcquiredJob_WorkspaceBuild_ goroutine 77 [running]: testing.tRunner.func1.2({0x35ba440, 0xc000f15620}) /usr/local/go/src/testing/testing.go:1734 +0x21c testing.tRunner.func1() /usr/local/go/src/testing/testing.go:1737 +0x35e panic({0x35ba440?, 0xc000f15620?}) /usr/local/go/src/runtime/panic.go:792 +0x132 github.com/coder/coder/v2/coderd/provisionerdserver_test.TestServer_ExpirePrebuildsSessionToken(0xc00010d500) /home/coder/coder/coderd/provisionerdserver/provisionerdserver_test.go:4128 +0xc4b testing.tRunner(0xc00010d500, 0x4bd8450) /usr/local/go/src/testing/testing.go:1792 +0xf4 created by testing.(*T).Run in goroutine 1 /usr/local/go/src/testing/testing.go:1851 +0x413 FAIL github.com/coder/coder/v2/coderd/provisionerdserver 20.830s FAIL ``` It's unclear why this would happen in the first place.	2025-11-03 11:39:02 +00:00
Danielle Maywood	5a31c590e6	fix(coderd/provisionerdserver): pipe through task id and prompt (#20408 ) Pipes through the Task's ID and prompt into the provisioner. This is required to support the new `coder_ai_task.prompt` field and modified `coder_ai_task.id` field.	2025-10-24 09:43:48 +01:00
Cian Johnston	dc6e50d6b7	feat(coderd/telemetry): add telemetry for database Tasks (#20279 ) Adds Tasks to telemetry snapshots Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2025-10-17 10:48:56 +01:00
Mathias Fredriksson	a8f87c2625	feat(coderd): implement task to app linking (#20237 ) This change adds workspace build/agent/app linking to tasks and wires it into `wsbuilder` and `provisionerdserver`. Closes coder/internal#948 Closes coder/coder#20212 Closes coder/coder#19773	2025-10-13 12:57:06 +03:00
Cian Johnston	06cbb2890f	fix: expire token for prebuilds user when regenerating session token (#19667 ) * provisionerdserver: Expires prebuild user token for workspace, if it exists, when regenerating session token. * dbauthz: disallow prebuilds user from creating api keys * dbpurge: added functionality to expire stale api keys owned by the prebuilds user	2025-09-02 09:38:43 +01:00
Susana Ferreira	0ab345ca84	feat: add prebuild timing metrics to Prometheus (#19503 ) ## Description This PR introduces one counter and two histograms related to workspace creation and claiming. The goal is to provide clearer observability into how workspaces are created (regular vs prebuild) and the time cost of those operations. ### `coderd_workspace_creation_total` * Metric type: Counter * Name: `coderd_workspace_creation_total` * Labels: `organization_name`, `template_name`, `preset_name` This counter tracks whether a regular workspace (not created from a prebuild pool) was created using a preset or not. Currently, we already expose `coderd_prebuilt_workspaces_claimed_total` for claimed prebuilt workspaces, but we lack a comparable metric for regular workspace creations. This metric fills that gap, making it possible to compare regular creations against claims. Implementation notes: * Exposed as a `coderd_` metric, consistent with other workspace-related metrics (e.g. `coderd_api_workspace_latest_build`: https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149). * Every `defaultRefreshRate` (1 minute ), DB query `GetRegularWorkspaceCreateMetrics` is executed to fetch all regular workspaces (not created from a prebuild pool). * The counter is updated with the total from all time (not just since metric introduction). This differs from the histograms below, which only accumulate from their introduction forward. ### `coderd_workspace_creation_duration_seconds` & `coderd_prebuilt_workspace_claim_duration_seconds` * Metric types: Histogram * Names: * `coderd_workspace_creation_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name`, `type` (`regular`, `prebuild`) * `coderd_prebuilt_workspace_claim_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name` We already have `coderd_provisionerd_workspace_build_timings_seconds`, which tracks build run times for all workspace builds handled by the provisioner daemon. However, in the context of this issue, we are only interested in creation and claim build times, not all transitions; additionally, this metric does not include `preset_name`, and adding it there would significantly increase cardinality. Therefore, separate more focused metrics are introduced here: * `coderd_workspace_creation_duration_seconds`: Build time to create a workspace (either a regular workspace or the build into a prebuild pool, for prebuild initial provisioning build). * `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a prebuilt workspace from the pool. The reason for two separate histograms is that: * Creation (regular or prebuild): provisioning builds with similar time magnitude, generally expected to take longer than a claim operation. * Claim: expected to be a much faster provisioning build. #### Native histogram usage Provisioning times vary widely between projects. Using static buckets risks unbalanced or poorly informative histograms. To address this, these metrics use [Prometheus native histograms](https://prometheus.io/docs/specs/native_histograms/): * First introduced in Prometheus v2.40.0 * Recommended stable usage from v2.45+ * Requires Go client `prometheus/client_golang` v1.15.0+ * Experimental and must be explicitly enabled on the server (`--enable-feature=native-histograms`) For compatibility, we also retain a classic bucket definition (aligned with the existing provisioner metric: https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189). * If native histograms are enabled, Prometheus ingests the high-resolution histogram. * If not, it falls back to the predefined buckets. Implementation notes: * Unlike the counter, these histograms are updated in real-time at workspace build job completion. * They reflect data only from the point of introduction forward (no historical backfill). ## Relates to Closes: https://github.com/coder/coder/issues/19528 Native histograms tested in observability stack: https://github.com/coder/observability/pull/50	2025-08-28 15:00:26 +01:00
Cian Johnston	bd139f3a43	fix(coderd/provisionerdserver): workaround lack of coder_ai_task resource on stop transition (#19560 ) This works around the issue where a task may "disappear" on stop. Re-using the previous value of `has_ai_task` and `sidebar_app_id` on a stop transition. --------- Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>	2025-08-27 10:33:17 +01:00
Dean Sheather	1a601c30ad	chore: move usage types to new package (#19103 )	2025-08-20 23:48:38 +10:00
Dean Sheather	6eb02d1c2a	chore: wire up usage tracking for managed agents (#19096 ) Wires up the usage collector and publisher to coderd. Relates to coder/internal#814	2025-08-20 23:38:09 +10:00
Cian Johnston	155c7bbc65	chore(coderd/provisionerdserver): avoid fk error on invalid ai_task_sidebar_app_id (#19253 ) This is a workaround for https://github.com/coder/coder/issues/18776 We avoid the foreign key issue by checking the previously inserted workspace applications before calling UpdateWorkspaceAITask. Now, affected workspaces will show as "not running an AI task" on the single task view, which is technically correct. We also insert a provisioner job log at WARN level to ensure that the user sees some information that they have run into this issue, as well as logging on the server side. Longer term, we plan to modify how the workspace tasks view is presented. This is a stopgap measure until we solidify that plan. NOTE: this does not address the fact that stopping a workspace with `has_ai_task: true` will result in the completed stop build no longer having `has_ai_task: true`, resulting in tasks "disappearing" on stop.	2025-08-08 18:06:32 +01:00
Cian Johnston	b0ba798f78	chore(coderd/provisionerdserver): remove dangerous usage of dbtestutil.DisableForeignKeysAndTriggers (#19122 ) May have caught https://github.com/coder/coder/issues/18776	2025-08-04 21:51:33 +01:00
Benjamin Peinhardt	e4dc2d9418	fix: add constraint and runtime check for provisioner logs size limit (#18893 ) This PR sets a constraint of 1MB on the provisioner job logs written to the database. This is consistent with the constraint we place on workspace agent logs: https://github.com/coder/coder/blob/4ac6be6d835dc36c242e35a26b584b784040bf28/coderd/database/dump.sql#L2030 It also adds a message printed to the front end about the provisioner log overflow, and updates the message printed to the front end when workspace startup logs exceed the max, as it was causing some customers to think their startup script had failed to run.	2025-07-30 19:09:53 -05:00
Danny Kopping	0238f2926d	feat: persist AI task state in template imports & workspace builds (#18449 )	2025-06-24 10:36:37 +00:00
ケイラ	fae30a00fd	chore: remove unnecessary redeclarations in for loops (#18440 )	2025-06-20 13:16:55 -06:00
Hugo Dutka	910858b731	chore(coderd/provisionerdserver): convert dbmem tests to use postgres (#18278 )	2025-06-09 10:05:29 +02:00
Steven Masley	0428c5ec1c	chore: include 'everyone' group in template importing (#18257 )	2025-06-05 19:25:36 +00:00
Steven Masley	8387dd27ab	chore: add form_type parameter argument to db (#17920 ) `form_type` is a new parameter field in the terraform provider. Bring that field into coder/coder. Validation for `multi-select` has also been added.	2025-05-29 08:55:19 -05:00
Cian Johnston	776c144128	fix(coderd): ensure agent timings are non-zero on insert (#18065 ) Relates to https://github.com/coder/coder/issues/15432 Ensures that no workspace build timings with zero values for started_at or ended_at are inserted into the DB or returned from the API.	2025-05-29 13:36:06 +01:00
Michael Suchacz	b7462fb256	feat: improve transaction safety in CompleteJob function (#17970 ) This PR refactors the CompleteJob function to use database transactions more consistently for better atomicity guarantees. The large function was broken down into three specialized handlers: - completeTemplateImportJob - completeWorkspaceBuildJob - completeTemplateDryRunJob Each handler now uses the Database.InTx wrapper to ensure all database operations for a job completion are performed within a single transaction, preventing partial updates in case of failures. Added comprehensive tests for transaction behavior for each job type. Fixes #17694 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-05-21 16:48:51 +02:00
Steven Masley	789c4beba7	chore: add dynamic parameter error if missing metadata from provisioner (#17809 )	2025-05-14 12:21:36 -05:00
Danny Kopping	6e967780c9	feat: track resource replacements when claiming a prebuilt workspace (#17571 ) Closes https://github.com/coder/internal/issues/369 We can't know whether a replacement (i.e. drift of terraform state leading to a resource needing to be deleted/recreated) will take place apriori; we can only detect it at `plan` time, because the provider decides whether a resource must be replaced and it cannot be inferred through static analysis of the template. This is likely to be the most common gotcha with using prebuilds, since it requires a slight template modification to use prebuilds effectively, so let's head this off before it's an issue for customers. Drift details will now be logged in the workspace build logs: ![image](https://github.com/user-attachments/assets/da1988b6-2cbe-4a79-a3c5-ea29891f3d6f) Plus a notification will be sent to template admins when this situation arises: ![image](https://github.com/user-attachments/assets/39d555b1-a262-4a3e-b529-03b9f23bf66a) A new metric - `coderd_prebuilt_workspaces_resource_replacements_total` - will also increment each time a workspace encounters replacements. We only track _that_ a resource replacement occurred, not how many. Just one is enough to ruin a prebuild, but we can't know apriori which replacement would cause this. For example, say we have 2 replacements: a `docker_container` and a `null_resource`; we don't know which one might cause an issue (or indeed if either would), so we just track the replacement. --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com>	2025-05-14 14:52:22 +02:00
Sas Swart	425ee6fa55	feat: reinitialize agents when a prebuilt workspace is claimed (#17475 ) This pull request allows coder workspace agents to be reinitialized when a prebuilt workspace is claimed by a user. This facilitates the transfer of ownership between the anonymous prebuilds system user and the new owner of the workspace. Only a single agent per prebuilt workspace is supported for now, but plumbing has already been done to facilitate the seamless transition to multi-agent support. --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com> Co-authored-by: Danny Kopping <dannykopping@gmail.com>	2025-05-14 14:15:36 +02:00
Danielle Maywood	0b5f27f566	feat: add `parent_id` column to `workspace_agents` table (#17758 ) Adds a new nullable column `parent_id` to `workspace_agents` table. This lays the groundwork for having child agents.	2025-05-13 00:01:31 +01:00
Danny Kopping	af2941bb92	feat: add `is_prebuild_claim` to distinguish post-claim provisioning (#17757 ) Used in combination with https://github.com/coder/terraform-provider-coder/pull/396 This is required by both https://github.com/coder/coder/pull/17475 and https://github.com/coder/coder/pull/17571 Operators may need to conditionalize their templates to perform certain operations once a prebuilt workspace has been claimed. This value will only be set once a claim takes place and a subsequent `terraform apply` occurs. Any `terraform apply` runs thereafter will be indistinguishable from a normal run on a workspace. --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com>	2025-05-12 14:19:03 +00:00
Steven Masley	ea65ddc17d	fix: correct user roles being passed into terraform context (#17460 ) Roles were being passed into the workspace context incorrectly. Site wide scopes were being org scoped. Roles outside the org should also not be sent.	2025-04-17 15:42:23 -05:00
ケイラ	f670bc31f5	chore: update testutil chan helpers (#17408 )	2025-04-16 10:37:09 -06:00
Sas Swart	a98605913a	feat: mark prebuilds as such and set their preset ids (#16965 ) This pull request closes https://github.com/coder/internal/issues/513	2025-04-14 15:34:50 +02:00
Sas Swart	0b2b643ce2	feat: persist prebuild definitions on template import (#16951 ) This PR allows provisioners to recognise and report prebuild definitions to the coder control plane. It also allows the coder control plane to then persist these to its store. closes https://github.com/coder/internal/issues/507 --------- Signed-off-by: Danny Kopping <dannykopping@gmail.com> Co-authored-by: Danny Kopping <dannykopping@gmail.com> Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>	2025-04-07 10:35:28 +02:00
Mathias Fredriksson	5c8cac9fb7	feat: add name to workspace agent devcontainers (#17089 ) In the presence of multiple devcontainers, it would be nice to differentiate them by name. This change inherits the resource name from terraform. Refs #17076	2025-03-25 12:59:20 +00:00
ケイラ	5b3eda6719	chore: persist template import terraform plan in postgres (#17012 )	2025-03-24 10:01:50 -06:00

1 2 3

135 Commits