Commit Graph

11425 Commits

Author SHA1 Message Date
Cian Johnston 38017010ce fix(coderd): disallow POSTing a workspace build on a deleted workspace (#20584)
- Adds a check on /api/v2/workspacebuilds to disallow creating a START or STOP build if the workspace is deleted. 
- DELETEs are still allowed.
2025-10-30 13:32:18 +00:00
Spike Curtis 984a834e81 docs: revert work in progress 10k scale doc (#20580)
Reverts in-progress 10k docs because people found it confusing.
2025-10-30 16:17:04 +04:00
Cian Johnston 2bcf08457b ci: revert workaround for get.helm.sh outage (#20552) (#20557)
Reverts the temporary workaround in #20552. 
Merge after get.helm.sh is once again operational.
2025-10-30 10:56:24 +00:00
Cian Johnston 73dedcc765 fix: delete related task when deleting workspace (#20567)
* Instead of prompting the user to start a deleted workspace (which is
silly), prompt them to create a new task instead.
* Adds a warning dialog when deleting a workspace
* Updates provisionerdserver to delete the related task if a workspace
is related to a task
2025-10-30 10:37:51 +00:00
Spike Curtis 94f6e83cfa docs: fix typo: worklods (#20578)
fixes typo.
2025-10-30 12:45:47 +04:00
Ethan bc0c4ebaa7 chore(dogfood): add back coder envbuilder template (#20576)
I've given the CI dev.coder user Admin on the template, and tested the template still builds a workspace.
2025-10-30 19:09:08 +11:00
Spike Curtis dc277618ee chore: refactor flags that target workspaces in scaletest (#20537)
For the https://github.com/coder/internal/issues/913 we are going to be targeting running workspaces. So this PR modularizes the CLI flags and logic that select those targets so we can reuse it.
2025-10-30 11:10:24 +04:00
Ethan b90c74a94d chore(scaletest): avoid polling workspace builds during workspace-updates tests (#20534)
This PR is just committing the changes I made while running the
`workspace-updates` load generator.

It ensures we're not polling the workspace build progress in the
background (while we also watch for workspace updates via the tailnet),
and also removes an unnecessary query to `/api/v2/workspace/{id}` after
each workspace is built.
2025-10-30 12:14:25 +11:00
Danny Kopping ff532d9bf3 chore: handle deprecated aibridge experimental routes (#20565)
In v2.28 we're [removing the aibridge
experiment](https://github.com/coder/coder/pull/20544).

We need to handle `/api/experimental/aibridge/*` until Beta (next
release).

Signed-off-by: Danny Kopping <danny@coder.com>
2025-10-29 19:11:34 -06:00
Steven Masley 54497f4f6b chore: add revocation endpoint to oauth well-known (#20561)
Was added to apps endpoints, but not the wider site ones. This is a site
wide oauth route
2025-10-29 16:44:53 -05:00
Danielle Maywood 9629d873fb fix(site): fix disappearing preset selector when switching task template (#20514)
Closes https://github.com/coder/coder/issues/20465

Ensure we set `selectedPresetId` to `undefined` when we change
`selectedTemplateId` to ensure we don't end up breaking the `<Select>`
component by giving it an invalid preset id.
2025-10-29 21:11:44 +00:00
Asher 643fe38b1e fix: use temp file on same device with mcp file edit (#20477)
Otherwise you can get errors like "invalid cross-device link".
2025-10-29 12:23:06 -08:00
Jaayden Halko c827a08c11 refactor: migrate deployment banner to Tailwind and radix (#20479)
before:
<img width="1667" height="48" alt="Screenshot 2025-10-25 at 18 02 45"
src="https://github.com/user-attachments/assets/1525a01e-5976-4d0e-8280-1b9ae8d91197"
/>

after:
<img width="1662" height="35" alt="Screenshot 2025-10-25 at 18 02 17"
src="https://github.com/user-attachments/assets/d0fd7b69-ee88-4986-a539-5917c17a8b85"
/>
2025-10-29 15:41:19 -04:00
Bartek Gatz 1b6556c2f6 fix: improve visual separation between prompt and task list (#20427) 2025-10-29 19:00:27 +00:00
Mathias Fredriksson 859e94d67a fix: deprecate codersdk.AITaskPromptParameterName and reduce usage (#20501)
Depends on coder/sqlc#1
Fixes coder/internal#979
Updates coder/internal#973
2025-10-29 18:59:12 +00:00
Cian Johnston 50749d131b ci: workaround for get.helm.sh outage (#20552)
<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->
2025-10-29 17:33:03 +00:00
Dean Sheather 9986dc0c38 chore: remove brazil region from dogfood (#20548) 2025-10-29 13:03:14 -04:00
Dean Sheather 92b63871ca chore: remove dogfood envbuilder template deployment (#20546)
Already missing in state, verified with `terraform state list`
2025-10-29 13:03:00 -04:00
Mathias Fredriksson 303e9ef7de fix: switch to coder/sqlc fork (#20536)
Refs https://github.com/coder/sqlc/pull/1
Unblocks https://github.com/coder/coder/pull/20501

Upstream https://github.com/sqlc-dev/sqlc/pull/4159
2025-10-29 18:45:56 +02:00
Cian Johnston 1ebc217624 fix: update task link AppStatus using task_id (#20543)
Fixes https://github.com/coder/coder/issues/20515

Alternative to https://github.com/coder/coder/pull/20519

Adds `task_id` to `workspaces_expanded` view and updates the "View Task"
link in `AppStatuses` component.

NOTE: this contains a migration
2025-10-29 15:45:45 +00:00
Danielle Maywood 06dbadab11 fix(coderd): ensure lifecycle executor has sufficient task permissions (#20539)
We recently made a change to the `wsbuilder` to handle task related
logic. Our test coverage for the lifecycle executor didn't handle this
scenario and so we missed that it had insufficient permissions.

This PR adds `Update` and `Read` permissions for `Task`s in the
lifecycle executor, as well as an autostart/autostop test tailored to
task workspaces to verify the change.

---

Anthropic's Claude Sonnet 4.5 Thinking was involved in writing the tests
2025-10-29 15:44:35 +00:00
Cian Johnston 566146af72 fix(coderd): fix audit log resource link for tasks (#20545)
Existing task audit log links were incorrect. As audit log links are
generated on-the-fly, this does not require backfill.
2025-10-29 15:31:41 +00:00
Susana Ferreira 7e8fcb4b0f perf: optimize prebuilds membership reconciliation to check orgs not presets (#20493)
## Description

The membership reconciliation ensures the prebuilds system user is a
member of all organizations with prebuilds configured. To support
prebuilds quota management, each organization must have a prebuilds
group that the system user belongs to.

## Problem

Previously, membership reconciliation iterated over all presets to check
and update membership status. This meant database queries
`GetGroupByOrgAndName` and `InsertGroupMember` were executed for each
preset. Since presets are unique combinations of `(organization,
template, template version, preset)`, this resulted in several redundant
checks for the same organization.

In dogfood, `InsertGroupMember` was called thousands of times per day,
even though memberships were already configured ([internal Grafana
dashboard link](https://grafana.dev.coder.com/goto/46MZ1UgDg?orgId=1))

<img width="5382" height="1788" alt="Screenshot 2025-10-28 at 16 01 36"
src="https://github.com/user-attachments/assets/757b7253-106f-4f72-8586-8e2ede9f18db"
/>

## Solution

This PR introduces `GetOrganizationsWithPrebuildStatus`, a single query
that returns:
* All unique organizations with prebuilds configured
* Whether the prebuilds user is a member of each organization
* Whether the prebuilds group exists in each organization
* Whether the prebuilds user is in the prebuilds group

The membership reconciliation logic now:
* Fetches status for all organizations in one query
* Only performs inserts for organizations missing required memberships
or groups
* Safely handles concurrent operations via unique constraint violations
* This reduces database load from `O(presets)` to `O(organizations)` per
reconciliation loop, with a single read query when everything is
configured.

## Changes

* Add `GetOrganizationsWithPrebuildStatus` SQL query
* Update `membership.ReconcileAll` to use organization-based
reconciliation instead of preset-based
* Update tests to reflect new behavior

Related to internal thread:
https://codercom.slack.com/archives/C07GRNNRW03/p1760535570381369
2025-10-29 14:24:29 +00:00
Danny Kopping dd28eef5b4 chore: update dogfood template to use new aibridge endpoint (#20525)
Also updating Nix to 2.28.5 since the previous version 404s.

Closes https://github.com/coder/internal/issues/1105
2025-10-29 07:53:34 -06:00
Danny Kopping 2f886ce8d0 chore: update docs (#20521)
Updates AI Bridge docs to remove experiment details.
2025-10-29 07:43:33 -06:00
Danny Kopping dcfd6d6f73 chore: graduate aibridge cli out of experimental (#20524)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->
2025-10-29 07:36:08 -06:00
Danny Kopping b20fd6f2c1 chore: graduate aibridge API out of experimental (#20523)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->
2025-10-29 07:18:54 -06:00
Danny Kopping 2294c55bd9 chore: graduate aibridged* packages out of experimental (#20522)
<!--

If you have used AI to produce some or all of this PR, please ensure you have read our [AI Contribution guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING) before submitting.

-->
2025-10-29 07:00:24 -06:00
Susana Ferreira aad1b401c1 feat: add prebuilds reconciliation duration metric (#20535)
## Description

Adds `coderd_prebuilds_reconciliation_duration_seconds` histogram metric
to track the duration of each prebuilds reconciliation cycle.

This metric helps operators monitor reconciliation performance and
identify potential bottlenecks.

## Changes

- Added `ReconcileStats` struct to capture reconciliation cycle
statistics
- Updated `ReconcileAll()` to return stats including elapsed time
- Added histogram metric
`coderd_prebuilds_reconciliation_duration_seconds`
2025-10-29 12:52:30 +00:00
Steven Masley a8294872a3 chore: use consistent function for statefile path (#20527)
We use `getStateFilePath` in 2 locations, and a manual `filepath.Join`
here. Just refactored to use the helper function
2025-10-29 07:44:09 -05:00
Danny Kopping 95a1ca898f chore: remove aibridge experiment (#20520)
Removes the experiment and all references to it
2025-10-29 06:18:38 -06:00
Susana Ferreira c3e3bb58f2 feat: delete pending canceled prebuilds (#20499)
## Description

PR https://github.com/coder/coder/pull/20387 introduced canceling
pending prebuild jobs from inactive template versions to avoid
provisioning obsolete workspaces. However, the associated prebuilds
remained in the database with "Canceled" status, visible in the UI.

This PR now orphan-deletes these canceled prebuilt workspaces. Since the
canceled jobs were never processed by a provisioner, no Terraform
resources were created, making orphan deletion safe.

Orphan deletion always creates a provisioner job, but behaves
differently based on provisioner availability:
- If no provisioner daemon is available, the job is immediately marked
as completed and the workspace is marked as deleted without any
provisioner processing
- If a provisioner daemon is available, it processes the delete job with
empty Terraform state (no actual resources to destroy)

The job cancellation and workspace deletion occur atomically in the same
transaction. We don't split this into two separate reconciliation runs
because there's no way to distinguish between system-canceled prebuilds
and user-canceled workspaces. If we deleted canceled workspaces in a
later run, we'd delete user-canceled workspaces that users may want to
keep for troubleshooting.

Note: This only applies to system-generated prebuilds from inactive
template versions.

## Changes

* Update `UpdatePrebuildProvisionerJobWithCancel` query to return job
ID, workspace ID, template ID, and template version preset ID
* Add `DeprovisionMode` enum to support orphan deletion in the provision
flow
* Update `ActionTypeCancelPending` handler to cancel jobs and
orphan-delete associated workspaces atomically
2025-10-29 10:37:28 +00:00
Atif Ali 0d765f56f7 chore: update terraform to 1.13.4 (#20532)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Ethan Dickson <ethan@coder.com>
2025-10-29 07:34:41 +00:00
Asher 8b6f55c312 chore: improve remote mcp workspace and file prompts (#20475)
I have been experimenting (via blink) and these seem to have made the
LLM behave more intelligently and consistently when it comes to creating
workspaces and manipulating files.

Partially addresses https://github.com/coder/internal/issues/1047
2025-10-28 15:20:20 -08:00
Danielle Maywood 40fc337659 fix(site): fix react state violation in filetree create/update utils (#20483) 2025-10-28 21:41:02 +00:00
david-fraley f6df4c0ed8 docs: update release calendar for new patches (#20526) 2025-10-28 14:24:03 -07:00
Steven Masley 924afb753f feat: add more detailed init timings in build timeline (#20503)
This uses the `terraform init -json` logs to add more visibility into
the `init` phase of `plan`. The `init` logs omit `ms` precision, so we
have to use `time.Now()` 😢

Open TF PR: https://github.com/hashicorp/terraform/pull/37818
2025-10-28 15:47:52 -05:00
Callum Styan 45c43d4ec4 fix: refactor agent resource monitoring API to avoid excessive calls to DB (#20430)
This should resolve https://github.com/coder/internal/issues/728 by
refactoring the ResourceMonitorAPI struct to only require querying the
resource monitor once for memory and once for volumes, then using the
stored monitors on the API struct from that point on. This should
eliminate the vast majority of calls to `GetWorkspaceByAgentID` and
`FetchVolumesResourceMonitorsUpdatedAfter`/`FetchMemoryResourceMonitorsUpdatedAfter`
(millions of calls per week).

Tests passed, and I ran an instance of coder via a workspace with a
template that added resource monitoring every 10s. Note that this is the
default docker container, so there are other sources of
`GetWorkspaceByAgentID` db queries. Note that this workspace was running
for ~15 minutes at the time I gathered this data.

Over 30s for the `ResourceMonitor` calls:
```
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep ResourceMonitor | grep count
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2
100  288k    0  288k    0     0  58.3M      0 --:--:-- --:--:-- --:--:-- 70.4M
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2
coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 155
coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 155
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep ResourceMonitor | grep count
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2
100  288k    0  288k    0     0  34.7M      0 --:--:-- --:--:-- --:--:-- 40.2M
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2
coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2
coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 158
coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 158
```

And over 1m for the `GetWorkspaceAgentByID` calls, the majority are from
the workspace metadata stats updates:
```
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep GetWorkspaceByAgentID | grep count
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  284k    0  284k    0     0  42.4M      0 --:--:-- --:--:-- --:--:-- 46.3M
coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 876
coder@callum-coder-2:~/coder$ curl localhost:19090/metrics | grep GetWorkspaceByAgentID | grep count
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  284k    0  284k    0     0  75.4M      0 --:--:-- --:--:-- --:--:-- 92.7M
coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 918
```

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-10-28 13:38:16 -07:00
Danielle Maywood a1e7e105a4 chore: disable task notifications by default (#20518)
Relates to https://github.com/coder/internal/issues/1098

Currently task notifications are incredibly noisy. We should disable
them by default for the upcoming release whilst we iron them out.
2025-10-28 17:21:23 +00:00
david-fraley cf93c34172 docs: update coder_token_lifetime description to include units and examples (#20516) 2025-10-28 15:15:57 +00:00
Cian Johnston 659f89e079 feat(coderd): add owner-related fields to tasks_with_status view (#20471)
Relates to
https://github.com/coder/coder/pull/20431/files#diff-9cfc826a6ce7e77d977b2025482474dd263d12965b2a94479a74c7f1d872b782

If the workspace relating to a task was deleted, most of the
workspace-related fields in `taskFromDBTaskAndWorkspace` will be
zero-valued. However, we can still get information relating to the owner
so that "created by" shows up correctly in the UI.

Updates the `tasks_with_status` view with a join on `visible_users` to
get owner-related info.
2025-10-28 14:29:29 +00:00
Danielle Maywood e4e4669feb fix(agent/agentcontainers): remove unneeded default branch (#20511)
Closes https://github.com/coder/internal/issues/769

According to the `time.NewTicker` documentation [^1] (which is used
under the hood by https://github.com/coder/quartz) it will automatically
adjust the time interval to make up for slow receivers. This means we
should be safe to drop the default branch.

> NewTicker returns a new Ticker containing a channel that will send the
current time on the channel after each tick. The period of the ticks is
specified by the duration argument. The ticker will adjust the time
interval or drop ticks to make up for slow receivers. The duration d
must be greater than zero; if not, NewTicker will panic.

[^1]: https://pkg.go.dev/time#Ticker
2025-10-28 12:16:42 +00:00
Mathias Fredriksson a1fa58ac17 fix: update dbgen and dbfake task creation and toolsdk test fixtures (#20508)
Depends on #20506
Fixes coder/internal#1103
2025-10-28 14:15:58 +02:00
Hugo Dutka 88b7372e7f chore: remove custom go cache download step from CI (#20510)
Since depot added [native support for go
cache](https://depot.dev/docs/cache/reference/gocache), custom cache
download and upload steps are not necessary anymore.
2025-10-28 12:58:25 +01:00
Dean Sheather dec6d310a8 fix: avoid bad switch statement in license code (#20509)
Noticed this while trying to investigate a flake.

Relates to https://github.com/coder/internal/issues/788
2025-10-28 06:19:52 +00:00
Spike Curtis e720afa9d0 docs: add description of dynamic parameters test (#20488)
## Add Dynamic Parameters test procedure to 10k users validated architecture

This PR adds a new test procedure for Dynamic Parameters to the 10k users validated architecture documentation. No changes to the recommended hardware specs as this test case succeeded with no issues.
2025-10-28 10:11:25 +04:00
Danny Kopping d18441debe feat: add AWS Bedrock support (#20507)
Depends on https://github.com/coder/aibridge/pull/44

Closes https://github.com/coder/aibridge/issues/28

---------

Signed-off-by: Danny Kopping <danny@coder.com>
2025-10-28 03:38:14 +00:00
ケイラ 4f7b279fd8 feat: add an organization member permission level (#19953) 2025-10-27 17:14:16 -06:00
Mathias Fredriksson c3cbd977f1 fix(coderd/database/dbfake): use transaction for workspace builder (#20506)
While investigating a flake I noticed that the dbfake workspace builder
executes all database inserts without a transaction. Since our real
wsbuilder implementation utilizes one it makes sense to do here as well.

For example, our normal workspace <-> build relationship is such that a
workspace cannot exist with at least one build. However, our
GetWorkspaces query left joins workspace builds but has types that are
non-nullable, leading to flakes like coder/internal#1103.
2025-10-28 01:06:52 +02:00
ケイラ d8b1ca70d6 chore: remove @Parkreiner from CODEOWNERS (#20504) 2025-10-27 11:55:33 -06:00