Commit Graph

3372 Commits

Author SHA1 Message Date
Susana Ferreira d79a7797c2 fix: exclude prebuilt workspaces from template-level lifecycle updates (#19265)
## Description

This PR ensures that lifecycle-related changes made via template
schedule updates do **not affect prebuilt workspaces**. Since prebuilds
are managed by the reconciliation loop and do not participate in the
regular lifecycle executor flow, they must be excluded from any updates
triggered by template configuration changes.

This includes changes to TTL, dormant-deletion scheduling, deadline and
autostart scheduling.

## Changes

- Updated SQL query `UpdateWorkspacesTTLByTemplateID` to exclude
prebuilt workspaces
- Updated SQL query `UpdateWorkspacesDormantDeletingAtByTemplateID` to
exclude prebuilt workspaces
- Updated application-layer logic to skip any updates to lifecycle
parameters if a workspace is a prebuild
- Preserved all existing update behavior for regular user workspaces

This change guarantees that only lifecycle-managed workspaces are
affected when template-level configurations are modified, preserving
strict boundaries between prebuild and user workspace lifecycles.

Related with: 
* Issue: https://github.com/coder/coder/issues/18898
* PR: https://github.com/coder/coder/pull/19252
2025-08-19 13:08:01 +01:00
Kacper Sawicki 9edceef0bf feat(coderd): add support for external agents to API's and provisioner (#19286)
This pull request introduces support for external workspace management, allowing users to register and manage workspaces that are provisioned and managed outside of the Coder.

Depends on: https://github.com/coder/terraform-provider-coder/pull/424

* GET /api/v2/init-script - Gets the agent initialization script
  * By default, it returns a script for Linux (amd64), but with query parameters (os and arch) you can get the init script for different platforms
* GET /api/v2/workspaces/{workspace}/external-agent/{agent}/credentials - Gets credentials for an external agent **(enterprise)**
* Updated queries to filter workspaces/templates by the has_external_agent field
2025-08-19 10:41:33 +02:00
Kacper Sawicki 5e4aa79a9d feat(coderd): add has_external_agent flag to template_versions and workspace_builds (#19285)
This pull request introduces support for external workspace management, allowing users to register and manage workspaces that are provisioned and managed outside of the Coder.

* Added has_external_agent field to workspace builds and template versions
2025-08-19 10:29:51 +02:00
Danielle Maywood a8c89a120f fix: increase timeout for watch workspace agent devcontainers test (#19390)
Relates to https://github.com/coder/internal/issues/907

The test can take around 10s when it is the only one running, so in a
constrained environment like CI it makes sense that it still hits the 25
second timeout. For now we up the limit to 60 seconds until the test is
rewritten to greatly reduce the time taken.
2025-08-18 09:56:39 +01:00
Ethan fdc9dfae89 test(coderd/database/dbpurge): use mock db in TestPurge (#19386)
Closes https://github.com/coder/internal/issues/906

This test was using dbmem until we removed it. The test just makes sure the background job runs at all, so a mock db continues to be fine here.

No other tests in this package used dbmem, so this is the only test I've changed.
2025-08-18 15:25:52 +10:00
Dean Sheather 8f9f0cda11 chore: avoid DNS lookups for DERP in tests (#19385)
Closes https://github.com/coder/internal/issues/886
2025-08-18 03:44:37 +00:00
Rowan Smith 93279dff24 chore: change format of key from uuid to string to fix swagger issue (#19380)
ref: https://codercom.slack.com/archives/C014JH42DBJ/p1755192759211289

this change allows api keys to be deleted via swagger
2025-08-18 12:37:51 +10:00
Danielle Maywood 4ca69af867 fix: increase timeout for watch workspace agent devcontainers test (#19376) 2025-08-15 19:11:19 +01:00
Callum Styan 6c902a7410 fix: don't create autostart workspace builds with no available provisioners (#19067)
This should fix https://github.com/coder/coder/issues/17941 by introducing a check for whether there are any valid (non-stale provisioners for a job in the autobuild executor code path.

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-08-15 08:50:51 -07:00
Dean Sheather a25d85631b chore: add usage tracking package (#19095)
Not used in coderd yet, see stack.

Adds two new packages:
- `coderd/usage`: provides an interface for the "Collector" as well as a stub implementation for AGPL
- `enterprise/coderd/usage`: provides an interface for the "Publisher" as well as a Tallyman implementation

Relates to https://github.com/coder/internal/issues/814
2025-08-16 01:31:00 +10:00
Danielle Maywood 205eb29e60 fix: stop reading closed channel for /watch devcontainers endpoint (#19373)
Fixes https://github.com/coder/coder/issues/19372

We increase the read limit to 4MiB (we use this limit elsewhere). We
also make sure to stop sending messages when `containersCh` becomes
closed.
2025-08-15 12:32:33 +01:00
Sas Swart a9f607afd8 test: add an ergonomic way to access the test database for debugging (#19371)
Accessing the database during debugging currently requires either
spinning up a separate PostgreSQL instance or inspecting memory to
retrieve the DSN—both of which add unnecessary friction. While the test
suite already provisions a database by default, connecting to it for
manual inspection or debugging is not straightforward.

This change introduces a clearer and more accessible way to surface the
DSN during debugging sessions, allowing developers to connect to the
test database directly without relying on external infrastructure or ad
hoc methods.

Expected Usage:
1. Debug using dlv or the IDE.
2. Step through line by line and determine that a query isn't doing what
you'd expect
3. No further insight to be gained at the Go level
4. The next place to test is to connect directly to the database while
it is in the exact state that the test has produced just before running
the query
5. Rerun the test with this option enabled and your breakpoint set right
before the questionable query runs
6. Connect to the database and inspect or troubleshoot as you need to
2025-08-15 11:25:20 +02:00
Ethan d7bdb3cdef ci: add paralleltestctx to lint/go (#19369)
Closes https://github.com/coder/internal/issues/884

We're adding this as a `go run` in `lint/go` for now, since adding it to
golangci-lint ourselves involves recompiling golangci-lint and then
running that new binary. I'll look into proposing it being added to the
public golangci-lint linters.

Doesn't appear to cause the lint ci job to take any longer, which is
nice.
2025-08-15 16:16:18 +10:00
Steven Masley 4926410146 feat: keep original token refresh error in external auth (#19339)
External auth refresh errors lose the original error thrown on the first
refresh. This PR saves that error to the database to be raised on
subsequent refresh attempts
2025-08-14 09:50:31 -05:00
Susana Ferreira 734299de71 fix: disallow lifecycle endpoints for prebuilt workspaces (#19264)
## Description

This PR updates the API to prevent lifecycle configuration endpoints
from being used on prebuilt workspaces. Since prebuilds are managed by
the reconciliation loop and do not participate in the regular workspace
lifecycle, they must not support per-workspace overrides for fields like
deadline, TTL, autostart, or dormancy.

Attempting to use these endpoints on a prebuilt workspace will now
return a clear validation error (`409 Conflict`) with an appropriate
explanation. This prevents accidental misconfiguration and preserves the
lifecycle separation between prebuilds and regular workspaces.

## Changes

The following endpoints now return an error if the target workspace is a
prebuild:

* `PUT /workspaces/{workspace}/extend`
* `PUT /workspaces/{workspace}/ttl`
* `PUT /workspaces/{workspace}/autostart`
* `PUT /workspaces/{workspace}/dormant`

Update endpoints logic to use the API clock in order to allow time
mocking in tests.

Related with: 
* Issue: https://github.com/coder/coder/issues/18898
* PR: https://github.com/coder/coder/pull/19252
2025-08-14 11:30:19 +01:00
Hugo Dutka af97b78e76 chore(coderd/database/dbauthz): migrate TestTemplate to use mocked DB (#19304)
Related to https://github.com/coder/internal/issues/869

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-08-14 11:32:53 +02:00
Hugo Dutka e10f29c481 chore(coderd/database/dbauthz): migrate File, Group, APIKey, AuditLogs, and ConnectionLogs tests to mocked db (#19299)
Related to https://github.com/coder/internal/issues/869

---------

Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-08-13 09:30:48 -05:00
Susana Ferreira 8567ecbe52 fix: set prebuilds lifecycle parameters on creation and claim (#19252)
## Description

This PR ensures that prebuilt workspaces are properly excluded from the
lifecycle executor and treated as a separate class of workspaces, fully
managed by the prebuild reconciliation loop.

It introduces two lifecycle guarantees:
* When a prebuilt workspace is created (i.e., when the workspace build
completes), all lifecycle-related fields are unset, ensuring the
workspace does not participate in TTL, autostop, autostart, dormancy, or
auto-deletion logic.
* When a prebuilt workspace is claimed, it transitions into a regular
user workspace. At this point, all lifecycle fields are correctly
populated according to template-level configurations, allowing the
workspace to be managed by the lifecycle executor as expected.

## Changes

* Prebuilt workspaces now have all lifecycle-relevant fields unset
during creation
* When a prebuild is claimed:
* Lifecycle fields are set based on template and workspace level
configurations. This ensures a clean transition into the standard
workspace lifecycle flow.
* Updated lifecycle-related SQL update queries to explicitly exclude
prebuilt workspaces.

## Relates 

Related issue: https://github.com/coder/coder/issues/18898

To reduce the scope of this PR and make the review process more
manageable, the original implementation has been split into the
following focused PRs:
* https://github.com/coder/coder/pull/19259
* https://github.com/coder/coder/pull/19263
* https://github.com/coder/coder/pull/19264
* https://github.com/coder/coder/pull/19265

These PRs should be considered in conjunction with this one to
understand the complete set of lifecycle separation changes for prebuilt
workspaces.
2025-08-13 12:45:46 +01:00
Ethan 791d39c261 test(coderd/database): use seperate context for subtests to fix flake (#19330)
Fixes flakes like
https://github.com/coder/coder/actions/runs/16927282256/job/47965470039

https://coder.com/blog/go-testing-contexts-and-t-parallel

...I'm going to take a stab at turning this into a lint rule. I think
it's possible by just reading the AST?
2025-08-13 14:45:35 +10:00
Rafael Rodriguez aab2ccdb38 fix!: support empty or default fields when updating templates (#19256)
Breaking change: Field types in `codersdk.UpdateTemplateMeta` for
`Icon`, `Description`, and `DisplayName` moved to `*string`

## Summary

In this pull request we're updating the `UpdateTemplateMeta` struct to
allow `DisplayName`, `Description`, and `Icon` to be set as empty `""`
or default to the value from the template if not provided in an update
call.

Fixes https://github.com/coder/coder/issues/19036

### The bug

The reported bug occurred when clients were attempting to update a
metadata field in a template via an edit call. When the request was
decoded into an `UpdateTemplateMeta` struct the default values for
fields in the struct were used to update the template even if they
weren't provided. This led to fields like `Icon` being set to `""` (the
default value).

### Changes

To allow for specific fields to be set to `""` these fields were updated
to be `*string` as opposed to `string`. This allows for clients to set
these fields as `""` in an update request or they will default to the
template value if they are not provided in the update request (will be
`nil`).

Added tests to confirm empty and nil values and updated other tests that
use these fields.
2025-08-12 11:37:44 -05:00
Danielle Maywood f349edcc3c refactor: create tasks in coderd instead of frontend (#19280)
Instead of creating tasks with a specialized call to `CreateWorkspace`
on the frontend, we instead lift this to the backend and allow the
frontend to simply call `CreateAITask`.
2025-08-12 11:23:55 +01:00
Hugo Dutka cda1a3a593 chore(coderd/database/dbauthz): migrate TestUser to mocked db (#19305)
Related to https://github.com/coder/internal/issues/869

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2025-08-12 09:20:12 +00:00
Andrew Aquino b8c9192d0b fix: generalize password invalid message (#19307)
fixes #19044 

password over 64 characters means it's _too_ strong; now the error
message is applicable to this case
2025-08-11 12:20:02 -07:00
Mathias Fredriksson 1b66495b70 fix(coderd/prometheusmetrics)!: filter deleted wsbuilds to reduce db load (#19197)
This change removes the `GetLatestWorkspaceBuilds` query which includes
all workspaces for all time (including deleted). This allows us to also
stop using `GetProvisionerJobsByIDs` for said builds as the job status
is included in `GetWorkspaces` called separately.

**BREAKING CHANGE**: The `coderd_api_workspace_latest_build` Prometheus
metric no longer includes builds belonging to deleted workspaces, as
such, this metric will show fewer statuses.

Fixes coder/internal#717
2025-08-11 14:48:31 +03:00
Steven Masley ce935657f6 test: start migrating dbauthz tests to mocked db (#19257)
This PR adds a framework to move to a mocked db. And therefore massively speed up these tests.
2025-08-08 13:46:24 -05:00
Cian Johnston 155c7bbc65 chore(coderd/provisionerdserver): avoid fk error on invalid ai_task_sidebar_app_id (#19253)
This is a workaround for https://github.com/coder/coder/issues/18776

We avoid the foreign key issue by checking the previously inserted
workspace applications before calling UpdateWorkspaceAITask. Now,
affected workspaces will show as "not running an AI task" on the single
task view, which is technically correct.

We also insert a provisioner job log at WARN level to ensure that the
user sees some information that they have run into this issue, as well
as logging on the server side.

Longer term, we plan to modify how the workspace tasks view is
presented. This is a stopgap measure until we solidify that plan.

NOTE: this does **not** address the fact that stopping a workspace with
`has_ai_task: true` will result in the completed stop build no longer
having `has_ai_task: true`, resulting in tasks "disappearing" on stop.
2025-08-08 18:06:32 +01:00
Cian Johnston afb54f6884 chore: revert feat(enterprise/coderd): allow system users to be added to groups (#19254)
This reverts commit b200fc8e67
(https://github.com/coder/coder/pull/18341).
2025-08-08 12:18:07 +01:00
Sas Swart b200fc8e67 feat(enterprise/coderd): allow system users to be added to groups (#18341)
closes https://github.com/coder/coder/issues/18274

This pull request makes system users visible in various group related
queries so that they can be added to and removed from groups. This
allows system user quotas to be configured. System users are still
ignored in certain queries, such as when license seat consumption is
determined.

This pull request further ensures the existence of a
"coder_prebuilt_workspaces" group in any organization that needs
prebuilt workspaces

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
  * Organization and group member listings now include system users.
* **Bug Fixes**
* Updated tests to reflect the inclusion of system users in member and
group queries.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-08-08 11:03:17 +02:00
ケイラ 7bb52e1f8a test: add tests for updating workspace acl (#19240) 2025-08-07 17:09:46 -06:00
Steven Masley 0a3afeddc8 chore: add more pprof labels for various go routines (#19243)
- ReplicaSync
- Notifications
- MetricsAggregator
- DBPurge
2025-08-07 20:05:32 +00:00
Yevhenii Shcherbina c65996a041 feat: add user_secrets table (#19162)
Closes https://github.com/coder/internal/issues/780

## Summary of changes:
- added `user_secrets` table
- `user_secrets` table contains `env_name` and `file_path` fields which
are not used at the moment, but will be used in later PRs
- `user_secrets` table doesn't contain `value_key_id`, I will add it in
a separate migration in a dbcrypt PR
- on one hand I don't want to add fields which are not used (because
it's a risk smth may change in implementation later), on the other hand
I don't want to add too many migrations for user secrets table
- added unique sql indexes
- added sql queries for CRUD operations on user-secrets
- introduced new `ResourceUserSecret` resource
- basic unit-tests for CRUD ops and authorization behavior
- Role updates:
  - owner:
    - remove `ResourceUserSecret` from site-wide perms
    - add `ResourceUserSecret` to user-wide perms
   - orgAdmin
- remove `ResourceUserSecret` from org-wide perms; seems it's not
strictly required, because `ResourceUserSecret` is not tied to
organization in dbauthz wrappers?
   - memberRole
- no need to change memberRole because it implicitly has access to
user-secrets thanks to the `allPermsExcept`
   - is it enough changes to roles?
   
Main questions:
- [ ] We will have 2 migrations for user-secrets:
  - initial migration (in current PR)
  - adding `value_key_id` in dbcrypt PR
  - is this approach reasonable?
- [ ] Are changes to roles's permissions are correct?
- [ ] Are changes in roles_test.go are correct?

---------

Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
2025-08-07 15:58:59 -04:00
Steven Masley 34c46c0748 chore: rename service -> coder_service, remove agent_id label (#19241)
Pyroscope uses `service` tag for top level distinction. So move our
`service` -> `coder_service`
2025-08-07 13:58:39 -05:00
Steven Masley 8ba8b4f061 chore: add profiling labels for pprof analysis (#19232)
PProf labels segment the code into groups for determing the source of
cpu/memory profiles. Since the web server and background jobs share a
lot of the same code (eg wsbuilder), it helps to know if the load is
user induced, or background job based.
2025-08-07 11:21:17 -05:00
ケイラ 26458cd6f0 refactor: consolidate template and workspace acl validation (#19192) 2025-08-07 10:14:58 -06:00
Dean Sheather 82d5a20762 fix: fix flake in workspace TTL test caused by new constraint (#19213) 2025-08-07 07:16:01 +00:00
Ethan 99d75cc000 fix: use system context for querying workspaces when deleting users (#19211)
Closes #19209.

In `templates.go`, we do this to make sure we count ALL workspaces for a template before we try and delete that template:
https://github.com/coder/coder/blob/dc598856e3be0926573dbbe2ec680e95a139093a/coderd/templates.go#L81-L99

However, we weren't doing the same when attempting to delete users, leading to the linked issue. We can solve the issue the same way as we do for templates.
2025-08-07 16:31:27 +10:00
Dean Sheather 9edca92c74 fix: fix incorrect migration in constraint (#19212) 2025-08-07 06:19:00 +00:00
Dean Sheather dc598856e3 chore: improve build deadline code (#19203)
- Adds/improves a lot of comments to make the autostop calculation code
clearer
- Changes the behavior of the enterprise template schedule store to
match the behavior of the workspace TTL endpoint when the new TTL is
zero
- Fixes a bug in the workspace TTL endpoint where it could unset the
build deadline, even though a max_deadline was specified
- Adds a new constraint to the workspace_builds table that enforces the
deadline is non-zero and below the max_deadline if it is set
- Adds CHECK constraint enum generation to scripts/dbgen, used for
testing the above constraint
- Adds Dean and Danielle as CODEOWNERS for the autostop calculation code
2025-08-07 11:00:31 +10:00
Steven Masley 3024bdebcb chore: support 'me' as the username for template author (#19204)
`author:me` to find my templates. Much nicer than knowing my own
username
2025-08-06 11:45:07 -05:00
Steven Masley 5b80c47e8c feat: add author filter command to template filtering (#19202)
Can do `author:username` to filter templates created by a certain
author. Adding to help clean out some templates that I created on our
dev instance. This makes sorting a bit easier.
2025-08-06 10:41:01 -05:00
Danielle Maywood cc609cbd53 fix(site): display tasks link when no templates contain an AI task (#19184)
Fixes https://github.com/coder/coder/issues/19059

This change ensures we always show the "Tasks" link, even when a deployment doesn't yet have a tasks template set up.
2025-08-05 19:30:17 +01:00
Cian Johnston b0ba798f78 chore(coderd/provisionerdserver): remove dangerous usage of dbtestutil.DisableForeignKeysAndTriggers (#19122)
May have caught https://github.com/coder/coder/issues/18776
2025-08-04 21:51:33 +01:00
Hugo Dutka 79cd80e5ca feat: add MCP tools for ChatGPT (#19102)
Addresses https://github.com/coder/internal/issues/772.

Adds the toolset query parameter to the `/api/experimental/mcp/http` endpoint, which, when set to "chatgpt", exposes new `fetch` and `search` tools compatible with ChatGPT, as described in the
[ChatGPT docs](https://platform.openai.com/docs/mcp). These tools are
exposed in isolation because in my usage I found that ChatGPT refuses to
connect to Coder if it sees additional MCP tools.

<img width="1248" height="908" alt="Screenshot 2025-07-30 at 16 36 56"
src="https://github.com/user-attachments/assets/ca31e57b-d18b-4998-9554-7a96a141527a"
/>
2025-08-04 14:11:22 +02:00
Spike Curtis d4b44185da chore: add database dump and dbfake logging (#19144)
relates to  #778

Somehow in `TestWorkspaceAgent` the agent with the test instance identifier is not being added to the database, or is getting deleted.

I'm adding some additional logging to `dbfake` and setting the affected tests to dump postgres on error, to see if we can get to the bottom of the issue.
2025-08-04 13:22:04 +04:00
ケイラ 1cffd11619 feat: add workspace sharing page (#19107) 2025-07-31 15:05:09 +00:00
ケイラ ed62ddc38e chore: add workspace-sharing experiment (#19106) 2025-07-31 07:52:57 -06:00
Benjamin Peinhardt e4dc2d9418 fix: add constraint and runtime check for provisioner logs size limit (#18893)
This PR sets a constraint of 1MB on the provisioner job logs written to
the database. This is consistent with the constraint we place on
workspace agent logs:
https://github.com/coder/coder/blob/4ac6be6d835dc36c242e35a26b584b784040bf28/coderd/database/dump.sql#L2030

It also adds a message printed to the front end about the provisioner
log overflow, and updates the message printed to the front end when
workspace startup logs exceed the max, as it was causing some customers
to think their startup script had failed to run.
2025-07-30 19:09:53 -05:00
ケイラ eeb0bbefb9 feat: implement acl for workspaces (#19094) 2025-07-30 17:02:51 -06:00
Callum Styan d736af1fa3 fix: handle potential DB conflict due to concurrent upload requests in postFile (#19005)
This issue manifests when users have multiple templates which rely on
the same files, for example see:
https://github.com/coder/coder/issues/17442

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-07-30 13:55:30 -07:00
Callum Styan ffbfaf2a6f feat: allow bypassing current CORS magic based on template config (#18706)
Solves https://github.com/coder/coder/issues/15096

This is a slight rework/refactor of the earlier PRs from @dannykopping
and @Emyrk:
- https://github.com/coder/coder/pull/15669
- https://github.com/coder/coder/pull/15684
- https://github.com/coder/coder/pull/17596

Rather than having a per-app CORS behaviour setting and additionally a
template level setting for ports, this PR adds a single template level
CORS behaviour setting that is then used by all apps/ports for
workspaces created from that template.

The main changes are in `proxy.go` and `request.go` to:
a) get the CORS behaviour setting from the template
b) have `HandleSubdomain` bypass the CORS middleware handler if the
selected behaviour is `passthru`
c) in `proxyWorkspaceApp`, do not modify the response if the selected
behaviour is `passthru`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added support for configuring CORS behavior ("simple" or "passthru")
at the template level for all shared ports.
* Introduced a new "CORS Behavior" setting in the template creation and
settings forms.
* API endpoints and responses now include the optional `cors_behavior`
property for templates.
* Workspace apps and proxy now honor the specified CORS behavior,
enabling conditional CORS middleware application.
* Enhanced workspace app tests with comprehensive scenarios covering
CORS behaviors and authentication states.

* **Bug Fixes**
  * None.

* **Documentation**
* Updated API and admin documentation to describe the new
`cors_behavior` property and its usage.
* Added examples and schema references for CORS behavior in relevant API
docs.

* **Tests**
* Extended automated tests to cover different CORS behavior scenarios
for templates and workspace apps.

* **Chores**
* Updated audit logging to track changes to the `cors_behavior` field on
templates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-07-30 13:42:39 -07:00