Commit Graph

89 Commits

Author SHA1 Message Date
George K cc2efe9e1f feat(coderd/rbac): make organization-member a per-org system custom role (#21359)
Migrated the built-in organization-member role to DB storage so it can be customized per org.

Closes https://github.com/coder/internal/issues/1073 (part 1)
2026-01-12 18:19:19 -08:00
Spike Curtis bddb808b25 chore: arrange imports in a standard way (#21452)
Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example:

```
import (
	"context"
	"time"

	"github.com/prometheus/client_golang/prometheus"
	"golang.org/x/xerrors"
	"gopkg.in/natefinch/lumberjack.v2"

	"cdr.dev/slog/v3"
	"github.com/coder/coder/v2/codersdk/agentsdk"
	"github.com/coder/serpent"
)
```

3 groups: standard library, 3rd partly libs, Coder libs.

This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.
2026-01-08 15:24:11 +04:00
Spike Curtis 49b34a716a fix: fix slog to always use array of Fields (#21426)
Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder).

It also updates dependencies that also use slog and were updated.

I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule.

Other dependencies, I pushed new tags.
2026-01-08 10:29:41 +04:00
Danielle Maywood f45a179181 test: move context to after db creation (#21224)
Closes  https://github.com/coder/internal/issues/1040

We move the context to just before it is used to avoid the scenario
where NewDB takes a while to spin up and runs up the context to the
deadline.
2025-12-11 21:51:16 +00:00
George K da71e546bb chore: fix test errors on newer debian-based systems due to deprecated TZ (#21115)
It appears on newer Debian systems `Canada/Newfoundland` TZ is not
present and `America/St_Johns` should be used instead. Coder tests use a
docker PG image where `Canada/Newfoundland` is still supported:

```
$ docker run --rm -it us-docker.pkg.dev/coder-v2-images-public/public/postgres:17 bash
root@ca99e82721dc:/# ls -l /usr/share/zoneinfo/Canada/Newfoundland
lrwxrwxrwx 1 root root 19 Mar 26  2025 /usr/share/zoneinfo/Canada/Newfoundland -> ../America/St_Johns
```
However, if a local PG instance is running on a Debian Trixie host,
coder test will use it and error out due to the zone being unavailable:

```
$ docker run --rm -it debian:trixie bash
root@f285092767e4:/# ls -l /usr/share/zoneinfo/Canada/Newfoundland
ls: cannot access '/usr/share/zoneinfo/Canada/Newfoundland': No such file or directory
root@f285092767e4:/# ls -l /usr/share/zoneinfo/America/St_Johns
-rw-r--r-- 1 root root 3655 Aug 24 20:12 /usr/share/zoneinfo/America/St_Johns
```

... which causes the tests to error out:

```
$ go test ./enterprise/coderd
--- FAIL: TestWorkspaceTemplateParamsChange (0.13s)
    workspaces_test.go:3097: TestWorkspaceTagsTerraform: using cached terraform providers
    workspaces_test.go:3097: Set TF_CLI_CONFIG_FILE=/home/geo/.cache/coderv2-test/terraform_workspace_tags_test/a28ed341dee8/terraform.rc
    coderdenttest.go:84:
        	Error Trace:	/home/geo/coder/coderd/database/dbtestutil/db.go:161
        	            				/home/geo/coder/coderd/database/dbtestutil/db.go:122
        	            				/home/geo/coder/coderd/coderdtest/coderdtest.go:270
        	            				/home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:105
        	            				/home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:84
        	            				/home/geo/coder/enterprise/coderd/coderdenttest/coderdenttest.go:84
        	            				/home/geo/coder/enterprise/coderd/workspaces_test.go:3103
        	Error:      	Received unexpected error:
        	            	pq: invalid value for parameter "TimeZone": "Canada/Newfoundland"
        	Test:       	TestWorkspaceTemplateParamsChange
        	Messages:   	failed to set timezone for database
...
```

This commit replaces the problematic TZ with the canonical one.
2025-12-10 08:09:13 -08:00
Steven Masley cefe07d074 feat: purge expired api keys in dbpurge (#20863)
closes https://github.com/coder/coder/issues/19889

This is in response to a migration in v2.27 that takes very long on deployments with large `api_key` tables.
2025-11-24 10:24:32 -06:00
Mathias Fredriksson 1483fd11ff fix(coderd/database): improve task status in tasks_with_status view (#20683)
This change restructures the `tasks_with_status` view query to:

- Improve debuggability by adding a `status_debug` column to better
understand the outcome
- Reduce clutter from `bool_or`, `bool_and` which are aggregate
functions that did not actually have serve a purpose (each join is 0-1
rows)
- Improve agent lifecycle state coverage, `start_timeout` and
`start_error` were omitted
- These states are easy to trigger even in a perfectly functioning
workspace/task so we now rely on app health to report whether or not
there was an issue
- Mark canceling and canceled workspace build jobs as error state
- Agent stop states were implicitly `unknown`, now there are explicit (I
initially considered `error`, could go either way)
2025-11-14 19:52:26 +02:00
Mathias Fredriksson a1fa58ac17 fix: update dbgen and dbfake task creation and toolsdk test fixtures (#20508)
Depends on #20506
Fixes coder/internal#1103
2025-10-28 14:15:58 +02:00
Paweł Banaszewski 50ba223aa1 feat: add db query for setting interception ended_at field (#20437)
Adds UpdateAIBridgeInterceptionEnded query to mark interceptions as
done.
Needed for https://github.com/coder/internal/issues/1051
2025-10-27 09:51:37 +01:00
Mathias Fredriksson 5c802c2627 feat(coderd): use task data model when creating a new task (#20275)
Updates coder/internal#976
2025-10-23 19:12:09 +03:00
Hugo Dutka e62c5db678 chore: remove references to dbtestutil.WillUsePostgres (#20436)
Addresses https://github.com/coder/internal/issues/758.

This PR only cleans up dead code, it makes no changes to test logic.
2025-10-23 14:24:54 +02:00
Cian Johnston dc6e50d6b7 feat(coderd/telemetry): add telemetry for database Tasks (#20279)
Adds Tasks to telemetry snapshots

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2025-10-17 10:48:56 +01:00
Mathias Fredriksson 82945cfb16 fix(coderd/database): add missing columns to tasks with status (#20311)
Updates coder/internal#976
2025-10-15 16:34:33 +00:00
Cian Johnston 9f229370e7 feat(coderd/database): add ListTasks query (#20282)
Relates to https://github.com/coder/internal/issues/981

Adds a `ListTasks` query that allows filtering by OwnerID and OrganizationID.
2025-10-14 17:33:30 +01:00
Mathias Fredriksson 5dc57da6b4 fix(coderd/database): ensure task name uniqueness (#20236)
This change ensures task names are unique per user the same way we do
for workspaces. This ensures we don't create tasks that are impossible
to start due to another task being named the same creating a workspace
name conflict.

Updates coder/internal#948
Supersedes coder/coder#20212
2025-10-13 12:42:38 +03:00
Mathias Fredriksson 952c69f412 feat(coderd/database): add task status and status view (#20235)
This change updates the `task_workspace_apps` table structure for
improved linking to workspace builds and adds queries to manage tasks
and a view to expose task status.

Updates coder/internal#948
Supersedes coder/coder#20212
Supersedes coder/coder#19773
2025-10-13 12:25:58 +03:00
Dean Sheather 39bf3ba628 chore: replace GetManagedAgentCount query with aggregate table (#19636)
- Removes GetManagedAgentCount query
- Adds new table `usage_events_daily` which stores aggregated usage
events by the type and UTC day
- Adds trigger to update the values in this table when a new row is
inserted into `usage_events`
- Adds a migration that adds `usage_events_daily` rows for existing data
in `usage_events`
- Adds tests for the trigger
- Adds tests for the backfill query in the migration

Since the `usage_events` table is unreleased currently, this migration
will do nothing on real deployments and will only affect preview
deployments such as dogfood.

Closes https://github.com/coder/internal/issues/943
2025-08-30 03:39:37 +10:00
Steven Masley ef0d74fb75 chore: improve performance of 'GetLatestWorkspaceBuildsByWorkspaceIDs' (#19452)
Closes https://github.com/coder/internal/issues/716

This prevents a scan over the entire `workspace_build` table by removing
a `join`. This is still imperfect as we are still scanning over the
number of builds for the workspaces in the arguments. Ideally we would
have some index or something precomputed. Then we could skip scanning
over the builds for the correct workspaces that are not the latest.
2025-08-26 09:26:11 -05:00
Rafael Rodriguez ad5e6785f4 feat: add filtering options to provisioners list (#19378)
## Summary

In this pull request we're adding support for additional filtering
options to the `provisioners list` CLI command and the
`/provisionerdaemons` API endpoint.

Resolves: https://github.com/coder/coder/issues/18783

### Changes

#### Added CLI Options

- `--show-offline`: When this option is provided, all provisioner
daemons will be returned. This means that when `--show-offline` is not
provided only `idle` and `busy` provisioner daemons will be returned.
- `--status=<list_of_statuses>`: When this option is provided with a
comma-separated list of valid statuses (`idle`, `busy`, or `offline`)
only provisioner daemons that have these statuses will be returned.
- `--max-age=<duration>`: When this option is provided with a valid
duration value (e.g., `24h`, `30s`) only provisioner daemons with a
`last_seen_at` timestamp within the provided max age will be returned.

#### Query Params

- `?offline=true`: Include offline provisioner daemons in the results.
Offline provisioner daemons will be excluded if `?offline=false` or if
offline is not provided.
- `?status=<list_of_statuses>`: Include provisioner daemons with the
specified statuses.
- `?max_age=<duration>`: Include provisioner daemons with a
`last_seen_at` timestamp within the max age duration.

#### Frontend

- Since offline provisioners will not be returned by default anymore
(`--show-offline` has to be provided to see them), a checkbox was added
to the provisioners list page to allow for offline provisioners to be
displayed
- A revamp of the provisioners page will be done in:
https://github.com/coder/coder/issues/17156, this checkbox change was
just added to maintain currently functionality with the backend updates

Current provisioners page (without checkbox)

<img width="1329" height="574" alt="Screenshot 2025-08-20 at 10 51
00 AM"
src="https://github.com/user-attachments/assets/77b73650-0b62-44f0-a77f-acbe5710809f"
/>

Provisioners page with checkbox (unchecked)

<img width="1314" height="626" alt="Screenshot 2025-08-20 at 10 48
40 AM"
src="https://github.com/user-attachments/assets/7ba164ad-6d3f-417b-bd39-338c0161b145"
/>

Provisioner page with checkbox (checked) and URL updated with query
parameters

<img width="1306" height="597" alt="Screenshot 2025-08-20 at 10 50
14 AM"
src="https://github.com/user-attachments/assets/e78d0986-bbf8-491b-9d56-b682973237a0"
/>

### Show Offline vs Offline Status

To list offline provisioner daemons, users can either:

1. Include the `--show-offline` option

OR

2. Include `offline` in the list of values provided to the `--status`
option
2025-08-21 16:03:34 -04:00
Callum Styan bcdade7d8c fix: add database constraint to enforce minimum username length (#19453)
Username length and format, via regex, are already enforced at the
application layer, but we have some code paths with database queries
where we could optimize away many of the DB query calls if we could be
sure at the database level that the username is never an empty string.

For example: https://github.com/coder/coder/pull/19395

---------

Signed-off-by: Callum Styan <callumstyan@gmail.com>
2025-08-21 07:56:41 -07:00
Ethan 791d39c261 test(coderd/database): use seperate context for subtests to fix flake (#19330)
Fixes flakes like
https://github.com/coder/coder/actions/runs/16927282256/job/47965470039

https://coder.com/blog/go-testing-contexts-and-t-parallel

...I'm going to take a stab at turning this into a lint rule. I think
it's possible by just reading the AST?
2025-08-13 14:45:35 +10:00
Yevhenii Shcherbina c65996a041 feat: add user_secrets table (#19162)
Closes https://github.com/coder/internal/issues/780

## Summary of changes:
- added `user_secrets` table
- `user_secrets` table contains `env_name` and `file_path` fields which
are not used at the moment, but will be used in later PRs
- `user_secrets` table doesn't contain `value_key_id`, I will add it in
a separate migration in a dbcrypt PR
- on one hand I don't want to add fields which are not used (because
it's a risk smth may change in implementation later), on the other hand
I don't want to add too many migrations for user secrets table
- added unique sql indexes
- added sql queries for CRUD operations on user-secrets
- introduced new `ResourceUserSecret` resource
- basic unit-tests for CRUD ops and authorization behavior
- Role updates:
  - owner:
    - remove `ResourceUserSecret` from site-wide perms
    - add `ResourceUserSecret` to user-wide perms
   - orgAdmin
- remove `ResourceUserSecret` from org-wide perms; seems it's not
strictly required, because `ResourceUserSecret` is not tied to
organization in dbauthz wrappers?
   - memberRole
- no need to change memberRole because it implicitly has access to
user-secrets thanks to the `allPermsExcept`
   - is it enough changes to roles?
   
Main questions:
- [ ] We will have 2 migrations for user-secrets:
  - initial migration (in current PR)
  - adding `value_key_id` in dbcrypt PR
  - is this approach reasonable?
- [ ] Are changes to roles's permissions are correct?
- [ ] Are changes in roles_test.go are correct?

---------

Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
2025-08-07 15:58:59 -04:00
Dean Sheather dc598856e3 chore: improve build deadline code (#19203)
- Adds/improves a lot of comments to make the autostop calculation code
clearer
- Changes the behavior of the enterprise template schedule store to
match the behavior of the workspace TTL endpoint when the new TTL is
zero
- Fixes a bug in the workspace TTL endpoint where it could unset the
build deadline, even though a max_deadline was specified
- Adds a new constraint to the workspace_builds table that enforces the
deadline is non-zero and below the max_deadline if it is set
- Adds CHECK constraint enum generation to scripts/dbgen, used for
testing the above constraint
- Adds Dean and Danielle as CODEOWNERS for the autostop calculation code
2025-08-07 11:00:31 +10:00
Ethan 5c1bf1d46c test(coderd/database): use seperate context for subtests to fix flake (#19029)
Fixes flakes like https://github.com/coder/coder/actions/runs/16487670478/job/46615625141, caused by the issue described in https://coder.com/blog/go-testing-contexts-and-t-parallel

It'd be cool if we could lint for this? That a context from an outer test isn't used in a subtest if that subtest calls `t.Parallel`.
2025-07-24 20:07:54 +10:00
Cian Johnston c4b69bbe63 fix: prioritise human-initiated builds over prebuilds (#18933)
Continues from https://github.com/coder/coder/pull/18882

- Reverts extraneous changes
- Adds explicit `ORDER BY initiator_id = $PREBUILDS_USER_ID` to
`AcquireProvisionerJob`
- Improves test added for above PR

---------

Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: kylecarbs <7122116+kylecarbs@users.noreply.github.com>
2025-07-22 13:03:50 +01:00
Ethan 7c077d39c5 chore: populate connectionlog count using a separate query (#18629)
This is the third PR for moving connection events out of the audit log.

This PR populates `count` on `ConnectionLogResponse` using a separate query, to preemptively mitigate the issue described in #17689. It's structurally identical to a portion of https://github.com/coder/coder/pull/18600, but for the connection log instead of the audit log.
       
Future PRs:
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature
2025-07-15 15:03:30 +10:00
Ethan 7a339a1ffe feat: add connectionlogs API (#18628)
This is the second PR for moving connection events out of the audit log.

This PR:
- Adds the `/api/v2/connectionlog` endpoint
- Adds filtering for `GetAuthorizedConnectionLogsOffset` and thus the endpoint. 
There's quite a few, but I was aiming for feature parity with the audit log.
  1. `organization:<id|name>`
  2. `workspace_owner:<username>`
  3. `workspace_owner_email:<email>`
  4. `type:<ssh|vscode|jetbrains|reconnecting_pty|workspace_app|port_forwarding>`
  5. `username:<username>` 
     - Only includes web-based connection events (workspace apps, web port forwarding) as only those include user metadata.
  6. `user_email:<email>`
  7. `connected_after:<time>`
  8. `connected_before:<time>`
  9. `workspace_id:<id>`
  10. `connection_id:<id>`
      - If you have one snapshot of the connection log, and some sessions are ongoing in that snapshot, you could use this filter to check if they've been closed since.
  11. `status:<connected|disconnected>`
       - If `connected` only sessions with a null `close_time` are returned, if `disconnected`, only those with a non-null `close_time`. If filter is omitted, both are returned.
       
Future PRs:
- Populate `count` on `ConnectionLogResponse` using a seperate query (to preemptively mitigate the issue described in #17689)
- Implement a table in the Web UI for viewing connection logs.
- Write a query to delete old events from the audit log, call it from dbpurge.
- Write documentation for the endpoint / feature (including these filters)
2025-07-15 14:55:34 +10:00
Ethan 08e17a07fc chore!: route connection logs to new table (#18340)
### Breaking Change (changelog note):
> User connections to workspaces, and the opening of workspace apps or ports will no longer create entries in the audit log. Those events will now be included in the 'Connection Log'.
Please see the 'Connection Log' page in the dashboard, and the Connection Log [documentation](https://coder.com/docs/admin/monitoring/connection-logs) for details. Those with permission to view the Audit Log will also be able to view the Connection Log. The new Connection Log has the same licensing restrictions as the Audit Log, and requires a Premium Coder deployment.

### Context

This is the first PR of a few for moving connection events out of the audit log, and into a new database table and web UI page called the 'Connection Log'.

This PR:
- Creates the new table
- Adds and tests queries for inserting and reading, including reading with an RBAC filter.
- Implements the corresponding RBAC changes, such that anyone who can view the audit log can read from the table
- Implements, under the enterprise package, a `ConnectionLogger` abstraction to replace the `Auditor` abstraction for these logs. (No-op'd in AGPL, like the `Auditor`)
- Routes SSH connection and Workspace App events into the new `ConnectionLogger`
- Updates all existing tests to check the values of the `ConnectionLogger` instead of the `Auditor`.

Future PRs:
- Add filtering to the query
- Add an enterprise endpoint to query the new table
- Write a query to delete old events from the audit log, call it from dbpurge.
- Implement a table in the Web UI for viewing connection logs.


> [!NOTE]
> The PRs in this stack obviously won't be (completely) atomic. Whilst they'll each pass CI, the stack is designed to be merged all at once. I'm splitting them up for the sake of those reviewing, and so changes can be reviewed as early as possible.  Despite this, it's really hard to make this PR any smaller than it already is. I'll be keeping it in draft until it's actually ready to merge.
2025-07-15 14:36:06 +10:00
Cian Johnston 0367dbac43 chore: optimize GetPrebuiltWorkspaces query (#18717)
* Adds GetRunningPrebuiltWorkspacesOptimized query
* Runs both original and updated query side-by-side and logs diffs
2025-07-09 11:30:42 +01:00
Hugo Dutka 3c2f3d640b chore: remove dbmem (#18803)
Remove the in-memory database. Addresses #15109.
2025-07-09 09:46:31 +02:00
Cian Johnston 4e95b1d20e fix: revert changes to GetRunningPrebuiltWorkspaces (#18688)
… (#18588)"

This reverts commit 258a839d27.
2025-07-01 10:11:43 +00:00
Cian Johnston 258a839d27 chore(coderd/database): optimize GetRunningPrebuiltWorkspaces (#18588)
Fixes https://github.com/coder/internal/issues/715

After this change, the only use of the `workspace_prebuilds` view is the
`ClaimPrebuiltWorkspace` query. A subsequent PR will update the view.

Before: ~44ms https://explain.dalibo.com/plan/76cbe21d1a4c9329#plan

After: 7.3ms https://explain.dalibo.com/plan/5abbdf926315677e#plan
2025-07-01 09:42:01 +01:00
Kacper Sawicki 695de6e0c0 chore(coderd/database): optimize AuditLogs queries (#18600)
Closes #17689

This PR optimizes the audit logs query performance by extracting the
count operation into a separate query and replacing the OR-based
workspace_builds with conditional joins.

## Query changes
* Extracted count query to separate one
* Replaced single `workspace_builds` join with OR conditions with
separate conditional joins
* Added conditional joins
* `wb_build` for workspace_build audit logs (which is a direct lookup)
    * `wb_workspace` for workspace create audit logs (via workspace)

Optimized AuditLogsOffset query:
https://explain.dalibo.com/plan/4g1hbedg4a564bg8

New CountAuditLogs query:
https://explain.dalibo.com/plan/ga2fbcecb9efbce3
2025-07-01 07:31:14 +02:00
ケイラ fae30a00fd chore: remove unnecessary redeclarations in for loops (#18440) 2025-06-20 13:16:55 -06:00
Susana Ferreira 72f7d70bab feat: allow TemplateAdmin to delete prebuilds via auth layer (#18333)
## Description

This PR adds support for deleting prebuilt workspaces via the
authorization layer. It introduces special-case handling to ensure that
`prebuilt_workspace` permissions are evaluated when attempting to delete
a prebuilt workspace, falling back to the standard `workspace` resource
as needed.

Prebuilt workspaces are a subset of workspaces, identified by having
`owner_id` set to `PREBUILD_SYSTEM_USER`.
This means:
* A user with `prebuilt_workspace.delete` permission is allowed to
**delete only prebuilt workspaces**.
* A user with `workspace.delete` permission can **delete both normal and
prebuilt workspaces**.

⚠️ This implementation is scoped to **deletion operations only**. No
other operations are currently supported for the `prebuilt_workspace`
resource.

To delete a workspace, users must have the following permissions:
* `workspace.read`: to read the current workspace state
* `update`: to modify workspace metadata and related resources during
deletion (e.g., updating the `deleted` field in the database)
* `delete`: to perform the actual deletion of the workspace

## Changes

* Introduced `authorizeWorkspace()` helper to handle prebuilt workspace
authorization logic.
* Ensured both `prebuilt_workspace` and `workspace` permissions are
checked.
* Added comments to clarify the current behavior and limitations.
* Moved `SystemUserID` constant from the `prebuilds` package to the
`database` package `PrebuildsSystemUserID` to resolve an import cycle
(commit
https://github.com/coder/coder/pull/18333/commits/f24e4ab4b6f0a56726fd04be2d7302c9fdb52d53).
* Update middleware `ExtractOrganizationMember` to include system user
members.
2025-06-20 17:36:32 +01:00
Danielle Maywood b712d0b23f feat(coderd/agentapi): implement sub agent api (#17823)
Closes https://github.com/coder/internal/issues/619

Implement the `coderd` side of the AgentAPI for the upcoming
dev-container agents work.

`agent/agenttest/client.go` is left unimplemented for a future PR
working to implement the agent side of this feature.
2025-05-29 12:15:47 +01:00
Danielle Maywood 6e255c72c6 chore(coderd/database): enforce agent name unique within workspace build (#18052)
Adds a database trigger that runs on insert and update of the
`workspace_agents` table. The trigger ensures that the agent name is
unique within the context of the workspace build it is being inserted
into.
2025-05-28 14:21:17 +01:00
Yevhenii Shcherbina 110102a60a fix: optimize queue position sql query (#17974)
Use only `online provisioner daemons` for
`GetProvisionerJobsByIDsWithQueuePosition` query. It should improve
performance of the query.
2025-05-28 08:21:16 -04:00
Yevhenii Shcherbina 53e8e9c7cd fix: reduce cost of prebuild failure (#17697)
Relates to https://github.com/coder/coder/issues/17432

### Part 1:

Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.

- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.

- By default `FailureHardLimit` is set to 3.

- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.

### Part 2

Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
  'normal',           -- Prebuilds are working as expected; this is the default, healthy state.
  'hard_limited',     -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
  'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.

- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>

### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
2025-05-21 15:16:38 -04:00
brettkolodny b7e08ba7c9 fix: filter out deleted users when attempting to delete an organization (#17621)
Closes
[coder/internal#601](https://github.com/coder/internal/issues/601)
2025-05-01 13:26:01 -03:00
Sas Swart 99c6f235eb feat: add migrations and queries to support prebuilds (#16891)
Depends on https://github.com/coder/coder/pull/16916 _(change base to
`main` once it is merged)_

Closes https://github.com/coder/internal/issues/514

_This is one of several PRs to decompose the `dk/prebuilds` feature
branch into separate PRs to merge into `main`._

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-03 10:58:30 +02:00
Jon Ayers 17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
brettkolodny cf10d98aab fix: improve error message when deleting organization with resources (#17049)
Closes
[coder/internal#477](https://github.com/coder/internal/issues/477)

![Screenshot 2025-03-21 at 11 25
57 AM](https://github.com/user-attachments/assets/50cc03e9-395d-4fc7-8882-18cb66b1fac9)

I'm solving this issue in two parts:

1. Updated the postgres function so that it doesn't omit 0 values in the
error
2. Created a new query to fetch the number of resources associated with
an organization and using that information to provider a cleaner error
message to the frontend

> **_NOTE:_** SQL is not my strong suit, and the code was created with
the help of AI. So I'd take extra time looking over what I wrote there
2025-03-25 15:31:24 -04:00
Danny Kopping 4c33846f6d chore: add prebuilds system user (#16916)
Pre-requisite for https://github.com/coder/coder/pull/16891

Closes https://github.com/coder/internal/issues/515

This PR introduces a new concept of a "system" user.

Our data model requires that all workspaces have an owner (a `users`
relation), and prebuilds is a feature that will spin up workspaces to be
claimed later by actual users - and thus needs to own the workspaces in
the interim.

Naturally, introducing a change like this touches a few aspects around
the codebase and we've taken the approach _default hidden_ here; in
other words, queries for users will by default _exclude_ all system
users, but there is a flag to ensure they can be displayed. This keeps
the changeset relatively small.

This user has minimal permissions (it's equivalent to a `member` since
it has no roles). It will be associated with the default org in the
initial migration, and thereafter we'll need to somehow ensure its
membership aligns with templates (which are org-scoped) for which it'll
need to provision prebuilds; that's a solution we'll have in a
subsequent PR.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
2025-03-25 12:18:06 +00:00
Marcin Tojek 075e5f4f6e test: skip tests affected by daylight savings issues (#16857)
Related: https://github.com/coder/internal/issues/464

This will unblock the CI pipeline.
2025-03-10 13:10:34 +01:00
Yevhenii Shcherbina 84881a0e98 test: fix flaky tests (#16799)
Relates to: https://github.com/coder/internal/issues/451

Create separate context with timeout for every subtest.
2025-03-04 08:44:48 -05:00
Yevhenii Shcherbina b85ba586ee fix(coderd/database): consider tag sets when calculating queue position (#16685)
Relates to https://github.com/coder/coder/issues/15843

## PR Contents

- Reimplementation of the `GetProvisionerJobsByIDsWithQueuePosition` SQL
query to **take into account** provisioner job tags and provisioner
daemon tags.
- Unit tests covering different **tag sets**, **job statuses**, and
**job ordering** scenarios.

## Notes

- The original row order is preserved by introducing the `ordinality`
field.
- Unnecessary rows are filtered as early as possible to ensure that
expensive joins operate on a smaller dataset.
- A "fake" join with `provisioner_jobs` is added at the end to ensure
`sqlc.embed` compiles successfully.
- **Backward compatibility is preserved**—only the SQL query has been
updated, while the Go code remains unchanged.
2025-03-03 10:02:18 -05:00
Yevhenii Shcherbina 98dfc70f31 fix(coderd/database): remove linux build tags from db package (#16633)
Remove linux build tags from database package to make sure we can run
tests on Mac OS.
2025-02-25 11:39:37 -05:00
Jaayden Halko 546a549dcf feat: enable soft delete for organizations (#16584)
- Add deleted column to organizations table
- Add trigger to check for existing workspaces, templates, groups and
members in a org before allowing the soft delete

---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>
2025-02-24 12:59:41 -05:00
Sas Swart 50bf5ca8fe fix(coderd/database): aggregate user engagement statistics by interval (#16150)
This PR addresses the TODO comment here:

https://github.com/coder/coder/pull/16134/files#diff-1844f895bb005f036da11d800fe2a76b54bfddd481c5d8cb15f210c64679caa5R47

The backend now backfills entries for dates with no status changes.
2025-01-16 17:34:53 +02:00