Commit Graph

36 Commits

Author SHA1 Message Date
J. Scott Miller 7bde763b66 feat: add workspace build transition to provisioner job list (#24131)
Closes #16332

Previously `coder provisioner jobs list` showed no indication of what a workspace
build job was doing (i.e., start, stop, or delete). This adds
`workspace_build_transition` to the provisioner job metadata, exposed in
both the REST API and CLI. Template and workspace name columns were also
added, both available via `-c`.

```
$ coder provisioner jobs list -c id,type,status,"workspace build transition"
ID                                    TYPE                     STATUS     WORKSPACE BUILD TRANSITION
95f35545-a59f-4900-813d-80b8c8fd7a33  template_version_import  succeeded
0a903bbe-cef5-4e72-9e62-f7e7b4dfbb7a  workspace_build          succeeded  start
```
2026-04-10 09:50:11 -05:00
Cian Johnston e9025f91e8 chore(db): remove 23 unused database methods (#22999)
Removes 22 database query methods with no callers outside generated code
and the dbauthz wrapper layer (~1,600 lines).

**Security keys (6)** — superseded by `cryptokeys` package:
`GetAppSecurityKey`, `UpsertAppSecurityKey`, `GetOAuthSigningKey`,
`UpsertOAuthSigningKey`, `GetCoordinatorResumeTokenSigningKey`,
`UpsertCoordinatorResumeTokenSigningKey`

**Superseded queries (4):**
- `GetProvisionerJobsByIDs` → `GetProvisionerJobsByIDsWithQueuePosition`
- `GetDeploymentDAUs` / `GetTemplateDAUs` →
`GetTemplateInsightsByInterval`
- `GetWorkspaceBuildParametersByBuildIDs` + its `GetAuthorized...`
variant → unused

**OAuth2 (2):**
`GetOAuth2ProviderAppByRegistrationToken`,
`UpdateOAuth2ProviderAppSecretByID`

**Chat (4)** — pre-wired with no callers:
`GetChatModelConfigByProviderAndModel`, `DeleteChatMessagesByChatID`,
`ListChatsByRootID`, `ListChildChatsByParentID`

**Other (6):**
`DeleteGitSSHKey`, `UpdateUserLinkedID`, `GetFileIDByTemplateVersionID`,
`GetTemplateVersionHasAITask`, `InsertUserGroupsByName`,
`RemoveUserFromAllGroups`
2026-03-12 21:32:57 +00:00
Jon Ayers e7ea649dc2 fix: optimize GetProvisionerJobsByIDsWithQueuePosition query (#22724) 2026-03-09 16:47:02 -05:00
Mathias Fredriksson f75cbab6ce fix(coderd/database): prevent AcquireProvisionerJob from grabbing canceled jobs (#21852)
The AcquireProvisionerJob query only checked started_at IS NULL, allowing
it to acquire jobs that were canceled while pending (which have
completed_at set but started_at still NULL).

Added completed_at IS NULL check to the query to prevent this.

Also fixed JobCompleteBuilder.Do() in dbfake to set started_at when
completing jobs to match production behavior.

Fixes coder/internal#1323
2026-02-03 10:42:17 +02:00
Sas Swart d17dd5d787 feat: add filtering by initiator to provisioner job listing in the CLI (#20137)
Relates to https://github.com/coder/internal/issues/934

This PR provides a mechanism to filter provisioner jobs according to who
initiated the job.
This will be used to find pending prebuild jobs when prebuilds have
overwhelmed the provisioner job queue. They can then be canceled.

If prebuilds are overwhelming provisioners, the following steps will be
taken:

```bash
# pause prebuild reconciliation to limit provisioner queue pollution:
coder prebuilds pause 
# cancel pending provisioner jobs to clear the queue
coder provisioner jobs list --initiator="prebuilds" --status="pending" | jq ... | xargs -n1 -I{} coder provisioner jobs cancel {}
# push a fixed template and wait for the import to complete
coder templates push ... # push a fixed template
# resume prebuild reconciliation
coder prebuilds resume
```

This interface differs somewhat from what was specified in the issue,
but still provides a mechanism that addresses the issue. The original
proposal was made by myself and this simpler implementation makes sense.
I might add a `--search` parameter in a follow-up if there is appetite
for it.

Potential follow ups:
* Support for this usage: `coder provisioner jobs list --search
"initiator:prebuilds status:pending"`
* Adding the same parameters to `coder provisioner jobs cancel` as a
convenience feature so that operators don't have to pipe through `jq`
and `xargs`
2025-10-06 08:56:43 +00:00
Kacper Sawicki 776231d025 fix(coderd): add blocking GetProvisionerJobByIDWithLock for workspace build cancellation (#19737)
Closes https://github.com/coder/internal/issues/885

Adds a new database method GetProvisionerJobByIDWithLock that uses FOR
UPDATE without SKIP LOCKED to fix workspace build cancellation returning
500 errors when jobs are locked.
2025-09-08 15:40:14 +02:00
Benjamin Peinhardt e4dc2d9418 fix: add constraint and runtime check for provisioner logs size limit (#18893)
This PR sets a constraint of 1MB on the provisioner job logs written to
the database. This is consistent with the constraint we place on
workspace agent logs:
https://github.com/coder/coder/blob/4ac6be6d835dc36c242e35a26b584b784040bf28/coderd/database/dump.sql#L2030

It also adds a message printed to the front end about the provisioner
log overflow, and updates the message printed to the front end when
workspace startup logs exceed the max, as it was causing some customers
to think their startup script had failed to run.
2025-07-30 19:09:53 -05:00
Cian Johnston c4b69bbe63 fix: prioritise human-initiated builds over prebuilds (#18933)
Continues from https://github.com/coder/coder/pull/18882

- Reverts extraneous changes
- Adds explicit `ORDER BY initiator_id = $PREBUILDS_USER_ID` to
`AcquireProvisionerJob`
- Improves test added for above PR

---------

Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>
Co-authored-by: kylecarbs <7122116+kylecarbs@users.noreply.github.com>
2025-07-22 13:03:50 +01:00
Yevhenii Shcherbina 110102a60a fix: optimize queue position sql query (#17974)
Use only `online provisioner daemons` for
`GetProvisionerJobsByIDsWithQueuePosition` query. It should improve
performance of the query.
2025-05-28 08:21:16 -04:00
Michael Suchacz 769c9ee337 feat: cancel stuck pending jobs (#17803)
Closes: #16488
2025-05-20 15:22:44 +02:00
Susana Ferreira f044cc3550 feat: add provisioner daemon name to provisioner jobs responses (#17877)
# Description

This PR adds the `worker_name` field to the provisioner jobs endpoint.

To achieve this, the following SQL query was updated:
-
`GetProvisionerJobsByOrganizationAndStatusWithQueuePositionAndProvisioner`

As a result, the `codersdk.ProvisionerJob` type, which represents the
provisioner job API response, was modified to include the new field.

**Notes:** 
* As mentioned in
[comment](https://github.com/coder/coder/pull/17877#discussion_r2093218206),
the `GetProvisionerJobsByIDsWithQueuePosition` query was not changed due
to load concerns. This means that for template and template version
endpoints, `worker_id` will still be returned, but `worker_name` will
not.
* Similar to `worker_id`, the `worker_name` is only present once a job
is assigned to a provisioner daemon. For jobs in a pending state (not
yet assigned), neither `worker_id` nor `worker_name` will be returned.

---

# Affected Endpoints

- `/organizations/{organization}/provisionerjobs`
- `/organizations/{organization}/provisionerjobs/{job}`

---

# Testing

- Added new tests verifying that both `worker_id` and `worker_name` are
returned once a provisioner job reaches the **succeeded** state.
- Existing tests covering state transitions and other logic remain
unchanged, as they test different scenarios.

---

# Front-end Changes

Admin provisioner jobs dashboard:
<img width="1088" alt="Screenshot 2025-05-16 at 11 51 33"
src="https://github.com/user-attachments/assets/0e20e360-c615-4497-84b7-693777c5443e"
/>

Fixes: https://github.com/coder/coder/issues/16982
2025-05-19 16:05:39 +01:00
Yevhenii Shcherbina b85ba586ee fix(coderd/database): consider tag sets when calculating queue position (#16685)
Relates to https://github.com/coder/coder/issues/15843

## PR Contents

- Reimplementation of the `GetProvisionerJobsByIDsWithQueuePosition` SQL
query to **take into account** provisioner job tags and provisioner
daemon tags.
- Unit tests covering different **tag sets**, **job statuses**, and
**job ordering** scenarios.

## Notes

- The original row order is preserved by introducing the `ordinality`
field.
- Unnecessary rows are filtered as early as possible to ensure that
expensive joins operate on a smaller dataset.
- A "fake" join with `provisioner_jobs` is added at the end to ensure
`sqlc.embed` compiles successfully.
- **Backward compatibility is preserved**—only the SQL query has been
updated, while the Go code remains unchanged.
2025-03-03 10:02:18 -05:00
Mathias Fredriksson 5ba7ba6bfc fix(coderd): add strict org ID joins for provisioner job metadata (#16588)
References #16558
2025-02-17 14:16:45 +02:00
Mathias Fredriksson e38bd27183 feat(coderd): add support for provisioner job id and tag filter (#16556)
This change adds to new filters to the provisionerjobs endpoint, id
(array) and tags (map).

Updates #15084
Updates #15192
Related #16532
2025-02-13 18:24:27 +02:00
Bruno Quaresma e9b3561677 refactor: return template_icon and make metadata required (#16496) 2025-02-10 10:00:34 -03:00
Mathias Fredriksson b04d883348 feat: add provisioner job metadata (#16454)
This change adds metadata to provisioner jobs to help with rendering
related tempaltes and workspaces in the UI.

Updates #15084
2025-02-06 16:19:20 +02:00
Mathias Fredriksson 75c899ff71 feat(cli): add provisioner job cancel command (#16252)
Fixes #16117
Updates #15084
2025-01-27 16:26:56 +00:00
Mathias Fredriksson 3864c7e3b0 feat(coderd): add endpoint to list provisioner jobs (#16029)
Closes #15190
Updates #15084
2025-01-20 11:18:53 +02:00
Cian Johnston 36c2cf8a40 fix(coderd/database): exclude canceled jobs in queue position (#15835)
When calculating the queue position in
`GetProvisionerJobsByIDsWithQueuePosition` we only counted jobs with
`started_at = NULL`. This is misleading, as it allows canceling or
canceled jobs to take up rows in the computed queue position, giving an
impression that the queue is larger than it really is.

This modifies the query to also exclude jobs with a null `canceled_at`,
`completed_at`, or `error` field for the purposes of calculating the
queue position, and also adds a test to validate this behaviour.

(Note: due to the behaviour of `dbgen.ProvisionerJob` with `dbmem` I had
to use other proxy methods to validate the corresponding dbmem
implementation.)

---------

Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-12-12 12:37:45 +00:00
Sas Swart 814dd6f854 feat(coderd): update API to allow filtering provisioner daemons by tags (#15448)
This PR provides new parameters to an endpoint that will be necessary
for #15048
2024-11-15 11:33:22 +02:00
Bruno Quaresma 705b9ccda8 feat(coderd): add workspace timings endpoint (#14648) 2024-09-16 16:31:05 -03:00
Danny Kopping 6960d194ae feat: add provisioning timings to understand slow build times (#14274) 2024-08-21 14:18:58 +02:00
Steven Masley f0f9569d51 chore: enforce that provisioners can only acquire jobs in their own organization (#12600)
* chore: add org ID as optional param to AcquireJob
* chore: plumb through organization id to provisioner daemons
* add org id to provisioner domain key
* enforce org id argument
* dbgen provisioner jobs defaults to default org
2024-03-18 12:48:13 -05:00
Cian Johnston 53e8f9c0f9 fix(coderd): only allow untagged provisioners to pick up untagged jobs (#12269)
Alternative solution to #6442

Modifies the behaviour of AcquireProvisionerJob and adds a special case for 'un-tagged' jobs such that they can only be picked up by 'un-tagged' provisioners.

Also adds comprehensive test coverage for AcquireJob given various combinations of tags.
2024-02-22 15:04:31 +00:00
Dean Sheather 98a5ae7f48 feat: add provisioner job hang detector (#7927) 2023-06-25 13:17:00 +00:00
Kyle Carberry 69f911dfd5 feat: add queue_position and queue_size to provisioner jobs (#8074) 2023-06-20 15:07:18 -05:00
Colin Adler 8bd9f9c351 feat: unified tracing between coderd<->provisionerd (#7370) 2023-05-03 23:02:35 +00:00
Marcin Tojek 3b87316ad7 feat: propagate job error codes (#6507)
* feat: propagate job error_code

* fix

* Fix

* Fix

* Fix

* add errors to typesGenerated

* Address PR comments

* Fix
2023-03-08 16:32:00 +01:00
Dean Sheather 66a6b590a1 feat: add template max_ttl (#6114)
Co-authored-by: Bruno Quaresma <bruno@coder.com>
2023-03-07 14:14:58 +00:00
Colin Adler a02617b66b fix: remove unnecessary WHEREs from AcquireProvisionerJob (#5257) 2022-12-02 23:53:49 +00:00
Kyle Carberry b6703b11c6 feat: Add external provisioner daemons (#4935)
* Start to port over provisioner daemons PR

* Move to Enterprise

* Begin adding tests for external registration

* Move provisioner daemons query to enterprise

* Move around provisioner daemons schema

* Add tags to provisioner daemons

* make gen

* Add user local provisioner daemons

* Add provisioner daemons

* Add feature for external daemons

* Add command to start a provisioner daemon

* Add provisioner tags to template push and create

* Rename migration files

* Fix tests

* Fix entitlements test

* PR comments

* Update migration

* Fix FE types
2022-11-16 16:34:06 -06:00
Colin Adler e7f1192614 fix: add index for GetProvisionerLogsByIDBetween (#5020) 2022-11-16 14:32:29 -06:00
Jon Ayers 4e57b9fbdc fix: allow regular users to push files (#4500)
- As part of merging support for Template RBAC
  and user groups a permission check on reading files
  was relaxed.

  With the addition of admin roles on individual templates, regular
  users are now able to push template versions if they have
  inherited the 'admin' role for a template. In order to do so
  they need to be able to create and read their own files. Since
  collisions on hash in the past were ignored, this means that a regular user
  who pushes a template version with a file hash that collides with
  an existing hash will not be able to read the file (since it belongs to
  another user).

  This commit fixes the underlying problem which was that
  the files table had a primary key on the 'hash' column.
  This was not a problem at the time because only template
  admins and other users with similar elevated roles were
  able to read all files regardless of ownership. To fix this
  a new column and primary key 'id' has been introduced to the files
  table. The unique constraint has been updated to be hash+created_by.
  Tables (provisioner_jobs) that referenced files.hash have been updated
  to reference files.id. Relevant API endpoints have also been updated.
2022-10-13 18:02:52 -05:00
Kyle Carberry df2649ed2a fix: Test flake in TestWorkspaceStatus (#4333)
This also changes the status to be on the workspace build, since
that's where the true value is calculated. This exposed a bug where
jobs could never enter the canceled state unless fetched by a
provisioner daemon, which was nice to fix!

See: https://github.com/coder/coder/actions/runs/3175304200/jobs/5173479506
2022-10-03 11:43:11 -05:00
Kyle Carberry 4cce969018 feat: Add anonymized telemetry to report product usage (#2273)
* feat: Add anonymized telemetry to report product usage

This adds a background service to report telemetry to a Coder
server for usage data. There will be realtime event data sent
in the future, but for now usage will report on a CRON.

* Fix flake and requested changes

* Add reporting options for setup

* Add reporting for workspaces

* Add resources as they are reported

* Track API key usage

* Ensure telemetry is tracked prior to exit
2022-06-17 00:26:40 -05:00
Colin Adler fd523100bf chore: split queries.sql into files by table (#762) 2022-04-01 15:45:23 -05:00