Commit Graph

44 Commits

Author SHA1 Message Date
Steven Masley 8ba8b4f061 chore: add profiling labels for pprof analysis (#19232)
PProf labels segment the code into groups for determing the source of
cpu/memory profiles. Since the web server and background jobs share a
lot of the same code (eg wsbuilder), it helps to know if the load is
user induced, or background job based.
2025-08-07 11:21:17 -05:00
Dean Sheather 9a6dd73f68 feat: add managed agent license limit checks (#18937)
- Adds a query for counting managed agent workspace builds between two
timestamps
- The "Actual" field in the feature entitlement for managed agents is
now populated with the value read from the database
- The wsbuilder package now validates AI agent usage against the limit
when a license is installed

Closes coder/internal#777
2025-07-22 13:39:26 +10:00
Susana Ferreira 211393a69c fix: exclude prebuilt workspaces from lifecycle executor (#18762)
## Description

This PR updates the lifecycle executor to explicitly exclude prebuilt
workspaces from being considered for lifecycle operations such as
`autostart`, `autostop`, `dormancy`, `default TTL` and `failure TTL`.

Prebuilt workspaces (i.e., those owned by the prebuild system user) are
handled separately by the prebuild reconciliation loop. Including them
in the lifecycle executor could lead to unintended behavior such as
incorrect scheduling or state transitions.

## Changes

* Updated the lifecycle executor query
`GetWorkspacesEligibleForTransition` to exclude workspaces with
`owner_id = 'c42fdf75-3097-471c-8c33-fb52454d81c0'` (prebuilds).
* Added tests to verify prebuilt workspaces are not considered in:
  * Autostop
  * Autostart
  * Default TTL
  * Dormancy
  * Failure TTL

Fixes: https://github.com/coder/coder/issues/18740
Related to: https://github.com/coder/coder/issues/18658
2025-07-08 11:35:28 +01:00
Steven Masley 7b152cdd91 chore: increase fileCache hit rate in autobuilds lifecycle (#18507)
`wsbuilder` hits the file cache when running validation. This solution is imperfect, but by first sorting workspaces by their template version id, the cache hit rate should improve.
2025-06-24 07:36:39 -05:00
Steven Masley 82af2e019d feat: implement dynamic parameter validation (#18482)
# What does this do?

This does parameter validation for dynamic parameters in `wsbuilder`. All input parameters are validated in `coder/coder` before being sent to terraform.

The heart of this PR is [`ResolveParameters`](https://github.com/coder/coder/blob/b65001e89c0577199a8e470c138c51e91cf2350c/coderd/dynamicparameters/resolver.go#L30-L30).

# What else changes?

`wsbuilder` now needs to load the terraform files into memory to succeed. This does add a larger memory requirement to workspace builds.

# Future work

- Sort autostart handling workspaces by template version id. So workspaces with the same template version only load the terraform files once from the db, and store them in the cache.
2025-06-23 12:35:15 -05:00
Steven Masley e76d58f2b6 chore: disable parameter validatation for dynamic params for all transitions (#17926)
Dynamic params skip parameter validation in coder/coder.
This is because conditional parameters cannot be validated 
with the static parameters in the database.
2025-05-20 10:09:53 -05:00
Danielle Maywood 1267c9c405 fix: ensure reason present for workspace autoupdated notification (#17935)
Fixes https://github.com/coder/coder/issues/17930

Update the `WorkspaceAutoUpdated` notification to only display the
reason if it is present.
2025-05-20 16:01:57 +01:00
Danielle Maywood 50ff06cc3c chore: acquire lock for individual workspace transition (#15859)
When Coder is ran in High Availability mode, each Coder instance has a
lifecycle executor. These lifecycle executors are all trying to do the
same work, and whilst transactions saves us from this causing an issue,
we are still doing extra work that could be prevented.

This PR adds a `TryAcquireLock` call for each attempted workspace
transition, meaning two Coder instances shouldn't duplicate effort.
2024-12-13 16:59:27 +00:00
Danielle Maywood e21a301682 fix: make GetWorkspacesEligibleForTransition return even less false positives (#15594)
Relates to https://github.com/coder/coder/issues/15082

Further to https://github.com/coder/coder/pull/15429, this reduces the
amount of false-positives returned by the 'is eligible for autostart'
part of the query. We achieve this by calculating the 'next start at'
time of the workspace, storing it in the database, and using it in our
`GetWorkspacesEligibleForTransition` query.

The prior implementation of the 'is eligible for autostart' query would
return _all_ workspaces that at some point in the future _might_ be
eligible for autostart. This now ensures we only return workspaces that
_should_ be eligible for autostart.

We also now pass `currentTick` instead of `t` to the
`GetWorkspacesEligibleForTransition` query as otherwise we'll have one
round of workspaces that are skipped by `isEligibleForTransition` due to
`currentTick` being a truncated version of `t`.
2024-12-02 21:02:36 +00:00
Cian Johnston 2b57dcc68c feat(coderd): add matched provisioner daemons information to more places (#15688)
- Refactors `checkProvisioners` into `db2sdk.MatchedProvisioners`
- Adds a separate RBAC subject just for reading provisioner daemons
- Adds matched provisioners information to additional endpoints relating to
  workspace builds and templates
-Updates existing unit tests for above endpoints
-Adds API endpoint for matched provisioners of template dry-run job
-Updates CLI to show warning when creating/starting/stopping/deleting
 workspaces for which no provisoners are available

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2024-12-02 20:54:32 +00:00
Steven Masley 144d3f3e3d chore: record lifecycle duration metric to prometheus (#15279)
`autobuild_execution_duration_seconds` keeps track of how long autobuild
takes and exposes it via prometheus histogram
2024-10-30 10:20:47 -04:00
Steven Masley ccfffc6911 chore: add tx metrics and logs for serialization errors (#15215)
Before db_metrics were all or nothing. Now `InTx` metrics are always recorded, and query metrics are opt in.


Adds instrumentation & logging around serialization failures in the database.
2024-10-25 12:14:15 -04:00
Steven Masley 343f8ec9ab chore: join owner, template, and org in new workspace view (#15116)
Joins in fields like `username`, `avatar_url`, `organization_name`,
`template_name` to `workspaces` via a **view**. 
The view must be maintained moving forward, but this prevents needing to
add RBAC permissions to fetch related workspace fields.
2024-10-22 09:20:54 -05:00
Bruno Quaresma 10327fb3a9 fix(coderd): humanize duration on notifications (#14333) 2024-08-19 15:49:47 -03:00
Danny Kopping c6076d2d0d chore: improve notification template tests' resilience (#14196) 2024-08-07 11:33:26 +02:00
Bruno Quaresma 58b810fb0a fix: fix dormancy notifications (#14029) 2024-07-29 11:20:04 -03:00
Bruno Quaresma 0d9615b4fd feat(coderd): notify when workspace is marked as dormant (#13868) 2024-07-24 13:38:21 -03:00
Marcin Tojek fbd1d7f9a7 feat: notify on successful autoupdate (#13903) 2024-07-18 15:19:12 +02:00
Marcin Tojek 7c41f957de feat: autostop workspaces owned by suspended users (#13790) 2024-07-04 13:35:41 +00:00
Steven Masley 845407fe7a chore: cover deadline crossing autostart border on start (#13115)
When starting a workspace, if the deadline crosses an autostart boundary, the deadline is set to autostart + TTL. 
This copies the behavior in `ActivityBumpWorkspace`, but does not require activity.
2024-05-01 10:43:04 -05:00
Jon Ayers 383eed93f8 fix: use correct logger for lifecycle_executor (#11763) 2024-01-23 14:33:55 -06:00
Jon Ayers ce49a55f56 chore: update build_reason 'autolock' -> 'dormancy' (#11074) 2023-12-07 17:11:57 -06:00
Jon Ayers 48d69c9e60 fix: update autostart context to include querying users (#10929) 2023-11-28 17:56:49 -06:00
Colin Adler 7f39ff854e fix: skip autostart for suspended/dormant users (#10771) 2023-11-22 11:14:32 -06:00
Steven Masley 290180b104 feat!: bump workspace activity by 1 hour (#10704)
Marked as a breaking change as the previous activity bump was always the TTL duration of the workspace/template.

This change is more cost conservative, only bumping by 1 hour for workspace activity. To accommodate wrap around, eg bumping a workspace into the next autostart, the deadline is bumped by the TTL if the workspace crosses the autostart threshold.

This is a niche case that is likely caused by an idle terminal making a workspace survive through a night. The next morning, the workspace will get activity bumped the default TTL on the autostart, being similar to as if the workspace was autostarted again.

In practice, a good way to avoid this is to set a max_deadline of <24hrs to avoid wrap around entirely.
2023-11-15 09:42:27 -06:00
Jon Ayers 75ab16d19a fix: prevent db deadlock when workspaces go dormant (#10618) 2023-11-13 13:40:20 -06:00
Jon Ayers 997493d4ae feat: add template setting to require active template version (#10277) 2023-10-18 17:07:21 -05:00
Colin Adler 1ad998ee3a fix: add requester IP to workspace build audit logs (#10242) 2023-10-18 15:08:02 -05:00
Steven Masley 39c0539d42 feat: add controls to template for determining startup days (#10226)
* feat: template controls which days can autostart
* Add unit test to test blocking autostart with DaysOfWeek
2023-10-13 11:57:18 -05:00
Spike Curtis 983e8c3ae8 feat: add API support for workspace automatic updates (#10099)
* Added automatic_updates to workspaces table

Signed-off-by: Spike Curtis <spike@coder.com>

* Queries and API updates

Signed-off-by: Spike Curtis <spike@coder.com>

* Golden files

Signed-off-by: Spike Curtis <spike@coder.com>

* Enable automatic updates on autostart

Signed-off-by: Spike Curtis <spike@coder.com>

* db migration number

Signed-off-by: Spike Curtis <spike@coder.com>

* fix imports and ts mock

Signed-off-by: Spike Curtis <spike@coder.com>

* code review updates

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-10-06 13:27:12 +04:00
Jon Ayers b32d79ef0b fix: fix failed workspaces continuously auto-deleting (#10069)
- Fixes an issue where workspaces that are eligible for auto-deletion
  are retried every tick (1 minute) even if the previous deletion
  transition failed.

  The updated logic only attempts to delete workspaces that previously
  failed once a day (24 hours since last attempt).
2023-10-05 14:11:39 -05:00
Jon Ayers 91265678ad chore: add auditing to workspace dormancy (#10070)
- Adds an audit log for workspaces automatically transitioned to the dormant
  state.
- Imposes a mininum of 1 minute on cleanup-related fields. This is to
  prevent accidental API misuse from resulting in catastrophe.
2023-10-05 13:41:07 -05:00
Steven Masley 5021e23105 chore: compute job status as column (#10024)
* chore: provisioner job status as column
* use provisioner job status for workspace searching
2023-10-04 20:57:46 -05:00
Spike Curtis 375c70d141 feat: integrate Acquirer for provisioner jobs (#9717)
* chore: add Acquirer to provisionerdserver pkg

Signed-off-by: Spike Curtis <spike@coder.com>

* code review improvements & fixes

Signed-off-by: Spike Curtis <spike@coder.com>

* feat: integrate Acquirer for provisioner jobs

Signed-off-by: Spike Curtis <spike@coder.com>

* Fix imports, whitespace

Signed-off-by: Spike Curtis <spike@coder.com>

* provisionerdserver always closes; remove poll interval from playwright

Signed-off-by: Spike Curtis <spike@coder.com>

* post jobs outside transactions

Signed-off-by: Spike Curtis <spike@coder.com>

* graceful shutdown in test

Signed-off-by: Spike Curtis <spike@coder.com>

* Mark AcquireJob deprecated

Signed-off-by: Spike Curtis <spike@coder.com>

* Graceful shutdown on all provisionerd tests

Signed-off-by: Spike Curtis <spike@coder.com>

* Deprecate, not remove CLI flags

Signed-off-by: Spike Curtis <spike@coder.com>

---------

Signed-off-by: Spike Curtis <spike@coder.com>
2023-09-19 10:25:57 +04:00
Mathias Fredriksson ad23d33f28 refactor(coderd/schedule): move cron schedule to cron package (#9507)
This removes an indirect import of `coderd/database` from the CLI and
results in a logical separation between server related and generalized
schedule.

No size change (yet).

Ref: #9380
2023-09-04 16:48:25 +03:00
Mathias Fredriksson 19d7da3d24 refactor(coderd/database): split Time and Now into dbtime package (#9482)
Ref: #9380
2023-09-01 16:50:12 +00:00
Jon Ayers 7f14b50dbe chore: rename locked to dormant (#9290)
* chore: rename locked to dormant

- The following columns have been updated:
  - workspace.locked_at -> dormant_at
  - template.inactivity_ttl -> time_til_dormant
  - template.locked_ttl -> time_til_dormant_autodelete

This change has also been reflected in the SDK.

A route has also been updated from /workspaces/<id>/lock to /workspaces/<id>/dormant
2023-08-24 13:25:54 -05:00
Kyle Carberry 22e781eced chore: add /v2 to import module path (#9072)
* chore: add /v2 to import module path

go mod requires semantic versioning with versions greater than 1.x

This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```

Migrate generated files to import /v2

* Fix gen
2023-08-18 18:55:43 +00:00
Jon Ayers e43608395c feat: add frontend for locked workspaces (#8655)
- Fix workspaces query for locked workspaces.
2023-08-03 19:46:02 -05:00
Jon Ayers b47d076756 feat: add deleting_at column to workspaces (#8333) 2023-07-20 22:01:11 -05:00
Dean Sheather dc8b73168e feat: add user quiet hours schedule and restart requirement feature flag (#8115) 2023-07-20 23:35:41 +10:00
Jon Ayers 4a9c8f407a feat: add auto-locking/deleting workspace based on template config (#8240) 2023-07-02 21:29:52 -05:00
Marcin Tojek 8e2422d42c feat: use named loggers in coderd (#8148) 2023-06-22 20:09:33 +02:00
Jon Ayers 1b0124ecdb feat: automatically stop workspaces based on failure_ttl (#7989) 2023-06-22 00:33:22 -04:00