Commit Graph

17 Commits

Author SHA1 Message Date
Yevhenii Shcherbina 53e8e9c7cd fix: reduce cost of prebuild failure (#17697)
Relates to https://github.com/coder/coder/issues/17432

### Part 1:

Notes:
- `GetPresetsAtFailureLimit` SQL query is added, which is similar to
`GetPresetsBackoff`, they use same CTEs: `filtered_builds`,
`time_sorted_builds`, but they are still different.

- Query is executed on every loop iteration. We can consider marking
specific preset as permanently failed as an optimization to avoid
executing query on every loop iteration. But I decided don't do it for
now.

- By default `FailureHardLimit` is set to 3.

- `FailureHardLimit` is configurable. Setting it to zero - means that
hard limit is disabled.

### Part 2

Notes:
- `PrebuildFailureLimitReached` notification is added.
- Notification is sent to template admins.
- Notification is sent only the first time, when hard limit is reached.
But it will `log.Warn` on every loop iteration.
- I introduced this enum:
```sql
CREATE TYPE prebuild_status AS ENUM (
  'normal',           -- Prebuilds are working as expected; this is the default, healthy state.
  'hard_limited',     -- Prebuilds have failed repeatedly and hit the configured hard failure limit; won't be retried anymore.
  'validation_failed' -- Prebuilds failed due to a non-retryable validation error (e.g. template misconfiguration); won't be retried.
);
```
`validation_failed` not used in this PR, but I think it will be used in
next one, so I wanted to save us an extra migration.

- Notification looks like this:
<img width="472" alt="image"
src="https://github.com/user-attachments/assets/e10efea0-1790-4e7f-a65c-f94c40fced27"
/>

### Latest notification views:
<img width="463" alt="image"
src="https://github.com/user-attachments/assets/11310c58-68d1-4075-a497-f76d854633fe"
/>
<img width="725" alt="image"
src="https://github.com/user-attachments/assets/6bbfe21a-91ac-47c3-a9d1-21807bb0c53a"
/>
2025-05-21 15:16:38 -04:00
Danny Kopping 6e967780c9 feat: track resource replacements when claiming a prebuilt workspace (#17571)
Closes https://github.com/coder/internal/issues/369

We can't know whether a replacement (i.e. drift of terraform state
leading to a resource needing to be deleted/recreated) will take place
apriori; we can only detect it at `plan` time, because the provider
decides whether a resource must be replaced and it cannot be inferred
through static analysis of the template.

**This is likely to be the most common gotcha with using prebuilds,
since it requires a slight template modification to use prebuilds
effectively**, so let's head this off before it's an issue for
customers.

Drift details will now be logged in the workspace build logs:


![image](https://github.com/user-attachments/assets/da1988b6-2cbe-4a79-a3c5-ea29891f3d6f)

Plus a notification will be sent to template admins when this situation
arises:


![image](https://github.com/user-attachments/assets/39d555b1-a262-4a3e-b529-03b9f23bf66a)

A new metric - `coderd_prebuilt_workspaces_resource_replacements_total`
- will also increment each time a workspace encounters replacements.

We only track _that_ a resource replacement occurred, not how many. Just
one is enough to ruin a prebuild, but we can't know apriori which
replacement would cause this.
For example, say we have 2 replacements: a `docker_container` and a
`null_resource`; we don't know which one might
cause an issue (or indeed if either would), so we just track the
replacement.

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
2025-05-14 14:52:22 +02:00
Danielle Maywood 6dd1056025 feat(coderd/notifications): group workspace build failure report (#17306)
Closes https://github.com/coder/coder/issues/15745

Instead of sending X many reports to a single template admin, we instead
send only 1.
2025-04-10 13:32:19 +01:00
Vincent Vielle ddb06741c9 chore: improve dormant workspace notification wording (#17100)
Related to #17099
2025-03-26 15:54:03 +01:00
Vincent Vielle 7b65422ef3 fix: change notifications actions url (#17083)
Related to #17082

Some notifications ( workspace created and workspace manually updated )
are using wrong variables to build the Action URL. Fixing it.
2025-03-25 11:29:02 +01:00
Vincent Vielle fe24a7a4a8 feat(coderd): remove greetings from notifications templates (#16991)
This PR aimes to [fix this
issue](https://github.com/coder/internal/issues/448) -

The main idea is to remove greetings from templates stored in the DB -
and instead push it into the template for require methods - for now
SMTP.
2025-03-21 16:05:08 +01:00
Danielle Maywood d2419c89ac feat: add tool to send a test notification (#16611)
Relates to https://github.com/coder/coder/issues/16463

Adds a CLI command, and API endpoint, to trigger a test notification for
administrators of a deployment.
2025-02-19 13:08:38 +00:00
Danielle Maywood dbad69dbd9 chore: add workspace oom/ood notification templates (#16250) 2025-02-04 19:25:18 +00:00
Danielle Maywood f3fe3bc785 feat: notify on workspace update (#15979)
Relates to https://github.com/coder/coder/issues/15845

When the `/workspace/<name>/builds` endpoint is hit, we check if the
requested template version is different to the previously used template
version. If these values differ, we can assume that the workspace has
been manually updated and send the appropriate notification. Automatic
updates happen in the lifecycle executor and bypasses this endpoint
entirely.
2025-01-02 12:19:34 +00:00
Danielle Maywood f0e81ab455 feat: notify on workspace creation (#15934) 2024-12-20 13:53:10 +00:00
Danielle Maywood 095c9797c9 feat: notify users on template deprecation (#15195)
Closes https://github.com/coder/coder/issues/15117

Notify users when a template has been deprecated.
2024-10-24 13:12:12 +01:00
Vincent Vielle 297089e944 feat(coderd): add company logo when available for email notifications (#14935)
This PR aims to close #14253 

We keep the default behavior using the Coder logo if there's no logo
set.
Otherwise we want to use the logo based on the URL set in appearance.

---------

Co-authored-by: defelmnq <yvincent@coder.com>
2024-10-22 14:06:19 +02:00
Danielle Maywood 23f61c68b4 fix: urlencode email in reset password link (#15167)
Fixes https://github.com/coder/coder/issues/15151

This runs `urlencode` (provided by `text/template`) on the email address
in the link. This ensures the link will work if a user has an email in
the form `user+label@example.com`.
2024-10-21 16:09:59 +01:00
Bruno Quaresma aaa1223408 feat(site): add forgot password link (#15108)
Demo:

https://github.com/user-attachments/assets/139eb8c0-5bd6-4bbd-8064-a4acc526afda
2024-10-18 09:50:22 -03:00
Sas Swart dfb6bfa4d2 fix(coderd/notifications): exclude unset fields from notifications (#15110)
This PR will ensure that optional fields are ignored when they are unset
in user account related templates.
2024-10-16 21:53:24 +02:00
Sas Swart fac77f956e fix(coderd/notifications): simplify TemplateWorkspaceManualBuildFailed (#15067)
This PR closes #15065.

As advised by @mtojek, a template's display name may be set to "", which
is not useful in an email notification. We'd like to provide a friendly
name for the template, but it also needs to be identifiable.

As such, we fall back to template.Name in the case that the template's
display name is empty.
2024-10-15 21:02:02 +02:00
Sas Swart 208ed1efd7 chore(coderd/notifications): expand golden file testing for notifications (#15032)
This PR aims to close https://github.com/coder/coder/issues/14913.

It expands the golden files for the notifier to include the entire
payload serialised as JSON.
2024-10-14 12:34:32 +00:00