Commit Graph

14 Commits

Author SHA1 Message Date
Danny Kopping 9341efec9f feat!: seed ai_providers from env on server startup (#24895)
_Disclaimer: implemented by a Coder Agent using Claude Opus 4.7_

Part of the implementation of [RFC: Common AI Provider Configs](https://www.notion.so/coderhq/RFC-Common-AI-Provider-Configs-34bd579be59280ed958feffb82024797) (AIGOV-201).

## Note

This change can cause a previously working installation to fail to start should a conflict exist between the providers configured in the environment & those now migrated to the database.

I'll raise a PR upstack to document this process and workarounds should a startup fail.

## What this PR does

Reconciles environment-derived AI provider configuration with the `ai_providers` table at server startup. The seed runs **before** the aibridged daemon is initialized, so the runtime always reads providers from the database; the legacy `CODER_AIBRIDGE_*` environment variables become a one-shot migration source.

### Behavior

- Concurrent server starts are serialized through a Postgres advisory lock (`LockIDAIProvidersEnvSeed`).
- Missing rows are inserted with an audit entry attributed to the system actor.
- Existing rows whose canonical hash matches the env-derived hash are left alone (the common no-op restart path).
- Existing rows whose canonical hash does **not** match cause server startup to fail with a descriptive error so the operator can explicitly resolve the conflict in either env or DB.
- Soft-deleted rows are NOT resurrected from env; an explicit operator deletion is sticky across restarts.
- Indexed providers whose name conflicts with a legacy env var fail startup with a clear remediation message.
- Unknown provider types (e.g. `copilot`, until the DB enum is widened) are skipped with a log entry rather than failing startup.

### Canonical hashing

The `canonicalAIProvider` shape captures exactly the fields that determine runtime behavior — `type`, `base_url`, and the Bedrock subset of settings (access key, access key secret, region, model, small fast model) — and is hashed with SHA-256. The hash is **computed on demand from the row + env**, never persisted, so the database does not need a new column for it. API keys live in the separate `ai_provider_keys` table and are intentionally excluded from the hash so operators can rotate keys via the API without forcing a server restart.

<details>
<summary>Decision log</summary>

- The hash is intentionally not persisted in the database. The RFC discussed this trade-off; computing on demand keeps the schema minimal and lets the canonical shape evolve without a migration.
- The lock uses an `iota` slot in `coderd/database/lock.go` rather than `GenLockID` so it's stable, easy to audit, and matches the convention used for every other startup lock.
- A bearer-token Anthropic provider whose env vars also set Bedrock metadata but no AWS credentials does NOT store the Bedrock fields. Without credentials the discriminated settings would misrepresent the row as Bedrock auth.
- We deliberately do NOT publish to the `ai_providers_changed` pubsub channel from the seed because the seed completes before any subscriber is started; the follow-up PR introduces that channel.

</details>
2026-05-22 08:37:27 +02:00
Zach a31e476623 fix: make boundary usage telemetry collection atomic (#21907)
Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and
GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during
telemetry period cutover. Without serialization, an upsert concurrent with the
delete could lose data (deleted right after being written) or commit after the
delete (miscounted in the next period). Both operations now acquire
LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.
2026-02-06 09:52:17 -07:00
George K cc2efe9e1f feat(coderd/rbac): make organization-member a per-org system custom role (#21359)
Migrated the built-in organization-member role to DB storage so it can be customized per org.

Closes https://github.com/coder/internal/issues/1073 (part 1)
2026-01-12 18:19:19 -08:00
Yevhenii Shcherbina 27bc60d1b9 feat: implement reconciliation loop (#17261)
Closes https://github.com/coder/internal/issues/510

<details>
<summary> Refactoring Summary </summary>

### 1) `CalculateActions` Function

#### Issues Before Refactoring:

- Large function (~150 lines), making it difficult to read and maintain.
- The control flow is hard to follow due to complex conditional logic.
- The `ReconciliationActions` struct was partially initialized early,
then mutated in multiple places, making the flow error-prone.

Original source:  

https://github.com/coder/coder/blob/fe60b569ad754245e28bac71e0ef3c83536631bb/coderd/prebuilds/state.go#L13-L167

#### Improvements After Refactoring:

- Simplified and broken down into smaller, focused helper methods.
- The flow of the function is now more linear and easier to understand.
- Struct initialization is cleaner, avoiding partial and incremental
mutations.

Refactored function:  

https://github.com/coder/coder/blob/eeb0407d783cdda71ec2418c113f325542c47b1c/coderd/prebuilds/state.go#L67-L84

---

### 2) `ReconciliationActions` Struct

#### Issues Before Refactoring:

- The struct mixed both actionable decisions and diagnostic state, which
blurred its purpose.
- It was unclear which fields were necessary for reconciliation logic,
and which were purely for logging/observability.

#### Improvements After Refactoring:

- Split into two clear, purpose-specific structs:
- **`ReconciliationActions`** — defines the intended reconciliation
action.
- **`ReconciliationState`** — captures runtime state and metadata,
primarily for logging and diagnostics.

Original struct:  

https://github.com/coder/coder/blob/fe60b569ad754245e28bac71e0ef3c83536631bb/coderd/prebuilds/reconcile.go#L29-L41

</details>

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Sas Swart <sas.swart.cdk@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Dean Sheather <dean@deansheather.com>
Co-authored-by: Spike Curtis <spike@coder.com>
Co-authored-by: Danny Kopping <danny@coder.com>
2025-04-17 09:29:29 -04:00
Sas Swart 99c6f235eb feat: add migrations and queries to support prebuilds (#16891)
Depends on https://github.com/coder/coder/pull/16916 _(change base to
`main` once it is merged)_

Closes https://github.com/coder/internal/issues/514

_This is one of several PRs to decompose the `dk/prebuilds` feature
branch into separate PRs to merge into `main`._

---------

Signed-off-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: Danny Kopping <dannykopping@gmail.com>
Co-authored-by: evgeniy-scherbina <evgeniy.shcherbina.es@gmail.com>
2025-04-03 10:58:30 +02:00
Jon Ayers 17ddee05e5 chore: update golang to 1.24.1 (#17035)
- Update go.mod to use Go 1.24.1
- Update GitHub Actions setup-go action to use Go 1.24.1
- Fix linting issues with golangci-lint by:
  - Updating to golangci-lint v1.57.1 (more compatible with Go 1.24.1)

🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2025-03-26 01:56:39 -05:00
Jon Ayers 2d5c068525 feat: implement key rotation system (#14710) 2024-09-19 19:12:44 +01:00
Marcin Tojek 6de59371ea feat: notifications: report failed workspace builds (#14571) 2024-09-18 09:11:44 +02:00
Mathias Fredriksson 3adcccb618 fix(coderd/database): reduce db load via dbpurge advisory locking (#13021) 2024-04-22 11:10:32 +00:00
Mathias Fredriksson 12e6fbf11e feat(coderd/database): add dbrollup service to rollup insights (#12665)
Add `dbrollup` service that runs the `UpsertTemplateUsageStats` query
every 5 minutes, on the minute. This allows us to have fairly real-time
insights data when viewing "today".
2024-03-22 18:42:43 +02:00
Dean Sheather bedd2c5922 fix: avoid race between replicas on start (#12344)
DERP mesh key setup would do a SELECT and then an INSERT on failure, without a lock. During some testing with multiple replicas, I managed to cause a replica to crash due to them initializing simultaneously.

Fixes:

Encountered an error running "coder server"
create coder API: insert mesh key: pq: duplicate key value violates unique constraint "site_configs_key_key"

Co-authored-by: Cian Johnston <cian@coder.com>
2024-02-28 16:14:11 +00:00
Dean Sheather 98a5ae7f48 feat: add provisioner job hang detector (#7927) 2023-06-25 13:17:00 +00:00
Kyle Carberry bbb0fab1de chore: merge database gen scripts (#8073)
* chore: merge database gen scripts

* Fix type params gen

* Merge enum into dbgen
2023-06-20 16:24:33 +00:00
Dean Sheather 1bdd2abed7 feat: use JWT ticket to avoid DB queries on apps (#6148)
Issue a JWT ticket on the first request with a short expiry that
contains details about which workspace/agent/app combo the ticket is
valid for.
2023-03-07 19:38:11 +00:00