coder

mirror of https://github.com/coder/coder.git synced 2026-06-04 21:48:22 +00:00

Author	SHA1	Message	Date
Rowan Smith	107fd97a61	fix: avoid derp-related panic during wsproxy registration (backport release/2.31) (#22526 ) Backport of #22322. - Cherry-picked `7f03bd7`. Co-authored-by: Dean Sheather <dean@deansheather.com>	2026-03-03 13:46:42 +05:00
Jon Ayers	4f34452bcc	fix: use separate http.Transports for wsproxy tests (#22292 ) - Previously all tests were sharing the global http.Transport meaning on `Close` it would close connections presumed to be idle for other tests. fixes https://github.com/coder/internal/issues/112	2026-02-24 23:56:58 -06:00
Jon Ayers	0a7a3da178	fix: exclude provisioner_state from workspace_build_with_user view (#22159 ) The provisioner state for a workspace build was being loaded for every long-lived agent rpc connection. Since this state can be anywhere from kilobytes to megabytes this can gradually cause the `coderd` memory footprint to grow over time. It's also a lot of unnecessary allocations for every query that fetches a workspace build since only a few callers ever actually reference the provisioner state. This PR removes it from the returned workspace build and adds a query to fetch the provisioner state explicitly.	2026-02-23 22:46:17 -06:00
Sushant P	37a8e61ea2	chore: move Shared Workspaces from experiments to beta (#22206 ) * Removed the shared-workspaces experiment and cleaned up related middleware * Added beta tagging to the UI for shared workspaces	2026-02-23 08:30:32 -08:00
Jake Howell	d700f9ebc4	fix: restore block to `Managed Agents` on `Enterprise` (#22210 ) #21998 accidentally allowed `Managed Agents` usages whilst being on an `Enterprise` license. This was incorrect, it should work as the following (same as prior to #21998). \| Scenario \| Before your PRs \| After your PRs (bug) \| After this fix \| \|---\|---\|---\|---\| \| Unlicensed (AGPL) \| Permitted \| Permitted \| Permitted \| \| Licensed, no entitlement \| Blocked \| Permitted \| Blocked \| \| Licensed, explicitly disabled (limit=0) \| Blocked \| Permitted \| Blocked \| \| Licensed, entitled, under limit \| Permitted \| Permitted \| Permitted \| \| Licensed, entitled, over limit \| Blocked \| Permitted (advisory) \| Permitted (advisory) \| \| Any license, stop/delete \| Permitted \| Permitted \| Permitted \| \| Any license, non-AI build \| Permitted \| Permitted \| Permitted \|	2026-02-20 20:15:32 +11:00
Jake Howell	051ed34580	feat: convert `soft_limit` to `limit` (#22048 ) In relation to [`internal#1281`](https://github.com/coder/internal/issues/1281) Remove the `soft_limit` field from the `Feature` type and simplify license limit handling. This change: - Removes the `soft_limit` field from the API and SDK - Uses the soft limit value as the single `limit` value in the UI and API - Simplifies warning logic to only show warnings when the limit is exceeded - Updates tests to reflect the new behavior - Updates the UI to use the single limit value for display	2026-02-20 16:09:12 +11:00
Jake Howell	203899718f	feat: remove agent workspaces limit (#21998 ) In relation to [`internal#1281`](https://github.com/coder/internal/issues/1281) Managed agent workspace build limits are now advisory only. Breaching the limit no longer blocks workspace creation — it only surfaces a warning. - Removed hard-limit enforcement in `checkAIBuildUsage` so AI task builds are always permitted regardless of managed agent count. - Updated the license warning to remove "Further managed agent builds will be blocked." verbiage. - Updated tests to assert builds succeed beyond the limit instead of failing. - Removed the "Limit" display from the `ManagedAgentsConsumption` progress bar — the bar is now relative to the included allowance (soft limit) only, and turns orange when usage exceeds it. Bonus: - De-MUI'd `LicenseBannerView` — replaced Emotion CSS and MUI `Link` with Tailwind classes. - Added `highlight-orange` color token to the Tailwind theme.	2026-02-20 12:56:00 +11:00
Danielle Maywood	92a6d6c2c0	chore: remove unnecessary loop variable captures (#22180 ) Since Go 1.22, the loop variable capture issue is resolved. Variables declared by for loops are now per-iteration rather than per-loop, making the 'v := v' pattern unnecessary.	2026-02-19 09:02:19 +00:00
Paweł Banaszewski	90c11f3386	feat: add client column to aibridge_interceptions table (#21839 ) Adds `client` column to `aibridge_interceptions` table. It is set accordingly to what is passed from AI Bridge in `RecordInterception`. Adds interception filtering by `client` value. Depends on: https://github.com/coder/aibridge/pull/158 Updates aibridge library to include this change. Fixes: https://github.com/coder/aibridge/issues/31	2026-02-17 15:43:02 +01:00
Steven Masley	01f06671a1	chore: return 404, not 400 if missing or authz deny (#22069 )	2026-02-13 08:19:07 -06:00
Callum Styan	5f3be6b288	feat: add provisioner job queue wait time histogram and jobs enqueued counter (#21869 ) This PR adds some metrics to help identify job enqueue rates and latencies. This work was initiated as a way to help reduce the cost of the observation/measurement itself for autostart scaletests, which impacts our ability to identify/reason about the load caused by autostart. See: https://github.com/coder/internal/issues/1209 I've extended the metrics here to account for regular user initiated builds, prebuilds, autostarts, etc. IMO there is still the question here of whether we want to include or need the `transition` label, which is only present on workspace builds. Including it does lead to an increase in cardinality, and in the case of the histogram (when not using native histograms) that's at least a few extra series for every bucket. We could remove the transition label there but keep it on the counter. Additionally, the histogram is currently observing latencies for other jobs, such as template builds/version imports, those do not have a transition type associated with them. Tested briefly in a workspace, can see metric values like the following: - `coderd_workspace_builds_enqueued_total{build_reason="autostart",provisioner_type="terraform",status="success",transition="start"} 1` - `coderd_provisioner_job_queue_wait_seconds_bucket{build_reason="autostart",job_type="workspace_build",provisioner_type="terraform",transition="start",le="0.025"} 1` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-12 13:40:47 -08:00
Susana Ferreira	220b9f3cc5	fix: track goroutines and fix race condition in reconciler (#21980 ) ## Problem CI failure showed 3 goroutines leaked in the prebuilds reconciler, all stuck in `select` state: 1) `MetricsCollector.BackgroundFetch` (metrics goroutine) 2) `StoreReconciler.Run` (main reconciliation loop) 3) `StoreReconciler.Run.func3()` (provisioner job publisher goroutine) All three goroutines were waiting for `ctx.Done()`, which likely means `cancelFn()` was never called to trigger shutdown. Note: I was unable to reproduce the flake locally. The likely cause was a race condition between `Run()` and `Stop()` where `Stop()` could check `running` (seeing `false`), return early, and then `Run()` would start goroutines that never get cleaned up. This could happen in any `coderd` test that starts a server with prebuilds enabled. ### Problems identified 1) Missing waitgoroup tracking: provisioner job publisher goroutine was not tracked in the waitgroup, therefore, this goroutine was not tracked for a clean shutdown in `Run defer func()`. 2) The provisioner job publisher goroutine had a redundant `case <-c.done` that could race with `Stop()` select statement. 3) Race condition between `Run()` and `Stop()`: the `running` and `stopped` fields were `atomic.Bool` values checked and set independently, allowing a window where `Stop()` could see `running=false` and return early, then `Run()` would set `running=true` and start goroutines that would never be cleaned up. This could happen in any `coderd` test that starts a server with prebuilds enabled. ## Changes * Added `wg.Add(1)` and `defer wg.Done()` to track provisioner job publisher goroutine in waitgroup * Removed redundant `case <-c.done` from provisioner job publisher goroutine to eliminate race condition * Replaced `atomic.Bool` for `running` and `stopped` with a `sync.Mutex` lifecycle state, also protecting `cancelFn` under the same mutex, to eliminate the race between `Run()` and `Stop()` * Added a guard in `Run()` to prevent double-start (`c.stopped \|\| c.running`) * Improved comments in Stop() and Run() to clarify shutdown behavior Closes: https://github.com/coder/internal/issues/1116	2026-02-12 15:35:42 +00:00
Cian Johnston	25a0c807cb	chore(coderd/database/dbfake): add support for provisioner job timestamp control (#21944 ) Relates to https://github.com/coder/coder/pull/21922 / https://github.com/coder/internal/issues/1259 * Adds `dbfake.BuilderOption func(WorkspaceBuildBuilder)` Adds `BuilderOption` methods for setting various provisioner job related fields on `WorkspaceBuildBuilder`. * Migrates a number of existing tests that previously dependeded on provisioner job timing to use these updated methods in the following packages: * `coderd/jobreaper` * `coderd/notifications/reports` * `enterprise/coderd/schedule` * `enterprise/coderd/prebuilds` * `scripts/workspace-runtime-audit` 🤖 Created using Mux (Opus 4.5) --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-06 09:44:40 +00:00
Cian Johnston	91be688e39	chore(coderd/database): remove deprecated db2sdk.List(Lazy)? methods (#21902 ) Removes deprecated methods db2sdk.List and db2sdk.ListLazy.	2026-02-03 17:52:07 +00:00
Jon Ayers	3c1db17361	fix: use existing transaction to claim prebuild (#21862 ) - Claiming a prebuild was happening outside a transaction	2026-02-02 17:57:59 -06:00
Dean Sheather	6954b73f8a	fix: prevent panic from duplicate metrics registration on license upload (#21832 )	2026-02-02 20:57:06 +11:00
Ethan	a464ab67c6	test: use explicit names in TestStartAutoUpdate to prevent flake (#21745 ) The test was creating two template versions without explicit names, relying on `namesgenerator.NameDigitWith()` which can produce collisions. When both versions got the same random name, the test failed with a 409 Conflict error. Fix by giving each version an explicit name (`v1`, `v2`). Closes https://github.com/coder/internal/issues/1309 --- Generated by [mux](https://mux.coder.com)	2026-01-30 13:24:06 +11:00
Marcin Tojek	04b0253e8a	feat: add Prometheus metrics for license warnings and errors (#21749 ) Fixes: coder/internal#767 Adds two new Prometheus metrics for license health monitoring: - `coderd_license_warnings` - count of active license warnings - `coderd_license_errors` - count of active license errors Metrics endpoint after startup of a deployment with license enabled: ``` ... # HELP coderd_license_errors The number of active license errors. # TYPE coderd_license_errors gauge coderd_license_errors 0 ... # HELP coderd_license_warnings The number of active license warnings. # TYPE coderd_license_warnings gauge coderd_license_warnings 0 ... ```	2026-01-29 13:50:15 +01:00
Cian Johnston	c2c225052a	chore(enterprise/coderd): ensure TestManagedAgentLimit differentiates between tasks and workspaces (#21731 ) My previous change to this test did not create another workspace using the template containing `coder_ai_task` resources, meaning that this test was not actually testing the right thing. This PR addresses this oversight.	2026-01-28 16:30:56 +00:00
Zach	2204731ddb	feat: implement boundary usage tracker and telemetry collection (#21716 ) Implements telemetry for boundary usage tracking across all Coder replicas and reports them via telemetry. Changes: - Implement Tracker with Track(), FlushToDB(), and StartFlushLoop() methods - Add telemetry integration via collectBoundaryUsageSummary() - Use telemetry lock to ensure only one replica collects per period The tracker accumulates unique workspaces, unique users, and request counts (allowed/denied) in memory, then flushes to the database periodically. During telemetry collection, stats are aggregated across all replicas and reset for the next period.	2026-01-27 19:11:40 -07:00
Steven Masley	799b190dee	fix: do not enforce managed agent limit for non-task workspaces (#21689 ) Only task workspaces have the checks in wsbuilder for violating the managed agent caps in the license. Stopped tasks that are resumed with a regular workspace start still count as usage.	2026-01-27 19:01:17 -06:00
Cian Johnston	7b44976618	fix(coderd/provisionerdserver): correct managed agent tracking (#21696 ) Relates to https://github.com/coder/internal/issues/1282 Updates tracking of managed agents to be predicated instead on the presence of a related `task_id` instead of the presence of a `coder_ai_task` resource.	2026-01-27 12:14:52 +00:00
Jake Howell	6f15b178a4	feat: extend premium license for `aigovernance` (#21499 ) Closes [#1227](https://github.com/coder/internal/issues/1227) Added support for license addons, starting with AI Governance, to enable dynamic feature grouping without requiring license reissuance. ### What changed? - Introduced a new `Addon` type to represent groupings of features that can be added to licenses - Created the first addon `AddonAIGovernance` which includes AI Bridge and Boundary features - Added validation for addon dependencies to ensure required features are present - Added new features: `FeatureBoundary` and `FeatureAIGovernanceUserLimit` - Updated license entitlement logic to handle addons and their features - Added helper methods to check if features belong to addons - Updated tests to verify addon functionality ### Why make this change? This change introduces a more flexible licensing model that allows features to be grouped into addons that can be added to licenses without requiring reissuance when new features are added to an addon. This is particularly useful for specialized feature sets like AI Governance, where related features can be bundled together and sold as a separate SKU. The addon approach allows for better organization of features and more granular control over entitlements.	2026-01-27 22:33:53 +11:00
Kacper Sawicki	78bc5861e0	feat(enterprise/coderd): add soft warning for AI Bridge GA transition (#21675 ) ## Summary AI Bridge is moving to General Availability in v2.30 and will require the AI Governance Add-On license in future versions. This adds a soft warning for deployments using AI Bridge via Premium/Enterprise FeatureSet without an explicit AI Bridge add-on license. Relates to: https://github.com/coder/internal/issues/1226 ## Changes - Track whether AI Bridge was explicitly granted via license Features (add-on) vs inherited from FeatureSet - Show soft warning when AI Bridge is enabled and entitled via FeatureSet but not via explicit add-on - Changed AI Bridge enablement from hardcoded `true` to check `CODER_AIBRIDGE_ENABLED` deployment config ## Behavior Change AI Bridge is now only marked as "enabled" in entitlements when `CODER_AIBRIDGE_ENABLED=true` is set in the deployment config. Previously, it was always enabled for Premium/Enterprise licenses regardless of the config setting. This change ensures that users who do not use AI Bridge will not see the soft warning about the upcoming license requirement. ## Warning Message > AI Bridge is now Generally Available in v2.30. In a future Coder version, your deployment will require the AI Governance Add-On to continue using this feature. Please reach out to your account team or sales@coder.com to learn more. ## Behavior \| Condition \| Warning Shown \| \|-----------\|---------------\| \| AI Bridge disabled \| ❌ No \| \| AI Bridge enabled + explicit add-on license \| ❌ No \| \| AI Bridge enabled + Premium/Enterprise FeatureSet (no add-on) \| ✅ Yes \| ## Screenshots ### 1. No license <img width="1708" height="577" alt="image" src="https://github.com/user-attachments/assets/cbdbfd4d-55de-4d70-8abf-2665f458e96f" /> ### 2. No license + CODER_AIBRIDGE_ENABLED=true <img width="1716" height="513" alt="image" src="https://github.com/user-attachments/assets/344aae76-7703-485f-b568-1f13a1efa48f" /> ### 3. Premium license + CODER_AIBRIDGE_ENABLED=false <img width="1687" height="389" alt="image" src="https://github.com/user-attachments/assets/c2be12b0-1c0f-438d-a293-f9ec9fe6a736" /> ### 4. Premium license + CODER_AIBRIDGE_ENABLED=true <img width="1707" height="525" alt="image" src="https://github.com/user-attachments/assets/1a4640e1-e656-4f9b-bed0-9390cb5d6a84" /> ## Notes - TODO comments added to mark code that should be removed when AI Bridge enforcement is added - Feature continues to work - this is just a transitional warning (soft enforcement)	2026-01-26 10:46:45 +01:00
Kacper Sawicki	b82693d4cc	feat(codersdk): revert "remove AI Bridge entitlement from Premium license" (#21653 ) Reverts coder/coder#21540	2026-01-23 15:58:12 +00:00
Susana Ferreira	f5858c8a18	fix: unregister metrics on reconciler stop to prevent panic on restart (#21647 ) ## Description Fixes a panic that occurs when the prebuilds feature is toggled by adding/removing a license. The `StoreReconciler` was not unregistering the `reconciliationDuration` histogram, causing a "duplicate metrics collector registration attempted" panic when a new reconciler was created. ## Changes * Unregister the `reconciliationDuration` histogram in `Stop()` alongside the existing metrics collector * Change log level when stopping the reconciler with a cause, since "entitlements change" is not an error condition * Add `TestReconcilerLifecycle` to verify the reconciler can be stopped and recreated with the same prometheus registry Related to internal slack thread: https://codercom.slack.com/archives/C07GRNNRW03/p1769116582171379	2026-01-23 14:45:27 +00:00
Kacper Sawicki	9843adb8c6	feat(codersdk): remove AI Bridge entitlement from Premium license (#21540 ) ## Summary AI Bridge is moving out of Premium as a separate add-on (GA in Feb 3). Closes https://github.com/coder/internal/issues/1226 ## Changes - Excludes `FeatureAIBridge` from `Enterprise()` and `FeatureSetPremium.Features()` - Adds soft warning for deployments with AI Bridge enabled but not entitled - Warning is displayed to Auditor/Owner roles in UI banner and CLI headers ## Warning Message When AI Bridge is enabled (`CODER_AIBRIDGE_ENABLED=true`) but the license doesn't include the entitlement: > AI Bridge has reached General Availability and your Coder deployment is not entitled to run this feature. Contact your account team (https://coder.com/contact) for information around getting a license with AI Bridge. ## Behavior - The feature remains usable in v2.30 (soft warning only) - Future versions may include hard enforcement	2026-01-23 13:48:27 +01:00
George K	d29a168785	fix(coderd/rbac): reinstate deployment-wide workspace.share permission for owner role (#21620 ) The removal of that permission from the role broke valid use cases (e.g. a site owner user creating a workspace owned by a system account and then trying to share it with another user). The bulk of the PR is made up of the rollbacks of the previously introduced test updates necessitated by the removal. Related to: https://github.com/coder/internal/issues/1285	2026-01-22 08:12:15 -08:00
Mathias Fredriksson	97e8a5b093	fix(coderd): allow agent auth during workspace shutdown (#21538 ) Agents were losing authentication during workspace shutdown, causing shutdown scripts to fail. The auth query required agents to belong to the latest build, but during shutdown a `stop` build becomes latest while the `start` build's agents are still running. Modified the auth query to allow `start` build agents to authenticate temporarily during `stop` execution. The query allows auth when: - Agent's `start` build job succeeded - Latest build is `stop` with `pending`/`running` job status - Builds are adjacent (`stop` is `build_number + 1`) - Template versions match Auth closes once `stop` completes. Renamed `GetWorkspaceAgentAndLatestBuildByAuthToken` to `GetAuthenticatedWorkspaceAgentAndBuildByAuthToken` since it returns the agent's build (not always latest) during shutdown. Closes coder/internal#1249 Fixes #19467	2026-01-21 13:18:43 +00:00
Susana Ferreira	6ef9670384	fix: limit concurrent database connections in prebuild reconciliation (#20908 ) ## Description This PR addresses database connection pool exhaustion during prebuilds reconciliation by introducing two changes: * `CanSkipReconciliation`: Filters out presets that don't need reconciliation before spawning goroutines. This ensures we only create goroutines for presets that will (_most likely_) perform database operations, avoiding unnecessary connection pool usage. * Dynamic `eg.SetLimit`: Limits concurrent goroutines based on the configured database connection pool size (`CODER_PG_CONN_MAX_OPEN / 2`). This replaces the previous hardcoded limit of 5, ensuring the reconciliation loop scales appropriately with the configured pool size while leaving capacity for other database operations. ## Changes * Add `CanSkipReconciliation()` method to `PresetSnapshot` that returns true for inactive presets with no running workspaces, no pending jobs, or expired prebuilds. * Add `maxDBConnections` parameter to `NewStoreReconciler` and compute `reconciliationConcurrency` as half the pool size (minimum 1). * Add `ReconciliationConcurrency()` getter method to `StoreReconciler`. * Add `eg.SetLimit(c.reconciliationConcurrency)` to bound concurrent reconciliation goroutines. * Add `PresetsTotal` and `PresetsReconciled` to `ReconcileStats` for observability. * Add `TestCanSkipReconciliation` unit tests. * Add `TestReconciliationConcurrency` unit tests. * Add benchmark tests for reconciliation performance. ## Benchmarks * `BenchmarkReconcileAll_NoOps`: Tests presets with no reconciliation actions. All presets are filtered by `CanSkipReconciliation`, resulting in no goroutines spawned and no database connections used. * `BenchmarkReconcileAll_ConnectionContention`: Tests presets where all require reconciliation actions. All presets spawn goroutines, but concurrency is limited by `eg.SetLimit(reconciliationConcurrency)`. * `BenchmarkReconcileAll_Mix`: Simulates a realistic scenario with a large subset of inactive presets (filtered by `CanSkipReconciliation`) and a smaller subset requiring reconciliation (limited by `eg.SetLimit`). Closes: https://github.com/coder/coder/issues/20606	2026-01-21 10:56:31 +00:00
George K	0712faef4f	feat(enterprise): implement organization "disable workspace sharing" option (#21376 ) Adds a per-organization setting to disable workspace sharing. When enabled, all existing workspace ACLs in the organization are cleared and the workspace ACL mutation API endpoints return `403 Forbidden`. This complements the existing site-wide `--disable-workspace-sharing` flag by providing more granular control at the organization level. Closes https://github.com/coder/internal/issues/1073 (part 2) --------- Co-authored-by: Steven Masley <Emyrk@users.noreply.github.com>	2026-01-14 09:47:50 -08:00
Susana Ferreira	000bc334c9	fix: reuse reconciliation lock transaction for read operations in prebuilds (#21408 ) ## Description Reuses the reconciliation lock transaction for read operations during prebuilds reconciliation, reducing unnecessary database connections. ## Changes * Use the lock transaction (`db`) for read operations and `c.store` for write operations: * `GetPrebuildsSettings`: now uses `db` * `SnapshotState`: now uses `db` * `MembershipReconciler`: continues to use `c.store` (performs write operations) * Add comments explaining the transaction model and when to use `db` vs `c.store` Related to: https://github.com/coder/coder/pull/20587	2026-01-13 15:04:51 +00:00
George K	cc2efe9e1f	feat(coderd/rbac): make organization-member a per-org system custom role (#21359 ) Migrated the built-in organization-member role to DB storage so it can be customized per org. Closes https://github.com/coder/internal/issues/1073 (part 1)	2026-01-12 18:19:19 -08:00
Zach	091d31224d	fix: replace moby/moby namesgenerator with internal implementation (#21377 ) Replace the external moby/moby/pkg/namesgenerator dependency with an internal implementation using gofakeit/v7. The moby package has ~25k unique name combinations, and with its retry parameter only adds a random digit 0-9, giving ~250k possibilities. In parallel tests, this has led to collisions (flakes). The new internal API at coderd/util/namesgenerator eliminates the external dependnecy and offers functions with explicit uniqueness guarantees. This PR also consolidates fragmented name generation in a few places to use the new package. \| Old (moby/moby) \| New \| \|-------------------------------------\|------------------------\| \| namesgenerator.GetRandomName(0) \| NameWith("_") \| \| namesgenerator.GetRandomName(>0) \| NameDigitWith("_") \| \| testutil.GetRandomName(t) \| UniqueName() \| \| testutil.GetRandomNameHyphenated(t) \| UniqueNameWith("-") \| namesgenerator package API: - NameWith(delim): random name, not unique - NameDigitWith(delim): random name with 1-9 suffix, not unique - UniqueName(): guaranteed unique via atomic counter - UniqueNameWith(delim): unique with custom delimiter Names continue to be docker style `[adjective][delim][surname]`. Unique names are truncated to 32 characters (preserving the numeric suffix) to fit common name length limits in Coder. Related test flakes: https://github.com/coder/internal/issues/1212 https://github.com/coder/internal/issues/118 https://github.com/coder/internal/issues/1068	2026-01-09 15:40:26 -07:00
Spike Curtis	bddb808b25	chore: arrange imports in a standard way (#21452 ) Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example: ``` import ( "context" "time" "github.com/prometheus/client_golang/prometheus" "golang.org/x/xerrors" "gopkg.in/natefinch/lumberjack.v2" "cdr.dev/slog/v3" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/serpent" ) ``` 3 groups: standard library, 3rd partly libs, Coder libs. This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.	2026-01-08 15:24:11 +04:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
Sas Swart	9a0024c45f	chore: add tracing to prebuilds (#21443 ) The implementation for prebuilt workspaces is complex and conversations regarding edge cases and bugs frequently get bogged down by minutiae, because it's hard to reason about the behaviour of the system. To alleviate this, I've introduced otel tracing to the StoreReconciler (see attached). We can now directly observe the behaviour of the prebuilds system under load in order to inform our decisions. Traces are terminated at the boundary between prebuilds and workspace builder, because of prebuilt workspaces' "fire and forget" philosophy and to prevent span explosion. <img width="3024" height="1718" alt="image" src="https://github.com/user-attachments/assets/f9b207be-8f2c-475e-98a8-46ef70bda446" />	2026-01-07 11:04:40 +02:00
Danielle Maywood	467c8bbd6b	fix: prevent notification for dormant delete on dormant-removal (#21427 ) Ensure we do not send "Marked for deletion" notifications when disabling dormancy deletion	2026-01-06 16:26:28 +00:00
Danny Kopping	733b6b7db9	feat: add API to serve proxy certificate (#21391 ) Closes https://github.com/coder/internal/issues/1184	2025-12-29 18:00:06 +00:00
Danny Kopping	a173c38715	chore: remove experimental endpoints (#21390 ) This should've been removed when we cut the Beta release, but we missed it. Adding as a drive-by.	2025-12-29 16:17:46 +00:00
Steven Masley	35f1c44455	test: fix path seperator on windows for unit test (#21382 ) Test TestWorkspaceTemplateParamsChange writes a file to disk Closes https://github.com/coder/internal/issues/1213	2025-12-23 15:13:16 +00:00
Steven Masley	61d7d2983f	fix: remove state information from apply (#21373 ) Delete builds were not deleting resources as the tf state being sent in the apply request was empty. State removed from apply request and read from the session instead.	2025-12-22 16:18:53 +00:00
Cian Johnston	f1b930b190	chore(enterprise): increase coverage of TestWorkspaceBuild (#21304 ) Relates to #20925 This PR expands the test coverage of `enterprise/coderd/TestWorkspaceBuild` to also exercise the `postWorkspaceBuilds` handler. Previously, it only exercised the `createWorkspace` handler.	2025-12-17 17:28:38 +00:00
Spike Curtis	bd753d9cb9	fix: mark users seen when activating on login (#21305 ) fixes #21303 Update user last_seen_at when we mark them active on login. This prevents a narrow race where they can be re-marked dormant and fail to log in.	2025-12-17 16:49:40 +04:00
Steven Masley	3194bcfc9e	chore: distinct operations for provisioner's 'parse', 'init', 'plan', 'apply', 'graph' (#21064 ) Provisioner steps broken into smaller granular actions. Changes: - `ExtractArchive` moved to `init` request (was in `configure`) - Writing `tfstate` moved to `plan` (was in `configure`) - Moved most plan/apply outputs to `GraphComplete`	2025-12-15 11:26:41 -06:00
George K	103967ed02	feat: add sharing info to /workspaces endpoint (#21049 ) closes: https://github.com/coder/internal/issues/858 Similar to https://github.com/coder/coder/pull/19375, this one uses system permissions for fetching actual user and group data. Modifies the `workspaces_expanded` view to fetch the required data; this way it's made available to all code paths that make use of it. Also fixes a bug in a test helper function that can result in `null` being saved to the DB for `user_acl` or `group_acl` and break tests; a defensive check constraint that prevents this is worth a PR, e.g: `ALTER TABLE workspaces ADD CONSTRAINT group_acl_is_object CHECK (jsonb_typeof(group_acl) = 'object');` Also adds missing `OwnerName` in `ConvertWorkspaceRows`.	2025-12-15 08:42:08 -08:00
Kacper Sawicki	6f86f67754	feat(coderd): add overload protection with rate limiting and concurrency control (#21161 ) ## Summary This adds configurable overload protection to the AI Bridge daemon to prevent the server from being overwhelmed during periods of high load. Partially addresses coder/internal#1153 (rate limits and concurrency control; circuit breakers are deferred to a follow-up). ## New Configuration Options \| Option \| Environment Variable \| Description \| Default \| \|--------\|---------------------\|-------------\|---------\| \| `--aibridge-max-concurrency` \| `CODER_AIBRIDGE_MAX_CONCURRENCY` \| Maximum number of concurrent AI Bridge requests. Set to 0 to disable (unlimited). \| `0` \| \| `--aibridge-rate-limit` \| `CODER_AIBRIDGE_RATE_LIMIT` \| Maximum number of AI Bridge requests per second. Set to 0 to disable rate limiting. \| `0` \| ## Behavior When limits are exceeded: - Concurrency limit: Returns HTTP `503 Service Unavailable` with message "AI Bridge is currently at capacity. Please try again later." - Rate limit: Returns HTTP `429 Too Many Requests` with `Retry-After` header. Both protections are optional and disabled by default (0 values). ## Implementation The overload protection is implemented as reusable middleware in `coderd/httpmw/ratelimit.go`: 1. `RateLimitByAuthToken`: Per-user rate limiting that uses `APITokenFromRequest` to extract the authentication token, with fallback to `X-Api-Key` header for AI provider compatibility (e.g., Anthropic). Falls back to IP-based rate limiting if no token is present. Includes `Retry-After` header for backpressure signaling. 2. `ConcurrencyLimit`: Uses an atomic counter to track in-flight requests and reject when at capacity. The middleware is applied in `enterprise/coderd/aibridge.go` via `r.Group` in the following order: 1. Concurrency check (faster rejection for load shedding) 2. Rate limit check Note: Rate limiting currently applies to all AI Bridge requests, including pass-through requests. Ideally only actual interceptions should count, but this would require changes in the aibridge library. ## Testing Added comprehensive tests for: - Rate limiting by auth token (Bearer token, X-Api-Key, no token fallback to IP) - Different tokens not rate limited against each other - Disabled when limit is zero - Retry-After header is set on 429 responses - Concurrency limiting (allows within limit, rejects over limit, disabled when zero)	2025-12-11 16:38:54 +01:00
Dean Sheather	b199eb1c38	fix: allow stops and deletes after breaching AI limit (#21186 ) Fixes a bug a customer encountered once they breached their limit. Adds a test.	2025-12-09 11:05:12 +00:00
blinkagent[bot]	b4be5bcfed	docs: fix swagger tags for license endpoints (#21101 ) ## Summary Change `@Tags` from `Organizations` to `Enterprise` for `POST /licenses` and `POST /licenses/refresh-entitlements` to match the `GET` and `DELETE` license endpoints which are already tagged as `Enterprise`. ## Problem The license API endpoints were inconsistently tagged in the swagger annotations: - `GET /licenses` → `Enterprise` ✓ - `DELETE /licenses/{id}` → `Enterprise` ✓ - `POST /licenses` → `Organizations` ✗ - `POST /licenses/refresh-entitlements` → `Organizations` ✗ This caused the POST endpoints to be documented in the [Organizations API docs](https://coder.com/docs/reference/api/organizations) instead of the [Enterprise API docs](https://coder.com/docs/reference/api/enterprise) where the other license endpoints live. ## Fix Simply updated the `@Tags` annotation from `Organizations` to `Enterprise` for both POST endpoints. This was an oversight from the original swagger docs addition in #5625 (January 2023). Co-authored-by: blink-so[bot] <211532188+blink-so[bot]@users.noreply.github.com>	2025-12-05 15:27:22 +00:00
Marcin Tojek	9c7135a61d	chore: add license check for prebuilds (#20947 ) Related: https://github.com/coder/coder/pull/20864	2025-11-26 15:00:07 +01:00

1 2 3 4 5 ...

732 Commits