coder

mirror of https://github.com/coder/coder.git synced 2026-06-03 21:18:24 +00:00

Author	SHA1	Message	Date
Spike Curtis	bddb808b25	chore: arrange imports in a standard way (#21452 ) Fixes all our Go file imports to match the preferred spec that we've _mostly_ been using. For example: ``` import ( "context" "time" "github.com/prometheus/client_golang/prometheus" "golang.org/x/xerrors" "gopkg.in/natefinch/lumberjack.v2" "cdr.dev/slog/v3" "github.com/coder/coder/v2/codersdk/agentsdk" "github.com/coder/serpent" ) ``` 3 groups: standard library, 3rd partly libs, Coder libs. This PR makes the change across the codebase. The PR in the stack above modifies our formatting to maintain this state of affairs, and is a separate PR so it's possible to review that one in detail.	2026-01-08 15:24:11 +04:00
Spike Curtis	49b34a716a	fix: fix slog to always use array of Fields (#21426 ) Upgrades to slog v3 which includes a small, but backward incompatible API change to the acceptible call arguments when logging. This change allows us to verify via compile time type checking that arguments are correct and won't cause a panic, as was possible in slog v1, which this replaces (v2 was tagged but never used in coder/coder). It also updates dependencies that also use slog and were updated. I've left the `aibridge` dependency as a commit SHA, under the assumption that the team there (cc @pawbana @dannykopping ) will tag and update the dependency soon and on their own schedule. Other dependencies, I pushed new tags.	2026-01-08 10:29:41 +04:00
Callum Styan	45c43d4ec4	fix: refactor agent resource monitoring API to avoid excessive calls to DB (#20430 ) This should resolve https://github.com/coder/internal/issues/728 by refactoring the ResourceMonitorAPI struct to only require querying the resource monitor once for memory and once for volumes, then using the stored monitors on the API struct from that point on. This should eliminate the vast majority of calls to `GetWorkspaceByAgentID` and `FetchVolumesResourceMonitorsUpdatedAfter`/`FetchMemoryResourceMonitorsUpdatedAfter` (millions of calls per week). Tests passed, and I ran an instance of coder via a workspace with a template that added resource monitoring every 10s. Note that this is the default docker container, so there are other sources of `GetWorkspaceByAgentID` db queries. Note that this workspace was running for ~15 minutes at the time I gathered this data. Over 30s for the `ResourceMonitor` calls: ``` coder@callum-coder-2:~/coder$ curl localhost:19090/metrics \| grep ResourceMonitor \| grep count % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2 coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2 100 288k 0 288k 0 0 58.3M 0 --:--:-- --:--:-- --:--:-- 70.4M coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2 coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2 coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 155 coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 155 coder@callum-coder-2:~/coder$ curl localhost:19090/metrics \| grep ResourceMonitor \| grep count % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsByAgentID"} 2 coderd_db_query_latencies_seconds_count{query="FetchMemoryResourceMonitorsUpdatedAfter"} 2 100 288k 0 288k 0 0 34.7M 0 --:--:-- --:--:-- --:--:-- 40.2M coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsByAgentID"} 2 coderd_db_query_latencies_seconds_count{query="FetchVolumesResourceMonitorsUpdatedAfter"} 2 coderd_db_query_latencies_seconds_count{query="UpdateMemoryResourceMonitor"} 158 coderd_db_query_latencies_seconds_count{query="UpdateVolumeResourceMonitor"} 158 ``` And over 1m for the `GetWorkspaceAgentByID` calls, the majority are from the workspace metadata stats updates: ``` coder@callum-coder-2:~/coder$ curl localhost:19090/metrics \| grep GetWorkspaceByAgentID \| grep count % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 284k 0 284k 0 0 42.4M 0 --:--:-- --:--:-- --:--:-- 46.3M coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 876 coder@callum-coder-2:~/coder$ curl localhost:19090/metrics \| grep GetWorkspaceByAgentID \| grep count % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 284k 0 284k 0 0 75.4M 0 --:--:-- --:--:-- --:--:-- 92.7M coderd_db_query_latencies_seconds_count{query="GetWorkspaceByAgentID"} 918 ``` --------- Signed-off-by: Callum Styan <callumstyan@gmail.com>	2025-10-28 13:38:16 -07:00
Danielle Maywood	ec517657a8	chore: add targets to oom/ood notifications (#16968 ) Add targets to OOM/OOD notifications to allow Coder Inbox clients to filter on these notifications.	2025-03-18 12:59:30 +00:00
Danielle Maywood	d6b9806098	chore: implement oom/ood processing component (#16436 ) Implements the processing logic as set out in the OOM/OOD RFC.	2025-02-17 16:56:52 +00:00
Vincent Vielle	bc609d0056	feat: integrate agentAPI with resources monitoring logic (#16438 ) As part of the new resources monitoring logic - more specifically for OOM & OOD Notifications , we need to update the AgentAPI , and the agents logic. This PR aims to do it, and more specifically : We are updating the AgentAPI & TailnetAPI to version 24 to add two new methods in the AgentAPI : - One method to fetch the resources monitoring configuration - One method to push the datapoints for the resources monitoring. Also, this PR adds a new logic on the agent side, with a routine running and ticking - fetching the resources usage each time , but also storing it in a FIFO like queue. Finally, this PR fixes a problem we had with RBAC logic on the resources monitoring model, applying the same logic than we have for similar entities.	2025-02-14 10:28:15 +01:00

6 Commits