Backport of #23835.
Audit and connection log pages were timing out due to expensive COUNT(*)
queries over large tables. This commit adds opt-in count capping:
requests can return a `count_cap` field signaling that the count was
truncated at a threshold, avoiding full table scans that caused page
timeouts.
Text-cast UUID comparisons in regosql-generated authorization queries
also contributed to the slowdown by preventing index usage for
connection and audit log queries. These now emit native UUID operators.
Frontend changes handle the capped state in usePaginatedQuery and
PaginationWidget, optionally displaying a capped count in the pagination
UI (e.g. "Showing 2,076 to 2,100 of 2,000+ logs")
---
Cherry picked from 86ca61d6ca
### Breaking change (changelog note):
>With new connection events appearing in the Connection Log, connection events older than 90 days will now be deleted from the Audit Log. If you require this legacy data, we recommend querying it from the REST API or making a backup of the database/these events before upgrading your Coder deployment. Please see the PR for details on what exactly will be deleted.
Of note is that there are currently no plans to delete connection events from the Connection Log.
### Context
This is the fifth PR for moving connection events out of the audit log.
In previous PRs:
- **New** connection logs have been routed to the `connection_logs` table. They will *not* appear in the audit log.
- These new connection logs are served from the new `/api/v2/connectionlog` endpoint.
In this PR:
- We'll now clean existing connection events out of the audit log, if they are older than 90 days, We do this in batches of 1000, every 10 minutes.
The criteria for deletion is simple:
```
WHERE
(
action = 'connect'
OR action = 'disconnect'
OR action = 'open'
OR action = 'close'
)
AND "time" < @before_time::timestamp with time zone
```
where `@before_time` is currently configured to 90 days in the past.
Future PRs:
- Write documentation for the endpoint / feature
Closes#17689
This PR optimizes the audit logs query performance by extracting the
count operation into a separate query and replacing the OR-based
workspace_builds with conditional joins.
## Query changes
* Extracted count query to separate one
* Replaced single `workspace_builds` join with OR conditions with
separate conditional joins
* Added conditional joins
* `wb_build` for workspace_build audit logs (which is a direct lookup)
* `wb_workspace` for workspace create audit logs (via workspace)
Optimized AuditLogsOffset query:
https://explain.dalibo.com/plan/4g1hbedg4a564bg8
New CountAuditLogs query:
https://explain.dalibo.com/plan/ga2fbcecb9efbce3
This commit adds new audit resource types for workspace agents and
workspace apps, as well as connect/disconnect and open/close actions.
The idea is that we will log new audit events for connecting to the
agent via SSH/editor.
Likewise, we will log openings of `coder_app`s.
This change also introduces support for filtering by `request_id`.
Updates #15139
* Allow creating test audits with nil org
Not all audit entries have organization IDs, so this will allow us to
test those cases.
* Add organization details to audit log queries
* Add organization to audit log response
This replaces the old ID. This is a breaking change but organizations
were not being used before.