coder

shishantbiswas/coder

Fork 0

mirror of https://github.com/coder/coder.git synced 2026-06-03 13:08:25 +00:00

Commit Graph

Author	SHA1	Message	Date
Zach	a31e476623	fix: make boundary usage telemetry collection atomic (#21907 ) Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during telemetry period cutover. Without serialization, an upsert concurrent with the delete could lose data (deleted right after being written) or commit after the delete (miscounted in the next period). Both operations now acquire LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.	2026-02-06 09:52:17 -07:00
Zach	90aeea5649	fix: handle boundary usage across snapshots and flush races (#21805 ) Previously there were two issues that could cause incorrect boundary usage telemetry data. 1. Bad handling across snapshot intervals: After telemetry snapshot deleted the DB row, the next flush would INSERT the stale cumulative data (which included already-reported usage). This would then be overwritten by subsequent UPDATE flushes, causing the delta between the last snapshot and the reset to be lost (under-reporting usage). Additionally, if there was no new usage after the reset, the tracker would carry over all usage from the previous period into the next period (over-reporting usage). 2. Missed usage from a race condition: Track() calls between the first mutex unlock and second mutex lock in FlushToDB() were lost. The data wasn't included in the current flush (already snapshotted) and was wiped by the subsequent reset. This is likely low impact to overall usage numbers in the real world. Fix by tracking unique workspace/user deltas separately from cumulative values and always tracking delta allowed/denied requests. Deltas are used for INSERT (fresh start after reset), cumulative for UPDATE (accurate unique counts within a period). All counters reset atomically before the DB operation so Track() calls during the operation are preserved for the next flush.	2026-02-02 09:11:54 -07:00
Zach	2204731ddb	feat: implement boundary usage tracker and telemetry collection (#21716 ) Implements telemetry for boundary usage tracking across all Coder replicas and reports them via telemetry. Changes: - Implement Tracker with Track(), FlushToDB(), and StartFlushLoop() methods - Add telemetry integration via collectBoundaryUsageSummary() - Use telemetry lock to ensure only one replica collects per period The tracker accumulates unique workspaces, unique users, and request counts (allowed/denied) in memory, then flushes to the database periodically. During telemetry collection, stats are aggregated across all replicas and reset for the next period.	2026-01-27 19:11:40 -07:00

Author

SHA1

Message

Date

Zach

a31e476623

fix: make boundary usage telemetry collection atomic (#21907 )

Previously, UpsertBoundaryUsageStats (INSERT...ON CONFLICT DO UPDATE) and
GetAndResetBoundaryUsageSummary (DELETE...RETURNING) could race during
telemetry period cutover. Without serialization, an upsert concurrent with the
delete could lose data (deleted right after being written) or commit after the
delete (miscounted in the next period). Both operations now acquire
LockIDBoundaryUsageStats within a transaction to ensure a clean cutover.

2026-02-06 09:52:17 -07:00

Zach

90aeea5649

fix: handle boundary usage across snapshots and flush races (#21805 )

Previously there were two issues that could cause incorrect boundary
usage telemetry data.

1. Bad handling across snapshot intervals: After telemetry snapshot deleted
the DB row, the next flush would INSERT the stale cumulative data (which
included already-reported usage). This would then be overwritten by
subsequent UPDATE flushes, causing the delta between the last snapshot
and the reset to be lost (under-reporting usage). Additionally, if there
was no new usage after the reset, the tracker would carry over all usage
from the previous period into the next period (over-reporting usage).

2. Missed usage from a race condition: Track() calls between the first
mutex unlock and second mutex lock in FlushToDB() were lost. The data
wasn't included in the current flush (already snapshotted) and was wiped
by the subsequent reset. This is likely low impact to overall usage
numbers in the real world.

Fix by tracking unique workspace/user deltas separately from cumulative
values and always tracking delta allowed/denied requests. Deltas are used
for INSERT (fresh start after reset), cumulative for UPDATE (accurate unique
counts within a period). All counters reset atomically before the DB operation
so Track() calls during the operation are preserved for the next flush.

2026-02-02 09:11:54 -07:00

Zach

2204731ddb

feat: implement boundary usage tracker and telemetry collection (#21716 )

Implements telemetry for boundary usage tracking across all Coder
replicas and reports them via telemetry.

Changes:
- Implement Tracker with Track(), FlushToDB(), and StartFlushLoop() methods
- Add telemetry integration via collectBoundaryUsageSummary()
- Use telemetry lock to ensure only one replica collects per period

The tracker accumulates unique workspaces, unique users, and request
counts (allowed/denied) in memory, then flushes to the database
periodically. During telemetry collection, stats are aggregated across
all replicas and reset for the next period.

2026-01-27 19:11:40 -07:00

3 Commits