Files
coder/coderd/boundaryusage/doc.go
T
Zach 7dfa33b410 feat: add boundary usage tracking database schema and tracker skeleton (#21670)
feat: add boundary usage telemetry database schema and RBAC

Adds the foundation for tracking boundary usage telemetry across Coder
replicas. This includes:

  - Database schema: `boundary_usage_stats` table with per-replica stats
    (unique workspaces, unique users, allowed/denied request counts)
  - Database queries: upsert stats, get aggregated summary, reset stats,
    delete by replica ID
  - RBAC: `boundary_usage` resource type with read/update/delete actions,
    accessible only via system `BoundaryUsageTracker` subject (not regular
    user roles)
  - Tracker skeleton + docs: stub implementation in `coderd/boundaryusage/`

The tracker accumulates stats in memory and periodically flushes to the
database. Stats are aggregated across replicas for telemetry reporting,
then reset when a new reporting period begins. The tracker implementation
and plumbing will be done in a subsequent commit/PR.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 13:29:21 -07:00

80 lines
5.2 KiB
Go

// Package boundaryusage tracks workspace boundary usage for telemetry reporting.
// The design intent is to track trends and rough usage patterns.
//
// Each replica does in-memory usage tracking. Boundary usage is inferred at the
// control plane when workspace agents call the ReportBoundaryLogs RPC. Accumulated
// stats are periodically flushed to a database table keyed by replica ID. Telemetry
// aggregates are computed across all replicas when generating snapshots.
//
// Aggregate Precision:
//
// The aggregated stats represent approximate usage over roughly the telemetry
// snapshot interval, not a precise time window. This imprecision arises because:
//
// - Each replica flushes independently, so their data covers slightly different
// time ranges (varying by up to the flush interval)
// - Unflushed in-memory data at snapshot time rolls into the next period
// - The snapshot captures "data flushed since last reset" rather than "usage
// during exactly the last N minutes"
//
// We accept this imprecision to keep the architecture simple. Each replica
// operates independently and flushes to the database on their own schedule.
// This approach also minimizes database load. The table contains at most one
// row per replica, so flushes are just upserts, and resets only delete N
// rows. There's no accumulation of historical data to clean up. The only
// synchronization is a database lock that ensures exactly one replica reports
// telemetry per period.
//
// Known Shortcomings:
//
// - Unique workspace/user counts may be inflated when the same workspace or
// user connects through multiple replicas, as each replica tracks its own
// unique set
// - Ad-hoc boundary usage in a workspace may not be accounted for e.g. if
// the boundary command is invoked directly with the --log-proxy-socket-path
// flag set to something other than the Workspace agent server.
//
// Implementation:
//
// The Tracker maintains sets of unique workspace IDs and user IDs, plus request
// counters. When boundary logs are reported, Track() adds the IDs to the sets
// and increments request counters.
//
// FlushToDB() writes stats to the database, replacing all values with the current
// in-memory state. Stats accumulate in memory throughout the telemetry period.
//
// A new period is detected when the upsert results in an INSERT (meaning
// telemetry deleted the replica's row). At that point, all in-memory stats are
// reset so they only count usage within the new period.
//
// Below is a sequence diagram showing the flow of boundary usage tracking.
//
// ┌───────┐ ┌───────────────┐ ┌──────────┐ ┌────┐ ┌───────────┐
// │ Agent │ │BoundaryLogsAPI│ │ Tracker │ │ DB │ │ Telemetry │
// └───┬───┘ └───────┬───────┘ └────┬─────┘ └──┬─┘ └─────┬─────┘
// │ │ │ │ │
// │ ReportBoundaryLogs│ │ │ │
// ├──────────────────►│ │ │ │
// │ │ Track(...) │ │ │
// │ ├────────────────►│ │ │
// │ : │ │ │ │
// │ : │ │ │ │
// │ ReportBoundaryLogs│ │ │ │
// ├──────────────────►│ │ │ │
// │ │ Track(...) │ │ │
// │ ├────────────────►│ │ │
// │ │ │ │ │
// │ │ │ FlushToDB │ │
// │ │ ├────────────►│ │
// │ │ │ : │ │
// │ │ │ : │ │
// │ │ │ FlushToDB │ │
// │ │ ├────────────►│ │
// │ │ │ │ │
// │ │ │ │ Snapshot │
// │ │ │ │ interval │
// │ │ │ │◄───────────┤
// │ │ │ │ Aggregate │
// │ │ │ │ & Reset │
package boundaryusage