mirror of
https://github.com/coder/coder.git
synced 2026-06-02 20:48:20 +00:00
feat: add prebuild timing metrics to Prometheus (#19503)
## Description This PR introduces one counter and two histograms related to workspace creation and claiming. The goal is to provide clearer observability into how workspaces are created (regular vs prebuild) and the time cost of those operations. ### `coderd_workspace_creation_total` * Metric type: Counter * Name: `coderd_workspace_creation_total` * Labels: `organization_name`, `template_name`, `preset_name` This counter tracks whether a regular workspace (not created from a prebuild pool) was created using a preset or not. Currently, we already expose `coderd_prebuilt_workspaces_claimed_total` for claimed prebuilt workspaces, but we lack a comparable metric for regular workspace creations. This metric fills that gap, making it possible to compare regular creations against claims. Implementation notes: * Exposed as a `coderd_` metric, consistent with other workspace-related metrics (e.g. `coderd_api_workspace_latest_build`: https://github.com/coder/coder/blob/main/coderd/prometheusmetrics/prometheusmetrics.go#L149). * Every `defaultRefreshRate` (1 minute ), DB query `GetRegularWorkspaceCreateMetrics` is executed to fetch all regular workspaces (not created from a prebuild pool). * The counter is updated with the total from all time (not just since metric introduction). This differs from the histograms below, which only accumulate from their introduction forward. ### `coderd_workspace_creation_duration_seconds` & `coderd_prebuilt_workspace_claim_duration_seconds` * Metric types: Histogram * Names: * `coderd_workspace_creation_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name`, `type` (`regular`, `prebuild`) * `coderd_prebuilt_workspace_claim_duration_seconds` * Labels: `organization_name`, `template_name`, `preset_name` We already have `coderd_provisionerd_workspace_build_timings_seconds`, which tracks build run times for all workspace builds handled by the provisioner daemon. However, in the context of this issue, we are only interested in creation and claim build times, not all transitions; additionally, this metric does not include `preset_name`, and adding it there would significantly increase cardinality. Therefore, separate more focused metrics are introduced here: * `coderd_workspace_creation_duration_seconds`: Build time to create a workspace (either a regular workspace or the build into a prebuild pool, for prebuild initial provisioning build). * `coderd_prebuilt_workspace_claim_duration_seconds`: Time to claim a prebuilt workspace from the pool. The reason for two separate histograms is that: * Creation (regular or prebuild): provisioning builds with similar time magnitude, generally expected to take longer than a claim operation. * Claim: expected to be a much faster provisioning build. #### Native histogram usage Provisioning times vary widely between projects. Using static buckets risks unbalanced or poorly informative histograms. To address this, these metrics use [Prometheus native histograms](https://prometheus.io/docs/specs/native_histograms/): * First introduced in Prometheus v2.40.0 * Recommended stable usage from v2.45+ * Requires Go client `prometheus/client_golang` v1.15.0+ * Experimental and must be explicitly enabled on the server (`--enable-feature=native-histograms`) For compatibility, we also retain a classic bucket definition (aligned with the existing provisioner metric: https://github.com/coder/coder/blob/main/provisionerd/provisionerd.go#L182-L189). * If native histograms are enabled, Prometheus ingests the high-resolution histogram. * If not, it falls back to the predefined buckets. Implementation notes: * Unlike the counter, these histograms are updated in real-time at workspace build job completion. * They reflect data only from the point of introduction forward (no historical backfill). ## Relates to Closes: https://github.com/coder/coder/issues/19528 Native histograms tested in observability stack: https://github.com/coder/observability/pull/50
This commit is contained in:
+12
-6
@@ -62,12 +62,6 @@ import (
|
||||
"github.com/coder/serpent"
|
||||
"github.com/coder/wgtunnel/tunnelsdk"
|
||||
|
||||
"github.com/coder/coder/v2/coderd/entitlements"
|
||||
"github.com/coder/coder/v2/coderd/notifications/reports"
|
||||
"github.com/coder/coder/v2/coderd/runtimeconfig"
|
||||
"github.com/coder/coder/v2/coderd/webpush"
|
||||
"github.com/coder/coder/v2/codersdk/drpcsdk"
|
||||
|
||||
"github.com/coder/coder/v2/buildinfo"
|
||||
"github.com/coder/coder/v2/cli/clilog"
|
||||
"github.com/coder/coder/v2/cli/cliui"
|
||||
@@ -83,15 +77,19 @@ import (
|
||||
"github.com/coder/coder/v2/coderd/database/migrations"
|
||||
"github.com/coder/coder/v2/coderd/database/pubsub"
|
||||
"github.com/coder/coder/v2/coderd/devtunnel"
|
||||
"github.com/coder/coder/v2/coderd/entitlements"
|
||||
"github.com/coder/coder/v2/coderd/externalauth"
|
||||
"github.com/coder/coder/v2/coderd/gitsshkey"
|
||||
"github.com/coder/coder/v2/coderd/httpmw"
|
||||
"github.com/coder/coder/v2/coderd/jobreaper"
|
||||
"github.com/coder/coder/v2/coderd/notifications"
|
||||
"github.com/coder/coder/v2/coderd/notifications/reports"
|
||||
"github.com/coder/coder/v2/coderd/oauthpki"
|
||||
"github.com/coder/coder/v2/coderd/prometheusmetrics"
|
||||
"github.com/coder/coder/v2/coderd/prometheusmetrics/insights"
|
||||
"github.com/coder/coder/v2/coderd/promoauth"
|
||||
"github.com/coder/coder/v2/coderd/provisionerdserver"
|
||||
"github.com/coder/coder/v2/coderd/runtimeconfig"
|
||||
"github.com/coder/coder/v2/coderd/schedule"
|
||||
"github.com/coder/coder/v2/coderd/telemetry"
|
||||
"github.com/coder/coder/v2/coderd/tracing"
|
||||
@@ -99,9 +97,11 @@ import (
|
||||
"github.com/coder/coder/v2/coderd/util/ptr"
|
||||
"github.com/coder/coder/v2/coderd/util/slice"
|
||||
stringutil "github.com/coder/coder/v2/coderd/util/strings"
|
||||
"github.com/coder/coder/v2/coderd/webpush"
|
||||
"github.com/coder/coder/v2/coderd/workspaceapps/appurl"
|
||||
"github.com/coder/coder/v2/coderd/workspacestats"
|
||||
"github.com/coder/coder/v2/codersdk"
|
||||
"github.com/coder/coder/v2/codersdk/drpcsdk"
|
||||
"github.com/coder/coder/v2/cryptorand"
|
||||
"github.com/coder/coder/v2/provisioner/echo"
|
||||
"github.com/coder/coder/v2/provisioner/terraform"
|
||||
@@ -280,6 +280,12 @@ func enablePrometheus(
|
||||
}
|
||||
}
|
||||
|
||||
provisionerdserverMetrics := provisionerdserver.NewMetrics(logger)
|
||||
if err := provisionerdserverMetrics.Register(options.PrometheusRegistry); err != nil {
|
||||
return nil, xerrors.Errorf("failed to register provisionerd_server metrics: %w", err)
|
||||
}
|
||||
options.ProvisionerdServerMetrics = provisionerdserverMetrics
|
||||
|
||||
//nolint:revive
|
||||
return ServeHandler(
|
||||
ctx, logger, promhttp.InstrumentMetricHandler(
|
||||
|
||||
@@ -241,6 +241,8 @@ type Options struct {
|
||||
UpdateAgentMetrics func(ctx context.Context, labels prometheusmetrics.AgentMetricLabels, metrics []*agentproto.Stats_Metric)
|
||||
StatsBatcher workspacestats.Batcher
|
||||
|
||||
ProvisionerdServerMetrics *provisionerdserver.Metrics
|
||||
|
||||
// WorkspaceAppAuditSessionTimeout allows changing the timeout for audit
|
||||
// sessions. Raising or lowering this value will directly affect the write
|
||||
// load of the audit log table. This is used for testing. Default 1 hour.
|
||||
@@ -1930,6 +1932,7 @@ func (api *API) CreateInMemoryTaggedProvisionerDaemon(dialCtx context.Context, n
|
||||
},
|
||||
api.NotificationsEnqueuer,
|
||||
&api.PrebuildsReconciler,
|
||||
api.ProvisionerdServerMetrics,
|
||||
)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
|
||||
@@ -184,6 +184,8 @@ type Options struct {
|
||||
OIDCConvertKeyCache cryptokeys.SigningKeycache
|
||||
Clock quartz.Clock
|
||||
TelemetryReporter telemetry.Reporter
|
||||
|
||||
ProvisionerdServerMetrics *provisionerdserver.Metrics
|
||||
}
|
||||
|
||||
// New constructs a codersdk client connected to an in-memory API instance.
|
||||
@@ -604,6 +606,7 @@ func NewOptions(t testing.TB, options *Options) (func(http.Handler), context.Can
|
||||
Clock: options.Clock,
|
||||
AppEncryptionKeyCache: options.APIKeyEncryptionCache,
|
||||
OIDCConvertKeyCache: options.OIDCConvertKeyCache,
|
||||
ProvisionerdServerMetrics: options.ProvisionerdServerMetrics,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -2699,6 +2699,13 @@ func (q *querier) GetQuotaConsumedForUser(ctx context.Context, params database.G
|
||||
return q.db.GetQuotaConsumedForUser(ctx, params)
|
||||
}
|
||||
|
||||
func (q *querier) GetRegularWorkspaceCreateMetrics(ctx context.Context) ([]database.GetRegularWorkspaceCreateMetricsRow, error) {
|
||||
if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceWorkspace.All()); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return q.db.GetRegularWorkspaceCreateMetrics(ctx)
|
||||
}
|
||||
|
||||
func (q *querier) GetReplicaByID(ctx context.Context, id uuid.UUID) (database.Replica, error) {
|
||||
if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceSystem); err != nil {
|
||||
return database.Replica{}, err
|
||||
|
||||
@@ -2177,6 +2177,10 @@ func (s *MethodTestSuite) TestWorkspace() {
|
||||
dbm.EXPECT().GetWorkspaceAgentDevcontainersByAgentID(gomock.Any(), agt.ID).Return([]database.WorkspaceAgentDevcontainer{d}, nil).AnyTimes()
|
||||
check.Args(agt.ID).Asserts(w, policy.ActionRead).Returns([]database.WorkspaceAgentDevcontainer{d})
|
||||
}))
|
||||
s.Run("GetRegularWorkspaceCreateMetrics", s.Subtest(func(_ database.Store, check *expects) {
|
||||
check.Args().
|
||||
Asserts(rbac.ResourceWorkspace.All(), policy.ActionRead)
|
||||
}))
|
||||
}
|
||||
|
||||
func (s *MethodTestSuite) TestWorkspacePortSharing() {
|
||||
|
||||
@@ -1356,6 +1356,13 @@ func (m queryMetricsStore) GetQuotaConsumedForUser(ctx context.Context, ownerID
|
||||
return consumed, err
|
||||
}
|
||||
|
||||
func (m queryMetricsStore) GetRegularWorkspaceCreateMetrics(ctx context.Context) ([]database.GetRegularWorkspaceCreateMetricsRow, error) {
|
||||
start := time.Now()
|
||||
r0, r1 := m.s.GetRegularWorkspaceCreateMetrics(ctx)
|
||||
m.queryLatencies.WithLabelValues("GetRegularWorkspaceCreateMetrics").Observe(time.Since(start).Seconds())
|
||||
return r0, r1
|
||||
}
|
||||
|
||||
func (m queryMetricsStore) GetReplicaByID(ctx context.Context, id uuid.UUID) (database.Replica, error) {
|
||||
start := time.Now()
|
||||
replica, err := m.s.GetReplicaByID(ctx, id)
|
||||
|
||||
@@ -2851,6 +2851,21 @@ func (mr *MockStoreMockRecorder) GetQuotaConsumedForUser(ctx, arg any) *gomock.C
|
||||
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetQuotaConsumedForUser", reflect.TypeOf((*MockStore)(nil).GetQuotaConsumedForUser), ctx, arg)
|
||||
}
|
||||
|
||||
// GetRegularWorkspaceCreateMetrics mocks base method.
|
||||
func (m *MockStore) GetRegularWorkspaceCreateMetrics(ctx context.Context) ([]database.GetRegularWorkspaceCreateMetricsRow, error) {
|
||||
m.ctrl.T.Helper()
|
||||
ret := m.ctrl.Call(m, "GetRegularWorkspaceCreateMetrics", ctx)
|
||||
ret0, _ := ret[0].([]database.GetRegularWorkspaceCreateMetricsRow)
|
||||
ret1, _ := ret[1].(error)
|
||||
return ret0, ret1
|
||||
}
|
||||
|
||||
// GetRegularWorkspaceCreateMetrics indicates an expected call of GetRegularWorkspaceCreateMetrics.
|
||||
func (mr *MockStoreMockRecorder) GetRegularWorkspaceCreateMetrics(ctx any) *gomock.Call {
|
||||
mr.mock.ctrl.T.Helper()
|
||||
return mr.mock.ctrl.RecordCallWithMethodType(mr.mock, "GetRegularWorkspaceCreateMetrics", reflect.TypeOf((*MockStore)(nil).GetRegularWorkspaceCreateMetrics), ctx)
|
||||
}
|
||||
|
||||
// GetReplicaByID mocks base method.
|
||||
func (m *MockStore) GetReplicaByID(ctx context.Context, id uuid.UUID) (database.Replica, error) {
|
||||
m.ctrl.T.Helper()
|
||||
|
||||
@@ -306,6 +306,9 @@ type sqlcQuerier interface {
|
||||
GetProvisionerLogsAfterID(ctx context.Context, arg GetProvisionerLogsAfterIDParams) ([]ProvisionerJobLog, error)
|
||||
GetQuotaAllowanceForUser(ctx context.Context, arg GetQuotaAllowanceForUserParams) (int64, error)
|
||||
GetQuotaConsumedForUser(ctx context.Context, arg GetQuotaConsumedForUserParams) (int64, error)
|
||||
// Count regular workspaces: only those whose first successful 'start' build
|
||||
// was not initiated by the prebuild system user.
|
||||
GetRegularWorkspaceCreateMetrics(ctx context.Context) ([]GetRegularWorkspaceCreateMetricsRow, error)
|
||||
GetReplicaByID(ctx context.Context, id uuid.UUID) (Replica, error)
|
||||
GetReplicasUpdatedAfter(ctx context.Context, updatedAt time.Time) ([]Replica, error)
|
||||
GetRunningPrebuiltWorkspaces(ctx context.Context) ([]GetRunningPrebuiltWorkspacesRow, error)
|
||||
|
||||
@@ -7309,7 +7309,7 @@ const getPrebuildMetrics = `-- name: GetPrebuildMetrics :many
|
||||
SELECT
|
||||
t.name as template_name,
|
||||
tvp.name as preset_name,
|
||||
o.name as organization_name,
|
||||
o.name as organization_name,
|
||||
COUNT(*) as created_count,
|
||||
COUNT(*) FILTER (WHERE pj.job_status = 'failed'::provisioner_job_status) as failed_count,
|
||||
COUNT(*) FILTER (
|
||||
@@ -20131,6 +20131,75 @@ func (q *sqlQuerier) GetDeploymentWorkspaceStats(ctx context.Context) (GetDeploy
|
||||
return i, err
|
||||
}
|
||||
|
||||
const getRegularWorkspaceCreateMetrics = `-- name: GetRegularWorkspaceCreateMetrics :many
|
||||
WITH first_success_build AS (
|
||||
-- Earliest successful 'start' build per workspace
|
||||
SELECT DISTINCT ON (wb.workspace_id)
|
||||
wb.workspace_id,
|
||||
wb.template_version_preset_id,
|
||||
wb.initiator_id
|
||||
FROM workspace_builds wb
|
||||
JOIN provisioner_jobs pj ON pj.id = wb.job_id
|
||||
WHERE
|
||||
wb.transition = 'start'::workspace_transition
|
||||
AND pj.job_status = 'succeeded'::provisioner_job_status
|
||||
ORDER BY wb.workspace_id, wb.build_number, wb.id
|
||||
)
|
||||
SELECT
|
||||
t.name AS template_name,
|
||||
COALESCE(tvp.name, '') AS preset_name,
|
||||
o.name AS organization_name,
|
||||
COUNT(*) AS created_count
|
||||
FROM first_success_build fsb
|
||||
JOIN workspaces w ON w.id = fsb.workspace_id
|
||||
JOIN templates t ON t.id = w.template_id
|
||||
LEFT JOIN template_version_presets tvp ON tvp.id = fsb.template_version_preset_id
|
||||
JOIN organizations o ON o.id = w.organization_id
|
||||
WHERE
|
||||
NOT t.deleted
|
||||
-- Exclude workspaces whose first successful start was the prebuilds system user
|
||||
AND fsb.initiator_id != 'c42fdf75-3097-471c-8c33-fb52454d81c0'::uuid
|
||||
GROUP BY t.name, COALESCE(tvp.name, ''), o.name
|
||||
ORDER BY t.name, preset_name, o.name
|
||||
`
|
||||
|
||||
type GetRegularWorkspaceCreateMetricsRow struct {
|
||||
TemplateName string `db:"template_name" json:"template_name"`
|
||||
PresetName string `db:"preset_name" json:"preset_name"`
|
||||
OrganizationName string `db:"organization_name" json:"organization_name"`
|
||||
CreatedCount int64 `db:"created_count" json:"created_count"`
|
||||
}
|
||||
|
||||
// Count regular workspaces: only those whose first successful 'start' build
|
||||
// was not initiated by the prebuild system user.
|
||||
func (q *sqlQuerier) GetRegularWorkspaceCreateMetrics(ctx context.Context) ([]GetRegularWorkspaceCreateMetricsRow, error) {
|
||||
rows, err := q.db.QueryContext(ctx, getRegularWorkspaceCreateMetrics)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer rows.Close()
|
||||
var items []GetRegularWorkspaceCreateMetricsRow
|
||||
for rows.Next() {
|
||||
var i GetRegularWorkspaceCreateMetricsRow
|
||||
if err := rows.Scan(
|
||||
&i.TemplateName,
|
||||
&i.PresetName,
|
||||
&i.OrganizationName,
|
||||
&i.CreatedCount,
|
||||
); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
items = append(items, i)
|
||||
}
|
||||
if err := rows.Close(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if err := rows.Err(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return items, nil
|
||||
}
|
||||
|
||||
const getWorkspaceACLByID = `-- name: GetWorkspaceACLByID :one
|
||||
SELECT
|
||||
group_acl as groups,
|
||||
|
||||
@@ -230,7 +230,7 @@ HAVING COUNT(*) = @hard_limit::bigint;
|
||||
SELECT
|
||||
t.name as template_name,
|
||||
tvp.name as preset_name,
|
||||
o.name as organization_name,
|
||||
o.name as organization_name,
|
||||
COUNT(*) as created_count,
|
||||
COUNT(*) FILTER (WHERE pj.job_status = 'failed'::provisioner_job_status) as failed_count,
|
||||
COUNT(*) FILTER (
|
||||
|
||||
@@ -923,3 +923,36 @@ SET
|
||||
user_acl = @user_acl
|
||||
WHERE
|
||||
id = @id;
|
||||
|
||||
-- name: GetRegularWorkspaceCreateMetrics :many
|
||||
-- Count regular workspaces: only those whose first successful 'start' build
|
||||
-- was not initiated by the prebuild system user.
|
||||
WITH first_success_build AS (
|
||||
-- Earliest successful 'start' build per workspace
|
||||
SELECT DISTINCT ON (wb.workspace_id)
|
||||
wb.workspace_id,
|
||||
wb.template_version_preset_id,
|
||||
wb.initiator_id
|
||||
FROM workspace_builds wb
|
||||
JOIN provisioner_jobs pj ON pj.id = wb.job_id
|
||||
WHERE
|
||||
wb.transition = 'start'::workspace_transition
|
||||
AND pj.job_status = 'succeeded'::provisioner_job_status
|
||||
ORDER BY wb.workspace_id, wb.build_number, wb.id
|
||||
)
|
||||
SELECT
|
||||
t.name AS template_name,
|
||||
COALESCE(tvp.name, '') AS preset_name,
|
||||
o.name AS organization_name,
|
||||
COUNT(*) AS created_count
|
||||
FROM first_success_build fsb
|
||||
JOIN workspaces w ON w.id = fsb.workspace_id
|
||||
JOIN templates t ON t.id = w.template_id
|
||||
LEFT JOIN template_version_presets tvp ON tvp.id = fsb.template_version_preset_id
|
||||
JOIN organizations o ON o.id = w.organization_id
|
||||
WHERE
|
||||
NOT t.deleted
|
||||
-- Exclude workspaces whose first successful start was the prebuilds system user
|
||||
AND fsb.initiator_id != 'c42fdf75-3097-471c-8c33-fb52454d81c0'::uuid
|
||||
GROUP BY t.name, COALESCE(tvp.name, ''), o.name
|
||||
ORDER BY t.name, preset_name, o.name;
|
||||
|
||||
@@ -165,6 +165,18 @@ func Workspaces(ctx context.Context, logger slog.Logger, registerer prometheus.R
|
||||
return nil, err
|
||||
}
|
||||
|
||||
workspaceCreationTotal := prometheus.NewCounterVec(
|
||||
prometheus.CounterOpts{
|
||||
Namespace: "coderd",
|
||||
Name: "workspace_creation_total",
|
||||
Help: "Total regular (non-prebuilt) workspace creations by organization, template, and preset.",
|
||||
},
|
||||
[]string{"organization_name", "template_name", "preset_name"},
|
||||
)
|
||||
if err := registerer.Register(workspaceCreationTotal); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
ctx, cancelFunc := context.WithCancel(ctx)
|
||||
done := make(chan struct{})
|
||||
|
||||
@@ -200,6 +212,27 @@ func Workspaces(ctx context.Context, logger slog.Logger, registerer prometheus.R
|
||||
string(w.LatestBuildTransition),
|
||||
).Add(1)
|
||||
}
|
||||
|
||||
// Update regular workspaces (without a prebuild transition) creation counter
|
||||
regularWorkspaces, err := db.GetRegularWorkspaceCreateMetrics(ctx)
|
||||
if err != nil {
|
||||
if errors.Is(err, sql.ErrNoRows) {
|
||||
workspaceCreationTotal.Reset()
|
||||
} else {
|
||||
logger.Warn(ctx, "failed to load regular workspaces for metrics", slog.Error(err))
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
workspaceCreationTotal.Reset()
|
||||
|
||||
for _, regularWorkspace := range regularWorkspaces {
|
||||
workspaceCreationTotal.WithLabelValues(
|
||||
regularWorkspace.OrganizationName,
|
||||
regularWorkspace.TemplateName,
|
||||
regularWorkspace.PresetName,
|
||||
).Add(float64(regularWorkspace.CreatedCount))
|
||||
}
|
||||
}
|
||||
|
||||
// Use time.Nanosecond to force an initial tick. It will be reset to the
|
||||
|
||||
@@ -424,6 +424,107 @@ func TestWorkspaceLatestBuildStatuses(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestWorkspaceCreationTotal(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
for _, tc := range []struct {
|
||||
Name string
|
||||
Database func() database.Store
|
||||
ExpectedWorkspaces int
|
||||
}{
|
||||
{
|
||||
Name: "None",
|
||||
Database: func() database.Store {
|
||||
db, _ := dbtestutil.NewDB(t)
|
||||
return db
|
||||
},
|
||||
ExpectedWorkspaces: 0,
|
||||
},
|
||||
{
|
||||
// Should count only the successfully created workspaces
|
||||
Name: "Multiple",
|
||||
Database: func() database.Store {
|
||||
db, _ := dbtestutil.NewDB(t)
|
||||
u := dbgen.User(t, db, database.User{})
|
||||
org := dbgen.Organization(t, db, database.Organization{})
|
||||
insertTemplates(t, db, u, org)
|
||||
insertCanceled(t, db, u, org)
|
||||
insertFailed(t, db, u, org)
|
||||
insertFailed(t, db, u, org)
|
||||
insertSuccess(t, db, u, org)
|
||||
insertSuccess(t, db, u, org)
|
||||
insertSuccess(t, db, u, org)
|
||||
insertRunning(t, db, u, org)
|
||||
return db
|
||||
},
|
||||
ExpectedWorkspaces: 3,
|
||||
},
|
||||
{
|
||||
// Should not include prebuilt workspaces
|
||||
Name: "MultipleWithPrebuild",
|
||||
Database: func() database.Store {
|
||||
ctx := context.Background()
|
||||
db, _ := dbtestutil.NewDB(t)
|
||||
u := dbgen.User(t, db, database.User{})
|
||||
prebuildUser, err := db.GetUserByID(ctx, database.PrebuildsSystemUserID)
|
||||
require.NoError(t, err)
|
||||
org := dbgen.Organization(t, db, database.Organization{})
|
||||
insertTemplates(t, db, u, org)
|
||||
insertCanceled(t, db, u, org)
|
||||
insertFailed(t, db, u, org)
|
||||
insertSuccess(t, db, u, org)
|
||||
insertSuccess(t, db, prebuildUser, org)
|
||||
insertRunning(t, db, u, org)
|
||||
return db
|
||||
},
|
||||
ExpectedWorkspaces: 1,
|
||||
},
|
||||
{
|
||||
// Should include deleted workspaces
|
||||
Name: "MultipleWithDeleted",
|
||||
Database: func() database.Store {
|
||||
db, _ := dbtestutil.NewDB(t)
|
||||
u := dbgen.User(t, db, database.User{})
|
||||
org := dbgen.Organization(t, db, database.Organization{})
|
||||
insertTemplates(t, db, u, org)
|
||||
insertCanceled(t, db, u, org)
|
||||
insertFailed(t, db, u, org)
|
||||
insertSuccess(t, db, u, org)
|
||||
insertRunning(t, db, u, org)
|
||||
insertDeleted(t, db, u, org)
|
||||
return db
|
||||
},
|
||||
ExpectedWorkspaces: 2,
|
||||
},
|
||||
} {
|
||||
t.Run(tc.Name, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
registry := prometheus.NewRegistry()
|
||||
closeFunc, err := prometheusmetrics.Workspaces(context.Background(), testutil.Logger(t), registry, tc.Database(), testutil.IntervalFast)
|
||||
require.NoError(t, err)
|
||||
t.Cleanup(closeFunc)
|
||||
|
||||
require.Eventually(t, func() bool {
|
||||
metrics, err := registry.Gather()
|
||||
assert.NoError(t, err)
|
||||
|
||||
sum := 0
|
||||
for _, m := range metrics {
|
||||
if m.GetName() != "coderd_workspace_creation_total" {
|
||||
continue
|
||||
}
|
||||
for _, metric := range m.Metric {
|
||||
sum += int(metric.GetCounter().GetValue())
|
||||
}
|
||||
}
|
||||
|
||||
t.Logf("count = %d, expected == %d", sum, tc.ExpectedWorkspaces)
|
||||
return sum == tc.ExpectedWorkspaces
|
||||
}, testutil.WaitShort, testutil.IntervalFast)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestAgents(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
@@ -897,6 +998,7 @@ func insertRunning(t *testing.T, db database.Store, u database.User, org databas
|
||||
Transition: database.WorkspaceTransitionStart,
|
||||
Reason: database.BuildReasonInitiator,
|
||||
TemplateVersionID: templateVersionID,
|
||||
InitiatorID: u.ID,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
// This marks the job as started.
|
||||
|
||||
@@ -0,0 +1,177 @@
|
||||
package provisionerdserver
|
||||
|
||||
import (
|
||||
"context"
|
||||
"time"
|
||||
|
||||
"github.com/prometheus/client_golang/prometheus"
|
||||
|
||||
"cdr.dev/slog"
|
||||
)
|
||||
|
||||
type Metrics struct {
|
||||
logger slog.Logger
|
||||
workspaceCreationTimings *prometheus.HistogramVec
|
||||
workspaceClaimTimings *prometheus.HistogramVec
|
||||
}
|
||||
|
||||
type WorkspaceTimingType int
|
||||
|
||||
const (
|
||||
Unsupported WorkspaceTimingType = iota
|
||||
WorkspaceCreation
|
||||
PrebuildCreation
|
||||
PrebuildClaim
|
||||
)
|
||||
|
||||
const (
|
||||
workspaceTypeRegular = "regular"
|
||||
workspaceTypePrebuild = "prebuild"
|
||||
)
|
||||
|
||||
type WorkspaceTimingFlags struct {
|
||||
IsPrebuild bool
|
||||
IsClaim bool
|
||||
IsFirstBuild bool
|
||||
}
|
||||
|
||||
func NewMetrics(logger slog.Logger) *Metrics {
|
||||
log := logger.Named("provisionerd_server_metrics")
|
||||
|
||||
return &Metrics{
|
||||
logger: log,
|
||||
workspaceCreationTimings: prometheus.NewHistogramVec(prometheus.HistogramOpts{
|
||||
Namespace: "coderd",
|
||||
Name: "workspace_creation_duration_seconds",
|
||||
Help: "Time to create a workspace by organization, template, preset, and type (regular or prebuild).",
|
||||
Buckets: []float64{
|
||||
1, // 1s
|
||||
10,
|
||||
30,
|
||||
60, // 1min
|
||||
60 * 5,
|
||||
60 * 10,
|
||||
60 * 30, // 30min
|
||||
60 * 60, // 1hr
|
||||
},
|
||||
NativeHistogramBucketFactor: 1.1,
|
||||
// Max number of native buckets kept at once to bound memory.
|
||||
NativeHistogramMaxBucketNumber: 100,
|
||||
// Merge/flush small buckets periodically to control churn.
|
||||
NativeHistogramMinResetDuration: time.Hour,
|
||||
// Treat tiny values as zero (helps with noisy near-zero latencies).
|
||||
NativeHistogramZeroThreshold: 0,
|
||||
NativeHistogramMaxZeroThreshold: 0,
|
||||
}, []string{"organization_name", "template_name", "preset_name", "type"}),
|
||||
workspaceClaimTimings: prometheus.NewHistogramVec(prometheus.HistogramOpts{
|
||||
Namespace: "coderd",
|
||||
Name: "prebuilt_workspace_claim_duration_seconds",
|
||||
Help: "Time to claim a prebuilt workspace by organization, template, and preset.",
|
||||
// Higher resolution between 1–5m to show typical prebuild claim times.
|
||||
// Cap at 5m since longer claims diminish prebuild value.
|
||||
Buckets: []float64{
|
||||
1, // 1s
|
||||
5,
|
||||
10,
|
||||
20,
|
||||
30,
|
||||
60, // 1m
|
||||
120, // 2m
|
||||
180, // 3m
|
||||
240, // 4m
|
||||
300, // 5m
|
||||
},
|
||||
NativeHistogramBucketFactor: 1.1,
|
||||
// Max number of native buckets kept at once to bound memory.
|
||||
NativeHistogramMaxBucketNumber: 100,
|
||||
// Merge/flush small buckets periodically to control churn.
|
||||
NativeHistogramMinResetDuration: time.Hour,
|
||||
// Treat tiny values as zero (helps with noisy near-zero latencies).
|
||||
NativeHistogramZeroThreshold: 0,
|
||||
NativeHistogramMaxZeroThreshold: 0,
|
||||
}, []string{"organization_name", "template_name", "preset_name"}),
|
||||
}
|
||||
}
|
||||
|
||||
func (m *Metrics) Register(reg prometheus.Registerer) error {
|
||||
if err := reg.Register(m.workspaceCreationTimings); err != nil {
|
||||
return err
|
||||
}
|
||||
return reg.Register(m.workspaceClaimTimings)
|
||||
}
|
||||
|
||||
func (f WorkspaceTimingFlags) count() int {
|
||||
count := 0
|
||||
if f.IsPrebuild {
|
||||
count++
|
||||
}
|
||||
if f.IsClaim {
|
||||
count++
|
||||
}
|
||||
if f.IsFirstBuild {
|
||||
count++
|
||||
}
|
||||
return count
|
||||
}
|
||||
|
||||
// getWorkspaceTimingType returns the type of the workspace build:
|
||||
// - isPrebuild: if the workspace build corresponds to the creation of a prebuilt workspace
|
||||
// - isClaim: if the workspace build corresponds to the claim of a prebuilt workspace
|
||||
// - isWorkspaceFirstBuild: if the workspace build corresponds to the creation of a regular workspace
|
||||
// (not created from the prebuild pool)
|
||||
func getWorkspaceTimingType(flags WorkspaceTimingFlags) WorkspaceTimingType {
|
||||
switch {
|
||||
case flags.IsPrebuild:
|
||||
return PrebuildCreation
|
||||
case flags.IsClaim:
|
||||
return PrebuildClaim
|
||||
case flags.IsFirstBuild:
|
||||
return WorkspaceCreation
|
||||
default:
|
||||
return Unsupported
|
||||
}
|
||||
}
|
||||
|
||||
// UpdateWorkspaceTimingsMetrics updates the workspace timing metrics based on the workspace build type
|
||||
func (m *Metrics) UpdateWorkspaceTimingsMetrics(
|
||||
ctx context.Context,
|
||||
flags WorkspaceTimingFlags,
|
||||
organizationName string,
|
||||
templateName string,
|
||||
presetName string,
|
||||
buildTime float64,
|
||||
) {
|
||||
m.logger.Debug(ctx, "update workspace timings metrics",
|
||||
"organizationName", organizationName,
|
||||
"templateName", templateName,
|
||||
"presetName", presetName,
|
||||
"isPrebuild", flags.IsPrebuild,
|
||||
"isClaim", flags.IsClaim,
|
||||
"isWorkspaceFirstBuild", flags.IsFirstBuild)
|
||||
|
||||
if flags.count() > 1 {
|
||||
m.logger.Warn(ctx, "invalid workspace timing flags",
|
||||
"isPrebuild", flags.IsPrebuild,
|
||||
"isClaim", flags.IsClaim,
|
||||
"isWorkspaceFirstBuild", flags.IsFirstBuild)
|
||||
return
|
||||
}
|
||||
|
||||
workspaceTimingType := getWorkspaceTimingType(flags)
|
||||
switch workspaceTimingType {
|
||||
case WorkspaceCreation:
|
||||
// Regular workspace creation (without prebuild pool)
|
||||
m.workspaceCreationTimings.
|
||||
WithLabelValues(organizationName, templateName, presetName, workspaceTypeRegular).Observe(buildTime)
|
||||
case PrebuildCreation:
|
||||
// Prebuilt workspace creation duration
|
||||
m.workspaceCreationTimings.
|
||||
WithLabelValues(organizationName, templateName, presetName, workspaceTypePrebuild).Observe(buildTime)
|
||||
case PrebuildClaim:
|
||||
// Prebuilt workspace claim duration
|
||||
m.workspaceClaimTimings.
|
||||
WithLabelValues(organizationName, templateName, presetName).Observe(buildTime)
|
||||
default:
|
||||
m.logger.Warn(ctx, "unsupported workspace timing flags")
|
||||
}
|
||||
}
|
||||
@@ -129,6 +129,8 @@ type server struct {
|
||||
|
||||
heartbeatInterval time.Duration
|
||||
heartbeatFn func(ctx context.Context) error
|
||||
|
||||
metrics *Metrics
|
||||
}
|
||||
|
||||
// We use the null byte (0x00) in generating a canonical map key for tags, so
|
||||
@@ -178,6 +180,7 @@ func NewServer(
|
||||
options Options,
|
||||
enqueuer notifications.Enqueuer,
|
||||
prebuildsOrchestrator *atomic.Pointer[prebuilds.ReconciliationOrchestrator],
|
||||
metrics *Metrics,
|
||||
) (proto.DRPCProvisionerDaemonServer, error) {
|
||||
// Fail-fast if pointers are nil
|
||||
if lifecycleCtx == nil {
|
||||
@@ -248,6 +251,7 @@ func NewServer(
|
||||
heartbeatFn: options.HeartbeatFn,
|
||||
PrebuildsOrchestrator: prebuildsOrchestrator,
|
||||
UsageInserter: usageInserter,
|
||||
metrics: metrics,
|
||||
}
|
||||
|
||||
if s.heartbeatFn == nil {
|
||||
@@ -2281,6 +2285,50 @@ func (s *server) completeWorkspaceBuildJob(ctx context.Context, job database.Pro
|
||||
}
|
||||
}
|
||||
|
||||
// Update workspace (regular and prebuild) timing metrics
|
||||
if s.metrics != nil {
|
||||
// Only consider 'start' workspace builds
|
||||
if workspaceBuild.Transition == database.WorkspaceTransitionStart {
|
||||
// Get the updated job to report the metrics with correct data
|
||||
updatedJob, err := s.Database.GetProvisionerJobByID(ctx, jobID)
|
||||
if err != nil {
|
||||
s.Logger.Error(ctx, "get updated job from database", slog.Error(err))
|
||||
} else
|
||||
// Only consider 'succeeded' provisioner jobs
|
||||
if updatedJob.JobStatus == database.ProvisionerJobStatusSucceeded {
|
||||
presetName := ""
|
||||
if workspaceBuild.TemplateVersionPresetID.Valid {
|
||||
preset, err := s.Database.GetPresetByID(ctx, workspaceBuild.TemplateVersionPresetID.UUID)
|
||||
if err != nil {
|
||||
if !errors.Is(err, sql.ErrNoRows) {
|
||||
s.Logger.Error(ctx, "get preset by ID for workspace timing metrics", slog.Error(err))
|
||||
}
|
||||
} else {
|
||||
presetName = preset.Name
|
||||
}
|
||||
}
|
||||
|
||||
buildTime := updatedJob.CompletedAt.Time.Sub(updatedJob.StartedAt.Time).Seconds()
|
||||
s.metrics.UpdateWorkspaceTimingsMetrics(
|
||||
ctx,
|
||||
WorkspaceTimingFlags{
|
||||
// Is a prebuilt workspace creation build
|
||||
IsPrebuild: input.PrebuiltWorkspaceBuildStage.IsPrebuild(),
|
||||
// Is a prebuilt workspace claim build
|
||||
IsClaim: input.PrebuiltWorkspaceBuildStage.IsPrebuiltWorkspaceClaim(),
|
||||
// Is a regular workspace creation build
|
||||
// Only consider the first build number for regular workspaces
|
||||
IsFirstBuild: workspaceBuild.BuildNumber == 1,
|
||||
},
|
||||
workspace.OrganizationName,
|
||||
workspace.TemplateName,
|
||||
presetName,
|
||||
buildTime,
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
msg, err := json.Marshal(wspubsub.WorkspaceEvent{
|
||||
Kind: wspubsub.WorkspaceEventKindStateChange,
|
||||
WorkspaceID: workspace.ID,
|
||||
|
||||
@@ -4144,6 +4144,7 @@ func setup(t *testing.T, ignoreLogErrors bool, ov *overrides) (proto.DRPCProvisi
|
||||
},
|
||||
notifEnq,
|
||||
&op,
|
||||
provisionerdserver.NewMetrics(logger),
|
||||
)
|
||||
require.NoError(t, err)
|
||||
return srv, db, ps, daemon
|
||||
|
||||
@@ -143,9 +143,12 @@ deployment. They will always be available from the agent.
|
||||
| `coderd_oauth2_external_requests_rate_limit_total` | gauge | DEPRECATED: use coderd_oauth2_external_requests_rate_limit instead | `name` `resource` |
|
||||
| `coderd_oauth2_external_requests_rate_limit_used` | gauge | The number of requests made in this interval. | `name` `resource` |
|
||||
| `coderd_oauth2_external_requests_total` | counter | The total number of api calls made to external oauth2 providers. 'status_code' will be 0 if the request failed with no response. | `name` `source` `status_code` |
|
||||
| `coderd_prebuilt_workspace_claim_duration_seconds` | histogram | Time to claim a prebuilt workspace by organization, template, and preset. | `organization_name` `preset_name` `template_name` |
|
||||
| `coderd_provisionerd_job_timings_seconds` | histogram | The provisioner job time duration in seconds. | `provisioner` `status` |
|
||||
| `coderd_provisionerd_jobs_current` | gauge | The number of currently running provisioner jobs. | `provisioner` |
|
||||
| `coderd_workspace_builds_total` | counter | The number of workspaces started, updated, or deleted. | `action` `owner_email` `status` `template_name` `template_version` `workspace_name` |
|
||||
| `coderd_workspace_creation_duration_seconds` | histogram | Time to create a workspace by organization, template, preset, and type (regular or prebuild). | `organization_name` `preset_name` `template_name` `type` |
|
||||
| `coderd_workspace_creation_total` | counter | Total regular (non-prebuilt) workspace creations by organization, template, and preset. | `organization_name` `preset_name` `template_name` |
|
||||
| `coderd_workspace_latest_build_status` | gauge | The current workspace statuses by template, transition, and owner. | `status` `template_name` `template_version` `workspace_owner` `workspace_transition` |
|
||||
| `go_gc_duration_seconds` | summary | A summary of the pause duration of garbage collection cycles. | |
|
||||
| `go_goroutines` | gauge | Number of goroutines that currently exist. | |
|
||||
@@ -185,3 +188,19 @@ deployment. They will always be available from the agent.
|
||||
| `promhttp_metric_handler_requests_total` | counter | Total number of scrapes by HTTP status code. | `code` |
|
||||
|
||||
<!-- End generated by 'make docs/admin/integrations/prometheus.md'. -->
|
||||
|
||||
### Note on Prometheus native histogram support
|
||||
|
||||
The following metrics support native histograms:
|
||||
|
||||
* `coderd_workspace_creation_duration_seconds`
|
||||
* `coderd_prebuilt_workspace_claim_duration_seconds`
|
||||
|
||||
Native histograms are an **experimental** Prometheus feature that removes the need to predefine bucket boundaries and allows higher-resolution buckets that adapt to deployment characteristics.
|
||||
Whether a metric is exposed as classic or native depends entirely on the Prometheus server configuration (see [Prometheus docs](https://prometheus.io/docs/specs/native_histograms/) for details):
|
||||
|
||||
* If native histograms are enabled, Prometheus ingests the high-resolution histogram.
|
||||
* If not, it falls back to the predefined buckets.
|
||||
|
||||
⚠️ Important: classic and native histograms cannot be aggregated together. If Prometheus is switched from classic to native at a certain point in time, dashboards may need to account for that transition.
|
||||
For this reason, it’s recommended to follow [Prometheus’ migration guidelines](https://prometheus.io/docs/specs/native_histograms/#migration-considerations) when moving from classic to native histograms.
|
||||
|
||||
@@ -300,6 +300,7 @@ Coder provides several metrics to monitor your prebuilt workspaces:
|
||||
- `coderd_prebuilt_workspaces_desired` (gauge): Target number of prebuilt workspaces that should be available.
|
||||
- `coderd_prebuilt_workspaces_running` (gauge): Current number of prebuilt workspaces in a `running` state.
|
||||
- `coderd_prebuilt_workspaces_eligible` (gauge): Current number of prebuilt workspaces eligible to be claimed.
|
||||
- `coderd_prebuilt_workspace_claim_duration_seconds` ([_native histogram_](https://prometheus.io/docs/specs/native_histograms) support): Time to claim a prebuilt workspace from the prebuild pool.
|
||||
|
||||
#### Logs
|
||||
|
||||
|
||||
@@ -361,6 +361,7 @@ func (api *API) provisionerDaemonServe(rw http.ResponseWriter, r *http.Request)
|
||||
},
|
||||
api.NotificationsEnqueuer,
|
||||
&api.AGPL.PrebuildsReconciler,
|
||||
api.ProvisionerdServerMetrics,
|
||||
)
|
||||
if err != nil {
|
||||
if !xerrors.Is(err, context.Canceled) {
|
||||
|
||||
@@ -26,6 +26,7 @@ import (
|
||||
"github.com/coder/coder/v2/coderd/audit"
|
||||
"github.com/coder/coder/v2/coderd/autobuild"
|
||||
"github.com/coder/coder/v2/coderd/coderdtest"
|
||||
"github.com/coder/coder/v2/coderd/coderdtest/promhelp"
|
||||
"github.com/coder/coder/v2/coderd/database"
|
||||
"github.com/coder/coder/v2/coderd/database/dbauthz"
|
||||
"github.com/coder/coder/v2/coderd/database/dbfake"
|
||||
@@ -2873,6 +2874,133 @@ func TestPrebuildActivityBump(t *testing.T) {
|
||||
require.Zero(t, workspace.LatestBuild.MaxDeadline)
|
||||
}
|
||||
|
||||
func TestWorkspaceProvisionerdServerMetrics(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
// Setup
|
||||
log := testutil.Logger(t)
|
||||
reg := prometheus.NewRegistry()
|
||||
provisionerdserverMetrics := provisionerdserver.NewMetrics(log)
|
||||
err := provisionerdserverMetrics.Register(reg)
|
||||
require.NoError(t, err)
|
||||
client, db, owner := coderdenttest.NewWithDatabase(t, &coderdenttest.Options{
|
||||
Options: &coderdtest.Options{
|
||||
IncludeProvisionerDaemon: true,
|
||||
ProvisionerdServerMetrics: provisionerdserverMetrics,
|
||||
},
|
||||
LicenseOptions: &coderdenttest.LicenseOptions{
|
||||
Features: license.Features{
|
||||
codersdk.FeatureWorkspacePrebuilds: 1,
|
||||
},
|
||||
},
|
||||
})
|
||||
|
||||
// Given: a template and a template version with a preset without prebuild instances
|
||||
presetNoPrebuildID := uuid.New()
|
||||
versionNoPrebuild := coderdtest.CreateTemplateVersion(t, client, owner.OrganizationID, nil)
|
||||
_ = coderdtest.AwaitTemplateVersionJobCompleted(t, client, versionNoPrebuild.ID)
|
||||
templateNoPrebuild := coderdtest.CreateTemplate(t, client, owner.OrganizationID, versionNoPrebuild.ID)
|
||||
presetNoPrebuild := dbgen.Preset(t, db, database.InsertPresetParams{
|
||||
ID: presetNoPrebuildID,
|
||||
TemplateVersionID: versionNoPrebuild.ID,
|
||||
})
|
||||
|
||||
// Given: a template and a template version with a preset with a prebuild instance
|
||||
presetPrebuildID := uuid.New()
|
||||
versionPrebuild := coderdtest.CreateTemplateVersion(t, client, owner.OrganizationID, nil)
|
||||
_ = coderdtest.AwaitTemplateVersionJobCompleted(t, client, versionPrebuild.ID)
|
||||
templatePrebuild := coderdtest.CreateTemplate(t, client, owner.OrganizationID, versionPrebuild.ID)
|
||||
presetPrebuild := dbgen.Preset(t, db, database.InsertPresetParams{
|
||||
ID: presetPrebuildID,
|
||||
TemplateVersionID: versionPrebuild.ID,
|
||||
DesiredInstances: sql.NullInt32{Int32: 1, Valid: true},
|
||||
})
|
||||
// Given: a prebuild workspace
|
||||
wb := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{
|
||||
OwnerID: database.PrebuildsSystemUserID,
|
||||
TemplateID: templatePrebuild.ID,
|
||||
}).Seed(database.WorkspaceBuild{
|
||||
TemplateVersionID: versionPrebuild.ID,
|
||||
TemplateVersionPresetID: uuid.NullUUID{
|
||||
UUID: presetPrebuildID,
|
||||
Valid: true,
|
||||
},
|
||||
}).WithAgent(func(agent []*proto.Agent) []*proto.Agent {
|
||||
return agent
|
||||
}).Do()
|
||||
|
||||
// Mark the prebuilt workspace's agent as ready so the prebuild can be claimed
|
||||
// nolint:gocritic
|
||||
ctx := dbauthz.AsSystemRestricted(testutil.Context(t, testutil.WaitLong))
|
||||
agent, err := db.GetWorkspaceAgentAndLatestBuildByAuthToken(ctx, uuid.MustParse(wb.AgentToken))
|
||||
require.NoError(t, err)
|
||||
err = db.UpdateWorkspaceAgentLifecycleStateByID(ctx, database.UpdateWorkspaceAgentLifecycleStateByIDParams{
|
||||
ID: agent.WorkspaceAgent.ID,
|
||||
LifecycleState: database.WorkspaceAgentLifecycleStateReady,
|
||||
})
|
||||
require.NoError(t, err)
|
||||
|
||||
organizationName, err := client.Organization(ctx, owner.OrganizationID)
|
||||
require.NoError(t, err)
|
||||
user, err := client.User(ctx, "testUser")
|
||||
require.NoError(t, err)
|
||||
|
||||
// Given: no histogram value for prebuilt workspaces claim
|
||||
prebuiltWorkspaceHistogramMetric := promhelp.MetricValue(t, reg, "coderd_prebuilt_workspace_claim_duration_seconds", prometheus.Labels{
|
||||
"organization_name": organizationName.Name,
|
||||
"template_name": templatePrebuild.Name,
|
||||
"preset_name": presetPrebuild.Name,
|
||||
})
|
||||
require.Nil(t, prebuiltWorkspaceHistogramMetric)
|
||||
|
||||
// Given: the prebuilt workspace is claimed by a user
|
||||
claimedWorkspace, err := client.CreateUserWorkspace(ctx, user.ID.String(), codersdk.CreateWorkspaceRequest{
|
||||
TemplateVersionID: versionPrebuild.ID,
|
||||
TemplateVersionPresetID: presetPrebuildID,
|
||||
Name: coderdtest.RandomUsername(t),
|
||||
})
|
||||
require.NoError(t, err)
|
||||
coderdtest.AwaitWorkspaceBuildJobCompleted(t, client, claimedWorkspace.LatestBuild.ID)
|
||||
require.Equal(t, wb.Workspace.ID, claimedWorkspace.ID)
|
||||
|
||||
// Then: the histogram value for prebuilt workspace claim should be updated
|
||||
prebuiltWorkspaceHistogram := promhelp.HistogramValue(t, reg, "coderd_prebuilt_workspace_claim_duration_seconds", prometheus.Labels{
|
||||
"organization_name": organizationName.Name,
|
||||
"template_name": templatePrebuild.Name,
|
||||
"preset_name": presetPrebuild.Name,
|
||||
})
|
||||
require.NotNil(t, prebuiltWorkspaceHistogram)
|
||||
require.Equal(t, uint64(1), prebuiltWorkspaceHistogram.GetSampleCount())
|
||||
|
||||
// Given: no histogram value for regular workspaces creation
|
||||
regularWorkspaceHistogramMetric := promhelp.MetricValue(t, reg, "coderd_workspace_creation_duration_seconds", prometheus.Labels{
|
||||
"organization_name": organizationName.Name,
|
||||
"template_name": templateNoPrebuild.Name,
|
||||
"preset_name": presetNoPrebuild.Name,
|
||||
"type": "regular",
|
||||
})
|
||||
require.Nil(t, regularWorkspaceHistogramMetric)
|
||||
|
||||
// Given: a user creates a regular workspace (without prebuild pool)
|
||||
regularWorkspace, err := client.CreateUserWorkspace(ctx, user.ID.String(), codersdk.CreateWorkspaceRequest{
|
||||
TemplateVersionID: versionNoPrebuild.ID,
|
||||
TemplateVersionPresetID: presetNoPrebuildID,
|
||||
Name: coderdtest.RandomUsername(t),
|
||||
})
|
||||
require.NoError(t, err)
|
||||
coderdtest.AwaitWorkspaceBuildJobCompleted(t, client, regularWorkspace.LatestBuild.ID)
|
||||
|
||||
// Then: the histogram value for regular workspace creation should be updated
|
||||
regularWorkspaceHistogram := promhelp.HistogramValue(t, reg, "coderd_workspace_creation_duration_seconds", prometheus.Labels{
|
||||
"organization_name": organizationName.Name,
|
||||
"template_name": templateNoPrebuild.Name,
|
||||
"preset_name": presetNoPrebuild.Name,
|
||||
"type": "regular",
|
||||
})
|
||||
require.NotNil(t, regularWorkspaceHistogram)
|
||||
require.Equal(t, uint64(1), regularWorkspaceHistogram.GetSampleCount())
|
||||
}
|
||||
|
||||
// TestWorkspaceTemplateParamsChange tests a workspace with a parameter that
|
||||
// validation changes on apply. The params used in create workspace are invalid
|
||||
// according to the static params on import.
|
||||
|
||||
@@ -715,6 +715,37 @@ coderd_workspace_latest_build_status{status="failed",template_name="docker",temp
|
||||
coderd_workspace_builds_total{action="START",owner_email="admin@coder.com",status="failed",template_name="docker",template_version="gallant_wright0",workspace_name="test1"} 1
|
||||
coderd_workspace_builds_total{action="START",owner_email="admin@coder.com",status="success",template_name="docker",template_version="gallant_wright0",workspace_name="test1"} 1
|
||||
coderd_workspace_builds_total{action="STOP",owner_email="admin@coder.com",status="success",template_name="docker",template_version="gallant_wright0",workspace_name="test1"} 1
|
||||
# HELP coderd_workspace_creation_total Total regular (non-prebuilt) workspace creations by organization, template, and preset.
|
||||
# TYPE coderd_workspace_creation_total counter
|
||||
coderd_workspace_creation_total{organization_name="{organization}",preset_name="",template_name="docker"} 1
|
||||
# HELP coderd_workspace_creation_duration_seconds Time to create a workspace by organization, template, preset, and type (regular or prebuild).
|
||||
# TYPE coderd_workspace_creation_duration_seconds histogram
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="1"} 0
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="10"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="30"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="60"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="300"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="600"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="1800"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="3600"} 1
|
||||
coderd_workspace_creation_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",type="prebuild",le="+Inf"} 1
|
||||
coderd_workspace_creation_duration_seconds_sum{organization_name="{organization}",preset_name="Falkenstein",template_name="template-example",type="prebuild"} 4.406214
|
||||
coderd_workspace_creation_duration_seconds_count{organization_name="{organization}",preset_name="Falkenstein",template_name="template-example",type="prebuild"} 1
|
||||
# HELP coderd_prebuilt_workspace_claim_duration_seconds Time to claim a prebuilt workspace by organization, template, and preset.
|
||||
# TYPE coderd_prebuilt_workspace_claim_duration_seconds histogram
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="1"} 0
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="5"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="10"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="20"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="30"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="60"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="120"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="180"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="240"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="300"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_bucket{organization_name="{organization}",preset_name="Falkenstein",template_name="docker",le="+Inf"} 1
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_sum{organization_name="{organization}",preset_name="Falkenstein",template_name="docker"} 4.860075
|
||||
coderd_prebuilt_workspace_claim_duration_seconds_count{organization_name="{organization}",preset_name="Falkenstein",template_name="docker"} 1
|
||||
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
|
||||
# TYPE go_gc_duration_seconds summary
|
||||
go_gc_duration_seconds{quantile="0"} 2.4056e-05
|
||||
|
||||
Reference in New Issue
Block a user