Commit Graph

1321 Commits

Author SHA1 Message Date
Ethan 6e7f65bc59 fix(cli): properly handle build log streaming during coder ping (#15600)
Closes #15584.

- The `Collecting Diagnostics` spinner now starts after the workspace
build logs (if any) have finished streaming.
- Removes network interfaces with negative MTUs from `healthsdk`
diagnostics.
- Improves the wording on diagnostics for MTUs below the 'safe' value to
indicate that direct connections may be degraded, or rendered unusable
(i.e. if every packet is dropped).
2024-11-20 15:50:12 +11:00
Danielle Maywood 576e1f48fe feat!: allow disabling notifications (#15509)
Resolves https://github.com/coder/coder/issues/15513

Disables notifications when both `$CODER_NOTIFICATIONS_WEBHOOK_ENDPOINT` and `$CODER_EMAIL_SMARTHOST` are unset.

Breaking change: `$CODER_EMAIL_SMARTHOST` is no longer set by default as `localhost:587`, meaning any deployments that make use of this default value will need to add it back.

---------

Co-authored-by: Danny Kopping <danny@coder.com>
Co-authored-by: Mathias Fredriksson <mafredri@gmail.com>
2024-11-19 15:05:12 +00:00
Steven Masley c3c23ed3d9 chore: add query to fetch top level idp claim fields (#15525)
Adds an api endpoint to grab all available sync field options for IDP
sync. This is for autocomplete on idp sync forms. This is required for
organization admins to have some insight into the claim fields available
when configuring group/role sync.
2024-11-18 14:31:39 -06:00
Spike Curtis 5861e516b9 chore: add standard test logger ignoring db canceled (#15556)
Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests.  This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.

Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
2024-11-18 14:09:22 +04:00
Spike Curtis 747f7ce173 feat: add support for WorkspaceUpdates to WebsocketDialer (#15534)
closes #14730

Adds support for WorkspaceUpdates to the WebsocketDialer. This allows us to dial the new endpoint added in #14847 and connect it up to a `tailnet.Controllers` to connect to all agents over the tailnet.

I refactored the fakeWorkspaceUpdatesProvider to a mock and moved it to `tailnettest` so it could be more easily reused.  The Mock is a little more full-featured.
2024-11-18 10:54:11 +04:00
Joobi S B 4cb807670d chore: generate countries.tsx from Go code (#15274)
Closes https://github.com/coder/coder/issues/15074

We have a hard-coded list of countries at
https://github.com/coder/coder/blob/main/site/src/pages/SetupPage/countries.tsx.
This means Go code in coder/coder doesn't have an easy way of utilizing
it.

## Solution
Generate countries.tsx from Go code. Generated by `scripts/apitypings`
2024-11-15 12:05:21 -06:00
Sas Swart 814dd6f854 feat(coderd): update API to allow filtering provisioner daemons by tags (#15448)
This PR provides new parameters to an endpoint that will be necessary
for #15048
2024-11-15 11:33:22 +02:00
Spike Curtis 40802958e9 fix: use explicit api versions for agent and tailnet (#15508)
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.

What happened is that we accidentally shipped two new API features without bumping the version.  `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.

Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet.  That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs.  This isn't great, but hasn't caused us major issues because 

1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported. 
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.

Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.

To mitigate against this thing happening again, this PR also:

1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version.  That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.

With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
2024-11-15 11:16:28 +04:00
Mathias Fredriksson e55e8ee1b2 fix(cli): add backwards compat for old telemetry env and tests (#15476) 2024-11-14 01:07:52 +02:00
Steven Masley 782214bcd8 chore: move organizatinon sync to runtime configuration (#15431)
Moves the configuration from environment to database backed, to allow
configuring organization sync at runtime.
2024-11-08 08:44:14 -06:00
Spike Curtis e5661c2748 feat: add support for multiple tunnel destinations in tailnet (#15409)
Closes #14729

Expands the Coordination controller used by the CLI client to allow multiple tunnel destinations (agents).  Our current client uses just one, but this unifies the logic so that when we add Coder VPN, 1 is just a special case of "many."
2024-11-08 13:32:07 +04:00
Spike Curtis 718722af1b chore: refactor tailnetAPIConnector to tailnet.Controller (#15361)
Refactors `workspacesdk.tailnetAPIConnector` as a `tailnet.Controller` to reuse all the reconnection and graceful disconnect logic.

chore re: #14729
2024-11-08 10:10:54 +04:00
Spike Curtis 2d061e698d chore: refactor tailnetAPIConnector to use dialer (#15347)
refactors `tailnetAPIConnector` to use the `Dialer` interface in `tailnet`, introduced lower in this stack of PRs. This will let us use the same Tailnet API handling code across different things that connect to the Tailnet API (CLI client, coderd, workspace proxies, and soon: Coder VPN).

chore re: #14729
2024-11-07 17:24:19 +04:00
Bruno Quaresma 7f510051fb refactor: increase group name limit to 255 (#15377)
Close https://github.com/coder/coder/issues/15184
2024-11-06 14:39:50 -03:00
Spike Curtis 335e4ab6bf chore: refactor sending telemetry (#15345)
Implements a tailnet API Telemetry controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-06 20:23:23 +04:00
Spike Curtis 9126cd78a6 chore: refactor DERP setting loop (#15344)
Implements a Tailnet API DERP controller by refactoring from `workspacesdk`

chore re: #14729
2024-11-06 20:04:05 +04:00
Vincent Vielle 4fe2c5f09a fix: improve password validation flow (#15132)
Refers to #14984 

Currently, password validation is done backend side and is not explicit
enough so it can be painful to create first users.
We'd like to make this validation easier - but also duplicate it
frontend side to make it smoother.

Flows involved : 
- First user set password
- New user set password
- Change password

---------

Co-authored-by: BrunoQuaresma <bruno_nonato_quaresma@hotmail.com>
2024-11-05 17:22:32 +01:00
Spike Curtis 886dcbec84 chore: refactor coordination (#15343)
Refactors the way clients of the Tailnet API (clients of the API, which include both workspace "agents" and "clients") interact with the API.  Introduces the idea of abstract "controllers" for each of the RPCs in the API, and implements a Coordination controller by refactoring from `workspacesdk`.

chore re: #14729
2024-11-05 13:50:10 +04:00
Bruno Quaresma e232aee011 feat(site): add agent connection timings (#15276)
Local preview:

<img width="1260" alt="Screenshot 2024-10-29 at 16 16 01"
src="https://github.com/user-attachments/assets/10fdb20d-1f2a-4b0a-a8a1-171050ee620d">


Close https://github.com/coder/internal/issues/116

---------

Co-authored-by: Danny Kopping <danny@coder.com>
2024-11-01 13:29:00 -03:00
Ethan b1298a3c1e feat: add WorkspaceUpdates tailnet RPC (#14847)
Closes #14716
Closes #14717

Adds a new user-scoped tailnet API endpoint (`api/v2/tailnet`) with a new RPC stream for receiving updates on workspaces owned by a specific user, as defined in #14716. 

When a stream is started, the `WorkspaceUpdatesProvider` will begin listening on the user-scoped pubsub events implemented in #14964. When a relevant event type is seen (such as a workspace state transition), the provider will query the DB for all the workspaces (and agents) owned by the user. This gets compared against the result of the previous query to produce a set of workspace updates. 

Workspace updates can be requested for any user ID, however only workspaces the authorised user is permitted to `ActionRead` will have their updates streamed.
Opening a tunnel to an agent requires that the user can perform `ActionSSH` against the workspace containing it.
2024-11-01 14:53:53 +11:00
Ethan 31506e694b chore: send workspace pubsub events by owner id (#14964)
We currently send empty payloads to pubsub channels of the form `workspace:<workspace_id>` to notify listeners of updates to workspaces (such as for refreshing the workspace dashboard).

To support https://github.com/coder/coder/issues/14716, we'll instead send `WorkspaceEvent` payloads to pubsub channels of the form `workspace_owner:<owner_id>`. This enables a listener to receive events for all workspaces owned by a user.
This PR replaces the usage of the old channels without modifying any existing behaviors.

```
type WorkspaceEvent struct {
	Kind        WorkspaceEventKind `json:"kind"`
	WorkspaceID uuid.UUID          `json:"workspace_id" format:"uuid"`
	// AgentID is only set for WorkspaceEventKindAgent* events
	// (excluding AgentTimeout)
	AgentID *uuid.UUID `json:"agent_id,omitempty" format:"uuid"`
}
```

We've defined `WorkspaceEventKind`s based on how the old channel was used, but it's not yet necessary to inspect the types of any of the events, as the existing listeners are designed to fire off any of them.

```
WorkspaceEventKindStateChange     WorkspaceEventKind = "state_change"
WorkspaceEventKindStatsUpdate     WorkspaceEventKind = "stats_update"
WorkspaceEventKindMetadataUpdate  WorkspaceEventKind = "mtd_update"
WorkspaceEventKindAppHealthUpdate WorkspaceEventKind = "app_health"

WorkspaceEventKindAgentLifecycleUpdate  WorkspaceEventKind = "agt_lifecycle_update"
WorkspaceEventKindAgentLogsUpdate       WorkspaceEventKind = "agt_logs_update"
WorkspaceEventKindAgentConnectionUpdate WorkspaceEventKind = "agt_connection_update"
WorkspaceEventKindAgentLogsOverflow     WorkspaceEventKind = "agt_logs_overflow"
WorkspaceEventKindAgentTimeout          WorkspaceEventKind = "agt_timeout"
```
2024-11-01 14:17:05 +11:00
Colin Adler 088f21965b feat: add audit logs for dormancy events (#15298) 2024-10-31 17:55:42 -05:00
Danielle Maywood 330acd1270 chore: create ResourceNotificationMessage and AsNotifier (#15301)
Closes https://github.com/coder/coder/issues/15213

This PR enables sending notifications without requiring the auth system
context, instead using a new auth notifier context.
2024-10-31 17:01:51 +00:00
Danielle Maywood 823a2ea22e chore(cli): drop 'notification' prefix for configuring email auth (#15270)
Closes https://github.com/coder/coder/issues/14644
2024-10-30 10:06:10 +00:00
Jon Ayers cd890aa3a0 feat: enable key rotation (#15066)
This PR contains the remaining logic necessary to hook up key rotation
to the product.
2024-10-25 17:14:35 +01:00
Steven Masley ccfffc6911 chore: add tx metrics and logs for serialization errors (#15215)
Before db_metrics were all or nothing. Now `InTx` metrics are always recorded, and query metrics are opt in.


Adds instrumentation & logging around serialization failures in the database.
2024-10-25 12:14:15 -04:00
Danielle Maywood 5076161078 fix: show audit logs for forgot password flow (#15181)
Fixes https://github.com/coder/coder/issues/15150

Audit logs for requesting a password reset, and a user updating their
password, now show up in the audit log.
2024-10-22 13:47:30 +01:00
Bruno Quaresma 9c8ecb82a3 feat(coderd): return agent script timings (#14923)
Add the agent script timings into the
`/workspacebuilds/:workspacebuild/timings` response.

Close https://github.com/coder/coder/issues/14876
2024-10-14 09:31:03 -03:00
Danielle Maywood 4369f2b4b5 feat: implement api for "forgot password?" flow (#14915)
Relates to https://github.com/coder/coder/issues/14232

This implements two endpoints (names subject to change):
- `/api/v2/users/otp/request`
- `/api/v2/users/otp/change-password`
2024-10-04 11:53:25 +01:00
Spike Curtis 7d9f5ab81d chore: add Coder service prefix to tailnet (#14943)
re: #14715

This PR introduces the Coder service prefix: `fd60:627a:a42b::/48` and refactors our existing code as calling the Tailscale service prefix explicitly (rather than implicitly).

Removes the unused `Addresses` agent option. All clients today assume they can compute the Agent's IP address based on its UUID, so an agent started with a custom address would break things.
2024-10-04 10:04:10 +04:00
Benjamin Peinhardt 20bfd1f874 fix: fix bug with trailing version info not being properly stripped (#14963)
Fixes a bug where excess version info was not being stripped properly from
documentation links.
2024-10-03 17:30:25 +00:00
Jon Ayers 21b92ef893 feat: add cache abstraction for fetching signing keys (#14777)
- Adds the database implementation for fetching and caching keys
used for JWT signing. It's been merged into the `keyrotate` pkg and
renamed to `cryptokeys` since they're coupled concepts.
2024-10-01 11:04:51 -05:00
Danny Kopping 11f7b1b3f5 chore: remove notifications experiment (#14869)
Notifications have proved stable in the [mainline release of
v2.15](https://github.com/coder/coder/releases/tag/v2.15.0), and in
preparation for v2.16 we're moving this to stable.
2024-10-01 13:43:47 +00:00
Spike Curtis d6766f706d fix: sort provisioner key tags in cli output (#14875)
I'm seeing flakes like

```
    provisionerkeys_test.go:68: 2024-09-30 05:58:44.686: cmd: matched newline = "CREATED AT            NAME          TAGS            "
    provisionerkeys_test.go:72: 2024-09-30 05:58:44.686: cmd: matched newline = "2024-09-30T05:58:44Z  dont-test-me  my=way foo=bar  "
    provisionerkeys_test.go:74: 
        	Error Trace:	/Users/runner/work/coder/coder/enterprise/cli/provisionerkeys_test.go:74
        	Error:      	"2024-09-30T05:58:44Z  dont-test-me  my=way foo=bar  " does not contain "foo=bar my=way"
        	Test:       	TestProvisionerKeys/CRUD
```

e.g.
https://github.com/coder/coder/actions/runs/11100237276/job/30835714478?pr=14855

Since the tags are a map, we weren't outputting them in a consistent
order on the CLI, leading to flakes.

This sorts the tags by key when converting to a string, for a
consistent, canonical output.
2024-10-01 09:11:19 +04:00
Steven Masley 2c8b264d78 chore: remove multi-organization and custom role experiment (#14862)
Closes https://github.com/coder/coder/issues/14704

---------

Co-authored-by: Kayla Washburn-Love <mckayla@hey.com>
2024-09-27 14:06:16 -05:00
Garrett Delfosse 5cc5bbea04 fix: improve provisioner key cli usability (#14834)
What this changes:
- Unhides the `--key` flag on provisioner start
- Deprecates and hides `provisionerd` command group in favor of
`provisioner(s)`
- Removes org id from `coder provisioner keys list`
2024-09-27 10:34:41 -05:00
Danielle Maywood ae522c558d feat: add agent timings (#14713)
* feat: begin impl of agent script timings

* feat: add job_id and display_name to script timings

* fix: increment migration number

* fix: rename migrations from 251 to 254

* test: get tests compiling

* fix: appease the linter

* fix: get tests passing again

* fix: drop column from correct table

* test: add fixture for agent script timings

* fix: typo

* fix: use job id used in provisioner job timings

* fix: increment migration number

* test: behaviour of script runner

* test: rewrite test

* test: does exit 1 script break things?

* test: rewrite test again

* fix: revert change

Not sure how this came to be, I do not recall manually changing
these files.

* fix: let code breathe

* fix: wrap errors

* fix: justify nolint

* fix: swap require.Equal argument order

* fix: add mutex operations

* feat: add 'ran_on_start' and 'blocked_login' fields

* fix: update testdata fixture

* fix: refer to agent_id instead of job_id in timings

* fix: JobID -> AgentID in dbauthz_test

* fix: add 'id' to scripts, make timing refer to script id

* fix: fix broken tests and convert bug

* fix: update testdata fixtures

* fix: update testdata fixtures again

* feat: capture stage and if script timed out

* fix: update migration number

* test: add test for script api

* fix: fake db query

* fix: use UTC time

* fix: ensure r.scriptComplete is not nil

* fix: move err check to right after call

* fix: uppercase sql

* fix: use dbtime.Now()

* fix: debug log on r.scriptCompleted being nil

* fix: ensure correct rbac permissions

* chore: remove DisplayName

* fix: get tests passing

* fix: remove space in sql up

* docs: document ExecuteOption

* fix: drop 'RETURNING' from sql

* chore: remove 'display_name' from timing table

* fix: testdata fixture

* fix: put r.scriptCompleted call in goroutine

* fix: track goroutine for test + use separate context for reporting

* fix: appease linter, handle trackCommandGoroutine error

* fix: resolve race condition

* feat: replace timed_out column with status column

* test: update testdata fixture

* fix: apply suggestions from review

* revert: linter changes
2024-09-24 10:51:49 +01:00
Danielle Maywood 86f68b220e feat: add 'display_name' column to 'workspace_agent_scripts' (#14747)
* feat: add 'display_name' column to 'workspace_agent_scripts'

* fix: backfill from workspace_agent_log_sources

* fix: run 'make gen'
2024-09-20 14:26:13 +01:00
Ethan 37885e2e82 fix: make cli respect deployment --docs-url (#14568) 2024-09-18 21:47:53 +10:00
Ethan fccf6f1e0e feat!: add --default-token-lifetime (#14631) 2024-09-18 21:23:42 +10:00
Jon Ayers 45160c7679 feat: add schema for key rotation (#14662) 2024-09-17 18:08:18 +01:00
Steven Masley ce21b2030a feat: implement patch and get api methods for role sync (#14692)
* feat: implement patch and get api methods for role sync
2024-09-17 10:38:42 -05:00
Garrett Delfosse 335eb05223 feat: add keys to organization provision daemons (#14627) 2024-09-16 20:02:08 +00:00
Bruno Quaresma 705b9ccda8 feat(coderd): add workspace timings endpoint (#14648) 2024-09-16 16:31:05 -03:00
Steven Masley c330af0e4d chore: add group_ids filter to /groups endpoint (#14688)
Allow filtering groups by IDs.
2024-09-16 13:01:46 -05:00
Kayla Washburn-Love 5ed065d88d feat: get and update group IdP Sync settings (#14647)
---------

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2024-09-16 12:01:37 -05:00
Spike Curtis 2df9a3e554 fix: fix tailnet remoteCoordination to wait for server (#14666)
Fixes #12560

When gracefully disconnecting from the coordinator, we would send the Disconnect message and then close the dRPC stream.  However, closing the dRPC stream can cause the server not to process the Disconnect message, since we use the stream context in a `select` while sending it to the coordinator.

This is a product bug uncovered by the flake, and probably results in us failing graceful disconnect some minority of the time.

Instead, the `remoteCoordination` (and `inMemoryCoordination` for consistency) should send the Disconnect message and then wait for the coordinator to hang up (on some graceful disconnect timer, in the form of a context).
2024-09-16 09:24:30 +04:00
Spike Curtis fb3523b37f chore: remove legacy AgentIP address (#14640)
Removes the support for the Agent's "legacy IP" which was a hardcoded IP address all agents used to use, before we introduced "single tailnet". Single tailnet went GA in 2.7.0.
2024-09-12 07:40:19 +04:00
Ethan 2a9234e9ba fix: remove coderdtest dependency from codersdk (#14633) 2024-09-10 20:55:50 +10:00
Danielle Maywood 25f1ddbf5e feat: add 'hidden' option to 'coder_app' to hide app from UI (#14570)
Add 'hidden' property to 'coder_app' resource to allow hiding apps from the UI.
2024-09-09 14:39:32 +01:00