Refactors our use of `slogtest` to instantiate a "standard logger" across most of our tests. This standard logger incorporates https://github.com/coder/slog/pull/217 to also ignore database query canceled errors by default, which are a source of low-severity flakes.
Any test that has set non-default `slogtest.Options` is left alone. In particular, `coderdtest` defaults to ignoring all errors. We might consider revisiting that decision now that we have better tools to target the really common flaky Error logs on shutdown.
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.
What happened is that we accidentally shipped two new API features without bumping the version. `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.
Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet. That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs. This isn't great, but hasn't caused us major issues because
1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported.
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.
Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.
To mitigate against this thing happening again, this PR also:
1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version. That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.
With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
Refactors the way clients of the Tailnet API (clients of the API, which include both workspace "agents" and "clients") interact with the API. Introduces the idea of abstract "controllers" for each of the RPCs in the API, and implements a Coordination controller by refactoring from `workspacesdk`.
chore re: #14729
* feat: begin impl of agent script timings
* feat: add job_id and display_name to script timings
* fix: increment migration number
* fix: rename migrations from 251 to 254
* test: get tests compiling
* fix: appease the linter
* fix: get tests passing again
* fix: drop column from correct table
* test: add fixture for agent script timings
* fix: typo
* fix: use job id used in provisioner job timings
* fix: increment migration number
* test: behaviour of script runner
* test: rewrite test
* test: does exit 1 script break things?
* test: rewrite test again
* fix: revert change
Not sure how this came to be, I do not recall manually changing
these files.
* fix: let code breathe
* fix: wrap errors
* fix: justify nolint
* fix: swap require.Equal argument order
* fix: add mutex operations
* feat: add 'ran_on_start' and 'blocked_login' fields
* fix: update testdata fixture
* fix: refer to agent_id instead of job_id in timings
* fix: JobID -> AgentID in dbauthz_test
* fix: add 'id' to scripts, make timing refer to script id
* fix: fix broken tests and convert bug
* fix: update testdata fixtures
* fix: update testdata fixtures again
* feat: capture stage and if script timed out
* fix: update migration number
* test: add test for script api
* fix: fake db query
* fix: use UTC time
* fix: ensure r.scriptComplete is not nil
* fix: move err check to right after call
* fix: uppercase sql
* fix: use dbtime.Now()
* fix: debug log on r.scriptCompleted being nil
* fix: ensure correct rbac permissions
* chore: remove DisplayName
* fix: get tests passing
* fix: remove space in sql up
* docs: document ExecuteOption
* fix: drop 'RETURNING' from sql
* chore: remove 'display_name' from timing table
* fix: testdata fixture
* fix: put r.scriptCompleted call in goroutine
* fix: track goroutine for test + use separate context for reporting
* fix: appease linter, handle trackCommandGoroutine error
* fix: resolve race condition
* feat: replace timed_out column with status column
* test: update testdata fixture
* fix: apply suggestions from review
* revert: linter changes
This adds the ability for `TunnelAuth` to also authorize incoming wireguard node IPs, preventing agents from reporting anything other than their static IP generated from the agent ID.
Changes the agent to use the new v2 API for sending logs, via the logSender component.
We keep the PatchLogs function around, but deprecate it so that we can test the v1 endpoint.
`agentsdk` depends on `agent/proto` because it needs to get the version to dial.
Therefore, the conversion routines need to live in `agentsdk` so that we can convert to and from the Manifest.
I briefly considered refactoring the agent to only reference `proto.Manifest`, but decided against it because we might have multiple protocol versions in the future, its useful to have a protocol-independent data structure.
This one is huge, and I'm sorry.
The problem is that once I change `tailnet.Conn` to start doing v2 behavior, I kind of have to change it everywhere, including in CoderSDK (CLI), the agent, wsproxy, and ServerTailnet.
There is still a bit more cleanup to do, and I need to add code so that when we lose connection to the Coordinator, we mark all peers as LOST, but that will be in a separate PR since this is big enough!
* Adds agenttest.New() helper function
* Makes sure agent gets closed on test cleanup
* Makes sure you don't forget to set session token
* Sets the agent and client logger automatically
* chore: add /v2 to import module path
go mod requires semantic versioning with versions greater than 1.x
This was a mechanical update by running:
```
go install github.com/marwan-at-work/mod/cmd/mod@latest
mod upgrade
```
Migrate generated files to import /v2
* Fix gen
* chore: rename startup logs to agent logs
This also adds a `source` property to every agent log. It
should allow us to group logs and display them nicer in
the UI as they stream in.
* Fix migration order
* Fix naming
* Rename the frontend
* Fix tests
* Fix down migration
* Match enums for workspace agent logs
* Fix inserting log source
* Fix migration order
* Fix logs tests
* Fix psql insert