Fixes https://github.com/coder/internal/issues/214#15475 missed that we also write to `rpty` after starting
`rpty.lifecycle()`.
This PR moves the function call right at the end. Hopefully this should
address the data races before we go resorting to mutexes.
Bumps the Tailnet and Agent API version 2.3, and creates some extra controls and machinery around these versions.
What happened is that we accidentally shipped two new API features without bumping the version. `ScriptCompleted` on the Agent API in Coder v2.16 and `RefreshResumeToken` on the Tailnet API in Coder v2.15.
Since we can't easily retroactively bump the versions, we'll roll these changes into API version 2.3 along with the new WorkspaceUpdates RPC, which hasn't been released yet. That means there is some ambiguity in Coder v2.15-v2.17 about exactly what methods are supported on the Tailnet and Agent APIs. This isn't great, but hasn't caused us major issues because
1. RefreshResumeToken is considered optional, and clients just log and move on if the RPC isn't supported.
2. Agents basically never get started talking to a Coderd that is older than they are, since the agent binary is normally downloaded from Coderd at workspace start.
Still it's good to get things squared away in terms of versions for SDK users and possible edge cases around client and server versions.
To mitigate against this thing happening again, this PR also:
1. adds a CODEOWNERS for the API proto packages, so I'll review changes
2. defines interface types for different API versions, and has the agent explicitly use a specific version. That way, if you add a new method, and try to use it in the agent without thinking explicitly about versions, it won't compile.
With the protocol controllers stuff, we've sort of already abstracted the Tailnet API such that the interface type strategy won't work, but I'll work on getting the Controller to be version aware, such that it can check the API version it's getting against the controllers it has -- in a later PR.
re: #14730
Adds support for the workspace updates protocol controller to also program DNS names for each agent.
Right now, we only program names like `myagent.myworkspace.me.coder` and `myworkspace.coder.` (if there is exactly one agent in the workspace). We also want to support `myagent.myworkspace.username.coder.`, but for that we need to update WorkspaceUpdates RPC to also send the workspace owner's username, which will be in a separate PR.
Move claims from a `debug` column to an actual typed column to be used.
This does not functionally change anything, it just adds some Go typing to build
on.
The hardcoded image is an anti-pattern, leading to weird errors if the
`docker` group is absent. We should either provide a better error
in-product or just have a better image.
@matifali - also down to use a Devcontainers universal image instead or
make this a parameter. Let me know what you think the best "default
install" is
re: #14730
Adds a protocol controller for WorkspaceUpdates RPC that takes all the agents we learn about over the RPC, and programs them into the Coordination controller, so that we set up tunnels to all the agents.
Handling DNS is in a PR up the stack, as is actually wiring it up to anything.
- Adds documentation for how to correctly hold --parameter with list(string)
- Adds tests for the aforementioned documented correct finger positions for --parameter list(string)
Adds a `--things` flag to our `multi-select` example prompt command.
```
go run ./cmd/coder exp prompt-example multi-select --things=Code,Bike,Potato=mashed
"Code, Bike, Potato=mashed" are nice choices.
```
Relates to https://github.com/coder/coder/issues/15082
The old implementation of `GetWorkspacesEligibleForTransition` returns
many workspaces that are not actually eligible for transition. This new
implementation reduces this number significantly (at least on our
dogfood instance).
Related to #15309
As we already are doing for agent logs - this PR is enabling the logs
rotation for coderd logs.
Currently keeping the same logic than we had for agent - with 5MB as the
file size for rotation.
- Assert rbac in fake notifications enqueuer
- Move fake notifications enqueuer to separate notificationstest package
- Update dbauthz rbac policy to allow provisionerd and autostart to create and read notification messages
- Update tests as required
Fixes test flakes _a la_
```
t.go:108: 2024-11-05 09:52:37.996 [erro] workspacestats: failed to load template schedule bumping activity, defaulting to bumping by 60min request_id=f14215d2-73dc-47ba-aa81-422c62f257e4 workspace_id=545d73c7-3a62-4466-8c08-b6abb12867b7 template_id=49747428-3abb-40e4-a6b2-03653e9f2506 ...
error= fetch object:
github.com/coder/coder/v2/coderd/database/dbauthz.(*querier).GetTemplateByID.fetch[...].func1
/home/runner/work/coder/coder/coderd/database/dbauthz/dbauthz.go:497
- pq: canceling statement due to user request
*** slogtest: log detected at level ERROR; TEST FAILURE ***
```
seen here on main: https://github.com/coder/coder/actions/runs/11681605747/job/32527006174
Whilst the `networking-troubleshooting` docs page already mentions that
a direct connection can be established over a private network, even if
there are no STUN servers, it's worth this is the case at the end of the
ping output.
This also removes a print statement that was dirtying up the diagnostic
output, and corrects the name of the `--disable-direct-connections`
flag.
Closes https://github.com/coder/coder/issues/12612
The problem in the linked issue was caused due to a mismatch of when the
Web UI tooltip shows up (2 hours before an autostop requirement) and the
leeway in the `autostop_requirement` algorithm (workspace builds must be
1 hour before an autostop requirement to skip them).
Now, restarting your workspace whilst the tooltip is showing will skip
the upcoming autostop requirement.
This also could have been fixed by only showing the tooltip one hour
before the autostop requirement, but it looks like 1 hour was chosen
arbitrarily, and it doesn't hurt to give users more time to skip the
autostop.
Fixes https://github.com/coder/coder/issues/12687
There was a race condition where we would start the rpty lifecycle
before generating the ID, leading to a data race where we would try to
concurrently read and write the struct field.