Commit Graph

3372 Commits

Author SHA1 Message Date
Steven Masley 79d5c238cc fix: always return a clean http client for promoauth (#11963)
* fix: add unit test to verify default client is not broken

* always return a clean http client
* No need to clone the tripper
2024-02-01 11:13:34 -05:00
Spike Curtis 1aa117b9ec chore: rename client Listen to ConnectRPC (#11916)
ConnectRPC seems more appropriate for this function
2024-02-01 14:44:11 +04:00
Spike Curtis d5a98cc6d7 fix: avoid race in TestPGPubsub_Metrics by using Eventually (#11973)
Annoyingly, prometheus Registry collects metrics async, which is causing our test to be racy.  They also don't export enough from the Metric interface for us to replicate a synchronous collect, so we have to use Eventually to test.
2024-02-01 12:10:19 +04:00
Spike Curtis 5a359d50dd feat: add metrics to PGPubsub (#11971)
Adds prometheus metrics to PGPubsub for monitoring its health and performance in production.

Related to #11950 --- additional diagnostics to help figure out what's happening
2024-02-01 11:25:03 +04:00
Colin Adler 3ace7982aa fix: rewrite url to agent ip in single tailnet (#11810)
This restores previous behavior of being able to cache connections
across agents in single tailnet.
2024-02-01 00:25:52 -06:00
Colin Adler 4ed1f5581a chore(coderd): add logging to agent rpc yamux conn (#11965) 2024-01-31 23:17:20 -06:00
Spike Curtis b79785c86f feat: move agent v2 API connection monitoring to yamux layer (#11910)
Moves monitoring of the agent v2 API connection to the yamux layer.

Present behavior monitors this at the websocket layer, and closes the websocket on completion. This can cause yamux to hit unexpected errors since the connection is closed underneath it.

This might be the cause of yamux errors that some customers are seeing

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/tCz4CxRU9jhAJ7zH8RTi/53b8b5ef-e9e5-44a5-b559-99c37c136071.png)

In any case, it's more graceful to close yamux first and let yamux close the underlying websocket.  That should limit yamux error logging to truly unexpected/error cases.

The only downside is that the yamux `Close()` doesn't accept a reason, so if the agent becomes outdated and we close the API connection, the agent just sees the connection close without a reason.  I'm not sure we log this at the agent anyway, but it would be nice.  I think more accurate logging on Coderd are more important.

I've also added some logging when the monitor disconnects for reasons other than the context being canceled (e.g. agent outdated, failed pings).
2024-02-01 08:18:35 +04:00
Steven Masley ac64155282 fix: strip timezone information from a date in dau response (#11962)
* fix: strip timezone information from a date in dau response

Timezone information is lost, so do not forward it to the client.

* fix: timezone offset should be flipped
* Make tests deterministic
2024-01-31 16:01:50 -06:00
Marcin Tojek 13cbca679e feat: support template bundles as zip archives (#11839) 2024-01-31 14:49:55 +01:00
Mathias Fredriksson b25deaae20 fix(coderd/database): fix limit in GetUserWorkspaceBuildParameters (#11954) 2024-01-31 13:56:36 +02:00
Spike Curtis a34cada09a feat: add logging to pgPubsub (#11953)
Should be helpful for #11950

Adds a logger to pgPubsub and logs various events, most especially connection and disconnection from postgres.
2024-01-31 15:49:16 +04:00
Jon Ayers 0c30dde9b5 feat: add customizable upgrade message on client/server version mismatch (#11587) 2024-01-30 17:11:37 -06:00
Ammar Bandukwala adbb025e74 feat: add user-level parameter autofill (#11731)
This PR solves #10478 by auto-filling previously used template values in create and update workspace flows.

I decided against explicit user values in settings for these reasons:

* Autofill is far easier to implement
* Users benefit from autofill _by default_ — we don't need to teach them new concepts
* If we decide that autofill creates more harm than good, we can remove it without breaking compatibility
2024-01-30 16:02:21 -06:00
Colin Adler 2fd1a726aa fix: only delete expired agents on success (#11940) 2024-01-30 14:11:45 -06:00
Colin Adler 27f3b7a814 fix: add timeout to listening ports request (#11935)
This can potentially hang for 15m if the agent is unreachable.
2024-01-30 13:53:52 -06:00
Bruno Quaresma dcab6fa5a4 feat(site): display user avatar (#11893)
* add owner API to workspace and workspace build responses
* display user avatar in workspace top bar

Co-authored-by: Cian Johnston <cian@coder.com>
2024-01-30 17:07:06 +00:00
Spike Curtis 0fc177203e feat: use agent v2 API to update app health (#11889)
Use the Agent v2 API to update App Health
2024-01-30 11:35:12 +04:00
Spike Curtis 2599850e54 feat: use agent v2 API to post startup (#11877)
Uses the v2 Agent API to post startup information.
2024-01-30 11:23:28 +04:00
Spike Curtis da8bb1c198 feat: use agent v2 API to fetch manifest (#11832)
Agent uses the v2 API to obtain the manifest, instead of the HTTP API.
2024-01-30 10:11:28 +04:00
Spike Curtis 0eff646c31 chore: move proto to sdk conversion to agentsdk (#11831)
`agentsdk` depends on `agent/proto` because it needs to get the version to dial.

Therefore, the conversion routines need to live in `agentsdk` so that we can convert to and from the Manifest.

I briefly considered refactoring the agent to only reference `proto.Manifest`, but decided against it because we might have multiple protocol versions in the future, its useful to have a protocol-independent data structure.
2024-01-30 09:04:56 +04:00
Spike Curtis 1e8a9c09fe chore: remove legacy wsconncache (#11816)
Fixes #8218

Removes `wsconncache` and related "is legacy?" functions and API calls that were used by it.

The only leftover is that Agents still use the legacy IP, so that back level clients or workspace proxies can dial them correctly.

We should eventually remove this: #11819
2024-01-30 07:56:36 +04:00
Spike Curtis 13e24f21e4 feat: use Agent v2 API for Service Banner (#11806)
Agent uses the v2 API for the service banner, rather than the v1 HTTP API.

One of several for #10534
2024-01-30 07:44:47 +04:00
Jon Ayers 4f5a2f0a9b feat: add backend for jfrog xray support (#11829) 2024-01-29 19:30:02 -06:00
Spike Curtis 207328ca50 feat: use appearance.Fetcher in agentapi (#11770)
This PR updates the Agent API to use the appearance.Fetcher, which is set by entitlement code in Enterprise coderd.

This brings the agentapi into compliance with the Enterprise feature.
2024-01-29 21:22:50 +04:00
Spike Curtis b2bc3fff33 fix: wait for new template version before promoting (#11874)
Fixes a test flake due to not waiting for the correct template version prior to promoting it.
2024-01-29 19:29:56 +04:00
Steven Masley 04a23261e6 chore: ensure github uids are unique (#11826) 2024-01-29 09:13:46 -06:00
Steven Masley d66e6e78ee fix: always attempt external auth refresh when fetching (#11762) (#11830)
* fix: always attempt external auth refresh when fetching
* refactor validate to check expiry when considering "valid"
2024-01-29 08:55:15 -06:00
Spike Curtis bc4ae53261 chore: refactor Appearance to an interface callable by AGPL code (#11769)
The new Agent API needs an interface for ServiceBanners, so this PR creates it and refactors the AGPL and Enterprise code to achieve it.

Before we depended on the fact that the HTTP endpoint was missing to serve an empty ServiceBanner on AGPL deployments, but that won't work with dRPC, so we need a real interface to call.
2024-01-29 12:17:31 +04:00
Marcin Tojek aacb4a2b4c feat: use map instead of slice in metrics aggregator (#11815) 2024-01-29 09:12:41 +01:00
Cian Johnston 42e997d39e fix(coderd/rbac): do not cache context cancellation errors (#11840)
#7439 added global caching of RBAC results.
Calls are cached based on hash(subject, object, action).
We often use dbauthz.AsSystemRestricted to handle "internal" authz calls, and these are often repeated with similar arguments and are likely to get cached.
So a transient error doing an authz check on a system function will be cached for up to a minute.
I'm just starting off with excluding context.Canceled but there's likely a whole suite of different errors we want to also exclude from the global cache.
2024-01-26 16:19:55 +00:00
Dean Sheather 29707099d7 chore: add agentapi tests (#11269) 2024-01-26 07:04:19 +00:00
Steven Masley 005c014f13 chore: instrument additional github api calls (#11824)
* chore: instrument additional githubapi calls

This only affects github as a login source, not external auth.
2024-01-25 18:34:46 -06:00
Ammar Bandukwala 79568bf628 Revert "fix: always attempt external auth refresh when fetching (#11762)"
This reverts commit 0befc0826a.
2024-01-25 14:22:47 -06:00
Steven Masley 0befc0826a fix: always attempt external auth refresh when fetching (#11762)
* fix: always attempt external auth refresh when fetching
* refactor validate to check expiry when considering "valid"
2024-01-25 10:54:56 -06:00
Cian Johnston 8eae4f83bf fix(coderd/provisionerdserver): fix test flake in TestHeartbeat (#11808) 2024-01-25 12:05:57 +00:00
Cian Johnston 4616ccf462 fix(coderd): alter return signature of convertWorkspace, add check for requesterID (#11796) 2024-01-24 14:13:14 +00:00
Cian Johnston f92336c4d5 feat(coderd): allow workspace owners to mark workspaces as favorite (#11791)
- Adds column `favorite` to workspaces table
- Adds API endpoints to favorite/unfavorite workspaces
- Modifies sorting order to return owners' favorite workspaces first
2024-01-24 13:39:19 +00:00
Spike Curtis 5cbb76b47a fix: stop spamming DERP map updates for equivalent maps (#11792)
Fixes 2 related issues:

1. wsconncache had incorrect logic to test whether to send DERPMap updates, sending if the maps were equivalent, instead of if they were _not equivalent_.
2. configmaps used a bugged check to test equality between DERPMaps, since it contains a map and the map entries are serialized in random order. Instead, we avoid comparing the protobufs and instead depend on the existing function that compares `tailcfg.DERPMap`. This also has the effect of reducing the number of times we convert to and from protobuf.
2024-01-24 16:27:15 +04:00
Spike Curtis f5dbc718a7 fix: accept agent RPC connection without version query parameter (#11790)
Fixes an issue where Coder v2.7.1 agents connect to /api/v2/workspaceagents/me/rpc without a version query parameter
2024-01-24 09:10:16 +04:00
Colin Adler 13beb04521 fix: disable keepalives in workspaceapps transport (#11789)
Connection caching causes requests to hit the wrong workspaces. See
comment.

Fixes https://github.com/coder/coder/issues/11767
2024-01-24 14:46:59 +10:00
Jon Ayers 383eed93f8 fix: use correct logger for lifecycle_executor (#11763) 2024-01-23 14:33:55 -06:00
Steven Masley d6ba0dfecb feat: add "updated" search param to workspaces (#11714)
* feat: add "updated" search param to workspaces
* rego -> sql needs to specify which <table>.organization_id
2024-01-23 11:52:06 -06:00
Steven Masley 081fbef097 fix: code-server path based forwarding, defer to code-server (#11759)
Do not attempt to construct a path based port forward url.
Always defer to code server, as it has it's own proxy method.
2024-01-23 11:36:44 -06:00
Spike Curtis 059e533544 feat: agent uses Tailnet v2 API for DERPMap updates (#11698)
Switches the Agent to use Tailnet v2 API to get DERPMap updates.

Subsequent PRs will do the same for the CLI (`codersdk`) and `wsproxy`.
2024-01-23 14:42:07 +04:00
Spike Curtis 3e0e7f8739 feat: check agent API version on connection (#11696)
fixes #10531

Adds a check for `version` on connection to the Agent API websocket endpoint.  This is primarily for future-proofing, so that up-level agents get a sensible error if they connect to a back-level Coderd.

It also refactors the location of the `CurrentVersion` variables, to be part of the `proto` packages, since the versions refer to the APIs defined therein.
2024-01-23 14:27:49 +04:00
Spike Curtis eb12fd7d92 feat: make ServerTailnet set peers lost when it reconnects to the coordinator (#11682)
Adds support to `ServerTailnet` to set all peers lost before attempting to reconnect to the coordinator. In practice, this only really affects `wsproxy` since coderd has a local connection to the coordinator that only goes down if we're shutting down or change licenses.
2024-01-23 13:17:56 +04:00
Asher 3014777d2a feat: add endpoints to oauth2 provider applications (#11718)
These will show up when configuring the application along with the
client ID and everything else.  Should make it easier to configure the
application, otherwise you will have to go look up the URLs in the
docs (which are not yet written).

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2024-01-22 13:25:25 -09:00
Steven Masley 8e0a153725 chore: implement device auth flow for fake idp (#11707)
* chore: implement device auth flow for fake idp
2024-01-22 20:46:05 +00:00
Asher 16c6cefde8 chore: pass lifetime directly into api key generate (#11715)
Rather than passing all the deployment values.  This is to make it
easier to generate API keys as part of the oauth flow.

I also added and fixed a test for when the lifetime is set and the
default and expiration are unset.

Co-authored-by: Steven Masley <stevenmasley@gmail.com>
2024-01-22 11:42:55 -09:00
Dean Sheather 15a90f028e chore: collect more template telemetry to gauge feature usage
We don't have visibility into some feature usage, so this adds a lot of fields missing from `database.Template` to `telemetry.Template`. Deprecation message is not collected, just whether it's set or not.
2024-01-22 18:55:27 +10:00