Commit Graph

22 Commits

Author SHA1 Message Date
Zach b94a0aebcd fix(coderd/externalauth): isolate TestValidateToken transports to fix flake (#25015)
This change uses separate http clients/transports in TestValidateToken
subtests. Previously parallel subtests of TestValidateToken shared
a http.DefaultTransport. When one subtest's httptest.Server.Close() ran in
t.Cleanup, it called http.DefaultTransport.CloseIdleConnections, which
could interrupt connection(s) used in another subtest.
2026-05-07 08:38:20 -06:00
Jason Barnett da6e708bd2 fix(coderd/externalauth): detect concurrent refresh race to prevent cache poisoning (#24228)
<!--

If you have used AI to produce some or all of this PR, please ensure you
have read our [AI Contribution
guidelines](https://coder.com/docs/about/contributing/AI_CONTRIBUTING)
before submitting.

-->

Fixes https://github.com/coder/coder/issues/17069

Builds on #24332 and #24334 which addressed token persistence and rate
limit handling.

## Problem

When multiple concurrent requests race to refresh an expiring external
auth token, providers with single-use refresh tokens (e.g., GitHub Apps)
reject all but the first refresh attempt with `bad_refresh_token`. The
losing request caches this transient error in the
`oauth_refresh_failure_reason` database column and clears the refresh
token, blocking all subsequent refresh attempts until the user manually
re-authenticates.

This is common for users with multiple terminals, IDE connections, or
workspaces open, all of which poll the external auth endpoint and
trigger concurrent refreshes when the token nears expiry. Database
analysis showed 5 of 7 affected users failed within 5-10 seconds of
token expiry, matching the Go oauth2 library's `expiryDelta` window.

## Fix

Before caching a `bad_refresh_token` failure, re-read the external auth
link from the database. If the refresh token has changed (indicating a
concurrent caller already refreshed successfully), return the winner's
updated link instead of writing a failure. An empty-string guard ensures
a token cleared by another loser isn't mistaken for a winner's
successful refresh.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Garrett Delfosse <garrett@coder.com>
2026-05-04 14:02:07 -04:00
Mathias Fredriksson 1926b7e658 fix(coderd/externalauth): detect rate-limit 403/429 and narrow isFailedRefresh (#24334)
ValidateToken treated all 403 responses as "token invalid," including
GitHub rate limits. isFailedRefresh included 403 in the status code
fallthrough, destroying tokens on rate-limited refresh attempts.

Split the combined 401/403 check in ValidateToken into a switch on
status code. On 403, inspect X-RateLimit-Remaining and Retry-After
headers; if either indicates a rate limit, return optimistically valid.
Handle 429 the same way. Plain 403 without rate-limit headers preserves
the existing invalid-token behavior.

Add incorrect_client_credentials and invalid_client to isFailedRefresh
error code switch. Remove 403 from the status code fallthrough since no
known provider returns 403 from the token endpoint.
2026-04-28 18:03:35 +03:00
Mathias Fredriksson 2a1984f0e8 fix(coderd/externalauth): save refreshed token before validation (#24332)
GitHub rotates refresh tokens on use, invalidating the old token
immediately. If post-refresh validation fails (e.g. rate-limited
403 from /user), the new token was silently discarded because the
DB save only happened after successful validation. The next refresh
attempt would use the stale refresh token, fail permanently, and
destroy the token.

Move the UpdateExternalAuthLink call to immediately after
TokenSource.Token() succeeds. The post-validation save block is
removed (dead code after the early save). The DB write uses a
detached context (context.WithoutCancel) so a canceled request
cannot prevent persistence of the already-consumed refresh token.
2026-04-18 14:28:29 +03:00
Cian Johnston bc27274aba feat(coderd): refactors github pr sync functionality (#22715)
- Adds `_API_BASE_URL` to `CODER_EXTERNAL_AUTH_CONFIG_`
- Extracts and refactors existing GitHub PR sync logic to new packages
`coderd/gitsync` and `coderd/externalauth/gitprovider`
- Associated wiring and tests

Created using Opus 4.6
2026-03-10 18:46:01 +00:00
Kyle Carberry 53e52aef78 fix(externalauth): prevent race condition in token refresh with optimistic locking (#22904)
## Problem

When multiple concurrent callers (e.g., parallel workspace builds) read
the same single-use OAuth2 refresh token from the database and race to
exchange it with the provider, the first caller succeeds but subsequent
callers get `bad_refresh_token`. The losing caller then **clears the
valid new token** from the database, permanently breaking the auth link
until the user manually re-authenticates.

This is reliably reproducible when launching multiple workspaces
simultaneously with GitHub App external auth and user-to-server token
expiration enabled.

## Solution

Two layers of protection:

### 1. Singleflight deduplication (`Config.RefreshToken` +
`ObtainOIDCAccessToken`)

Concurrent callers for the same user/provider share a single refresh
call via `golang.org/x/sync/singleflight`, keyed by `userID`. The
singleflight callback re-reads the link from the database to pick up any
token already refreshed by a prior in-flight call, avoiding redundant
IDP round-trips entirely.

### 2. Optimistic locking on `UpdateExternalAuthLinkRefreshToken`

The SQL `WHERE` clause now includes `AND oauth_refresh_token =
@old_oauth_refresh_token`, so if two replicas (HA) race past
singleflight, the loser's destructive UPDATE is a harmless no-op rather
than overwriting the winner's valid token.

## Changes

| File | Change |
|------|--------|
| `coderd/externalauth/externalauth.go` | Added `singleflight.Group` to
`Config`; split `RefreshToken` into public wrapper +
`refreshTokenInner`; pass `OldOauthRefreshToken` to DB update |
| `coderd/provisionerdserver/provisionerdserver.go` | Wrapped OIDC
refresh in `ObtainOIDCAccessToken` with package-level singleflight |
| `coderd/database/queries/externalauth.sql` | Added optimistic lock
(`WHERE ... AND oauth_refresh_token = @old_oauth_refresh_token`) |
| `coderd/database/queries.sql.go` | Regenerated |
| `coderd/database/querier.go` | Regenerated |
| `coderd/database/dbauthz/dbauthz_test.go` | Updated test params for
new field |
| `coderd/externalauth/externalauth_test.go` | Added
`ConcurrentRefreshDedup` test; updated existing tests for singleflight
DB re-read |

## Testing

- **New test `ConcurrentRefreshDedup`**: 5 goroutines call
`RefreshToken` concurrently, asserts IDP refresh called exactly once,
all callers get same token.
- All existing `TestRefreshToken/*` subtests updated and passing.
- `TestObtainOIDCAccessToken` passing.
- `dbauthz` tests passing.
2026-03-10 13:52:55 -04:00
Danielle Maywood f91475cd51 test: remove unnecessary dbauthz.AsSystemRestricted calls in tests (#22663) 2026-03-05 20:29:49 +00:00
Steven Masley 8fefd91e4a feat!: support PKCE in the oauth2 client's auth/exchange flow (#21215)
**Breaking Change:** Existing oauth apps might now use PKCE. If an
unknown IdP type was being used, and it does not support PKCE, it will
break.

To fix, set the PKCE methods on the external auth to `none`
```
export CODER_EXTERNAL_AUTH_1_PKCE_METHODS=none
```
2025-12-15 17:41:47 +00:00
Paweł Banaszewski 439b041780 feat: add best effort attempt to revoke oauth access token in external auth provider (#19775)
Solves #15575
Adds OAuth access token revocation when unlinking external auth
provider. Due to revocation not being consistently implemented by
providers this is only best effort attempt. Unsuccessful revocation
won't influence link removal.
2025-09-19 16:27:02 +02:00
Dean Sheather 6eb02d1c2a chore: wire up usage tracking for managed agents (#19096)
Wires up the usage collector and publisher to coderd.

Relates to coder/internal#814
2025-08-20 23:38:09 +10:00
Steven Masley 4926410146 feat: keep original token refresh error in external auth (#19339)
External auth refresh errors lose the original error thrown on the first
refresh. This PR saves that error to the database to be raised on
subsequent refresh attempts
2025-08-14 09:50:31 -05:00
ケイラ fae30a00fd chore: remove unnecessary redeclarations in for loops (#18440) 2025-06-20 13:16:55 -06:00
Hugo Dutka 1d48131e98 chore(coderd/externalauth): remove dbmem from tests (#18147)
Related to https://github.com/coder/coder/issues/15109
2025-06-02 14:42:18 +02:00
Cian Johnston e744cde86f fix(coderd): ensure that clearing invalid oauth refresh tokens works with dbcrypt (#15721)
https://github.com/coder/coder/pull/15608 introduced a buggy behaviour
with dbcrypt enabled.
When clearing an oauth refresh token, we had been setting the value to
the empty string.
The database encryption package considers decrypting an empty string to
be an error, as an empty encrypted string value will still have a nonce
associated with it and thus not actually be empty when stored at rest.

Instead of 'deleting' the refresh token, 'update' it to be the empty
string.
This plays nicely with dbcrypt.

It also adds a 'utility test' in the dbcrypt package to help encrypt a
value. This was useful when manually fixing users affected by this bug
on our dogfood instance.
2024-12-03 13:26:31 -06:00
Steven Masley 78f9f43c97 chore: do not refresh tokens that have already failed refreshing (#15608)
Once a token refresh fails, we remove the `oauth_refresh_token` from the
database. This will prevent the token from hitting the IDP for
subsequent refresh attempts.

Without this change, a bad script can cause a failing token to hit a
remote IDP repeatedly with each `git` operation. With this change, after
the first hit, subsequent hits will fail locally, and never contact the
IDP.

The solution in both cases is to authenticate the external auth link. So
the resolution is the same as before.
2024-11-20 20:13:07 -06:00
Steven Masley 24ba81930b chore: return failed refresh errors on external auth as string (was boolean) (#13402)
* chore: return failed refresh errors on external auth

Failed refreshes should return errors. These errors are captured
as validate errors.
2024-06-03 09:33:49 -05:00
Steven Masley 566f8f231d chore: add unit test for pass through external auth query params (#12928)
* chore: verify pass through external auth query params

Unit test added to verify behavior of query params set in the
auth url for external apps. This behavior is intended to specifically
support Auth0 audience query param.
2024-04-10 13:58:29 -05:00
Steven Masley 50b78e3325 chore: instrument external oauth2 requests (#11519)
* chore: instrument external oauth2 requests

External requests made by oauth2 configs are now instrumented into prometheus metrics.
2024-01-10 09:13:30 -06:00
Kyle Carberry bb4ce87242 fix: add support for custom auth header with client secret (#10513)
This fixes OAuth2 with JFrog Artifactory.
2023-11-03 16:26:30 +00:00
Kyle Carberry 5abfe5afd0 chore: rename dbfake to dbmem (#10432) 2023-10-30 17:42:20 +00:00
Kyle Carberry 863c2e7b64 feat: allow storing extra oauth token properties in the database (#10152) 2023-10-09 18:49:30 -05:00
Kyle Carberry 45b53c285f feat: allow external services to be authable (#9996)
* feat: allow external services to be authable

* Refactor external auth config structure for defaults

* Add support for new config properties

* Change the name of external auth

* Move externalauth -> external-auth

* Run gen

* Fix tests

* Fix MW tests

* Fix git auth redirect

* Fix lint

* Fix name

* Allow any ID

* Fix invalid type test

* Fix e2e tests

* Fix comments

* Fix colors

* Allow accepting any type as string

* Run gen

* Fix href
2023-10-03 14:04:39 +00:00