## Description
`Provider.InjectAuthHeader` is no longer needed. With the addition of `KeyFailoverConfig` in #24920, authentication is now applied per-attempt by `KeyFailoverTransport` on passthrough routes. This PR removes the dead method from the `Provider` interface, all implementations (`Anthropic`, `OpenAI`, `Copilot`), and the test mock.
The orphaned `InjectAuthHeader` unit tests are replaced with `Test{Anthropic,OpenAI,Copilot}_KeyFailoverConfig`. `TestPassthrough_KeyFailover` is also extended to cover Copilot in the BYOK scenario.
Related to: https://linear.app/codercom/issue/AIGOV-334/aibridge-follow-ups-from-key-failover-prs
> [!NOTE]
> Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira
## Description
Cleans up how key pool errors are represented and how they get turned into HTTP responses. Consolidates two error types into a single type with a kind tag, and gives the response helpers in both providers consistent names.
## Changes
- Replaced the keypool sentinel and transient error struct with one error type that carries a kind and a retry-after duration.
- Updated `KeyFailoverConfig.BuildKeyPoolResponse` to take the typed key pool error, so each provider can shape the exhaustion response in its own format.
- Removed the per-provider `MarkKey` callback from `KeyFailoverConfig` since providers can rely on the shared `MarkKeyOnStatus` helper.
- Renamed the response-error helpers so OpenAI and Anthropic use the same naming.
Related to: https://linear.app/codercom/issue/AIGOV-334/aibridge-follow-ups-from-key-failover-prs
> [!NOTE]
> Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira
Documents the known race in `EventStream.IsStreaming()` and the
resulting flake in
`TestStreamingInterception_AgenticLoopFailover/agentic_all_keys_fail `,
accepted rather than fixed since the inner agentic loop is on track to
be removed as part of the reverse proxy migration in coder/aibridge#223.
Full reasoning in coder/internal#1524.
My agent added `//nolint:testpackage` to a test file on one of my PRs.
Again. This PR cleans it up across the entire repo and updates the
in-repo conventions so future agents stop doing it.
The repo already has a precedent for white-box tests that need to touch
unexported symbols: `*_internal_test.go` (145+ existing files). The
`testpackage` linter's default `skip-regexp` exempts that filename
suffix, so the `//nolint:testpackage` directive is unnecessary in every
case where someone reached for it. This PR renames 51 such files to
`*_internal_test.go` via `git mv` so blame and history follow, and
strips the dead directive from 2 files that were already correctly named
(`coderd/oauth2provider/authorize_internal_test.go`,
`coderd/x/chatd/advisor_internal_test.go`).
`.claude/docs/TESTING.md` now documents the rule explicitly under *Test
Package Naming*, which is imported into the root `AGENTS.md` via
`@.claude/docs/TESTING.md`. The rule: prefer `package foo_test`; if you
need internal access, rename the file to `*_internal_test.go` rather
than adding a nolint directive.
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6/4.7*
Fixes
[coder/aibridge#280](https://github.com/coder/aibridge/issues/280).
Claude Opus 4.7 (and future adaptive-only Bedrock models) reject the
legacy `thinking.type: "enabled"` + `budget_tokens` shape with a 400.
Claude Code falls back to that shape when it cannot read the upstream
model's capability metadata, which is exactly the case when AI Bridge
sits between the client and Bedrock. Pinning back to Opus 4.6 is the
only operator workaround today.
This is the counterpart to the `adaptive -> enabled` conversion added in
[coder/aibridge#225](https://github.com/coder/aibridge/pull/225) for
older Bedrock models.
## Behavior
- New `bedrockModelRequiresAdaptiveThinking()` helper matches Opus 4.7
(covers `us.anthropic.claude-opus-4-7`, ARN-style application inference
profile names that include the model ID, etc.).
- New `RequestPayload.convertEnabledThinkingForBedrock()` rewrites
`thinking: {type: enabled, budget_tokens: N}` to `thinking: {type:
adaptive}`. The budget hint is dropped; an explicit
`output_config.effort` from the caller is preserved naturally because we
never touch that field. We deliberately do **not** derive an effort
label from the budget (see decision log).
- `removeUnsupportedBedrockFields` learns a variadic `exemptFields`
parameter. Adaptive-only models support `output_config` natively (no
beta flag required), so `augmentRequestForBedrock` exempts that field
for those models.
- Bedrock Opus 4.7 accepts `output_config.effort` but rejects
`output_config.format` (structured outputs) with the same "Extra inputs
are not permitted" 400. The generic strip pass operates at top-level
granularity only, so a small targeted pass drops `output_config.format`
after the top-level strip for adaptive-only models.
The whole Bedrock thinking-type shim block carries a header comment
flagging it as temporary; a planned native Bedrock provider removes the
impedance mismatch and lets us delete it.
## Out of scope
The issue calls out a possible follow-up around `Anthropic-Beta:
interleaved-thinking-2025-05-14` for adaptive-only models; best evidence
is that Opus 4.7 still accepts those flags, so this PR is a no-op there.
<details>
<summary>Decision log</summary>
- `bedrockModelSupportsAdaptiveThinking` now also returns `true` for
adaptive-only models. That keeps the existing
`convertAdaptiveThinkingForBedrock` branch from running on Opus 4.7
(which would otherwise be incorrect; `adaptive` is the supported native
type there), and the new `convertEnabledThinkingForBedrock` runs only
for adaptive-only models via the explicit
`bedrockModelRequiresAdaptiveThinking` switch case. The two model sets
are disjoint by construction.
- The reverse conversion does **not** derive `output_config.effort` from
`budget_tokens / max_tokens`. The two thinking shapes encode different
intents (`enabled+budget` is "give me exactly N tokens,"
`adaptive[+effort]` is "model, pick a budget, optionally biased") and
there is no canonical mapping between them. An earlier draft of this PR
derived effort via midpoints of an invented anchor table; it was
symmetric-looking but lossy and required a lot of scaffolding (sorted
anchors, init-time invariant guard, round-trip tests) to keep two halves
consistent. The reverse direction now just rewrites the shape, which is
honest about the information loss and matches platform-defined adaptive
behavior when no effort hint is present.
- `output_config.format` is stripped only for adaptive-only models.
Other Bedrock models either don't get `output_config` through at all
(top-level strip handles them) or accept it via a beta flag that may
imply broader feature support. Easy to widen if the same 400 shows up
elsewhere.
- I chose `variadic exemptFields ...string` over passing the model down
to `removeUnsupportedBedrockFields`, to keep that function focused on
stripping and to localise the model-aware policy in
`augmentRequestForBedrock`.
</details>
## Description
Adds automatic key failover for passthrough routes for the Anthropic and OpenAI providers. A new `keyFailoverTransport` wraps the reverse-proxy transport: centralized requests walk the configured key pool and retry with the next key on key-specific failures (401/403/429), reusing the same key-marking semantics as the bridged routes.
BYOK passthrough requests run as a single attempt with no failover.
## Changes
- New `keypool.KeyFailoverConfig` carrying the `Pool` to walk and the provider-specific closures (`IsBYOK`, `InjectAuthKey`, `MarkKey`, `BuildExhaustedResponse`).
- New `keypool.NewKeyFailoverTransport`: wraps an inner `http.RoundTripper`. Returns `inner` unchanged when `Pool` is nil, otherwise produces a transport that buffers the request body once, walks the pool per request, and replays each attempt with the next key.
- New `Provider.KeyFailoverConfig(logger)` interface method. Anthropic injects `X-Api-Key`; OpenAI injects `Authorization: Bearer ...`; Copilot returns an empty config.
- `passthrough.go` wires `NewKeyFailoverTransport` around the existing apidump middleware, so every retry attempt is recorded.
## Related Issues
Related to: https://github.com/coder/internal/issues/1446
Related to: https://linear.app/codercom/issue/AIGOV-197/aibridge-automatic-key-failover-for-bridged-and-passthrough-routes
## Follow-up PRs
- Remove dead `Provider.InjectAuthHeader` method now that all auth is applied per-attempt by `KeyFailoverTransport`.
- Bedrock multi-key support.
- Refactor provider vs interceptor config separation.
- Record the actually-used key in the interception credential hint after failover.
> [!NOTE]
> Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira
## Description
Adds automatic key failover for centralized OpenAI provider, covering both chat completions and responses APIs. Same shape as the Anthropic PR: each upstream call walks the configured key pool, keys are marked **temporary** on 429 (with cooldown from `Retry-After`) and **permanent** on 401/403. Each agentic-loop iteration gets its own fresh walker so a tool-call continuation can fail over independently of the initial request.
BYOK is unchanged: BYOK requests run as a single attempt with no failover.
## Changes
- `config.OpenAI` carries a `KeyPool`. `Key` remains for BYOK Authorization Bearer set per interception.
- Chat completions blocking interceptor: walks the pool via `newChatCompletionWithKeyFailover`, marks keys on key-specific failures, returns on first success or non-failover error.
- Chat completions streaming interceptor: per-iteration walker. Pre-stream failures fail over to the next key; mid-stream errors are relayed as SSE events.
- Responses blocking interceptor: extracts `newResponseWithKeyFailover` parallel to chatcompletions.
- Responses streaming interceptor: per-iteration walker, retains the existing buffer-then-forward design.
## Related Issues
Related to: https://github.com/coder/internal/issues/1446
Related to: https://linear.app/codercom/issue/AIGOV-197/aibridge-automatic-key-failover-for-bridged-and-passthrough-routes
## Follow-up PRs
- Bedrock multi-key support.
- Refactor provider vs interceptor config separation.
- Record the actually-used key in the interception credential hint after failover.
> [!NOTE]
> Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira
## Description
Adds automatic key failover for centralized Anthropic provider. When a key pool is configured, each upstream call walks the pool and tries keys in order until one succeeds or the pool is exhausted. Keys are marked **temporary** on 429 (with cooldown from `Retry-After`) and **permanent** on 401/403. Errors that aren't key-specific don't trigger failover. Each agentic-loop iteration gets its own fresh walker, so a tool-call continuation can fail over independently of the initial request.
BYOK is unchanged: BYOK requests run as a single attempt with no failover.
## Changes
- `config.Anthropic` carries a `KeyPool`. `Key` remains for BYOK X-Api-Key set per interception.
- Blocking interceptor: walks the pool, marks keys on key-specific failures, returns on first success or non-failover error.
- Streaming interceptor: per-iteration walker. Pre-stream failures fail over to the next key; mid-stream errors are relayed as SSE events.
- New `keypool` error types: `TransientExhaustionError` (carries soonest cooldown) and `ErrPermanentExhaustion`. Replace the prior `ErrAllKeysExhausted`.
- Error responses now consistently include the outer `"type": "error"` field.
## Related Issues
Related to: https://github.com/coder/internal/issues/1446
Related to: https://linear.app/codercom/issue/AIGOV-197/aibridge-automatic-key-failover-for-bridged-and-passthrough-routes
## Follow-up PRs
- Bedrock multi-key support.
- Refactor provider vs interceptor config separation.
- Record the actually-used key in the interception credential hint after failover.
> [!NOTE]
> Initially generated by Claude Opus 4.7, modified and reviewed by @ssncferreira
## Description
Removes 429 (Too Many Requests) from the circuit breaker failure conditions. Rate limiting is now handled by automatic key failover instead of tripping the circuit breaker.
## Changes
`DefaultIsFailure` no longer treats 429 as a circuit breaker failure. The circuit breaker now only trips on server overload responses (503, 529).
Tests and integration tests updated to use 503 instead of 429 for tripping circuits. Description strings in deployment config updated to reflect the change.
Closes https://github.com/coder/internal/issues/1445
> [!NOTE]
> Initially generated by Coder Agents, modified and reviewed by @ssncferreira
## Description
Adds the `aibridge/keypool` package, a thread-safe key pool with per-key state tracking and cooldown expiry. This PR introduces the package only; wiring it into the aibridge providers and coder configuration will happen in upstream PRs.
## Changes
Each key is in one of three states: **Valid** (available), **Temporary** (rate-limited with cooldown expiry), or **Permanent** (revoked/unauthorized, terminal until restart). The state is derived from the key fields rather than stored explicitly: once a cooldown expires, the key is valid again without any external action.
A `Walker` provides per-request key traversal using a primary-with-fallback strategy: each request walks the pool from index 0, skipping unavailable keys, so the first key is always preferred when healthy. Walkers are independent, so concurrent requests traverse the pool without interfering with each other.
`MarkTemporary` preserves the longer cooldown when concurrent requests both mark the same key.
Relates to https://github.com/coder/internal/issues/1445
> [!NOTE]
> Initially generated by Coder Agents, modified and reviewed by @ssncferreira
> AI tools where used when creating this PR
This PR removes environment variable parsing from `/aibridge` directory.
Added env variables/flags for dump dir as coder options.
Only added to new indexed provider options
(`CODER_AIBRIDGE_PROVIDER_<N>_*`) not to deprecated legacy env variables
(`CODER_AIBRIDGE_ANTHROPIC_*` and `CODER_AIBRIDGE_OPENAI_KEY_*`).
Reverted adding `MaxRetries` option as it will be removed soon due to
key failover work:
https://github.com/coder/coder/pull/24783#discussion_r3155544808
> AI tools where used when creating this PR
This PR:
* removes references to aibridge repository from coder docs
* updates aibdrige/README.md
* makes it clear aibridge (keeping old name) is a handler not a separate
process
* updates outdated sections about: metrics, recorded interface and
supported paths.
---------
Co-authored-by: Susana Ferreira <susana@coder.com>
> AI tools where used when creating this PR
Updates `AGENTS.md` to reflect that `/aibridge` is now a directory in
coder/coder repo not a separate repo.
*Disclaimer: implemented by a Coder Agent using Claude Opus 4.6*
Porting https://github.com/coder/aibridge/pull/277 to coder/coder after
the [aibridge code move](https://github.com/coder/coder/pull/24190).
## Summary
Fixes client detection and session ID tracking for the [Charm
Crush](https://github.com/charmbracelet/crush) AI coding client.
## Changes
### Bug fix: User-Agent matching
The actual Crush user-agent is `Charm-Crush/{version}
(https://charm.land/crush)` (hyphenated), but `GuessClient` only checked
for `charm crush/` (space-separated). After lowercasing,
`Charm-Crush/0.2.0` becomes `charm-crush/0.2.0`, which did not match the
`charm crush/` prefix.
Now matches both formats for backwards compatibility.
### Session ID tracking
Adds an explicit `ClientCrush` case to `GuessSessionID`. Crush does not
currently send a session ID header to upstream AI providers, so this
returns `nil` (consistent with how `ClientZed`, `ClientRoo`, and
`ClientCursor` are handled).
### Tests
- Added `charm_crush_hyphen` test case for `GuessClient` using the real
user-agent format.
- Added `crush_returns_empty` test case for `GuessSessionID`.
This PR merges code from `coder/aibridge` repository into `coder/coder`.
It was split into 4 PRs for easier review but stacked PRs will need to
be merged into this PR so all checks pass.
* https://github.com/coder/coder/pull/24190 -> raw code copy (this PR,
before merging PRs on top of it, it was just 1 commit:
https://github.com/coder/coder/commit/70d33f33200c7e77df910957595715f81f9bec24)
* https://github.com/coder/coder/pull/24570 -> update imports in
`coder/coder` to use copied code
* https://github.com/coder/coder/pull/24586 -> linter fixes and CI
integration (also added README.md)
* https://github.com/coder/coder/pull/24571 -> added exclude to
scripts/check_emdash.sh check
Original PR message (before PR squash):
Moves coder/aibridge code into coder/coder repository.
Omitted files:
- `go.mod`, `go.sum`, `.gitignore`, `.github/workflows/ci.yml,`
`Makefile`, `LICENSE`, `README.md` (modified README.md is added later)
- `.github`, `example`, `buildinfo,` `scripts` directories
Simple verification script (will list omitted files)
```
tmp=$(mktemp -d)
echo "$tmp"
git clone --depth=1 https://github.com/coder/aibridge "$tmp/aibridge"
git clone --depth=1 --branch pb/aibridge-code-move https://github.com/coder/coder "$tmp/coder"
diff -rq --exclude=.git "$tmp/aibridge" "$tmp/coder/aibridge"
# rm -rf "$tmp"
```