feat(codersdk): add circuit breaker configuration support for aibridge (#21546)

## Summary

Add circuit breaker support for AI Bridge to protect against cascading
failures from upstream AI provider rate limits (HTTP 429, 503, and
Anthropic's 529 overloaded responses).

## Changes

- Add 5 new CLI options for circuit breaker configuration:
  - `--aibridge-circuit-breaker-enabled` (default: false)
  - `--aibridge-circuit-breaker-failure-threshold` (default: 5)
  - `--aibridge-circuit-breaker-interval` (default: 10s)
  - `--aibridge-circuit-breaker-timeout` (default: 30s)
  - `--aibridge-circuit-breaker-max-requests` (default: 3)
- Update aibridge dependency to include circuit breaker support
- Add tests for pool creation with circuit breaker providers

## Notes

- Circuit breaker is **disabled by default** for backward compatibility
- When enabled, applies to both OpenAI and Anthropic providers
- Uses sony/gobreaker internally via the aibridge library

## Testing

```
make test RUN=TestPoolWithCircuitBreakerProviders
```
This commit is contained in:
Kacper Sawicki
2026-01-20 14:59:29 +01:00
committed by GitHub
parent bfae5b03dc
commit ed679bb3da
12 changed files with 341 additions and 16 deletions
+4
View File
@@ -121,6 +121,10 @@ AI BRIDGE OPTIONS:
See
https://docs.claude.com/en/docs/claude-code/settings#environment-variables.
--aibridge-circuit-breaker-enabled bool, $CODER_AIBRIDGE_CIRCUIT_BREAKER_ENABLED (default: false)
Enable the circuit breaker to protect against cascading failures from
upstream AI provider rate limits (429, 503, 529 overloaded).
--aibridge-retention duration, $CODER_AIBRIDGE_RETENTION (default: 60d)
Length of time to retain data such as interceptions and all related
records (token, prompt, tool use).