Rate Limiting¶

VibeWarden enforces two independent token-bucket rate limits on every proxied request:

per_ip — applied to every request, keyed by client IP address.
per_user — applied only to authenticated requests, keyed by user ID.

Both limits must pass. A request failing either limit receives 429 Too Many Requests with:

Retry-After: <seconds>

{"error":"rate_limit_exceeded","retry_after_seconds":<N>}

Rate limit activity is captured as structured events in the AI-readable log stream (event_type: rate_limit.blocked).

Backing stores¶

VibeWarden supports two backing stores, selected by rate_limit.store:

Store	State scope	Restarts	Multi-instance
`memory`	Per-process	Resets counters	Not shared
`redis`	Redis server	Preserved	Shared

Single instance: in-memory store (default)¶

No external dependencies. Counters live in process memory and reset when VibeWarden restarts. This is the right choice for:

Local development
Single-instance deployments where occasional counter resets on redeploy are acceptable
Any setup where you do not want to operate a Redis server

rate_limit:
  enabled: true
  store: memory   # default; this line is optional

  per_ip:
    requests_per_second: 10
    burst: 20

  per_user:
    requests_per_second: 100
    burst: 200

Consistency note: In-memory counters are not shared across deployments. If you restart VibeWarden (e.g. during a deploy) every client's counter resets to zero. This is acceptable for single-instance workloads.

Multi-instance: Redis store for shared counters¶

When you run more than one VibeWarden instance behind a load balancer, in-memory counters allow clients to exceed their intended limit by routing requests across instances. The Redis store solves this by keeping all counters in a shared Redis server.

The token bucket logic runs as an atomic Lua script inside Redis, so no two instances can race on the same key.

rate_limit:
  enabled: true
  store: redis

  redis:
    address: "localhost:6379"   # or use url: (see below)
    password: ""
    db: 0
    pool_size: 10
    key_prefix: vibewarden
    fallback: true
    health_check_interval: 30s

  per_ip:
    requests_per_second: 10
    burst: 20

  per_user:
    requests_per_second: 100
    burst: 200

Local Redis via Docker Compose:

When rate_limit.store is redis and no rate_limit.redis.url is set, vibew init adds a redis service to the generated docker-compose.yml automatically. No manual configuration is needed for local dev.

Config reference¶

`rate_limit`¶

Field	Type	Default	Description
`enabled`	bool	`true`	Enable or disable rate limiting
`store`	string	`memory`	Backing store: `memory` or `redis`
`trust_proxy_headers`	bool	`false`	Read `X-Forwarded-For` for the real client IP. Enable only when VibeWarden is behind a trusted proxy.
`exempt_paths`	list	`[]`	Glob patterns that bypass rate limiting. `/_vibewarden/*` is always exempt.

`rate_limit.per_ip` and `rate_limit.per_user`¶

Field	Type	Default (`per_ip`)	Default (`per_user`)	Description
`requests_per_second`	float	`10`	`100`	Sustained token refill rate
`burst`	int	`20`	`200`	Maximum tokens that can accumulate

burst should always be >= requests_per_second. Setting burst to requests_per_second disables any burst tolerance — every request must arrive no faster than the sustained rate.

`rate_limit.redis`¶

Only read when store is redis.

Field	Type	Default	Description
`url`	string	`""`	Full Redis URL (`redis://` or `rediss://` for TLS). Takes precedence over all other fields when set.
`address`	string	`localhost:6379`	Redis server address in `host:port` form. Used when `url` is empty.
`password`	string	`""`	Redis `AUTH` password. Leave empty for no-auth Redis. Ignored when `url` is set.
`db`	int	`0`	Logical database index (0–15). Ignored when `url` is set.
`pool_size`	int	`0` (auto)	Maximum number of socket connections in the pool. `0` lets go-redis choose based on CPU count.
`key_prefix`	string	`vibewarden`	Namespace prefix for all Redis keys. Full key format: `<key_prefix>:ratelimit:<n>:<identifier>`.
`fallback`	bool	`true`	Fail-open (`true`): fall back to in-memory on Redis failure. Fail-closed (`false`): deny all requests when Redis is unreachable.
`health_check_interval`	string	`30s`	How often the background goroutine probes Redis for recovery after a failure. Go duration string (`"30s"`, `"1m"`).

URL format:

redis://[:<password>@]<host>:<port>[/<db>]
rediss://[:<password>@]<host>:<port>[/<db>]    # TLS

Using url is the recommended approach for external providers because it embeds all credentials in a single string that can be stored in a secret manager and passed via environment variable:

VIBEWARDEN_RATE_LIMIT_REDIS_URL=rediss://:mypassword@redis.example.com:6380/0

External Redis providers¶

Upstash¶

Upstash provides a serverless Redis compatible with standard clients. It is a good choice for apps deployed on platforms where running a sidecar Redis container is not practical.

Create a Redis database at console.upstash.com.
Copy the Endpoint and Password from the database details page.
Configure VibeWarden using the TLS URL (rediss://):

rate_limit:
  store: redis
  redis:
    url: "rediss://default:${UPSTASH_REDIS_PASSWORD}@<endpoint>.upstash.io:6380"

Set UPSTASH_REDIS_PASSWORD as an environment variable. Never commit the password in vibewarden.yaml.

Pool size note: Upstash free tier limits concurrent connections. Set pool_size: 5 (or lower) to stay within the limit:

rate_limit:
  store: redis
  redis:
    url: "rediss://default:${UPSTASH_REDIS_PASSWORD}@<endpoint>.upstash.io:6380"
    pool_size: 5

AWS ElastiCache¶

ElastiCache runs inside your VPC. VibeWarden must run in the same VPC (or be connected via VPC Peering or Transit Gateway) to reach it.

ElastiCache Serverless (recommended for new workloads):

ElastiCache Serverless supports TLS and IAM-based auth. Use a rediss:// URL:

rate_limit:
  store: redis
  redis:
    url: "rediss://:${ELASTICACHE_PASSWORD}@<cluster-endpoint>.cache.amazonaws.com:6379"

ElastiCache Cluster Mode Disabled (single shard):

For non-cluster mode, use the Primary Endpoint:

rate_limit:
  store: redis
  redis:
    address: "<primary-endpoint>.cache.amazonaws.com:6379"
    password: "${ELASTICACHE_AUTH_TOKEN}"   # requires transit encryption
    pool_size: 20

Enable in-transit encryption on the ElastiCache cluster and set an Auth token in the cluster settings. Without transit encryption, the password field and AUTH are not available.

Redis Cloud¶

Redis Cloud (Redislabs) requires TLS and provides a connection URL in the database dashboard.

rate_limit:
  store: redis
  redis:
    url: "rediss://default:${REDIS_CLOUD_PASSWORD}@redis-<id>.c<num>.eu-west-1-1.ec2.redns.redis-cloud.com:12345"

Copy the full endpoint from the Connect button in the Redis Cloud console and substitute the password via an environment variable.

How rate limits work with circuit breakers and retries¶

Retry-After and client back-off¶

When VibeWarden returns 429, the response always includes:

Retry-After: <seconds>

Well-behaved clients (and many HTTP libraries) honour Retry-After. If your upstream app or a client library retries automatically on 429, make sure the retry logic reads Retry-After and backs off accordingly. Blind retries on 429 waste burst budget and make throttling worse.

Interaction with per-IP and per-user limits¶

Both limits are checked independently. A request can be blocked by either:

per_ip — the client IP has exhausted its token bucket.
per_user — the authenticated user has exhausted their token bucket.

The response body's error field is always rate_limit_exceeded. Check the structured log event (event_type: rate_limit.blocked, payload.limit_type: "ip" or "user") to determine which limit fired.

Redis fallback and circuit breaking¶

The Redis store is wrapped in a FallbackStore that acts as a circuit breaker:

VibeWarden assumes Redis is healthy at startup.
A background goroutine probes Redis every health_check_interval (default: 30s).
If the probe fails, the store is marked unhealthy and a rate_limit.store_fallback structured event is emitted.
Fail-open (fallback: true, default): rate limiting continues using the in-memory store. Counters are no longer shared across instances, but requests are not blocked.
Fail-closed (fallback: false): all requests are denied with 429 until Redis recovers.
When Redis recovers, the store switches back automatically and a rate_limit.store_recovered event is emitted.

Choose fallback: false when rate limiting correctness (e.g. financial or compliance workloads) outweighs availability concerns.

Troubleshooting: why am I getting 429s?¶

1. Identify which limit fired¶

Check the structured logs for rate_limit.blocked events:

{
  "event_type": "rate_limit.blocked",
  "payload": {
    "limit_type": "ip",
    "identifier": "203.0.113.42",
    "remaining": 0,
    "retry_after_seconds": 1
  }
}

limit_type is "ip" or "user". identifier is the IP address or user ID.

2. Check Prometheus metrics¶

vibewarden_rate_limit_hits_total{limit_type="ip"}
vibewarden_rate_limit_hits_total{limit_type="user"}

A sudden spike in these counters indicates a traffic burst or a misbehaving client.

3. Verify the IP VibeWarden sees¶

If trust_proxy_headers: false (the default) and VibeWarden is behind a load balancer, every request appears to come from the load balancer's IP. All traffic then shares a single per-IP bucket and the limit is exhausted immediately.

Fix: enable trust_proxy_headers: true only when you trust all upstream proxies:

rate_limit:
  trust_proxy_headers: true

If trust_proxy_headers: true and you are still seeing 429, check that the load balancer is actually setting X-Forwarded-For and that the header reaches VibeWarden without being stripped.

4. Requests hitting `per_ip` when you expect `per_user`¶

per_user only applies to authenticated requests. If your auth mode is none, or the request does not carry a valid session/token, VibeWarden cannot extract a user ID and falls back to per_ip only.

Verify the request carries a valid credential:

Kratos mode: session cookie (ory_kratos_session) must be present and valid.
JWT mode: Authorization: Bearer <token> must be present and validate against JWKS.
API key mode: the configured header (default: X-API-Key) must contain a valid key.

5. Limits are too low for your traffic pattern¶

Increase burst to absorb legitimate traffic spikes without changing the sustained requests_per_second rate. For example, an API used by a single-page app that fires many requests on page load may need:

per_ip:
  requests_per_second: 20
  burst: 100

6. Redis fallback is active¶

Check logs for rate_limit.store_fallback events. If present, counters are no longer shared across instances. Identify and resolve the Redis connectivity issue (network policy, auth failure, cluster restart), then wait for the background probe to detect recovery (up to health_check_interval).

Force a restart of VibeWarden to reconnect immediately if needed.

7. Exempting specific paths¶

Add paths that should never be rate limited to exempt_paths:

rate_limit:
  exempt_paths:
    - "/health"
    - "/public/*"
    - "/static/*"

The /_vibewarden/* prefix (health check, metrics) is always exempt regardless of this setting.