Most rate limiting solutions are either too simple (fixed window only) or too complex (requires a PhD to configure). This one tries to hit the sweet spot:

Five algorithms to choose from, depending on your use case
Three storage backends: memory for development, SQLite for single-node, Redis for distributed
Works how you'd expect: decorator for endpoints, middleware for global limits
Fully async and type-checked with pyright
Sensible defaults but configurable when you need it

Installation

Using pip

# Basic installation (memory backend only)
pip install fastapi-traffic

# With Redis support
pip install fastapi-traffic[redis]

# With all extras
pip install fastapi-traffic[all]

Using uv

# Basic installation
uv add fastapi-traffic

# With Redis support
uv add fastapi-traffic[redis]

# With all extras
uv add fastapi-traffic[all]

Quick Start

Basic Usage with Decorator

from fastapi import FastAPI, Request
from fastapi_traffic import rate_limit

app = FastAPI()

@app.get("/api/resource")
@rate_limit(100, 60)  # 100 requests per 60 seconds
async def get_resource(request: Request):
    return {"message": "Hello, World!"}

Using Different Algorithms

from fastapi_traffic import rate_limit, Algorithm

# Token Bucket - allows bursts
@app.get("/api/burst")
@rate_limit(100, 60, algorithm=Algorithm.TOKEN_BUCKET, burst_size=20)
async def burst_endpoint(request: Request):
    return {"message": "Burst allowed"}

# Sliding Window - precise rate limiting
@app.get("/api/precise")
@rate_limit(100, 60, algorithm=Algorithm.SLIDING_WINDOW)
async def precise_endpoint(request: Request):
    return {"message": "Precise limiting"}

# Fixed Window - simple and efficient
@app.get("/api/simple")
@rate_limit(100, 60, algorithm=Algorithm.FIXED_WINDOW)
async def simple_endpoint(request: Request):
    return {"message": "Fixed window"}

Custom Key Extraction

def api_key_extractor(request: Request) -> str:
    """Rate limit by API key instead of IP."""
    return request.headers.get("X-API-Key", "anonymous")

@app.get("/api/by-key")
@rate_limit(1000, 3600, key_extractor=api_key_extractor)
async def api_key_endpoint(request: Request):
    return {"message": "Rate limited by API key"}

Using SQLite Backend (Persistent)

from fastapi_traffic import RateLimiter, SQLiteBackend
from fastapi_traffic.core.limiter import set_limiter

# Configure persistent storage
backend = SQLiteBackend("rate_limits.db")
limiter = RateLimiter(backend)
set_limiter(limiter)

@app.on_event("startup")
async def startup():
    await limiter.initialize()

@app.on_event("shutdown")
async def shutdown():
    await limiter.close()

Using Redis Backend (Distributed)

from fastapi_traffic import RateLimiter
from fastapi_traffic.backends.redis import RedisBackend

# Create Redis backend
backend = await RedisBackend.from_url("redis://localhost:6379/0")
limiter = RateLimiter(backend)
set_limiter(limiter)

Global Middleware

from fastapi_traffic.middleware import RateLimitMiddleware

app.add_middleware(
    RateLimitMiddleware,
    limit=1000,
    window_size=60,
    exempt_paths={"/health", "/docs"},
    exempt_ips={"127.0.0.1"},
)

Dependency Injection

from fastapi import Depends
from fastapi_traffic.core.decorator import RateLimitDependency

rate_dep = RateLimitDependency(limit=100, window_size=60)

@app.get("/api/with-info")
async def endpoint_with_info(
    request: Request,
    rate_info = Depends(rate_dep)
):
    return {
        "remaining": rate_info.remaining,
        "reset_at": rate_info.reset_at,
    }

Exception Handling

from fastapi_traffic import RateLimitExceeded

@app.exception_handler(RateLimitExceeded)
async def rate_limit_handler(request: Request, exc: RateLimitExceeded):
    return JSONResponse(
        status_code=429,
        content={
            "error": "rate_limit_exceeded",
            "retry_after": exc.retry_after,
        },
        headers=exc.limit_info.to_headers() if exc.limit_info else {},
    )

Algorithms

Algorithm	Description	Use Case
`TOKEN_BUCKET`	Allows bursts up to bucket capacity	APIs that need burst handling
`SLIDING_WINDOW`	Precise request counting	High-accuracy rate limiting
`FIXED_WINDOW`	Simple time-based windows	Simple, low-overhead limiting
`LEAKY_BUCKET`	Smooths out request rate	Consistent throughput
`SLIDING_WINDOW_COUNTER`	Balance of precision and efficiency	General purpose (default)

Backends

MemoryBackend (Default)

In-memory storage with LRU eviction
Best for single-process applications
No persistence across restarts

SQLiteBackend

Persistent storage using SQLite
WAL mode for better performance
Suitable for single-node deployments

RedisBackend

Distributed storage using Redis
Required for multi-node deployments
Supports atomic operations via Lua scripts

Configuration Options

@rate_limit(
    limit=100,              # Max requests in window
    window_size=60.0,       # Window size in seconds
    algorithm=Algorithm.SLIDING_WINDOW_COUNTER,
    key_prefix="api",       # Prefix for rate limit keys
    key_extractor=func,     # Custom key extraction function
    burst_size=None,        # Burst size (token/leaky bucket)
    include_headers=True,   # Add rate limit headers to response
    error_message="...",    # Custom error message
    status_code=429,        # HTTP status when limited
    skip_on_error=False,    # Skip limiting on backend errors
    cost=1,                 # Cost per request
    exempt_when=func,       # Function to check exemption
    on_blocked=func,        # Callback when request is blocked
)

Response Headers

When include_headers=True, responses include:

X-RateLimit-Limit: Maximum requests allowed
X-RateLimit-Remaining: Remaining requests in window
X-RateLimit-Reset: Unix timestamp when limit resets
Retry-After: Seconds until retry (when rate limited)

Development

See DEVELOPMENT.md for setting up a development environment and contributing.

License

Apache License 2.0