High-performance load balancer and reverse proxy for AI/LLM APIs. Route requests intelligently, failover automatically, and monitor everything in real-time.
curl -fsSL https://raw.githubusercontent.com/mindbalancer/mindbalancer-labs/main/scripts/install.sh | bash
Any OpenAI SDK
Load Balance • Route • Cache
OpenAI, Anthropic, Ollama...
Production-ready features for reliability, performance, and observability.
Distribute traffic across multiple providers based on weight, latency, or active connections. No more single points of failure.
When a provider goes down, requests are automatically routed to healthy alternatives. Zero downtime, zero intervention.
Cache deterministic requests to reduce costs and latency. Same prompt with temperature=0? Get instant responses from cache.
Prometheus metrics, request logging, and a built-in web dashboard. Know exactly what's happening with your AI traffic.
Centralized API key management with AES-256 encryption. Per-user rate limits to prevent abuse and control costs.
MySQL-compatible admin interface. Manage servers, routing rules, and monitor stats with familiar SQL commands.
Point your existing OpenAI SDK to MindBalancer. That's it. Your application doesn't need to know about multiple providers, failover logic, or caching.
View Examplesfrom openai import OpenAI
client = OpenAI(
base_url="http://mindbalancer:6034/v1",
api_key="any-key" # MindBalancer handles auth
)
response = client.chat.completions.create(
model="gpt-4", # Routes to right provider
messages=[{"role": "user", "content": "Hello!"}]
)
Use the same OpenAI-compatible API for everything.
Get started in under 5 minutes. Free and open source forever.