Open Source — MIT Licensed

The ProxySQL for AI

High-performance load balancer and reverse proxy for AI/LLM APIs. Route requests intelligently, failover automatically, and monitor everything in real-time.

Quick Start Read the Docs

curl -fsSL https://raw.githubusercontent.com/mindbalancer/mindbalancer-labs/main/scripts/install.sh | bash

99.9%

Uptime with Failover

<1ms

Added Latency

Provider Support

100%

OpenAI Compatible

Everything you need for AI infrastructure

Production-ready features for reliability, performance, and observability.

Intelligent Load Balancing

Distribute traffic across multiple providers based on weight, latency, or active connections. No more single points of failure.

Weighted Round-Robin
Least Connections
Latency-based Routing
Hostgroup Support

Automatic Failover

When a provider goes down, requests are automatically routed to healthy alternatives. Zero downtime, zero intervention.

Health Checks (5s intervals)
Circuit Breaker Pattern
Retry with Exponential Backoff
Hot Config Reload (SIGHUP)

Response Caching

Cache deterministic requests to reduce costs and latency. Same prompt with temperature=0? Get instant responses from cache.

LRU Cache with TTL
X-Cache: HIT/MISS headers
Enable/Disable via mindsql
Up to 80% cost reduction

Real-time Observability

Prometheus metrics, request logging, and a built-in web dashboard. Know exactly what's happening with your AI traffic.

Prometheus Metrics Endpoint
Per-provider Cost Tracking
Web UI Dashboard
Grafana-ready

Security & Rate Limiting

Centralized API key management with AES-256 encryption. Per-user rate limits to prevent abuse and control costs.

AES-256 API Key Encryption
Per-user Rate Limits
Request/Token Limits
X-RateLimit Headers

mindsql CLI

MySQL-compatible admin interface. Manage servers, routing rules, and monitor stats with familiar SQL commands.

MySQL Protocol Compatible
Readline Support (history, arrows)
INSERT/DELETE/SELECT
SHOW CACHE STATUS

Zero Code Changes

Point your existing OpenAI SDK to MindBalancer. That's it. Your application doesn't need to know about multiple providers, failover logic, or caching.

View Examples

app.py

from openai import OpenAI

client = OpenAI(
    base_url="http://mindbalancer:6034/v1",
    api_key="any-key"  # MindBalancer handles auth
)

response = client.chat.completions.create(
    model="gpt-4",  # Routes to right provider
    messages=[{"role": "user", "content": "Hello!"}]
)

The ProxySQL for AI

Your Application

MindBalancer

AI Providers