60 free models · 11 providers · self-hosted

One key.
One billion free tokens.
Every month.

A self-hosted, OpenAI-compatible proxy that routes across every credible free-tier LLM provider. Bring your own keys; we just point requests at whichever provider still has budget left.

Aggregate monthly free budget
1B
tokens · per month
Across 60 free-tier models on 11 providers. Mistral alone contributes ~1B; everything else is bonus.
The catalog

Every free tier worth using

Only providers with recurring free quotas, no credit card required, and a self-serve API.

Google Gemini
~3M/moper model
Gemini 2.5 Flash · Flash-Lite · 3.1 Pro/Flash previews. 20 RPD per model.
OpenRouter
~6M/moper model
19 :free models — DeepSeek, Kimi, Qwen, Llama, Gemma, Nemotron, Tencent HY3 …
Cerebras
~30M/moshared
Qwen3 235B · GPT-OSS 120B · Llama 3.1 8B. 1M TPD · 30 RPM. Fastest tokens you'll ever see.
Groq
~30M/moper model
Llama 3.3 70B · Llama 4 Scout · GPT-OSS 120B/20B · Qwen3 32B. 1000 RPD.
Mistral La Plateforme
~1B/moshared
Mistral Large/Medium · Codestral · Devstral · Magistral. The biggest free pool of any provider.
GitHub Models
~18M/moest.
GPT-4.1 · GPT-4o. 50 RPD on the free Copilot tier. Higher caps with paid Copilot.
SambaNova
~3M/moshared
DeepSeek V3.1/V3.2 · Llama 4 Maverick · Llama 3.3 70B · Gemma 3 12B · GPT-OSS 120B.
Cloudflare Workers AI
~20M/moshared
Kimi K2.5/K2.6 · Qwen3 30B · GLM-4.7 Flash · Llama 4 Scout · IBM Granite 4.0. 10K Neurons/day.
Z.ai (Zhipu)
~30M/moshared
GLM-4.5 Flash · GLM-4.7 Flash. Both :free — perpetually, no card.
Cohere
~1-2M/moshared
Command R+. Trial key: 1000 calls/month, 20 RPM.
NVIDIA NIM
creditsdisabled
Llama 3.1 70B. Disabled by default — credit-based, not recurring.
Three minutes to first token

Quick start

1

Clone & install

Node 20+ required. better-sqlite3 brings prebuilt binaries — no compile step.

2

Drop in your keys

Sign up free at each provider, paste keys into the dashboard. No credit card needed for any of them.

3

Point your SDK

Set base_url to your local proxy. Use any OpenAI-compatible client — Cursor, the SDK, curl, anything.

~/freellmapi · zsh
# 1. clone, install, run
$git clone https://github.com/tashfeenahmed/freellmapi.git
$cd freellmapi
$npm install && npm run dev

# 2. open http://localhost:3001 — paste keys in the UI

# 3. call it like OpenAI
$curl http://localhost:3001/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":"hi"}]}'