FreeLLMAPI — One key. One billion free LLM tokens. Every month.

The catalog

Every free tier worth using

Only providers with recurring free quotas, no credit card required, and a self-serve API.

Google Gemini

~3M/moper model

Gemini 2.5 Flash · Flash-Lite · 3.1 Pro/Flash previews. 20 RPD per model.

OpenRouter

~6M/moper model

19 :free models — DeepSeek, Kimi, Qwen, Llama, Gemma, Nemotron, Tencent HY3 …

Cerebras

~30M/moshared

Qwen3 235B · GPT-OSS 120B · Llama 3.1 8B. 1M TPD · 30 RPM. Fastest tokens you'll ever see.

Groq

~30M/moper model

Llama 3.3 70B · Llama 4 Scout · GPT-OSS 120B/20B · Qwen3 32B. 1000 RPD.

Mistral La Plateforme

~1B/moshared

Mistral Large/Medium · Codestral · Devstral · Magistral. The biggest free pool of any provider.

GitHub Models

~18M/moest.

GPT-4.1 · GPT-4o. 50 RPD on the free Copilot tier. Higher caps with paid Copilot.

SambaNova

~3M/moshared

DeepSeek V3.1/V3.2 · Llama 4 Maverick · Llama 3.3 70B · Gemma 3 12B · GPT-OSS 120B.

Cloudflare Workers AI

~20M/moshared

Kimi K2.5/K2.6 · Qwen3 30B · GLM-4.7 Flash · Llama 4 Scout · IBM Granite 4.0. 10K Neurons/day.

Z.ai (Zhipu)

~30M/moshared

GLM-4.5 Flash · GLM-4.7 Flash. Both :free — perpetually, no card.

Cohere

~1-2M/moshared

Command R+. Trial key: 1000 calls/month, 20 RPM.

NVIDIA NIM

creditsdisabled

Llama 3.1 70B. Disabled by default — credit-based, not recurring.

Three minutes to first token

Quick start

Clone & install

Node 20+ required. better-sqlite3 brings prebuilt binaries — no compile step.

Drop in your keys

Point your SDK

Set base_url to your local proxy. Use any OpenAI-compatible client — Cursor, the SDK, curl, anything.

~/freellmapi · zsh

# 1. clone, install, run
$git clone https://github.com/tashfeenahmed/freellmapi.git
$cd freellmapi
$npm install && npm run dev

# 2. open http://localhost:3001 — paste keys in the UI

# 3. call it like OpenAI
$curl http://localhost:3001/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":"hi"}]}'