FreeInference Documentation

Free LLM inference for coding agents and IDEs

FreeInference provides free access to state-of-the-art language models for coding agents and AI-powered IDEs, with a particularly smooth setup path for Kilo Code.

Quick Links

Quick Start - Get started in 5 minutes
IDE & Coding Agent Integrations - Configure with Kilo Code, Cursor, Roo Code, and other coding agents
Claude Code - Use Claude Code with FreeInference’s Anthropic-compatible endpoint
Available Models - View available models
API Headers Reference - API headers reference

Key Features

Free Access: Free inference for coding agents and development tools
Multiple Models: Access GLM, Qwen, MiniMax, and other powerful models
IDE Integration: Easy setup with Kilo Code, Cursor, Roo Code, and more
Kilo-Friendly Setup: Detailed Kilo Code instructions for a fast OpenAI-compatible configuration

Getting Started

Get your API key - Register at https://freeinference.org and create your API key
Choose your IDE:
- Kilo Code - Kilo Code setup with recommended models
- Cursor - AI-powered code editor
Configure and start coding!

See the Quick Start guide for detailed setup instructions.

Available Models

Model	Context Length	Best For
GLM-5.1	200K tokens	Latest GLM-5 generation
GLM-5 Turbo ^recommended	200K tokens	Faster GLM-5 variant
GLM-4.7	200K tokens	Long context, bilingual
MiniMax M2.5	205K tokens	Ultra-long context, multimodal
MiniMax M2.7	205K tokens	Large codebases
Qwen3.6 35B ^fastest	262K tokens	Strong reasoning and coding intelligence

See the complete Available Models list for all available models.

Support

Need help? Check out:

IDE & Coding Agent Integrations - IDE setup guides
Available Models - Available models
GitHub Issues - Report bugs or request features