FreeInference Documentation
Free LLM inference for coding agents and IDEs
FreeInference provides free access to state-of-the-art language models specifically designed for coding agents like Cursor, Roo Code, and other AI-powered IDEs.
Quick Links
Quick Start - Get started in 5 minutes
IDE & Coding Agent Integrations - Configure with Cursor, Roo Code, and other coding agents
Available Models - View available models
API Headers Reference - API headers reference
Key Features
- Free Access
Free inference for coding agents and development tools
- Multiple Models
Access GLM, Qwen, MiniMax, and other powerful models
- IDE Integration
Easy setup with Cursor, Roo Code, Kilo Code, and more
Getting Started
Get your API key - Register at https://freeinference.org and create your API key
Choose your IDE:
Cursor - AI-powered code editor
Roo Code / Kilo Code - VS Code extensions
Configure and start coding!
See the Quick Start guide for detailed setup instructions.
Available Models
Model |
Context Length |
Best For |
|---|---|---|
GLM-5 recommended |
200K tokens |
Most capable, bilingual |
GLM-5.1 |
200K tokens |
Latest GLM-5 generation |
GLM-5 Turbo |
200K tokens |
Faster GLM-5 variant |
GLM-4.7 |
200K tokens |
Long context, bilingual |
MiniMax M2.5 new |
1M tokens |
Ultra-long context, multimodal |
MiniMax M2.7 |
196K tokens |
Large codebases |
Qwen3.6 27B |
65K tokens |
Self-hosted code generation |
Qwen3.6 35B |
65K tokens |
Self-hosted code generation |
See the complete Available Models list for all available models.
Support
Need help? Check out:
IDE & Coding Agent Integrations - IDE setup guides
Available Models - Available models
GitHub Issues - Report bugs or request features