FreeInference Documentation

Free LLM inference for coding agents and IDEs

FreeInference provides free access to state-of-the-art language models specifically designed for coding agents like Cursor, Roo Code, and other AI-powered IDEs.

Key Features

Free Access

Free inference for coding agents and development tools

Multiple Models

Access GLM, Qwen, MiniMax, and other powerful models

IDE Integration

Easy setup with Cursor, Roo Code, Kilo Code, and more

Getting Started

  1. Get your API key - Register at https://freeinference.org and create your API key

  2. Choose your IDE:

  3. Configure and start coding!

See the Quick Start guide for detailed setup instructions.

Available Models

Model

Context Length

Best For

GLM-5 recommended

200K tokens

Most capable, bilingual

GLM-5.1

200K tokens

Latest GLM-5 generation

GLM-5 Turbo

200K tokens

Faster GLM-5 variant

GLM-4.7

200K tokens

Long context, bilingual

MiniMax M2.5 new

1M tokens

Ultra-long context, multimodal

MiniMax M2.7

196K tokens

Large codebases

Qwen3.6 27B

65K tokens

Self-hosted code generation

Qwen3.6 35B

65K tokens

Self-hosted code generation

See the complete Available Models list for all available models.

Support

Need help? Check out: