North Mini Code gives developers a small active-footprint coding model for agentic software engineering, terminal work, and code generation. This guide explains what it is, how it works, and where you can use it.

What Is North Mini Code?
North Mini Code is an open-weights coding model from Cohere and Cohere Labs.
Cohere built it as a 30B total parameter Mixture-of-Experts model with 3B active parameters. Developers can use it for code generation, repo-level code changes, terminal-based agent tasks, and local coding workflows.
The official model name is North-Mini-Code-1.0. The API model ID is north-mini-code-1-0.
Overview
| Item | Details |
|---|---|
| Model name | North Mini Code |
| Official model ID | north-mini-code-1-0 |
| Developer | Cohere and Cohere Labs |
| Model type | Sparse Mixture-of-Experts coding model |
| Size | 30B total parameters, 3B active parameters |
| Input | Text |
| Output | Text |
| Context window | 256K tokens |
| Max output | 64K tokens |
| License | Apache 2.0 |
| Open weights | Yes |
| API access | Cohere Chat V2, Chat V1, and Chat Completions |
| Pricing | Free until rate limits are reached; production use can run through Cohere Model Vault |
| Local deployment | Supported through tools such as vLLM and SGLang |
Features
Agentic Coding Focus
Cohere trained North Mini Code for agentic coding tasks. Developers can use it inside coding agents that inspect files, edit code, run commands, and continue across multiple steps.
Small Active Footprint
The model has 30B total parameters but activates 3B parameters during inference. That design helps teams test coding agents with lower inference cost than a dense model of the same total size.
Long Context for Repositories
North Mini Code supports a 256K token context window. Developers can give it larger code snippets, logs, file contents, and task instructions in one session.
Large Output Budget
The model supports up to 64K output tokens. That helps with long patches, multi-file explanations, generated scripts, and terminal-style task traces.
Tool Use for Coding Agents
North Mini Code supports tool use through chat templates in Transformers. Agent builders can pass tool descriptions with JSON schema and let the model call tools such as shell commands.
Local Deployment Options
Developers can run North Mini Code with local serving tools such as vLLM and SGLang. This matters for teams that want more control over infrastructure, latency, or data handling.
Use Cases
Repo-Level Code Changes
A software engineer can use North Mini Code in an agent harness to inspect a repository, understand a bug report, edit files, and run tests. This fits workflows built around SWE-Agent or OpenCode-style tools.
Terminal-Based Automation
A developer tools team can connect North Mini Code to shell tools for multi-step terminal tasks. The model can plan commands, read command output, and continue the task inside an agent loop.
Local Code Generation
An individual developer can run the model locally for Python scripts, utilities, and code examples. The 3B active parameter footprint makes local testing more practical than using a larger dense coding model.
Scientific and Algorithmic Coding
A researcher can use North Mini Code for programming tasks that need reasoning over code, data structures, or algorithms. This use case works outside a full agent loop.
Internal Developer Agents
A platform team can use North Mini Code as the model behind an internal coding assistant. The model can support code edits, terminal actions, and structured tool calls when the team wraps it in the right agent system.
Should You Use North Mini Code?
Use North Mini Code when you need an open-weights coding model for agentic software engineering, terminal tasks, or local code generation.
It makes the most sense for developers, agent builders, and engineering teams that want to test coding workflows with a model designed for tool use and long-context work.