Mistral releases Devstral2 and Mistral Vibe CLI

TL;DR Highlight

Mistral released Devstral 2 (123B, open-source, SWE-bench 72.2%) and a locally runnable Devstral Small — serious open-source coding agent competition.

Who Should Read

Devs building coding agents or AI-assisted development tools, and engineers evaluating whether to self-host coding models vs paying for API access.

Core Mechanics

Devstral 2 is Mistral's 123B open-source coding model hitting 72.2% on SWE-bench Verified — one of the highest scores for an open-source model on this benchmark.
Devstral Small is the locally runnable variant, designed to run on consumer hardware while still being competitive for coding tasks.
Both models are released under an open-source license, making them viable for self-hosted coding agent pipelines without API cost overhead.
The models are specifically fine-tuned for agentic coding scenarios — file editing, bash execution, multi-step debugging — not just code completion.
SWE-bench Verified 72.2% puts Devstral 2 ahead of many closed models and comparable to GPT-5.2-Codex on that specific benchmark.

Evidence

Benchmark numbers from Mistral's release post show Devstral 2 at 72.2% SWE-bench Verified, with community verification ongoing.
HN commenters were enthusiastic about the open-source release, noting that 72% SWE-bench Verified is a milestone that was only recently achievable by frontier closed models.
Some skepticism about SWE-bench as a complete proxy for real-world coding agent usefulness — task distribution may not reflect typical developer work.
Discussion of running Devstral Small on 4-bit quantized hardware (M-series Macs, consumer GPUs) with acceptable latency.

How to Apply

Swap Devstral 2 into your coding agent harness (e.g., SWE-agent, OpenHands) and benchmark against your closed-model baseline — the open-source gap has closed significantly.
For teams with on-prem requirements or data privacy constraints, Devstral Small running locally is now a credible option for AI-assisted coding.
Combine Devstral 2 with a long-context model for the planning phase and Devstral Small for local quick edits — a cost-effective hybrid agent architecture.

Code Example

snippet

# Install Mistral Vibe CLI
curl -LsSf https://mistral.ai/vibe/install.sh | bash

# Use Devstral 2 with llm CLI
llm install llm-mistral
llm mistral refresh
llm -m mistral/devstral-2512 "Generate an SVG of a pelican riding a bicycle"

# For Nix users
nix run github:numtide/llm-agents.nix#mistral-vibe

Terminology

SWE-bench VerifiedA benchmark where the model must autonomously fix real GitHub issues — 'Verified' means the test cases were human-reviewed for quality.

Agentic codingAI-assisted coding where the model can take multi-step actions: reading files, running tests, editing code, and iterating — beyond single-shot code generation.