2025: The Year in LLMs

TL;DR Highlight

Simon Willison's comprehensive 2025 LLM ecosystem retrospective covers reasoning models, agents, vibe coding, MCP, and everything else developers need to know.

Who Should Read

Developers who want a single well-synthesized summary of what changed in LLMs in 2025 and what it means for practitioners.

Core Mechanics

Reasoning models (o3, Claude 3.5 Sonnet thinking, Gemini 2.0 Flash Thinking) became mainstream — the ability to 'think before answering' measurably improves accuracy on complex tasks.
Agentic AI went from experimental to production — multi-step tool-using agents are now deployed in real workflows, with MCP (Model Context Protocol) emerging as a standardization layer.
Vibe coding became a real phenomenon: a meaningful fraction of shipped code in 2025 was written primarily by AI with humans in a supervisory role.
Context windows exploded — 1M+ token windows became available, changing what's possible for document processing and long-session agents.
Open-source models closed the gap with closed frontier models significantly — running frontier-tier performance locally became possible for the first time.
Multimodal capabilities (vision, audio, video) matured from toy features to practical tools in several product categories.
The MCP ecosystem grew rapidly — dozens of server implementations enabling Claude and other models to connect to external tools and data sources.

Evidence

Willison is a highly respected voice in the developer community (creator of Django, Datasette) — his annual reviews are widely read and trusted for their practicality.
The post synthesizes his own experiments plus broader community evidence, with links to specific papers, announcements, and examples throughout.
HN discussion validated most of his observations, with commenters adding specific experiences — particularly around agentic workflows and vibe coding adoption.
Several readers noted the retrospective is unusually balanced — acknowledging both genuine progress and real limitations without being either dismissive or hype-driven.

How to Apply

Use this as an orientation document for bringing teammates up to speed on the AI landscape — it's dense but well-structured.
For technical leads: use the reasoning model section to evaluate whether your current model choices are still appropriate, given how much the reasoning tier has improved.
The MCP section is particularly actionable — if you haven't evaluated the MCP ecosystem for your agent tooling needs, this is a good starting point.
For PMs: the 'vibe coding' and 'agentic AI' sections have concrete examples of what organizations are actually shipping — useful for calibrating what's realistic to build.

Terminology

Vibe codingA development workflow where the developer describes intent at a high level and AI generates most of the implementation, with the human primarily reviewing and directing.

MCP (Model Context Protocol)Anthropic's open protocol for connecting AI models to external tools, data sources, and APIs in a standardized way.

Reasoning modelAn LLM variant that generates an internal chain-of-thought before producing its final answer, improving accuracy on complex multi-step problems.