2025-12-10 · Cursor

Debug Mode, Plan Mode Improvements, Multi-Agent Judging, and Pinned Chats

agents

Debug Mode, Plan Mode Improvements, Multi-Agent Judging, and Pinned Chats

Source: Cursor Date: 2025-12-10 URL: https://cursor.com/changelog/2-2

Summary

Cursor 2.2 ships Debug Mode (runtime log instrumentation across stacks and models), a Browser Visual Editor for real-time CSS and component tree editing, Plan Mode with inline Mermaid diagrams and per-task agent delegation, and Multi-Agent Judging (automatic evaluation of parallel agent solutions with recommended approach). Plans now save as editable disk files. AWS Bedrock performance improvements land alongside.

Implications

Agent-IDE feature race. Multi-Agent Judging is the most distinct capability here — automatic evaluation and recommendation across parallel agent solutions is moving toward an orchestration layer that decides which agent output wins. This is the /best-of-n concept from 3.0 made retroactively visible in 2.2, showing a consistent architectural thread. Aider and Codex CLI have no equivalent.

Model integration cadence. AWS Bedrock performance improvements signal Cursor is investing in enterprise model routing that goes beyond Anthropic/OpenAI direct APIs. Bedrock support broadens the procurement surface for enterprise accounts with existing AWS relationships — a meaningful GTM differentiator.

Watch: Whether Multi-Agent Judging exposes the evaluation criteria (enabling users to tune the judge) and whether Debug Mode’s runtime instrumentation expands into persistent observability (logs, traces, metrics).

← all signals