Two frontier models reshape the landscape
Two frontier models reshape the landscape
Claude Mythos Preview (April 7)
Anthropic’s most powerful model. 93.9% SWE-bench Verified (previous frontier ~85%). 97.6% USAMO 2026. Autonomously discovers and chains zero-day exploits across every major OS and browser. Found thousands of unknown vulnerabilities.
Not generally available. Deployed via Project Glasswing to 50+ tech companies with $100M+ in credits for defensive security. Partners include NVIDIA, Amazon, Apple, Google, Microsoft, CrowdStrike.
Implications:
- New deployment model: directed use-case access, not open API
- SWE-bench ceiling moved dramatically (93.9% vs ~85%)
- Model too capable for general release sets precedent for capability-based access restrictions
- When capabilities reach Claude Code, “coding agent” means something different
Meta Muse Spark (April 8)
First model from Meta Superintelligence Labs (Alexandr Wang). Small and fast. Natively multimodal. Tool-use and multi-agent orchestration built in. “Contemplating mode” runs agent squad in parallel.
Proprietary. Private API only. After Llama 1-4 open-weight, Meta goes closed.
Implications:
- Open-weight landscape contracts — Google (Gemma), Alibaba (Qwen), Zhipu (GLM) now primary producers
- Multi-agent as native model capability, not harness feature
- Llama future unclear — open-weight era may be ending at Meta
- No local option for consumer hardware
Cross-cutting
The two announcements share a pattern: frontier models moving away from open access. Anthropic withholds for safety. Meta withholds for competitive advantage. The result is the same — the most capable models are not available for local inference or unrestricted API access.
The open-weight community (Gemma 4, Qwen 3.5, community fine-tuners) becomes more important as a hedge against this trend. Google’s Apache 2.0 shift for Gemma 4 looks prescient.
Cursor Bugbot update (April 8)
Bugbot learns from PR feedback and applies learned rules to future reviews. MCP tools for review context. 78% resolution rate. Self-improving agents from task-specific data.